Home » What We're Thinking » Scott's Musings
So, we’re just retiring our first server. We used to name our servers after our clients, and our first client was MIND – Microsoft Internet Developer magazine, a pre-cursor to MSDN magazine. It was with nostalgia, that I went looking for that first bit of technical prose… But alas, the Microsoft archives only go back to 2000… Duh??!!
Not to be discouraged, I turned to the amazing web archive and they had the copy. Amazing!
So, in honour of the now defunct MIND server, I reprint the original “Make Your Legacy Apps Work on the Internet” authored more-or-less 10 years ago, and still right on message. You have to love the old days when “VB6 COM Objects” were cutting edge, and XML was the be-all-and-end-all-that-it-should-be-the-whole-cover. Beautiful.
Check it on the web archive.
----
This article assumes you're familiar with XML, Microsoft Message Queue, and Visual Basic Download the code (22KB)
Make Your Legacy Apps Work on the Internet Scott Howlett and Jeff Dunmall
The Internet is increasingly becoming an important path for business-to-business data services. You can use BizTalk and COM to keep your existing systems useful for longer.
In the coming years, most companies will integrate their Internet presence with their mission-critical line-of-business systems. Creating these applications will be the most difficult challenge Internet developers have yet to face. It means enabling interoperability between legacy systems, possibly from different companies, and doing it with the availability and scalability that Internet applications demand. By getting into the right design mindset and making the right technology choices, notably Extensible Markup Language (XML) and Microsoft® Message Queue Services (MSMQ), Internet applications can provide the return-on-investment that has been promised in the past, but rarely delivered. First, let’s explore what legacy systems are and why they’re important. We’ll walk you through interoperable Internet application development and the issues involved in building them. Then we’ll build an Internet-based Address Change facility as a traditional Internet application that integrates with existing systems. We’ll also show you how the design is extensible, enabling business-to-business communication using the BizTalk framework. Surprisingly, the framework can be written with only about 100 lines of ASP and Visual Basic® code. Microsoft will expand the technology available for integration with legacy systems when it releases the Microsoft Enterprise Interop Server, codenamed "Babylon" (see Figure 1). Babylon delivers application integration with COM Transaction Integrator (COMTI) and an MSMQ-to-MQ
Figure 1: Microsoft Enterprise Interop Server Architecture
Series bridge, data integration with OLE DB providers and ODBC drivers, and network/platform integration via an SNA gateway or direct TCP/IP access. We’ll only address the challenges of getting data to the Interop server. To find out more about Microsoft’s interoperability strategy, go to http://web.archive.org/web/20000617215818/http://www.microsoft.com/isapi/gomscom.asp?target=/interoperability/.
Background
If you were at Microsoft TechEd this past May, you probably heard Paul Maritz’s discussion about the upcoming third generation of Internet applications. In his analysis, the first generation involved content publishing based on the HTML standard. The second generation was based on dynamic content using Windows® DNA technology, and the third generation will feature integrated Internet systems that enable more effective line-of-business applications. Building third-generation Internet systems will demand integration with legacy systems from one or more companies. Companies will be aggressively developing these applications because they know that customers and suppliers will demand integrated systems based on open Internet standards. Increased efficiency and streamlined business processes should have pleasant effects on the bottom line, in part through the competitive advantage gained by providing better and faster service. If a company doesn’t or can’t provide integrated services, business opportunities will likely be lost. If you read Don Box’s article, "Windows 2000 Brings Significant Refinements to the COM(+) Programming Model" (Microsoft Systems Journal, May 1999), you were probably surprised at the notion that COM components developed before Windows 2000 are often referred to as "legacy" components. You’ve never written any COBOL, so how could you have any legacy code? The terminology shouldn’t surprise anyone. It simply means that you have written software that is part of a production system. Having legacy code is a good thing because the alternative is that none of your code is working in production! So, simply put, a legacy system is any system that exists in production. It’s made of legacy code and the manual processes, such as call centers, that support the system. Legacy systems often have hundreds of development-years invested in them. This is why it’s so important to evaluate legacy systems and build Internet systems on top of their foundations. With the right design, you will be able to overcome the pitfalls associated with integration and interoperability. Building line-of-business systems in Internet time (less than six months) demands that existing systems be exploited. Usually, legacy systems have some manual processing where the combined expertise and experience of people is critical. You simply can’t reproduce that effort in your timeframe, and you probably don’t want to anyway—unless you have a penchant for pain or like to rewrite COBOL source code. Of course, legacy systems aren’t all good news for Internet applications. When you bring them into the fold, you’re adding the most dangerous software villain: the unknown quantity of legacy systems. But on the bright side, you’ve got an excellent starting point because you already have a working system. Legacy systems are frequently not documented, leaving everyone afraid that doing anything to them might break them. You can’t change any of their code and no one can or will tell you how they work. So there have to be legacy experts on the team, at least for the design period. They’ll be responsible for investigating, researching, and documenting the interfaces, business rules, and other facets of the legacy systems. Finding legacy experts is usually the first problem in building these systems. On the technology side—where we’ll focus the rest of this article—there are a number of hurdles to overcome. Integration isn’t always easy. In fact, it almost always involves some pain. But with the introduction of technologies such as XML and application services such as MSMQ, building interoperable systems has never been easier. Microsoft Windows 2000 offers more technology advancements that will make interoperability even easier, notably native support for queued components. Babylon, due in beta in the last quarter of 1999, will provide even more opportunity for legacy integration.
Availability, Performance, and Scalability
Generally, integration problems fall into three familiar categories: availability, performance/scalability, and exception handling. The availability of your application is related directly to the availability of the legacy systems you connect with. Your Internet application likely needs to run 24/7, while your legacy systems may not have the same constraints. They may be scheduled for nightly maintenance or weekend shutdowns. What will your Internet application do when legacy systems are unavailable? From a performance and scalability perspective, the legacy system probably can’t handle the volume of traffic that the Internet site can, at least not in the same form. Including legacy resources in your application increases transaction time, thereby decreasing scalability. Sometimes even connecting to the legacy systems can be a costly operation. If the only interface to the legacy system is through screen scraping (don’t laugh, it happens all the time), then you definitely have a back-end resource bottleneck to overcome in your design. If you have any combination of these issues, you must find a way to decouple your legacy systems from your Internet application, making it transparent to your customers. The most effective way to accomplish this is with MSMQ. On the Internet side, you can concurrently send thousands of requests and get excellent performance from MSMQ. On the legacy side, you can process those messages at a pace that the legacy system can handle. If there are only five database connections available through SNA, for example, you can process five messages at a time. This effectively allows you to throttle the load on the legacy system, while at the same time addressing your availability concerns and allowing your Internet application to scale appropriately. A third grade French teacher once said, "Pour toutes les règles, il y a des exceptions." (Every rule has an exception.) We don’t remember much French, but after years in the software business, we’ve refined these words of wisdom: "Every business rule that has existed for more than five years has at least one exception." And herein lies the third category of integration problem: if every rule can be broken, how do you write a middle tier that enforces the business rules? As a further complicating factor, many of these rules are entrenched in the legacy systems and manual processes (and people) that surround them. So to build a completely automated system, it’s necessary to discover every business rule that’s grown up around the process, and to uncover all the exceptions to the rules. Realistically, it takes time to flush all of these rules out and code them into your application, and they typically aren’t documented very well. In many cases, this is not possible in a six-month timeframe, so you’ll need to incorporate the manual processes to handle the remaining exceptions, at least in the first version. The golden rule is to make most things automatic and everything else possible, even if it means invoking a manual process. For example, if you move to another town, it’s likely that your auto insurance policy needs to be updated to reflect your new neighborhood. So when building an Internet address change system for an insurance company, you have two design choices. You can either reprogram the system to automate the adjustment of the policy, or you can integrate this change within the existing process. If you’ve ever worked with an insurance company, option one should send shivers down your spine, especially if you have to deliver your application in six months. Changing a policy involves numerous rules that need to be enforced, many of them for legal reasons. Besides, you’re supposed to be building an address change system, not an automatic policy update system. So the best alternative is to integrate the existing policy update process into the new system. This will save you time (the scarcest resource) in the development cycle because you won’t have to discover every rule and exception up front. By incorporating the existing manual processes, you can rely on the expertise of the people who know the system best. Integration makes your life easier by saving you time and headaches. In version 2, after you get the app up and running, you can seek to uncover and incorporate more business rules and thereby automate the system more completely. Don’t underestimate how complex this will be. Even though a division might look like a call center to you, it’s also the brain center for many businesses, and it can’t be easily replaced.
How to Bring Them Together
So how do you overcome the availability, performance/scalability, and exception handling pitfalls? How many new design patterns and technologies do you need to learn? There’s some good news. All three areas can be addressed by one technology choice: a combination of MSMQ and XML. But technology alone won’t save the day. To do it right, you’ll need to change the way you think about application development. Generally speaking, people are on-demand thinkers. When we’re hungry, we eat; when we’re thirsty, we drink. This type of thinking translates into a design pattern that will leave you dead in the water when it comes to building integrated Internet systems. You have to break this design pattern by thinking asynchronously first, and designing synchronous transactions only as a last resort. If you must use a synchronous transaction, hold on to your hat—in some cases it simply may not be possible with hundreds of users. Benchmark synchronous transactions early to avoid late-breaking scalability problems. You also have to get over the common mindset that goes something like this: "If I (re)build everything in the system, all the code will be mine and everything will work." While this may be true in some cases, it is not the proper approach, especially if you want to finish on time. The key here is to make use of the enormous amount of effort that has been invested in existing systems and processes. If you integrate with an existing process, it’s likely that the development time spent handling exceptions can be reduced by half because the existing process already has built-in exception handling mechanisms. Finally, you must embrace the versioned approach to software development. If version 1.0 of your application is also meant to be the last version, it is almost certain to fail. A one-version approach will compromise proper system design, the first casualty of increased scope inside the same development schedule. Furthermore, you’ll lose the ability to make course corrections, both in overall architecture and specific features.
MSMQ and XML
As we mentioned previously, the three main problems with integrated systems (availability, performance/scalability, and exception handling) can be addressed with a single technology choice—MSMQ and XML. Many systems already communicate by passing a simple string. In order for both systems to understand the format and location of the data in the string, it’s probably marked it up with some kind of token system or based on fixed-length fields. XML is the formalization of this concept. It also includes the tools and techniques to make development easier. By using valid XML—XML that conforms to a document type definition (DTD) or schema—industries can standardize on a data format, giving applications the ability to exchange data with a much larger audience. XML provides a great way to model data and represent that data in a simple and powerful format. What XML does for data, MSMQ does for transport. MSMQ gives systems a reliable and disconnected path for the transfer of XML between different systems in an organization. It guarantees that the XML is delivered once (and only once) when network conditions permit. If you’re integrating with systems that are widely distributed geographically, not always available, slow, or nonscalable, MSMQ allows you to isolate these systems from your customers. Your customers need not be concerned with your legacy systems because customers never interact directly with them.
Sample Application
About a year ago, one of our cars was stolen from the airport. The settlement for my car showed up in the requisite 60 days, but a check for the car’s contents never arrived. After six months or so, we gave the insurance company a call. After all, the Smiths CD that was in the car had to be replaced. The agent said that the second check was indeed sent over five months ago. It turns out that while the auto policy had the new address, the home policy (which insured the contents of the car) still had the old address. The same insurance company held both policies, so the cause of this mishap was surely the lack of integration across their line-of-business systems. Let’s examine an Internet-based address change facility that solves the problem we just outlined. The insurance company has many internal processes, but we’ll focus on these three parts of the call center:
Typical Approach
The typical approach toward building an Internet-based front end over a legacy system would be to build an HTML form that posts to an ASP page. The ASP code would connect to each of the data sources and execute some SQL to make the address changes. Finally, you’d write some HTML back to the client, indicating the success or failure of the operation. The HTML form would look something like Figure 2.
Figure 2: A Traditional HTML Form
This solution does not address the availability, performance, scalability, or exception handling challenges. First, availability is not optimal because if any of the data sources are unavailable at submit time, the user will receive an error message. Furthermore, if the underlying data sources run on a mainframe (as most legacy systems do), there may also be transient problems with connectivity through the SNA gateway. And, of course, there are the usual network problems, which may be intensified if the database is located across the WAN. There is a big problem with performance: the user is waiting for all database transactions to complete. This can be disastrous, especially at busy periods (end-of-month, holidays, and so on) when legacy systems typically run at or near full capacity. Having this many database connections on a single page also presents scalability problems, especially if connections are limited (which they frequently are in legacy systems) and if there are contention issues. There may even be a show stopper if the transactions are lengthy and are locking resources. This system does not exploit existing systems, and there is no exception handling mechanism. There are also less obvious problems with this solution. What if complex business rules need to be enforced? Even worse, what if they had traditionally been enforced by a manual process? What if connectivity to the back-end database is simply not possible (existing instead in a flat file on the mainframe) or a subsystem is down for regular maintenance? What if you want to share this information with an affiliate company? This may sound like worst-case planning, but these are common concerns at large corporations.
A Better Solution
Now, let’s take a look at how this system could be built to address the shortcomings of the traditional approach. As you read, keep three "VIA" (Version, Integrate, Asynchronous) rules in mind:
Figure 7: Customer XML Document
At this point, you’ve defined your XML schema, created an XML document on the client, and validated it against the schema. Now it’s time to submit it to the server. To post the XML to the page, use the XMLHTTPRequest object included in the msxml.dll that ships with Internet Explorer 5.0. Using this object offers the best performance (by providing the leanest possible HTTP POST forms without compression) and the most straightforward code. The client-side source code to post the XML is shown in the SendXML method in Figure 6. Prior to Internet Explorer 5.0, the preferred method for HTTP posting would have been through the WinInet API. We’ve seen the code that does this, and it’s not pretty. The code to receive the XML on the server is shown in Figure 8. Note the call to set the async property of the XMLDOM object to False. If you omit this line, the resolution of the schema and the eventual parsing of the document will be performed asynchronously, which is not what you want in this case. The source code that receives XML on the server is shown in the ProcessRequest method in Figure 8. The SendMessage function sends an MSMQ message to the Distributor here as well. The address change message is sent from the ASP page to the source queue shown in Figure 9. When it arrives, MSMQ notifies a Listener that calls the Distributor, a Microsoft Transaction Services (MTS) component. The Distributor pulls the message off the queue and sends the XML body to the destination queues as specified in the registry. Should something go wrong while sending the messages, the transaction rolls back, leaving the message on the source queue and nothing in the destination queues. More detailed information about sending and receiving MSMQ messages is available in Ted Pattison’s article, "Using Visual Basic to Integrate MSMQ into Your Distributed Applications" (Microsoft Systems Journal, May 1999).
Figure 9: Distributor Architecture
To take a message off a queue in an MTS transaction, the application removing the message must be running locally on the same machine as the queue and it must be running in MTS.
The Listener
MSMQ notifies the Listener application (see Figure 10) when a message arrives by declaring an MSMQEvent object called WithEvents:
Dim WithEvents msmqMsgEvent As MSMQEvent Dim msmqQue As MSMQQueue
To set up notification in the Visual Basic-based listener, you’d then use the following code:
Set msmqInfo = CreateObject("MSMQ.MSMQQueueInfo") Set msmqMsgEvent = CreateObject("MSMQ.MSMQEvent") msmqInfo.FormatName = txtSourceQueue Set msmqQue = msmqInfo.Open(MQ_RECEIVE_ACCESS, MQ_DENY_NONE) msmqQue.EnableNotification Event:=msmqMsgEvent
Instead of using a path name here, you should use a format name to increase performance. Using a path name requires a query to the Message Queue Information Store (MQIS). That RPC call adds significant overhead to the open queue request. Using a format name, on the other hand, requires only a single RPC call to the MQIS (if it is not a direct format name). After the first call, MSMQ caches the connection information, which removes the site controller from the picture and increases performance. This will be particularly relevant in the MTS Distributor component.
Figure 10: The Listener Application
When an MSMQEvent_Arrived event is fired, the Visual Basic-based component calls the Distributor in MTS with the format name of the source queue. It does not pass a reference to the source queue directly; calling ReceiveCurrent on the reference would not include the source message in the transaction because the queue was not opened in the context of the transaction. If the transaction aborted, the message would neither appear in the destination queue nor remain in the source queue.
The Distributor
Based on data stored in a local database, the Distributor will send one or more additional messages based on the contents of the registry (see Figure 11). The source code for the Distributor is shown in Figure 12. The messages sent by the Distributor will in turn be received by Handlers, which will initiate and execute an address change for a particular system. The architecture of a Handler is very similar to the Distributor, so we’re leaving out the details.
Figure 11: Registry Values
In a more robust system, the Distributor might determine the destination based on a database lookup and the type of XML schema used in the body of the message. This more flexible architecture could be used to process other message types as well. Figure 9 gives the impression that the Distributor and destination queues are all running side-by-side on the same machine. While this is possible, it is equally likely that the Handlers would be running in separate offices, maybe even in different countries. This transparency gives your application the ability to communicate over slow links or WAN connections, knowing that your message will be processed when network conditions permit.
Benefits of this Solution
Before going into the benefits of this architecture, take a look at the amount of code written for this sample. Granted, the sample app is straightforward, but the table in Figure 13 shows just how little source code is required to build the framework around an integrated Internet site. As you can see in Figure 13, there is not much source code involved in our sample application. Remember VIA? This solution fulfills all three parts:
Conclusion
To build third-generation Internet applications, you’ll need to build an interoperable system. Look at legacy systems and processes as opportunities to be exploited because the technology exists to address their shortcomings. By making the right technology choices now, communication with the systems of other companies will be a natural extension in future versions. It’s clear that XML is emerging as the standard data format for communication between systems. On the transport side, MSMQ is an easy-to-use, asynchronous mechanism that can increase both performance and scalability. It will also isolate legacy systems and increase the availability of the system. To build this new brand of applications, you’ll need to focus on the VIA design techniques we discussed in this article. Using these technologies and the VIA design concepts, you should be able to find the way to your third-generation Internet application. Good luck!
From the September 1999 issue of Microsoft Internet Developer. Get it at your local newsstand, or better yet, subscribe.
http://web.archive.org/web/20000617215818/http://www.microsoft.com/isapi/gomsdn.asp?TARGET=/xml/articles/hess061499.asp and http://web.archive.org/web/20000617215818/http://www.microsoft.com/isapi/gomsdn.asp?TARGET=/xml//default.asp
[Update (April 8, 2009) – I’ve just concluded the the source of the trouble is the OfficeLiveAddIn. I’m also hearing from colleagues that the OfficeLiveAddIn, which causes changes to the user agent, is wreaking havoc with Sharepoint users as well. Apparently the WebDav implementation for Sharepoint has a specific check for “office”, presumably for different handling for non-browser-based clients.]
[Update: On Rez’s computer, both the 32 bit and the 64 bit version of IE8 work just fine with IE8 using Vista 64 bit. For me, only the 64 bit version of IE8 works on Vista 64 bit. <= i.e. inconclusive testing!]
I went to IE8 on my work computer today. Went to CRM, and got this nasty dialog:
No problem, I thought. I had read about “compatibility mode” in IE8, where IE8 pretends to be IE7 so that websites don’t break. I checked my compatibility settings (Tools | Compatibility View Settings) and my browser was *already* in compatibility mode. To be certain, I used Fiddler to check the user agent:
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; WOW64; Trident/4.0; SLCC1; .NET CLR 2.0.50727; InfoPath.2; .NET CLR 1.1.4322; MS-RTC LM 8; WWTClient2; .NET CLR 3.5.21022; OfficeLiveConnector.1.3; OfficeLivePatch.0.0; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
Sure enough, IE8 was pretending to be IE7.
I switched compatibility mode off, just for kicks. The user-agent did change but CRM still didn’t work.
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; WOW64; Trident/4.0; SLCC1; .NET CLR 2.0.50727; InfoPath.2; .NET CLR 1.1.4322; MS-RTC LM 8; WWTClient2; .NET CLR 3.5.21022; OfficeLiveConnector.1.3; OfficeLivePatch.0.0; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
Next, I issued the dire warning to my team – if you upgrade to IE8, CRM 3.0 won’t work. But then, strangely, my colleague Rez said that CRM3.0 (the same installation) worked for him under IE8. Hmmm, I thought. What is going on. Here is the user-agent from Rez who is also running Vista 64 bit:
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0)
Funny. The MSIE part was the same, but the o/s looked different. Now, I seemed to remember a few different icons in the start menu – Internet Explorer and Internet Explorer (64 bit). I knew I was running the 32 bit version, because there’s no 64 bit Flash plug-in. You can see that in my user-agent – “Windows NT6.0; WOW64” (WOW is Windows on Windows) whereas Rez had “Windows NT6.1”. Sure enough, if I ran the 64-bit IE8, CRM worked with this user-agent:
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Win64; x64; Trident/4.0; .NET CLR 2.0.50727; SLCC1; Tablet PC 2.0; .NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
So, from what I can tell IE8 32 bit does not work with CRM3.0 and IE8 64 bit does. The remaining mystery – what is Rez’s Windows NT6.1???