Peter Vogel Gold Collection
You can automate your Access application from a .NET program–but you'd better watch out. If you're not careful, Access will unexpectedly become unavailable, for instance. More importantly, you need to be able to get rid of Access when you don't want it anymore. Peter Vogel shows how to handle both of these problems.
You can call your Access application from a .NET program. This lets you, for instance, take advantage of Access' fabulous report writer from a .NET application. To provide backward compatibility with existing Windows applications, the .NET Framework allows COM objects to be called from .NET applications. And, as far as .NET is concerned, Access is just a particularly large COM object. However, because Access is such a large application, in order to prevent memory leaks or have Access suddenly stop working, you must take advantage of some special features in the .NET Framework that are designed to manage COM objects.
The easiest way to work with COM objects from .NET, in general, is to use Interop Assemblies. When a .NET application uses Interop Assemblies, the .NET-based code interacts with a Runtime Callable Wrapper (RCW) that handles all communication with the COM object for the .NET-based application. If you're developing your .NET application in Visual Studio .NET, VS.NET will create an RCW for you automatically (there's also a command-line tool). However, this automatic generation of an RCW fails if you have an object that works with complex data types–in other words, with Access. It's possible to customize an Interop Assembly to handle these more complex data types.
Fortunately, Microsoft provides a customized Interop Assembly for working with Access (and with the rest of the Microsoft Office products). These are known as the Primary Interop Assemblies (PIAs) and are what you should use to talk to Access from .NET. You can download them from the MSDN site–search on "Office 2003 Update: Redistributable Primary Interop Assemblies."
However, the PIAs don't handle all of the potential problems with working with Access. This article will show you how to recognize those problems and how to handle them.
Understanding the problem
Some problems are easily solved. For example, one of the common problems in automating Access from a .NET application is the "dead command button." From your .NET-based code, you create a menu button and put it on an Access menu bar. You then tie a routine in a .NET-based application to the button's Click event. What could go wrong?
Suddenly, things do go wrong: The user clicks on the button but the Click event routine doesn't run. This problem occurs if the variable that refers to the button is declared locally (for example, inside a subroutine or function). When the subroutine or button finishes, the variable is destroyed, the related object (your button) is also destroyed, and the routine tied to the button stops working. You're probably knowledgeable enough about working with objects to recognize this problem and deal with it. In fact, you'd probably spot this problem immediately if the code stopped running as soon as the routine ended and the variable went out of scope.
But .NET doesn't work that way. In the .NET world, an object isn't necessarily destroyed with the variable that refers to it. Don't be misled: Sooner or later, the .NET Framework will destroy the object. However, when the variable is destroyed, the object that it refers to is merely put on the .NET garbage heap to be destroyed at some time in the future. As long as the object hangs around on the garbage heap, your Click event code will continue to run when the button is clicked. But when .NET's garbage collection finally gets around to destroying the object, your button will stop working.
In the world of COM, we're used to what's called a deterministic model of memory management. Like all COM objects, Access has an internal reference counter that counts how many variables are referring to it. When a new variable is aimed at the Access Application object through a Set command, the counter inside Access is incremented; as variables are set to nothing and stop pointing to the Application object, the counter is decremented. If the counter ever gets to 0, Access is destroyed–immediately. Typically, in a Visual Basic application that's using Access, only a single variable will be pointing to the Access Application object. As a result, setting that variable to Nothing causes Access to be removed from memory (see Doug Steele's discussion of automating Excel in this month's "Access Answers" column for information about the mechanics of automating Excel).
The .NET Framework, on the other hand, manages memory non-deterministically. You control when the object goes on the garbage heap; .NET controls when the garbage collection process actually occurs and deletes objects that aren't being used. To make matters more interesting, garbage collection is a lazy process. .NET only does a garbage collection when it has to, usually when .NET is running out of memory (for more on this non-deterministic process and a complete discussion of garbage collection in the .NET Framework, see "Programming for Garbage Collection" at http://msdn.microsoft.com/en-us/library/0xy59wtx.aspx
This lazy approach to freeing up memory is only part of what makes diagnosing the Click event problem so difficult. Remember that, from the .NET side of the process, the variable is pointing not to Access but to the RCW that's managing Access. From the Access side of things, Access is working with the RCW, not with your application. As long as the RCW is working with Access, the Application object's reference counter will be greater than 0, and the object won't be destroyed.
When the .NET routine ends, what's lost is the reference to the RCW, not to Access. The RCW is then placed on the garbage heap, still referencing the Access Application object. As long as the RCW sits on the garbage heap, the Access Application object's reference counter will stay greater than 0. Eventually, the RCW will be destroyed during garbage collection, the Application object's reference counter will drop to 0, and Access will be removed from memory. At this point, your button goes dead.
Preventing memory leaks
You can solve the previous problem by making sure that you declare your variables at the module level (a necessity in Visual Basic if you want to declare a variable that will catch events, anyway). But what if you're loading Access, running some reports, and then want to release Access? Now you have a different problem: Instead of having Access unexpectedly drop from memory, you now want to remove Access at your command. In your .NET application you could set the Access variable to Nothing, but, as you've seen, all you've done is break your connection to the Access Application's RCW. Access won't necessarily be removed from memory just because you've set the reference to RCW to Nothing. If your application repeatedly loads and reloads Access (or, in a Web application, if different pages keep loading and reloading Access), you can end up with multiple copies of Access hanging around in memory. Given the size of Access, this is probably bad news for your application.
Your first thought may be, "This isn't a problem. As long as I have enough memory to support all these copies of Access, I don't need to have any of those copies removed. When I don't have enough memory, garbage collection will be triggered and the copies of Access will be removed from memory."
But the sad news is that garbage collection may not remove those copies of Access from memory even when memory shortages trigger garbage collection. Garbage collection, while it may be lazy, is also intelligent. This means that garbage collection attempts to remove objects in some order: Objects that were used for a short time are assigned a higher priority in garbage collection than objects that were used for a long time, for instance; larger objects are assigned higher priority than smaller objects because larger objects free up more memory. The first criterion means that it's a good idea to release the RCW as soon as you're done with it to make it more likely the related copy of Access is removed from memory.
That was the good news. The bad news is that because large objects are given a higher priority than small objects, it's actually more likely that your Access object will hang around in memory through repeated garbage collections.
That may seem counterintuitive: There aren't many COM objects bigger than Access. However, what .NET monitors is the size of the RCW, not the size of the Access object. So when your .NET code releases the RCW, garbage collection may well ignore the RCW because, typically, RCWs are small. And as long as the RCW stays in memory, the (typically large) Access object will also stay in memory.
It's possible to force garbage collection in your .NET code. But the reason that garbage collection is best handled by .NET is the same reason that figuring out how to access your data is best left to the database engine: If you step in and take control, you'll probably get worse performance overall. In any case, forcing garbage collection doesn't ensure that instances of Access will be destroyed: Even when called explicitly, .NET garbage collection is always non-deterministic.
You have two tools for making sure that Access instances are released from memory:
• The AppDomain object
• The ReleaseComObject method
AppDomain is the simplest solution but has both an impact on performance and a security risk. If you're ready to do some more planning and coding, ReleaseComObject will give you the best performance.
A .NET AppDomain object actually generates a separate environment for Access to run in. If you load Access into an AppDomain, then unload the AppDomain, your Access application (and anything that it was using) will be removed from memory. As that last sentence implies, you can load multiple COM objects into a single AppDomain; unloading the AppDomain will remove all of those objects at one stroke.
The downside is that creating an AppDomain can be expensive. If you're worried about performance, you shouldn't use an AppDomain. While creating an AppDomain locally is completely secure, creating an AppDomain remotely destroys code security.
Given those restrictions, your best choice is to use the ReleaseComObject method. Effectively, you use this method to reduce the Access Application's reference count to 0. Once the reference count gets to 0, COM will take care of removing Access from memory. You can then set the variable pointing to the RCW to Nothing (or just let it go out of scope) so that it can be garbage collected at some later date. While the RCW may hang around in memory for a long time, you can count on .NET to make an intelligent decision about removing it from memory.
The reference counter for an object being managed by .NET is handled slightly differently than when the object is being using by other COM objects. When the .NET code creates the Access Application object, the Application object's reference counter is set to 1. After that, even if many variables in the same process on the .NET side are used to reference the Application object, the reference counter in the Application object stays at 1. This makes sense: The additional variables are pointing to the RCW, not the Access Application object. However, the COM object's reference counter will increment if the variable pointing to the RCW is passed across a process boundary, so you can't be sure that the reference counter will always be set to 1–this will be important later.
Peter writes in 2009: You can avoid the following loop in .Net 2.0 using the FinalReleaseComObject.
By using the ReleaseComObject method of the System.Marshal object, you can force the reference counter down by one. The method also returns the resulting value of the reference counter so you can check to see if it's reached 0. Since it's possible for the reference counter to be more than 1, checking this value is essential.
This example loads Access, then decrements the reference count until it reaches 0, and finally releases the RCW:
Dim acc As _
…working with Access…
intRefCount = _
Loop While intRefCount > 0
My comment about multiple variables pointing to a single RCW skips over a key problem: Because the one RCW that manages Access can be shared among multiple variables, calling the ReleaseComObject method may result in reducing the reference counter of a copy of Access that other routines are using. If the counter goes to 0 and COM removes Access from memory, then those other routines will find that while their RCW is still present, the copy of Access that it was managing is not. This generates a System.Runtime.InteropServices.InvalidComObjectException with the additional information "COM object that has been separated from its underlying RCW cannot be used."
If you're considering moving to .NET, you can still take advantage of legacy Access code. However, accessing COM objects from .NET has its issues, as outlined in this article. As in any application, you want to make sure that .NET variables referring to COM objects don't go out of scope–but in .NET, diagnosing the problem can be difficult because the effects can be delayed as objects wait to be garbage collected. Unique to the .NET environment is making sure that any copy of Access that you're using is removed from memory when you want it to be. The Marshal object's ReleaseComObject method gives you the best performance in ensuring that you remain in control of when Access is removed.
Why Not Consider These Pages