About a month ago, I was asked by a company to fix a bug in their huge C# application. From time to time their application just crashed. No exceptions, no hints in the logfiles, nothing.
Of course the bug was not reproducable (”it tends to occur under heavy load”), which made things even more interesting.
So what you do when you have no idea and should have fixed it yesterday?
Get A General Idea
Of course I have no idea about the whole project. To get an overview it’s generally a good idea to search for subprojects/solutions. They often have meaningful names and are stored in directories which may indicate some kind of hierarchy. You can get an idea about the structuring in a short time.
After doing that, heh, I know there is plenty code around, but at least I somehow have an idea what is going on and more important, what the application actually does.
Debugging
I add a listener for unhandled AppDomain exceptions so their details get flushed out into a file. Then I start the App.
“Tataaa”, after half an hour I have a crash. Great! My exception handler did not produce a file though.
So I can assume the problem is not in the application’s C# code itself. Thus cluttering the code with silly debug statements won’t help here (That’s what the guy before was doing).
What I need is a better debugger! After some research I even found something useful.
I run the App with my new debugger and after a while the program fails again. Catting trough the output files and there was my problem: 0xc0000005 (STATUS_ACCESS_VIOLATION)
Ou ou…
Well, I don’t think that the App runs into a .NET bug. Access violation… Pointers and stuff? C comes into my mind. In C# there are no pointers though. If they have been used anyway, my exception handler would have gotten a proper exception and would have flushed out some details.
Maybe there are some COM calls somewhere which are causing problems?
It is possible to directly call unmanaged code from a managed C# App. The only thing you have to do is declaring the API methods you want to use with the “DllImport” attribute:
[DllImport("wininet.dll")]
static extern IntPtr FindFirstUrlCacheGroup(
int dwFlags,
int dwFilter,
IntPtr lpSearchCondition,
int dwSearchCondition,
ref int lpGroupId,
IntPtr lpReserved);
A simple grep and I located the COM stuff. Some further research and this line turns up:
public int Write( byte[] buffer )
{
//...
if( WriteFile(
this.fileHandle,
buffer,
buffer.Length - 1,
out this.bytesWritten,
ref this.writeNativeOverlapped) )
return bytesWritten;
//...
return 0;
}
Whats wrong here?
Maybe you think it’s okay. Well, this is why this post got written.
Garbage Collection
The debugged program is written for the .NET framework. This means we have garbage collection support. This again makes everybody building and using objects and except the IDisposable guys nobody really cares when and how the objects get deleted because it just works.
Nothing new for you I guess.
The garbage collector does also other cool things.
For example it moves the memory used by the objects in the RAM so the objects stay together and thus prevent memory fragmentation.
We therefore also benefit from the locality effect and our fancy L1, L2 and Ln caches finally do something useful. That means the program gets faster, the user is happier, we have less performance issues and hence more time for doing other things.
Everything fantastic until here.
COM
The debugged program does not live just in that world though. By passing our empty buffer we do enter the dark and evil COM world.
Thats where the reason of my program crash lies.
What happens with the COM invoke?
The .NET framework copies some values (value types) and passes a pointer to the system for the others. It needs pointers so e.g. the operating system can fill the buffer. Of course I can also pass a copy of my buffer and the system can fill also that one, but well, I wont be able to get it back into my application anymore. By leaving a reference I can get the result when the system finished writing data into my buffer. Thats why we have to specify the size of the buffer. On the COM side the system just receives a pointer to a byte array and has no idea how big it is.
Anyway, guess what happens when the garbage collector in its infinite wisdom decides to move our buffer into another memory area?
On the .NET side everything will still be okay. Calls on the buffer will be mapped to the right place. The operating system however, still has its pointer from the last call and that pointer points to a memory location where there is no buffer anymore. The OS writes it’s result to some location in the heap. Maybe another data structure is using it or that location doesn’t even belong to us anymore.
However, it writes to that address and because of memory protection the call gets blocked, and somebody kill us. ha!
Holding memory
So I must prevent that the buffer gets moved from the garbage collector while a reference to it is held by the system. The .NET way preventing a buffer to be moved by the Garbage Collector is the fixed keyword.
byte[] buffer = new byte[512];
long size = buffer.Length;
unsafe
{
long* ptrSize = &size;
fixed( byte* ptrbuffer = buffer )
{
GetComputerName( ptrbuffer, ptrSize );
}
}
That doesn’t work for me though. I have two threads. One for asynchronous writing and one for asynchronous reading. I cant use that fixed stuff because I would need to wait until I can free the buffer, that would make me synchronous again.
Fortunately there is another solution: Object-Pinning. From msdn:
… This prevents the garbage collector from moving the object and hence undermines the efficiency of the garbage collector. …
ye! By pinning an object you tell the garbage collector it should let its dirty fingers from your shiny objects. That fixed my problem:
Classical example of framework magic striking back