Plugs and me

January 3rd, 2008

Earlier this day I wanted to charge my laptop’s battery, but…
Outlet

Aw, that somehow doesn’t fit!

Of course it was already late and the shops were closed so I couldn’t buy one of those converter-things.

It was obvious though that the plus and minus connectors would be just in the right position, just the third connector was in the way. And like McGyver I have a Swiss Army Knife with me in such situations and wanted to cut away the earth contact so it would fit into that Spanish power outlet.

Fortunately, my girlfriend was also around and like usually she thinks way more than I do, so as I told her about my genius plan, she took a look at the transformer

and asked me why I don’t use those standard two pole cables like the one I use for charging the camera.

Of course, it just worked and here you see what will provide the laptop with power until I turn back home.

(By the way, thanks to that other guy somewhere at Las Canteras beach near to my hotel for providing an unprotected access point)

Bitshifting

December 21st, 2007

Today I found this wonderful collection of cool bit twiddling hacks.

One bookmark more…

Typing

December 17th, 2007

Those of you who regularly poll tech sites will certainly have heard of ruby in the last two years. The fanboyism of the average ruby zealot is not ignorable and it’s impossible to escape from it.
Anyway, I won’t write about ruby and its users today (though maybe they should broaden their horizons and try something real).

However, a year ago during the big RoR hype (remember 37 signals?) plenty of web developers jumped into the ruby water. I have the impression lots of them did never use a web framework before and were that amazed about the productivity boost they experienced, they just couldn’t stop writing about their great discovery. This affected the typical discussion and made things even worse.

ruby ownez teh noobs OMG

Fortunately, nowadays that is followed by the classical

ZOMG RUBY IS TEH SLOW

and most of the ruby developers got uncovered anyway by Steve Yegge’s

Ruby fans seemed like ferret owners. They could go on and on about how much they adored their beady-eyed albino stretch-limo rats, and how cute they were, but we all knew they were just looking for attention.

Anyway. Where was I going with this? Oh yes, if you are bored and search for entertainment among ruby flame wars on the internet you will certainly read about problematic DB design, scalability and performance issues. As for the performance issues, ruby’s type system is blamed for the performance issues it has. But what is a type system?

Type system

When a program gets written, the developer normally has to deal with data. This may be anything from a simple integer to more complex data structures. Those values are stored as bytes in the memory. As you can imagine, it’s not very funny to deal with a bunch of bytes so people started to define “types” which represent specific byte arrays. Hence, instead of an array of four eight bytes, a variable with type “integer” can be used. After that everybody could pass his new and fancy types to functions but it was still a bit of a hassle.

Later somebody came up with the idea of adding functions to those types and called them “Objects”.

However, when you move around your variables on the stack, a mechanism is required which ensures that the type of your variables matches the type expected (We have a function whose only parameter is an integer. It must be forbidden that a string gets passed. Otherwise strange things would happen when the first few bytes of the string passed get interpreted as an integer). The thing which makes sure that this doesn’t happen is a Type Checking System.

There are several ways to implement such a system. The verification may occur when the program gets compiled or on demand when the program is running.

Static typing

The compiler makes sure that the type-contracts don’t get violated. This means type errors get detected very early and the compiler can also perform further optimizations. The errors detected depend on how powerful the type system is though.
In a statically typed language, refactoring tools are easier to implement and more accurate. The contract between caller and callee is clearer but this normally leads to more verbose expressions. Limitations in the typing system exist and need to explicitly overridden by the developers with casts or similar constructs. Static typing is used in Java or Haskell.

Dynamic typing

The type verification is done during run-time. This leads to some runtime overhead, allows more flexibility for the developer though.

var i = 3;
var j = "33";
var k = i + j;

Although a string is used for “j”, it’s obvious enough for the runtime what we are trying to do. More code can be written in less time like this but there is no compiler which makes sure that the statement can be executed later. Dynamic typing is used in languages like Python and Lisp.

Duck typing

Duck typing is a subset of dynamic typing. Rather than requiring a special type of a variable, just the actually used aspects are demanded. The type itself doesn’t matter. When the execution environment discovers during run-time that the required method is not implemented in the passed variable, the application stops with a run-time error.

Ruby

Ruby uses duck typing. The problem this language has is that those duck type checks need to be performed on every function call (and it’s really a problem. Check out the release notes of new versions and you will see they just talk about performance. Or watch the community get thrilled when there is a release of a fast implementation)
Ruby needs to scan and analyze the object, find the function which should be called, generate the according cpu instructions and execute it. In a statically typed language, all those steps can be precomputed by the compiler. And guess what, Ruby has to redo those validations on every function call! Unfortunately it’s just pretty common for developers to call functions all the time. Now just imagine how slow recursion is.

Thus the anti-ruby guys have certainly a point with their performance argument. It *is* slow.

However, it should also be considered that not everybody builds a website which attracts half the population on earth and shorter development cycles are worth more than an upgrade of your server.

Don’t Wait

December 9th, 2007

Let’s assume my car’s light is broken. I have no idea about anything inside my car and can’t help myself. So I bring it to a garage and let somebody who knows how fix the light.
I arrive there but the mechanician tells me he has no time. There are other cars which need to be fixed first. He will text me when it’s done though.

CarThat’s okay with me. These days you should consider yourself lucky when they even find time to talk to you. However, will I hang around there for I don’t know how many hours? God no, of course not! I can do plenty of other things than sitting in the garage: Go shopping, have dinner, work a few hours, meet somebody, etc.
If I would tell to anybody I’ve waited five hours in the garage for my lights he would tell me what crazy guy I am. There are obviously better ways to spend five hours than watching the garage’s pin up calendars.

As soon as we change into the software world though, it is considered normal by the majority to wait that long.

Of course there are ways to get a beep as well when things are ready for you but many developers I met prefer to put their program waiting in the garage instead of letting it doing something useful. Furthermore, when they mention how much they waited, they almost never hear what crazy thing they just did.

An operating system provides exactly the mechanism required to be able to start an operation and then do more useful things than waiting for its result. How many of us use them consistently?

Waiting...On Windows for example there are the overlapped I/O operations. You pass a special structure which contains an event while calling your long during operation. If the system decides it should not execute it synchronously, the method returns immediately and you are free doing other things while your request gets executed by the operating system. After running out of stuff you could compute and prepare, you actively wait for your first call to end and will be woken up when the system completed your request and presents you the results from the first call.

On Linux we have similar possibilities of course. Asynchronous operations like “aio_read()” or “aio_write()” may be called and the caller has then the possibility to do other things like allocating new buffers, compute the result from the last read operation, etc. When the new result is needed, “aio_suspend()” can be called to await the I/O operation’s completion. Note for my PnProgging friends: In case you are wondering, I/O Multiplexing (select/poll) is blocking and thus not the same as asynchronous operations.

SleepingNot really hard, isn’t it? Just I almost never see it in other’s code. They prefer letting their software wait instead of doing something more useful. Nobody would ever wait for hours in a garage because its obvious to anybody how inefficient that would be. In our software world though, it is not considered that silly anymore. Most of the I/O operations I saw in projects were done synchronously. As if our software was fast and we would have enough time anyway.

When I see such code and investigate why it has been solved that way, the developers responsible I talk to start to flame about how complicate asynchronous operations are. That’s like saying you cannot go shopping after leaving your car in the garage because it requires you to think what you will do when you finished shopping.

The sad thing is, this is not just the case with low level I/O operations. Asynchronous behavior is another great thing which doesn’t get adopted because there are too many guys out there who don’t understand it and prefer their old, simple and inefficient solutions.

You can find this kind of thinking almost everywhere. I experienced it many times. For example when I provided asynchronous execution for long during operations in my classes. People started to claim that it’s too complex and were happy that they could go on abusing their UI threads. Of course they have a point here, asynchronous things require further synchronization and thus makes the code longer and harder to read. But for me it sounds like when somebody tells me we shouldn’t use C++ templates because they are too hard. Or we shouldn’t target parallelization because of synchronization issues. Or recursion should be avoided because it’s hard to understand.

But guess what? Software is hard!

Haskell GUI

December 7th, 2007

In the last few days I’ve been checking out some haskell GUI frameworks. This is what I found so far.

Gtk2Hs

As the name says, Gtk2Hs is a library which bases on the Gtk+ project. It supports native look’n'feel and is supported on Linux, MacOS and Windows. You can even use the glade interface builder on Linux for creating your Haskell guis.

The user interface code is implemented in one big Monad (how else?) and a hello world application looks like

import Graphics.UI.Gtk
 
main :: IO ()
main = do
  initGUI
  window <- windowNew
  button <- buttonNew
  set window [ containerChild := button ]
  set button [ buttonLabel := "Hello World" ]
 
  onClicked button (putStrLn "Hello World")
  onDestroy window mainQuit
  widgetShowAll window
  mainGUI

wxHaskell

wxHaskell is a Haskell like wrapper built on top of the C++ wxWidgets framework. It runs on all major platforms as well and supports native look’n'feel. The MacOS port is available as a darcs repository.

The example.

module Main where
import Graphics.UI.WX
 
main :: IO ()
main = do
  f <- frame    [text := "Hello World"]
  quit <- button f [text := "Quit", on command := close f]
  set f [layout := widget quit]

HOC

On http://hoc.sourceforge.net/ there are Objective-C bindings so it is possible to access MacOS’ Cocoa library from Haskell and build Cocoa objects from Haskell.

[Update]

Eric pointed out some mistakes.

Centralization

November 28th, 2007

Last weekend I traveled to a place where it looks better than sunny Switzerland normally does. Though thats not very hard as you can see

Swiss weather

Wonderful things those trips to Las Palmas by the way. I strongly suggest to anybody who has the time and lives somehow close to go.

Anyway, I’m not talking about the sunny days I spent there. The really great thing was that I met an old friend I thought I had lost years ago. He turned up out of nothing. So suddenly that it took me few moments until I realized that it was really him. But there he was again, origin of pleasure and amusement, fountain of long and intense nightly discussions. Let me introduce him here!

TV

He even made it to TV. Boah!

Mixing Data

The problem we see here is not just the result of a crazy guy who put his long running service on an OS which is notorious for getting slow and unstable after using it for a long time.

I think it gives an adequate example of how putting everything into a central place may mess up everything. If you save all you have into those database oriented file/configuration pools you give away all the control you have about your data and thus your application as well.
Of course while saving your application’s data you may receive an error and the next time you have an invalid file as well, but at least you can somehow handle that (by holding tmp files or whatever) If your data storage gets corrupted you may not even be able to access the incomplete data at all.
Moreover, you are not the only one who is accessing that storage. When the power cord gets plugged out and hundreds of other clients are saving their stuff to the same place its way more likely that the internal structure of that storage gets corrupted (more likely, it may be avoided of course).
Then you face the problem that it was maybe your data which is not accessible anymore after the next reboot because some other guy was saving his stuff, accidentally spoiled the data from all other clients and you can do nothing about it.

In the Windows world there are a few failed DB-design based features:

Vista Filesystem (WinFS)

The guys at Redmond wanted to store all files in a huge database-like filesystem. You have transactions, can scale over several computers, everything! The real Wow effect!
Almost finished However, it did not ship with Vista and I don’t know what happened to it. I guess it’s imploded under it’s own weight. Maybe it was too slow, to buggy or whatever.
You can ship any application with some bugs but a filesystem has to be perfect. One small bug and data starts to disappear in random places and will make your users running away in droves.

Anyway, it seems it’s not trivial to correctly implement a storage with a DB design so it still achieves good performance and provides the essential features of a filesystem, otherwise it would have been shipped.

Registry

Another nice example of what happens when data from different clients get mixed together.
It is the last resort of uninstalled applications and everlasting producer of problems. One wrong edit and your system doesn’t boot anymore. Moreover, unofficial waste bin of unsorted configuration data for all kind of applications. Not to compare with the relatively clean structuring of a multi file /etc directory.

Outlook/Exchange

Normal users tend to use their mail account as a searchable file storage. Nothing wrong with that. They just noted that the normal windows search is crap and in their mail program they can tag their files with emails and separate them in categories. The only guy who is not so happy about that and responds with mailbox size limits is the one who has to maintain all those things. He has to do that because all mails get stored in a gigantic file and when that file exceeds the 16GB storage limit he has a problem. (If he wants more he needs an expensive update, has to hack the registry etc.) Why not providing a unix like mail spool with a single file for every user? Another example of how you create problem by storing everything in the same place.

Everything Is A File

The unixen philosophy where everything from a socket, to handles, to drivers is a file is the much simpler and thus more powerful and flexible way than inventing a new architectural paradigm for every emerging problem you face. Furthermore, coming up with a new approaches for such basic things instead of using a proved abstraction model just introduces unnecessary complexity which will sooner or later (maintenance) cause tremendous headache.

Trouble

By putting your code into dependency of others you increase the probability that it fails. This is especially true if your dependency is accessed by other clients as it is done in those file/configuration databases. Every external component you use may fail and typically you have no way to fix it. There are already enough problems an application has to face. By requiring yet another fancy module you just add another hurdle which could break your program.

It’s that simple.

Flying objects

November 21st, 2007

About a month ago, I was asked by a company to fix a bug in their huge C# application. From time to time their application just crashed. No exceptions, no hints in the logfiles, nothing.
Of course the bug was not reproducable (”it tends to occur under heavy load”), which made things even more interesting.

So what you do when you have no idea and should have fixed it yesterday?

Get A General Idea

Of course I have no idea about the whole project. To get an overview it’s generally a good idea to search for subprojects/solutions. They often have meaningful names and are stored in directories which may indicate some kind of hierarchy. You can get an idea about the structuring in a short time.

After doing that, heh, I know there is plenty code around, but at least I somehow have an idea what is going on and more important, what the application actually does.

Debugging

I add a listener for unhandled AppDomain exceptions so their details get flushed out into a file. Then I start the App.

“Tataaa”, after half an hour I have a crash. Great! My exception handler did not produce a file though.
So I can assume the problem is not in the application’s C# code itself. Thus cluttering the code with silly debug statements won’t help here (That’s what the guy before was doing).
What I need is a better debugger! After some research I even found something useful.

I run the App with my new debugger and after a while the program fails again. Catting trough the output files and there was my problem: 0xc0000005 (STATUS_ACCESS_VIOLATION)

Ou ou…

Well, I don’t think that the App runs into a .NET bug. Access violation… Pointers and stuff? C comes into my mind. In C# there are no pointers though. If they have been used anyway, my exception handler would have gotten a proper exception and would have flushed out some details.

Maybe there are some COM calls somewhere which are causing problems?

It is possible to directly call unmanaged code from a managed C# App. The only thing you have to do is declaring the API methods you want to use with the “DllImport” attribute:

[DllImport("wininet.dll")]
 static extern IntPtr FindFirstUrlCacheGroup(
 	int dwFlags,
 	int dwFilter,
 	IntPtr lpSearchCondition,
 	int dwSearchCondition,
 	ref int lpGroupId,
 	IntPtr lpReserved);

A simple grep and I located the COM stuff. Some further research and this line turns up:

public int Write( byte[] buffer )
{
 
//...
 
if( WriteFile(
	this.fileHandle,
	buffer,
	buffer.Length - 1,
	out this.bytesWritten,
	ref this.writeNativeOverlapped) )
		return bytesWritten;
//...
 
return 0;
}

Whats wrong here?

Maybe you think it’s okay. Well, this is why this post got written.

Garbage Collection

The debugged program is written for the .NET framework. This means we have garbage collection support. This again makes everybody building and using objects and except the IDisposable guys nobody really cares when and how the objects get deleted because it just works.

Nothing new for you I guess.

The garbage collector does also other cool things.
For example it moves the memory used by the objects in the RAM so the objects stay together and thus prevent memory fragmentation.
We therefore also benefit from the locality effect and our fancy L1, L2 and Ln caches finally do something useful. That means the program gets faster, the user is happier, we have less performance issues and hence more time for doing other things.

Everything fantastic until here.

COM

The debugged program does not live just in that world though. By passing our empty buffer we do enter the dark and evil COM world.
Thats where the reason of my program crash lies.

What happens with the COM invoke?
The .NET framework copies some values (value types) and passes a pointer to the system for the others. It needs pointers so e.g. the operating system can fill the buffer. Of course I can also pass a copy of my buffer and the system can fill also that one, but well, I wont be able to get it back into my application anymore. By leaving a reference I can get the result when the system finished writing data into my buffer. Thats why we have to specify the size of the buffer. On the COM side the system just receives a pointer to a byte array and has no idea how big it is.

Anyway, guess what happens when the garbage collector in its infinite wisdom decides to move our buffer into another memory area?
On the .NET side everything will still be okay. Calls on the buffer will be mapped to the right place. The operating system however, still has its pointer from the last call and that pointer points to a memory location where there is no buffer anymore. The OS writes it’s result to some location in the heap. Maybe another data structure is using it or that location doesn’t even belong to us anymore.
However, it writes to that address and because of memory protection the call gets blocked, and somebody kill us. ha!

Holding memory

So I must prevent that the buffer gets moved from the garbage collector while a reference to it is held by the system. The .NET way preventing a buffer to be moved by the Garbage Collector is the fixed keyword.

byte[] buffer = new byte[512];
long size = buffer.Length;
unsafe
{
	long* ptrSize = &size;
	fixed( byte* ptrbuffer = buffer )
	{
		GetComputerName( ptrbuffer, ptrSize );
	}
}

That doesn’t work for me though. I have two threads. One for asynchronous writing and one for asynchronous reading. I cant use that fixed stuff because I would need to wait until I can free the buffer, that would make me synchronous again.

Fortunately there is another solution: Object-Pinning. From msdn:

… This prevents the garbage collector from moving the object and hence undermines the efficiency of the garbage collector. …

ye! By pinning an object you tell the garbage collector it should let its dirty fingers from your shiny objects. That fixed my problem:

  • Creating a wrapper class for the buffer. Which holds the size and the buffer array itself.
  • Pin those members in the constructor
     bufferHandle = GCHandle.Alloc(
    	localBuffer,
    	GCHandleType.Pinned)

    and unpin them when they get disposed.

     bufferHandle.Free();
  • Save them locally before writing and reading so they don’t get disposed by the GC while an IO action is pending

Classical example of framework magic striking back

Using your keys

November 15th, 2007

When I ask you (assuming you are an IT guy, I don’t think other people will find their way here) when you used caps lock the last time you will probably answer never. Anyway, I’m sure you did, though without noticing it aND IT WAS JUST A BIT ANNOying @*#!^.

Because I use the ctrl key in my favorite editor all the time (not to copy/paste code ;) ) and it is in such an unnatural position there are days my hand start to hurt me or feel cramping because of the crippled position I force it into for hours. Then I usually have to stop typing (”Oh no!”) and wait a few hours until I can continue again.

However, its possible to map the caps lock key to the ctrl key. Here is how:

Linux

Load

keycode 0x42 = Escape

into your xmodmap

Mac

Tiger
In Keyboard & Mouse settings, Keyboard, Modifier Keys… choose ^ ctrl for caps lock

Panther
I don’t have panther installed but maybe fKeys works for you.

Windows

Create a file called mapper.reg and paste

[HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Control\Keyboard Layout]
"Scancode Map"=hex:00,00,00,00,00,00,00,00,02,00,00,00,01,00,3a,00,00,00,00,00

in. Then double click, import it to your registry and reboot. (You may want to backup the existing value first)

The clock change bug

November 14th, 2007

If you commute in those new trains, you may have noticed their new and shiny system which shows you in which village you are and when you will arrive in the next station.

However, since the clock change 18 days ago, it just doesnt work anymore: Sometimes it displays three different times on the same screen, the ‘next stop’ field is wrong or the driver just turned it off and announce the stations by his own.

We all create software and know how hard it is to keep those little things in mind which could make our software fail. I was not very surprised when I saw it the first time, but that was almost one and a half years ago!

I wonder what those guys at sbb are doing (maybe they outsourced it to Swisscom IT Services (they grow by buying other companies and fail when they have to deliver on time because it’s all mixed up)) There was a bug a long time ago, its easy reproduceable and it’s around until today.

Still, it’s a wonderful example for the type of dev who just press ‘run’ in his editor to see if his program is okay and then hope he did enough testing.

Your editor and you

November 7th, 2007

Yesterday I saw somebody implementing a piece of software without knowing what the actual problem was. There weren’t just a few details missing, almost everything was unclear by then. He just started hacking like that: Classes there, interfaces there, unit tests etc, etc. It seemed so ridiculous to me that I had to ask him. “Ey, we can refactor later” was his response and he continued. Well, of course everybody has its own strategies, I just couldn’t imagine that in the end he will have achieved a clean design without having thrown away everything. How can you be that crazy and just spill out code without have any concrete details about what you should actually do?

How to write software?His next reply was “Ey, look mate: You can press here and here, and then magically everything gets reordered. It’s so fancy!”.
Soon he run into a problem. Ha! You have a race condition somewhere, now what you gonna do with all your refactoring tools? He calmly started his debugger tried / failed / built / continued until he got it fixed. He wasn’t coding in vi, nano, EditPlus or TextMate. Of course not, it was one of those Java editors with trillions of features.

And he felt save using it. Oh yes. He loved that warm and fuzzy feeling of live API browsing when pressing a single dot. Those underlined statements which warned him about possible pitfalls like Microsoft Word does with grammatical errors. The relaxed feeling he had because he knew he will be able to refactor all his code later into a nice class hierarchy anyway.

FeaturesWe all like it, don’t we? Those bloated and fat enterprise programming environments with their sweet features they provide. You don’t have to think about algorithms anymore because “I can debug them later anyway” and you don’t think about performance issues because “My performance tool will highlight the bad statements”. Soon you don’t even think about design anymore because “I can update my software architecture in that other UML tool”

But do you remember the first time you opened that editor? Oh yes, it took ages to load (you realized why it shipped on a DVD). It was fat, cluttered with options, menus and wizards. Everything was so slow because of all the bloat it had. You missed the tactile feedback from your TextMate times.

Anyway, it was worth to give up that old stupid editor! Look at all the new and shiny features you got now: PerformanceAnalyzer which puts a red bullet in front of computation intensive statements. The UnitTestGenerator. The DocGenerator. The DataStructureWizard. The AutoPatternizer.

Soon you started to code before thinking. You ignored your carefully crafted code conventions: “Look it up in DocGenerator”, you stopped thinking about architecture: “UML Refactorer will do it later forFeatures me”. Anyway, you could do it: Your imaginary friends were waiting to help you with all your coding issues, weren’t they?
Hold on a second, did you forget? Where have they been before that last release, when your boss came in and yelled about the disastrous performance of the module you wrote, PerformanceAnalyzer didn’t help you, eh? What about that other guy who had to write an extension for you application, but the “UML Refactoring Tool” failed because of an “internal error” You gave up your lovely editor for a few fancy features which barely work.

However, time passed and the more code you were spilling out, the more you forgot about the old days. You unconsciously accepted everything which surrounded you. Slowly you became like the editor you use and feared the first time you opened: Fat, bloated and slow.

Close your eyes, take a deep breath and go back a few years in time. Now, enjoy the old but refreshing feelings you had with your simple editor.