Archive for the ‘philosophy’ Category

Ping

Monday, May 12th, 2008

Yes I’m still here.

As I wrote last time my new laptop arrived, I still think it’s a great laptop but not as great as it would explain blogging inactivity for more than a month.

Many thing happened in the meantime, firefox is becoming emacs so emacs should better get the next firefox first, jeff the obvious has something new going on, subversion is considered a dead patient (at least where I hang around when I’m online).
Of course not everybody shares my observations. I heard others are more worried about missing potatoes for the next month than JVM’s balls of steal. Ah yeah, there was also a guy who asked me why I use emacs as its not installed anywhere. He got used writing in vi and it’s on all boxes and wherever he sshs into he finds a familiar environment. Heh, I demonstrated tramp and laughed is his face.

However, that doesn’t explain what I’ve been doing.

Well… I’ve been busy. Not just busy as usual, really busy. I guess the underlying problem is my new strategy how tasks should be handled. You know, I adopted the maybe greatest philosophy ever and converted myself into a lazy evaluation person.
That means, I’m doing nothing for a long time until something should be actually delivered and then, and just then I start focusing on the tasks.
The huge advantage is that you just spend time doing what is really necessary and have more time for other things.
The downside however is the work-explosion you cannot predict before no matter how hard you try. Nobody can. Thus when you finally started you just pray you don’t run out of resources.

One thing I was doing was dealing with was the .NET XmlSerializer.
And I tell you, that’s a piece of crap. Normal people like you and me tend to introduce interfaces, abstract base classes and many other things to keep things somehow modular so they can be kept in the context in our heads.
But exactly those abstractions cannot be serialized. As soon as the serializer discovers that the type of your property is an interface it will throw an exception back to you and tells you to change your design because it’s too complicated to serialize.
One way to solve the serialization problems is to duplicate all problem-properties and map them all to object properties but that’s exactly what you wanted to avoid in the first place.
The second approach is to tag the base class (if there is any) with XmlInclude tags for all existing inherited classes. But as you normally don’t have the control over your base class or just can’t update it whenever you add new extended implementation you cannot make use of it anyway.
The third approach is to use the IXmlSerializable interface. By implementing it, you can override the way how your class is converted into an Xml document and deal with interfaces and base classes on your own. Wohooo!
However, after five classes I noticed that it is a pretty tedious job to implement the same similar logic all over the place. So I implemented a somehow generic solution which scans with reflection over the properties of a type and if it finds something with an interface or inherited class it stores further type hints as attributes so it can convert it back later.
Of course I chose the latest option and in my base classes I can now write

public class Car
{
 IEnginge Engine{get;set;}
 IList<IFoo> Foo{ get;}
 BaseDescriptor Desc{ get{ return new ExtendendDescriptor()} }
 WriteXml( XmlWriter writer )
 {
  Serializer.WriteXml( this, writer );
 }
  ReadXml( XmlReader reader )
 {
  Serializer.ReadXml( this, reader );
 }
}

And all kinds of interfaces, generics, lists and dictionaries can be serialized with the implementation in Serializer. Mishto!

Though it’s not a very elegant solution neither those 2xml conversions. Actually I still don’t get the hype around that text format. Obviously it’s great (and even the first time) that your application achieves standard compliance and interoperability by just choosing the right library and not everybody is busy implementing their own buggy text parser but still… for logical representations it’s not the right thing in my opinion. The only really cool thing I found are XPath queries but that’s already it.

However, the main part of the time I spent with all my vapor projects and I somehow doubt that any of them will be finished until summer. Obviously that’s very normal with vapor projects just I made the mistake to actually talk about them and mentioned release dates.

But seriously guys, you actually think I have time for that? I mean, do I look like?
And it is not just the actual coding itself, using immature languages will bring you enough trouble anyway and when you don’t have time to fix the main libraries of that language, don’t want to monkey patch everything or don’t have the time to write new bindings every second hour then you just have yet another unfinished project in your repository.

Yes obviously I should have used something more proven but limitations get just discovered above a certain project-size anyway. When the developed program suddenly gets incredible slow or shows (at least in the eyes of the naive programmers like I am) strange behavior. Though at least you start getting a feeling which things might be problematic when faced with a new language and may explore their limits right in the beginning and don’t run into huge problems because of some half-baked implementations you are depending on at a later stage.

Ah where was I… right, problems with static languages. Did you notice that everybody started to ditch their old fat static language and went over to the dynamic camp? Oh and I’m not just talking about far away from here somewhere in the interwebs. It seems to be happening right down the street at the next corner. Who knows where this will lead us to?
It ends up into a maintenance hell with many different languages and their incompatible flavors as a result of constant forking and a too dynamic development path? Where it’s faster to rewrite the program for every new feature and where the non technical java-is-great managers won and don’t want to hear about the introduction of yet another language and framework?
Or we get a truly great environment with easy interoperability and robustness which attracts even more dynamic kids to jump in.

Today’s situation with the current languages seem do match the first option more thus the ongoing adoption of dynamic idea let us hope for a shift to the second one in near future. As soon as it would get widely accepted there would be yet another division in our industry. It will be way harder to hide missing knowledge or unwillingness to contribute behind strategyBridgeAdapter patterns or endless abstractions over abstractions. What the blubs will do is another question but there will be enough things left to do anyway.
Moreover, there would be finally a supporting type system which wouldn’t just put obstacles in our ways.

Time will show. Anyway, I wish our application would have been written in a dynamic language like lisp. Instead of converting some parts of the program into an xml file and then parse it back, the saved data could be lisp code on its own which could be loaded straight into the application without tedious parsing.

What a great world that would be.

OS essentials

Thursday, January 17th, 2008

The last time I was sitting in the train on my way back home I sat next to some computer science students complaining about the operating system course at their university.
“It’s way too low level”, “I never going to need this because in my Groovie world everything is painted in pink.”

Well, I do not agree here.

When you work on any project with a decent size there will be moments where you find yourself struggling with the limitations and quirks of the OS you are using. This may be anything from performance issues to all kind of allocation restrictions and filesystem limits.

If you attended the classes, you know about those problems in the first place and can code around those critical parts with appropriate methods. But if you have no idea about scheduling, threading, memory management, I/O stuff and security, to just name a few, there will be certainly a moment where the program is unexplainable slow or runs into strange errors and you have no idea what is going on. Or even worse, the external problem does not get identified as such and you start wasting ages finding the point in your code base which is responsible for the unexpected behavior.

And what will you do then? Analyzation programs nor your fancy tools will help you here. Suddenly, the pinkness of your Groovie world will become the everything covering tacky and brown glue which, slowly but steadily fills up your head-internal project cache and gives you those clammy feelings when something gets changed.

Frameworks and developing environments get better and better over time, let developers create more reliable applications in shorter time. But even today, in the age of Java-VM, .NET-VM, Whatever-VM you still have to deal with that thing which is running your hardware because the perfect abstraction doesn’t exist.

Thats when your computer science knowledge from some dark place deep in your head comes into. Because after the operating system course you can pass by the limitations of your framework and operating system, because you know what you are actually doing and what you cause deep down in your framework or operating system. Otherwise you don’t even have the slightest chance to fix, not even to detect those kind of problems. This ability differs you also from the outsource-endangered species “code monkey”.

And that leads me to the point that you generally should somehow understand the tools you are using because one day they will strike back and you will find yourself in huge problems and cannot go on with whatever you were doing. In that moment where the fancy automatizations you are using fail, the only thing you can do is hoping that there will be some kind of guru around who has the time, knowledge and lust to help you. Or you just know what the whole thing is about, search for somehow related keywords in the internet and find a solution for your problem within one or two hours.

However, obviously you cannot be familiar with all apps’ details you are using, but at least the core stuff you should understand. Putting your work in dependency of something you have no idea what it does and just hope that it will run yet another day is a bit crazy.

Don’t Wait

Sunday, December 9th, 2007

Let’s assume my car’s light is broken. I have no idea about anything inside my car and can’t help myself. So I bring it to a garage and let somebody who knows how fix the light.
I arrive there but the mechanician tells me he has no time. There are other cars which need to be fixed first. He will text me when it’s done though.

CarThat’s okay with me. These days you should consider yourself lucky when they even find time to talk to you. However, will I hang around there for I don’t know how many hours? God no, of course not! I can do plenty of other things than sitting in the garage: Go shopping, have dinner, work a few hours, meet somebody, etc.
If I would tell to anybody I’ve waited five hours in the garage for my lights he would tell me what crazy guy I am. There are obviously better ways to spend five hours than watching the garage’s pin up calendars.

As soon as we change into the software world though, it is considered normal by the majority to wait that long.

Of course there are ways to get a beep as well when things are ready for you but many developers I met prefer to put their program waiting in the garage instead of letting it doing something useful. Furthermore, when they mention how much they waited, they almost never hear what crazy thing they just did.

An operating system provides exactly the mechanism required to be able to start an operation and then do more useful things than waiting for its result. How many of us use them consistently?

Waiting...On Windows for example there are the overlapped I/O operations. You pass a special structure which contains an event while calling your long during operation. If the system decides it should not execute it synchronously, the method returns immediately and you are free doing other things while your request gets executed by the operating system. After running out of stuff you could compute and prepare, you actively wait for your first call to end and will be woken up when the system completed your request and presents you the results from the first call.

On Linux we have similar possibilities of course. Asynchronous operations like “aio_read()” or “aio_write()” may be called and the caller has then the possibility to do other things like allocating new buffers, compute the result from the last read operation, etc. When the new result is needed, “aio_suspend()” can be called to await the I/O operation’s completion. Note for my PnProgging friends: In case you are wondering, I/O Multiplexing (select/poll) is blocking and thus not the same as asynchronous operations.

SleepingNot really hard, isn’t it? Just I almost never see it in other’s code. They prefer letting their software wait instead of doing something more useful. Nobody would ever wait for hours in a garage because its obvious to anybody how inefficient that would be. In our software world though, it is not considered that silly anymore. Most of the I/O operations I saw in projects were done synchronously. As if our software was fast and we would have enough time anyway.

When I see such code and investigate why it has been solved that way, the developers responsible I talk to start to flame about how complicate asynchronous operations are. That’s like saying you cannot go shopping after leaving your car in the garage because it requires you to think what you will do when you finished shopping.

The sad thing is, this is not just the case with low level I/O operations. Asynchronous behavior is another great thing which doesn’t get adopted because there are too many guys out there who don’t understand it and prefer their old, simple and inefficient solutions.

You can find this kind of thinking almost everywhere. I experienced it many times. For example when I provided asynchronous execution for long during operations in my classes. People started to claim that it’s too complex and were happy that they could go on abusing their UI threads. Of course they have a point here, asynchronous things require further synchronization and thus makes the code longer and harder to read. But for me it sounds like when somebody tells me we shouldn’t use C++ templates because they are too hard. Or we shouldn’t target parallelization because of synchronization issues. Or recursion should be avoided because it’s hard to understand.

But guess what? Software is hard!

Centralization

Wednesday, November 28th, 2007

Last weekend I traveled to a place where it looks better than sunny Switzerland normally does. Though thats not very hard as you can see

Swiss weather

Wonderful things those trips to Las Palmas by the way. I strongly suggest to anybody who has the time and lives somehow close to go.

Anyway, I’m not talking about the sunny days I spent there. The really great thing was that I met an old friend I thought I had lost years ago. He turned up out of nothing. So suddenly that it took me few moments until I realized that it was really him. But there he was again, origin of pleasure and amusement, fountain of long and intense nightly discussions. Let me introduce him here!

TV

He even made it to TV. Boah!

Mixing Data

The problem we see here is not just the result of a crazy guy who put his long running service on an OS which is notorious for getting slow and unstable after using it for a long time.

I think it gives an adequate example of how putting everything into a central place may mess up everything. If you save all you have into those database oriented file/configuration pools you give away all the control you have about your data and thus your application as well.
Of course while saving your application’s data you may receive an error and the next time you have an invalid file as well, but at least you can somehow handle that (by holding tmp files or whatever) If your data storage gets corrupted you may not even be able to access the incomplete data at all.
Moreover, you are not the only one who is accessing that storage. When the power cord gets plugged out and hundreds of other clients are saving their stuff to the same place its way more likely that the internal structure of that storage gets corrupted (more likely, it may be avoided of course).
Then you face the problem that it was maybe your data which is not accessible anymore after the next reboot because some other guy was saving his stuff, accidentally spoiled the data from all other clients and you can do nothing about it.

In the Windows world there are a few failed DB-design based features:

Vista Filesystem (WinFS)

The guys at Redmond wanted to store all files in a huge database-like filesystem. You have transactions, can scale over several computers, everything! The real Wow effect!
Almost finished However, it did not ship with Vista and I don’t know what happened to it. I guess it’s imploded under it’s own weight. Maybe it was too slow, to buggy or whatever.
You can ship any application with some bugs but a filesystem has to be perfect. One small bug and data starts to disappear in random places and will make your users running away in droves.

Anyway, it seems it’s not trivial to correctly implement a storage with a DB design so it still achieves good performance and provides the essential features of a filesystem, otherwise it would have been shipped.

Registry

Another nice example of what happens when data from different clients get mixed together.
It is the last resort of uninstalled applications and everlasting producer of problems. One wrong edit and your system doesn’t boot anymore. Moreover, unofficial waste bin of unsorted configuration data for all kind of applications. Not to compare with the relatively clean structuring of a multi file /etc directory.

Outlook/Exchange

Normal users tend to use their mail account as a searchable file storage. Nothing wrong with that. They just noted that the normal windows search is crap and in their mail program they can tag their files with emails and separate them in categories. The only guy who is not so happy about that and responds with mailbox size limits is the one who has to maintain all those things. He has to do that because all mails get stored in a gigantic file and when that file exceeds the 16GB storage limit he has a problem. (If he wants more he needs an expensive update, has to hack the registry etc.) Why not providing a unix like mail spool with a single file for every user? Another example of how you create problem by storing everything in the same place.

Everything Is A File

The unixen philosophy where everything from a socket, to handles, to drivers is a file is the much simpler and thus more powerful and flexible way than inventing a new architectural paradigm for every emerging problem you face. Furthermore, coming up with a new approaches for such basic things instead of using a proved abstraction model just introduces unnecessary complexity which will sooner or later (maintenance) cause tremendous headache.

Trouble

By putting your code into dependency of others you increase the probability that it fails. This is especially true if your dependency is accessed by other clients as it is done in those file/configuration databases. Every external component you use may fail and typically you have no way to fix it. There are already enough problems an application has to face. By requiring yet another fancy module you just add another hurdle which could break your program.

It’s that simple.

Your editor and you

Wednesday, November 7th, 2007

Yesterday I saw somebody implementing a piece of software without knowing what the actual problem was. There weren’t just a few details missing, almost everything was unclear by then. He just started hacking like that: Classes there, interfaces there, unit tests etc, etc. It seemed so ridiculous to me that I had to ask him. “Ey, we can refactor later” was his response and he continued. Well, of course everybody has its own strategies, I just couldn’t imagine that in the end he will have achieved a clean design without having thrown away everything. How can you be that crazy and just spill out code without have any concrete details about what you should actually do?

How to write software?His next reply was “Ey, look mate: You can press here and here, and then magically everything gets reordered. It’s so fancy!”.
Soon he run into a problem. Ha! You have a race condition somewhere, now what you gonna do with all your refactoring tools? He calmly started his debugger tried / failed / built / continued until he got it fixed. He wasn’t coding in vi, nano, EditPlus or TextMate. Of course not, it was one of those Java editors with trillions of features.

And he felt save using it. Oh yes. He loved that warm and fuzzy feeling of live API browsing when pressing a single dot. Those underlined statements which warned him about possible pitfalls like Microsoft Word does with grammatical errors. The relaxed feeling he had because he knew he will be able to refactor all his code later into a nice class hierarchy anyway.

FeaturesWe all like it, don’t we? Those bloated and fat enterprise programming environments with their sweet features they provide. You don’t have to think about algorithms anymore because “I can debug them later anyway” and you don’t think about performance issues because “My performance tool will highlight the bad statements”. Soon you don’t even think about design anymore because “I can update my software architecture in that other UML tool”

But do you remember the first time you opened that editor? Oh yes, it took ages to load (you realized why it shipped on a DVD). It was fat, cluttered with options, menus and wizards. Everything was so slow because of all the bloat it had. You missed the tactile feedback from your TextMate times.

Anyway, it was worth to give up that old stupid editor! Look at all the new and shiny features you got now: PerformanceAnalyzer which puts a red bullet in front of computation intensive statements. The UnitTestGenerator. The DocGenerator. The DataStructureWizard. The AutoPatternizer.

Soon you started to code before thinking. You ignored your carefully crafted code conventions: “Look it up in DocGenerator”, you stopped thinking about architecture: “UML Refactorer will do it later forFeatures me”. Anyway, you could do it: Your imaginary friends were waiting to help you with all your coding issues, weren’t they?
Hold on a second, did you forget? Where have they been before that last release, when your boss came in and yelled about the disastrous performance of the module you wrote, PerformanceAnalyzer didn’t help you, eh? What about that other guy who had to write an extension for you application, but the “UML Refactoring Tool” failed because of an “internal error” You gave up your lovely editor for a few fancy features which barely work.

However, time passed and the more code you were spilling out, the more you forgot about the old days. You unconsciously accepted everything which surrounded you. Slowly you became like the editor you use and feared the first time you opened: Fat, bloated and slow.

Close your eyes, take a deep breath and go back a few years in time. Now, enjoy the old but refreshing feelings you had with your simple editor.

You are not set to an instance of an object

Monday, October 29th, 2007

Today in school we had some exercises related to synchronization mechanism in Java. We grabbed the exercise descriptions and templates from the server and started coding. After five minutes the guy next to me started to yell why he gets a NullPointerException all the time and why eclipse’s jiangadeege-bulb doesn’t offer a fix for his problem.

We looked at it and some stack trace hopping later we got it fixed. After a while though, the guy on the right told me that he was experiencing the same problem (later we found out that there was an error in the provided template) and asked me to help him as well. I started to list the wonderful advantages of having functional languages and told them that most of those problems magically disappear in a language like Haskell. Some of the guys got it “cool”, this post is for the other ones.

DivisionBasically, you can divide the different ways in which you structure your code into four paradigm. That is Procedural, Functional, Logical and Object Oriented. There also others (e.g. Aspect Oriented) but I will ignore them for today.

The procedural way is approximately the thing you would do in C (assuming you are not using your own OO emulation with structs or so) where you have functions, variables and calls. In OO you have objects, methods and messages but everybody does its own things anyway.

And then there is the functional way. Unlike programs from other paradigm in which you change state and data all the time to achieve your result, functional programs do not hold changeable state nor use mutable data (ignoring I/O for now). This is maybe the most obvious difference.

Maybe the exception before occurred because you forgot to call init() and used matchesRegex(…) directly. Or concurrent calls on getFac() resulted in race conditions. Or you were iterating over a collection while it got modified. Or you “cached” a state from an object because you thought it will be too costly to call every time but in the end failed to update it properly.

Restrictions...All those problems occurred because a common resource was accessed incorrectly. As soon as you start using those shared resources (that does not have to be the all-time-evil singleton, also class attributes may bring you into troubles) you need think about how you initialize them and in which order. Which values may be null? Where do I verify all my data? What happens on concurrent calls, do they mess everything up? What will happen when I throw an exception, do I reinitialize properly?

One of the main points in functional programming is that you avoid having shared resources as much as possible. The matchesRegex(…) does not have any mutable attributes in it’s class and initializes everything at its own. The getFac() does not get calculate the factorial out of a value passed in the class’ constructor, it accepts that value as parameter. All needed data is passed directly from the caller to the method, you don’t have any locking problems anymore because nothing is shared.

Furthermore, because of its stateless design, passing the same parameter to a function will also always return the same result.Stateless It does not depend on that other method DoEvil() which sets the class attribute google you use to a state you never thought of before, it’s more like math: The Square root of four will return two also when you calculated exp(4) thousand times before. Those functions do not share data between each other thus wont affect each other. Because everything runs at it’s own, the compiler has also the possibility to optimize and split your program over several threads and CPUs automatically. Of course, no shared data means no need for synchronization mechanism which will let the program run truly faster because no synchronization information have to be passed between the CPUs.

“But we have to do it in Java” I hear you whining (I bet some of them are actually happy about it so they can continue using their lame editor… giusi?)

Java isn’t a pure functional language at all, but that functional stuff is just a programming paradigm and thus not bound to a language. Because Java is a multi paradigm language and tries to be THE language for everything (somehow explains why it’s so crappy) you can also implement your programs in a more or less functional way. Of course it’s not as easy as it would be if you’d use a special purpose language nor will you get the benefits of an fp environment neither. But if used correctly, you eliminate the whole category of those side effect errors.

In most languages you can program in a functional style. While it won’t look as beautiful and won’t achieve the same parallelization as for example in Haskell it does protect you from an annoying type of bugs.

Write functional, also if you use a lame language!