What shouldn't you do to get out of a dead platform?

So far I’ve talked about the hidden costs of sticking with an old platform and discussed some of these hidden costs are. Now I’d like to talk about what happens when you smell a dead platform rotting and try to do something… and then realize you’ve done the wrong thing…

In the aforementioned OpenVMS using place where I used to work, my group was created when they started to get dissatisfied with OpenVMS, so we were chartered to give the company experience with doing things a little differently. Naturally, each individual member of the team took things as far as we could in whatever directions we could dream up.

Sometimes it’s better to re-invent the wheel

In our group, we had no hope of being able to use any pieces of the old model right away. We didn’t want to write huge chunks of code all over again from scratch, so we licensed some libraries. We even got the source code for them. They plugged into MFC, so really our platform was Windows + MFC + these other libraries, not just Windows.

Now, I should note that the biggest thing that people don’t get about the way that shared-source and open-source work as compared to just getting the source code of the library is that we found and fixed some bugs in their code. However, every time they released a new version, we had to re-apply our patches, because we sure as hell weren’t going to do their coding for them.

And these libraries sucked in a pretty extreme fashion. Stuff worked well enough for them to give us a nice friendly demo and it worked well enough for us to get fairly far into our code before we realized it wasn’t working… and then suddenly we’d hit a wall where we realized that, fundamentally, whoever wrote the code was fairly dumb.

Clearly, re-inventing the wheel is not a good thing. But I still feel that this is one case where we should have reduced the parts of the platform we didn’t control.

Switching platforms is easy. Switching to the right one is hard

I worked at another company who wanted to move from just a Unix based platform (generally SGIs and Suns) to Windows and maybe some other future platform. So they did what everybody was making a lot of noise about — they ported stuff over to Java. Except that Java wasn’t fast enough, so we left the heavy-lifting code that was designed to be run as a command-line application on a beefy server (back when SGI made fancy colored cube machines that were 64 bit and fast and better than what you could pick up at Fry’s) in C++. Except that the UI to set up that heavy-lifting code used OpenGL to set things up, so we had to write a bastard mixture of C++ and Java. And this was before anybody had bothered to write a good way to do this, or even to pass GUI handles across the C++/Java boundary.

This was a huge testing nightmare because they had decided to fix their longstanding architectural issues by porting to Java, so you had two categories of bugs to deal with.

So, by the time I was worn down by testing the heck out of it, I surmised the situation was this: They would have gotten just as much bang for the buck, if not more, if they were to have just ported their X windows code over to Windows and maintained two codebases in parallel.

Furthermore, Java didn’t turn out to be the hyped platform that people made it out to be on the desktop, so the port turned out to be a really bad idea.

When you are really in trouble, tread carefully

Now, back to the OpenVMS company. See they had some hardware that didn’t work out so well for them and realized it was time to port. They decided that a proper port would take five years, maybe more, to get together. So they decided to re-architect the system in such a way that they weren’t dependent on OpenVMS and built on a super-modern distributed platform.

(A slight digression here: The hardware was perfectly functional, just completely and utterly unsuitable for their workloads in a way that wasn't instantly apparent.)

The biggest problem, and the hardest one to beat, is that you wrote your initial code when you had ten folks who all reported to the same manager. But now you have ten times that who now only have a high-level exec in common.

Often times, you’ll have enough difference of opinion and schedule that nobody can come to an agreement, even if you start an architecture committee. You can easily discover that, five years later, you have made no progress.

I should note that this is even worse than doing the wrong port, because it is very likely that all of the effort you’ve spent on this problem is wasted.

Even without this people problem, you can get yourself into real trouble. I was working on a project at a different company and we were facing some scalability issues. We figured we could do a careful refactoring to get the architecture to the point where it was scalable. Except that we quickly realized as we got farther along into our plan that it really would require a lot more effort to get right than we’d thought and that it was very likely that we’d run out of performance headroom before we were done.

Tread very carefully.

A vacuous long-term goal doesn’t help

Now, I’ve worked on some projects where a wrapper class was maintained, not because we had a palatable need to switch the implementation, but because we might need it one day. In every case, the wrapper’s been violated. On the other hand, if a wrapper’s necessary, people will make sure it works. Even if it’s just to enforce an organizational boundary between two teams.

Furthermore, if you know what’s different between the two platforms, you can actually make an effective wrapper, instead of wrappering everything “just in case”.

Next up…

I’m going to wrap this up with what I think some useful rules for you to have in mind when you are picking or sticking with a platform…