The USS Quad Damage

Sea Monsters and OSGi

You are my white whale, OSGi! I'd do my captain Ahab yell as I pierced thee with a polearm of some sort but I haven't actually read the book.

OSGi apps tend to have that feel of clunkiness.

If you’ve encountered OSGi anywhere, chances are you hated it with a burning, fiery passion, whether you were a software developer or a user. After using it for a while and wanting to stop the hurting, I came upon an Epiphany and... well, I’ll let you uncover the aftermath for yourself:

Perhaps the problem isn't too much OSGI, but too little...

—Sunny Kalsi (@thesunnyk) January 7, 2014

Let me explain both the epiphany and “the problem with OSGi”, as well as the initial developer experience with it, because most people haven’t even heard of OSGi.

Read this before you start the adventure

You’ll likely encounter “OSGi” when you’re trying to achieve a relatively modest goal in a heavily pluggable Java application (like Eclipse). You’ll notice a sea of undocumented interfaces peppered throughout various packages that will enable you to achieve your goal. When you’ve finished dry-retching because there’s really nothing else left in your stomach you’ll just think pragmatically, write your code, and try and test your application (or plugin, or whatever it is). People have to work in abattoirs, your life isn’t that bad.

The unit tests will go well enough; everything’s an interface so you can mock things to your heart’s content. Unfortunately it isn’t clear how the interfaces are actually supposed to work so you’ve guessed a bit but you should figure that out when you run the thing, right?

You start the (whatever it is) up, or more likely, do a jiggery pokery thing. This gives you a ClassNotFoundException, even though the classes are clearly there and the ClassLoader knows they’re there, so you restart. Now you still get the ClassNotFoundException, but it’s different classes. After messing around for a bit it starts to work and you can’t figure out why, but oh well at least you can make progress.

Then about an hour later while testing something unrelated, you get ClassNotFoundException again.

A week later and either a package somewhere has been updated or you get a report from the “field” or something minor and insignificant changes. Now, a thing is not working. There’s no error messages, and sometimes there are, but it’s hard to tell because the app behemoth kind of sort of throws error messages like it was candy and the app was trying to attract children.

Several years after therapy, you try and clear your head and figure out what the hell you were even thinking. In trying to make progress you forgot to do what you would normally do in this situation: read the documentation. You figure OSGi sucks but you’ll try and figure out how it ticks so you can defeat it.

What is OSGi?

Firstly, think about Java. Operating Systems are often designed to allow individual processes to dynamically insert code and run it. This allows for shared libraries and dynamically linked libraries. However, once a process is running, it’s quite difficult to add a new library into a running process (the proper term for this is a “component model”). The JVM also has no built in mechanism to do this. OSGi “solves” this problem in a fairly simple (for the user) way. Java JARs get a new manifest (or three) with some data about what interfaces they provide, there’s stuff for starting up services and long running tasks... and... that’s it.

OSGi doesn’t seem like a monster after all. I mean, you can see where all the problems come from! Because OSGi bundles may not be loaded, or because your manifest fails to import all the correct things, or because some other jar fails to export the correct things, even though your code compiles fine, it may fail at runtime with a ClassNotFoundException. Also, because interfaces are dynamically loaded, you could have the wrong version of an interface and then strange things will happen.

Worse, you cannot find out until runtime. There’s a whole class of compile-time problems — an interface method has been changed or removed, a bundle can’t be found for some reason, even though the jar is right there — that have become runtime problems.

This is why I don’t understand Neil Bartlett’s comment (maybe there’s something I still don’t understand):

charlesofarrell thesunnyk When OSGi is properly used, you just don't get ClassNotFoundException.

—Neil Bartlett (@nbartlett) January 7, 2014

OSGi apps tend to have that feel of clunkiness, which you see as a user as well as a developer. Nothing feels tightly integrated, nothing feels friction free. But looking at it from the other side, you can almost pity OSGi, because when you know the weaknesses of OSGi, you can cater for them. OSGi is a problem, but that problem has a flipside — the dynamic loading. There’s just no other way of achieving that without having the same set of problems. OSGi is almost not the issue, but who OSGi hangs out with.

A Toxic Scene

The problem is, you never get “OSGi” all by itself, you get a menagerie of tools, each toxic on their own, but combining to create a concoction so vile I’ve... lost interest in this metaphor. Anyway, one of the major problems is dependency injection.

The issue with dependency injection is that it’s “viral”, which people originally argued was a good thing, but now I’m not sure. In any case, the problem is that everything is finely sliced into interfaces, and these interfaces tend to find their way into your component exports, and then into other components' imports, which means that if you ever change the interface — one which you probably consider internal — something’s going to break and there’s no way to even test for it, much less try and check against at compile-time.

On the other side of that equation, it’s always possible, and very enticing, to use an interface you can see to do a thing you want to do, even though whoever wrote the other component hasn’t “officially” let you see it (which might even matter if there was documentation for the “official” stuff). Now you have a dependency on a version of some code which will likely change under your nose right as you forgot that you were even using it.

Secondly, there’s usually XML files everywhere. What they define is hard to know, the value they provide is dubious, they’re never edited outside of the development team (even though that was their original design intent) and they again serve to convert a whole bunch of compile time issues into runtime issues. Error in the XML file? Runtime problem! String that’s supposed to be an Enum but doesn’t match? Runtime! Class with misspelled name or package? Runtime!

There’s also often a panoply of huge and useless services tagging along for the ride. OSGi necessarily makes application startup slower, but often, it’s the other things that are starting up that make the developer experience awful. This huge startup time makes you more desperate to never restart that thing, so you rely ever more heavily on OSGi.

The Epiphany

The epiphany is that everyone hates OSGi, and this is precisely the thing that causes this toxic cycle. Running away from the pain just makes it worse. The only way out of this madness is to embrace OSGi, get rid of its friends, and start building components only when they’re needed.

This means bundling components in a “static” way rather than via OSGi. This means exporting interfaces sparingly, having a very small touch area, small APIs which communicate efficiently and in a flexible way. This means considering what really constitutes a service and what’s really just a library. In the end, if you only have 5 or 10 components in your application instead of 500 or 1000, the classes of pain that can bite you are reduced by orders of magnitude, and OSGi starts to pay for itself in spades.

The problem isn’t too much OSGi, it’s too little.