Modeling Operating Systems in Avian Computing

One of the initial design goals of every operating system (OS) is that it be lightweight and have minimal performance impacts on the running applications. Unfortunately, as the OS matures, it begins to take on baggage and assume a heavier footprint.

The goal of lightness probably has an unintended consequence; it probably makes it harder for the developers to understand what their code actually did. Lightness generally means terseness, meaning no excess code, not even any diagnostic code.

An interesting way to overcome this apparent limitation would be to use Avian Computing and ConcX to model the OS being designed. Each of the processes to include in the OS would initially be a ConcX entity that performs the task(s) of the final process.

There would be several advantages of this method. First, and perhaps most importantly, using the built-in logging features in ConcX entities, it would be simpler to identify the conditions that lead to a failure. This would be increasingly true as the amount of parallelism built into the OS grows. The more sophisticated and parallel an OS, the greater the need for help locating the cause of any failure.

For example, assume that the operating system will use Semaphore X to control some resource and that semaphore became unavailable to the various processes. In ConcX, it would be relatively simple to find which of the threads had obtained Semaphore X, when exactly it happened and what it was waiting for that was preventing it from releasing Semaphore X. Assuming the developer had instrumented his bird properly, it would have recorded when it ate Semaphore X and any problems or issues that it encountered that prevented it from releasing Semaphore X. The developer might even have made it easier to diagnose by writing an error food object out to the TupleTree, such as when some value is expected to be Zero or One and instead it is a negative value or greater than One.

Which leads to the second advantage of modeling the OS in ConcX; the ability to modify the system with minimal effort. When a potential fix is identified, it can be inserted into the appropriate bird(s) and the system restarted. No major recompiles or installing the executable in the test system(s). And much like with Unit Testing, a test bird that is configured to always produce the error condition and the system run to verify that the system handles the error appropriately.

Beyond error correction, easy modification of the modules in an OS makes it easier for developers to experiment with how the functions in the OS are allocated. For example, is it better to have one code module with a huge IF statement that then calls sub-modules or is it better to have a bunch of separate special-purpose modules? Should Capability X be included in Function Y or should they be separate functions?

Additionally, it should be easier to identify which birds are the bottlenecks. If one or a couple of birds are performing some capabilities that always cause other birds to wait excessively, then those birds can be analyzed to see if they can be split into separate functions or simplified or streamlined, etc.

A third advantage is the ability to catch “black swan” events. Unexpected conditions are frequently difficult to identify because the developers “knows” that some value will always be Zero or One so never considers the possibility that it might be outside the range so won’t find that error until they consider what happens when it does fall outside the range.

If the developer codes his birds correctly, any unexpected values will be recorded in the bird’s history and/or will write an error object to the TupleTree. This assumption-trapping is easy to write in ConcX and has minimal impact to overall performance but pays huge dividends by catching unexpected conditions that can lead to unexpected behavior by (or crashing of)  the OS. Identifying the failures of code or values that are “too simple to fail” and identifying all the conditions that must be correct for the module to succeed effectively produce a “criteria list” that the developer of the final OS must be able to meet.

Another advantage of modeling in ConcX is that low-level errors could intentionally be allowed to propagate thru the OS to study the effect on the system. Errors are not all created equal. Some errors could cause catastrophic results; other errors might have zero overall impact because they null out and are internally eliminated. Knowing which errors have the greatest potential for affecting the OS allows developers to focus their limited time and attention to where they will have the greatest impact.

Perhaps most importantly, modeling the OS in ConcX will allow the developers to think about and interact with their new OS at a higher level of abstraction. Conceptually, they can move functionality around and adjust behaviors with minimal costs in time and effort. ConcX provides a loosely-coupled environment where changes to one piece of code will only affect other pieces of code thru a well-defined interface (the TupleTree).

And then at the end of the modeling phase, developers have a working “flowchart” from which to code the actual OS. All of the time spent coding the birds in ConcX is thrown away. Every bit of the analysis and effort to understand the new OS is kept. The most critical portions of the OS can be coded quickly and with confidence because they are already well understood and well defined because of the time spent modeling the OS in ConcX.

This entry was posted in New Development Model on by .

About AvianWriter

Microcomputing and I grew up around the same time (a long time ago). I've always been fascinated by parallel programming but always found it more difficult than it needed to be. After much consideration, the Avian Computing project was born. Oh yeah, and I like long walks on the beach.

Leave a Reply

Your email address will not be published. Required fields are marked *