Monday, August 18, 2014

Sunday, August 3, 2014

As I've been using PyOpenWorm, I've encountered some cracks in the design. The primary issue is the interaction between configuration, object properties, and database set-up. Some examples of problems that have resulted:
  1. Run-time configuration has been set-up such that each object has to receive its configuration anew from the object that creates it or else use a globally set default configuration. This is a very brittle framework that falls apart any time the appropriate configuration is not passed to Configureable objects contained within other Configureable objects which can result in configuration values hanging around between connect() and disconnect(). This isn't a big deal for simple one-off scripts, but can be annoying when testing or when designing more complex scripts.
  2. Registration of a Property doesn't take place until the first time its owner is initialized. This prevents us from resolving a Property to an object directly from the graph without having made one of its owners first. This results in properties which are identical having different names in sub-classes due to the dynamic nature of property-name creation.
  3. Each DataObject sub-class has to be registered as a separate step from defining the class. This is an minor annoyance when creating new classes.
  4. It's necessary to either know the structure of generated namespace URIs (e.g., http://openworm.org/entities/Neuron/) and recreate them or to create an object of the desired type when referencing objects in another namespace. The first approach is sub-optimal because we might have reason to change the structure of the URIs in the future which is already a problem for some auxiliary methods designed to extract information from a URI. The second approach is also problematic since it requires the creation of objects which are never used otherwise, creating overhead for the garbage collector as well as the programmer
My proposed solution to these problems is to push the configuration and class-dependent initialization to the class-definiton phase through the use of Python's decorators which run on library-load. This takes care of the fourth point by having all of the namespaces and namespace sturcture made static by the time any object should be created.

The third point is covered by folding the current registration into the class decoration.

The second point is covered by attaching each Property subclass to the owner's class while retaining the initialization in the __init__() call for accessing owner-instances.

Finally, the first point can be addressed without the proposed solution by not having configuration passed through __init__. Initially I conceived of a tree structure for configuration which could be augmented by internal nodes and passed on to children with specific configuration needs that didn't need communication to higher level users. This structure is exemplified by Data which is Configureable, but also a Configure object suitable for configuring other objects. Unfortunately, this is the only case where the feature gets any use and isn't required for passing augmented configuration to the module as a whole, so it can simply be removed.

I hope to address some of these issues soon after outstanding issues are closed and features for the next release are well-defined.

UPDATE: Most of these issues have been addressed in the new-classes branch. Not yet merged into master.