Getting Started with dip.io

In this section we look at the information an application needs to provide in order to be able to read and write its data from and to persistent storage.

dip decouples an object from where it might be stored and what format it might be stored in. This allows new types of storage and new data formats to be defined that can be used to store an object without having to change the implementation of the object itself.

dip includes specific support for the most common case of storing objects that are instances of Model as an XML file in a filesystem.

Concepts

dip considers storage to be either streaming storage or structured storage. Streaming storage stores data as a byte stream. A filesystem is the most common example of streaming storage. Structured storage stores data in a storage specific structure. An SQL database would be an example of structured storage.

An object is stored according to a data format. When an object is written it is encoded according to a format. When it is read it is decoded from a format. An object will normally have a native format which is used when reading and writing the object. It may be possible to export an object to other formats and to import an object from other formats. Each data format has a unique string identifier.

When reading or writing an object from or to streaming storage a data format must be specified. The data format effectively imposes the structure of the data on top of the byte stream. Structured storage has, by definition, an implicit data format.

An object is stored at a storage location that is unique within a particular piece of storage. For example, the location of an object stored in a filesystem is the absolute path name of the file containing the encoded object. A storage location may be implicit. This means that where an object is stored is determined by the value of the object and is not specified by the user.

Storing an Application Object

As it is the most common case we look at how to store an application object in a filesystem. This means that we need to define a native data format for our object, assuming one doesn’t already exist.

The first step in to provide an implementation of the ICodecsFactory interface, which is itself derived from the IFormat interface:

from dip.io import ICodecsFactory
from dip.io.codecs import XmlDecoder, XmlEncoder
from dip.model import implements, Int, Interface, Model, Str


class IExampleModel(Interface)

    name = Str()

    age = Int()


@implements(ICodecsFactory)
class ExampleModelCodecsFactory(Model):

    id = 'myapplication.formats.example_model'

    name = "Example Model"

    filter = "Example model files (*.exm)"

    def decodes(self, obj):

        return isinstance(obj, IExampleModel)

    def decoder(self):

        return XmlDecoder(factory=self)

    def encodes(self, obj):

        return isinstance(obj, IExampleModel)

    def encoder(self):

        return XmlEncoder(factory=self)

The above code defines an interface, IExampleModel and a codecs factory that can handle an object that either implements, or can be adapted to, that interface. Of course in a real application you would place the interface definition and the codecs factory in different files.

Codecs factories are registered with the io module by contributing to the dip.io.codecs extension point. Therefore your plugin should contain code similar to the following:

codecs = ContributionTo('dip.io.codecs')

@codecs.default
def codecs(self):

    from example_model_codecs_factory import ExampleModelCodecsFactory

    return [ExampleModelCodecsFactory()]

We will now look at the detail of the factory.

The factory itself is a Model instance that implements the ICodecsFactory interface as follows:

@implements(ICodecsFactory)
class ExampleModelCodecsFactory(Model)

Each format has a unique string identifier as follows:

id = 'myapplication.formats.example_model'

Each format also has a short descriptive name as follows:

name = "Example Model"

Most applications will support using the filesystem as streaming storage and so the factory allows a file filter to be specified as follows:

filter = "Example model files (*.exm)"

This will be used whenever a QFileDialog is used to obtain a storage location from the user.

Next we implement the decodes() method that determines if an object can be decoded from the format. In this case we simply need to check that the object implements the IExampleModel interface as follows:

def decodes(self, obj):

    return isinstance(obj, IExampleModel)

We then implement the decoder() method that creates an object that will actually do the work of decoding the object we are reading. The decoder that is created must implement the IDecoder interface, or be able to be adapted to that interface.

dip includes the XmlDecoder class which implements the IDecoder interface and is able to decode any Model instance from XML. We create it as follows:

def decoder(self):

    return XmlDecoder(factory=self)

The steps needed encode an object are similar to what we do for decoding. In particular we implement the encodes() and encoder() methods as follows:

def encodes(self, obj):

    return isinstance(obj, IExampleModel)

def encoder(self):

    return XmlEncoder(factory=self)

The XmlEncoder class is an encoder provided by dip that implements the IEncoder interface and is able to encode any Model instance to XML.

You may want to sub-class XmlDecoder and XmlEncoder and reimplement the decode_attribute() and encode_attribute() methods to fine tune the behaviour.

Implementing a Decoder and Encoder

There will be cases where the XmlDecoder and XmlEncoder classes are not appropriate:

  • the application object is not a Model instance
  • you don’t wish to use XML
  • you already have an existing format you need to use
  • you already have code that reads and/or writes the application object and you want to re-use it.

Of course it is not necessary to implement both a decoder and an encoder for a format. You may need to support a legacy format which you will only ever read (i.e. decode). Or you may want to export (i.e. encode) an object in a format that you will never need to read (e.g. PDF). If your format factory doesn’t support decoding then your decodes() method should always return False. In the same way, if your format factory doesn’t support encoding then your encodes() method should always return False.

A decoder is either an implementation of the IDecoder interface or an object that can be adapted to that interface. The interface defines a single decode() method that decodes a byte stream and returns an application object.

An encoder is either an implementation of the IEncoder interface or an object that can be adapted to that interface. The interface defines a single encode() method that encodes an application object and returns a byte stream.

If you want to re-use existing code that implements a decoder and encoder then you can integrate it in one of two ways:

  • If the code is implemented as a class (or perhaps two separate classes) then you can provide an adapter between that class and the appropriate interface.
  • If the code is implemented as simple functions then you can provider a wrapper class that implements the IDecoder and IEncoder interfaces and calls those functions from the decode() and encode() methods as appropriate.

Another situation you may find is that you have existing code that will read and write an object from and to a file, rather than to a generic byte stream. There are two ways to handle this as well:

  • Your existing code is, in effect, coupling the data format with the storage. In dip terms this is structured storage and can be handled by defining a new type of storage specifically for it.
  • A more flexible approach is to use the local filesystem as a staging post. When decoding the byte stream is first saved to a local file and the existing code is called to read that local file and return the object. Finally the local file can be deleted. When encoding the existing code is called to create a local file containing the object. This file is then read and returned as a byte stream. Again, the local file can then be deleted.

The advantages of the second approach are that it is simpler to implement and means that the decoder and encoder can be used with any other streaming storage that might be available now or in the future without requiring any changes to the code.

Defining a New Type of Storage

The need to define a new type of storage arises less often than the need to define a new data format. When the need does arise it is typically as a result of some new technology or service becoming available that can be used by many applications rather than something that is application specific. For example, an organisation may subscribe to a cloud based file service. A new type of storage would then be defined to implement access to it. All existing applications could then use it without making changes to those applications.

The other situation that would require a new type of storage to be defined is when a database is being used as structured storage.

In this section we describe the high level steps taken to define a new type of storage, including the interfaces and classes that need to be implemented.

Storage is defined following the same pattern used to define data formats. A storage factory is defined that extends an extension point. The factory creates instances of storage, as required, that is used to actually read and write objects.

The identifier of the extension point is dip.io.storage.

The IStreamingStorageFactory interface must be implemented by a streaming storage factory, and its __call__() method must return an implementation of the IStreamingStorage interface.

The IStructuredStorageFactory interface must be implemented by a structured storage factory, and its __call__() method must return an implementation of the IStructuredStorage interface.

Both IStreamingStorage and IStructuredStorage are derived from the IStorage interface. This defines read() and write() methods to do the reading and writing of an object from and to a specific storage location.

IStorage also defines the ui attribute which is an implementation of the IStorageUi interface. This interface defines methods that create the necessary user interfaces that the user will use to select storage location. For example, the filesystem storage type included with dip provides access to QFileDialog using this mechanism. A storage type that handled a database may implement a QWidget database browser.

The filesystem storage type included with dip is derived from the QIODeviceStorage class. This class, as the name implies, can be used as a base class for any streaming storage that can be accessed using a QIODevice.

Querying Available Storage

So far in this section we haven’t said anything about how an application can choose a particular type of storage to store an object. This is because, in a typical application, it is handled automatically by the dip.tools module.

Each application has an i/o manager which is an implementation of the IIOManager interface. It is a singleton that can be obtained by calling IOManager(). Like any other part of dip you can provide an alternative implementation if required.

An IIOManager provides the readable_storage() method that will return a list of IStorage implementations from which an object can be read. It also provides the writeable_storage() method that will return a list of IStorage implementations to which an object can be written.

Defining a Storage Policy

Sometimes you may have a situation where an object can be read from or written to a particular storage type, but you want to place restrictions on that access and the options presented to the user. For example, a certain type of user may only be able to read from the storage, or access to the storage may be limited to certain times of the day.

The i/o manager will consult a storage policy to determine if an object should actually be allowed to be read from or written to a particular storage type. A policy is an implementation of the IStoragePolicy interface defined by the policy attribute of the IIOManager. The default policy is None which allows all access.