The MOAI code can be easily extended in several ways.
The first way is prefered, because you can then easily maintain your own setup in your versioning control system, and deploy just by running a buildout. The second and third way might be easier if you’re just doing some experiments.
In the buildout.cfg file, you will find a section called MOAI, it looks like this:
[moai]
configuration_profile = example
extension_modules = moai.extensions
The extension_modules line adds modules that will be examined by MOAI to see if it contains extensions. You can add your own modules here, or put some scripts in the default extensions directory.
The configuration_profile line determines which profile will be used, you probably want to create your own profile, and change this line.
After any changes to the buildout.cfg file, you need to run the buildout script again
> bin/buildout
First, you will need to choose a name for your profile, we will use ‘myuni’ in this example. Change the configuration_profile value in buildout.cfg
configuration_profile = myuni
Let’s also add some more specific configuration, that our profile will use
[myuni]
fedora = http://library.myuni.edu/fedora
datastream = DATA
path = /tmp/myuni
database = /tmp/moai
Don’t forget to run buildout:
> bin/buildout
To create your own configuration profile, create a python file, and put it in a python module that’s listed as an extension module.
In the python file, add the following code
from moai import ConfigurationProfile, name
from moai.provider.fedora import FedoraBasedContentProvider
from moai.database.btree import BTreeDatabase
from moai.update import DatabaseUpdater
from moai.server import Server, FeedConfig
from moai.http.cherry import start_server
from moai.extensions.myuni.content import MyUniContent
class MyUniConfiguration(ConfigurationProfile):
name('myuni') # configuration profile name used in buildout.cfg
def get_content_provider(self):
provider = FedoraBasedContentProvider(
self.config['fedora'],
self.config['path'],
self.config['datastream'])
provider.set_logger(self.log)
return provider
def get_database_updater(self):
return DatabaseUpdater(self.get_content_provider(),
MyUniContent,
BTreeDatabase(self.config['database'], 'w'),
self.log)
def get_database(self):
return BTreeDatabase(self.config['database'], 'r')
def get_server(self):
server = Server('http://localhost:8080/oai',
self.get_database())
server.add_config(
FeedConfig('publications',
'MyUni Publication Server',
'http://localhost:8080/oai/publications',
self.log,
sets_allowed=['publication'],
metadata_prefixes=['oai_dc', 'mods', 'didl']))
server.add_config(
FeedConfig('stuff',
'MyUni Stuff that are Publications Server',
'http://localhost:8080/oai/stuff',
self.log,
sets_disallowed=['publication'],
metadata_prefixes=['oai_dc', 'mods', 'didl']))
return server
def start_server(self):
start_server('127.0.0.1', 8080, 10, 'oai', self.get_server())
Allthough this might be a bit more complicated then a config file, it does expose all the hooks we’re you are able to plug in different implementations making it a very powerful way to customize things. In the released version of MOAI, we will ship with a number of configuration profiles, so you wont have to write a complete configuration profile if it’s not needed.
A content object will take an object (handed by a ContentProvider) and transform it so it can be inserted into the MOAI database. Since every university will probably use a different metadata format, this object will need to be implemented. In the above configuration profile, a class called MyUniContentObject is imported from a seperate python module.
This python module (content.py) could look something like this:
import os
import time
from datetime import datetime
from lxml import etree
from moai.content import XMLContentObject
class MyUniContent(XMLContentObject):
def update(self, path, provider):
self.provider = provider
self.nsmap = {'uni':'http://library.myuni.edu/ns'}
doc = etree.parse(path)
self.root = doc.getroot()
self.when_modified = datetime(*time.gmtime(os.path.getmtime(path))[:6])
self.deleted = False
self.sets = []
self.content_type = self.root.xpath('local-name()')
self.is_set = self.content_type == 'set'
self._fields = {}
if self.content_type == 'person':
self.id = self.xpath('uni:id/text()', 'id', unicode, required=True)
self.label = self.xpath('uni:id/text()', 'id', unicode, required=True)
else:
self.id = self.xpath('uni:uuid/text()', 'id', unicode, required=True)
self.label = self.xpath('uni:title/text()', 'id', unicode, required=True)
self.sets = self.xpath('uni:oai-sets/text()', 'id', unicode, multi=True)
self.sets.append('publication')
self._fields = self.add_publication_fields()
def add_publication_fields(self):
fields = {
'language': self.xpath('uni:language/text()', 'id', unicode, multi=True),
'contributor': self.xpath('uni:creator/text()', 'id', unicode, multi=True),
'title': [self.label],
'publisher': self.xpath('uni:publisher/text()', 'id', unicode, multi=True),
'type': self.xpath('uni:type/text()', 'id', unicode, multi=True),
'subject': self.xpath('uni:type/text()', 'id', unicode, multi=True),
}
dates = self.xpath('uni:date/text()', 'id', unicode, multi=True)
if dates:
dateval = parse_date(dates[0])
assert not dateval is None, 'Unknown dateFormat: %s' % dateval
fields['date'] = [dateval]
fields['author'] = fields['contributor']
fields['url'] = ['http://dx.doi.org/????/%s' % self.id]
fields['dare_id'] = ['urn:NBN:nl:ui:??-%s' %self.id]
filename = self.xpath('uni:date/text()', 'id', unicode, multi=True)
if filename:
fields['assets'] = [{'filename': filename[0],
'mimetype': 'application/binary'}]
return fields
Note that this is just an example of how a content object might look, this will totally depend on your existing content format, and which kind of properties you want to store in the MOAI database. This in turn decides which metadataprefixes you can use in the OAI server.