OAI-PMH facilities for Python, Zope, and Silva

About OAI-PMH

The Open Archives Initiative Protocol for Metadata Harvesting, OAI-PMH, is a well-established standard in the content management and library science worlds that is gaining in importance. The protocol provides an application-independent interoperability framework for metadata exchange between online parties. Many academic libraries and other organisations expose OAI-PMH compliant repositories to the web which can be harvested.

The OAI-PMH standard defines the following parties and software components:

  • A ‘Data Provider’ such as an academic library runs a Repository that supports OAI-PMH as a means of exposing metadata information about resources, for instance academic publications.
  • A ‘Service Provider’ uses Harvester software to harvest metadata from such Repositories. The harvested metadata can then be used to provide valued-added services, such as a website that allows browsing and searching through their catalog.

OAI-PMH Pack

Infrae has extended Silva so it allows users to browse and search harvested metadata, further enriching the extensive feature-set of this open source CMS. An organization that uses Silva can thus easily become an OAI-PMH Service Provider.

In the process, Infrae also developed a module for accessing OAI-PMH compliant repositories in Python, and developed a sophisticated harvesting and indexing system for using harvested metadata in Zope. These reusable components are designed to be building blocks for other Python or Zope-based applications.

Infrae calls this full stack of OAI-PMH services the OAI-PMH Pack. It is possible to use a combination of Silva and Railroad to become both Service Provider and Data Provider, but since the OAI Pack builds on an interoperable protocol, the components can also be used individually.

The OAI-PMH Pack comprises:

oaipmh Python module

The oaipmh Python module enables high-level access to an OAI-PMH metadata repository. Arbitrary repositories can be accessed and harvested using an easy to use Python-based API. It has built-in support for the default Dublin Core metadata set (oai_dc). It can also be easily extended with support for other metadata sets using a simple declarative system based on industry-standard xpath expressions.

The oaipmh module can be integrated with any Python application. The only requirement is libxml2 and its Python bindings.

Download oaipmh

OAICore Zope Product

The OAICore Zope product extends Zope with OAI-PMH harvesting and indexing abilities. It builds on the Python oaipmh module. OAICore can be used to harvest arbitrary OAI-PMH compliant repositories. All harvested data is stored in a scalable BTreeFolder in Zope, using the Zope catalog for indexing the data.

OAICore has a declarative system for declaring metadata indexes, and can autogenerate user interfaces for searching and browsing metadata. It can be easily extended with support for arbitrary metadata sets, building on the functionality of the oaipmh Python module.

This flexible component is intended to be integrated with Zope-based systems such as CMS systems. One such integration we’ve provided is with the Silva CMS.

Download OAICore

SilvaOAI Silva extension

SilvaOAI builds on OAICore to make OAI-PMH-harvested metadata browseable and searchable in Silva. When installed in Silva, Silva has a new content object called an “OAI Query”. Silva authors can specify in the Silva UI which collection of harvested metadata they want to expose on the query’s public web page, for instance all entries in an OAI-PMH defined set. End-users can then browse this metadata in the Silva site’s public layout, and also request more detailed information. End-users can also further search the collection using a search form.

Download SilvaOAI

More information about OAI-PMH:
http://www.openarchives.org/
http://www.oaforum.org/tutorial/
OAI-PMH Pack SVN repository:
https://infrae.com/svn/OAICore/
More information about Silva:
https://infrae.com/products/silva