Cheshire3 Object Model

Introduction

A schematic layout of the object structure of a typical Cheshire3 database. Each of the brief descriptions below link to further information about the class of object, including the default implementations available as well as their API.

Overview

Model unavailable

Base Class

The base class from which all configurable objects in the framework derive. As such, they all share this common API.

Server

A protocol neutral collection of databases, users and their dependant objects. It acts as an inital entry point for all requests and handles such things as user authentication, multi-database searching, and global object configuration.

ProtocolHandler

Handles requests from a protocol and transforms them into their internal representation before handing them off to a server to process. It takes back the results of the processing, transforms them into the protocol's response and returns it.

Database

A virtual collection of records which may be interacted with. A database includes indexes, which contain data extracted from the records as well as configuration details. The database is responsable for handling queries which come to it and distributing it amongst its component objects.

Index

A collection of terms extracted from records, along with their frequency. Extraction is done via a chain of objects before being stored in an IndexStore.

DocumentFactory

An object that will act as an interface to a collection of one or more documents outside of the Cheshire framework. This might load from a file, or directory, but equally could be an interface to another database or a remote system.

Document

Unparsed data of any format. Documents are turned in to Records by Parsers, or into other Documents by PreParsers. Transformers turn Records into Documents.

Record

Parsed machine readable data (most commonly XML) plus any associated metadata. Records are usually modelled internally using XML, with both DOM and SAX2 interfaces.

ResultSet

A collection of records, typically created as the result of a search on a database.

PreParser

An interface to transform a Document into a different Document. For example to turn USMARC into unparsed MARCXML.

Transformer

An interface to turn a Record into a Document. This might involve using XSLT to transform the record into a different schema.

Parser

An interface to transform a Document containing raw XML into a Record.

Selector

A simple wrapper around an XPath or other means of selecting data from a parsed structure in a Record.

Extractor

An interface to extract data of a certain format from the results of evaluating an XPath expression on a Record.

Tokenizer

Takes a string of language and processes it to produce an ordered list of tokens. For example, to turn a string into keywords.

Normalizer

An interface to turn raw data from one form into another. Each normalizer does one atomic operation and may be strung together in chains with other objects to create the required end result. For example, to turn a string into its lowercase form, or to turn a string or numerals into a number.

TokenMerger

Takes an ordered list of tokens (i.e. as produced by a Tokenizer) and merges them into a hash. This might involve merging multiple tokens per key, while maintaining frequency, proximity information etc.

RecordStore

An interface to a persistent storage mechanism for records and their associated metadata.

IndexStore

An interface to a persistent storage mechanism for indexes and their extracted terms.

DocumentStore

An interface to a persistent storage mechanism for Documents and their associated metadata.

ResultSetStore

An interface to a persistent storage mechanism for saving ResultSets.

ObjectStore

An interface to a persistent storage mechanism for configured Cheshire3 objects.

Workflow

Objects that can interpret XML configuration and execute code based on it to perform tasks.

Other Objects

Other modules in the Cheshire3 framework: