fr.gouv.culture.sdx.oai
Class LuceneDocumentBaseOAIRepository

java.lang.Object
  extended by org.apache.avalon.framework.logger.AbstractLogEnabled
      extended by org.apache.cocoon.xml.AbstractXMLProducer
          extended by org.apache.cocoon.xml.AbstractXMLPipe
              extended by fr.gouv.culture.oai.OAIObjectImpl
                  extended by fr.gouv.culture.oai.AbstractOAIRepository
                      extended by fr.gouv.culture.sdx.oai.AbstractDocumentBaseOAIRepository
                          extended by fr.gouv.culture.sdx.oai.LuceneDocumentBaseOAIRepository
All Implemented Interfaces:
OAIObject, OAIRepository, DocumentBaseOAIRepository, org.apache.avalon.excalibur.pool.Poolable, org.apache.avalon.excalibur.pool.Recyclable, org.apache.avalon.framework.configuration.Configurable, org.apache.avalon.framework.context.Contextualizable, org.apache.avalon.framework.logger.LogEnabled, org.apache.avalon.framework.service.Serviceable, org.apache.cocoon.xml.XMLPipe, org.apache.cocoon.xml.XMLProducer, org.apache.excalibur.xml.sax.XMLConsumer, org.apache.excalibur.xml.sax.XMLizable, org.xml.sax.ContentHandler, org.xml.sax.ext.LexicalHandler

public class LuceneDocumentBaseOAIRepository
extends AbstractDocumentBaseOAIRepository


Nested Class Summary
 
Nested classes/interfaces inherited from interface fr.gouv.culture.sdx.oai.DocumentBaseOAIRepository
DocumentBaseOAIRepository.ConfigurationNode
 
Nested classes/interfaces inherited from interface fr.gouv.culture.oai.OAIObject
OAIObject.Node
 
Field Summary
protected static java.lang.String ATTRIBUTE_NAME_EXCLUDE_QUERY
           
protected static java.lang.String ATTRIBUTE_NAME_INCLUDE_QUERY
           
protected  java.lang.String ATTRIBUTE_NAME_SDXFIELD
           
protected  java.lang.String ELEMENT_NAME_EXCLUDE
           
protected  java.lang.String ELEMENT_NAME_INCLUDE
           
protected  java.lang.String ELEMENT_NAME_OAI_FORMAT
           
protected  java.lang.String ELEMENT_NAME_OAI_SUBSET
           
protected  SimpleQuery excludeQuery
          query string for document selection
protected  SimpleQuery includeQuery
          query string for document selection
protected  LuceneIndex luceneSearchIndex
          Search index of the underlying LuceneDocumentBase
protected  java.lang.String newResumptionToken
           
static java.lang.String PARAMETER_NAME_EXCLUDE_QUERY
           
static java.lang.String PARAMETER_NAME_INCLUDE_QUERY
           
protected  java.lang.String resumptionToken
           
protected  java.util.Hashtable setMappings
          Set mappings from configuration file
 
Fields inherited from class fr.gouv.culture.sdx.oai.AbstractDocumentBaseOAIRepository
_database, context, documentBase, documentBaseId, externalIdPrefix, id, isDefault, manager, numRecordsPerResponse, PARAMETER_NAME_SDX_FIELD, PARAMETER_NAME_SET_NAME, PARAMETER_NAME_SET_SPEC, resumptionTokenIdGen
 
Fields inherited from class fr.gouv.culture.oai.AbstractOAIRepository
adminEmails, baseURL, compression, deletedRecord, description, earliestDatestamp, granularity, metadataFormats, protocolVersion, repositoryName
 
Fields inherited from class fr.gouv.culture.oai.OAIObjectImpl
_context, logger
 
Fields inherited from class org.apache.cocoon.xml.AbstractXMLProducer
contentHandler, EMPTY_CONTENT_HANDLER, lexicalHandler, xmlConsumer
 
Fields inherited from interface fr.gouv.culture.oai.OAIObject
HTTP_HEADER_NAME_FROM, HTTP_HEADER_NAME_USER_AGENT, NUMBER_RECORDS_PER_RESPONSE, STRING_DATEFORMAT_GRANULARITY_DAY, STRING_DATEFORMAT_GRANULARITY_SECOND
 
Constructor Summary
LuceneDocumentBaseOAIRepository(LuceneDocumentBase base)
           
 
Method Summary
 void addDeletedRecord(java.lang.String id)
          Adds a deleted record lucene document to the search index for the specific id
protected  org.apache.lucene.search.BooleanQuery addSetQuery(java.lang.String setSpec)
          Returns a boolean query for a specific set for a specific set
 void configure(org.apache.avalon.framework.configuration.Configuration configuration)
          Configure an LuceneDocumentBaseOAIRepository
protected  void configureFormats(org.apache.avalon.framework.configuration.Configuration configuration)
          Configure the supported formats of an OAIRepository
protected  void configureSets(org.apache.avalon.framework.configuration.Configuration configuration)
          Configure the sets of an OAIRepository
protected  void configureSubset(org.apache.avalon.framework.configuration.Configuration configuration)
           
protected  org.apache.lucene.search.Hits executeIdQuery(OAIRequest request)
          Executes a query from an OAI request for the identifier param
protected  org.apache.lucene.search.Hits executeQueryForRequestParams(OAIRequest request)
          Executes a query for a request and sorts them in ascending date order
protected  org.apache.lucene.search.BooleanQuery getBaseQuery()
          Builds a basic boolean query adding the global inclusion and exclusion queries.
protected  org.apache.lucene.search.BooleanQuery getBaseQueryForResults()
           
protected  org.apache.avalon.framework.parameters.Parameters getDocumentSetParameters(org.apache.lucene.document.Document doc)
          Returns a parameters object containing set information for a lucene document
 java.lang.String getEarliestDatestamp()
          Returns the earliest _datestamp for this repository
protected  org.apache.lucene.search.TermQuery getIdQuery(java.lang.String id)
           
protected  org.apache.lucene.search.Query getLuceneSimpleQuery(java.lang.String query)
          Builds an sdx simple query for the provided query string and returns the underlying lucene query object
 void getRecord(OAIRequest request)
          Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface
 java.lang.String getResumptionToken()
           
protected  SearchLocations getSearchLocation()
          Builds a search locations object using the underlying LuceneIndex
 void listIdentifiers(OAIRequest request)
          Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface
protected  void listIdentifiersOrRecords(OAIRequest request)
          Protected method which search documents in the LuceneDocumentBase and constructs OAI records.
 void listMetadataFormats(OAIRequest request)
          Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface
 void listRecords(OAIRequest request)
          Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface
 void listSets(OAIRequest request)
          Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface
 void purgeDeletedRecords()
          Destroys all deleted record entries
 void removeDeletedRecord(java.lang.String id)
          Removes a deleted record entry from lucene document from the search index for the specific id
protected  void resetInternalObjects(java.lang.String rt, java.lang.String newrt)
           
protected  void sendHeaderEvents(OAIRequest request, org.apache.lucene.document.Document doc)
          Sends the OAI-PMH header events for the provided lucene document
protected  void sendIdentifier(OAIRequest request, org.apache.lucene.document.Document doc)
           
protected  void sendRecord(OAIRequest request, org.apache.lucene.document.Document doc, java.lang.String mdPrefix)
           
protected  java.util.ArrayList verifyMetadataFormatForRecord(org.apache.lucene.document.Document document)
          Verifies that a document/record is available in the oai_dc format
protected  java.util.ArrayList verifyMetadataFormatForRecord(org.apache.lucene.document.Document document, java.lang.String mdPrefix)
          Verifies that a document/record is available in the specified format
 
Methods inherited from class fr.gouv.culture.sdx.oai.AbstractDocumentBaseOAIRepository
buildExternalOaiId, buildExternalOaiId, buildUrlLocator, configureAdminEmails, configureBaseURL, configureDatabase, configureDefault, configureDescription, configureExternalIdPrefix, configureId, configureResumptionTokenIDGenerator, createResumptionToken, deriveInternalSdxId, deriveInternalSdxId, getId, getRepositoryId, getResumptionTokenCursor, getResumptionTokenProperty, isDefault, service, setDefault, unsetDefault
 
Methods inherited from class fr.gouv.culture.oai.AbstractOAIRepository
endVerbEvent, getAdminEmails, getBaseURL, getCompression, getDeletedRecord, getDescription, getGranularity, getGranularity, getProtocolVersion, getRepositoryName, identify, sendNoSetHierarchyError, sendResumptionToken, sendResumptionToken, sendResumptionToken, sendResumptionTokensNotSupportedError, startVerbEvent, toSAX, verifyGranularity, verifyParameters
 
Methods inherited from class fr.gouv.culture.oai.OAIObjectImpl
contextualize, enableLogging, endElement, getContext, sendElement, sendElementContent, startElement
 
Methods inherited from class org.apache.cocoon.xml.AbstractXMLPipe
characters, comment, endCDATA, endDocument, endDTD, endElement, endEntity, endPrefixMapping, ignorableWhitespace, processingInstruction, setDocumentLocator, skippedEntity, startCDATA, startDocument, startDTD, startEntity, startPrefixMapping
 
Methods inherited from class org.apache.cocoon.xml.AbstractXMLProducer
recycle, setConsumer, setContentHandler, setLexicalHandler
 
Methods inherited from class org.apache.avalon.framework.logger.AbstractLogEnabled
getLogger, setupLogger, setupLogger, setupLogger
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface fr.gouv.culture.oai.OAIRepository
getAdminEmails, getBaseURL, getCompression, getDeletedRecord, getDescription, getGranularity, getProtocolVersion, getRepositoryName, identify, verifyParameters
 
Methods inherited from interface org.apache.avalon.framework.logger.LogEnabled
enableLogging
 
Methods inherited from interface org.apache.avalon.framework.context.Contextualizable
contextualize
 
Methods inherited from interface org.apache.excalibur.xml.sax.XMLizable
toSAX
 
Methods inherited from interface org.xml.sax.ContentHandler
characters, endDocument, endElement, endPrefixMapping, ignorableWhitespace, processingInstruction, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping
 
Methods inherited from interface org.xml.sax.ext.LexicalHandler
comment, endCDATA, endDTD, endEntity, startCDATA, startDTD, startEntity
 
Methods inherited from interface org.apache.cocoon.xml.XMLProducer
setConsumer
 

Field Detail

luceneSearchIndex

protected LuceneIndex luceneSearchIndex
Search index of the underlying LuceneDocumentBase


includeQuery

protected SimpleQuery includeQuery
query string for document selection


excludeQuery

protected SimpleQuery excludeQuery
query string for document selection


setMappings

protected java.util.Hashtable setMappings
Set mappings from configuration file


ATTRIBUTE_NAME_SDXFIELD

protected final java.lang.String ATTRIBUTE_NAME_SDXFIELD
See Also:
Constant Field Values

ELEMENT_NAME_OAI_FORMAT

protected final java.lang.String ELEMENT_NAME_OAI_FORMAT
See Also:
Constant Field Values

resumptionToken

protected java.lang.String resumptionToken

newResumptionToken

protected java.lang.String newResumptionToken

ELEMENT_NAME_OAI_SUBSET

protected final java.lang.String ELEMENT_NAME_OAI_SUBSET
See Also:
Constant Field Values

ELEMENT_NAME_INCLUDE

protected final java.lang.String ELEMENT_NAME_INCLUDE
See Also:
Constant Field Values

ELEMENT_NAME_EXCLUDE

protected final java.lang.String ELEMENT_NAME_EXCLUDE
See Also:
Constant Field Values

ATTRIBUTE_NAME_INCLUDE_QUERY

protected static final java.lang.String ATTRIBUTE_NAME_INCLUDE_QUERY
See Also:
Constant Field Values

ATTRIBUTE_NAME_EXCLUDE_QUERY

protected static final java.lang.String ATTRIBUTE_NAME_EXCLUDE_QUERY
See Also:
Constant Field Values

PARAMETER_NAME_INCLUDE_QUERY

public static final java.lang.String PARAMETER_NAME_INCLUDE_QUERY
See Also:
Constant Field Values

PARAMETER_NAME_EXCLUDE_QUERY

public static final java.lang.String PARAMETER_NAME_EXCLUDE_QUERY
See Also:
Constant Field Values
Constructor Detail

LuceneDocumentBaseOAIRepository

public LuceneDocumentBaseOAIRepository(LuceneDocumentBase base)
Method Detail

configure

public void configure(org.apache.avalon.framework.configuration.Configuration configuration)
               throws org.apache.avalon.framework.configuration.ConfigurationException
Configure an LuceneDocumentBaseOAIRepository

Specified by:
configure in interface org.apache.avalon.framework.configuration.Configurable
Overrides:
configure in class AbstractDocumentBaseOAIRepository
Parameters:
confugiration - The configuration
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureSets

protected void configureSets(org.apache.avalon.framework.configuration.Configuration configuration)
                      throws org.apache.avalon.framework.configuration.ConfigurationException
Configure the sets of an OAIRepository

Parameters:
confugiration - The configuration
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureFormats

protected void configureFormats(org.apache.avalon.framework.configuration.Configuration configuration)
                         throws org.apache.avalon.framework.configuration.ConfigurationException
Configure the supported formats of an OAIRepository

Parameters:
confugiration - The configuration
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureSubset

protected void configureSubset(org.apache.avalon.framework.configuration.Configuration configuration)
                        throws org.apache.avalon.framework.configuration.ConfigurationException
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

listSets

public void listSets(OAIRequest request)
              throws org.xml.sax.SAXException
Description copied from interface: OAIRepository
Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface

Throws:
org.xml.sax.SAXException

listIdentifiers

public void listIdentifiers(OAIRequest request)
                     throws org.xml.sax.SAXException
Description copied from interface: OAIRepository
Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface

Throws:
org.xml.sax.SAXException

getRecord

public void getRecord(OAIRequest request)
               throws org.xml.sax.SAXException
Description copied from interface: OAIRepository
Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface

Throws:
org.xml.sax.SAXException

listRecords

public void listRecords(OAIRequest request)
                 throws org.xml.sax.SAXException
Description copied from interface: OAIRepository
Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface

Throws:
org.xml.sax.SAXException

listIdentifiersOrRecords

protected void listIdentifiersOrRecords(OAIRequest request)
                                 throws org.xml.sax.SAXException
Protected method which search documents in the LuceneDocumentBase and constructs OAI records. The method to search documents is the same for the verbs ListIdentifiers and ListRecords.

Throws:
org.xml.sax.SAXException

resetInternalObjects

protected void resetInternalObjects(java.lang.String rt,
                                    java.lang.String newrt)

listMetadataFormats

public void listMetadataFormats(OAIRequest request)
                         throws org.xml.sax.SAXException
Description copied from interface: OAIRepository
Send's xml data via SAX events to the Consumer, etc. see Cocoon's XMLProducer interface

Throws:
org.xml.sax.SAXException

verifyMetadataFormatForRecord

protected java.util.ArrayList verifyMetadataFormatForRecord(org.apache.lucene.document.Document document)
Verifies that a document/record is available in the oai_dc format

Parameters:
document - The document verify
Returns:
an ArrayList of the available formats

verifyMetadataFormatForRecord

protected java.util.ArrayList verifyMetadataFormatForRecord(org.apache.lucene.document.Document document,
                                                            java.lang.String mdPrefix)
Verifies that a document/record is available in the specified format

Parameters:
document - The document verify
mdPrefix - The format to verify
Returns:
an ArrayList of the available formats

getResumptionToken

public java.lang.String getResumptionToken()

getSearchLocation

protected SearchLocations getSearchLocation()
Builds a search locations object using the underlying LuceneIndex


getBaseQuery

protected org.apache.lucene.search.BooleanQuery getBaseQuery()
Builds a basic boolean query adding the global inclusion and exclusion queries.


getBaseQueryForResults

protected org.apache.lucene.search.BooleanQuery getBaseQueryForResults()

getIdQuery

protected org.apache.lucene.search.TermQuery getIdQuery(java.lang.String id)

executeIdQuery

protected org.apache.lucene.search.Hits executeIdQuery(OAIRequest request)
                                                throws org.xml.sax.SAXException
Executes a query from an OAI request for the identifier param

Throws:
org.xml.sax.SAXException

sendRecord

protected void sendRecord(OAIRequest request,
                          org.apache.lucene.document.Document doc,
                          java.lang.String mdPrefix)
                   throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

sendIdentifier

protected void sendIdentifier(OAIRequest request,
                              org.apache.lucene.document.Document doc)
                       throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

sendHeaderEvents

protected void sendHeaderEvents(OAIRequest request,
                                org.apache.lucene.document.Document doc)
                         throws org.xml.sax.SAXException
Sends the OAI-PMH header events for the provided lucene document

Parameters:
request - The request
doc - The lucene document
Throws:
org.xml.sax.SAXException

addDeletedRecord

public void addDeletedRecord(java.lang.String id)
                      throws SDXException
Adds a deleted record lucene document to the search index for the specific id

Parameters:
id - The id of the deleted record
Throws:
SDXException

removeDeletedRecord

public void removeDeletedRecord(java.lang.String id)
                         throws SDXException
Removes a deleted record entry from lucene document from the search index for the specific id

Parameters:
id - The id of the deleted record
Throws:
SDXException

addSetQuery

protected org.apache.lucene.search.BooleanQuery addSetQuery(java.lang.String setSpec)
Returns a boolean query for a specific set for a specific set

Parameters:
setSpec - The setSpec to use

getLuceneSimpleQuery

protected org.apache.lucene.search.Query getLuceneSimpleQuery(java.lang.String query)
                                                       throws SDXException
Builds an sdx simple query for the provided query string and returns the underlying lucene query object

Parameters:
query - The query string
Throws:
SDXException

getDocumentSetParameters

protected org.apache.avalon.framework.parameters.Parameters getDocumentSetParameters(org.apache.lucene.document.Document doc)
Returns a parameters object containing set information for a lucene document

Parameters:
doc - The lucene document

executeQueryForRequestParams

protected org.apache.lucene.search.Hits executeQueryForRequestParams(OAIRequest request)
                                                              throws SDXException,
                                                                     org.xml.sax.SAXException
Executes a query for a request and sorts them in ascending date order

Parameters:
request - The request from which to build the query to execute
Returns:
Hits
Throws:
SDXException
org.xml.sax.SAXException

getEarliestDatestamp

public java.lang.String getEarliestDatestamp()
Returns the earliest _datestamp for this repository

Specified by:
getEarliestDatestamp in interface OAIRepository
Overrides:
getEarliestDatestamp in class AbstractOAIRepository

purgeDeletedRecords

public void purgeDeletedRecords()
Destroys all deleted record entries



Copyright © 2000-2010 Ministere de la culture et de la communication / AJLSM. All Rights Reserved.