fr.gouv.culture.sdx.documentbase
Class LuceneDocumentBase

java.lang.Object
  extended by fr.gouv.culture.sdx.utils.AbstractSdxObject
      extended by fr.gouv.culture.sdx.utils.database.DatabaseBacked
          extended by fr.gouv.culture.sdx.documentbase.AbstractDocumentBase
              extended by fr.gouv.culture.sdx.documentbase.SDXDocumentBase
                  extended by fr.gouv.culture.sdx.documentbase.LuceneDocumentBase
All Implemented Interfaces:
DocumentBase, SDXDocumentBaseTarget, Searchable, Describable, Encodable, Identifiable, Localizable, Saveable, SdxObject, Target, org.apache.avalon.framework.configuration.Configurable, org.apache.avalon.framework.context.Contextualizable, org.apache.avalon.framework.logger.LogEnabled, org.apache.avalon.framework.service.Serviceable, org.apache.excalibur.xml.sax.XMLizable
Direct Known Subclasses:
LuceneThesaurus

public class LuceneDocumentBase
extends SDXDocumentBase

Author:
mpichot

Nested Class Summary
 
Nested classes/interfaces inherited from interface fr.gouv.culture.sdx.documentbase.SDXDocumentBaseTarget
SDXDocumentBaseTarget.ConfigurationNode
 
Nested classes/interfaces inherited from interface fr.gouv.culture.sdx.documentbase.DocumentBase
DocumentBase.ConfigurationNode
 
Field Summary
protected  FieldList _fieldList
          The (Lucene) fields that are to be handled by the index.
protected  java.util.HashMap _xmlFieldList
          The list of fields with a XML type
static java.lang.String DBELEM_ATTRIBUTE_REMOTE_ACCESS
          The implied attribute stating whether this document base is to be exposed to remote access or not.
static java.lang.String ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
          The element used to define system fields in sdx.xconf.
protected  java.lang.String INDEX_DIR_CURRENT
          Directory names for indexes
protected  java.lang.String INDEX_DIR_MAIN
           
protected  long lastDocCount
          Number of indexed doc since last split
protected  LuceneIndex luceneActiveIndex
          The active index for this document base
protected  LuceneIndex luceneCurrentIndex
          The temporary index for this document base
protected  java.util.Vector luceneSearchIndexList
          The sub-indexes for this document base (first entry is the activeIndex)
protected  java.lang.String SEARCH_INDEX_DIRECTORY_NAME
          The directory name for the index that stores documents' indexation.
protected  int subIndexCount
          Number of subindexes
 
Fields inherited from class fr.gouv.culture.sdx.documentbase.SDXDocumentBase
_configuration, _documentAdditionStatus, _ilevel, _ilogger, _isIndexOptimized, autoOptimize, baseIndexDir, DOC_ADD_STATUS_ADDED, DOC_ADD_STATUS_FAILURE, DOC_ADD_STATUS_IGNORED, DOC_ADD_STATUS_REFRESHED, DOC_ADD_STATUS_REPLACED, DOC_URL, ELEMENT_NAME_DEFAULT_HPP, ELEMENT_NAME_DEFAULT_MAXSORT, isDatadirShared, keepOriginalDocuments, scheduler, SDX_DATABASE_FORMAT, SDX_DATABASE_VERSION, SDX_DATABASE_VERSION_2_3, SDX_DATE, SDX_DATE_MILLISECONDS, SDX_ISO8601_DATE, SDX_USER, splitActive, splitDoc, splitSize, splitUnit, useCompoundFiles
 
Fields inherited from class fr.gouv.culture.sdx.documentbase.AbstractDocumentBase
_indexationPipeline, _oaiHarv, ATTRIBUTE_AUTO_OPTIMIZE, ATTRIBUTE_COMPOUND_FILES, ATTRIBUTE_SPLIT_DOC, ATTRIBUTE_SPLIT_SIZE, ATTRIBUTE_SPLIT_UNIT, DBELEM_ATTRIBUTE_DEFAULT, DBELEM_ATTRIBUTE_HPP, DBELEM_ATTRIBUTE_KEEP_ORIGINAL, DBELEM_ATTRIBUTE_MAXSORT, defaultHitsPerPage, defaultMaxSort, defaultRepository, ELEMENT_NAME_INDEX_SPLIT, ELEMENT_NAME_OPTIMIZE, INTERNAL_FIELD_NAME_SDX_OAI_DELETED_RECORD, INTERNAL_FIELD_NAME_SDXALL, INTERNAL_FIELD_NAME_SDXAPPID, INTERNAL_FIELD_NAME_SDXCONTENTLENGTH, INTERNAL_FIELD_NAME_SDXDBID, INTERNAL_FIELD_NAME_SDXDOCID, INTERNAL_FIELD_NAME_SDXDOCTYPE, INTERNAL_FIELD_NAME_SDXMODDATE, INTERNAL_FIELD_NAME_SDXREPOID, INTERNAL_SDXALL_FIELD_VALUE, isDefault, locale, oaiRepo, oaiRepositories, PROPERTY_NAME_ATTACHED, PROPERTY_NAME_CONTENT_LENGTH, PROPERTY_NAME_DOCTYPE, PROPERTY_NAME_MIMETYPE, PROPERTY_NAME_ORIGINAL, PROPERTY_NAME_PARENT, PROPERTY_NAME_REPO, PROPERTY_NAME_SUB, repoConnectionPool, repositories, useMetadata
 
Fields inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked
_database, CLASS_NAME_SUFFIX, DATABASE_DIR_NAME, databaseConf, dbLocation, dbPath, DEFAULT_DATABASE_TYPE
 
Fields inherited from class fr.gouv.culture.sdx.utils.AbstractSdxObject
_context, _description, _encoding, _id, _locale, _logger, _manager, _xmlizable_objects, _xmlLang, isToSaxInitialized
 
Fields inherited from interface fr.gouv.culture.sdx.documentbase.DocumentBase
CLASS_NAME_SUFFIX, PACKAGE_QUALNAME
 
Fields inherited from interface fr.gouv.culture.sdx.utils.Encodable
DEFAULT_ENCODING
 
Fields inherited from interface fr.gouv.culture.sdx.utils.save.Saveable
ALL_SAVE_ATTRIB, PATH_ATTRIB, SAVE_DIRECTORY_PARAM
 
Constructor Summary
LuceneDocumentBase()
          Creates the document base.
 
Method Summary
protected  void addSubIndex()
          Adds a splitted sub-index and update configuration aftermath
protected  void addSubIndex(LuceneIndex index)
          Adds a splitted sub-index and update configuration aftermath
protected  void addToSearchIndex(java.lang.Object indexationDoc, boolean batchIndex)
          Writes a document to the search index
 void backup(SaveParameters save_config)
          Saves the DocumentBase data objects
protected  void backupIndexes(SaveParameters save_config)
          Save the indexes files
protected  void backupTimeStamp(SaveParameters save_config)
          Save the timestamp files
protected  void compactSearchIndex()
           
 void configure(org.apache.avalon.framework.configuration.Configuration configuration)
          Sets the configuration options for this document base.
protected  void configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
          Configures the Lucene document base
protected  void configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
          Configures the fields list
protected  void configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
          Configures the OAI harverster of this Lucene document base.
protected  void configureOAIRepositories(org.apache.avalon.framework.configuration.Configuration configuration)
          Configures on or more OAI repositories.
protected  void configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
          Configures an OAIRespository Configures an OAIRespository based on the configuration element <oai-repository>
protected  void configureSearchIndex()
          Configures Lucene search index
 OAIRepository createOAIRepository()
          Creates the default OAIRepository for the documentbase, using the older configuration
 OAIRepository createOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
          Creates the OAIRepository for the documentbase Configures an OAIRespository based on the configuration that must start with an element <oai-repository>
 OAIRepository createOAIRepository(java.lang.String repoId)
          Creates an OAIRepository for the documentbase, using the older configuration
 java.util.Date creationDate()
          Returns the creation date of the Lucene search index.
 void delete(Document[] docs, org.xml.sax.ContentHandler handler)
          Deletes documents to this base.
protected  void deleteFromSearchIndex(java.lang.String docId)
           
 int docCount()
          Returns the number of documents in all Lucene sub indexes.
protected  java.lang.String getFormatedSubIndexId(int subIndexNumber)
          Gets the formated sub-index number (for directories name)
 Index getIndex()
          Gets the Index object for indexing and searching.
protected  java.lang.Object getIndexationDocument(IndexableDocument doc, java.lang.String storeDocId, java.lang.String repoId, IndexParameters params)
           
 org.apache.lucene.index.IndexReader getIndexReader()
          Return the Lucene index reader Returns the index reader for all this document base indexes.
protected  long getIndexSize(LuceneIndex index)
          Returns the index size
 LuceneIndex getLuceneIndex()
           
 org.apache.lucene.search.Searcher getSearcher()
          Returns the Lucene index searcher Returns the index searcher for all this document base indexes.
 java.util.HashMap getXMLFieldList()
          Returns the list of XML type fields
 void index(IndexableDocument[] docs, Repository repository, IndexParameters params, org.xml.sax.ContentHandler handler)
          Adds one or more indexables documents to the search index of Lucene.
 void indexModified()
          Modifies the last modfication timestamp file
 void init()
          Initializes the document base.
protected  void initializeVectorizedIndex()
          Initializes the index vector Initializes the index vector by searching all sub index in it's directory
NB : working as intended.
protected  boolean initToSax()
          Init the LinkedHashMap _xmlizable_objects with the objects in order to describ them in XML
protected  void initVolatileObjectsToSax()
          Init the LinkedHashMap _xmlizable_volatile_objects with the objects in order to describ them in XML.
 java.util.Date lastModificationDate()
          Returns the last modification date of the Lucene search index.
 void mergeBatch()
          Deprecated. This method is deprecated since SDX v. 2.3. Use mergeCurrentBatch() instead.
 void mergeCurrentBatch()
          Merges a batch of documents Merges a batch of documents (in memory) into the physical index on the file system and optimize this one if necessary (depends of the autoOptimize attribute for the current Document Base).
 void optimize()
          Process an optimization of the indexes and repositories and system databases
 void reloadFieldList(java.lang.String appConfString)
          Reloads the fieldList of an application
protected  void removeSubIndex()
          Remove a splitted sub-index and update configuration aftermath Currently of no use as there is no plan to do so, just here as a reminder for future functionnalities
protected  void renewKeyIndex()
          Refreshes data for the main and current index
 void replaceFieldList(FieldList fieldList)
          Replaces the current fieldList by the new one
 void restore(SaveParameters save_config)
          Restore the DocumentBase data objects
protected  void restoreIndexes(SaveParameters save_config)
          Save the indexes files
protected  void restoreTimeStamp(SaveParameters save_config)
          Restore the timestamp files
protected  IndexParameters setBaseParameters(IndexParameters params)
          Sets the default pipeline parameters and ensures the params have a pipeline
protected  void setSearchIndexParameters(LuceneIndexParameters params)
          Sets the search index parameters for indexation performance
 boolean splitCheck(boolean currentIndex)
          Tests splitting conditions Returns true when splitting condition are reached.
 void splitIndex(boolean currentIndex)
          Splits current index Splits the current big index into 2 smaller one
 
Methods inherited from class fr.gouv.culture.sdx.documentbase.SDXDocumentBase
add, checkIntegrity, configureBase, configureIdGenerator, configureOAIComponents, configureOptimizeTriggers, configureRepositories, configureSplit, delete, deleteIndexableDocumentComponents, deleteRelationsToMastersFromDatabase, getByteSplitSize, getConfiguration, getDocument, getDocument, getDocument, getDocument, getIndexationInformations, getIndexationLogger, getOwners, getRelated, getRepositoryConfigurationList, getRepositoryForDocument, getRepositoryForStorage, getSplitDoc, getSplitSize, getSplitUnit, getUseCompoundFiles, handleParameters, index, index, isAutoOptimized, isIndexOptimized, rollbackIndexation, setConfiguration, targetTriggered
 
Methods inherited from class fr.gouv.culture.sdx.documentbase.AbstractDocumentBase
addOaiDeletedRecord, addOAIRepository, configurePipeline, createEntityForDocMetaData, delete, deletePhysicalDocument, getDefaultHitsPerPage, getDefaultMaxSort, getDefaultOAIRepository, getDefaultRepository, getIdGenerator, getIndexationPipeline, getMimeType, getOAIHarvester, getOAIRepositoriesSize, getOAIRepository, getOAIRepository, getPooledRepositoryConnection, getRepository, getSourceValidity, isDefault, isUseMetadata, managedOaiDeletedRecord, optimizeDatabase, optimizeRepositories, releasePooledRepositoryConnections, removeOaiDeletedRecord
 
Methods inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked
configure, getClassNameSuffix, getDatabase
 
Methods inherited from class fr.gouv.culture.sdx.utils.AbstractSdxObject
configureDescription, contextualize, enableLogging, getBaseAttributes, getContext, getDescription, getEncoding, getId, getLocale, getLog, getServiceManager, getXmlLang, service, setDescription, setEncoding, setId, setLocale, setUpSdxObject, setUpSdxObject, setXmlLang, toSAX, verifyConfigurationResources
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface fr.gouv.culture.sdx.utils.SdxObject
getLog
 
Methods inherited from interface org.apache.avalon.framework.logger.LogEnabled
enableLogging
 
Methods inherited from interface org.apache.avalon.framework.context.Contextualizable
contextualize
 
Methods inherited from interface org.apache.avalon.framework.service.Serviceable
service
 
Methods inherited from interface fr.gouv.culture.sdx.utils.Identifiable
getId, setId
 
Methods inherited from interface fr.gouv.culture.sdx.utils.Describable
getDescription, setDescription
 
Methods inherited from interface fr.gouv.culture.sdx.utils.Encodable
getEncoding, setEncoding
 
Methods inherited from interface fr.gouv.culture.sdx.utils.Localizable
getLocale, getXmlLang, setLocale, setXmlLang
 
Methods inherited from interface org.apache.excalibur.xml.sax.XMLizable
toSAX
 
Methods inherited from interface fr.gouv.culture.sdx.search.Searchable
getId
 

Field Detail

luceneSearchIndexList

protected java.util.Vector luceneSearchIndexList
The sub-indexes for this document base (first entry is the activeIndex)


luceneActiveIndex

protected LuceneIndex luceneActiveIndex
The active index for this document base


luceneCurrentIndex

protected LuceneIndex luceneCurrentIndex
The temporary index for this document base


_fieldList

protected FieldList _fieldList
The (Lucene) fields that are to be handled by the index.


_xmlFieldList

protected java.util.HashMap _xmlFieldList
The list of fields with a XML type


subIndexCount

protected int subIndexCount
Number of subindexes


lastDocCount

protected long lastDocCount
Number of indexed doc since last split


INDEX_DIR_CURRENT

protected final java.lang.String INDEX_DIR_CURRENT
Directory names for indexes

See Also:
Constant Field Values

INDEX_DIR_MAIN

protected final java.lang.String INDEX_DIR_MAIN
See Also:
Constant Field Values

SEARCH_INDEX_DIRECTORY_NAME

protected final java.lang.String SEARCH_INDEX_DIRECTORY_NAME
The directory name for the index that stores documents' indexation.

See Also:
Constant Field Values

DBELEM_ATTRIBUTE_REMOTE_ACCESS

public static final java.lang.String DBELEM_ATTRIBUTE_REMOTE_ACCESS
The implied attribute stating whether this document base is to be exposed to remote access or not.

See Also:
Constant Field Values

ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS

public static final java.lang.String ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
The element used to define system fields in sdx.xconf.

See Also:
Constant Field Values
Constructor Detail

LuceneDocumentBase

public LuceneDocumentBase()
Creates the document base. After a document base is created, the super.getLog() could be set (optional, but suggested for errors messages) ; it should then be configured and after, initialized in order to work properly.

See Also:
AbstractSdxObject.enableLogging(org.apache.avalon.framework.logger.Logger), configure(org.apache.avalon.framework.configuration.Configuration), init()
Method Detail

configure

public void configure(org.apache.avalon.framework.configuration.Configuration configuration)
               throws org.apache.avalon.framework.configuration.ConfigurationException
Sets the configuration options for this document base.

Specified by:
configure in interface org.apache.avalon.framework.configuration.Configurable
Overrides:
configure in class SDXDocumentBase
Parameters:
configuration - The configuration object from which to build a document base.

Sample configuration entry:

<sdx:documentBase sdx:id = "myDocumentBaseName" sdx:type = "lucene">
       <sdx:fieldList xml:lang = "fr-FR" sdx:variant = "" sdx:analyzerConf = "" sdx:analyzerClass = "">
     <sdx:field code = "fieldName" type = "word" xml:lang = "fr-FR" sdx:analyzerClass = "" sdx:analyzerConf = ""/>
     <sdx:field code = "fieldName2" type = "field" xml:lang = "fr-FR" brief = "true"/>
     <sdx:field code = "fieldName3" type = "date" xml:lang = "fr-FR"/>
     <sdx:field code = "fieldName4" type = "unindexed" xml:lang = "fr-FR"/>
     </sdx:fieldList>
     <sdx:index>
     <sdx:pipeline sdx:id = "sdxIndexationPipeline">
     <sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step2" sdx:type = "xslt"/>
     <sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step3" sdx:type = "xslt" keep = "true"/>
     </sdx:pipeline>
     </sdx:index>
     <sdx:repositories>
     <sdx:repository baseDirectory = "blah4" depth = "3" extent = "100" sdx:type = "FS" sdx:default = "true" sdx:id = "blah4"/>
     <sdx:repository ref = "blah2"/>
     </sdx:repositories>
     </sdx:documentBase>
     
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureDocumentBase

protected void configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
                              throws org.apache.avalon.framework.configuration.ConfigurationException
Configures the Lucene document base

Specified by:
configureDocumentBase in class SDXDocumentBase
Parameters:
configruation - Configuration
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureFieldList

protected void configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
                           throws org.apache.avalon.framework.configuration.ConfigurationException
Configures the fields list

Parameters:
configuration -
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

reloadFieldList

public void reloadFieldList(java.lang.String appConfString)
                     throws SDXException
Reloads the fieldList of an application

Parameters:
appConfString - The path of the configuration file wich contain the new fieldList (eg, file:///myFiles/application.xconf, cocoon://myApplication/conf/application.xconf)
Throws:
SDXException

replaceFieldList

public void replaceFieldList(FieldList fieldList)
                      throws org.apache.avalon.framework.configuration.ConfigurationException
Replaces the current fieldList by the new one

Parameters:
fieldList - The new fieldList wich replace the old one
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureSearchIndex

protected void configureSearchIndex()
                             throws org.apache.avalon.framework.configuration.ConfigurationException
Configures Lucene search index

Throws:
org.apache.avalon.framework.configuration.ConfigurationException

createOAIRepository

public OAIRepository createOAIRepository(java.lang.String repoId)
Creates an OAIRepository for the documentbase, using the older configuration

Overrides:
createOAIRepository in class AbstractDocumentBase
Parameters:
repoId - String The id of the repository to create
Returns:
OAIRepository

createOAIRepository

public OAIRepository createOAIRepository()
Creates the default OAIRepository for the documentbase, using the older configuration

Specified by:
createOAIRepository in interface DocumentBase
Overrides:
createOAIRepository in class AbstractDocumentBase
Returns:
See Also:
createOAIRepository(String)

createOAIRepository

public OAIRepository createOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
Creates the OAIRepository for the documentbase Configures an OAIRespository based on the configuration that must start with an element <oai-repository>

Parameters:
configuration - The configuration
Returns:
OAIRepository

configureOAIRepositories

protected void configureOAIRepositories(org.apache.avalon.framework.configuration.Configuration configuration)
                                 throws org.apache.avalon.framework.configuration.ConfigurationException
Configures on or more OAI repositories.

Specified by:
configureOAIRepositories in class SDXDocumentBase
Parameters:
configuration -
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureOAIRepository

protected void configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
                               throws org.apache.avalon.framework.configuration.ConfigurationException
Configures an OAIRespository Configures an OAIRespository based on the configuration element <oai-repository>

Specified by:
configureOAIRepository in class SDXDocumentBase
Parameters:
configuration - The configuration
Throws:
org.apache.avalon.framework.configuration.ConfigurationException
See Also:
SDXDocumentBase.configureOAIRepository(org.apache.avalon.framework.configuration.Configuration)

configureOAIHarvester

protected void configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
                              throws org.apache.avalon.framework.configuration.ConfigurationException
Configures the OAI harverster of this Lucene document base.

Specified by:
configureOAIHarvester in class SDXDocumentBase
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

index

public void index(IndexableDocument[] docs,
                  Repository repository,
                  IndexParameters params,
                  org.xml.sax.ContentHandler handler)
           throws SDXException,
                  org.xml.sax.SAXException,
                  org.apache.cocoon.ProcessingException
Adds one or more indexables documents to the search index of Lucene.

After adding the document to the search index, this method recycles the Lucene searcher if :

  1. The auto-optimize option is false
  2. More than one documents are added to the search index

Specified by:
index in interface DocumentBase
Overrides:
index in class SDXDocumentBase
Parameters:
docs - The documents to add.
repository - The repository where to store the documents. If null is passed, the default repository will be used.
params - The parameters for this adding action.
handler - A content handler where to send information about the process (may be null) TODO : what kind of "informations" ? -pb
Throws:
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingException
See Also:
SDXDocumentBase.index(fr.gouv.culture.sdx.document.IndexableDocument[], fr.gouv.culture.sdx.repository.Repository, fr.gouv.culture.sdx.documentbase.IndexParameters, org.xml.sax.ContentHandler)

delete

public void delete(Document[] docs,
                   org.xml.sax.ContentHandler handler)
            throws SDXException,
                   org.xml.sax.SAXException,
                   org.apache.cocoon.ProcessingException
Deletes documents to this base.

Deletes one or more documents to this LuceneDocumentBase and recycle Lucene searcher if deletes only one document or the LuceneDocumentBase is not autoOptimize.

Specified by:
delete in interface DocumentBase
Overrides:
delete in class SDXDocumentBase
Parameters:
docs - The document to add and to index.
handler - A content handler to feed with information.
Throws:
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingException
See Also:
AbstractDocumentBase.delete(Document, ContentHandler)

setBaseParameters

protected IndexParameters setBaseParameters(IndexParameters params)
Sets the default pipeline parameters and ensures the params have a pipeline

Overrides:
setBaseParameters in class SDXDocumentBase
Parameters:
params - The params object provided by the user at indexation time

getXMLFieldList

public java.util.HashMap getXMLFieldList()
Description copied from class: SDXDocumentBase
Returns the list of XML type fields

Specified by:
getXMLFieldList in class SDXDocumentBase

getIndex

public Index getIndex()
Gets the Index object for indexing and searching.

Returns:
The LuceneIndex object.

getLuceneIndex

public LuceneIndex getLuceneIndex()

setSearchIndexParameters

protected void setSearchIndexParameters(LuceneIndexParameters params)
Sets the search index parameters for indexation performance

Parameters:
params - The lucene specific params to user

addToSearchIndex

protected void addToSearchIndex(java.lang.Object indexationDoc,
                                boolean batchIndex)
                         throws SDXException
Writes a document to the search index

Specified by:
addToSearchIndex in class SDXDocumentBase
Parameters:
indexationDoc - The Document to add
batchIndex -
Throws:
SDXException

deleteFromSearchIndex

protected void deleteFromSearchIndex(java.lang.String docId)
                              throws SDXException
Specified by:
deleteFromSearchIndex in class SDXDocumentBase
Throws:
SDXException

compactSearchIndex

protected void compactSearchIndex()
                           throws SDXException
Specified by:
compactSearchIndex in class SDXDocumentBase
Throws:
SDXException

getIndexationDocument

protected java.lang.Object getIndexationDocument(IndexableDocument doc,
                                                 java.lang.String storeDocId,
                                                 java.lang.String repoId,
                                                 IndexParameters params)
                                          throws SDXException
Specified by:
getIndexationDocument in class SDXDocumentBase
Throws:
SDXException

lastModificationDate

public java.util.Date lastModificationDate()
Returns the last modification date of the Lucene search index.


creationDate

public java.util.Date creationDate()
Returns the creation date of the Lucene search index.


init

public void init()
          throws SDXException
Description copied from interface: DocumentBase
Initializes the document base.

This method must be called after the super.getLog() has been set and the configuration done.

Specified by:
init in interface DocumentBase
Overrides:
init in class SDXDocumentBase
Throws:
SDXException

initToSax

protected boolean initToSax()
Description copied from class: AbstractSdxObject
Init the LinkedHashMap _xmlizable_objects with the objects in order to describ them in XML

Overrides:
initToSax in class SDXDocumentBase

initVolatileObjectsToSax

protected void initVolatileObjectsToSax()
Init the LinkedHashMap _xmlizable_volatile_objects with the objects in order to describ them in XML.

Some objects need to be refresh each time a toSAX is called.

Overrides:
initVolatileObjectsToSax in class SDXDocumentBase

optimize

public void optimize()
Process an optimization of the indexes and repositories and system databases

Specified by:
optimize in interface DocumentBase
Specified by:
optimize in class SDXDocumentBase

mergeCurrentBatch

public void mergeCurrentBatch()
Merges a batch of documents

Merges a batch of documents (in memory) into the physical index on the file system and optimize this one if necessary (depends of the autoOptimize attribute for the current Document Base).

Specified by:
mergeCurrentBatch in class SDXDocumentBase

indexModified

public void indexModified()
Modifies the last modfication timestamp file

Specified by:
indexModified in class SDXDocumentBase

splitIndex

public void splitIndex(boolean currentIndex)
                throws java.io.IOException,
                       SDXException
Splits current index

Splits the current big index into 2 smaller one

Specified by:
splitIndex in class SDXDocumentBase
Throws:
IOException, - SDXException
java.io.IOException
SDXException

initializeVectorizedIndex

protected void initializeVectorizedIndex()
                                  throws org.apache.avalon.framework.configuration.ConfigurationException
Initializes the index vector

Initializes the index vector by searching all sub index in it's directory
NB : working as intended.

Throws:
org.apache.avalon.framework.configuration.ConfigurationException

addSubIndex

protected void addSubIndex()
                    throws org.apache.avalon.framework.configuration.ConfigurationException
Adds a splitted sub-index and update configuration aftermath

Throws:
SDXException - If it's impossible to configure or initialize the sub-index to add.
org.apache.avalon.framework.configuration.ConfigurationException

removeSubIndex

protected void removeSubIndex()
Remove a splitted sub-index and update configuration aftermath Currently of no use as there is no plan to do so, just here as a reminder for future functionnalities


splitCheck

public boolean splitCheck(boolean currentIndex)
                   throws SDXException
Tests splitting conditions

Returns true when splitting condition are reached. If so, should be followed by a splitIndex() call. Controls order:

  1. size of the index
  2. number of documents

Specified by:
splitCheck in class SDXDocumentBase
Parameters:
currentIndex - boolean to indicate the test concerns the current index (true) or the active one (false)
Returns:
true when splitting condition are reached, false otherwise.
Throws:
SDXException

getIndexSize

protected long getIndexSize(LuceneIndex index)
Returns the index size

Parameters:
index - LuceneIndex
Returns:
long index size as long

getSearcher

public org.apache.lucene.search.Searcher getSearcher()
                                              throws SDXException
Returns the Lucene index searcher

Returns the index searcher for all this document base indexes.

Returns:
Searcher
Throws:
SDXException - If it's not possible to build MultiSearcher.
See Also:
ParallelMultiSearcher

getIndexReader

public org.apache.lucene.index.IndexReader getIndexReader()
                                                   throws SDXException
Return the Lucene index reader

Returns the index reader for all this document base indexes.

Returns:
IndexReader
Throws:
SDXException - If it's not possible to build MultiReader.
See Also:
MultiReader

getFormatedSubIndexId

protected java.lang.String getFormatedSubIndexId(int subIndexNumber)
Gets the formated sub-index number (for directories name)

Parameters:
subIndexNumber - int representing the number of the sub-index
Returns:
sub-index number formatted as a String

addSubIndex

protected void addSubIndex(LuceneIndex index)
                    throws SDXException
Adds a splitted sub-index and update configuration aftermath

Parameters:
index - LuceneIndex
Throws:
SDXException - If nt's not possible to configure and initialize th sub-index.

renewKeyIndex

protected void renewKeyIndex()
                      throws SDXException
Refreshes data for the main and current index

Throws:
SDXException - If it's impossible to freeing resources or initializing Lucene index.

backup

public void backup(SaveParameters save_config)
            throws SDXException
Saves the DocumentBase data objects

Specified by:
backup in interface Saveable
Overrides:
backup in class SDXDocumentBase
Parameters:
save_config - SaveParameters
Throws:
SDXException
See Also:
Saveable.backup(fr.gouv.culture.sdx.utils.save.SaveParameters)

backupIndexes

protected void backupIndexes(SaveParameters save_config)
                      throws SDXException
Save the indexes files

Specified by:
backupIndexes in class SDXDocumentBase
Throws:
SDXException

backupTimeStamp

protected void backupTimeStamp(SaveParameters save_config)
                        throws SDXException
Save the timestamp files

Specified by:
backupTimeStamp in class SDXDocumentBase
Throws:
SDXException

restore

public void restore(SaveParameters save_config)
             throws SDXException
Restore the DocumentBase data objects

Specified by:
restore in interface Saveable
Overrides:
restore in class SDXDocumentBase
Throws:
SDXException
See Also:
Saveable.restore(fr.gouv.culture.sdx.utils.save.SaveParameters)

restoreIndexes

protected void restoreIndexes(SaveParameters save_config)
                       throws SDXException
Save the indexes files

Specified by:
restoreIndexes in class SDXDocumentBase
Throws:
SDXException

restoreTimeStamp

protected void restoreTimeStamp(SaveParameters save_config)
                         throws SDXException
Restore the timestamp files

Specified by:
restoreTimeStamp in class SDXDocumentBase
Throws:
SDXException

docCount

public int docCount()
Returns the number of documents in all Lucene sub indexes.

Returns:
the number of document in all sub indexes
TODO - This needs to be periodically written to a .properties file
TODO - we a configurable generic mechanism to save such information to a .properties file like certain queries, terms, etc. which should be updated after indexation/deletion

mergeBatch

public void mergeBatch()
                throws SDXException
Deprecated. This method is deprecated since SDX v. 2.3. Use mergeCurrentBatch() instead.

Merges a batch of documents (in memory) into the physical index on the file system.

Specified by:
mergeBatch in class SDXDocumentBase
Throws:
SDXException


Copyright © 2000-2010 Ministere de la culture et de la communication / AJLSM. All Rights Reserved.