|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object fr.gouv.culture.sdx.utils.AbstractSdxObject fr.gouv.culture.sdx.utils.database.DatabaseBacked fr.gouv.culture.sdx.documentbase.AbstractDocumentBase fr.gouv.culture.sdx.documentbase.SDXDocumentBase fr.gouv.culture.sdx.documentbase.LuceneDocumentBase
public class LuceneDocumentBase
Nested Class Summary |
---|
Nested classes/interfaces inherited from interface fr.gouv.culture.sdx.documentbase.SDXDocumentBaseTarget |
---|
SDXDocumentBaseTarget.ConfigurationNode |
Nested classes/interfaces inherited from interface fr.gouv.culture.sdx.documentbase.DocumentBase |
---|
DocumentBase.ConfigurationNode |
Field Summary | |
---|---|
protected FieldList |
_fieldList
The (Lucene) fields that are to be handled by the index. |
protected java.util.HashMap |
_xmlFieldList
The list of fields with a XML type |
static java.lang.String |
DBELEM_ATTRIBUTE_REMOTE_ACCESS
The implied attribute stating whether this document base is to be exposed to remote access or not. |
static java.lang.String |
ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
The element used to define system fields in sdx.xconf. |
protected java.lang.String |
INDEX_DIR_CURRENT
Directory names for indexes |
protected java.lang.String |
INDEX_DIR_MAIN
|
protected long |
lastDocCount
Number of indexed doc since last split |
protected LuceneIndex |
luceneActiveIndex
The active index for this document base |
protected LuceneIndex |
luceneCurrentIndex
The temporary index for this document base |
protected java.util.Vector |
luceneSearchIndexList
The sub-indexes for this document base (first entry is the activeIndex) |
protected java.lang.String |
SEARCH_INDEX_DIRECTORY_NAME
The directory name for the index that stores documents' indexation. |
protected int |
subIndexCount
Number of subindexes |
Fields inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked |
---|
_database, CLASS_NAME_SUFFIX, DATABASE_DIR_NAME, databaseConf, dbLocation, dbPath, DEFAULT_DATABASE_TYPE |
Fields inherited from class fr.gouv.culture.sdx.utils.AbstractSdxObject |
---|
_context, _description, _encoding, _id, _locale, _logger, _manager, _xmlizable_objects, _xmlLang, isToSaxInitialized |
Fields inherited from interface fr.gouv.culture.sdx.documentbase.DocumentBase |
---|
CLASS_NAME_SUFFIX, PACKAGE_QUALNAME |
Fields inherited from interface fr.gouv.culture.sdx.utils.Encodable |
---|
DEFAULT_ENCODING |
Fields inherited from interface fr.gouv.culture.sdx.utils.save.Saveable |
---|
ALL_SAVE_ATTRIB, PATH_ATTRIB, SAVE_DIRECTORY_PARAM |
Constructor Summary | |
---|---|
LuceneDocumentBase()
Creates the document base. |
Method Summary | |
---|---|
protected void |
addSubIndex()
Adds a splitted sub-index and update configuration aftermath |
protected void |
addSubIndex(LuceneIndex index)
Adds a splitted sub-index and update configuration aftermath |
protected void |
addToSearchIndex(java.lang.Object indexationDoc,
boolean batchIndex)
Writes a document to the search index |
void |
backup(SaveParameters save_config)
Saves the DocumentBase data objects |
protected void |
backupIndexes(SaveParameters save_config)
Save the indexes files |
protected void |
backupTimeStamp(SaveParameters save_config)
Save the timestamp files |
protected void |
compactSearchIndex()
|
void |
configure(org.apache.avalon.framework.configuration.Configuration configuration)
Sets the configuration options for this document base. |
protected void |
configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the Lucene document base |
protected void |
configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the fields list |
protected void |
configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the OAI harverster of this Lucene document base. |
protected void |
configureOAIRepositories(org.apache.avalon.framework.configuration.Configuration configuration)
Configures on or more OAI repositories. |
protected void |
configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
Configures an OAIRespository Configures an OAIRespository based on the configuration element <oai-repository> |
protected void |
configureSearchIndex()
Configures Lucene search index |
OAIRepository |
createOAIRepository()
Creates the default OAIRepository for the documentbase, using the older configuration |
OAIRepository |
createOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
Creates the OAIRepository for the documentbase Configures an OAIRespository based on the configuration that must start with an element <oai-repository> |
OAIRepository |
createOAIRepository(java.lang.String repoId)
Creates an OAIRepository for the documentbase, using the older configuration |
java.util.Date |
creationDate()
Returns the creation date of the Lucene search index. |
void |
delete(Document[] docs,
org.xml.sax.ContentHandler handler)
Deletes documents to this base. |
protected void |
deleteFromSearchIndex(java.lang.String docId)
|
int |
docCount()
Returns the number of documents in all Lucene sub indexes. |
protected java.lang.String |
getFormatedSubIndexId(int subIndexNumber)
Gets the formated sub-index number (for directories name) |
Index |
getIndex()
Gets the Index object for indexing and searching. |
protected java.lang.Object |
getIndexationDocument(IndexableDocument doc,
java.lang.String storeDocId,
java.lang.String repoId,
IndexParameters params)
|
org.apache.lucene.index.IndexReader |
getIndexReader()
Return the Lucene index reader Returns the index reader for all this document base indexes. |
protected long |
getIndexSize(LuceneIndex index)
Returns the index size |
LuceneIndex |
getLuceneIndex()
|
org.apache.lucene.search.Searcher |
getSearcher()
Returns the Lucene index searcher Returns the index searcher for all this document base indexes. |
java.util.HashMap |
getXMLFieldList()
Returns the list of XML type fields |
void |
index(IndexableDocument[] docs,
Repository repository,
IndexParameters params,
org.xml.sax.ContentHandler handler)
Adds one or more indexables documents to the search index of Lucene. |
void |
indexModified()
Modifies the last modfication timestamp file |
void |
init()
Initializes the document base. |
protected void |
initializeVectorizedIndex()
Initializes the index vector Initializes the index vector by searching all sub index in it's directory NB : working as intended. |
protected boolean |
initToSax()
Init the LinkedHashMap _xmlizable_objects with the objects in order to describ them in XML |
protected void |
initVolatileObjectsToSax()
Init the LinkedHashMap _xmlizable_volatile_objects with the objects in order to describ them in XML. |
java.util.Date |
lastModificationDate()
Returns the last modification date of the Lucene search index. |
void |
mergeBatch()
Deprecated. This method is deprecated since SDX v. 2.3. Use mergeCurrentBatch() instead. |
void |
mergeCurrentBatch()
Merges a batch of documents Merges a batch of documents (in memory) into the physical index on the file system and optimize this one if necessary (depends of the autoOptimize attribute for the current Document Base). |
void |
optimize()
Process an optimization of the indexes and repositories and system databases |
void |
reloadFieldList(java.lang.String appConfString)
Reloads the fieldList of an application |
protected void |
removeSubIndex()
Remove a splitted sub-index and update configuration aftermath Currently of no use as there is no plan to do so, just here as a reminder for future functionnalities |
protected void |
renewKeyIndex()
Refreshes data for the main and current index |
void |
replaceFieldList(FieldList fieldList)
Replaces the current fieldList by the new one |
void |
restore(SaveParameters save_config)
Restore the DocumentBase data objects |
protected void |
restoreIndexes(SaveParameters save_config)
Save the indexes files |
protected void |
restoreTimeStamp(SaveParameters save_config)
Restore the timestamp files |
protected IndexParameters |
setBaseParameters(IndexParameters params)
Sets the default pipeline parameters and ensures the params have a pipeline |
protected void |
setSearchIndexParameters(LuceneIndexParameters params)
Sets the search index parameters for indexation performance |
boolean |
splitCheck(boolean currentIndex)
Tests splitting conditions Returns true when splitting condition are reached. |
void |
splitIndex(boolean currentIndex)
Splits current index Splits the current big index into 2 smaller one |
Methods inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked |
---|
configure, getClassNameSuffix, getDatabase |
Methods inherited from class fr.gouv.culture.sdx.utils.AbstractSdxObject |
---|
configureDescription, contextualize, enableLogging, getBaseAttributes, getContext, getDescription, getEncoding, getId, getLocale, getLog, getServiceManager, getXmlLang, service, setDescription, setEncoding, setId, setLocale, setUpSdxObject, setUpSdxObject, setXmlLang, toSAX, verifyConfigurationResources |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface fr.gouv.culture.sdx.utils.SdxObject |
---|
getLog |
Methods inherited from interface org.apache.avalon.framework.logger.LogEnabled |
---|
enableLogging |
Methods inherited from interface org.apache.avalon.framework.context.Contextualizable |
---|
contextualize |
Methods inherited from interface org.apache.avalon.framework.service.Serviceable |
---|
service |
Methods inherited from interface fr.gouv.culture.sdx.utils.Identifiable |
---|
getId, setId |
Methods inherited from interface fr.gouv.culture.sdx.utils.Describable |
---|
getDescription, setDescription |
Methods inherited from interface fr.gouv.culture.sdx.utils.Encodable |
---|
getEncoding, setEncoding |
Methods inherited from interface fr.gouv.culture.sdx.utils.Localizable |
---|
getLocale, getXmlLang, setLocale, setXmlLang |
Methods inherited from interface org.apache.excalibur.xml.sax.XMLizable |
---|
toSAX |
Methods inherited from interface fr.gouv.culture.sdx.search.Searchable |
---|
getId |
Field Detail |
---|
protected java.util.Vector luceneSearchIndexList
protected LuceneIndex luceneActiveIndex
protected LuceneIndex luceneCurrentIndex
protected FieldList _fieldList
protected java.util.HashMap _xmlFieldList
protected int subIndexCount
protected long lastDocCount
protected final java.lang.String INDEX_DIR_CURRENT
protected final java.lang.String INDEX_DIR_MAIN
protected final java.lang.String SEARCH_INDEX_DIRECTORY_NAME
public static final java.lang.String DBELEM_ATTRIBUTE_REMOTE_ACCESS
public static final java.lang.String ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
Constructor Detail |
---|
public LuceneDocumentBase()
AbstractSdxObject.enableLogging(org.apache.avalon.framework.logger.Logger)
,
configure(org.apache.avalon.framework.configuration.Configuration)
,
init()
Method Detail |
---|
public void configure(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configure
in interface org.apache.avalon.framework.configuration.Configurable
configure
in class SDXDocumentBase
configuration
- The configuration object from which to build a document base.
Sample configuration entry:
<sdx:documentBase sdx:id = "myDocumentBaseName" sdx:type = "lucene"> <sdx:fieldList xml:lang = "fr-FR" sdx:variant = "" sdx:analyzerConf = "" sdx:analyzerClass = ""> <sdx:field code = "fieldName" type = "word" xml:lang = "fr-FR" sdx:analyzerClass = "" sdx:analyzerConf = ""/> <sdx:field code = "fieldName2" type = "field" xml:lang = "fr-FR" brief = "true"/> <sdx:field code = "fieldName3" type = "date" xml:lang = "fr-FR"/> <sdx:field code = "fieldName4" type = "unindexed" xml:lang = "fr-FR"/> </sdx:fieldList> <sdx:index> <sdx:pipeline sdx:id = "sdxIndexationPipeline"> <sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step2" sdx:type = "xslt"/> <sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step3" sdx:type = "xslt" keep = "true"/> </sdx:pipeline> </sdx:index> <sdx:repositories> <sdx:repository baseDirectory = "blah4" depth = "3" extent = "100" sdx:type = "FS" sdx:default = "true" sdx:id = "blah4"/> <sdx:repository ref = "blah2"/> </sdx:repositories> </sdx:documentBase>
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configureDocumentBase
in class SDXDocumentBase
configruation
- Configuration
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configuration
-
org.apache.avalon.framework.configuration.ConfigurationException
public void reloadFieldList(java.lang.String appConfString) throws SDXException
appConfString
- The path of the configuration file wich contain the new fieldList (eg, file:///myFiles/application.xconf, cocoon://myApplication/conf/application.xconf)
SDXException
public void replaceFieldList(FieldList fieldList) throws org.apache.avalon.framework.configuration.ConfigurationException
fieldList
- The new fieldList wich replace the old one
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureSearchIndex() throws org.apache.avalon.framework.configuration.ConfigurationException
org.apache.avalon.framework.configuration.ConfigurationException
public OAIRepository createOAIRepository(java.lang.String repoId)
createOAIRepository
in class AbstractDocumentBase
repoId
- String The id of the repository to create
public OAIRepository createOAIRepository()
createOAIRepository
in interface DocumentBase
createOAIRepository
in class AbstractDocumentBase
createOAIRepository(String)
public OAIRepository createOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
configuration
- The configuration
protected void configureOAIRepositories(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIRepositories
in class SDXDocumentBase
configuration
-
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIRepository
in class SDXDocumentBase
configuration
- The configuration
org.apache.avalon.framework.configuration.ConfigurationException
SDXDocumentBase.configureOAIRepository(org.apache.avalon.framework.configuration.Configuration)
protected void configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIHarvester
in class SDXDocumentBase
org.apache.avalon.framework.configuration.ConfigurationException
public void index(IndexableDocument[] docs, Repository repository, IndexParameters params, org.xml.sax.ContentHandler handler) throws SDXException, org.xml.sax.SAXException, org.apache.cocoon.ProcessingException
After adding the document to the search index, this method recycles the Lucene searcher if :
index
in interface DocumentBase
index
in class SDXDocumentBase
docs
- The documents to add.repository
- The repository where to store the documents. If null is passed, the default repository will be used.params
- The parameters for this adding action.handler
- A content handler where to send information about the process (may be null)
TODO : what kind of "informations" ? -pb
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingException
SDXDocumentBase.index(fr.gouv.culture.sdx.document.IndexableDocument[], fr.gouv.culture.sdx.repository.Repository, fr.gouv.culture.sdx.documentbase.IndexParameters, org.xml.sax.ContentHandler)
public void delete(Document[] docs, org.xml.sax.ContentHandler handler) throws SDXException, org.xml.sax.SAXException, org.apache.cocoon.ProcessingException
Deletes one or more documents to this LuceneDocumentBase and recycle Lucene searcher if deletes only one document or the LuceneDocumentBase is not autoOptimize.
delete
in interface DocumentBase
delete
in class SDXDocumentBase
docs
- The document to add and to index.handler
- A content handler to feed with information.
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingException
AbstractDocumentBase.delete(Document, ContentHandler)
protected IndexParameters setBaseParameters(IndexParameters params)
setBaseParameters
in class SDXDocumentBase
params
- The params object provided by the user at indexation timepublic java.util.HashMap getXMLFieldList()
SDXDocumentBase
getXMLFieldList
in class SDXDocumentBase
public Index getIndex()
public LuceneIndex getLuceneIndex()
protected void setSearchIndexParameters(LuceneIndexParameters params)
params
- The lucene specific params to userprotected void addToSearchIndex(java.lang.Object indexationDoc, boolean batchIndex) throws SDXException
addToSearchIndex
in class SDXDocumentBase
indexationDoc
- The Document to addbatchIndex
-
SDXException
protected void deleteFromSearchIndex(java.lang.String docId) throws SDXException
deleteFromSearchIndex
in class SDXDocumentBase
SDXException
protected void compactSearchIndex() throws SDXException
compactSearchIndex
in class SDXDocumentBase
SDXException
protected java.lang.Object getIndexationDocument(IndexableDocument doc, java.lang.String storeDocId, java.lang.String repoId, IndexParameters params) throws SDXException
getIndexationDocument
in class SDXDocumentBase
SDXException
public java.util.Date lastModificationDate()
public java.util.Date creationDate()
public void init() throws SDXException
DocumentBase
This method must be called after the super.getLog() has been set and the configuration done.
init
in interface DocumentBase
init
in class SDXDocumentBase
SDXException
protected boolean initToSax()
AbstractSdxObject
initToSax
in class SDXDocumentBase
protected void initVolatileObjectsToSax()
Some objects need to be refresh each time a toSAX is called.
initVolatileObjectsToSax
in class SDXDocumentBase
public void optimize()
optimize
in interface DocumentBase
optimize
in class SDXDocumentBase
public void mergeCurrentBatch()
Merges a batch of documents (in memory) into the physical index on the
file system and optimize this one if necessary (depends of the
autoOptimize
attribute for the current Document Base).
mergeCurrentBatch
in class SDXDocumentBase
public void indexModified()
indexModified
in class SDXDocumentBase
public void splitIndex(boolean currentIndex) throws java.io.IOException, SDXException
Splits the current big index into 2 smaller one
splitIndex
in class SDXDocumentBase
IOException,
- SDXException
java.io.IOException
SDXException
protected void initializeVectorizedIndex() throws org.apache.avalon.framework.configuration.ConfigurationException
Initializes the index vector by searching all sub index in it's directory
NB : working as intended.
org.apache.avalon.framework.configuration.ConfigurationException
protected void addSubIndex() throws org.apache.avalon.framework.configuration.ConfigurationException
SDXException
- If it's impossible to configure or initialize the sub-index to add.
org.apache.avalon.framework.configuration.ConfigurationException
protected void removeSubIndex()
public boolean splitCheck(boolean currentIndex) throws SDXException
Returns true when splitting condition are reached. If so, should be followed by a splitIndex() call. Controls order:
splitCheck
in class SDXDocumentBase
currentIndex
- boolean to indicate the test concerns the current
index (true
) or the active one (false
)
true
when splitting condition are reached,
false
otherwise.
SDXException
protected long getIndexSize(LuceneIndex index)
index
- LuceneIndex
public org.apache.lucene.search.Searcher getSearcher() throws SDXException
Returns the index searcher for all this document base indexes.
SDXException
- If it's not possible to build MultiSearcher.ParallelMultiSearcher
public org.apache.lucene.index.IndexReader getIndexReader() throws SDXException
Returns the index reader for all this document base indexes.
SDXException
- If it's not possible to build MultiReader.MultiReader
protected java.lang.String getFormatedSubIndexId(int subIndexNumber)
subIndexNumber
- int representing the number of the sub-index
protected void addSubIndex(LuceneIndex index) throws SDXException
index
- LuceneIndex
SDXException
- If nt's not possible to configure and initialize th sub-index.protected void renewKeyIndex() throws SDXException
SDXException
- If it's impossible to freeing resources or
initializing Lucene index.public void backup(SaveParameters save_config) throws SDXException
backup
in interface Saveable
backup
in class SDXDocumentBase
save_config
- SaveParameters
SDXException
Saveable.backup(fr.gouv.culture.sdx.utils.save.SaveParameters)
protected void backupIndexes(SaveParameters save_config) throws SDXException
backupIndexes
in class SDXDocumentBase
SDXException
protected void backupTimeStamp(SaveParameters save_config) throws SDXException
backupTimeStamp
in class SDXDocumentBase
SDXException
public void restore(SaveParameters save_config) throws SDXException
restore
in interface Saveable
restore
in class SDXDocumentBase
SDXException
Saveable.restore(fr.gouv.culture.sdx.utils.save.SaveParameters)
protected void restoreIndexes(SaveParameters save_config) throws SDXException
restoreIndexes
in class SDXDocumentBase
SDXException
protected void restoreTimeStamp(SaveParameters save_config) throws SDXException
restoreTimeStamp
in class SDXDocumentBase
SDXException
public int docCount()
public void mergeBatch() throws SDXException
mergeBatch
in class SDXDocumentBase
SDXException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |