fr.gouv.culture.sdx.search.lucene.analysis
Class Analyzer_fr
java.lang.Object
org.apache.lucene.analysis.Analyzer
fr.gouv.culture.sdx.search.lucene.analysis.AbstractAnalyzer
fr.gouv.culture.sdx.search.lucene.analysis.DefaultAnalyzer
fr.gouv.culture.sdx.search.lucene.analysis.Analyzer_fr
- All Implemented Interfaces:
- Analyzer, java.io.Serializable, org.apache.avalon.framework.configuration.Configurable, org.apache.avalon.framework.logger.LogEnabled, org.apache.excalibur.xml.sax.XMLizable
public class Analyzer_fr
- extends DefaultAnalyzer
An analyzer for french language.
This analyzers performs these tasks :
- all letters are converted to lower case
- stop words are removed, the list of words removed can come from a configuration file or use a default one
- accents from ISO-8859-1 can be removed
The possible configurations of this analyzer are :
- A list of stop words can be given in the configuration file, a default list is hardcoded.
- Accented characters are converted to their unaccented form by default, but this can be override with the keepAccent attribute
- See Also:
- Serialized Form
Method Summary |
void |
configure(org.apache.avalon.framework.configuration.Configuration configuration)
Configures this analyzer. |
protected java.lang.String |
getAnalyzerType()
|
org.apache.lucene.analysis.TokenStream |
tokenStream(java.lang.String fieldName,
java.io.Reader reader)
Builds a chain for filtering words. |
Methods inherited from class org.apache.lucene.analysis.Analyzer |
getPositionIncrementGap |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ANALYZER_TYPE
protected static final java.lang.String ANALYZER_TYPE
- See Also:
- Constant Field Values
Analyzer_fr
public Analyzer_fr()
getAnalyzerType
protected java.lang.String getAnalyzerType()
- Overrides:
getAnalyzerType
in class DefaultAnalyzer
- See Also:
fr.gouv.culture.sdx.search.lucene.analysis.AbstractAnalyzer#getAnalyserType()
configure
public void configure(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
- Configures this analyzer.
- Specified by:
configure
in interface org.apache.avalon.framework.configuration.Configurable
- Overrides:
configure
in class DefaultAnalyzer
- Throws:
org.apache.avalon.framework.configuration.ConfigurationException
tokenStream
public final org.apache.lucene.analysis.TokenStream tokenStream(java.lang.String fieldName,
java.io.Reader reader)
- Builds a chain for filtering words.
The chain is this one :
- Specified by:
tokenStream
in interface Analyzer
- Overrides:
tokenStream
in class DefaultAnalyzer
Copyright © 2000-2010 Ministere de la culture et de la communication / AJLSM. All Rights Reserved.