|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.analysis.Analyzer fr.gouv.culture.sdx.search.lucene.analysis.AbstractAnalyzer fr.gouv.culture.sdx.search.lucene.analysis.DefaultAnalyzer fr.gouv.culture.sdx.search.lucene.analysis.Analyzer_cz
public final class Analyzer_cz
Analyzer for Czech language. Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified, the exclusion list is empty by default.
Field Summary | |
---|---|
protected static java.lang.String |
ANALYZER_TYPE
|
static java.lang.String[] |
DEFAULT_STOP_WORDS
List of typical stopwords. |
Fields inherited from class fr.gouv.culture.sdx.search.lucene.analysis.DefaultAnalyzer |
---|
ATTRIBUTE_EXCLUDE_STEMS, ATTRIBUTE_USE_STOP_WORDS, EXCLUDE_STEM_ELEMENT, EXCLUDE_STEMS_ELEMENT, excludeTable, stopTable |
Fields inherited from class fr.gouv.culture.sdx.search.lucene.analysis.AbstractAnalyzer |
---|
logger |
Constructor Summary | |
---|---|
Analyzer_cz()
Builds an analyzer. |
|
Analyzer_cz(java.io.File stopwords)
Builds an analyzer with the given stop words. |
|
Analyzer_cz(java.util.Set stopwords)
Builds an analyzer with the given stop words. |
|
Analyzer_cz(java.lang.String[] stopwords)
Builds an analyzer with the given stop words. |
Method Summary | |
---|---|
protected java.lang.String |
getAnalyzerType()
|
void |
loadStopWords(java.io.InputStream wordfile,
java.lang.String encoding)
Loads stopwords hash from resource stream (file, database...). |
org.apache.lucene.analysis.TokenStream |
tokenStream(java.lang.String fieldName,
java.io.Reader reader)
Creates a TokenStream which tokenizes all the text in the provided Reader. |
Methods inherited from class fr.gouv.culture.sdx.search.lucene.analysis.DefaultAnalyzer |
---|
buildExcludeTable, buildStopTable, configure, getDefaultStopWords, tokenStream |
Methods inherited from class fr.gouv.culture.sdx.search.lucene.analysis.AbstractAnalyzer |
---|
enableLogging, toSAX |
Methods inherited from class org.apache.lucene.analysis.Analyzer |
---|
getPositionIncrementGap |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final java.lang.String ANALYZER_TYPE
public static final java.lang.String[] DEFAULT_STOP_WORDS
Constructor Detail |
---|
public Analyzer_cz()
public Analyzer_cz(java.lang.String[] stopwords)
stopwords
- public Analyzer_cz(java.util.Set stopwords)
stopwords
- public Analyzer_cz(java.io.File stopwords) throws java.io.IOException
stopwords
-
java.io.IOException
Method Detail |
---|
protected java.lang.String getAnalyzerType()
getAnalyzerType
in class DefaultAnalyzer
fr.gouv.culture.sdx.search.lucene.analysis.AbstractAnalyzer#getAnalyserType()
public void loadStopWords(java.io.InputStream wordfile, java.lang.String encoding)
wordfile
- File containing the wordlistencoding
- Encoding used (win-1250, iso-8859-2, ...}, null for default system encodingpublic final org.apache.lucene.analysis.TokenStream tokenStream(java.lang.String fieldName, java.io.Reader reader)
tokenStream
in interface Analyzer
tokenStream
in class DefaultAnalyzer
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |