gpl.pierrick.brihaye.aramorph.lucene
Class ArabicStemmer

java.lang.Object
  extended byorg.apache.lucene.analysis.TokenStream
      extended byorg.apache.lucene.analysis.TokenFilter
          extended bygpl.pierrick.brihaye.aramorph.lucene.ArabicStemmer

public class ArabicStemmer
extends org.apache.lucene.analysis.TokenFilter

A stemmer that will return the possible stems for arabic tokens.


Field Summary
protected  boolean debug
          Whether or not the analyzer should output debug messages
protected  boolean outputBuckwalter
          Whether or not the analyzer should output tokens in the Buckwalter transliteration system
 
Fields inherited from class org.apache.lucene.analysis.TokenFilter
input
 
Constructor Summary
ArabicStemmer(org.apache.lucene.analysis.TokenStream input)
          Constructs a stemmer that will return the possible stems for arabic tokens in the Buckwalter transliteration system.
ArabicStemmer(org.apache.lucene.analysis.TokenStream input, boolean debug)
          Constructs a stemmer that will return the possible stems for arabic tokens in the Buckwalter transliteration system.
ArabicStemmer(org.apache.lucene.analysis.TokenStream input, boolean debug, boolean outputBuckwalter)
          Constructs a stemmer that will return the possible stems for arabic tokens.
 
Method Summary
 AraMorph getAramorph()
          Returns the arabic stemmer in use.
 org.apache.lucene.analysis.Token next()
          Returns the next token in the stream, or null at EOS.
 
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

debug

protected boolean debug
Whether or not the analyzer should output debug messages


outputBuckwalter

protected boolean outputBuckwalter
Whether or not the analyzer should output tokens in the Buckwalter transliteration system

Constructor Detail

ArabicStemmer

public ArabicStemmer(org.apache.lucene.analysis.TokenStream input)
Constructs a stemmer that will return the possible stems for arabic tokens in the Buckwalter transliteration system.

Parameters:
input - The token stream from a tokenizer

ArabicStemmer

public ArabicStemmer(org.apache.lucene.analysis.TokenStream input,
                     boolean debug)
Constructs a stemmer that will return the possible stems for arabic tokens in the Buckwalter transliteration system.

Parameters:
input - The reader
debug - Whether or not the stemmer should display convenience messages on System.out

ArabicStemmer

public ArabicStemmer(org.apache.lucene.analysis.TokenStream input,
                     boolean debug,
                     boolean outputBuckwalter)
Constructs a stemmer that will return the possible stems for arabic tokens.

Parameters:
input - The reader
debug - Whether or not the stemmer should display convenience messages on System.out
outputBuckwalter - Whether or not the analyzer should output tokens in the Buckwalter transliteration system
Method Detail

getAramorph

public AraMorph getAramorph()
Returns the arabic stemmer in use.

Returns:
The arabic stemmer
See Also:
AraMorph

next

public final org.apache.lucene.analysis.Token next()
                                            throws java.io.IOException
Returns the next token in the stream, or null at EOS.

Returns:
The token with its type set to the morphological identification of the stem. Tokens with no grammatical identification have their type set to NO_RESULT. Token's termText is the romanized form of the stem
Throws:
java.io.IOException - If a problem occurs
See Also:
Token.type()