public abstract class AbstractParser extends Object implements Parser
Modifier and Type | Field and Description |
---|---|
protected Set<String> |
annotatedWords
Set of words already annotated since parser creation or the last call to
reset . |
protected Set<String> |
exclusions
Set of words excluded from annotation by the user.
|
protected boolean |
firstOccurrenceOnly
Flag if only the first occurrence of a word should be annotated.
|
protected boolean |
ignoreNewlines
Flag if newlines in a text should be ignored by the parser.
|
protected int |
parsePosition
Offset in the array of chars currently parsed.
|
Constructor and Description |
---|
AbstractParser(Set<String> exclusions,
boolean ignoreNewlines,
boolean firstOccurrenceOnly) |
Modifier and Type | Method and Description |
---|---|
int |
getParsePosition()
Returns the position in the text the parser is currently parsing.
|
protected boolean |
ignoreWord(String word)
Test if the word should not be annotated, either because it appears in the set of ignored
words or the set of already annotated words.
|
boolean |
isAnnotateFirstOccurrenceOnly()
Test if only the first occurrence of a word should be annotated.
|
boolean |
isIgnoreNewlines()
Test if the parser skips newlines in the imported text.
|
void |
reset()
Clears any caches which may have been filled during parsing.
|
void |
setAnnotateFirstOccurrenceOnly(boolean firstOccurrenceOnly)
Set if only the first occurrence of a word should be annotated.
|
void |
setIgnoreNewlines(boolean ignoreNewlines)
Set if the parser should skip newlines in the imported text.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getLanguage, getName, parse
protected int parsePosition
protected boolean ignoreNewlines
protected Set<String> annotatedWords
reset
.
If firstOccurrenceOnly
is set to false
, the variable
is set to null
. Derived classes are responsible for adding annotated words to
this set.protected boolean firstOccurrenceOnly
public int getParsePosition()
getParsePosition
in interface Parser
public void reset()
public void setIgnoreNewlines(boolean ignoreNewlines)
setIgnoreNewlines
in interface Parser
public boolean isIgnoreNewlines()
isIgnoreNewlines
in interface Parser
public void setAnnotateFirstOccurrenceOnly(boolean firstOccurrenceOnly)
true
, an annotated word will be cached and further occurrences will be ignored.
The cache of annotated words will be cleared when reset
is called.setAnnotateFirstOccurrenceOnly
in interface Parser
public boolean isAnnotateFirstOccurrenceOnly()
isAnnotateFirstOccurrenceOnly
in interface Parser
protected boolean ignoreWord(String word)
Copyright © 2001-2013 the JGloss developers. All Rights Reserved.