public abstract class FileBasedDictionary extends Object implements IndexedDictionary, Indexable, BaseEntry.MarkerDictionary
The class provides a framework to implement dictionaries which are stored in a file.
It is assumed that the dictionary file is a sequence of dictionary entries which are separated by
byte-sized entry sepator markers. Each entry is again subdivided into several fields like word,
reading or translation. Field division markers can be more complex. Common tasks like index
management, searching and search type management are implemented in this class. Derived classes
which implement specific dictionary formats must implement the abstract methods which deal
with the differences in file formats, especially the parsing of dictionary entries to
DictionaryEntry
instances.
Indexable.CharData
Modifier and Type | Field and Description |
---|---|
protected EncodedCharacterHandler |
characterHandler
Stores the character handler created by a call to
createCharacterHandler and used thorough this class. |
protected static ResourceBundle |
NAMES
Localized messages and strings for the dictionary implementations.
|
protected Map<Attribute<?>,Set<AttributeValue>> |
supportedAttributes
Set of attributes supported by this dictionary implementation.
|
protected Map<SearchMode,SearchFieldSelection> |
supportedSearchModes
Stores the supported search modes of this dictionary.
|
Modifier | Constructor and Description |
---|---|
protected |
FileBasedDictionary(FileBasedDictionaryStructure structure,
jgloss.dictionary.filebased.EntryParser entryParser,
File _dicfile,
String _encoding)
Initializes the dictionary.
|
Modifier and Type | Method and Description |
---|---|
void |
buildIndex()
Rebuild the index or add missing index data to an already existing index file.
|
int |
compare(ByteBuffer data,
int position)
Compare the data in a buffer to an index entry.
|
int |
compare(int pos1,
int pos2)
Compares two index entries.
|
DictionaryEntry |
createEntryFromMarker(int marker)
Create a dictionary entry from a marker, which is the start offset of the entry.
|
void |
dispose()
Called when the dictionary is no longer needed.
|
protected abstract boolean |
escapeChar(char c)
Test if a character must be escaped if it is to be used in a dictionary entry.
|
<T extends AttributeValue> |
getAttributeValues(Attribute<T> att)
Return the set of known attribute values for an attribute.
|
Indexable.CharData |
getChar(int position,
Indexable.CharData result)
Decode the character at a given position in the indexable data.
|
EncodedCharacterHandler |
getEncodedCharacterHandler()
Return a character handler which understands the character encoding format used by this
dictionary.
|
String |
getName()
Returns the name of this dictionary.
|
Set<Attribute<?>> |
getSupportedAttributes()
Get a set of all attributes used by this dictionary.
|
SearchFieldSelection |
getSupportedFields(SearchMode mode)
Return the search fields for which a search of the given mode is supported for
this dictionary implementation.
|
protected void |
initSearchModes()
Initialize the map of search modes supported by this dictionary implementation.
|
protected void |
initSupportedAttributes()
Initialize the set of supported attributes.
|
boolean |
loadIndex()
Load the index for the dictionary.
|
Iterator<DictionaryEntry> |
search(SearchMode searchmode,
Object[] parameters)
Searches for entries in the dictionary.
|
boolean |
supports(SearchMode mode,
boolean fully)
Test if this dictionary supports searches of a certain type.
|
protected String |
unescape(String str)
Replace any escape sequences in the string by the character represented.
|
protected static final ResourceBundle NAMES
resources/messages-dictionary
protected final EncodedCharacterHandler characterHandler
createCharacterHandler
and used thorough this class.protected final Map<SearchMode,SearchFieldSelection> supportedSearchModes
initSearchModes
.protected final Map<Attribute<?>,Set<AttributeValue>> supportedAttributes
initSupportedAttributes
.protected FileBasedDictionary(FileBasedDictionaryStructure structure, jgloss.dictionary.filebased.EntryParser entryParser, File _dicfile, String _encoding) throws IOException
loadIndex
must be successfully called._dicfile
- File which holds the dictionary._encoding
- Character encoding of the dictionary file.IOException
- if the dictionary or the index file cannot be read.protected void initSearchModes()
protected void initSupportedAttributes()
public boolean supports(SearchMode mode, boolean fully)
Dictionary
search
with this search mode will throw an exception.supports
in interface Dictionary
mode
- The search mode to test.fully
- If true
, test if the search mode is fully supported, if
false
, test if it is partially supported.public Set<Attribute<?>> getSupportedAttributes()
Dictionary
getSupportedAttributes
in interface Dictionary
public <T extends AttributeValue> Set<T> getAttributeValues(Attribute<T> att)
Dictionary
PartOfSpeech
attributes.
An example for non-constant values,
which will not be returned by this method, are
InformationAttributeValues
.
In this case, an empty set is returned. For unsupported attributes,
null
will be returned.getAttributeValues
in interface Dictionary
public SearchFieldSelection getSupportedFields(SearchMode mode)
Dictionary
SearchFieldSelection
parameter, at least one search field must be
selected in the SearchFieldSelection
object returned.getSupportedFields
in interface Dictionary
public boolean loadIndex() throws IndexException
IndexedDictionary
loadIndex
in interface IndexedDictionary
true
if the index was loaded successfully, false
if
the index does not exist, does not contain all needed index data or is
damaged. In this case, buildIndex
must be called.IndexException
- if reading the index failed for an unforeseeable reason.
In this case, calling buildIndex
will likely also fail
and the dictionary object can't be used.public void buildIndex() throws IndexException
IndexedDictionary
loadIndex
after
buildIndex
is not neccessary.buildIndex
in interface IndexedDictionary
IndexException
- if the index creation failed.public EncodedCharacterHandler getEncodedCharacterHandler()
getEncodedCharacterHandler
in interface Indexable
public Iterator<DictionaryEntry> search(SearchMode searchmode, Object[] parameters) throws SearchException
Dictionary
search
in interface Dictionary
searchmode
- The requested search mode. The search mode must be supported by this
dictionary.parameters
- Search parameters as required by the searchmode
.
The parameters must be valid for the selected search mode according to
List.isValid
.SearchException
- if the search mode is not supported or there was an error
during the search.public DictionaryEntry createEntryFromMarker(int marker) throws SearchException
BaseEntryRef
to recreate a dictionary entry.createEntryFromMarker
in interface BaseEntry.MarkerDictionary
SearchException
protected abstract boolean escapeChar(char c)
c
- The character to test.true
, if the character must be escaped.protected String unescape(String str)
escape
. This implementation
calls StringTools.unicodeUnescape
.public void dispose()
Dictionary
dispose
in interface Dictionary
public String getName()
getName
in interface Dictionary
public int compare(int pos1, int pos2) throws IndexException
Indexable
Indexable
object. Since the index entries are usually
strings of text, it is expected that it is a lexicographical ordering, although there
is no guarantee that the compare
method will impose the same ordering
as String.compareTo
.compare
in interface Indexable
pos1
- Position of the first index entry.pos2
- Position of the second index entry.<0
if the entry at pos1 is smaller than the entry at pos2;
>0
if it is greater and 0
if the entries at both
positions is equal.IndexException
public int compare(ByteBuffer data, int position) throws IndexException
Indexable
Indexable
class as EUC-JP encoded
text, the buffer must also contain EUC-JP encoded text. The ordering by this
compare
method must be consistent with compare(int,int)
.
The only allowed difference is that comparisons may be truncated to the length of the buffer;
i. e., a comparison may return equality even if the index entry is longer than the data
in the buffer (the buffer data is a prefix of the index entry). This is allowed to make
substring searches possible.compare
in interface Indexable
<0
if the data in the buffer is smaller than the index entry;
>0
if it is greater and 0
buffer data and the entry
are identical.IndexException
public Indexable.CharData getChar(int position, Indexable.CharData result) throws IndexException
Indexable
getChar
in interface Indexable
result
- The result of the method invocation will be stored in the object.
This prevents the need to create an object every time the method is invoked.
If null
is passed, a new instance will be created.IndexException
Copyright © 2001-2013 the JGloss developers. All Rights Reserved.