public interface EncodedCharacterHandler
The methods use ints
to represent character values. Subclasses may guarantee that
only values in char
or short
range are used.
Implementation classes of this interface do not have to use unicode code points as character
values; in this case directly typecasting the values to a Java
char
will not give the expected results. The only guarantee is that the
ordering of int
character values is equivalent to the alphabetical ordering of
the represented characters.
Modifier and Type | Method and Description |
---|---|
boolean |
canEncode(char c)
Returns whether or not the character encoding can encode the given
character.
|
int |
convertCharacter(int character)
Modify a character returned by
readCharacter to make
different character classes compare equal. |
CharacterClass |
getCharacterClass(int character,
boolean inWord)
Test the character class of a character returned by
readCharacter . |
String |
getEncodingName()
Return the name of the encoding supported by this handler.
|
int |
readCharacter(ByteBuffer buffer)
Decode the character at the current buffer position.
|
int |
readPreviousCharacter(ByteBuffer buffer)
Decode the character before the character at the current buffer position.
|
int readCharacter(ByteBuffer buffer) throws BufferUnderflowException, IndexOutOfBoundsException, CharacterCodingException
position()
will be at the start of the
next character.buffer
- The buffer which contains the encoded character.BufferUnderflowException
- if the end of the buffer is reached before a character
is completely decoded.CharacterCodingException
- if the bytes at the current buffer position are not
a legal encoded character.IndexOutOfBoundsException
int readPreviousCharacter(ByteBuffer buffer) throws BufferUnderflowException, IndexOutOfBoundsException, CharacterCodingException
position()
will be at the start of the
character returned. Calling this method multiple times will effectively read the encoded
string backwards.buffer
- The buffer which contains the encoded character.BufferUnderflowException
- if the end of the buffer is reached before a character
is completely decoded.CharacterCodingException
- if the bytes at the current buffer position are not
a legal encoded character.IndexOutOfBoundsException
int convertCharacter(int character)
readCharacter
to make
different character classes compare equal. This is used for searching and indexing to
treat certain character classes as identical with respect to comparison. Examples are
uppercase and lowercase western characters, or katakana and hiragana. What characters
are converted is dependent of the class implementing this interface and may be
configured by modifying the object's state.character
- The character to convert, as returned by
readCharacter
.CharacterClass getCharacterClass(int character, boolean inWord)
readCharacter
. These are specialized character classes
which do not directly map to any unicode character classes. They are used during index
creation to decide if the current character is part of an indexable word.character
- The character to test.inWord
- true
, if the character before the current character was in
character class ROMAN_WORD
. This may influence
the character class of the tested character.boolean canEncode(char c)
String getEncodingName()
java.nio.charset.Charset
.Copyright © 2001-2013 the JGloss developers. All Rights Reserved.