//readium-shared/org.readium.r2.shared.util.tokenizer
Package-level declarations¶
Types¶
| Name | Summary |
|---|---|
| DefaultTextContentTokenizer | [androidJvm] class DefaultTextContentTokenizer : Tokenizer<String, IntRange> A default cluster TextTokenizer taking advantage of the best capabilities of each Android version. |
| IcuTextTokenizer | [androidJvm] @RequiresApi(value = 24) class IcuTextTokenizer(language: Language?, unit: TextUnit) : Tokenizer<String, IntRange> Implementation of a TextTokenizer using ICU components to perform the actual tokenization while taking into account languages specificities. |
| NaiveTextTokenizer | [androidJvm] class NaiveTextTokenizer(unit: TextUnit) : Tokenizer<String, IntRange> A naive Tokenizer relying on java.text.BreakIterator to split the content. |
| TextTokenizer | [androidJvm] typealias TextTokenizer = Tokenizer<String, IntRange> A tokenizer splitting a String into range tokens (e.g. words, sentences, etc.). |
| TextUnit | [androidJvm] enum TextUnit : Enum<TextUnit> A text token unit which can be used with a TextTokenizer. |
| Tokenizer | [androidJvm] fun interface Tokenizer<D, T> A tokenizer splits a piece of data D into a list of T tokens. |