Uses of Class
org.apache.lucene.util.AttributeSource
Packages that use AttributeSource
Package
Description
Text analysis.
Analyzer for Arabic.
Analyzer for Bulgarian.
Analyzer for Bengali Language.
Provides various convenience classes for creating boosts on Tokens.
Analyzer for Brazilian Portuguese.
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams.
Analyzer for Sorani Kurdish.
Fast, general-purpose grammar-based tokenizers.
Analyzer for Simplified Chinese, which indexes words.
Construct n-grams for frequently occurring terms and phrases.
A filter that decomposes compound words you find in many Germanic languages into the word parts.
Basic, general-purpose analysis components.
Analyzer for Czech.
Analyzer for German.
Analyzer for Greek.
Fast, general-purpose URLs and email addresses tokenizers.
Analyzer for English.
Analyzer for Spanish.
Analyzer for Persian.
Analyzer for Finnish.
Analyzer for French.
Analyzer for Irish.
Analyzer for Galician.
Analyzer for Hindi.
Analyzer for Hungarian.
A Java implementation of Hunspell stemming and
spell-checking algorithms (
Hunspell
), and a stemming
TokenFilter (HunspellStemFilter
) based on it.Analysis components based on ICU
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
Analyzer for Indonesian.
Analyzer for Indian languages.
Analyzer for Italian.
Analyzer for Japanese.
Analyzer for Korean.
Analyzer for Latvian.
MinHash filtering (for LSH).
Miscellaneous Tokenstreams.
Character n-gram tokenizers and filters.
Analyzer for Norwegian.
Analysis components for path-like strings such as filenames.
Set of components for pattern-based (regex) analysis.
Provides various convenience classes for creating payloads on Tokens.
Analysis components for phonetic search.
Analyzer for Portuguese.
Filter to reverse token text.
Analyzer for Russian.
Word n-gram filters.
Analyzer for Serbian.
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the
Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29.Stempel: Algorithmic Stemmer
Analyzer for Swedish.
Analysis components for Synonyms.
Analysis components for Synonyms using Word2Vec model.
Analyzer for Telugu Language.
Analyzer for Thai.
Analyzer for Turkish.
Utility functions for text analysis.
Tokenizer that is aware of Wikipedia syntax.
Codecs API: API for customization of the encoding and structure of the index.
Pluggable term index / block terms dictionary implementations.
The logical representation of a
Document
for indexing and
searching.Code to maintain and access indices.
Monitoring framework
Code to search indices.
Highlighting search terms.
Support for index-time and query-time joins.
Analyzer based autosuggest.
Support for document suggestion
The UnifiedHighlighter -- a flexible highlighter that can get offsets from postings, term
vectors, or analysis.
Some utility classes.
Utility classes for working with token streams as graphs.
-
Uses of AttributeSource in org.apache.lucene.analysis
Subclasses of AttributeSource in org.apache.lucene.analysisModifier and TypeClassDescriptionprivate static final class
private static class
Token Stream that outputs tokens from a topo sorted graph.final class
This class can be used if the token attributes of a TokenStream are intended to be consumed more than once.class
Abstract base class for TokenFilters that may remove tokens.class
An abstract TokenFilter that exposes its input stream as a graphclass
Normalizes token text to lower case.class
Removes stop words from a token stream.class
A TokenFilter is a TokenStream whose input is another TokenStream.class
A Tokenizer is a TokenStream whose input is a Reader.class
Fields in org.apache.lucene.analysis declared as AttributeSourceModifier and TypeFieldDescription(package private) final AttributeSource
GraphTokenFilter.Token.attSource
Methods in org.apache.lucene.analysis with parameters of type AttributeSourceModifier and TypeMethodDescription(package private) void
GraphTokenFilter.Token.reset
(AttributeSource attSource) Constructors in org.apache.lucene.analysis with parameters of type AttributeSourceModifierConstructorDescription(package private)
Token
(AttributeSource attSource) protected
TokenStream
(AttributeSource input) A TokenStream that uses the same attributes as the supplied one. -
Uses of AttributeSource in org.apache.lucene.analysis.ar
Subclasses of AttributeSource in org.apache.lucene.analysis.arModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesArabicNormalizer
to normalize the orthography.final class
ATokenFilter
that appliesArabicStemmer
to stem Arabic words.. -
Uses of AttributeSource in org.apache.lucene.analysis.bg
Subclasses of AttributeSource in org.apache.lucene.analysis.bgModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesBulgarianStemmer
to stem Bulgarian words. -
Uses of AttributeSource in org.apache.lucene.analysis.bn
Subclasses of AttributeSource in org.apache.lucene.analysis.bnModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesBengaliNormalizer
to normalize the orthography.final class
ATokenFilter
that appliesBengaliStemmer
to stem Bengali words. -
Uses of AttributeSource in org.apache.lucene.analysis.boost
Subclasses of AttributeSource in org.apache.lucene.analysis.boostModifier and TypeClassDescriptionfinal class
Characters before the delimiter are the "token", those after are the boost. -
Uses of AttributeSource in org.apache.lucene.analysis.br
Subclasses of AttributeSource in org.apache.lucene.analysis.br -
Uses of AttributeSource in org.apache.lucene.analysis.cjk
Subclasses of AttributeSource in org.apache.lucene.analysis.cjkModifier and TypeClassDescriptionfinal class
Forms bigrams of CJK terms that are generated from StandardTokenizer or ICUTokenizer.final class
ATokenFilter
that normalizes CJK width differences: Folds fullwidth ASCII variants into the equivalent basic latin Folds halfwidth Katakana variants into the equivalent kana -
Uses of AttributeSource in org.apache.lucene.analysis.ckb
Subclasses of AttributeSource in org.apache.lucene.analysis.ckbModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesSoraniNormalizer
to normalize the orthography.final class
ATokenFilter
that appliesSoraniStemmer
to stem Sorani words. -
Uses of AttributeSource in org.apache.lucene.analysis.classic
Subclasses of AttributeSource in org.apache.lucene.analysis.classicModifier and TypeClassDescriptionclass
Normalizes tokens extracted withClassicTokenizer
.final class
A grammar-based tokenizer constructed with JFlex -
Uses of AttributeSource in org.apache.lucene.analysis.cn.smart
Subclasses of AttributeSource in org.apache.lucene.analysis.cn.smartModifier and TypeClassDescriptionclass
Tokenizer for Chinese or mixed Chinese-English text. -
Uses of AttributeSource in org.apache.lucene.analysis.commongrams
Subclasses of AttributeSource in org.apache.lucene.analysis.commongramsModifier and TypeClassDescriptionfinal class
Construct bigrams for frequently occurring terms while indexing.final class
Wrap a CommonGramsFilter optimizing phrase queries by only returning single words when they are not a member of a bigram. -
Uses of AttributeSource in org.apache.lucene.analysis.compound
Subclasses of AttributeSource in org.apache.lucene.analysis.compoundModifier and TypeClassDescriptionclass
Base class for decomposition token filters.class
ATokenFilter
that decomposes compound words found in many Germanic languages.class
ATokenFilter
that decomposes compound words found in many Germanic languages. -
Uses of AttributeSource in org.apache.lucene.analysis.core
Subclasses of AttributeSource in org.apache.lucene.analysis.coreModifier and TypeClassDescriptionfinal class
Folds all Unicode digits in[:General_Category=Decimal_Number:]
to Basic Latin digits (0-9
).final class
Converts an incoming graph token stream, such as one fromSynonymGraphFilter
, into a flat form so that all nodes form a single linear chain with no side paths.final class
Emits the entire input as a single token.class
A LetterTokenizer is a tokenizer that divides text at non-letters.final class
Normalizes token text to lower case.final class
Removes stop words from a token stream.final class
Removes tokens whose types appear in a set of blocked types from a token stream.final class
A UnicodeWhitespaceTokenizer is a tokenizer that divides text at whitespace.final class
Normalizes token text to UPPER CASE.final class
A tokenizer that divides text at whitespace characters as defined byCharacter.isWhitespace(int)
. -
Uses of AttributeSource in org.apache.lucene.analysis.cz
Subclasses of AttributeSource in org.apache.lucene.analysis.czModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesCzechStemmer
to stem Czech words. -
Uses of AttributeSource in org.apache.lucene.analysis.de
Subclasses of AttributeSource in org.apache.lucene.analysis.deModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesGermanLightStemmer
to stem German words.final class
ATokenFilter
that appliesGermanMinimalStemmer
to stem German words.final class
Normalizes German characters according to the heuristics of the German2 snowball algorithm.final class
ATokenFilter
that stems German words. -
Uses of AttributeSource in org.apache.lucene.analysis.el
Subclasses of AttributeSource in org.apache.lucene.analysis.elModifier and TypeClassDescriptionfinal class
Normalizes token text to lower case, removes some Greek diacritics, and standardizes final sigma to sigma.final class
ATokenFilter
that appliesGreekStemmer
to stem Greek words. -
Uses of AttributeSource in org.apache.lucene.analysis.email
Subclasses of AttributeSource in org.apache.lucene.analysis.emailModifier and TypeClassDescriptionfinal class
This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs. -
Uses of AttributeSource in org.apache.lucene.analysis.en
Subclasses of AttributeSource in org.apache.lucene.analysis.enModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesEnglishMinimalStemmer
to stem English words.final class
TokenFilter that removes possessives (trailing 's) from words.final class
A high-performance kstem filter for english.final class
Transforms the token stream as per the Porter stemming algorithm. -
Uses of AttributeSource in org.apache.lucene.analysis.es
Subclasses of AttributeSource in org.apache.lucene.analysis.esModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesSpanishLightStemmer
to stem Spanish words.final class
Deprecated.final class
ATokenFilter
that appliesSpanishPluralStemmer
to stem Spanish words. -
Uses of AttributeSource in org.apache.lucene.analysis.fa
Subclasses of AttributeSource in org.apache.lucene.analysis.faModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesPersianNormalizer
to normalize the orthography.final class
ATokenFilter
that appliesPersianStemmer
to stem Persian words. -
Uses of AttributeSource in org.apache.lucene.analysis.fi
Subclasses of AttributeSource in org.apache.lucene.analysis.fiModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesFinnishLightStemmer
to stem Finnish words. -
Uses of AttributeSource in org.apache.lucene.analysis.fr
Subclasses of AttributeSource in org.apache.lucene.analysis.frModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesFrenchLightStemmer
to stem French words.final class
ATokenFilter
that appliesFrenchMinimalStemmer
to stem French words. -
Uses of AttributeSource in org.apache.lucene.analysis.ga
Subclasses of AttributeSource in org.apache.lucene.analysis.gaModifier and TypeClassDescriptionfinal class
Normalises token text to lower case, handling t-prothesis and n-eclipsis (i.e., that 'nAthair' should become 'n-athair') -
Uses of AttributeSource in org.apache.lucene.analysis.gl
Subclasses of AttributeSource in org.apache.lucene.analysis.glModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesGalicianMinimalStemmer
to stem Galician words.final class
ATokenFilter
that appliesGalicianStemmer
to stem Galician words. -
Uses of AttributeSource in org.apache.lucene.analysis.hi
Subclasses of AttributeSource in org.apache.lucene.analysis.hiModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesHindiNormalizer
to normalize the orthography.final class
ATokenFilter
that appliesHindiStemmer
to stem Hindi words. -
Uses of AttributeSource in org.apache.lucene.analysis.hu
Subclasses of AttributeSource in org.apache.lucene.analysis.huModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesHungarianLightStemmer
to stem Hungarian words. -
Uses of AttributeSource in org.apache.lucene.analysis.hunspell
Subclasses of AttributeSource in org.apache.lucene.analysis.hunspellModifier and TypeClassDescriptionfinal class
TokenFilter that uses hunspell affix rules and words to stem tokens. -
Uses of AttributeSource in org.apache.lucene.analysis.icu
Subclasses of AttributeSource in org.apache.lucene.analysis.icuModifier and TypeClassDescriptionfinal class
A TokenFilter that applies search term folding to Unicode text, applying foldings from UTR#30 Character Foldings.class
Normalize token text with ICU'sNormalizer2
final class
ATokenFilter
that transforms text with ICU. -
Uses of AttributeSource in org.apache.lucene.analysis.icu.segmentation
Subclasses of AttributeSource in org.apache.lucene.analysis.icu.segmentationModifier and TypeClassDescriptionfinal class
Breaks text into words according to UAX #29: Unicode Text Segmentation (http://www.unicode.org/reports/tr29/) -
Uses of AttributeSource in org.apache.lucene.analysis.id
Subclasses of AttributeSource in org.apache.lucene.analysis.idModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesIndonesianStemmer
to stem Indonesian words. -
Uses of AttributeSource in org.apache.lucene.analysis.in
Subclasses of AttributeSource in org.apache.lucene.analysis.inModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesIndicNormalizer
to normalize text in Indian Languages. -
Uses of AttributeSource in org.apache.lucene.analysis.it
Subclasses of AttributeSource in org.apache.lucene.analysis.itModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesItalianLightStemmer
to stem Italian words. -
Uses of AttributeSource in org.apache.lucene.analysis.ja
Subclasses of AttributeSource in org.apache.lucene.analysis.jaModifier and TypeClassDescriptionfinal class
Replaces term text with theBaseFormAttribute
.final class
ATokenFilter
that adds Japanese romanized tokens to the term attribute.final class
ATokenFilter
that normalizes small letters (捨て仮名) in hiragana into normal letters.final class
ATokenFilter
that normalizes common katakana spelling variations ending in a long sound character by removing this character (U+30FC).final class
ATokenFilter
that normalizes small letters (捨て仮名) in katakana into normal letters.class
ATokenFilter
that normalizes Japanese numbers (kansūji) to regular Arabic decimal numbers in half-width characters.final class
Removes tokens that match a set of part-of-speech tags.final class
ATokenFilter
that replaces the term attribute with the reading of a token in either katakana or romaji form.final class
Tokenizer for Japanese that uses morphological analysis. -
Uses of AttributeSource in org.apache.lucene.analysis.ko
Subclasses of AttributeSource in org.apache.lucene.analysis.koModifier and TypeClassDescriptionclass
ATokenFilter
that normalizes Korean numbers to regular Arabic decimal numbers in half-width characters.final class
Removes tokens that match a set of part-of-speech tags.final class
Replaces term text with theReadingAttribute
which is the Hangul transcription of Hanja characters.final class
Tokenizer for Korean that uses morphological analysis. -
Uses of AttributeSource in org.apache.lucene.analysis.lv
Subclasses of AttributeSource in org.apache.lucene.analysis.lvModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesLatvianStemmer
to stem Latvian words. -
Uses of AttributeSource in org.apache.lucene.analysis.minhash
Subclasses of AttributeSource in org.apache.lucene.analysis.minhashModifier and TypeClassDescriptionclass
Generate min hash tokens from an incoming stream of tokens. -
Uses of AttributeSource in org.apache.lucene.analysis.miscellaneous
Subclasses of AttributeSource in org.apache.lucene.analysis.miscellaneousModifier and TypeClassDescriptionfinal class
This class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists.final class
A filter to apply normal capitalization rules to Tokens.final class
Removes words that are too long or too short from the stream.final class
Concatenates/Joins every incoming token with a separator into one output token for every path through the token stream (which is a graph).final class
A TokenStream that takes an array of input TokenStreams as sources, and concatenates them together.class
Allows skipping TokenFilters based on the current set of attributes.private final class
class
Filters all tokens that cannot be parsed to a date, using the providedDateFormat
.final class
Characters before the delimiter are the "token", the textual integer after is the term frequency.final class
Allows Tokens with a given combination of flags to be dropped.final class
An always exhausted token stream.class
Filter outputs a single token which is a concatenation of the sorted and de-duplicated set of input tokens.final class
Deprecated.Fix the token filters that create broken offsets in the first place.final class
When the plain text is extracted from documents, we will often have many words hyphenated and broken into two lines.final class
A TokenFilter that only keeps tokens with text contained in the required words.class
Marks terms as keywords via theKeywordAttribute
.final class
This TokenFilter emits each incoming token twice once as keyword and once non-keyword, in other words once withKeywordAttribute.setKeyword(boolean)
set totrue
and once set tofalse
.final class
Removes words that are too long or too short from the stream.final class
This TokenFilter limits the number of tokens while indexing.final class
Lets all tokens pass through until it sees one with a start offset <= a configured limit, which won't pass and ends the stream.final class
This TokenFilter limits its emitted tokens to those with positions that are not greater than the configured limit.final class
Marks terms as keywords via theKeywordAttribute
.class
A ConditionalTokenFilter that only applies its wrapped filters to tokens that are not contained in a protected set.final class
A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream.final class
This filter folds Scandinavian characters åÅäæÄÆ->a and öÖøØ->o.final class
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.final class
Marks terms as keywords via theKeywordAttribute
.final class
Provides the ability to override anyKeywordAttribute
aware stemmer with custom dictionary-based stemming.final class
Trims leading and trailing whitespace from Tokens in the stream.final class
A token filter for truncating the terms into a specific length.final class
Adds theTypeAttribute.type()
as a synonym, i.e.final class
Deprecated.UseWordDelimiterGraphFilter
instead: it produces a correct token graph so that e.g.final class
Splits words into subwords and performs optional transformations on subword groups, producing a correct token graph so that e.g.Methods in org.apache.lucene.analysis.miscellaneous that return AttributeSourceModifier and TypeMethodDescriptionprivate static AttributeSource
ConcatenatingTokenStream.combineSources
(TokenStream... sources) Constructors in org.apache.lucene.analysis.miscellaneous with parameters of type AttributeSource -
Uses of AttributeSource in org.apache.lucene.analysis.ngram
Subclasses of AttributeSource in org.apache.lucene.analysis.ngramModifier and TypeClassDescriptionfinal class
Tokenizes the given token into n-grams of given size(s).class
Tokenizes the input from an edge into n-grams of given size(s).final class
Tokenizes the input into n-grams of the given size(s).class
Tokenizes the input into n-grams of the given size(s). -
Uses of AttributeSource in org.apache.lucene.analysis.no
Subclasses of AttributeSource in org.apache.lucene.analysis.noModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesNorwegianLightStemmer
to stem Norwegian words.final class
ATokenFilter
that appliesNorwegianMinimalStemmer
to stem Norwegian words.final class
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (ae, oe, aa) by transforming them to åÅæÆøØ. -
Uses of AttributeSource in org.apache.lucene.analysis.path
Subclasses of AttributeSource in org.apache.lucene.analysis.pathModifier and TypeClassDescriptionclass
Tokenizer for path-like hierarchies.class
Tokenizer for domain-like hierarchies. -
Uses of AttributeSource in org.apache.lucene.analysis.pattern
Subclasses of AttributeSource in org.apache.lucene.analysis.patternModifier and TypeClassDescriptionfinal class
CaptureGroup uses Java regexes to emit multiple tokens - one for each capture group in one or more patterns.final class
A TokenFilter which applies a Pattern to each token in the stream, replacing match occurrences with the specified replacement string.final class
This tokenizer uses regex pattern matching to construct distinct tokens for the input stream.class
Set a type attribute to a parameterized value when tokens are matched by any of a several regex patterns.final class
final class
-
Uses of AttributeSource in org.apache.lucene.analysis.payloads
Subclasses of AttributeSource in org.apache.lucene.analysis.payloadsModifier and TypeClassDescriptionfinal class
Characters before the delimiter are the "token", those after are the payload.class
Assigns a payload to a token based on theTypeAttribute
class
Adds theOffsetAttribute.startOffset()
andOffsetAttribute.endOffset()
First 4 bytes are the startclass
Makes theTypeAttribute
a payload. -
Uses of AttributeSource in org.apache.lucene.analysis.phonetic
Subclasses of AttributeSource in org.apache.lucene.analysis.phoneticModifier and TypeClassDescriptionfinal class
TokenFilter for Beider-Morse phonetic encoding.final class
Create tokens for phonetic matches based on Daitch–Mokotoff Soundex.final class
Filter for DoubleMetaphone (supporting secondary codes)final class
Create tokens for phonetic matches. -
Uses of AttributeSource in org.apache.lucene.analysis.pt
Subclasses of AttributeSource in org.apache.lucene.analysis.ptModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesPortugueseLightStemmer
to stem Portuguese words.final class
ATokenFilter
that appliesPortugueseMinimalStemmer
to stem Portuguese words.final class
ATokenFilter
that appliesPortugueseStemmer
to stem Portuguese words. -
Uses of AttributeSource in org.apache.lucene.analysis.reverse
Subclasses of AttributeSource in org.apache.lucene.analysis.reverseModifier and TypeClassDescriptionfinal class
Reverse token string, for example "country" => "yrtnuoc". -
Uses of AttributeSource in org.apache.lucene.analysis.ru
Subclasses of AttributeSource in org.apache.lucene.analysis.ruModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesRussianLightStemmer
to stem Russian words. -
Uses of AttributeSource in org.apache.lucene.analysis.shingle
Subclasses of AttributeSource in org.apache.lucene.analysis.shingleModifier and TypeClassDescriptionfinal class
A FixedShingleFilter constructs shingles (token n-grams) from a token stream.final class
A ShingleFilter constructs shingles (token n-grams) from a token stream.Fields in org.apache.lucene.analysis.shingle declared as AttributeSourceModifier and TypeFieldDescription(package private) final AttributeSource
ShingleFilter.InputWindowToken.attSource
private AttributeSource
ShingleFilter.nextInputStreamToken
When the next input stream token has a position increment greater than one, it is stored in this field until sufficient filler tokens have been inserted to account for the position increment.Constructors in org.apache.lucene.analysis.shingle with parameters of type AttributeSource -
Uses of AttributeSource in org.apache.lucene.analysis.sinks
Subclasses of AttributeSource in org.apache.lucene.analysis.sinksModifier and TypeClassDescriptionfinal class
This TokenFilter provides the ability to set aside attribute states that have already been analyzed.static final class
TokenStream output from a tee.Constructors in org.apache.lucene.analysis.sinks with parameters of type AttributeSourceModifierConstructorDescriptionprivate
SinkTokenStream
(AttributeSource source, TeeSinkTokenFilter.States cachedStates) -
Uses of AttributeSource in org.apache.lucene.analysis.snowball
Subclasses of AttributeSource in org.apache.lucene.analysis.snowballModifier and TypeClassDescriptionfinal class
A filter that stems words using a Snowball-generated stemmer. -
Uses of AttributeSource in org.apache.lucene.analysis.sr
Subclasses of AttributeSource in org.apache.lucene.analysis.srModifier and TypeClassDescriptionfinal class
Normalizes Serbian Cyrillic and Latin characters to "bald" Latin.final class
Normalizes Serbian Cyrillic to Latin. -
Uses of AttributeSource in org.apache.lucene.analysis.standard
Subclasses of AttributeSource in org.apache.lucene.analysis.standardModifier and TypeClassDescriptionfinal class
A grammar-based tokenizer constructed with JFlex. -
Uses of AttributeSource in org.apache.lucene.analysis.stempel
Subclasses of AttributeSource in org.apache.lucene.analysis.stempelModifier and TypeClassDescriptionfinal class
Transforms the token stream as per the stemming algorithm. -
Uses of AttributeSource in org.apache.lucene.analysis.sv
Subclasses of AttributeSource in org.apache.lucene.analysis.svModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesSwedishLightStemmer
to stem Swedish words.final class
ATokenFilter
that appliesSwedishMinimalStemmer
to stem Swedish words. -
Uses of AttributeSource in org.apache.lucene.analysis.synonym
Subclasses of AttributeSource in org.apache.lucene.analysis.synonymModifier and TypeClassDescriptionfinal class
Deprecated.UseSynonymGraphFilter
instead, but be sure to also useFlattenGraphFilter
at index time (not at search time) as well.final class
Applies single- or multi-token synonyms from aSynonymMap
to an incomingTokenStream
, producing a fully correct graph output. -
Uses of AttributeSource in org.apache.lucene.analysis.synonym.word2vec
Subclasses of AttributeSource in org.apache.lucene.analysis.synonym.word2vecModifier and TypeClassDescriptionfinal class
Applies single-token synonyms from a Word2Vec trained network to an incomingTokenStream
. -
Uses of AttributeSource in org.apache.lucene.analysis.te
Subclasses of AttributeSource in org.apache.lucene.analysis.teModifier and TypeClassDescriptionfinal class
ATokenFilter
that appliesTeluguNormalizer
to normalize the orthography.final class
ATokenFilter
that appliesTeluguStemmer
to stem Telugu words. -
Uses of AttributeSource in org.apache.lucene.analysis.th
Subclasses of AttributeSource in org.apache.lucene.analysis.th -
Uses of AttributeSource in org.apache.lucene.analysis.tr
Subclasses of AttributeSource in org.apache.lucene.analysis.trModifier and TypeClassDescriptionfinal class
Strips all characters after an apostrophe (including the apostrophe itself).final class
Normalizes Turkish token text to lower case. -
Uses of AttributeSource in org.apache.lucene.analysis.util
Subclasses of AttributeSource in org.apache.lucene.analysis.utilModifier and TypeClassDescriptionclass
An abstract base class for simple, character-oriented tokenizers.final class
Removes elisions from aTokenStream
.class
Breaks text into sentences with aBreakIterator
and allows subclasses to decompose these sentences into words. -
Uses of AttributeSource in org.apache.lucene.analysis.wikipedia
Subclasses of AttributeSource in org.apache.lucene.analysis.wikipediaModifier and TypeClassDescriptionfinal class
Extension of StandardTokenizer that is aware of Wikipedia syntax. -
Uses of AttributeSource in org.apache.lucene.codecs
Methods in org.apache.lucene.codecs that return AttributeSource -
Uses of AttributeSource in org.apache.lucene.document
Subclasses of AttributeSource in org.apache.lucene.documentModifier and TypeClassDescriptionprivate static final class
private static final class
private static final class
-
Uses of AttributeSource in org.apache.lucene.index
Fields in org.apache.lucene.index declared as AttributeSourceModifier and TypeFieldDescription(package private) AttributeSource
FieldInvertState.attributeSource
private AttributeSource
BaseTermsEnum.atts
Methods in org.apache.lucene.index that return AttributeSourceModifier and TypeMethodDescriptionBaseTermsEnum.attributes()
FilteredTermsEnum.attributes()
Returns the related attributes, the returnedAttributeSource
is shared with the delegateTermsEnum
.FilterLeafReader.FilterTermsEnum.attributes()
abstract AttributeSource
TermsEnum.attributes()
Returns the related attributes.FieldInvertState.getAttributeSource()
Returns theAttributeSource
from theTokenStream
that provided the indexed tokens for this field.Methods in org.apache.lucene.index with parameters of type AttributeSourceModifier and TypeMethodDescription(package private) void
FieldInvertState.setAttributeSource
(AttributeSource attributeSource) Sets attributeSource to a new instance. -
Uses of AttributeSource in org.apache.lucene.monitor
Subclasses of AttributeSource in org.apache.lucene.monitorModifier and TypeClassDescription(package private) final class
(package private) class
A TokenStream created from aTermsEnum
-
Uses of AttributeSource in org.apache.lucene.search
Fields in org.apache.lucene.search declared as AttributeSourceModifier and TypeFieldDescriptionfinal AttributeSource
TermCollectingRewrite.TermCollector.attributes
attributes used for communication with the enumprivate final AttributeSource
FuzzyTermsEnum.atts
Methods in org.apache.lucene.search that return AttributeSourceMethods in org.apache.lucene.search with parameters of type AttributeSourceModifier and TypeMethodDescriptionprotected TermsEnum
AutomatonQuery.getTermsEnum
(Terms terms, AttributeSource atts) protected TermsEnum
FuzzyQuery.getTermsEnum
(Terms terms, AttributeSource atts) protected abstract TermsEnum
MultiTermQuery.getTermsEnum
(Terms terms, AttributeSource atts) Construct the enumeration to be used, expanding the pattern term.protected TermsEnum
MultiTermQuery.RewriteMethod.getTermsEnum
(MultiTermQuery query, Terms terms, AttributeSource atts) Returns theMultiTermQuery
sTermsEnum
protected TermsEnum
TermInSetQuery.getTermsEnum
(Terms terms, AttributeSource atts) Constructors in org.apache.lucene.search with parameters of type AttributeSourceModifierConstructorDescription(package private)
FuzzyTermsEnum
(Terms terms, AttributeSource atts, Term term, int maxEdits, int prefixLength, boolean transpositions) Constructor for enumeration of all terms from specifiedreader
which share a prefix of lengthprefixLength
withterm
and which have at mostmaxEdits
edits.private
FuzzyTermsEnum
(Terms terms, AttributeSource atts, Term term, Supplier<FuzzyAutomatonBuilder> automatonBuilder) -
Uses of AttributeSource in org.apache.lucene.search.highlight
Subclasses of AttributeSource in org.apache.lucene.search.highlightModifier and TypeClassDescription(package private) final class
This is a simplified version of org.apache.lucene.analysis.miscellaneous.LimitTokenOffsetFilter to prevent a dependency on analysis-common.jar.final class
This TokenFilter limits the number of tokens while indexing by adding up the current offset.final class
TokenStream created from a term vector field. -
Uses of AttributeSource in org.apache.lucene.search.join
Methods in org.apache.lucene.search.join with parameters of type AttributeSourceModifier and TypeMethodDescriptionprotected TermsEnum
TermsQuery.getTermsEnum
(Terms terms, AttributeSource atts) -
Uses of AttributeSource in org.apache.lucene.search.suggest.analyzing
Subclasses of AttributeSource in org.apache.lucene.search.suggest.analyzingModifier and TypeClassDescriptionfinal class
LikeStopFilter
except it will not remove the last token if that token was not followed by some token separator. -
Uses of AttributeSource in org.apache.lucene.search.suggest.document
Subclasses of AttributeSource in org.apache.lucene.search.suggest.documentModifier and TypeClassDescriptionfinal class
AConcatenateGraphFilter
but we can set the payload and provide access to config options.private static final class
TheContextSuggestField.PrefixTokenFilter
wraps aTokenStream
and adds a set prefixes ahead. -
Uses of AttributeSource in org.apache.lucene.search.uhighlight
Subclasses of AttributeSource in org.apache.lucene.search.uhighlightModifier and TypeClassDescriptionprivate static final class
Wraps anAnalyzer
and string text that represents multiple values delimited by a specified character. -
Uses of AttributeSource in org.apache.lucene.util
Methods in org.apache.lucene.util that return AttributeSourceModifier and TypeMethodDescriptionfinal AttributeSource
AttributeSource.cloneAttributes()
Performs a clone of allAttributeImpl
instances returned in a newAttributeSource
instance.Methods in org.apache.lucene.util with parameters of type AttributeSourceModifier and TypeMethodDescriptionfinal void
AttributeSource.copyTo
(AttributeSource target) Copies the contents of thisAttributeSource
to the given targetAttributeSource
.Constructors in org.apache.lucene.util with parameters of type AttributeSourceModifierConstructorDescriptionAttributeSource
(AttributeSource input) An AttributeSource that uses the same attributes as the supplied one. -
Uses of AttributeSource in org.apache.lucene.util.graph
Subclasses of AttributeSource in org.apache.lucene.util.graphModifier and TypeClassDescriptionprivate class
Fields in org.apache.lucene.util.graph declared as AttributeSourceMethods in org.apache.lucene.util.graph that return types with arguments of type AttributeSourceModifier and TypeMethodDescriptionGraphTokenStreamFiniteStrings.getTerms
(int state) Returns the list of tokens that start at the provided state
SpanishPluralStemFilter
instead.