Package org.opensextant.extraction
package org.opensextant.extraction
Extraction Fundamentals
Extraction fundamentals include TextEntity
, a span in free text, and TextMatch
a
TextEntity generated by an extractor, matcher, or rule. A span is defined as a character start offset
and end offset. A TextEntity provides basic reasoning for span logic and math: compare spans before, after
within, overlap, etc.
Beyond that, the extraction helpers here provide specific Solr tagger support, match filteration, match navigation, and match metrics.
-
ClassDescriptionAn exception to be thrown when place name matching goes awry.This is a holder for tracking various common measures: No.For now, this interface is closer to an AbstractExtractor where a clean interface might be output = Extractor.extract(input) This interface specifies moreThe Class MatchFilter.A very simple struct to hold data useful for post-processing entities once found.A variation on TextEntity that also records pattern metadata