Package org.opensextant.extraction


package org.opensextant.extraction

Extraction Fundamentals

Extraction fundamentals include TextEntity, a span in free text, and TextMatch a TextEntity generated by an extractor, matcher, or rule. A span is defined as a character start offset and end offset. A TextEntity provides basic reasoning for span logic and math: compare spans before, after within, overlap, etc.

Beyond that, the extraction helpers here provide specific Solr tagger support, match filteration, match navigation, and match metrics.