org.opensextant.extraction (Xponents Core API)

package org.opensextant.extraction

Extraction Fundamentals

Extraction fundamentals include TextEntity, a span in free text, and TextMatch a TextEntity generated by an extractor, matcher, or rule. A span is defined as a character start offset and end offset. A TextEntity provides basic reasoning for span logic and math: compare spans before, after within, overlap, etc.

Beyond that, the extraction helpers here provide specific Solr tagger support, match filteration, match navigation, and match metrics.

Related Packages

Package

Description

org.opensextant

org.opensextant.annotations

DeepEye is an approach for simplifying typical NLP annotation exchanges.

org.opensextant.data

Xponents Data Model

org.opensextant.output

Xponents Output Formatting using GISCore

org.opensextant.processing

Processing Basics: Parameters, Results Handlers, Pipelining

org.opensextant.util

Utilities for Extraction
Class

Description

ExtractionException

An exception to be thrown when place name matching goes awry.

ExtractionMetrics

This is a holder for tracking various common measures: No.

ExtractionResult

Extractor

For now, this interface is closer to an AbstractExtractor where a clean interface might be output = Extractor.extract(input) This interface specifies more

MatcherUtils

MatchFilter

The Class MatchFilter.

NormalizationException

TextEntity

A very simple struct to hold data useful for post-processing entities once found.

TextMatch

A variation on TextEntity that also records pattern metadata

Package org.opensextant.extraction

Extraction Fundamentals