Package org.opensextant.extraction
Interface Extractor
- All Known Implementing Classes:
AbstractFlexPat
,PatternsOfLife
,XCoord
,XTemporal
public interface Extractor
For now, this interface is closer to an AbstractExtractor
where a clean interface might be
output = Extractor.extract(input)
This interface specifies more
- Author:
- ubaldino
-
Field Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
cleanup()
Resource management.void
Configure an Extractor using defaults for that extractor.void
Configure an Extractor using a config file named by a path.void
Configure an Extractor using a config file named by a URL.Useful for working with text buffers adhoc.Useuful for working with batches of inputs that have an innate row ID + buffer pairing.getName()
-
Field Details
-
NO_DOC_ID
optional constant - a universal doc ID holder- See Also:
-
-
Method Details
-
getName
String getName() -
configure
Configure an Extractor using defaults for that extractor.- Throws:
ConfigException
- the config exception
-
configure
Configure an Extractor using a config file named by a path.- Parameters:
patfile
- configuration file path- Throws:
ConfigException
- the config exception
-
configure
Configure an Extractor using a config file named by a URL.- Parameters:
patfile
- configuration URL- Throws:
ConfigException
- the config exception
-
extract
Useuful for working with batches of inputs that have an innate row ID + buffer pairing.- Parameters:
input
- text input- Returns:
- the list of TextMatch
- Throws:
ExtractionException
- error if underlying extractor(s) fail
-
extract
Useful for working with text buffers adhoc. Fewer assumptions about input data here.- Parameters:
input
- text input, as a string- Returns:
- the list of TextMatch
- Throws:
ExtractionException
- error if underlying extractor(s) fail
-
cleanup
void cleanup()Resource management. This cleanup routine usually in turn calls some shutdown, disconnect, etc.
-