Package org.opensextant.extraction
Interface Extractor
-
- All Known Implementing Classes:
AbstractFlexPat
,PatternsOfLife
,XCoord
,XTemporal
public interface Extractor
For now, this interface is closer to an AbstractExtractor where a clean interface might be output = Extractor.extract(input) This interface specifies more- Author:
- ubaldino
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
NO_DOC_ID
optional constant - a universal doc ID holder
-
Method Summary
Modifier and Type Method Description void
cleanup()
Resource management.void
configure()
Configure an Extractor using defaults for that extractor.void
configure(java.lang.String patfile)
Configure an Extractor using a config file named by a path.void
configure(java.net.URL patfile)
Configure an Extractor using a config file named by a URL.java.util.List<TextMatch>
extract(java.lang.String input)
Useful for working with text buffers adhoc.java.util.List<TextMatch>
extract(TextInput input)
Useuful for working with batches of inputs that have an innate row ID + buffer pairing.java.lang.String
getName()
-
-
-
Field Detail
-
NO_DOC_ID
static final java.lang.String NO_DOC_ID
optional constant - a universal doc ID holder- See Also:
- Constant Field Values
-
-
Method Detail
-
getName
java.lang.String getName()
-
configure
void configure() throws ConfigException
Configure an Extractor using defaults for that extractor.- Throws:
ConfigException
- the config exception
-
configure
void configure(java.lang.String patfile) throws ConfigException
Configure an Extractor using a config file named by a path.- Parameters:
patfile
- configuration file path- Throws:
ConfigException
- the config exception
-
configure
void configure(java.net.URL patfile) throws ConfigException
Configure an Extractor using a config file named by a URL.- Parameters:
patfile
- configuration URL- Throws:
ConfigException
- the config exception
-
extract
java.util.List<TextMatch> extract(TextInput input) throws ExtractionException
Useuful for working with batches of inputs that have an innate row ID + buffer pairing.- Parameters:
input
- text input- Returns:
- the list of TextMatch
- Throws:
ExtractionException
- error if underlying extractor(s) fail
-
extract
java.util.List<TextMatch> extract(java.lang.String input) throws ExtractionException
Useful for working with text buffers adhoc. Fewer assumptions about input data here.- Parameters:
input
- text input, as a string- Returns:
- the list of TextMatch
- Throws:
ExtractionException
- error if underlying extractor(s) fail
-
cleanup
void cleanup()
Resource management. This cleanup routine usually in turn calls some shutdown, disconnect, etc.
-
-