Package org.opensextant.processing
Class XtractorGroup
- java.lang.Object
-
- org.opensextant.processing.XtractorGroup
-
public class XtractorGroup extends java.lang.Object
A Group of Xponent Extractors. An Extractor has a simple interface:+configure() + extract()
Configure any Extractor; add it to the stack here; Once you have added Extractors to your XtractorGroup, call XtractorGroup.setup() Since a single processor of several may throw an exception, while others succeed, The API does not throw exceptions failing a document completely. If you need access to exceptions thrown by each processor or formatter, then you would adapt the XtractorGroup here, but re-implementing the internal loops.
- Author:
- ubaldino
-
-
Field Summary
Fields Modifier and Type Field Description protected java.util.List<java.lang.String>
currErrors
API: child implementations have access to accumulated errors; reset() clears errors and other state.protected java.util.List<Extractor>
extractors
API: child implementations have access to the core list of extractors.protected java.util.List<ResultsFormatter>
formatters
API: child implementations have access to the core list of extractors.protected org.slf4j.Logger
log
API: child implementations should recreate their own logger.
-
Constructor Summary
Constructors Constructor Description XtractorGroup()
-
Method Summary
Modifier and Type Method Description void
addExtractor(Extractor xprocessor)
void
addFormatter(ResultsFormatter formatter)
void
cleanupAll()
Use only if you intend to shutdown.int
format(ExtractionResult compilation)
Format each result; Some formatters may pass on results For example, Shapefile formatter accepts only Geocoding-capable TextMatch.java.util.List<TextMatch>
process(TextInput input)
Process one input.int
processAndFormat(TextInput input)
Processes input content against all extractors and all formatters This does not throw exceptions, as some processing may fail, while others succeed.void
reset()
DRAFT: still figuring out the rules for 'reset' between processing or inputs.
-
-
-
Field Detail
-
extractors
protected java.util.List<Extractor> extractors
API: child implementations have access to the core list of extractors.
-
formatters
protected java.util.List<ResultsFormatter> formatters
API: child implementations have access to the core list of extractors.
-
log
protected org.slf4j.Logger log
API: child implementations should recreate their own logger.
-
currErrors
protected java.util.List<java.lang.String> currErrors
API: child implementations have access to accumulated errors; reset() clears errors and other state.
-
-
Method Detail
-
addExtractor
public void addExtractor(Extractor xprocessor)
-
addFormatter
public void addFormatter(ResultsFormatter formatter)
-
process
public java.util.List<TextMatch> process(TextInput input)
Process one input. If you have no need for formatting output at this time use this. If you have complext ExtractionResults where you want to add meta attributes, then you would use this approach
-
format
public int format(ExtractionResult compilation)
Format each result; Some formatters may pass on results For example, Shapefile formatter accepts only Geocoding-capable TextMatch.
-
cleanupAll
public void cleanupAll()
Use only if you intend to shutdown.
-
reset
public void reset()
DRAFT: still figuring out the rules for 'reset' between processing or inputs.
-
processAndFormat
public int processAndFormat(TextInput input)
Processes input content against all extractors and all formatters This does not throw exceptions, as some processing may fail, while others succeed.TODO: Processing/Formatting details would have to be retrieved by calling some other method that is statefully tracking such things.
- Parameters:
input
-- Returns:
- status -1 failure, 0 nothing found, 1 found matches and formatted; 2 found content but nothing formatted. them.
-
-