Package org.opensextant.processing
Class XtractorGroup
java.lang.Object
org.opensextant.processing.XtractorGroup
A Group of Xponent Extractors. An Extractor has a simple interface:
+configure() + extract()
Configure any Extractor; add it to the stack here; Once you have added Extractors to your XtractorGroup, call XtractorGroup.setup() Since a single processor of several may throw an exception, while others succeed, The API does not throw exceptions failing a document completely. If you need access to exceptions thrown by each processor or formatter, then you would adapt the XtractorGroup here, but re-implementing the internal loops.
- Author:
- ubaldino
-
Field Summary
Modifier and TypeFieldDescriptionAPI: child implementations have access to accumulated errors; reset() clears errors and other state.API: child implementations have access to the core list of extractors.protected List<ResultsFormatter>
API: child implementations have access to the core list of extractors.protected org.slf4j.Logger
API: child implementations should recreate their own logger. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
addExtractor
(Extractor xprocessor) void
addFormatter
(ResultsFormatter formatter) void
Use only if you intend to shutdown.int
format
(ExtractionResult compilation) Format each result; Some formatters may pass on results For example, Shapefile formatter accepts only Geocoding-capable TextMatch.Process one input.int
processAndFormat
(TextInput input) Processes input content against all extractors and all formatters This does not throw exceptions, as some processing may fail, while others succeed.void
reset()
DRAFT: still figuring out the rules for 'reset' between processing or inputs.
-
Field Details
-
extractors
API: child implementations have access to the core list of extractors. -
formatters
API: child implementations have access to the core list of extractors. -
log
protected org.slf4j.Logger logAPI: child implementations should recreate their own logger. -
currErrors
API: child implementations have access to accumulated errors; reset() clears errors and other state.
-
-
Constructor Details
-
XtractorGroup
public XtractorGroup()
-
-
Method Details
-
addExtractor
-
addFormatter
-
process
Process one input. If you have no need for formatting output at this time use this. If you have complext ExtractionResults where you want to add meta attributes, then you would use this approach -
format
Format each result; Some formatters may pass on results For example, Shapefile formatter accepts only Geocoding-capable TextMatch. -
cleanupAll
public void cleanupAll()Use only if you intend to shutdown. -
reset
public void reset()DRAFT: still figuring out the rules for 'reset' between processing or inputs. -
processAndFormat
Processes input content against all extractors and all formatters This does not throw exceptions, as some processing may fail, while others succeed.TODO: Processing/Formatting details would have to be retrieved by calling some other method that is statefully tracking such things.
- Parameters:
input
-- Returns:
- status -1 failure, 0 nothing found, 1 found matches and formatted; 2 found content but nothing formatted. them.
-