Class XtractorGroup

java.lang.Object
org.opensextant.processing.XtractorGroup

public class XtractorGroup extends Object
A Group of Xponent Extractors. An Extractor has a simple interface:
 +configure() + extract()
 

Configure any Extractor; add it to the stack here; Once you have added Extractors to your XtractorGroup, call XtractorGroup.setup() Since a single processor of several may throw an exception, while others succeed, The API does not throw exceptions failing a document completely. If you need access to exceptions thrown by each processor or formatter, then you would adapt the XtractorGroup here, but re-implementing the internal loops.

Author:
ubaldino
  • Field Details

    • extractors

      protected List<Extractor> extractors
      API: child implementations have access to the core list of extractors.
    • formatters

      protected List<ResultsFormatter> formatters
      API: child implementations have access to the core list of extractors.
    • log

      protected org.slf4j.Logger log
      API: child implementations should recreate their own logger.
    • currErrors

      protected List<String> currErrors
      API: child implementations have access to accumulated errors; reset() clears errors and other state.
  • Constructor Details

    • XtractorGroup

      public XtractorGroup()
  • Method Details

    • addExtractor

      public void addExtractor(Extractor xprocessor)
    • addFormatter

      public void addFormatter(ResultsFormatter formatter)
    • process

      public List<TextMatch> process(TextInput input)
      Process one input. If you have no need for formatting output at this time use this. If you have complext ExtractionResults where you want to add meta attributes, then you would use this approach
    • format

      public int format(ExtractionResult compilation)
      Format each result; Some formatters may pass on results For example, Shapefile formatter accepts only Geocoding-capable TextMatch.
    • cleanupAll

      public void cleanupAll()
      Use only if you intend to shutdown.
    • reset

      public void reset()
      DRAFT: still figuring out the rules for 'reset' between processing or inputs.
    • processAndFormat

      public int processAndFormat(TextInput input)
      Processes input content against all extractors and all formatters This does not throw exceptions, as some processing may fail, while others succeed.

      TODO: Processing/Formatting details would have to be retrieved by calling some other method that is statefully tracking such things.

      Parameters:
      input -
      Returns:
      status -1 failure, 0 nothing found, 1 found matches and formatted; 2 found content but nothing formatted. them.