Class AbstractFlexPat

  • All Implemented Interfaces:
    Extractor
    Direct Known Subclasses:
    PatternsOfLife, XCoord, XTemporal

    public abstract class AbstractFlexPat
    extends java.lang.Object
    implements Extractor
    FlexPat Extractor -- given a set of pattern families, extract, filter and normalize matches.
    Author:
    ubaldino
    • Field Detail

      • match_width

        protected int match_width
        CHARS. SHP DBF limit is 255 bytes, so SHP file outputters should assess at that time how/when to curtail match width. The max pre/post text seen useful has typically been about 200-250 characters.
      • log

        protected org.slf4j.Logger log
      • debug

        protected boolean debug
      • patterns_file

        protected java.lang.String patterns_file
    • Constructor Detail

      • AbstractFlexPat

        public AbstractFlexPat()
      • AbstractFlexPat

        public AbstractFlexPat​(boolean b)
    • Method Detail

      • createPatternManager

        protected abstract RegexPatternManager createPatternManager​(java.io.InputStream s,
                                                                    java.lang.String name)
                                                             throws java.io.IOException
        Create a pattern manager given the input stream and the file name.
        Parameters:
        s - stream of patterns config file
        name - app name
        Returns:
        the regex pattern manager
        Throws:
        java.io.IOException - Signals that an I/O exception has occurred.
      • configure

        public void configure​(java.lang.String patfile)
                       throws ConfigException
        Configure using a particular pattern file.
        Specified by:
        configure in interface Extractor
        Parameters:
        patfile - a pattern file.
        Throws:
        ConfigException - if pattern file not found
      • configure

        public void configure​(java.net.URL patfile)
                       throws ConfigException
        Configure using a URL pointer to the pattern file.
        Specified by:
        configure in interface Extractor
        Parameters:
        patfile - patterns file URL
        Throws:
        ConfigException - if pattern file not found
      • setMatchWidth

        public void setMatchWidth​(int w)
        Match Width is the text buffer before and after a TextMatch. Match buffers are used to create a match ID
        Parameters:
        w - width
      • set_match_id

        protected void set_match_id​(TextMatch m,
                                    int count)
        Optional. Assign an identifier to each Text Match found. This is an MD5 of the match in-situ. If context is provided, it is used to generate the identity. If a count is provided it is used. otherwise make use of just pattern ID + text value.
        Parameters:
        m - a TextMatch
        count - incrementor used for uniqueness
      • enableAll

        public void enableAll()
      • disableAll

        public void disableAll()
      • updateProgress

        public void updateProgress​(double progress)
      • markComplete

        public void markComplete()