Class PatternManager


  • public final class PatternManager
    extends RegexPatternManager

    This is the culmination of various coordinate extraction efforts in python and Java. This API poses no assumptions on input data or on execution.

    Common Coordinate Enumeration (CCE) is a concept for enumerating the coordinate representations. See XConstants for details. The basics of CCE include a family (DD, DMS, MGRS, etc.) and style ( enumerated in patterns config file).

    Features of REGEX patterns file:

    • DEFINE - a component of a coord pattern to match
    • RULE - a complete pattern to match
    • TEST - an example of the text the pattern should match in part or whole.

    The Rules file: The Rules is an external text file containing rules consisting of regular expressions used to identify geocoords. Below is an example of what a simple rule might look like:

     // Parts of a decimal degree Latitude/Longitude
     #DEFINE  decDegLat   \d?\d\.\d{1,20}
     #DEFINE  decDegLon   [0-1]?\d?\d\.\d{1,20}
    
     // TARGET: DD-xx, Decimal Deg, Preceding Hemisphere (a) H DD.DDDDDD° HDDD.DDDDDD°, optional deg symbol
     #RULE   DD      01      <hemiLatPre>\s?<decDegLat><degSym>?\s*<latlonSep>?\s*<hemiLonPre>\s?<decDegLon>lt;degSym>?
     #TEST   DD      01      N42.3, W102.4
     
    Where the DEFINE statements relay fields that the PatternManager will recall at runtime. The RULE is a composition of DEFINEs, other literals and regex patterns. A rule must have a family and a rule ID within that family. And the TEST statement (which is enumerated the same as the RULE family and ID). At runtime all tests are further labeled with an incrementor, e.g. for TEST "DD-01" might be the eighth test in the pattern file, so the test will be labeled internally as DD-01#8.
    Author:
    dlutz, MITRE creator (lutzdavp), ubaldino, MITRE adaptor, swainza
    • Field Detail

      • CCE_family_state

        public java.util.Map<java.lang.Integer,​java.lang.Boolean> CCE_family_state
    • Constructor Detail

      • PatternManager

        public PatternManager​(java.io.InputStream s,
                              java.lang.String n)
                       throws java.io.IOException
        Throws:
        java.io.IOException
    • Method Detail

      • initialize

        public void initialize​(java.io.InputStream io)
                        throws java.io.IOException
        Description copied from class: RegexPatternManager
        Initializes the pattern manager implementations. Reads the DEFINEs and RULEs from the pattern file and does the requisite substitutions. After initialization patterns HashMap will be populated.
        Overrides:
        initialize in class RegexPatternManager
        Parameters:
        io - stream
        Throws:
        java.io.IOException
      • enable_CCE_family

        public void enable_CCE_family​(int cce_fam,
                                      boolean enabled)
        Parameters:
        cce_fam -
        enabled -
      • create_pattern

        protected RegexPattern create_pattern​(java.lang.String fam,
                                              java.lang.String rule,
                                              java.lang.String desc)
        Implementation must create a RegexPattern given the basic RULE define, #RULE FAMILY RID REGEX PatternManager here adds compiled pattern and DEFINES.
        Specified by:
        create_pattern in class RegexPatternManager
        Parameters:
        fam -
        rule -
        desc -
        Returns:
      • validate_pattern

        protected boolean validate_pattern​(RegexPattern repat)
        Implementation has the option to check a pattern; For now invalid patterns are only logged.
        Specified by:
        validate_pattern in class RegexPatternManager
        Parameters:
        repat -
        Returns:
      • create_testcase

        protected PatternTestCase create_testcase​(java.lang.String id,
                                                  java.lang.String fam,
                                                  java.lang.String text)
        Implementation must create TestCases given the #TEST directive, #TEST RID TID TEXT
        Specified by:
        create_testcase in class RegexPatternManager
        Parameters:
        id -
        text -
        fam -
        Returns: