Package org.opensextant.extractors.xcoord
XCoord: Geographic Coordinate Extraction
XCoord is a developer toolkit for extracting 3 major forms of coordinate patterns from any textual data:-  UTM - Universal Transverse Mercator
 
- MGRS - Military Grid Reference System
 
- Degrees, Minutes, Seconds and variants (DD, DM, DMS)
XCoord allows the user to define their own coordinate patterns or
      extend the default patterns.  There are about 2 dozen
      coordinate patterns, defined here:  ./doc/XCoord_Patterns.htm
      
    
Usage
From the command line you can quickly test XCoord on a set of
      given test cases or provide a file of your own.
    
ant -f ./script/testing.xml test-xcoord
file? test/mytest.txt
ant test-default
... runs internal unit tests coupled with the given patterns
      configuration file
    
Programmatically, the essential usage is:
    
XCoord xc = new XCoord();
xc.configure();
TextMatchResult geocodes = xc.extract_coordinates(text, text_id);
//... Now iterate over geocodes.matches
Equally as well, the Extractor.extract() interface implemented by XCoord is even more lean:
// 1. just text as an input.
List<TextMatch> geocodes1 = xc.extract(text);
// 2. pass in a TextInput argument, for example DocInput that represents a document.
List<TextMatch> geocodes2 = xc.extract( new TextInput(text, text_id));
In the first case you can extract coordinates from any string of
      text. In the second case, if you are managing your input records
      using some identifiers and want to carry such IDs on through your
      extraction results, use the TextInput method.
    
Tuning peformance happens at many levels. XCoord can toggle each coordinate pattern family: UTM, MGRS, DM, DMS, DD if there are limited or known formats desired. As well, for embedding XCoord into other systems (such as its parent project OpenSextant), the constructor can take a configuration file, for example, xc.configure( "mypatterns.cfg"). Such configuration files must be in the CLASSPATH currently.
When Interpreting GeocodingResults the caller of XCoord should
      check if an individual match is a submatch (GeocoordMatch.is_submatch)
      or not.  While each pattern is assessed individually, there
      may be multiple matches resulting in overlapping
      annotations.  The intention is that the longest distinct
      match is most relevant for any given span of text.  Although
      in some uses all matches are worth seeing.  To be clear,
      matches that are contained entirely within other matches are
      marked as submatches and therefore less likely to be the item of
      interest for geocoding. Other matches may overlap (GeocoordMatch.is_overlap = true)
    
Pattern Definition
FlexPat (derived from a few other MITRE efforts) allows XCoord to
      design the coordinate patterns as regular expressions, using named
      pattern groups.  As of Java version 6, the Java regular
      expression (regex) capability does not allow the full regex
      grammar, including naming pattern groups.  FlexPat was
      designed to address this gap in functionality as well as to
      provide a foundation for simple text matches, pattern definition,
      and pattern test cases.  See documentation in XCoord's
      PatternManager.
    
    
Runtime Flags and Optimization
The use of configuration file parameters suggests that you have
      one value for a parameter at runtime through the duration of the
      current process.   Since processing may be
      context-sensitive, we use static runtime flags (a bit mask of
      flags from XConstants) to influence and tune
      behavior.    Current flags include toggling
      coordinate pattern families and the option to extract context
      text.
    
XCoord.RUNTIME_FLAGS ^= XConstants.FILTER_DMS_ON // Turn OFF DMS filters using XOR
XCoord.RUNTIME_FLAGS |= XConstants.FLAG_ALL_FILTERS // return to default filter behavior with all filters.
XCoord.RUNTIME_FLAGS = XConstants.FLAG_ALL_FILTERS // return to default behavior with all filters.
      Other FLAG parameters will be added over time to allow XCoord
      behavior to be adapted at runtime.
      
    
    
- 
ClassDescriptionDMS Filters include ignoring these patterns: dd-dd-dd HH:MM:ss (where dd-dd-dd HH-MM-ss would be a valid coordinate as the field separators for lat/lon are the same).DMSOrdinate represents all the various fields a WGS84 cartesian coordinate could have.Resolution field for DMS.msGeocoordMatch holds all the annotation data for the actual raw and normalized coordinate.Filtering matches is a matter of practicality.Represent a Hemisphere symbol and value.MGRS Filters include ignoring these patterns: 1234 123456 12345678 1234567890 Recent calendar dates of the form ddMMMyyyy, "14DEC1990" (MGRS: 14D EC 19 90 Recent calendar dates with time, ddMMHHmm, "14DEC1200" Noon on 14DEC.This is the culmination of various coordinate extraction efforts in python and Java.Use this XCoord class for both test and development of patterns, as well as to extract coordinates at runtime.