Class MajorPlaceRule

java.lang.Object
org.opensextant.extractors.geo.rules.GeocodeRule
org.opensextant.extractors.geo.rules.MajorPlaceRule

public class MajorPlaceRule extends GeocodeRule
Major Place rule -- fire this rule after Country rule. Try to find all countries in scope first, then major places. If you try to infer country from major places first you get a lot of false positives. Country name space is smaller and more reliable. LOTS of caveats: these rules enforce the notion that country names are drivers here, and major places amplify. IF we see a National Capital we can infer a country, provided no countries have been seen in document IF we see a major place, add that evidence weighting it higher if the country of that major place is also mentioned in document.
Author:
ubaldino
  • Field Details

  • Constructor Details

    • MajorPlaceRule

      public MajorPlaceRule(Map<String,Integer> populationStats)
      Major Place assigns a score to places that are national capitals, provinces, or cities with sizable population. Log(population) adds up to one point to place weight. Population data is indexed by location/grid using geohash. Source:geonames.org Population stats are deterministic -- they do not change during the processing and they are not context specific. So we only assess population per location ONCE not per mention.
      Parameters:
      populationStats - optional population stats.
  • Method Details

    • reset

      public void reset()
      Description copied from class: GeocodeRule
      no-op, unless overriden.
      Overrides:
      reset in class GeocodeRule
    • evaluate

      public void evaluate(List<PlaceCandidate> names)
      Overrides:
      evaluate in class GeocodeRule
      Parameters:
      names - list of found place names
    • isRuleFor

      public static boolean isRuleFor(PlaceCandidate pc)
      Determine if this rule was applied to the candidate.
      Parameters:
      pc -
      Returns:
    • evaluate

      public void evaluate(PlaceCandidate name, org.opensextant.data.Place geo)
      attach either a Capital or Admin region ID, giving it some weight based on various properties or context.
      Specified by:
      evaluate in class GeocodeRule
      Parameters:
      name - matched name in text
      geo - gazetteer entry or location