Class PersonNameFilter
java.lang.Object
org.opensextant.extractors.geo.rules.GeocodeRule
org.opensextant.extractors.geo.rules.PersonNameFilter
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
Rule fired if a location is found in an organization name; Only organization should be filtered out.Fields inherited from class org.opensextant.extractors.geo.rules.GeocodeRule
AVG_WORD_LEN, boundaryObserver, coordObserver, countryObserver, defaultMethod, LEX1, LEX2, locationOnly, log, LOWERCASE, NAME, textCase, UPPERCASE, weight
-
Constructor Summary
ConstructorDescriptionPersonNameFilter
(String namesPath, String persTitlesPath, String persSuffixesPath) Default constructor here used resource paths (which are retrieved as getResourceAsStream() Instead of retrieving resource URLs or files.PersonNameFilter
(URL names, URL persTitles, URL persSuffixes) Constructor for general usage if you know your files might come from file system or JAR. -
Method Summary
Modifier and TypeMethodDescriptionvoid
evaluate
(List<PlaceCandidate> names) Evaluate the place name purely based on previous rules or the lexical nature of the name, and not any geography, so this parent method is overriden and returns True always.void
evaluate
(PlaceCandidate name, org.opensextant.data.Place geo) The one evaluation scheme that all rules must implement.void
evaluateNamedEntities
(org.opensextant.data.TextInput input, List<PlaceCandidate> placeNames, List<TaxonMatch> persons, List<TaxonMatch> orgs, List<TaxonMatch> others) Use known person names to distinguish well-known persons that may or may not overlap in in the text and the namespace.void
reset()
no-op, unless overriden.Methods inherited from class org.opensextant.extractors.geo.rules.GeocodeRule
filterByNameOnly, filterOutByFrequency, internalPlaceID, isRelevant, isShort, logMsg, sameBoundary, sameCountry, sameCountry, sameLexicalName, setBoundaryObserver, setCountryObserver, setDefaultMethod, setGeohash, setLocationObserver, setTextCase, textCase
-
Field Details
-
NAME_IN_ORG_RULE
Rule fired if a location is found in an organization name; Only organization should be filtered out.- See Also:
-
-
Constructor Details
-
PersonNameFilter
public PersonNameFilter(URL names, URL persTitles, URL persSuffixes) throws org.opensextant.ConfigException Constructor for general usage if you know your files might come from file system or JAR.- Parameters:
names
-persTitles
-persSuffixes
-- Throws:
org.opensextant.ConfigException
- when filter files are missing.
-
PersonNameFilter
public PersonNameFilter(String namesPath, String persTitlesPath, String persSuffixesPath) throws org.opensextant.ConfigException Default constructor here used resource paths (which are retrieved as getResourceAsStream() Instead of retrieving resource URLs or files. This works best if you know your resource files will come from JAR only.- Parameters:
namesPath
-persTitlesPath
-persSuffixesPath
-- Throws:
org.opensextant.ConfigException
- when filter files are missing
-
-
Method Details
-
reset
public void reset()Description copied from class:GeocodeRule
no-op, unless overriden.- Overrides:
reset
in classGeocodeRule
-
getPersonNames
-
getOrgNames
-
evaluateNamedEntities
public void evaluateNamedEntities(org.opensextant.data.TextInput input, List<PlaceCandidate> placeNames, List<TaxonMatch> persons, List<TaxonMatch> orgs, List<TaxonMatch> others) Use known person names to distinguish well-known persons that may or may not overlap in in the text and the namespace.Hillary Clinton visited New York state today.
So, Clinton is part of a well known celebrity, and is not referring to Clinton, NY a town in upstate. We identify all such person names and mark any overlaps and co-references that coincide with tagged place names.
- Parameters:
placeNames
- places to negatepersons
- named persons in docorgs
- named orgs in doc
-
evaluate
Evaluate the place name purely based on previous rules or the lexical nature of the name, and not any geography, so this parent method is overriden and returns True always. That shunts the geo evaluation -- So, yes it always returns true.- Overrides:
evaluate
in classGeocodeRule
- Parameters:
names
- list of found place names
-
evaluate
Description copied from class:GeocodeRule
The one evaluation scheme that all rules must implement. Given a single text match and a location, consider if the geo is a good geocoding for the match.- Specified by:
evaluate
in classGeocodeRule
- Parameters:
name
- matched name in textgeo
- gazetteer entry or location
-