Package org.opensextant.extraction
Class TextMatch
java.lang.Object
org.opensextant.extraction.TextEntity
org.opensextant.extraction.TextMatch
- All Implemented Interfaces:
Comparable<TextMatch>
,MatchSchema
- Direct Known Subclasses:
DateMatch
,GeocoordMatch
,PoliMatch
A variation on TextEntity that also records pattern metadata
- Author:
- ubaldino
-
Field Summary
Modifier and TypeFieldDescriptionthe ID of the pattern that extracted thisA short label or tag representing the matcher, extractor, tagger, etc.protected String
Type, as in Annotation type or code.Fields inherited from class org.opensextant.extraction.TextEntity
end, is_duplicate, is_overlap, is_submatch, match_id, postChar, preChar, start, text
Fields inherited from interface org.opensextant.data.MatchSchema
VAL_COORD, VAL_COUNTRY, VAL_PLACE, VAL_POSTAL, VAL_TAXON
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionint
this match, A compared to B Order: A B then A > B Order: B A then A < B Order: same spans then A == Bvoid
void
If called, this overwrites existing match_id Match ID is typically entity label @ offset.create a simple text-based identifier with form of value + start offset ...Future planning -- match_id may become private field in future API.Get a normalized version of the text, lower case, punctuation and diacritics removed.getType()
boolean
Users of this class should set a non-default type via setType(String), otherwise the match remains default and generic.boolean
boolean
Case-insensitive comparison to another stringboolean
Compare the normalized string for this match to that of another.void
setFilteredOut
(boolean b) void
Allow matchers and taggers to set a type label, e.g., pattern family or other string.toString()
Methods inherited from class org.opensextant.extraction.TextEntity
contains, copy, getContext, getContextAfter, getContextBefore, getLength, getText, isAfter, isASCII, isBefore, isLeftMatch, isLower, isMixedCase, isOverlap, isRightMatch, isSameMatch, isUpper, isWithin, isWithinChars, setContext, setContext, setText, setTextOnly
-
Field Details
-
pattern_id
the ID of the pattern that extracted this -
producer
A short label or tag representing the matcher, extractor, tagger, etc. that produced this match. -
type
Type, as in Annotation type or code.
-
-
Constructor Details
-
TextMatch
public TextMatch(int x1, int x2)
-
-
Method Details
-
getType
-
setType
Allow matchers and taggers to set a type label, e.g., pattern family or other string.- Parameters:
t
- type
-
toString
- Overrides:
toString
in classTextEntity
- Returns:
- string representation
-
copy
- Parameters:
m
- a text match to copy to this instance
-
isSame
Case-insensitive comparison to another string- Parameters:
m
- match- Returns:
- trut if
-
isSameNorm
Compare the normalized string for this match to that of another.- Parameters:
m
-- Returns:
- true if getTextnorm()s yield same string.
-
isFilteredOut
public boolean isFilteredOut() -
setFilteredOut
public void setFilteredOut(boolean b) -
getTextnorm
Get a normalized version of the text, lower case, punctuation and diacritics removed. If you want only pieces of this normalization, you may override it.- Returns:
- normalized version of text.
-
isDefault
public boolean isDefault()Users of this class should set a non-default type via setType(String), otherwise the match remains default and generic.- Returns:
-
defaultMatchId
public void defaultMatchId()If called, this overwrites existing match_id Match ID is typically entity label @ offset. Alternatively a Match ID could be also label + value + start offset ... to distinguish this text span from others. -
getContentId
create a simple text-based identifier with form of value + start offset ...- Returns:
-
getMatchId
Future planning -- match_id may become private field in future API.- Returns:
-
compareTo
this match, A compared to B Order: A B then A > B Order: B A then A < B Order: same spans then A == B- Specified by:
compareTo
in interfaceComparable<TextMatch>
- Parameters:
other
-- Returns:
-