Package org.opensextant.extraction
Class TextMatch
- java.lang.Object
-
- org.opensextant.extraction.TextEntity
-
- org.opensextant.extraction.TextMatch
-
- Direct Known Subclasses:
DateMatch
,GeocoordMatch
,PoliMatch
public class TextMatch extends TextEntity
A variation on TextEntity that also records pattern metadata- Author:
- ubaldino
-
-
Field Summary
Fields Modifier and Type Field Description java.lang.String
pattern_id
the ID of the pattern that extracted thisjava.lang.String
producer
A short label or tag representing the matcher, extractor, tagger, etc.protected java.lang.String
type
Type, as in Annotation type or code.-
Fields inherited from class org.opensextant.extraction.TextEntity
end, is_duplicate, is_overlap, is_submatch, match_id, postChar, preChar, start, text
-
-
Constructor Summary
Constructors Constructor Description TextMatch(int x1, int x2)
-
Method Summary
Modifier and Type Method Description void
copy(TextMatch m)
void
defaultMatchId()
If called, this overwrites existing match_id Match ID is typically entity label @ offset.java.lang.String
getMatchId()
Future planning -- match_id may become private field in future API.java.lang.String
getTextnorm()
Get a normalized version of the text, lower case, punctuation and diacritics removed.java.lang.String
getType()
boolean
isDefault()
Users of this class should set a non-default type via setType(String), otherwise the match remains default and generic.boolean
isFilteredOut()
boolean
isSame(java.lang.String m)
Case-insensitive comparison to another stringboolean
isSameNorm(TextMatch m)
Compare the normalized string for this match to that of another.void
setFilteredOut(boolean b)
void
setType(java.lang.String t)
Allow matchers and taggers to set a type label, e.g., pattern family or other string.java.lang.String
toString()
-
Methods inherited from class org.opensextant.extraction.TextEntity
contains, copy, getContext, getContextAfter, getContextBefore, getLength, getText, isAfter, isASCII, isBefore, isLeftMatch, isLower, isMixedCase, isOverlap, isRightMatch, isSameMatch, isUpper, isWithin, isWithinChars, setContext, setContext, setText, setTextOnly
-
-
-
-
Field Detail
-
pattern_id
public java.lang.String pattern_id
the ID of the pattern that extracted this
-
producer
public java.lang.String producer
A short label or tag representing the matcher, extractor, tagger, etc. that produced this match.
-
type
protected java.lang.String type
Type, as in Annotation type or code.
-
-
Method Detail
-
getType
public java.lang.String getType()
-
setType
public void setType(java.lang.String t)
Allow matchers and taggers to set a type label, e.g., pattern family or other string.- Parameters:
t
- type
-
toString
public java.lang.String toString()
- Overrides:
toString
in classTextEntity
- Returns:
- string representation
-
copy
public void copy(TextMatch m)
- Parameters:
m
- a text match to copy to this instance
-
isSame
public boolean isSame(java.lang.String m)
Case-insensitive comparison to another string- Parameters:
m
- match- Returns:
- trut if
-
isSameNorm
public boolean isSameNorm(TextMatch m)
Compare the normalized string for this match to that of another.- Parameters:
m
-- Returns:
- true if getTextnorm()s yield same string.
-
isFilteredOut
public boolean isFilteredOut()
-
setFilteredOut
public void setFilteredOut(boolean b)
-
getTextnorm
public java.lang.String getTextnorm()
Get a normalized version of the text, lower case, punctuation and diacritics removed. If you want only pieces of this normalization, you may override it.- Returns:
- normalized version of text.
-
isDefault
public boolean isDefault()
Users of this class should set a non-default type via setType(String), otherwise the match remains default and generic.- Returns:
-
defaultMatchId
public void defaultMatchId()
If called, this overwrites existing match_id Match ID is typically entity label @ offset. Alternatively a Match ID could be also label + offset + value + ... to distinguish this text span from others.
-
getMatchId
public java.lang.String getMatchId()
Future planning -- match_id may become private field in future API.- Returns:
-
-