Class TextMatch

java.lang.Object
org.opensextant.extraction.TextEntity
org.opensextant.extraction.TextMatch
All Implemented Interfaces:
Comparable<TextMatch>, MatchSchema
Direct Known Subclasses:
DateMatch, GeocoordMatch, PoliMatch

public class TextMatch extends TextEntity implements MatchSchema, Comparable<TextMatch>
A variation on TextEntity that also records pattern metadata
Author:
ubaldino
  • Field Details

    • pattern_id

      public String pattern_id
      the ID of the pattern that extracted this
    • producer

      public String producer
      A short label or tag representing the matcher, extractor, tagger, etc. that produced this match.
    • type

      protected String type
      Type, as in Annotation type or code.
  • Constructor Details

    • TextMatch

      public TextMatch(int x1, int x2)
  • Method Details

    • getType

      public String getType()
    • setType

      public void setType(String t)
      Allow matchers and taggers to set a type label, e.g., pattern family or other string.
      Parameters:
      t - type
    • toString

      public String toString()
      Overrides:
      toString in class TextEntity
      Returns:
      string representation
    • copy

      public void copy(TextMatch m)
      Parameters:
      m - a text match to copy to this instance
    • isSame

      public boolean isSame(String m)
      Case-insensitive comparison to another string
      Parameters:
      m - match
      Returns:
      trut if
    • isSameNorm

      public boolean isSameNorm(TextMatch m)
      Compare the normalized string for this match to that of another.
      Parameters:
      m -
      Returns:
      true if getTextnorm()s yield same string.
    • isFilteredOut

      public boolean isFilteredOut()
    • setFilteredOut

      public void setFilteredOut(boolean b)
    • getTextnorm

      public String getTextnorm()
      Get a normalized version of the text, lower case, punctuation and diacritics removed. If you want only pieces of this normalization, you may override it.
      Returns:
      normalized version of text.
    • isDefault

      public boolean isDefault()
      Users of this class should set a non-default type via setType(String), otherwise the match remains default and generic.
      Returns:
    • defaultMatchId

      public void defaultMatchId()
      If called, this overwrites existing match_id Match ID is typically entity label @ offset. Alternatively a Match ID could be also label + value + start offset ... to distinguish this text span from others.
    • getContentId

      public String getContentId()
      create a simple text-based identifier with form of value + start offset ...
      Returns:
    • getMatchId

      public String getMatchId()
      Future planning -- match_id may become private field in future API.
      Returns:
    • compareTo

      public int compareTo(TextMatch other)
      this match, A compared to B Order: A B then A > B Order: B A then A < B Order: same spans then A == B
      Specified by:
      compareTo in interface Comparable<TextMatch>
      Parameters:
      other -
      Returns: