Class XTemporal

java.lang.Object
org.opensextant.extractors.flexpat.AbstractFlexPat
org.opensextant.extractors.xtemporal.XTemporal
All Implemented Interfaces:
Extractor

public class XTemporal extends AbstractFlexPat
Date/Time pattern extractor -- detects, parses, normalizes dates. Found date/time are DateMatch (TextMatch) objects
Author:
ubaldino
  • Field Details

    • DEFAULT_XTEMP_CFG

      public static final String DEFAULT_XTEMP_CFG
      The Constant DEFAULT_XTEMP_CFG.
      See Also:
    • TODAY

      public Date TODAY
      Application constants -- note the notion of TODAY is relative to the caller's notion of TODAY. If you are processing data from the past but have a sense of what TODAY is, then when found dates fall on either side of that they will be relative PAST and relative FUTURE.
    • TODAY_EPOCH

      public long TODAY_EPOCH
      The today epoch.
    • JAVA_0_DATE_YEAR

      public static final int JAVA_0_DATE_YEAR
      The Constant JAVA_0_DATE_YEAR.
      See Also:
    • ONE_YEAR_MS

      public static final long ONE_YEAR_MS
      The Constant ONE_YEAR_MS.
      See Also:
  • Constructor Details

    • XTemporal

      public XTemporal(boolean debugmode)
      XTemporal ctor
      Parameters:
      debugmode - true if debugging
    • XTemporal

      public XTemporal()
      non-debugging ctor;.
  • Method Details

    • getName

      public String getName()
      Extractor interface: getName.
      Returns:
      extractor name
    • createPatternManager

      protected RegexPatternManager createPatternManager(InputStream strm, String name) throws IOException
      Description copied from class: AbstractFlexPat
      Create a pattern manager given the input stream and the file name.
      Specified by:
      createPatternManager in class AbstractFlexPat
      Parameters:
      strm - stream of patterns config file
      name - app name
      Returns:
      the regex pattern manager
      Throws:
      IOException - Signals that an I/O exception has occurred.
    • extract

      public List<TextMatch> extract(TextInput input)
      Support the standard Extractor interface. This provides access to the most common extraction;
      Parameters:
      input - text
      Returns:
      list of TextMatch
    • extract

      public List<TextMatch> extract(String input_buf)
      Support the standard Extractor interface. This provides access to the most common extraction;
      Parameters:
      input_buf - text
      Returns:
      list of TextMatch
    • extract_dates

      public TextMatchResult extract_dates(String text, String text_id)
      A direct call to extract dates; which is useful for diagnostics and development/testing.
      Parameters:
      text - text
      text_id - text ID
      Returns:
      TextMatchResult, a wrapper around a list of TextMatch
    • match_DateTime

      public void match_DateTime(boolean flag)
      enable date time patterns
      Parameters:
      flag - true if enabling date/time matching
    • match_MonDayYear

      public void match_MonDayYear(boolean flag)
      enable mon day year patterns.
      Parameters:
      flag - true if enabling MonthDayYear family
    • match_DayMonYear

      public void match_DayMonYear(boolean flag)
      enable day mon year.
      Parameters:
      flag - the flag
    • setToday

      public void setToday(Date d)
      Optionally reset your context... what is TODAY with respect to your data?
      Parameters:
      d - date
    • setDistantPastYear

      public static void setDistantPastYear(int y)
      * Application thresholds -- chosen by the user.
      Parameters:
      y - 4-digit year
    • isFuture

      public boolean isFuture(long epoch)
      Given the set MAX_DATE_CUTOFF_YEAR, determine if the date epoch is earlier than this.
      Parameters:
      epoch - epoch since 1970-01-01
      Returns:
      true, if is future
    • isFuture

      public boolean isFuture(Date dt)
      Checks if is future.
      Parameters:
      dt - the dt
      Returns:
      true, if is future
    • isDistantPast

      public boolean isDistantPast(long epoch)
      Checks if is distant past.
      Parameters:
      epoch - epoch
      Returns:
      true if past DISTANT_PAST_THRESHOLD
    • isDistantPast

      public boolean isDistantPast(Date dt)
      Checks if is distant past.
      Parameters:
      dt - date
      Returns:
      true, if is distant past
    • isDistantPastYMD

      public boolean isDistantPastYMD(Date dt)
      if a date is too far in past to likley be a date of the format YYYY-MM-DD.
      Parameters:
      dt - date
      Returns:
      true if date is distant