Class SolrMatcherSupport

java.lang.Object
org.opensextant.extraction.SolrMatcherSupport
All Implemented Interfaces:
Closeable, AutoCloseable
Direct Known Subclasses:
GazetteerMatcher, TaxonMatcher

public abstract class SolrMatcherSupport extends Object implements Closeable
Connects to a Solr sever via HTTP and tags place names in document. The SOLR_HOME environment variable must be set to the location of the Solr server.

Author:
David Smiley - dsmiley@mitre.org, Marc Ubaldino - ubaldino@mitre.org
  • Field Details

    • DEFAULT_TAG_LIMIT

      public static final int DEFAULT_TAG_LIMIT
      See Also:
    • log

      protected org.slf4j.Logger log
    • requestHandler

      protected String requestHandler
    • solr

      protected SolrProxy solr
    • tagNamesTime

      protected int tagNamesTime
    • getNamesTime

      protected int getNamesTime
    • totalTime

      protected int totalTime
  • Constructor Details

    • SolrMatcherSupport

      public SolrMatcherSupport()
  • Method Details

    • setTaggerHandler

      public void setTaggerHandler(String nonDefault)
      Use this if you intend to set a non-default tagger path. E.g., /tag1 /tag-lang1 etc.
      Parameters:
      nonDefault - path of tagger.
    • close

      public void close()
      Close solr resources.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
    • getCoreName

      public abstract String getCoreName()
      Be explicit about the solr core to use for tagging.
      Returns:
      the core name
    • getMatcherParameters

      public abstract org.apache.solr.common.params.SolrParams getMatcherParameters()
      Return the Solr Parameters for the tagger op.
      Returns:
      SolrParams
    • createTag

      public abstract Object createTag(org.apache.solr.common.SolrDocument doc)
      Caller must implement their domain objects, POJOs... this callback handler only hashes them.
      Parameters:
      doc - record to convert to Place record
      Returns:
      object representing a Place
    • initialize

      public void initialize() throws org.opensextant.ConfigException
      Initialize. This capability is not supporting taggers/matchers using HTTP server. For now, it is intedended to be in-memory, local embedded solr server.
      Throws:
      org.opensextant.ConfigException - if solr server cannot be established from local index or from http server
    • getTaggingNamesTime

      public int getTaggingNamesTime()
      Emphemeral metric for the current tagText() call. Caller must get these numbers immediately after call.
      Returns:
      time to tag
    • getRetrievingNamesTime

      public int getRetrievingNamesTime()
      Returns:
      time to get reference records.
    • getTotalTime

      public int getTotalTime()
      Returns:
      time to get gazetteer records.
    • tagTextCallSolrTagger

      protected org.apache.solr.client.solrj.response.QueryResponse tagTextCallSolrTagger(String buffer, String docid, Map<Object,Object> refDataMap) throws org.opensextant.extraction.ExtractionException
      Solr call: tag input buffer, returning all candiate reference data that matched during tagging.
      Parameters:
      buffer - text to tag
      docid - id for text, only for tracking purposes
      refDataMap - - a map of reference data in solr, It will store caller's domain objects. e.g., rec.id => domain(rec)
      Returns:
      solr response
      Throws:
      org.opensextant.extraction.ExtractionException - tagger error