Package org.opensextant.extraction
Class SolrMatcherSupport
java.lang.Object
org.opensextant.extraction.SolrMatcherSupport
- All Implemented Interfaces:
Closeable
,AutoCloseable
- Direct Known Subclasses:
GazetteerMatcher
,TaxonMatcher
Connects to a Solr sever via HTTP and tags place names in document. The
SOLR_HOME
environment variable must be set to the location of
the Solr server.
- Author:
- David Smiley - dsmiley@mitre.org, Marc Ubaldino - ubaldino@mitre.org
-
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Close solr resources.abstract Object
createTag
(org.apache.solr.common.SolrDocument doc) Caller must implement their domain objects, POJOs...abstract String
Be explicit about the solr core to use for tagging.abstract org.apache.solr.common.params.SolrParams
Return the Solr Parameters for the tagger op.int
int
Emphemeral metric for the current tagText() call.int
void
Initialize.void
setTaggerHandler
(String nonDefault) Use this if you intend to set a non-default tagger path.protected org.apache.solr.client.solrj.response.QueryResponse
Solr call: tag input buffer, returning all candiate reference data that matched during tagging.
-
Field Details
-
DEFAULT_TAG_LIMIT
public static final int DEFAULT_TAG_LIMIT- See Also:
-
log
protected org.slf4j.Logger log -
requestHandler
-
solr
-
tagNamesTime
protected int tagNamesTime -
getNamesTime
protected int getNamesTime -
totalTime
protected int totalTime
-
-
Constructor Details
-
SolrMatcherSupport
public SolrMatcherSupport()
-
-
Method Details
-
setTaggerHandler
Use this if you intend to set a non-default tagger path. E.g., /tag1 /tag-lang1 etc.- Parameters:
nonDefault
- path of tagger.
-
close
public void close()Close solr resources.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
-
getCoreName
Be explicit about the solr core to use for tagging.- Returns:
- the core name
-
getMatcherParameters
public abstract org.apache.solr.common.params.SolrParams getMatcherParameters()Return the Solr Parameters for the tagger op.- Returns:
- SolrParams
-
createTag
Caller must implement their domain objects, POJOs... this callback handler only hashes them.- Parameters:
doc
- record to convert to Place record- Returns:
- object representing a Place
-
initialize
public void initialize() throws org.opensextant.ConfigExceptionInitialize. This capability is not supporting taggers/matchers using HTTP server. For now, it is intedended to be in-memory, local embedded solr server.- Throws:
org.opensextant.ConfigException
- if solr server cannot be established from local index or from http server
-
getTaggingNamesTime
public int getTaggingNamesTime()Emphemeral metric for the current tagText() call. Caller must get these numbers immediately after call.- Returns:
- time to tag
-
getRetrievingNamesTime
public int getRetrievingNamesTime()- Returns:
- time to get reference records.
-
getTotalTime
public int getTotalTime()- Returns:
- time to get gazetteer records.
-
tagTextCallSolrTagger
protected org.apache.solr.client.solrj.response.QueryResponse tagTextCallSolrTagger(String buffer, String docid, Map<Object, Object> refDataMap) throws org.opensextant.extraction.ExtractionExceptionSolr call: tag input buffer, returning all candiate reference data that matched during tagging.- Parameters:
buffer
- text to tagdocid
- id for text, only for tracking purposesrefDataMap
- - a map of reference data in solr, It will store caller's domain objects. e.g., rec.id => domain(rec)- Returns:
- solr response
- Throws:
org.opensextant.extraction.ExtractionException
- tagger error
-