Package org.opensextant.annotations
package org.opensextant.annotations
DeepEye is an approach for simplifying typical NLP annotation exchanges. It
represents a
practical data model for representing annotations -- any span of text tagged
with some metadata
in the context of a document. The resulting annotation can be serialized as
JSON, stored in a
database, and later deserialized or retrieved from that database. All of
these transformations
from Java or object state to representational state incur some loss, some
interpretation, etc.
DeepEye offers some best practices and some conveniences that support rapid
prototyping where NLP
is invoked natively or RESTfully and the outputs are persisted in databases.
The key concepts are the Record and the Annotation. A Record object
represents the original data
and any associated metadata. A Record must have an identifier and usually
relates to a single
source. Annotations are any key-value pair derived from a Record by some
processing routine.
Data structure:
- Records have an id, text, and attributes. Other optional fields, as well, such as processing state to indicate processing was done and annotations were contributed by that processor.
- Annotations link to Record by rec_id, they have a name, value, offset(s), attributes. As processing may yield spurious, repetitive annotations the AnnotationHelper can be used to cache the same name/value annotation as it appears over many span offsets. This convenience we term as annotation compression.
- AnnotationHelper is a utility class that can be used to formulate common OpenSextant annotations from Java classes. This utility class also helps with annotation compression and distilling large results in memory.
- DeepEyeStore is a noSQL-style API for finding and updating Records, saving Annotations, updating Annotations and recording Record state. MongoDB, PostgreSQL and SQLite implementations have been attempted, where MongoDB has been the most successful. This class is only an interface specification without implementation.
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII Xponents sub-project "DeepEye", NLP methodology IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII Copyright 2013-2021 MITRE Corporation
-
ClassDescriptionAn annotation is at least a typed name/value pair created by something.Basis for this optional helper class was three or four different projects using DeepEye as a model for persisting annotations from the typical Named Entity and Geo/Time extraction work.A base class for Record, Annotation and other structures.Exception used when there is a user or system error related to data serialization or any sort of Java object - to JSONification error.DeepEyeStore is an abstraction of a data store that stores records and annotations.A record is a representation of the raw original.