DeepEye is an approach for simplifying typical NLP annotation exchanges. It represents a practical data model for representing annotations -- any span of text tagged with some metadata in the context of a document. The resulting annotation can be serialized as JSON, stored in a database, and later deserialized or retrieved from that database. All of these transformations from Java or object state to representational state incur some loss, some interpretation, etc. DeepEye offers some best practices and some conveniences that support rapid prototyping where NLP is invoked natively or RESTfully and the outputs are persisted in databases. The key concepts are the Record and the Annotation. A Record object represents the original data and any associated metadata. A Record must have an identifier and usually relates to a single source. Annotations are any key-value pair derived from a Record by some processing routine. Data structure:
- Records have an id, text, and attributes. Other optional fields, as well, such as processing state to indicate processing was done and annotations were contributed by that processor.
- Annotations link to Record by rec_id, they have a name, value, offset(s), attributes. As processing may yield spurious, repetitive annotations the AnnotationHelper can be used to cache the same name/value annotation as it appears over many span offsets. This convenience we term as annotation compression.
- AnnotationHelper is a utility class that can be used to formulate common OpenSextant annotations from Java classes. This utility class also helps with annotation compression and distilling large results in memory.
- DeepEyeStore is a noSQL-style API for finding and updating Records, saving Annotations, updating Annotations and recording Record state. MongoDB, PostgreSQL and SQLite implementations have been attempted, where MongoDB has been the most successful. This class is only an interface specification without implementation.
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII Xponents sub-project "DeepEye", NLP methodology IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII Copyright 2013-2021 MITRE Corporation
Interface Summary Interface Description DeepEyeStoreDeepEyeStore is an abstraction of a data store that stores records and annotations.
Class Summary Class Description AnnotationAn annotation is at least a typed name/value pair created by something. AnnotationHelperBasis for this optional helper class was three or four different projects using DeepEye as a model for persisting annotations from the typical Named Entity and Geo/Time extraction work. DeepEyeDataA base class for Record, Annotation and other structures. RecordA record is a representation of the raw original.
Exception Summary Exception Description DeepEyeExceptionException used when there is a user or system error related to data serialization or any sort of Java object - to JSONification error.