UIMA

Apache UIMA
	<img src="//upload.wikimedia.org/wikipedia/commons/thumb/d/d2/Apache_UIMA_logo.svg/120px-Apache_UIMA_logo.svg.png" decoding="async" width="120" height="52" class="mw-file-element" data-file-width="1762" data-file-height="758">
Developer(s)	IBM, Apache Software Foundation (since October 2006)
Stable release	3.1.1 / November 8, 2019; 4 years ago
Repository	svn.apache.org/repos/asf/uima/ <img alt="Edit this at Wikidata" src="//upload.wikimedia.org/wikipedia/en/thumb/8/8a/OOjs_UI_icon_edit-ltr-progressive.svg/10px-OOjs_UI_icon_edit-ltr-progressive.svg.png" decoding="async" width="10" height="10" class="mw-file-element" data-file-width="20" data-file-height="20"> ;
Written in	Java with C++ enablement
Operating system	cross-platform
Type	text mining, information extraction
License	Apache License 2.0
Website	uima.apache.org

UIMA (/juˈiːmə/ yoo-EE-mə),^[1] short for Unstructured Information Management Architecture, is an OASIS standard^[2] for content analytics, originally developed at IBM. It provides a component software architecture for the development, discovery, composition, and deployment of multi-modal analytics for the analysis of unstructured information and integration with search technologies.

Structure edit

The UIMA architecture can be thought of in four dimensions:

It specifies component interfaces in an analytics pipeline.
It describes a set of design patterns.
It suggests two data representations: an in-memory representation of annotations for high-performance analytics and an XML representation of annotations for integration with remote web services.
It suggests development roles allowing tools to be used by users with diverse skills.

Apache UIMA, a reference implementation of UIMA, is maintained by the Apache Software Foundation.

UIMA is used in a number of software projects:

IBM Research's Watson uses UIMA for analyzing unstructured data.^[4]
The Clinical Text Analysis and Knowledge Extraction System (Apache cTAKES) is a UIMA-based system for information extraction from medical records.
DKPro Core is a collection of reusable UIMA components for general-purpose natural language processing.