In digital lexicography, natural language processing, and digital humanities, a lexical resource is a language resource consisting of one or several dictionaries, e.g., in the form of a database (Gil Francopoulo).
Different standards for the machine-readable edition of lexical resources exist, e.g., Lexical Markup Framework (LMF) an ISO standard for encoding lexical resources, comprising an abstract data model and an XML serialization, and OntoLex-Lemon, an RDF vocabulary for publishing lexical resources as knowledge graphs on the web, e.g., as Linguistic Linked Open Data.
Depending on the type of languages that are addressed, a lexical resource may be qualified as monolingual, bilingual or multilingual. For bilingual and multilingual lexical resources, the words may be connected or not connected from one language to another. When connected, the equivalence from a language to another is performed through a bilingual link (for bilingual lexical resources, e.g., using the relation vartrans:translatableAs in OntoLex-Lemon) or through multilingual notations (for multilingual lexical resources, e.g., by reference to the same ontolex:Concept in OntoLex-Lemon).
It is possible also to build and manage a lexical resource consisting of different lexicons of the same language, for instance, one dictionary for general words and one or several dictionaries for different specialized domains.
Machine-readable dictionary vs. NLP dictionaryEdit
Lexical resources in digital lexicography are often referred to as machine-readable dictionary (MRD), a dictionary stored as machine (computer) data instead of being printed on paper. It is an electronic dictionary and lexical database. The term MRD is often contrasted with NLP dictionary, in the sense that an MRD is the electronic form of a dictionary which was printed before on paper. Although being both used by programs, in contrast, the term NLP dictionary is preferred when the dictionary was built from scratch with NLP in mind.
A lexical database is a lexical resource which has an associated software environment database which permits access to its contents. The database may be custom-designed for the lexical information or a general-purpose database into which lexical information has been entered.
- Lexical Markup Framework (LMF), ISO standard for encoding lexical resources, comprising an abstract data model and an XML serialization
- OntoLex-Lemon, RDF vocabulary for publishing lexical resources on the web, e.g., as Linguistic Linked Open Data
- Lexical database
- LREC conference series
- Machine-readable dictionary
- SARMA, Shikhar Kr, et al. Building multilingual lexical resources using wordnets: Structure, design and implementation. In: Proceedings of the 3rd Workshop on Cognitive Aspects of the Lexicon. 2012. S. 161-170.
- Francopoulo, Gil; Bel, Nuria; George, Monte; Calzolari, Nicoletta; Monachini, Monica; Pet, Mandy; Soria, Claudia (2009-03-01). "Multilingual resources for NLP in the lexical markup framework (LMF)" (PDF). Language Resources and Evaluation. 43 (1): 57–70. doi:10.1007/s10579-008-9077-5. ISSN 1574-0218. S2CID 7697316.
- Cimiano, Philipp; Chiarcos, Christian; McCrae, John P.; Gracia, Jorge (2020), Linguistic Linked Data: Representation, Generation and Applications, Springer International Publishing, pp. 45–59, doi:10.1007/978-3-030-30225-2_4, ISBN 978-3-030-30225-2
- Cimiano, Phillip; McCrae, John P.; Buitelaar, Paul. "Lexicon Model for Ontologies: Community Report, 10 May 2016 Final Community Group Report 10 May 2016". W3C. Retrieved 6 December 2019.
- Gil Francopoulo (edited by) LMF Lexical Markup Framework, ISTE / Wiley 2013 (ISBN 978-1-84821-430-9)