LexMan

From String
Revision as of 13:43, 7 March 2012 by Njm (talk | contribs)
Jump to: navigation, search
Acronym

LexMan stands for Lexical Morphological analizer


Brief Description

LexMan is responsible for according to each token its part-of-speech (POS) and any other relevant morphosyntactic feature (gender, number, tense, mood, case, degree, etc.), using finite state transducers.

LexMan uses very rich, highly granular tagset, featuring 12 categories (v.g. noun, verb, adjective, pronoun, article, adverb, preposition, conjunction, numeral, interjection, ponctuation, and symbol) and 11 fields (scilicet, category (CAT), subcategory (SCT), mood (MOD), tense (TEN), person (PER), number (NUM), gender (GEN), degree (DEG), case (CAS), syntactic features (SYN), and semantic features (SEM)). No category uses all ten fields.

LexMan is used to generate and validate all the inflected forms associated to lexical lemmas, along with the corresponding morpho-syntactic information. LexMan also provides an efficient, fast and ductile way of maintaining and updating the lexicons.


Module evolution

A new version of LexMan, capable of performing tokenization, is currently being developed by Alexandre Vicente.


Publications