LexMan stands for Lexical Morphological analizer

Brief Description

LexMan is responsible for according to each token its part-of-speech (POS) and any other relevant morphosyntactic feature (gender, number, tense, mood, case, degree, etc.).

LexMan uses very rich, highly granular tagset, featuring 12 categories (v.g. noun, verb, adjective, pronoun, article, adverb, preposition, conjunction, numeral, interjection, ponctuation, and symbol) and 11 fields (scilicet, category (CAT), subcategory (SCT), mood (MOD), tense (TEN), person (PER), number (NUM), gender (GEN), degree (DEG), case (CAS), syntactic features (SYN), and semantic features (SEM)). No category uses all ten fields.

LexMan is used to generate and validate all the inflected forms associated to lexical lemmas, along with the corresponding morpho-syntactic information. LexMan also provides an efficient, fast and ductile way of maintaining and updating the lexicons.

Module evolution

A new version of LexMan, capable of performing tokenization, is currently being developed by Alexandre Vicente.