Difference between revisions of "LexMan"

From String
Jump to: navigation, search
(Brief Description)
Line 4: Line 4:
  
 
===== Brief Description =====
 
===== Brief Description =====
[[LexMan]] is responsible for according to each token its part-of-speech (POS) and any other relevant morphosyntactic feature (gender, number, tense, mood, case, degree, etc.).  
+
[[LexMan]] is responsible for according to each token its part-of-speech (POS) and any other relevant morphosyntactic feature (gender, number, tense, mood, case, degree, etc.), using [http://en.wikipedia.org/wiki/Finite_state_transducer finite state transducers].
  
 
[[LexMan]] uses very rich, highly granular tagset, featuring 12 '''categories''' (''v.g.'' noun, verb, adjective, pronoun, article, adverb, preposition, conjunction, numeral, interjection, ponctuation, and symbol) and 11 '''fields''' (''scilicet'', category (CAT), subcategory (SCT), mood (MOD), tense (TEN), person (PER), number (NUM), gender (GEN), degree (DEG), case (CAS), syntactic features (SYN), and semantic features (SEM)). No category uses all ten fields.
 
[[LexMan]] uses very rich, highly granular tagset, featuring 12 '''categories''' (''v.g.'' noun, verb, adjective, pronoun, article, adverb, preposition, conjunction, numeral, interjection, ponctuation, and symbol) and 11 '''fields''' (''scilicet'', category (CAT), subcategory (SCT), mood (MOD), tense (TEN), person (PER), number (NUM), gender (GEN), degree (DEG), case (CAS), syntactic features (SYN), and semantic features (SEM)). No category uses all ten fields.
Line 10: Line 10:
 
[[LexMan]] is used to generate and validate all the inflected forms associated to lexical lemmas, along with the corresponding morpho-syntactic information.  
 
[[LexMan]] is used to generate and validate all the inflected forms associated to lexical lemmas, along with the corresponding morpho-syntactic information.  
 
[[LexMan]] also provides an efficient, fast and ductile way of maintaining and updating the lexicons.
 
[[LexMan]] also provides an efficient, fast and ductile way of maintaining and updating the lexicons.
 +
  
 
===== Module evolution =====
 
===== Module evolution =====

Revision as of 13:43, 7 March 2012

Acronym

LexMan stands for Lexical Morphological analizer


Brief Description

LexMan is responsible for according to each token its part-of-speech (POS) and any other relevant morphosyntactic feature (gender, number, tense, mood, case, degree, etc.), using finite state transducers.

LexMan uses very rich, highly granular tagset, featuring 12 categories (v.g. noun, verb, adjective, pronoun, article, adverb, preposition, conjunction, numeral, interjection, ponctuation, and symbol) and 11 fields (scilicet, category (CAT), subcategory (SCT), mood (MOD), tense (TEN), person (PER), number (NUM), gender (GEN), degree (DEG), case (CAS), syntactic features (SYN), and semantic features (SEM)). No category uses all ten fields.

LexMan is used to generate and validate all the inflected forms associated to lexical lemmas, along with the corresponding morpho-syntactic information. LexMan also provides an efficient, fast and ductile way of maintaining and updating the lexicons.


Module evolution

A new version of LexMan, capable of performing tokenization, is currently being developed by Alexandre Vicente.


Publications