Difference between revisions of "LexMan"

From String
Jump to: navigation, search
Line 1: Line 1:
 
===== Acronym =====
 
===== Acronym =====
'''''LexMan''''' stands for '''''Lex'''''lical '''''M'''''orphological '''''an'''''alizer
+
'''''LexMan''''' stands for '''''Lex'''''ical '''''M'''''orphological '''''an'''''alizer
  
  
Line 6: Line 6:
 
LexMan is responsible for according to each token its part-of-speech (POS) and any other relevant morphosyntactic feature (gender, number, tense, mood, case, degree, etc.).  
 
LexMan is responsible for according to each token its part-of-speech (POS) and any other relevant morphosyntactic feature (gender, number, tense, mood, case, degree, etc.).  
  
The rich tag set has a high granularity featuring 12 POS categories and 11 fields.
+
LexMan uses very rich, highly granular tagset, featuring 12 '''categories''' (''v.g.'' noun, verb, adjective, pronoun, article, adverb, preposition, conjunction, numeral, interjection, ponctuation, and symbol) and 11 '''fields''' (''scilicet'', category (CAT), subcategory (SCT), mood (MOD), tense (TEN), person (PER), number (NUM), gender (GEN), degree (DEG), case (CAS), syntactic features (SYN), and semantic features (SEM)). No category uses all ten fields.
 
 
 
 
O LexMan é responsável pela etiquetagem morfossintática (POS tagging) da cadeia. O LexMan atribui a cada um dos segmentos identificados anteriormente todas as possíveis etiquetas morfossintáticas, ou seja, classifica um segmento como sendo um símbolo, um número, um verbo, etc. No caso das categorias com flexão, indica ainda os respetivos valores flexionais (tempo, modo, pessoa-número, género, número, grau, etc.). Uma palavra com mais do que uma etiqueta é uma palavra amb ́ıgua, de um ponto de vista morfossinta ́tico.
 
 
 
  
 +
LexMan is used to generate and validate all the inflected forms associated to lexical lemmas, along with the corresponding morpho-syntactic information.
 +
LexMan also provides an efficient, fast and ductile way of maintaining and updating the lexicons.
  
 
===== Module evolution =====
 
===== Module evolution =====
a new version, capable of performing tokenization is being developed by Alexandre Vicente.
+
A new version of LexMan, capable of performing tokenization, is currently being developed by Alexandre Vicente.
  
  
 
===== Publications =====
 
===== Publications =====

Revision as of 10:17, 7 March 2012

Acronym

LexMan stands for Lexical Morphological analizer


Brief Description

LexMan is responsible for according to each token its part-of-speech (POS) and any other relevant morphosyntactic feature (gender, number, tense, mood, case, degree, etc.).

LexMan uses very rich, highly granular tagset, featuring 12 categories (v.g. noun, verb, adjective, pronoun, article, adverb, preposition, conjunction, numeral, interjection, ponctuation, and symbol) and 11 fields (scilicet, category (CAT), subcategory (SCT), mood (MOD), tense (TEN), person (PER), number (NUM), gender (GEN), degree (DEG), case (CAS), syntactic features (SYN), and semantic features (SEM)). No category uses all ten fields.

LexMan is used to generate and validate all the inflected forms associated to lexical lemmas, along with the corresponding morpho-syntactic information. LexMan also provides an efficient, fast and ductile way of maintaining and updating the lexicons.

Module evolution

A new version of LexMan, capable of performing tokenization, is currently being developed by Alexandre Vicente.


Publications