RuDriCo2

From String
Revision as of 11:48, 5 March 2012 by Njm (talk | contribs) (Created page with "The '''RuDriCo2''' module is responsible for the word-splitting (i.e. solving contractions); (e.g. \textit{comigo} = \textit{com}/Prep + \textit{eu}/Pron it also applies a consi...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The RuDriCo2 module is responsible for the word-splitting (i.e. solving contractions);

(e.g. \textit{comigo} = \textit{com}/Prep + \textit{eu}/Pron

it also applies a considerably large set of disambiguation rules; finally, it identifies many unambiguous compound words.

This new version \textsc{RuDriCo2} is significantly (10 times) faster that the previous version, uses a more expressive language (allowing negation and disjunction, the use of regular expressions both in the lemma and in the surface form) and constitutes an approach to the XIP parser syntax (see below). It also validates the input data, features error messages and warnings for potential problems.



PUBLICATIONS

* Cláudio Diniz, Um Conversor baseado em regras de transformação declarativas, MSc thesis, Instituto Superior Técnico, Universidade Técnica de Lisboa, Lisboa, Portugal, October 2010 (bibtex)

* Cláudio Diniz, Nuno Mamede, João D. Pereira, RuDriCo2 - a faster disambiguator and segmentation modifier, in II Simpósio de Informática (INForum 2010), Universidade do Minho, pages 573-584, September 2010 (bibtex)