User contributions for Eugenio
10 January 2024
- 14:2414:24, 10 January 2024 diff hist +29,172 N Dictionaries Created page with "<div style="float:right;">__TOC__</div> === Description === STRING operates based on large-sized, comprehensive, highly granular lexical resources. Much emphasis is put in building them, under the conviction that the lexicon is key to many NLP tasks and applications. This page, constantly under construction, describes briefly the main resources already available and being used by STRING. === LexMan Dictionary === LexMan uses a dictionary of lemmas containing, for the m..."
- 14:1214:12, 10 January 2024 diff hist +13,158 N Corpora Created page with "=== Zero Anaphora Corpus (ZAC) === <div style="float:right;">__TOC__</div> ZAC - Zero Anaphora Corpus is a corpus of Brazilian Portuguese texts built in view of the construction of an Anaphora Resolution system, which is part of the STRING system. The ZAC corpus is aimed at the resolution of the so-called zero-anaphora, that is, an anaphora relation where the anaphoric expression (or anaphor) has been zeroed. In the following, we briefly present the main linguistic asp..."
- 13:5413:54, 10 January 2024 diff hist 0 MARv4 No edit summary current
- 13:5213:52, 10 January 2024 diff hist +5,752 N MARv4 Created page with "<div style="float:right;">__TOC__</div> ==== Acronym ==== '''''MARv''''' stands for '''M'''orphossyntactic '''A'''mbiguity '''R'''esol'''v'''er ==== Introduction ==== MARv2's architecture comprehends two submodules: a set of linguistically-oriented disambiguation rules module and a probabilistic disambiguation module. The linguistic-oriented is no longer used in the STRING chain because that function is now implemented by the RuDriCo module. MARv2..."
- 13:4813:48, 10 January 2024 diff hist +2,936 N InverseDic Created page with "{{DISPLAYTITLE: Inverse Vocabulary of Contemporary Portuguese (InVoc-PT)}} === Presentation === <div style="float:right;">__TOC__</div> An inverse vocabulary is a particular type of vocabulary in which words are presented in alphabetical order but sorted from the last to first first character. For example, here are some words (non-contiguous in the alphabet) in the order as they are shown in the inverse vocabulary: aba, alba, alga, malga, salga, ala, bala, pala, tala, e..." current
- 13:3313:33, 10 January 2024 diff hist +5,984 N RuDriCo2 Created page with "<div style="float:right;">__TOC__</div> ==== Acronym ==== '''''RuDriCo''''' stands for '''''Ru'''''le '''''Dri'''''ven '''''Co'''''nverter ==== Brief Description ==== RuDriCo2's main goal is to provide for an adjustment of the results produced by the LexMan morphological analyzer to the specific needs of each parser. In order to achieve this, it modifies the segmentation that is done by the former. For example, it might contract expressions provided by the morp..." current
- 13:3113:31, 10 January 2024 diff hist +2,134 N LexMan Created page with "<div style="float:right;">__TOC__</div> ==== Acronym ==== '''''LexMan''''' stands for '''Lex'''ical '''M'''orphological '''an'''alyzer ==== Brief Description ==== LexMan is responsible for according to each token its part-of-speech (POS) and any other relevant morphosyntactic feature (gender, number, tense, mood, case, degree, etc.), using [http://en.wikipedia.org/wiki/Finite_state_transducer finite state transducers]. LexMan uses very rich, highly granular ta..." current
- 13:2413:24, 10 January 2024 diff hist +1,040 N Contact Created page with "Any comments, suggestions, doubts or ideas, please contact us! We would like to hear from you! We are located in Lisbon, [http://www.l2f.inesc-id.pt/wiki/index.php/Location near the Saldanha area].<br> A general path finder is [http://www.transporlis.sapo.pt/index.cfm here]. Special options can be found [http://www.l2f.inesc-id.pt/wiki/index.php/Contacts_and_Directions here]. ==== Contacts ==== {| width="400" cellspacing="2" cellpadding="2" |- ! width="16%" valign="TO..." current
- 13:2213:22, 10 January 2024 diff hist +20,426 N XIP Created page with "<div style="float:right;">__TOC__</div> ==== Acronym ==== '''''XIP''''' stands for '''''X'''''EROX '''''I'''''ncremental '''''P'''''arsing ==== Introduction ==== XIP is a <span class="plainlinks">[http://www.xrce.xerox.com/Research-Development/Document-Content-Laboratory/Parsing-Semantics/Robust-Parsing XEROX]</span> parser, based on finite-state technology and able to perform several tasks, namely: * adding lexical, syntactic and semantic information; * applying..." current
- 13:1513:15, 10 January 2024 diff hist +24,448 N Publications Created page with "<div style="float:right;">__TOC__</div> ====in 2016==== '''[73]''' Francisco Dias [http://www.inesc-id.pt/ficheiros/publicacoes/10593.pdf Multilingual Automated Text Anonymization]. MSc thesis, Instituto Superior Técnico, Universidade Técnica de Lisboa, Lisboa, Portugal, June 2016 (bibtex) '''[72]''' Joana Pinto [http://www.inesc-id.pt/ficheiros/publicacoes/10639.pdf Fine-grained POS-tagging: Full disambiguation of verbal morpho-synta..." current
- 13:1113:11, 10 January 2024 diff hist +181 N Template:Coordinator Created page with "{| width='100%' cellspacing='0' | '''{{{name}}}''' |- | style='vertical-align: top; font-size: 12px; text-align: justify;' | [[Image:{{{photo}}}|right|top|border|130px]] {{{cv}}} |}" current
- 13:1013:10, 10 January 2024 diff hist +201 N Template:Teammember Created page with "{| width='100%' cellspacing='0' | '''{{{name}}}''' |- | style='vertical-align: top; font-size: 12px; text-align: justify;' | [[Image:{{{photo}}}|right|top|border|130px]] {{{work}}} <br />[{{{pub}}}] |}" current
- 13:0713:07, 10 January 2024 diff hist +40,630 N Team Created page with "== Coordination == {| width="100%" valign="top" cellpadding="10px" |style="vertical-align: top; text-align: left; width: 35%;" | {{Coordinator |name=[http://www.l2f.inesc-id.pt/wiki/index.php/Nuno_Mamede Nuno Mamede] (Computer Science Coordination) |photo=Nuno.png |cv=Nuno J. Mamede received his graduation, MSc and PhD degrees in Electrical and Computer Engineering by the [http://www.ist.utl.pt Instituto Superior Técnico], Lisbon, in 1981, 1985 and 1992, respectively...." current
- 13:0313:03, 10 January 2024 diff hist +8,709 N Architecture Created page with "<div style="float:right;">__TOC__</div> '''STRING''' is a '''St'''atistical and '''R'''ule-Based '''N'''atural Lan'''g'''uage Processing Chain for Portuguese developed at <span class="plainlinks">[https://www.hlt.inesc-id.pt/wiki/ HLT]</span> and it consists of several modules, which are represented in the next figure: 800px ==== Tokenizer ==== The first module is responsible for text segmentation, and it divides the text into tokens. Besides..." current
- 13:0013:00, 10 January 2024 diff hist 0 m Main Page No edit summary current
- 11:5811:58, 10 January 2024 diff hist +1,101 Main Page No edit summary
- 11:5011:50, 10 January 2024 diff hist +442 N MediaWiki:Sidebar Created page with "* Navigation ** Main_Page|STRING ** architecture|architecture ** team|team ** publications|publications ** transfer|technology transfer ** https://string.hlt.inesc-id.pt/demo|demo ** contact|contact * Modules ** LexMan|LexMan ** RuDriCo2|RuDriCo2 ** MARv4|MARv4 ** XIP|XIP ** other|other * Lexical Resources ** dictionaries|dictionaries ** InverseDic|inverse dictionary ** disambiguation|disambiguation ** grammar|grammar ** corpora|corpora" current