Difference between revisions of "InverseDic"

From String
Jump to: navigation, search
(Created page with "<div style="float:right;">__TOC__</div> Inverse Vocabulary of Contemporary Portuguese (InVoc-PT) === Presentation === An inverse vocabulary is a particular type of vocabulary ...")
 
(References)
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
{{DISPLAYTITLE: Inverse Vocabulary of Contemporary Portuguese (InVoc-PT)}}
 +
=== Presentation ===
 
<div style="float:right;">__TOC__</div>
 
<div style="float:right;">__TOC__</div>
 
Inverse Vocabulary of Contemporary Portuguese (InVoc-PT)
 
 
=== Presentation ===
 
  
 
An inverse vocabulary is a particular type of vocabulary in which words are presented in alphabetical order but sorted from the last to first first character. For example, here are some words (non-contiguous in the alphabet) in the order as they are shown in the inverse vocabulary: aba, alba, alga, malga, salga, ala, bala, pala, tala, etc.
 
An inverse vocabulary is a particular type of vocabulary in which words are presented in alphabetical order but sorted from the last to first first character. For example, here are some words (non-contiguous in the alphabet) in the order as they are shown in the inverse vocabulary: aba, alba, alga, malga, salga, ala, bala, pala, tala, etc.
Line 9: Line 7:
 
It is a vocabulary and not exactly a dictionary, as it does not produce any definitions of the words there listed, though it usually shows them with their grammatical categories (a.k.a. part-of-speech). Sometimes, some quantitative information is also presented regarding subsets of endings (v.g. number of entries ending in the same 2, 3, 4 characters).  
 
It is a vocabulary and not exactly a dictionary, as it does not produce any definitions of the words there listed, though it usually shows them with their grammatical categories (a.k.a. part-of-speech). Sometimes, some quantitative information is also presented regarding subsets of endings (v.g. number of entries ending in the same 2, 3, 4 characters).  
  
Due to its formal nature, inverse vocabularies often are by-products of "normal" machine-readable dictionaries, and they are built using software specially designed for that purpose.
+
Due to its formal nature, inverse vocabularies often are by-products of "normal" machine-readable dictionaries, and they are built using software specially designed for that purpose. Inverse vocabularies are important tools for the study of several linguistic phenomena, in particular the mechanisms and productivity of morphological derivation by suffixation.
 
 
Inverse vocabularies are important tools for the study of several linguistic phenomena, in particular the mechanisms and productivity of morphological derivation by suffixation.
 
  
 
As far as we know, only two converse Portuguese vocabularies were published to date. In chronological order:
 
As far as we know, only two converse Portuguese vocabularies were published to date. In chronological order:
Line 17: Line 13:
 
* o Dicionário inverso do Português, de Ernesto d'Andrade (1993): it contains 42,300 word forms, with POS tags
 
* o Dicionário inverso do Português, de Ernesto d'Andrade (1993): it contains 42,300 word forms, with POS tags
  
Note: In 1997, S. Eleutério has developed a reverse index based on dictionary of simple words of the DIGRAMA sistem (Eleutério et al. 1995), however this resource was only available to the research laboratory. The vocabulary featured 95,000 entries, their part-of-speech and their inflectional paradigm. Entries with different part-of-speech were not collapsed under the same entry.
+
The first book is virtually impossible to find today, except in libraries and specialised  booksellers. None of these resources, however, is available, at least to the general public, in digital format, which renders their use less practical.
  
The first book is virtually impossible to find today, except in libraries and specialised  booksellers. None of these resources, however, was available at least to the general public, in digital format, which made its use less practical.
+
It was this gap that the '''''STRING''''' [[team]] at the <span class="plainlinks">[http://www.l2f.inesc-id.pt L2F]</span>/<span class="plainlinks">[http://www.inesc-id.pt INESC ID Lisboa]</span> intended to fill, by providing access to the <span class="plainlinks">[http://string.l2f.inesc-id.pt/demo/inversedictionary.pl InVoc-PT]</span> for a broader public via-web consultation. Several sources were used to produce the vocabulary. The <span class="plainlinks">[http://string.l2f.inesc-id.pt/demo/inversedictionary.pl InVoc-PT]</span> contains 150,700 entries, consisting of the inverted form, their lemmas and POS tags, and for some words it produces the plural inflection. Entries with the same lemma but different POS are collapsed in a single entry.
  
It was this gap that the STRING team at the L2F/INESC ID Lisboa intended to fill, by providing access to the InVoc-PT for a broader public via-web consultation. Several sources were used to produce the vocabulary. The InVoc-PT contains 150,700 entries, consisting of the inverted form, their lemmas and POS tags, and for some words it produces the plural inflection. Entries with the same lemma but different POS are collapsed in a single entry.
+
=== References ===
 +
'''[1]'''  D'Andrade, E. ''Dicionário Inverso do Português.'' Lisboa: Cosmos (1993).
  
=== References ===
+
'''[2]'''  Wolf, E.M.; Narumov, B.P.; Vaisbord, A.S.; Kosarik, M.A. ''Dicionário Inverso da Língua Portuguesa'' [Обратный словарь португальского языка]. Moscovo: Hayka (1971).
D'Andrade, E. Dicionário Inverso do Português. Lisboa: Cosmos (1993).
 
Eleutério, S.; Ranchhod,E.; Freire, H.; Baptista, J. A system of electronic dictionaries of Portuguese. Linguisticae Investigationes 19:1, pp.57-82 (1995).
 
Wolf, E.M.; Narumov, B.P.; Vaisbord, A.S.; Kosarik, M.A. Dicionário Inverso da Língua Portuguesa [Обратный словарь португальского языка]. Moscovo: Hayka (1971).
 

Latest revision as of 17:29, 22 May 2013

Presentation

An inverse vocabulary is a particular type of vocabulary in which words are presented in alphabetical order but sorted from the last to first first character. For example, here are some words (non-contiguous in the alphabet) in the order as they are shown in the inverse vocabulary: aba, alba, alga, malga, salga, ala, bala, pala, tala, etc.

It is a vocabulary and not exactly a dictionary, as it does not produce any definitions of the words there listed, though it usually shows them with their grammatical categories (a.k.a. part-of-speech). Sometimes, some quantitative information is also presented regarding subsets of endings (v.g. number of entries ending in the same 2, 3, 4 characters).

Due to its formal nature, inverse vocabularies often are by-products of "normal" machine-readable dictionaries, and they are built using software specially designed for that purpose. Inverse vocabularies are important tools for the study of several linguistic phenomena, in particular the mechanisms and productivity of morphological derivation by suffixation.

As far as we know, only two converse Portuguese vocabularies were published to date. In chronological order:

  • o Dicionário inverso da língua portuguesa, de E. M. Wolf et al. (1971): Iit contains about 12,740 word forms, with POS tags, the gender of nouns, and the transitivity information for verbs.
  • o Dicionário inverso do Português, de Ernesto d'Andrade (1993): it contains 42,300 word forms, with POS tags

The first book is virtually impossible to find today, except in libraries and specialised booksellers. None of these resources, however, is available, at least to the general public, in digital format, which renders their use less practical.

It was this gap that the STRING team at the L2F/INESC ID Lisboa intended to fill, by providing access to the InVoc-PT for a broader public via-web consultation. Several sources were used to produce the vocabulary. The InVoc-PT contains 150,700 entries, consisting of the inverted form, their lemmas and POS tags, and for some words it produces the plural inflection. Entries with the same lemma but different POS are collapsed in a single entry.

References

[1] D'Andrade, E. Dicionário Inverso do Português. Lisboa: Cosmos (1993).

[2] Wolf, E.M.; Narumov, B.P.; Vaisbord, A.S.; Kosarik, M.A. Dicionário Inverso da Língua Portuguesa [Обратный словарь португальского языка]. Moscovo: Hayka (1971).