The processing chain of L2F consists of several modules, which are represented in the next figure:
Among the different modules of the processing chain is used XML (eXtensible Markup Language).
The first module is responsible for segmentation, it divides the text into tokens.
The LexMan does the morphosyntatic labeling.
The next module of the processing chain divides the text into sentences.
The morphosyntactic disambiguation module, RuDriCo2 performs corrections to the output of morphosyntactic labeling module.
￼MARv3, the statistical morphosyntactic disambiguation module performs a statistical disambiguation.
XIP performes the syntactic analysis. This analyzer allows to introduction of lexical, syntactic, and semantic information, it also allows the aplication of local grammars, morphosyntactic disambiguation rules, the calculation of chunks and dependencies. XIP is composed by different modules:
- Lexicons - allow to add information to the different tokens. In the XIP, there is a pre-existing lexicon, which can be enriched by adding lexical entries or changing existing ones.
- Local Grammars - XIP enables the writing of rules considering the left and right the contexts. The rules intended to define entities formed by more than one lexical units, grouping elements together into a single entity.
- Chunking Module - Chuking rules perform a sintatic analysis of the text, for each phrase is build a sequences of categories and grouped into structures (chunks).
- Dependency Module - dependences are syntactic relationships between different chunks. They allow to have a deeper and richer knowledge of a text. The nodes sequence previously identified by chunking rules are used by the dependency rules to calculate the relationships between them.