Acessibilidade / Reportar erro

Donatus: a user-friendly interface for the study of formal syntax using the Python NLTK Library

This paper firstly aims at showing the usefulness of CFG and FCFG in the study of formal syntax. Applying parsers based on these formalisms on the analysis of a corpus may reveal consequences from an approach which would otherwise pass by unnoticed. The Natural Language Toolkit (NLTK) comprises, among other facilities, generator tools for parsers in a variety of architectures. However, the non-trivial use of this library in automatic syntactic processing requires programming skills. In order to allow non-programmers access to parser implementation and testing, we developed Donatus, a user-friendly graphical interface to NLTK's parsing facilities with additional utilities that make it also useful to programmers. We explain the tool's functioning and demonstrate its relevance to formal syntactical investigation by means of a comparison between the computer implementations of two alternative approaches to adjectival modification in Portuguese. The first approach, based on traditional X-bar theory, generated a great number of false ambiguities. This problem was avoided by a parser based on an approach within the Minimalist Program. Without resorting to the computer, this difference between the two approaches would not be easily revealed.

Computational linguistics; Formal syntax; Generative grammar; X-bar theory; Context-free grammar; Unification grammar; Adjectival modification


Universidade Estadual Paulista Júlio de Mesquita Filho Rua Quirino de Andrade, 215, 01049-010 São Paulo - SP, Tel. (55 11) 5627-0233 - São Paulo - SP - Brazil
E-mail: alfa@unesp.br