Acessibilidade / Reportar erro

Syntactic parser for Brazilian Portuguese: challenges and Solutions

Abstract

This article aims to present the Syntactic Parser for Brazilian Portuguese - Parsero -, developed from the Generative Grammar (CHOMSKY, 2015CHOMSKY, N. Estruturas sintáticas. São Paulo: Vozes, 2015.) improved by the X-Barra Theory (CHOMSKY, 2014CHOMSKY, N. Ciência da linguagem. São Paulo: Editora UNESP, 2014.). Therefore, the rules developed by Othero (2009)OTHERO, G. de A. A gramática da frase em português: algumas reflexões para a formalização da estrutura frasal em português. Porto Alegre: EDIPUCRS, 2009. Disponível em: Disponível em: https://bibliodigital.unijui.edu.br:8443/xmlui/handle/123456789/1490 . Acesso em: 16 out. 2021.
https://bibliodigital.unijui.edu.br:8443...
especially for Brazilian Portuguese were used and adapted by our project to meet the needs of our Parser. The research used as lexical collection, to populate a Structured Query Language (SQL) Database, the resource Dictionary of Simple Inflected Words for Brazilian Portuguese (DELAF_PB), which was made available available by the Unitex-PB Project, developed by Núcleo Interinstitucional de Linguística Computacional (NILC) and by Instituto de Ciências Matemáticas e de Computação (ICMC). This resource, in turn, was built based on the French formalism - Dictionnarie Electronique du LADL (DELA) (MUNIZ, 2004MUNIZ, M. C. M. A construção de recursos linguístico-computacionais para o português do Brasil: o projeto de Unitex-PB. 2004. Dissertação de Mestrado - Instituto de Ciências Matemáticas de São Carlos, USP. Disponível em: http://ladl.univ-mlv.fr/brasil/bibliografia/oto/DissMuniz2004.pdf.
http://ladl.univ-mlv.fr/brasil/bibliogra...
). As a result of our project, we have made available to researchers interested in the topic the SQL Database with 1,193,295 classified lexical units, the address with the open source of Parsero and a link to run the application. Throughout the development of the Natural Language Processor (NLP), we had to put into practice interdisciplinary studies from language sciences and computer sciences, a necessary practice for the development of intelligent programs that can interact with writers or Brazilian Portuguese speakers.

Keywords:
Computational linguistics; Natural Language Processing; Generative Grammar; Syntactic parser; Brazilian Portuguese

Universidade Federal de Minas Gerais - UFMG Av. Antônio Carlos, 6627 - Pampulha, Cep: 31270-901, Belo Horizonte - Minas Gerais / Brasil, Tel: +55 (31) 3409-6009 - Belo Horizonte - MG - Brazil
E-mail: revistatextolivre@letras.ufmg.br