Acessibilidade / Reportar erro

Determining the geographical origin of lettuce with data mining applied to micronutrients and soil properties

ABSTRACT

Lettuce (Lactuca sativa) is the main leafy vegetable produced in Brazil. Since its production is widespread all over the country, lettuce traceability and quality assurance is hampered. In this study, we propose a new method to identify the geographical origin of Brazilian lettuce. The method uses a powerful data mining technique called support vector machines (SVM) applied to elemental composition and soil properties of samples analyzed. We investigated lettuce produced in São Paulo and Pernambuco, two states in the southeastern and northeastern regions in Brazil, respectively. We investigated efficiency of the SVM model by comparing its results with those achieved by traditional linear discriminant analysis (LDA). The SVM models outperformed the LDA models in the two scenarios investigated, achieving an average of 98 % prediction accuracy to discriminate lettuce from both states. A feature evaluation formula, called F–score, was used to measure the discriminative power of the variables analyzed. The soil exchangeable cation capacity, soil contents of low crystalized Al and Zn content in lettuce samples were the most relevant components for differentiation. Our results reinforce the potential of data mining and machine learning techniques to support traceability strategies and authentication of leafy vegetables.

ICP–OES; traceability; tropical soils; heavy metals; feature selection

Escola Superior de Agricultura "Luiz de Queiroz" USP/ESALQ - Scientia Agricola, Av. Pádua Dias, 11, 13418-900 Piracicaba SP Brazil, Phone: +55 19 3429-4401 / 3429-4486 - Piracicaba - SP - Brazil
E-mail: scientia@usp.br