Acessibilidade / Reportar erro

Automatic detection of deprived urban areas using Google Earth™ images of cities from the Brazilian semi-arid region

Detecção automática de áreas urbanas desfavorecidas usando imagens do Google Earth™ de cidades do semiárido brasileiro

Abstract

Automatic classification of deprived urban areas provides vital information for implementing pro-poor policies. In this paper, an approach for the classification of these areas in Brazilian cities is presented. Satellite images were obtained free of charge from six cities in the Brazilian semi-arid region using Google Earth Engine software. To assess the discriminative power of census data, data made publicly available by Brazilian Institute of Geography and Statistics (IBGE) were used to train SVM classifiers together with features extracted from images. The image features were extracted using the following approaches: color histograms, LBP histograms, and lacunarity. Four evaluation tests were investigated based on two criteria: use of census data and cross-validation method. Two types of cross-validation were used: 10-fold and leave-one-city-out. The use of census data caused a negative impact on the results. This impact is justified by the criteria on which census tracts are mapped in the country, not only morphological and visually perceptible through satellite images, as opposed to adopted extraction approaches. The best results obtained were average accuracy of 91.81% and average F1-score of 92.27%. This research contributes to the recognition of deprived urban areas and urban socio-spatial dynamics, supporting urban-territorial planning.

Keywords:
Deprived urban areas; Remote sensing; Machine learning

Resumo

A classificação automática de áreas urbanas desfavorecidas fornece informações vitais para a implementação de políticas pró-pobres. Neste artigo, é apresentada uma abordagem para a classificação dessas áreas nas cidades brasileiras. As imagens de satélite foram obtidas de modo gratuito para seis cidades do Semiárido brasileiro utilizando o software Google Earth Engine. Para avaliar o poder discriminativo dos dados censitários, foram utilizados dados disponibilizados publicamente pelo Instituto Brasileiro de Geografia e Estatística (IBGE) para treinar classificadores SVM juntamente com características extraídas de imagens. As características da imagem foram extraídas usando as seguintes abordagens: histogramas de cores, histogramas de LBP e lacunaridade. Quatro testes de avaliação foram investigados com base em dois critérios: uso de dados censitários e método de validação cruzada. Foram utilizados dois tipos de validação cruzada: 10 vezes e deixar uma cidade de fora. O uso de dados censitários causou impacto negativo nos resultados. Esse impacto é justificado pelos critérios sobre os quais os setores censitários são mapeados no país, não apenas morfológicos e visualmente perceptíveis por meio de imagens de satélite, ao contrário das abordagens de extração adotadas. Os melhores resultados obtidos foram acurácia média de 91,81% e F1-score médio de 92,27%. Essa pesquisa contribui para o reconhecimento de áreas urbanas precárias e de dinâmicas socioespaciais urbanas, dando suporte ao planejamento urbano-territorial.

Palavras-chave:
Áreas urbanas desfavorecidas; Sensoriamento remoto; Aprendizado de máquina

Introduction

Deprived Urban Areas (DUA) consist of urban areas that often lack basic services, such as clean water and sanitation, among others. Moreover, their dwellers are often exposed to unhealthy and unsuitable physical environments (Georganos et al., 2021Georganos, S., Abascal, A., Kuffer, M., Wang, J., Owusu, M., Wolff, E., & Vanhuysse, S. (2021). Is it all the same? Mapping and characterizing deprived urban areas using WorldView-3 superspectral imagery: a case study in Nairobi, Kenya. Remote Sensing, 2021(13), 4986. http://dx.doi.org/10.3390/rs13244986.
http://dx.doi.org/10.3390/rs13244986...
). Classifications of DUAs in Brazil are the subject of controversy due to a lack of consensus on a “generic slum ontology” (Kohli et al., 2016Kohli, D., Sliuzas, R., & Stein, A. (2016). Urban slum detection using texture and spatial metrics derived from satellite imagery. Journal of Spatial Science, 61(2), 405-426. http://dx.doi.org/10.1080/14498596.2016.1138247.
http://dx.doi.org/10.1080/14498596.2016....
). Brazilian Institute of Geography and Statistics (IBGE) classifies such settlements as Special Sectors of Subnormal Agglomerates (SEAS), based on the census survey carried out every 10 years. According to IBGE (2011)Instituto Brasileiro de Geografia e Estatística - IBGE. (2011). Censo demográfico 2010: características da população e dos domicílios: resultados do universo. Rio de Janeiro: IBGE.:

Set consisting of a minimum of 51 housing units, occupying or having occupied, until recently, land owned by others (public or private), arranged, in general, in a disorderly and dense manner, and lacking, for the most part, of public services essential.

This process is slow and susceptible to changes with the space-time dynamics. Social movements promote actions to guide public policies for land and urban regularizations and to guarantee housing rights. In the 1980’s, some DUA became Special Zones of Social Interest (ZEIS), guaranteeing the permanence of its residents, also ensuring attention to the planning and provision of basic infrastructure services. Such initiatives were consolidated and spread with the City Statute (Brasil, 2001Brasil. (2001, 10 de julho). Lei n. 10.257, de 10 de julho de 2001. Regulamenta os arts. 182 e 183 da Constituição Federal, estabelece diretrizes gerais da política urbana e dá outras providências. Brasília: Diário Oficial da União.). SEAS and ZEIS represent consolidated attempts for mapping DUA, although they were not classified exclusively by morphological criteria and there is a great diversity of them in Brazil.

The contribution of this paper lies in the automation of classification of remote sensing images from urban areas employing texture features, such as lacunarity, LBP (Local Binary Patterns), color histograms, and census data. The combination of these features created a robust approach to mapping DUA, and acted as an important support and alternative to the traditional mapping process, which is slow, expensive and unable to keep up with the dynamics of intense urban growth, especially in developing countries.

The differential of this work regarding the state of the art of studies in detection of DUA using machine learning is in: (i) use of satellite images with low spatial resolution and available for free; (ii) use of census data available for free at the IBGE website, in order to analyze the behavior of the classification with socioeconomic georeferenced information; (iii) leave-one-city-out assessment, which allows a more adequate assessment of the capacity of generalization of the approach; (iv) study of DUA in medium-sized Brazilian cities, most studies are concentrated in large metropolitan centers, disregarding their presence in smaller cities; and (v) combination of texture and color features of the image, such as: LBP, lacunarity and color histogram.

This research studied six medium-sized cities located in the Brazilian Semi-arid region (Table 1), one of the poorest regions of the country. These cities have populations between 249,000 and 556,000 people (IBGE, 2011Instituto Brasileiro de Geografia e Estatística - IBGE. (2011). Censo demográfico 2010: características da população e dos domicílios: resultados do universo. Rio de Janeiro: IBGE.) and, since they are in the Northeast region of Brazil, the predominant biomes are the Caatinga and the Atlantic Forest. They have diversity in their urban area, housing density, and indexes such as Human Development Index (HDI), Gini and GDP (Gross Domestic Product) per capita. Furthermore, they are within the expectations for medium-sized Brazilian cities and they have in common the fact of being regional centers with economical, political, cultural and social influence in their respective states.

Table 1
- Socioeconomic-spatial data of the cities

This paper is composed of four sections. The next section presents and discusses works that have similar objectives to this research but have slightly different approaches. Then, the proposed approach is presented which consists, briefly, in the fusion of census data with features extracted from satellite images. The third section shows the obtained results and compares them with state-of-the-art results. Finally, some considerations are made and the references are presented.

Literature review

Kit et al. (2012)Kit, O., Lüdeke, M., & Reckien, D. (2012). Texture-based identification of urban slums in Hyderabad, India using remote sensing data. Applied Geography (Sevenoaks, England), 32(2), 660-667. http://dx.doi.org/10.1016/j.apgeog.2011.07.016.
http://dx.doi.org/10.1016/j.apgeog.2011....
proposed a methodology to identify slums in Quickbird images of Hyderabad, India, using lacunarity features. They applied two methods to binarize the images that would be later taken as input for the lacunarity algorithm: Principal Component Analysis (PCA) and line detection. Data collected in surveys were used as ground-truth and a matrix was superimposed on the image of the city, classifying them in slums and non-slums. Such classification was obtained from the result of the binarization methods, after comparing the ground-truth with the results of 18 variations of threshold parameters and sliding window sizes. The results showed that the method that used line detection as a step prior to the lacunarity calculation obtained results consistently superior to the method that used PCA.

Owen and Wong (2013)Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016.
http://dx.doi.org/10.1016/j.apgeog.2012....
proposed an approach for classifying neighborhoods (in developing countries) into formal and informal exploring the following features: shape, terrain geomorphology, texture, road networks, dominant settlement materials and explicit settlement structure assessment. These features were used to train the following types of algorithms: discriminant function analysis and regression trees. Owen and Wong (2013)Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016.
http://dx.doi.org/10.1016/j.apgeog.2012....
mention the research of Kit et al. (2012)Kit, O., Lüdeke, M., & Reckien, D. (2012). Texture-based identification of urban slums in Hyderabad, India using remote sensing data. Applied Geography (Sevenoaks, England), 32(2), 660-667. http://dx.doi.org/10.1016/j.apgeog.2011.07.016.
http://dx.doi.org/10.1016/j.apgeog.2011....
as they also used lacunarity as an indicator of settlement informality. However, in the research of Owen and Wong (2013)Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016.
http://dx.doi.org/10.1016/j.apgeog.2012....
, lacunarity was not useful because of the small area units of the samples. The area studied was the region of Guatemala, from a Quickbird image covering 46 km2 of the chosen region. From this image, 24 indicators were extracted. Professionals from the National Geographic Institute of Guatemala were consulted to determine six regions of formal settlements and six regions of informal settlements. After statistical analysis and removal of non-informative indicators, cross-validation was applied obtaining an accuracy of 87.5% in the training of a Classification and Regression Tree (CART), using only four variables: entropy on the roads, size of vegetation cuttings, compactness of vegetation cuttings and GLCM correlation. Among the analyzed variables, the two that had the greatest contribution to the CART, respectively, were: entropy on the roads and average size of vegetation cuttings.

The classification of regions in images can use several types of features, from low-level features (texture, colors and gray levels) to high-level features (buildings, types of roofs and the presence of water bodies). Geographic Object-Based Image Analysis (GEOBIA) is a high-level approach capable of assisting in the description and explanation of the results of classification of regions in satellite images. Ribeiro (2015)Ribeiro, B. (2015). Mapping informal settlements using WorldView- 2 imagery and C4.5 decision tree classifier. In Joint Urban Remote Sensing Event (pp. 1-4). Crete: JURSE. proposed an approach that employs data mining techniques to map informal settlements, using C4.5 decision trees. He employed the TerraAIDA algorithm (Baatz & Schape, 2000Baatz, M., & Schape, A. (2000). Multiresolution segmentation: an optimization approach for high quality multi-scale image segmentation. In Angewandte Geographische Informationsverarbeitung. Salzburg: Wichmann.) for image segmentation of the city of Embu. The used image was obtained through the Worldview-2 satellite, on April 19, 2011. The city image was segmented into nine categories of land cover: grass and shrubs, trees, bare soil, clay tile roof, metallic roof, asphalt, clear and dark cement material, and shadow. The land cover results were evaluated using the metrics: accuracy (0.9094), Kappa (0.8898) and confusion matrix. Ribeiro (2015)Ribeiro, B. (2015). Mapping informal settlements using WorldView- 2 imagery and C4.5 decision tree classifier. In Joint Urban Remote Sensing Event (pp. 1-4). Crete: JURSE. used the InterIMAGE (PUC-Rio, 2022Pontifícia Universidade Católica do Rio de Janeiro - PUC-Rio. (2022). InterIMAGE. Retrieved in 2022, March 07, from http://www.lvc.ele.puc-rio.br/projects/interimage/
http://www.lvc.ele.puc-rio.br/projects/i...
) framework to manually create a knowledge model representing prior knowledge about the geographic region studied in her research. The obtained knowledge model allowed to achieve 100% accuracy in land use classification.

Kuffer et al. (2016)Kuffer, M., Pfeffer, K., Sliuzas, R., & Baud, I. (2016). Extraction of slum areas from VHR imagery using GLCM variance. Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1830-1840. http://dx.doi.org/10.1109/JSTARS.2016.2538563.
http://dx.doi.org/10.1109/JSTARS.2016.25...
proposed the use of the Gray Level Co-occurrence Matrix (GLCM) variance together with the Normalized Difference Vegetation Index (NDVI) as training input for a Random Forest Classifier (RFC) to discriminate between formal built-up and slum HUP (Homogeneous Urban Patches). Kuffer et al. (2016)Kuffer, M., Pfeffer, K., Sliuzas, R., & Baud, I. (2016). Extraction of slum areas from VHR imagery using GLCM variance. Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1830-1840. http://dx.doi.org/10.1109/JSTARS.2016.2538563.
http://dx.doi.org/10.1109/JSTARS.2016.25...
say that Kit et al. (2012)Kit, O., Lüdeke, M., & Reckien, D. (2012). Texture-based identification of urban slums in Hyderabad, India using remote sensing data. Applied Geography (Sevenoaks, England), 32(2), 660-667. http://dx.doi.org/10.1016/j.apgeog.2011.07.016.
http://dx.doi.org/10.1016/j.apgeog.2011....
showed the robustness and capacity of texture-based features for dealing with urban land use. Kuffer et al. (2016)Kuffer, M., Pfeffer, K., Sliuzas, R., & Baud, I. (2016). Extraction of slum areas from VHR imagery using GLCM variance. Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1830-1840. http://dx.doi.org/10.1109/JSTARS.2016.2538563.
http://dx.doi.org/10.1109/JSTARS.2016.25...
affirm that employed image-based features vary depending on the set sizes, and they cite Owen and Wong (2013)Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016.
http://dx.doi.org/10.1016/j.apgeog.2012....
as an example of research which selected a small set of image-based features from a larger set. Images from Mumbai, Ahmedabad and Kigali were used. In each city, the HUP interest regions were segmented according to the following criteria: areas of homogeneous texture; presence of diverse types of land cover; the HUP limits follow the physical limits; the size of the segmented region must be reasonably big (greater than the size of objects). Six images obtained from different sensors were used: WorldView-2, GCP, OrbView-3, Resourcesat-1, and Quickbird. The image resolutions vary from 0.5 m to 2.4 m. The data were analyzed following two categorizations: one with three classes (formal, slum, soil) using only variance, and another with four classes (formal, slum, soil, vegetation) using RFC. The RFC was evaluated using general accuracy and Kappa of 180 randomly selected points. The best result obtained an accuracy of 90% with an 87% Kappa.

Aiming to explore the power of Convolutional Neural Networks (CNN), Mboga et al. (2017)Mboga, N., Persello, C., Bergado, J., & Stein, A. (2017). Detection of informal settlements from VHR satellite images using convolutional neurais networks. In IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 5169-5172). New York: IEEE. proposed an end-to-end approach to detect informal settlements in satellite images. Mboga et al. (2017)Mboga, N., Persello, C., Bergado, J., & Stein, A. (2017). Detection of informal settlements from VHR satellite images using convolutional neurais networks. In IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 5169-5172). New York: IEEE. used the same Quickbird dataset of Kuffer et al. (2016)Kuffer, M., Pfeffer, K., Sliuzas, R., & Baud, I. (2016). Extraction of slum areas from VHR imagery using GLCM variance. Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1830-1840. http://dx.doi.org/10.1109/JSTARS.2016.2538563.
http://dx.doi.org/10.1109/JSTARS.2016.25...
, but do not mention any comparison with the results obtained by Kuffer et al. (2016)Kuffer, M., Pfeffer, K., Sliuzas, R., & Baud, I. (2016). Extraction of slum areas from VHR imagery using GLCM variance. Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1830-1840. http://dx.doi.org/10.1109/JSTARS.2016.2538563.
http://dx.doi.org/10.1109/JSTARS.2016.25...
. The network architecture employed is similar to that proposed by Bergado et al. (2016)Bergado, R., Persello, C., & Gevaert, C. (2016). A deep learning ap-proach to the classification of sub-decimetre resolution aerial images. In IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 1516-1519). New York: IEEE., which follows popular principles for the creation of CNN networks. The network is divided into two parts, containing, in the first, blocks of standard layers of CNN networks: convolutional layer, pooling and activation layer with non-linear function. In the second part, a Multi-Layer Perceptron (MLP) with a final softmax layer is used. They used Quickbird images from the city of Dar es Salaam (Tanzania) of 0.60 m resolution. The regions were branded in four categories: formal settlement, informal settlement, urban and vacant/agriculture. From the city’s image, patches were extracted to train the network. Several patching sizes were tested, and the best results (90% accuracy) were obtained for patches around 135 x 135 pixels and a training set containing 3,060 patches. For this same data configuration, they also evaluated a Support Vector Machine (SVM), with Gray Level Co-occurrence Matrix (GLCM) features as input, the result was 79.19%.

Persello and Stein (2017)Persello, C., & Stein, A. (2017). Deep Fully Convolutional Networks for the Detection of Informal Settlements in VHR Images. IEEE Geoscience and Remote Sensing Letters, 14(12), 2325-2329. http://dx.doi.org/10.1109/LGRS.2017.2763738.
http://dx.doi.org/10.1109/LGRS.2017.2763...
investigated the application of convolutional networks to the problem of detecting informal settlements in very high resolution (VHR) images. They also mention the use of the same Quickbird dataset of Kuffer et al. (2016)Kuffer, M., Pfeffer, K., Sliuzas, R., & Baud, I. (2016). Extraction of slum areas from VHR imagery using GLCM variance. Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1830-1840. http://dx.doi.org/10.1109/JSTARS.2016.2538563.
http://dx.doi.org/10.1109/JSTARS.2016.25...
, but do not mention any comparison with the results obtained by Kuffer et al. (2016)Kuffer, M., Pfeffer, K., Sliuzas, R., & Baud, I. (2016). Extraction of slum areas from VHR imagery using GLCM variance. Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1830-1840. http://dx.doi.org/10.1109/JSTARS.2016.2538563.
http://dx.doi.org/10.1109/JSTARS.2016.25...
. The networks evaluated are capable of inferring pixel-by-pixel labels, for patches, or for the entire image. The proposed architecture is made of six convolutional layers using Dilated Kernel (DK) and a final classification layer with 1 x 1 convolution and softmax loss function. The training of classifiers was made using a Quickbird image of the city of Dar es Salaam with spatial resolution of 60 cm. The proposed approach was trained with two input variations: patch-based (PB-CNN) and deconvolutional FCN (FCN-DEC). The results were compared with results obtained using trained SVM with pixel-based and GLCM features, and evaluated using the following metrics: Overall Accuracy (OA), Average Class Accuracy (AA) and Mean Producer’s Accuracy (PA). For all metrics, the proposed approach obtained results almost 10% higher than the results obtained using SVM.

Maxwell et al. (2018)Maxwell, A., Warner, T., & Fang, F. (2018). Implementation of machine-learning classification in remote sensing: an applied review. International Journal of Remote Sensing, 39(9), 2784-2817. http://dx.doi.org/10.1080/01431161.2018.1433343.
http://dx.doi.org/10.1080/01431161.2018....
presented a review on machine learning classification methods for remote sensing. They focused their attention especially in the following methods: SVMs, DTs (Decision Trees), RFs (Random Forests), boosted DTs, Artificial Neural Networks (ANNs), and K-Nearest Neighbor (K-NN). They employed the mentioned machine learning methods in two publicly available data sets: Indian Pines Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) and Geographic Object-Based Image Analysis (GEOBIA). Authors say that selecting a machine learning classifier for a particular task is challenging, perhaps because the procedures applied for different studies are not comparable. In their experiments, SVM achieved highest accuracy for the Indian Pines data set, and RF achieved highest accuracy for the GEOBIA data set. Authors say that, not only the chosen classifier may affect performance, but also factors like: training data, user defined parameter settings, and the predictor variable feature space. All these factors were adequately dealt with in this research. Training data were balanced, SVM parameters were chosen by grid-search, and PCA (Principal Component Analysis) was applied for dimensionality reduction.

Gadiraju et al. (2018)Gadiraju, K., Vatsavai, R., Kaza, N., Wibbels, E., & Krishna, A. (2018). Machine learning approaches for slum detection using very high resolution satellite images. In IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 1397-1404). New York: IEEE. http://dx.doi.org/10.1109/ICDMW.2018.00198.
http://dx.doi.org/10.1109/ICDMW.2018.001...
proposed the evaluation of three classification problems (differing in the label quantities) using classical approaches and one deep approach with a deep convolutional neural network. Gadiraju et al. (2018)Gadiraju, K., Vatsavai, R., Kaza, N., Wibbels, E., & Krishna, A. (2018). Machine learning approaches for slum detection using very high resolution satellite images. In IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 1397-1404). New York: IEEE. http://dx.doi.org/10.1109/ICDMW.2018.00198.
http://dx.doi.org/10.1109/ICDMW.2018.001...
cite a conference paper version of the journal paper published by Mboga et al. (2017)Mboga, N., Persello, C., Bergado, J., & Stein, A. (2017). Detection of informal settlements from VHR satellite images using convolutional neurais networks. In IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 5169-5172). New York: IEEE., the citation occurs in the reference section but it is not commented on the paper. The three classification types used the following labels: (i) Urban vs. others; Formal, Informal and Other; and (iii) Single Story, Multi-Story, Semi-Permanent, Temporary, Formal and Other. The images used were obtained from the Bengaluru region, in Karnataka, India. Two models of feature extraction were used: pixel-based and patch-based. The traditional classifiers used were: Naive Bayes, Decision Tree, K-Nearest Neighbors, Multi Layer Perceptron, Gradient Boosting, Random Forest and Adaboost Classifier. For the classical classifiers, the following types of features were used: Haralick Texture Features, Normalized Difference Built-Up Index (NDBI), Edge Density and Pan Sharpened Bands. The Convolutional Neural Network (CNN) has a typical architecture, containing seven convolutional layers, two dropout layers, two Max Pool layers, and three Fully Connected. The authors do not mention the number of patches used to train the CNN network, but they report that they used the 0.5 m resolution grid to patch images of 40 x 40 pixels and that they used data augmentation with random rotations of 90 degrees and flips. For problems with two and three classes, CNN obtained superior results in all metrics. However, for the six-class problem, the best results were obtained by the Gradient Boosting classifier.

Ibrahim et al. (2019)Ibrahim, R., Titheridge, H., Cheng, T., & Haworth, J. (2019). predictSLUMS: a new model for identifying and predicting informal settlements and slums in cities from street intersections using machine learning. Computers, Environment and Urban Systems, 76, 31-56. http://dx.doi.org/10.1016/j.compenvurbsys.2019.03.005.
http://dx.doi.org/10.1016/j.compenvurbsy...
proposed an approach for detecting informal settlements using data from street intersections. Ibrahim et al. (2019)Ibrahim, R., Titheridge, H., Cheng, T., & Haworth, J. (2019). predictSLUMS: a new model for identifying and predicting informal settlements and slums in cities from street intersections using machine learning. Computers, Environment and Urban Systems, 76, 31-56. http://dx.doi.org/10.1016/j.compenvurbsys.2019.03.005.
http://dx.doi.org/10.1016/j.compenvurbsy...
mention the work of Kit et al. (2012)Kit, O., Lüdeke, M., & Reckien, D. (2012). Texture-based identification of urban slums in Hyderabad, India using remote sensing data. Applied Geography (Sevenoaks, England), 32(2), 660-667. http://dx.doi.org/10.1016/j.apgeog.2011.07.016.
http://dx.doi.org/10.1016/j.apgeog.2011....
as an application of machine learning for remote sensing. The model is built using spatial statistics to train a deep artificial neural network to classify the regions into formal or informal. The modeling is divided into two steps: (i) generation of statistics using Nearest Neighbors and Getis-Ord; (ii) training a deep artificial neural network with the features of the previous step. The authors called the features “incident points” since they do not carry any other attribute than their coordinates. The approach was evaluated with data from five cities, four from Egypt and one from India. For the cities of Egypt, official GIS data provided by the General Organization of Physical Planning in Egypt (GOPP) were used. For the Indian city of Mumbai, data obtained from Google Maps and from the official city development plan were used. The deep artificial neural network was evaluated with 10-fold-cross-validation, and the best accuracy was 98% for the city of Minya and the lowest accuracy was 80% for the city of Mumbai.

Fallatah et al. (2020)Fallatah, A., Jones, S., & Mitchell, D. (2020). Object-based random forest classification for informal settlements identification in the Middle East: jeddah a case study. International Journal of Remote Sensing, 41(11), 4421-4445. http://dx.doi.org/10.1080/01431161.2020.1718237.
http://dx.doi.org/10.1080/01431161.2020....
used a GeoEye-1 image of 0.5m resolution for informal settlement mapping. Fallatah et al. (2020)Fallatah, A., Jones, S., & Mitchell, D. (2020). Object-based random forest classification for informal settlements identification in the Middle East: jeddah a case study. International Journal of Remote Sensing, 41(11), 4421-4445. http://dx.doi.org/10.1080/01431161.2020.1718237.
http://dx.doi.org/10.1080/01431161.2020....
say that, although mapping informal settlement indicators in countries where key geographical data are missing is a challenge, Owen and Wong (2013)Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016.
http://dx.doi.org/10.1016/j.apgeog.2012....
proposed 14 such indicators, among them: dwelling size and dwelling separation. Fallatah et al. (2020)Fallatah, A., Jones, S., & Mitchell, D. (2020). Object-based random forest classification for informal settlements identification in the Middle East: jeddah a case study. International Journal of Remote Sensing, 41(11), 4421-4445. http://dx.doi.org/10.1080/01431161.2020.1718237.
http://dx.doi.org/10.1080/01431161.2020....
affirm that in previous literature, such as Kuffer et al. (2016)Kuffer, M., Pfeffer, K., Sliuzas, R., & Baud, I. (2016). Extraction of slum areas from VHR imagery using GLCM variance. Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1830-1840. http://dx.doi.org/10.1109/JSTARS.2016.2538563.
http://dx.doi.org/10.1109/JSTARS.2016.25...
, informal settlement and slum have often been used interchangeably. As examples of successful application of CNN (Convolutional Neural Networks) to automatic detection of informal settlements, Fallatah et al. (2020)Fallatah, A., Jones, S., & Mitchell, D. (2020). Object-based random forest classification for informal settlements identification in the Middle East: jeddah a case study. International Journal of Remote Sensing, 41(11), 4421-4445. http://dx.doi.org/10.1080/01431161.2020.1718237.
http://dx.doi.org/10.1080/01431161.2020....
cites the works of Mboga et al. (2017)Mboga, N., Persello, C., Bergado, J., & Stein, A. (2017). Detection of informal settlements from VHR satellite images using convolutional neurais networks. In IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 5169-5172). New York: IEEE. and Persello and Stein (2017)Persello, C., & Stein, A. (2017). Deep Fully Convolutional Networks for the Detection of Informal Settlements in VHR Images. IEEE Geoscience and Remote Sensing Letters, 14(12), 2325-2329. http://dx.doi.org/10.1109/LGRS.2017.2763738.
http://dx.doi.org/10.1109/LGRS.2017.2763...
. Fallatah et al. (2020)Fallatah, A., Jones, S., & Mitchell, D. (2020). Object-based random forest classification for informal settlements identification in the Middle East: jeddah a case study. International Journal of Remote Sensing, 41(11), 4421-4445. http://dx.doi.org/10.1080/01431161.2020.1718237.
http://dx.doi.org/10.1080/01431161.2020....
used ontological classes proposed by other researchers. The proposed approach maps informal settlement indicators via OBIA by grouping pixels into coherent groups to form segments. Afterwards, Random Forest was employed to classify data. The proposed approach faced problems in the classification of informal areas due to failure in the definition of individual buildings. The most significant information sources for classifying informal settlements were: density (dwelling separation) and texture measures. Among the texture measures, the GLCMentropy was the most successful.

The use of high resolution images is important to aid the process of extracting features, but it is not always easy to obtain them for several reasons, especially their high cost. The Landinfo website presents tables of values for the purchase of high resolution satellite images. It is possible to see, for example, that the value of a panchromatic QuickBird image with 0.6m of resolution and 16.5 km width imaging swath (DigitalGlobe, 2022DigitalGlobe. (2022). QuickBird. Retrieved in 2022, March 07, from http://www.digitalglobe.com/index.php/85/QuickBird
http://www.digitalglobe.com/index.php/85...
) is US$ 14, and the minimum ordering area is 25 km2. We can argue that it becomes impracticable for public agencies to periodically buy high resolution satellite images due to the fact that the dynamics of territorial occupation changes constantly and new images need to be obtained. Sentinel-2 has resolution from 10 m to 60 m in the visible, and short-wave infrared spectral zones. The design of the telescope used by Sentinel-2 allows for a 290 km field of view

Another important aspect that can be observed in the analyzed studies is the predominant use of feature extraction approaches. Among the most popular characteristics investigated are: LBP, lacunarity, GLCM and statistical indicators. Among the most popular classifiers, are those of the decision trees type. Although the Deep Convolutional Neural Networks (DCNN) have been very successful in solving image processing problems, their application in processing satellite images is still incipient. A factor that draws our attention when evaluating approaches in the literature is that most, if not all approaches, evaluate their methods using images from the same region/city from which the classifier uses images to be trained. Of course, the training and test images are different, but what we call attention to here is that to assess the classifier’s generalization power, it should be tested with images obtained from different cities or regions from which they were used for training.

Table 2 summarizes the main factors of the analyzed literature as well as presents, for comparison purposes, the results obtained in this research. The most reported metric in the literature is accuracy. Among the reported accuracies, only that reported by Gadiraju et al. (2018)Gadiraju, K., Vatsavai, R., Kaza, N., Wibbels, E., & Krishna, A. (2018). Machine learning approaches for slum detection using very high resolution satellite images. In IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 1397-1404). New York: IEEE. http://dx.doi.org/10.1109/ICDMW.2018.00198.
http://dx.doi.org/10.1109/ICDMW.2018.001...
was 100%, but this result is questionable in at least two aspects: (i) the authors do not mention how many images (or image patches) were used for training and testing; and (ii) images from the same city were used for training and testing. The accuracy reported by Ibrahim et al. (2019)Ibrahim, R., Titheridge, H., Cheng, T., & Haworth, J. (2019). predictSLUMS: a new model for identifying and predicting informal settlements and slums in cities from street intersections using machine learning. Computers, Environment and Urban Systems, 76, 31-56. http://dx.doi.org/10.1016/j.compenvurbsys.2019.03.005.
http://dx.doi.org/10.1016/j.compenvurbsy...
was also quite high, 98%. Although the authors used images from four cities, they did not evaluate the generalization capacity of their approach through leave-one-city-out cross-validation, similar to that proposed in this paper. Besides, Ibrahim et al. (2019)Ibrahim, R., Titheridge, H., Cheng, T., & Haworth, J. (2019). predictSLUMS: a new model for identifying and predicting informal settlements and slums in cities from street intersections using machine learning. Computers, Environment and Urban Systems, 76, 31-56. http://dx.doi.org/10.1016/j.compenvurbsys.2019.03.005.
http://dx.doi.org/10.1016/j.compenvurbsy...
do not report the number of images used nor the spatial resolution of such images. Among the approaches that reported quantities of images used, the proposal in this article was the one that used more images, in a total of 3,406 patches. Regarding the spatial resolution of the images, most of the works used high resolutions. The articles that used the lowest resolutions were this one, with a resolution of 10 m and Owen and Wong (2013)Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016.
http://dx.doi.org/10.1016/j.apgeog.2012....
with a resolution of 30 m. When comparing the results of cross-validation with the one of Owen and Wong (2013)Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016.
http://dx.doi.org/10.1016/j.apgeog.2012....
, we verified that the accuracy for this paper was 87.22%, and for the paper of Owen and Wong (2013)Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016.
http://dx.doi.org/10.1016/j.apgeog.2012....
, it was 87.5%. Although the result of Owen and Wong (2013)Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016.
http://dx.doi.org/10.1016/j.apgeog.2012....
was slightly high, the article does not inform whether cross-validation was used, while the result reported by this article is the average result of 10-fold-cross-validation.

Table 2
- Summary of the evaluated works and comparison with the proposed approach

Proposed approach

This Section is composed of three subsections. In Subsection Data Acquisition, the procedures for data acquisition are described. Subsection Delimitation Criteria describes the criteria for delimiting the regions of interest, and Subsection Machine Learning Procedures explains the Machine Learning techniques that were employed in the experiments.

Data acquisition

The proposed approach used two types of data: satellite images and census data. The satellite images were obtained from the Google Earth Engine™ geospatial processing service, which provides images from several satellites. In this research, we chose to use images from the Sentinel-2 satellite, through filtering via the Google Earth Engine™ API. Two filters were applied to obtain the images, one was the date filter in which the period between January 1, 2018 and June 30, 2019 was chosen, and another filter to remove images that had more than 20% cloud coverage, selecting the bands B4, B3 and B2, which respectively represent the RGB color channels. The spatial resolution of the image is approximately 10m. In the satellite images, delimitations of polygons recognized as DUA were added, according to the delimitation criteria described below

The census data are available on the government domain website of the Brazilian Institute of Geography and Statistics (IBGE), which refer to the last census carried out in the country in 2010 and encompass a wide variety of socioeconomic information of the inhabitants and the characteristics of their households. For this study, the urban census tracts of each one of the six municipalities were considered and, from their attribute tables, a total of eight variables were selected: Income, Education, Longevity, Water, Sewage, Garbage, Residential Typology and Occupation. Universal data from IBGE references were used in order to ensure greater statistical reliability.

Delimitation criteria

After cutting the satellite image patches, the existing limits for DUA in each of the six cities were determined. The zoning proposed in the master plans for the Special Zones of Social Interest (ZEIS) added to the Special Sectors of Subnormal Agglomerates (SEAS) of IBGE (2011)Instituto Brasileiro de Geografia e Estatística - IBGE. (2011). Censo demográfico 2010: características da população e dos domicílios: resultados do universo. Rio de Janeiro: IBGE. was taken into account. These limits are distributed in the urban zones, sometimes overlapping or complementing each other, since in some occurrences, more recent master plans consider the surveys already made by IBGE to zone the DUA.

Taking the available delimitations of the six cities as the starting point, it was verified that not all cities have SEAS and ZEIS, as in the case of Feira de Santana, which does not have SEAS and not all ZEIS represent DUA, but open spaces destined for the construction of Housing of Social Interest (HIS) by the State, destined to a poor population, as well as housing projects already built. Of the master plans considered, four are in need of revision: Campina Grande (2006), Juazeiro do Norte (2000), Mossoró (2006), and Arapiraca (2004). In some of them, the limits of deprived areas are not available georeferenced.

Afterwards, a visual interpretation of the satellite images to delimit areas with typical morphology of deprived urban areas was made. These areas have special features, such as: (i) building geometry, with small and irregular buildings; (ii) high building density; (iii) irregular street or pedestrian path layout pattern; (iv) typical materials of the roof; (v) lack of shades and vegetation; and (vi) localization in environmentally sensitive areas (12). These new limits were drawn in the QGIS software for each one of the six cities, consulting free higher resolution satellite images available on the Google Earth platform, complementing the existing information. This complementation was made to supply the outdated and imprecise legislation of the municipalities, and to add areas of difficult recognition in the resolution of Sentinel-2

The following indicators from the Demographic Census (IBGE, 2011Instituto Brasileiro de Geografia e Estatística - IBGE. (2011). Censo demográfico 2010: características da população e dos domicílios: resultados do universo. Rio de Janeiro: IBGE.) were considered: water supply, destination of garbage and sewage, as household information; and income, education and longevity, as population information. Each indicator shows an amount of sub-indicators with specific information, the management of these data represents an adaptation of works developed by Anjos and Lacerda (2015)Anjos, K., & Lacerda, N. (2015). Urban and environmental transformations in poor areas of the metropolitan region of Recife (Brazil). Ambiente & Sociedade, 18(1), 37-58. http://dx.doi.org/10.1590/1809-4422ASOC516V1812015en.
http://dx.doi.org/10.1590/1809-4422ASOC5...
, which guided the calculation of averages and the allocation of weights for each sub-indicator.

After finishing the mapping — defined by the sum of the limits of the Special Areas of Social Interest (ZEIS), the limits of the Special Sectors of Subnormal Agglomerates (SEAS) and the limits of DUA identified through the visual interpretation of satellite images — the image samples that overlapped these limits with more than 50% of their area included were labeled as DUAs for training the classifiers.

Machine learning procedures

Following the delimitation criteria described in Subsection Data Acquisition, images from the following cities of the Brazilian Semi-arid region were obtained, followed by their respective states in parentheses: Campina Grande (Paraíba), Caruaru (Pernambuco), Feira de Santana (Bahia), Juazeiro do Norte (Ceará) and Mossoró (Rio Grande do Norte). These images were patched via a sliding window in 40 x 40 pixel resolution, with an overlap of 10 pixels. Although we had more than 100,000 patches available, only 3,406 were used. This is due to the fact that not the entire area of the image was mapped with census data.

From the image patches, the following features were extracted: color histograms, LBP histograms and lacunarity. The color histograms were extracted for each RGB channel, with 64 bins each and the LBP histogram with 10 bins. The LBP type used was the invariant to rotation (Ojala et al., 2002Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971-987. http://dx.doi.org/10.1109/TPAMI.2002.1017623.
http://dx.doi.org/10.1109/TPAMI.2002.101...
), also known as uniform. The LBP parameters were: number of neighbors (P) equal to 8 and radius (R) equal to 1. The lacunarity was calculated through an algorithm implemented by the authors of this paper themselves, based on the work of Plotnick et al. (1993)Plotnick, R. E., Gardner, R. H., & O’Neill, R. V. (1993). Lacunarity índices as measures of landscape texture. Landscape Ecology, 8(3), 201-211. http://dx.doi.org/10.1007/BF00125351.
http://dx.doi.org/10.1007/BF00125351...
. The parameters of the lacunarity features extracted were obtained from ten sizes of sliding boxes: 2, 3, 4, 5, 6, 7, 8, 9, 10, 18. Besides the features extracted from the images, this approach also used census data. The features vector of census data contains 4,056 attributes, that characterize the number of people per household, income, age, sanitation, occupation, among others.

After the concatenation of the image features with the census data features, the resulting vector was submitted to the dimensionality reduction using PCA. [Initially, the leave-one-city-out results were exposed (Table 3), then] Tables 4 and 5 show the number of features used after the dimensionality reduction for the k-fold-cross-validation training. Since the resulting number of features after the dimensionality reduction was low regarding the number of image patches, we chose the Support Vector Machines (SVM), which is a very popular classifier which researchers, historically, used to report high classification results.In this paper, unlike other articles reviewed in Section Literature review, we also performed the evaluation of the proposed approach using the leave-one-city-out principle. Following this principle, six models were trained, each one of them trained with images of five cities only and evaluated with the images from the city left out of the training. This mode of training and evaluation allows us to evaluate the generalization capacity of the algorithm. When the algorithm is trained and evaluated with images from the same city, the results can be biased, due to the test and training images being very similar.

Table 3
- Results considering leave-one-city-out classification using census data and image features
Table 4
- Results considering leave-one-city-out classification using only image features (w\o using census data)
Table 5
- Results from the 10-fold-cross-validation evaluation with image and census data features

Results and discussion

The results were evaluated under two aspects: mixing (or not) the images from all cities and using (or not) census data. The objective of evaluating the performance of the approach by mixing images from all cities or leaving a city out is to evaluate the generalization power of the algorithm. The evaluation with and without census data had the objective to evaluate the discrimination capacity of the classifier to distinguish DUA with this kind of data and also to highlight possible discrepancies in the standardization of precarious regions by the responsible agencies. From these two aspects of evaluation, the results are presented from four perspectives: (i) leave-one-city-out with image and census data features (Table 3); (ii) leave-one-city-out with image features only (Table 4); (iii) 10-fold-cross-validation with image and census data features (Table 5); and (iv) 10-fold-cross-validation with image features only (Table 6).

Table 6
- Results from the 10-fold-cross-validation without census data

The results in Table 3 show that the average results for all metrics fluctuate around 70%, and that the highest average was obtained by the metric macro precision (75.46%) with standard deviation of 9.51%. Considering that the evaluation presented in Table 3 results from leave-one-city-out, we can say that the result was quite good, as it demonstrates the generalization capacity of the algorithm. The test city from which the lowest results were obtained was Feira de Santana. As it can be observed from Figure 1, the justification for these results are: (i) the specificity in which the process of occupation of the precarious regions of the city took place; (ii) the use of leave-one-city-out approach is expected to give lower results than n-fold-cross-validation; and iii) the spatial changes caused by the construction of new housing estates for poor populations.

Figure 1
- Spatialization of the areas detected as deprived (blue) from the classification overlapping the areas recognized as deprived (hatches in yellow) and the urban legislation of the 6 municipalities. Source: Classification in Sentinel-2 images, authors (2021).

In contrast with the results shown in Table 3, the results in Table 4 show that the use of census data negatively impacted the training of models. For example, if we compare the average result for the F1 metric in the tables, we see that when trained with census data, the classifier obtained an average F1 of 72.98%, but when trained without census data, the average F1 results was 85.73%. In other words, there was an increase of more than 10% in the result of the metric. The increase of the average values of the metrics occurs for all the metrics evaluated. Observing the results obtained for the city with the lowest metrics (Feira de Santana), we verified that the results were higher with the use of census data only for recall. However, since the precision values were lower, when observing the F1 metric, we see that the increase in the recall was compensated by the decrease in precision.

Due to the fact that almost all the papers analyzed in this research do not evaluate their results with the leave-one-city-out approach, we also reported our results with 10-fold-cross-validation. When observing the values in Table 5, an expressive increase in the metrics is noticed, regarding the results shown in Table 3. The average values obtained in the 10-fold-cross-validation evaluation are around 10% higher than the average values obtained for leave-one-city-out. An increase in the metrics was already expected for the 10-fold-cross-validation results, since different samples can occur from regions of the same city in the training and test sets. Since the samples belong to the same city, they share similarities, facilitating the classification problem. In this regard, it is necessary to criticize the way the approaches available in the literature are evaluated. Most of them do not mention the use of cross-validation, nor do they evaluate their results by training and testing with images from different cities.

As occurred with the comparison results with and without census data using leave-one-city-out, there was an increase in the metrics of the classifiers due to the non-use of census data for the 10-fold-cross-validation evaluation. The average values of all metrics in Table 6 were above 90%. A factor that we must highlight for the proposed approach is that it was able to obtain results above 90% for several metrics using only 119 characteristics (PCA components) to train an SVM classifier.

The results of the classification represented in Figure 1 allows us to conclude that the images recognized as DUA, painted in blue, go beyond the limits established by the ZEIS, SEAS and complementary delimitations boundaries. The complementary delimitation boundaries are added by perceptible morphological criteria in the visual interpretation of the images, marked with red. In short, there are many areas of the cities that are not officially recognized as DUA, although they show similar morphological features from satellite images and census data.

Although the addition of georeferenced census data to the image portrays the reality more accurately, models trained using census data obtained inferior results to models that did not use this data. This decrease is related to the incompatibility of criteria between census tract mapping and the boundaries generated from image texture criteria. The mapping of the census tracts does not adopt exclusively morphological and perceptible criteria through the image, and follows the rules for the number of households surveyed. Therefore, it is comprehensible that census tracts with less building density are larger, and may have internal socio-spatial diversity.

In many Brazilian cities, DUA are large and officially recognized as ZEIS and these occurrences generally have more spatial compatibility with census tracts. However, there are cases of defined streets, alleys or invasions that present features of precariousness that are noticeable in the visual interpretation of the images, but less spatial compatibility with the census tracts. The diversity of typologies for DUA justifies the decrease with the results involving census tracts. Although cities have similarities because they are in the same biome, there is diversity in what each city considers as DUA.

The six cities analyzed present their urban, geographical and historical particularities, despite this, the results observed in the mapping point to common trends, among which, it is worth highlighting: (i) the existence of transition areas between deprived and non-deprived, where limits are not always defined as well-defined barriers, but urban precariousness is dispersed and presents itself within patterns that are not typical of deprived urban areas and at different scales; (ii) in the case of the 6 municipalities, DUA are mostly in peripheral areas, far from the historical and consolidated centrality of each municipality; (iii) deprived areas are dispersed on the periphery of the cities, although in some cases, as in Campina Grande - PB, they are mostly in the west part of the city; (iv) it is not possible to say that there is a concentration of deprived urban areas in a specific region of each municipality; (v) the presence of deprived areas detected in peri-urban areas, close to the rural areas of the municipalities; and (vi) deprived areas close to water bodies — this phenomenon was observed in all the municipalities analyzed, but not in all water bodies and in all their extension, as there are areas that are being targeted by the appreciation of the real estate market due to the environmental value provided by the landscape.

The results showed that many other areas were recognized as deprived, in addition to the areas initially recognized, leading to the following findings: (i) some areas have textural characteristics similar to deprived areas, although they are not recognized as such by the public authorities; (ii) some deprived areas refer to housing developments of social interest built by the government, and have different textural characteristics from unplanned areas, in addition, urbanization projects in deprived areas can modify street patterns and paving and roofing materials, increasing diversity of textures; (iii) the diversity of texture present in DUA, considering the 6 municipalities, may have contributed to the excessive recognition of new areas; (iv) the 10 meter resolution made it difficult to detect small DUA and limited the accuracy of the mapping; and (v) DUA characterized by tenement slums in vertical buildings in central areas are difficult to detect by the approach presented in this work. However, the results proved to be effective in recognizing the socio-spatial dynamics of segregation, in which the DUA are mostly outside the central areas, constituting an important resource in urban-territorial planning.

The six cities analyzed are not capital cities of their respective states, but they have influences from regional centers from an economic, political, cultural and social point of view, which also makes them similar in the way urban deprivation shows itself. The cities are not coastal cities, following a more concentric and less spread urban growth as cities along the coast. Those aspects contribute to the slum typologies found to be similar from a building point of view, with the predominance of housings in seeming masonry, ceramic roofs and narrow streets.

The lower metric values achieved, considering the census data, demonstrate that the latter are not and need not necessarily be correlated to the morphological data. In other words, in some DUA with regular road layouts, there are people living in precarious habitability conditions, as it is in many allotments and housing complex projects located in urban peripheries. Likewise, it is also possible that the residents of some spontaneous and unplanned settlements with irregular building plans and arrangements — as it is in many historical farms and in some more consolidated slums — have more favorable socio-economic conditions than others that live in planned urban areas.

In spatial terms, the census data are aggregated in tracts, spatial units quite larger than the morphological elements analyzed in the images, such as lots and buildings. It is very likely that different morphological patterns occur within the same census tract, which prevents a direct association among the features analyzed. In temporal terms, census data are generally much more dynamic than morphological data, that is, usually, the address and socioeconomic conditions of residents (income, education, age, health) change more quickly than physical structure and size of their homes. As the census data used in this paper are from the year 2010, date of the last census carried out in Brazil; there is, therefore, a lag of almost 10 years, thus not representing the current characteristics of the people who inhabit them.

Conclusions

In this paper, results from an investigation about the classification of DUA of six cities from the Brazilian Semi-arid region using satellite images and census data were presented. The following features were extracted from the images: color histograms, LBP histograms and lacunarity. In order to assess the discriminative power of census data, a hybrid feature vector was created from the concatenation of the features extracted from the images with the census data. This hybrid vector had its dimensionality reduced using PCA, and the reduced vector was given as input for the training of an SVM classifier.

From the analysis of related articles, we verified that most researchers use satellite images of high spatial resolution. Although high resolution images are more appropriate for the classification of DUA, generally these images are not available for free and the values for obtaining them are high. High prices make it difficult or unpracticable for public agencies to periodically analyze DUA using satellite images. For this reason, we chose to use images from the Google Earth Engine™ software.

Another factor observed in the literature analysis was that the articles generally assess the models using images from the same city for training and testing. In order to assess the generalizability of the proposed approach, in this article we also evaluate the results using the Leave-One-City-Out cross-validation approach. The results obtained in this approach were inferior to the results with 10 times cross-validation, as expected, but the results were still competitive with the state of the art.

The main factor revealed in our experiments was the inadequacy of census data for detecting DUA in different cities. This inadequacy is due to several factors presented in the analysis, the experiment is valid insofar as it opens space to reflect on the way census surveys are mapped, and the way each municipality establishes its master plan and legislation with the criteria for an area to be classified as deprived or not. It is not possible to standardize the classification of DUA considering the different city legislation. This difficulty leads us to propose as future work the investigation of approaches that consider the city legislation, but that are able to generalize the classifications of DUA for different cities.

In regard to the use of satellite images, we propose as future work the investigation of deep learning architectures for automatic feature extraction and model training. Although there are already articles in the literature that investigate deep models, the investigations focus on specific cities and there are few works, for example, using images from Brazilian cities. A factor to be investigated in future research would be the interpretation of features extracted from satellite images using deep convolutional neural networks. Furthermore, the proposition of indexes interpreted by specialists of classification of features for settlements based on features automatically obtained using DCNN. There are also methodological aspects that can be enhanced in the existing approaches, such as the use of leave-one-city-out cross-validation assessment approaches.

A factor to be observed in the literature is the non-existence of public databases for comparing approaches. This is due mainly to the fact that high resolution images are expensive. As we propose in this paper the use of public images, another proposal for future work would be to create a labeled base of images of Brazilian cities. Factors that were not investigated in this article and that may be part of further research are: evaluation of models trained only with census data, analysis of which features (and why) were selected by the PCA and application of other dimensionality reduction approaches that allow to explain the choice of features.

  • How to cite: Pereira, E. T., Barros Filho, M. N. M., Simões, M. B., & Bezerra Neto, J. A. (2022). Automatic detection of deprived urban areas using Google Earth™ images of cities from the Brazilian semi-arid region. urbe. Revista Brasileira de Gestão Urbana, v.14, e20210209. https://doi.org/10.1590/2175-3369.014.e20210209
  • Data availability statement

    The dataset that supports the results of this paper is available at SciELO Data and can be accessed via https://doi.org/10.48331/scielodata.MAEYDV

References

  • Anjos, K., & Lacerda, N. (2015). Urban and environmental transformations in poor areas of the metropolitan region of Recife (Brazil). Ambiente & Sociedade, 18(1), 37-58. http://dx.doi.org/10.1590/1809-4422ASOC516V1812015en
    » http://dx.doi.org/10.1590/1809-4422ASOC516V1812015en
  • Baatz, M., & Schape, A. (2000). Multiresolution segmentation: an optimization approach for high quality multi-scale image segmentation. In Angewandte Geographische Informationsverarbeitung Salzburg: Wichmann.
  • Bergado, R., Persello, C., & Gevaert, C. (2016). A deep learning ap-proach to the classification of sub-decimetre resolution aerial images. In IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 1516-1519). New York: IEEE.
  • Brasil. (2001, 10 de julho). Lei n. 10.257, de 10 de julho de 2001. Regulamenta os arts. 182 e 183 da Constituição Federal, estabelece diretrizes gerais da política urbana e dá outras providências. Brasília: Diário Oficial da União.
  • DigitalGlobe. (2022). QuickBird. Retrieved in 2022, March 07, from http://www.digitalglobe.com/index.php/85/QuickBird
    » http://www.digitalglobe.com/index.php/85/QuickBird
  • Fallatah, A., Jones, S., & Mitchell, D. (2020). Object-based random forest classification for informal settlements identification in the Middle East: jeddah a case study. International Journal of Remote Sensing, 41(11), 4421-4445. http://dx.doi.org/10.1080/01431161.2020.1718237
    » http://dx.doi.org/10.1080/01431161.2020.1718237
  • Gadiraju, K., Vatsavai, R., Kaza, N., Wibbels, E., & Krishna, A. (2018). Machine learning approaches for slum detection using very high resolution satellite images. In IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 1397-1404). New York: IEEE. http://dx.doi.org/10.1109/ICDMW.2018.00198
    » http://dx.doi.org/10.1109/ICDMW.2018.00198
  • Georganos, S., Abascal, A., Kuffer, M., Wang, J., Owusu, M., Wolff, E., & Vanhuysse, S. (2021). Is it all the same? Mapping and characterizing deprived urban areas using WorldView-3 superspectral imagery: a case study in Nairobi, Kenya. Remote Sensing, 2021(13), 4986. http://dx.doi.org/10.3390/rs13244986
    » http://dx.doi.org/10.3390/rs13244986
  • Ibrahim, R., Titheridge, H., Cheng, T., & Haworth, J. (2019). predictSLUMS: a new model for identifying and predicting informal settlements and slums in cities from street intersections using machine learning. Computers, Environment and Urban Systems, 76, 31-56. http://dx.doi.org/10.1016/j.compenvurbsys.2019.03.005
    » http://dx.doi.org/10.1016/j.compenvurbsys.2019.03.005
  • Instituto Brasileiro de Geografia e Estatística - IBGE. (2011). Censo demográfico 2010: características da população e dos domicílios: resultados do universo. Rio de Janeiro: IBGE.
  • Kit, O., Lüdeke, M., & Reckien, D. (2012). Texture-based identification of urban slums in Hyderabad, India using remote sensing data. Applied Geography (Sevenoaks, England), 32(2), 660-667. http://dx.doi.org/10.1016/j.apgeog.2011.07.016
    » http://dx.doi.org/10.1016/j.apgeog.2011.07.016
  • Kohli, D., Sliuzas, R., & Stein, A. (2016). Urban slum detection using texture and spatial metrics derived from satellite imagery. Journal of Spatial Science, 61(2), 405-426. http://dx.doi.org/10.1080/14498596.2016.1138247
    » http://dx.doi.org/10.1080/14498596.2016.1138247
  • Kuffer, M., Pfeffer, K., Sliuzas, R., & Baud, I. (2016). Extraction of slum areas from VHR imagery using GLCM variance. Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1830-1840. http://dx.doi.org/10.1109/JSTARS.2016.2538563
    » http://dx.doi.org/10.1109/JSTARS.2016.2538563
  • Maxwell, A., Warner, T., & Fang, F. (2018). Implementation of machine-learning classification in remote sensing: an applied review. International Journal of Remote Sensing, 39(9), 2784-2817. http://dx.doi.org/10.1080/01431161.2018.1433343
    » http://dx.doi.org/10.1080/01431161.2018.1433343
  • Mboga, N., Persello, C., Bergado, J., & Stein, A. (2017). Detection of informal settlements from VHR satellite images using convolutional neurais networks. In IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 5169-5172). New York: IEEE.
  • Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971-987. http://dx.doi.org/10.1109/TPAMI.2002.1017623
    » http://dx.doi.org/10.1109/TPAMI.2002.1017623
  • Owen, K., & Wong, D. (2013). An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics. Applied Geography, 38, 107-118. http://dx.doi.org/10.1016/j.apgeog.2012.11.016
    » http://dx.doi.org/10.1016/j.apgeog.2012.11.016
  • Persello, C., & Stein, A. (2017). Deep Fully Convolutional Networks for the Detection of Informal Settlements in VHR Images. IEEE Geoscience and Remote Sensing Letters, 14(12), 2325-2329. http://dx.doi.org/10.1109/LGRS.2017.2763738
    » http://dx.doi.org/10.1109/LGRS.2017.2763738
  • Plotnick, R. E., Gardner, R. H., & O’Neill, R. V. (1993). Lacunarity índices as measures of landscape texture. Landscape Ecology, 8(3), 201-211. http://dx.doi.org/10.1007/BF00125351
    » http://dx.doi.org/10.1007/BF00125351
  • Pontifícia Universidade Católica do Rio de Janeiro - PUC-Rio. (2022). InterIMAGE. Retrieved in 2022, March 07, from http://www.lvc.ele.puc-rio.br/projects/interimage/
    » http://www.lvc.ele.puc-rio.br/projects/interimage/
  • Ribeiro, B. (2015). Mapping informal settlements using WorldView- 2 imagery and C4.5 decision tree classifier. In Joint Urban Remote Sensing Event (pp. 1-4). Crete: JURSE.

Edited by

Guest editors: Vasco Barbosa, Lakshmi Rajendran and Mónica Suárez.

Publication Dates

  • Publication in this collection
    05 Dec 2022
  • Date of issue
    2022

History

  • Received
    04 July 2021
  • Accepted
    04 Aug 2022
Pontifícia Universidade Católica do Paraná Rua Imaculada Conceição, 1155. Prédio da Administração - 6°andar, 80215-901 - Curitiba - PR, 55 41 3271-1701 - Curitiba - PR - Brazil
E-mail: urbe@pucpr.br