Acessibilidade / Reportar erro

A Spectral Clustering Approach for the Evolution of the COVID-19 Pandemic in the State of Rio Grande do Sul, Brazil

ABSTRACT

The aim of this paper is to analyse the evolution of the COVID-19 pandemic in Rio Grande do Sul by applying graph-theoretical tools, particularly spectral clustering techniques, on weighted graphs defined on the set of 167 municipalities in the state with population 10,000 or more, which are based on data provided by government agencies and other sources. To respond to this outbreak, the state has adopted a system by which pre-determined regions are assigned flags on a weekly basis, and different measures go into effect according to the flag assigned. Our results suggest that considering a flexible approach to the regions themselves might be a useful additional tool to give more leeway to cities with lower incidence rates, while keeping the focus on public safety. Moreover, simulations show that the combination of pendulum migration and isolation data used in this paper leads to a coherent qualitative description of the evolution of the pandemic in Rio Grande do Sul. These simulations also confirm the dampening effect of isolation on the dissemination of the disease.

Keywords:
Spectral clustering; COVID-19 pandemic; discrete epidemiological model

1 INTRODUCTION

The aim of this paper is to employ graph-theoretical tools to understand the dissemination of COVID-19 in the Brazilian state of Rio Grande do Sul. These tools may be useful sources of additional information for decision making by health and government authorities.

The year 2020 has been marked by the global outbreak and spread of the virus SARS-CoV-2, which causes the coronavirus disease (COVID-19) in humans 88 Timeline of WHO’s response to COVID-19. ” ”https://www.who.int/news-room/detail/2906-2020-covidtimeline ” (2020. Access on: 1 Ago. 2020).
https://www.who.int/news-room/detail/290...
. In December 2019, several patients with an unknown severe respiratory disease were traced back to a wholesale market in Wuhan, China. Researchers were quick to detect and isolate a novel strain of coronavirus 2626 N. Zhu, D. Zhang, W. Wang, X. Li, B. Yang, J. Song, X. Zhao, B. Huang, W. Shi, R. Lu, P. Niu, F. Zhan, X. Ma, D. Wang, W. Xu, G. Wu & G. Gao. A novel coronavirus from patients with pneumonia in China, 2019. New England Journal of Medicine, 382 (2020). doi: 10.1056/NEJMoa2001017.
https://doi.org/10.1056/NEJMoa2001017...
. It was soon discovered that the virus is highly contagious, and that it can be transmitted by infected individuals before they show the first symptoms and even by infected individuals that remain asymptomatic throughout the course of the disease 2525 Y. Wang, J. Tong, Y. Qin, T. Xie, J. Li, J. Li, J. Xiang, Y. Cui, E.S. Higgs, J. Xiang & Y. He. Characterization of an asymptomatic cohort of SARS-COV-2 infected individuals outside of Wuhan, China. Clinical Infectious Diseases, (2020). doi: 10.1093/cid/ciaa629. URL https://doi.org/10.1093/cid/ciaa629.
https://doi.org/10.1093/cid/ciaa629...
. This has led to unprecedented public health measures by the Chinese authorities. A lockdown of Wuhan and 15 other cities in Hubei Province took effect on January 23 1111 D. Cyranoski. What China’s coronavirus response can teach the rest of the world. Nature, 579(7800) (2020), 479-480.. On January 30, the World Health Organization (WHO) declared COVID-19 a public health emergency of international concern 88 Timeline of WHO’s response to COVID-19. ” ”https://www.who.int/news-room/detail/2906-2020-covidtimeline ” (2020. Access on: 1 Ago. 2020).
https://www.who.int/news-room/detail/290...
. In the next month, a large number countries implemented measures aiming to prevent a global pandemic, ranging from travel restrictions, contact tracing and social isolation to border closures and lockdowns 88 Timeline of WHO’s response to COVID-19. ” ”https://www.who.int/news-room/detail/2906-2020-covidtimeline ” (2020. Access on: 1 Ago. 2020).
https://www.who.int/news-room/detail/290...
. These actions turned out to be unsuccessful in eradicating the disease, and the WHO characterised the outbreak as a pandemic on March 11. Two days later, it assessed that Europe had become the epicenter of the pandemic 88 Timeline of WHO’s response to COVID-19. ” ”https://www.who.int/news-room/detail/2906-2020-covidtimeline ” (2020. Access on: 1 Ago. 2020).
https://www.who.int/news-room/detail/290...
. The virus then quickly reached Brazil.

The first confirmed case of COVID-19 in Brazil dates back to February 26, in the state of São Paulo, and the Brazilian Health Ministry declared a state of nationwide community transmission on March 20 33 Diário Oficial da União, Portaria 454, 20 de março de 2020, Ministério da Saúde (2020).. At that point, the number of confirmed cases in the state of Rio Grande do Sul was 37 66 Painel Coronavírus RSSecretaria Estadual de Saúde (2020. Access on: 1 Ago. 2020). URL URL https://ti.saude.rs.gov.br/covid19/ .
https://ti.saude.rs.gov.br/covid19/...
, and the state government had already instated measures aimed at slowing down the spread of the virus, including school closures and a ban on commercial interstate travel 4 4 Diário Oficial do Estado do Rio Grande do Sul, Decreto 55.128, 19 de março de 2020 (2020).. In the next month, a large number of restrictions were imposed on activities that were deemed inessential. By the end of April, there were 1466 cases and 51 deaths officially attributed to COVID-19 in the state of Rio Grande do Sul 66 Painel Coronavírus RSSecretaria Estadual de Saúde (2020. Access on: 1 Ago. 2020). URL URL https://ti.saude.rs.gov.br/covid19/ .
https://ti.saude.rs.gov.br/covid19/...
. At this point, and as of this writing, there was no vaccine or proven effective treatment for patients with severe cases of COVID-19 88 Timeline of WHO’s response to COVID-19. ” ”https://www.who.int/news-room/detail/2906-2020-covidtimeline ” (2020. Access on: 1 Ago. 2020).
https://www.who.int/news-room/detail/290...
. Recognizing the seriousness of the health crisis and the social and economic impact of widespread isolation, the state government unveiled a regulatory model for controlled distancing55 Diário Oficial do Estado do Rio Grande do Sul, Decreto 55.240, 10 de maio de 2020 (2020). 1 1 https://distanciamentocontrolado.rs.gov.br/ , which went into effect on May 11.

This regulatory model divided the state into 20 (pre-determined) regions based on the availability of beds in intensive care units (ICU beds) for COVID-19 patients. Every week, each region is assigned one of four possible flags, yellow (low risk), orange (medium risk), red (high risk) or black (very high risk), according to a numerical value based on several indices that measure the spread of the disease and the availability of ICU beds. Each flag entails different social distancing measures and imposes different constraints on businesses (or even their mandatory closure). This regulation has legal precedence over more flexible measures determined by local authorities or by the federal government 22 Diário Oficial da União, Decisão Ação Direta de Inconstitucionalidade 6343, 1 de junho de 2020, Supremo Tribunal Federal (2020).. Due to its effect on daily lives and on the economic activity, this model has been in the spotlight, and it has mustered praise, but also faced criticism. We should mention that, after the adoption of this system in Rio Grande do Sul, other states have followed suit and devised similar models (for instance, Acre, Mato Grosso, Mato Grosso do Sul, Pará, Rio de Janeiro, and São Paulo).

The general aim of this paper is to analyse the evolution of the COVID-19 pandemic in Rio Grande do Sul by applying graph-theoretical tools on data provided by government agencies and other sources. Given the flag system described above, we believe that clustering techniques are particularly well-suited for this analysis. The general idea of clustering is to partition a (typically large) data set into (a much smaller number of) clusters in a way that data in a same cluster are similar, and data in different clusters are dissimilar. Formal measures of affinity and of the quality of a given partition rely heavily on the context of the problem being considered. In this paper, we address clustering from a graph-theoretical perspective. We consider weighted graphs G=(V, E, ω), where the vertex set V is the data set, the edge set E contains edges connecting elements of V and the function ω:E >0 assigns a positive weight w ij to each edge ijE. Our main tool is the use of spectral clustering, which is widely used in exploratory data analysis, but, to the best of our knowledge, has not been explored in connection with epidemiological models. Specifically, we would like to contribute in the following directions:

  1. So far, social distancing is the only measure to contain the spread of the disease. What would be a sensible way of dividing the state into smaller regions, so that each city lies in a cluster with cities to which it is strongly connected? How does this division relate with geographical divisions used by the state government? Is this interconnection reflected in the manner in which the disease has actually spread in the state? Did the response to self-isolation measures affect the way in which cities were interconnected?

  2. The flag system proposed by the government prescribes constraints on activities and businesses according to the risk assigned to each region. Would a more flexible approach, in which cities can be assigned to different regions on a weekly basis, allow more cities to be assigned lower risk flags?

  3. The availability of data, and the quality of the data, is fundamental to get meaningful results from any mathematical model. Did our data accurately capture the movement between cities and rates of social isolation? Can we see the impact of social isolation on the dissemination of the disease?

To address the first two questions, we consider two types of affinity measures. The first type is based on pendulum migration between cities, by which we mean the daily flow of commuters for work or education, to which we incorporate data about self-isolation. Our method gives a partition based solely on pre-pandemic data that captures the connection between the clusters and the spread of the disease. Moreover, we observed that incorporating isolation had a negligible effect on the way cities are clustered together. We believe that this suggests that the reaction to appeals for isolation was similar throughout the state, regardless of the particular way in which each city was affected by the disease. This seems to highlight the importance of a coordinated message by federal, state and local, and by the media. The second type of affinity measures is based on the availability of ICU beds. In this case, considering a more flexible approach to the regions, by which new clusters are determined on a weekly basis, more cities are assigned lowerrisk flags. This may be useful complementary information for the flag system used in Rio Grande do Sul.

As a means to assess the quality of our data, we have also used a discrete SEIR compartmental model to simulate the spread of the disease and the effect of the social distancing measures that have been implemented, based on the migration and isolation data used for clustering. In contrast to clustering techniques, models of this type are a basic tool in the epidemiological toolbox, both in their discrete and continuous versions, and there is a vast literature related with them, see 1313 M.Y. Li, H.L. Smith & L. Wang. Global dynamics of an SEIR epidemic model with vertical transmission. SIAM Journal on Applied Mathematics, 62(1) (2001), 58-69., 1414 K. Linka, M. Peirlinck, F. Sahli Costabal & E. Kuhl. Outbreak dynamics of COVID-19 in Europe and the effect of travel restrictions. Computer Methods in Biomechanics and Biomedical Engineering, (2020), 1-8. and the references therein. Our contribution in this respect was to show that the data for pendulum migration and isolation, combined with the available disease information, described a scenario that is coherent with the evolution of the disease in the state. Extrapolating from this, we conclude that isolation measures have been very important in slowing down the spread of the disease (often referred to as flattening the curve of new cases).

The remainder of the paper is organized as follows. In Section 2, we describe the data used in this paper. Section 3 is concerned with spectral clustering and its mathematical foundations. The affinity measures mentioned above are discussed in that section, and we also analyse the partitions that have been obtained by spectral methods. The SEIR model is introduced and analysed in Section 4. We finish the paper with concluding remarks.

2 DATA

In this section, we describe the data used in our study. The actual matrices are available in our git repository2 2 https://www.github.com/Lucassib/Cluster-COVID-19-RS . We consider the 167 municipalities in the state of Rio Grande do Sul whose estimated population in 2019 is above 10.000 according to the Brazilian Institute of Geography and Statistics (IBGE)3 3 https://www.ibge.gov.br/estatisticas/sociais/populacao . Hereafter they will be referred to as cities. The distance between cities is given by a square matrix 𝒟=(d ij ) of order n=167, where d ij denotes the average road distance from the seat of municipality i to the seat of municipality j and vice-versa, as calculated by the web mapping service Google Maps.

Using data from the population census of 2010, which is the most recent census performed in Brazil, we define square matrices 𝒯=(t ij ) and ℰ=(e ij ) of order n, where t ij is the number of daily commuters who reside in i and work in j and e ij is the number of commuters who reside in i and go to school in j. These matrices have been obtained by extracting anonymized census microdata related to long-form questionnaires, which are publicly available4 4 https://www.ibge.gov.br/estatisticas/sociais/populacao/9662-censo-demografico-2010.html?=&t=downloads , and by extrapolating them to the entire city population (adjusted to the 2019 values) using the survey weights that are part of the census microdata. To extract the data from this large dataset, we used the commercial statistical software Stata.

We also considered data directly related to the spread of the disease, and to the response to it, which has been extracted directly from the state health authorities 66 Painel Coronavírus RSSecretaria Estadual de Saúde (2020. Access on: 1 Ago. 2020). URL URL https://ti.saude.rs.gov.br/covid19/ .
https://ti.saude.rs.gov.br/covid19/...
.

In our approach, the time t is measured in weeks, where our weeks correspond to the state’s epidemiological weeks, which go from Saturday to Friday. Regarding epidemiological data, we consider N=17 weeks starting at the week of March 7-13, when the first cases of COVID-19 were officially confirmed in the state, until July 3. We note that most pandemic related data is actually released on a daily basis, but contains fluctuations that may be attributed to administrative procedures. For instance, the number of reported cases and deaths regularly goes down on weekends and holidays, and surges in the first business days thereafter, which suggests that it does not reflect the actual behavior of the disease. Regarding cases and deaths, the weekly data that we collect is simply the overall number of reported cases in a week. Regarding self-isolation and ICU beds occupancy rates, we take the average over the time period. We should point out that the number of ICU beds in the state expanded considerably during the weeks considered, so that the number of total ICU beds in each city is also tracked on a weekly basis. Finally, we point out that these data are only used for N=8 weeks, starting at the week between May 2 and 8 (when the model for controlled distancing was unveiled).

The information about self-isolation in each city i∈[n]={1,…,n} is given by values βi (t)∈[0,1] for all t∈{1, 2, …, N}. This is an index developed by In Loco5 5 https://www.inloco.com.br/ , a technology firm with offices in Brazil and in the United States, calculated from granular anonymized geolocation data from more than 60 million mobile devices across Brazil. It is defined as the proportion of devices in a city i that stayed within a radius of 450 meters from their habitual home during day t1010 N. Ajzenman, T. Cavalcanti & D. Da Mata. More Than Words: Leaders’ Speech and Risky Behavior during a Pandemic. SSRN, (2020). doi: 10.2139/ssrn.3582908. URL https://ssrn.com/abstract=3582908.
https://ssrn.com/abstract=3582908...
,2121 P.S. Peixoto, D.R. Marcondes, C.M. Peixoto, L. Queiroz, R. Gouveia, A. Delgado & S.M. Oliva. Potential dissemination of epidemics based on Brazilian mobile geolocation data. Part I: Population dynamics and future spreading of infection in the states of Sao Paulo and Rio de Janeiro during the pandemic of COVID-19. medRxiv, (2020). doi: 10.1101/2020.04.07.20056739. URL https://www.medrxiv.org/content/early/2020/04/11/2020.04.07.20056739.
https://www.medrxiv.org/content/early/20...
.

3 CLUSTERING

Consider a set of points M={p 1, …, p n } such that a weight w ij ≥0 is assigned to each pair of points p i and p j , where ij and i, j∈[n]. The aim of data clustering is to partition this set of points into classes such that elements of the same class are more alike, while elements of different classes are less alike. The weight w ij measures affinity or similarity in this context6 6 We use the word affinity because, in some of our examples, cities that are more different in some aspects will have more affinity to each other. ; the larger the value, the larger their affinity. For general data sets, a large number of similarity measures appear in the literature, and their quality depends on the context in which they are used 2323 U. Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4) (2007), 395-416. .

Here, points are cities and weights are used to measure whether cities are highly interconnected or not. Several such measures will be considered here. For instance, a simple way to measure interconnection between cities is by simply considering the number of people who commute between them. This leads to the following matrix, where the weight α ij between cities i and j is defined through the matrices 𝒯 and defined in Section 2:

A 0 = α i j , w h e r e α i j = t i j + t j i + e i j + e j i . (3.1)

This choice is justified because, in the context of affinity measures, it is natural to consider symmetric weights.

In order to understand how the interconnection between cities was affected during the pandemic, we also considered weights given by matrices A(t), for t∈{1, …, n}. To incorporate self-isolation data, we first adjust the rate of self-isolation in each city i in terms of the average isolation β¯i, which was calculated using the same cell-phone data for i in the entire month of February, before the implementation of measures to contain the dissemination of COVID-19. We define

β i * t = max β i t - β ¯ i 1 - β ¯ i , 0 , (3.2)

so that βi*t=0 if rates of self-isolation are below average (this actually does not happen in our data set after the first week); otherwise, it is a linear interpolation where 0 corresponds to the average rate and 1 to full isolation. We are now ready to define

A t = a i j , w h e r e a i j = 1 - β j * t t i j + e i j + 1 - β i * t t j i + e j i . (3.3)

The definition of A(t) reflects our belief that it is conceptually more relevant to consider information about isolation in city j to assess the impact on commuting from i to j than information about isolation in city i. On the other hand, we understand that the nature of our isolation index, which estimates the number of individuals who never leave their home, could suggest using indices in city i to limit commutes from i to j. This has been tested and would have negligible impact on the results. Moreover, it would have been natural to ignore data related to student mobility as of the third week because all in-person school and university operations had already been suspended by then. However, this turned out to make clustering more unstable, perhaps because entries associated with smaller or more remote cities became too small.

3.1 Normalized cut

Before introducing the other affinity measures used in this paper, we first describe the framework of our analysis. We think of the data points as vertices in a graph G=(V, E), where we use V=[n] for simplicity. The weight between p i and p j is viewed as a weight ω(ij)=w ij associated with the edge ij of G (if w ij =0, we assume that vertices i and j are not adjacent in G).

In general terms, a clustering problem in G=(V, E) consists of finding a partition V=V 1∪···∪V k of the vertex set into a pre-determined number k of classes, where the partition optimizes some measure of quality of the partition. There are several such measures proposed in the literature 2323 U. Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4) (2007), 395-416. . In this paper, we work with the the normalized cut introduced by Shi and Malik in 1212 Jianbo Shi & J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8) (2000), 888-905.. To define it, some additional notation is needed. Given UV, let U¯=V\U be the complement of U with respect to V. Moreover, for S, TV, let W(S, T)=Σi∈S, j∈T w ij . For a partition 𝒫={V 1, ..., V k } of V, let

N C u t P = = 1 k C u t V , V ¯ V o l V , (3.4)

where

C u t P = 1 2 = 1 k W V , V ¯ a n d V o l V = i V j V w i j .

Finding an optimal partition in this context is to find a partition 𝒫 of V that minimizes the value of NCut(𝒫). Note that this objective function takes both aims of clustering into account. On the one hand, the only weights that appear on numerators of terms in (3.4) are weights of edges whose endpoints lie in distinct classes, so that minimizing the function favors partitions such that vertices in different classes have small weight. On the other hand, the denominator of the term associated with V i in (3.4) counts the weight of each edge with both endpoints in V i twice, while the other edges incident with V i are only counted once. So, increasing the weight of internal edges would decrease the value of the cut. Unfortunately, the authors of 1212 Jianbo Shi & J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8) (2000), 888-905. showed that the problem of finding such a partition is NP-hard for general graphs (even if k=2).

However, this problem is well-suited for a spectral approach. The following definitions are well known in spectral graph theory. The weighted adjacency matrix W=(w ij ) of a graph G=(V, E) with weight function ω is defined by w ij =ω(ij) if ijE and w ij =0 otherwise. The degree of a vertex iV in G is given by di=Σj=1nwij. The diagonal matrix with the degrees d 1 ,..., d n on the diagonal is called the degree matrix D.

At this point, we could simply present the procedure that we use to cluster our data; however, we believe that explaining how it works, and its connection to linear algebra, clarifies our approach. The following computation are performed in detail in 2323 U. Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4) (2007), 395-416. . Given a positive integers n and k and a partition 𝒫={V 1, ..., V k } of the vertex set of a graph G=(V, E) with weight function ω and no isolated vertices, consider the matrix X 𝒫∈ℝn×k whose columns are the k vectors x (ℓ)=(x 1 (ℓ), x 2 (ℓ), ..., x n (ℓ))T with coordinates

x j = 1 V o l V i f j V ; 0 o t h e r w i s e ,

for all ℓ∈{1, ..., k} and j∈{1, ..., n}. Using the Laplacian matrixL=D-W associated with the weighted graph G, it turns out that

N C u t P = = 1 k C u t V , V ¯ V o l V = = 1 k x T L x = t r X P T L X P .

Writing YP=D-12XP we obtain that

N C u t P = t r Y P T D - 1 2 L D - 1 2 Y P = t r Y P T L Y P ,

where L=D-12LD-12 is the normalized Laplacian matrix associated with G. Therefore finding an optimal partition in the sense of 1212 Jianbo Shi & J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8) (2000), 888-905. is equivalent to finding a partition 𝒫 that minimizes

n c u t k G = min Q N C u t Q = min Q t r Y Q T L Y Q ,

where 𝒬 ranges over all partitions of V into exactly k sets. It is easy to see that YQTYQ=I, and by the Rayleigh-Ritz Theorem [1717 J. Magnus & H. Neudecker. “Matrix Differential Calculus with Applications in Statistics and Econometrics (Revised Edition)”. John Wiley & Sons Ltd (1999)., Theorem 13], we have

min Y n × k , Y T Y = I t r Y T L Y = λ 1 + + λ k , (3.5)

where 0=λ1≤···≤ λk are the k smallest eigenvalues of the symmetric matrix . Moreover, equality is achieved by matrices Y whose columns are orthogonal unit vectors generated by eigenvectors associated with the eigenvalues λ 1 ,..., λ k . As we have discussed, each partition of V into k parts is associated with a matrix Y as above. However, there are matrices Y that are feasible for (3.5), but are not of the form Y 𝒬 for any partition 𝒬. This leads to the the following inequality:

n c u t k r e l G = min Y n × k , Y T Y = I t r Y T L Y n c u t k G . (3.6)

As in usual LP-relaxations, the left-hand side of the inequality (3.6) may be computed efficiently and gives a lower bound on the value of an optimal partition. On the other hand, there is no obvious connection between a matrix Y that achieves ncutkrelG in (3.6) (i.e. a matrix constructed from eigenvectors associated with the smallest eigenvalues of ) and a partition into k parts 𝒫 such that NCut(𝒫) is close to ncutk (G). The following heuristic tries to find good quality partitions. To turn the matrix Y into a partition 𝒫, it uses a well-known geometric method, known as K-means 1616 J. Macqueen. Some methods for classification and analysis of multivariate observations. In “In 5-th Berkeley Symposium on Mathematical Statistics and Probability” (1967), p. 281-297.. One way of assessing the quality of the output partition 𝒫 is by looking at the ratio NCut𝒫/ncutkrelG1. If this ratio is exactly 1, the partition 𝒫 is optimal. Otherwise, it gives an upper bound on the actual value of the ratio ρ𝒫=NCut𝒫/ncutkG (however, we should mention that the gap between ncutkrelG and ncutk (G) may be very large in general). It is important to mention that this heuristic has been quite successful in practice, we refer to 1818 B. Nadler, S. Lafon, R.R. Coifman & I.G. Kevrekidis. Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators. In “Proceedings of the 18th International Conference on Neural Information Processing Systems”, NIPS’05. MIT Press, Cambridge, MA, USA (2005), p. 955-962.,1919 A.Y. Ng, M.I. Jordan & Y. Weiss. On Spectral Clustering: Analysis and an Algorithm. In “Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic”. MIT Press, Cambridge, MA, USA (2001), p. 849-856.,2424 U. von Luxburg, M. Belkin & O. Bousquet. Consistency of spectral clustering. Ann. Statist., 36(2) (2008), 555-586. doi: 10.1214/009053607000000640. URL https://doi.org/10.1214/009053607000000640.
https://doi.org/10.1214/0090536070000006...
for more explanation about these empirical findings. Moreover, defining the best choice for the number of clusters k is an important problem with no definitive solution. Parameters that are often used to indicate a good choice of k are the spectral gap (this is the ratio between consecutive eigenvalues, small ratios followed by a larger jump λk+1/λk indicate that k is a good choice) and the closeness to 0 (k is the number of eigenvalues below a certain threshold), and the stability of the clusters obtained in repeated iterations of the procedure, but other criteria also appear in the literature 2323 U. Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4) (2007), 395-416. .

We now state the heuristic of Shi and Malik 1212 Jianbo Shi & J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8) (2000), 888-905., iterated S times. Given an affinity matrix W associated with an n-vertex graph G=(V, E), do the following:

  • (1) Let D to be the degree matrix associated with W and construct its normalized Laplacian matrix ℒ=D -1/2 LD -1/2, where L=DW.

  • (2) Compute vectors x 1, x 2, ..., x k ∈ℝn , where each xi is a unit eigenvector associated with the eigenvalue λ i , where λ 1 ,..., λ k are the k smallest eigenvectors of ℒ (counting multiplicity). In the case of repeated eigenvalues, the eigenvectors associated with them must be orthogonal. Form the matrix X=[x 1 x 2 ... x k ]∈ℝn×k by stacking these eigenvectors in columns.

  • (3) Form the matrix Y=(y ij ) from X=(x ij ) by renormalizing each of the rows to have unit length (i.e. yij=xij/Σj=1nxij).

  • (4) for s=1, ..., S do (let 𝒬 denote the best partition obtained up to a given step, where the starting partition is arbitrary.)

  • (4.1) Treating the ith row of Y as a point y i ∈ℝk , split {y1 ,...,yn } into k clusters S 1 ,...,S k via K-means.

  • (4.2) Let 𝒫 be the partition such that vertex i is assigned to cluster if and only if yi lies in S .

  • (4.3) If NCut(𝒫)<NCut(𝒬), redefine 𝒬 as 𝒫.

  • (5) Return 𝒬, the partition with minimum Ncut obtained in step (4).

3.2 Affinity based on pendular migration

When we compute the eigenvalues of the matrix associated with the affinity measure A 0 defined in (3.1), we find determine that there is a considerable eigenvalue gap between λ 10 and λ 11, which suggests that k=10 iis a good choice for the number of clusters. When we apply the above procedure to the affinity measure A 0 for S=500, we obtain the partition given in Figure 1, whose gap is NCut(𝒫)/ncutkrel(G)1.3256. This means that 𝒫 is at most 32.56% above the actual value of ncutk (G), but the gap is typically much smaller (and may possibly be optimal). Regarding stability, this partition 𝒫 has been obtained 183 times out of the 500 iterations of the procedure.

Figure 1
Clustering obtained by spectral clustering with respect to measure A 0 for k=10 clusters. The largest city in each cluster is marked with a larger circle. The regions defined by the government are on the right.

Even though the data used to obtain this partition is not related to the pandemic, if we look at the evolution of the number of cases during this time period, the connection between the clusters and the spread of the disease is perceptible. For instance, Figure 2 shows how real data about the disease evolved in cities of two neighboring clusters of Figure 1 (left), namely the black and red clusters, in four different weeks (detailed material for all clusters is available in our git repository). The red cluster consists of four cities: Gramado, Canela, Nova Petro´polis and São Francisco de Paula (which are part of a nationally renowned touristic area) and the other cluster is centered in Caxias do Sul, the second largest city in the state by population. One important feature about these clusters is that they are subsets of the same region, according to the state 20 pre-determined regions. Note that the largest cities of all remaining clusters are also the largest city in their pre-determined region. The first cases appear in the cluster of Caxias do Sul quite early, and they quickly spread to cities in the same cluster, which has a relatively large number of active cases by May 2, the first week displayed in the figure (and the ninth week with cases in the state). On the other hand, there are no recorded cases in the cluster of Gramado until the week of May 9. After the first case is identified, all the other cities in the cluster record cases in a span of three weeks.

Figure 2
Clockwise, starting from the top left. Cases on the weeks from May 2-8, May 9-15, May 16-22 and from May 30 to June 5. Gray stands for no active cases, green for cases in the interval [1, 50], blue for [51, 100] and red above 100. Dark colors mean that the number of active cases has increased from the previous week, light colors mean that they have decreased.

This behavior supports the choice of pendulum migration as a footprint for the spread of the disease, as was done in 2222 P.J.S. Silva, T. Pereira & L.G. Nonato. Robot dance: a city-wise automatic control of Covid-19 mitigation levels. medRxiv, (2020). doi: 10.1101/2020.05.11.20098541. URL https://www.medrxiv.org/content/early/2020/05/18/2020.05.11.20098541.
https://www.medrxiv.org/content/early/20...
, for instance. However, instead of census data, the authors of 2222 P.J.S. Silva, T. Pereira & L.G. Nonato. Robot dance: a city-wise automatic control of Covid-19 mitigation levels. medRxiv, (2020). doi: 10.1101/2020.05.11.20098541. URL https://www.medrxiv.org/content/early/2020/05/18/2020.05.11.20098541.
https://www.medrxiv.org/content/early/20...
used mobile geolocation data from 2121 P.S. Peixoto, D.R. Marcondes, C.M. Peixoto, L. Queiroz, R. Gouveia, A. Delgado & S.M. Oliva. Potential dissemination of epidemics based on Brazilian mobile geolocation data. Part I: Population dynamics and future spreading of infection in the states of Sao Paulo and Rio de Janeiro during the pandemic of COVID-19. medRxiv, (2020). doi: 10.1101/2020.04.07.20056739. URL https://www.medrxiv.org/content/early/2020/04/11/2020.04.07.20056739.
https://www.medrxiv.org/content/early/20...
to monitor the movement between cities.

As mentioned at the beginning of this section, instead of using A0, one could adjust the measure to incorporate rates of isolation, using a different measure A(t) (defined in (3.3)) at each time t. As it turns out, the difference in the partition obtained when performing the above clustering procedure for A(t) instead of A0 is minor. Indeed, the Hamming distance at any time t between the two partitions was at most 1 (out of 167). This may indicate that public response to self-isolation has been rather uniform throughout the state.

3.3 Affinity based on available ICU beds

As mentioned in the introduction, the state government introduced regulation to define when mandatory protocols of social distancing must be put into effect. Every Saturday, each region, out of a pre-determined set of 20 regions (which in turn are sorted into seven macroregions), is assigned one of four possible flags, yellow, orange, red or black, according to a numerical value based on several indices, which take the number of cases, the number of hospitalizations, the number of deaths and the availability of ICU beds into account. Once a flag has been assigned, cities in the region must adapt to the state regulations associated with that flag (local governments may enforce stricter rules, if desired).

Even though this method was met by a very positive reception from health and local authorities, its implementation quickly led to complaints by cities and economic agents who deem to have been treated unfairly. For instance, in the first weeks using this method, it was pointed out that several cities where no cases had ever been recorded had been assigned orange or red flags (owing to an outbreak or a shortage of ICU beds in their region, for instance). Moreover, since the index for a region incorporates data from the macroregion to which it belongs, a high risk flag can be assigned to a region in which no city had a substantial number of cases. In some instances, this has led to loud public outcry and threats of disobedience by local authorities, which in turn led to negotiations and adjustments. At the present moment, regulations include automatic ‘flag reductions’ for cities that meet certain criteria. This is the case for cities where no new cases have been recorded in the past two weeks, for instance. Moreover, each city can appeal to a board after its weekly classification has been revealed. When this happens, the city is allowed to present new data, such as an expansion on the total number of ICU beds.

Given this reality, we aim to look at the partition into regions under a more flexible perspective. To this end, we propose affinity measures that consider the availability of ICU beds (updating it weekly) and consider what happens when we re-organize the regions on a weekly basis. For a city i, let u i (t) be the average total number of ICU beds in i at time t, and let i (t) be the average number of ICU beds that are available (i.e. unoccupied and ready to accommodate new patients) in i at time t. The first measure is ‘static’, as it only considers the total number of ICU beds at the beginning of the recording process:

C 0 = γ i j , w h e r e γ i j = u i 0 - u j 0 d i j + c , (3.7)

where u i (0) denotes the total number of ICU beds in city i on May 2 and d ij is the distance between i and j given by matrix 𝒟 (see Section 2) and the constant c=10 avoids the effect of very small distances.

Figure 3
Partitions obtained using the affinity measure (3.8) using data from the weeks from June 13-19 (left) and June 20-26 (right).

The intuition behind this definition is that the health systems of two cities i and j that are geographically close, but whose health infrastructure is very different, would tend to be interconnected (with the city with small health capability transferring patients to the other), while two cities whose health capacities are equivalent would be less dependent on each other.

The second measure is ‘dynamic’, not only updating the number of ICU beds, but also considering the actual number of ICU beds that are ready to accommodate new patients:

C t = c i j t , w h e r e c i j = max η i t j t - i t d i j + c , η j t i t - j t d i j + c , (3.8)

where c=10 and ηit=uit-it+1uit+1. This quantity η i (t) may be viewed as a rate of urgency for city i to look for ICU beds outside its borders. This rate is 1 if it does not have any ICU beds or if all its ICU beds are occupied, and decreases as the percentage of available beds gets larger. The term it-jt accounts for the fact that a city j with more available ICU beds than i would be desirable to receive patients rom i. In other words, the affinity measure of interconnection between i and j goes up from the perspective of i if its health system is strained and j is geographically close and has more available beds.

Applying the above spectral partitioning procedure with the affinity measure defined in (3.8) for k=20 (the number chosen by the state) and S=500 produces the partitions in Figure 3 in two consecutive weeks. In this particular case, 26 cities switched regions from one week to the next.

Our aim using this measure is to assess whether allowing the regions to be re-organized on a weekly basis can bring meaningful additional information to one of the features of the state flag system, namely that the state consists of 20 pre-determined regions, which are in turn combined into seven macroregions. To this end, we shall first give a general description of the way in which the state assigns flags to regions (the formula is in the appendix). The flag is based on 11 individual indices, classified in two main types, disease propagation or healthcare capacity, and computed in one of three levels (within each region, within each macroregion or statewide). For each index, four intervals have been defined, and a flag is assigned to the index according to the interval it belongs to. The flag actually assigned to the region is obtained from a weighted average of the flags assigned to the different indices.

Here, we have devised an alternative formula (the formula is in the appendix), which uses exactly the same indices wherever possible. An important difference is that we do not use any indices related with macroregions, as it would not make sense to assign a city to a new region every week, while at the same time assume that cities lie in a fixed macroregion. Unfortunately, some of the data available for macroregions was not publicly available, or was less reliable, for the cities themselves. Because of this, we transferred the weight of these indices to other indices measuring similar features for cities. To assess what the dynamic clustering obtained using our matrices might say about the clustering defined by the state, we proceed in two steps. The first compares the flags assigned to the 20 pre-determined regions using the state’s formula and this new formula. Figure 4 does this for the weeks from June 13-19 and June 20-26. (A comparison for all seven weeks under consideration may be found in our git repository).

The second step is to split the state in 20 regions on a weekly basis (which we call the dynamic partition) and compare the flags assigned by the new formula to these regions and to the 20 predetermined regions. This is done in Figure 5, which suggests that more cities would be assigned a lower-risk flag in the dynamic partition. On the week from June 13-19, 26 cities had a lower-risk flag for the dynamic regions, 13 cities had a higher-risk flag for the dynamic region, and 128 cities remained the same. On the week from June 20-26, these numbers where 50, 2 and 115, respectively.

In short, our computations suggest that the flags assigned with the new formula are related with the flags from the original formula. Moreover, flags assigned in the second step show that partitioning on a weekly basis allows for more flexibility than considering the same partition throughout. This sends the message that it might be possible to devise a formula that takes more, or more reliable, information into account (as in the government’s formula), and that allows regions to be adapted on a weekly basis.

We should emphasize that we do not believe that the new formula presented here is better than the formula used by the state government, quite the opposite, but simulations suggest that the new formula was able to capture the main features of the government’s formula using the data available to us. We are also not suggesting that our regions are necessarily better than the predetermined regions defined by the state government. Even though our results show that a more flexible approach would allow more cities to be assigned lower-risk flags, implementing weekly changes to the regions would bring its own challenges. The government’s regions are heavily based on the way in which the public health system is organized and on the reality that many cities of small and average size do not have hospitals, particularly hospitals equipped with ICU beds to treat complex cases, and therefore need to establish formal agreements with one or more cities to which their patients can be transferred. Because of this, periodic changes to the regions would require that some cities direct their patients to hospitals in different cities every week, which is certainly not easy to implement. However, in exceptional situations such as a pandemic, this might be justified, and accepted by local governments, given the benefit of more leeway to cities that are not as directly affected by the disease.

Figure 4
Flags assigned by the state formula (left) and by our formula (right) on the weeks from June 13-19 (top) and June 20-26 (bottom).

Figure 5
Flags assigned by our formula to the state’s regions (left) and to the dynamic regions (right) on the weeks from June 13-19 (top) and June 20-26 (bottom). Flags are given by colors, regions are labelled 0 to 19.

4 SEIR MODEL

Using the data collected in the previous sections, it is possible define a discrete model for the spread of the disease, which gives a qualitative description of the evolution of the disease and helps us understand the effect of different parameters associated with the disease and of measures to contain it. We consider a discrete susceptible-exposed-infectious-recovered (SEIR) epidemiological model, where the spread of the disease is represented by a recurrence relation indexed by a discrete parameter t ∈ {0, 1, . . .}. This recurrence relations are give the expected behavior of a stochastic process defined on a digraph G = (V, E, ω), where each vertex represents a city and the weight w ij of an arc i j represents the number of commuters from i to j on an average day. Each city i ∈ V has population P i and, for all t ≥ 0, the vector x i (t) = (S i (t), E i (t), I i (t), R i (t)) stands for the number of susceptible, exposed, infected and removed inhabitants of city i at time t, respectively. As usual, all susceptible individuals are assumed to be prone to contracting the disease. Exposed individuals have been infected, but are not yet contagious, while infected individuals are capable of infecting susceptible individuals. Removed individuals either recovered (and became immune from the disease) or passed away. Initially, each city i is assigned a vector x i (0) with the number of individuals in each class at the start of the process.

We now describe how our system evolves. As in the work of Silva, Pereira and Nonato 2222 P.J.S. Silva, T. Pereira & L.G. Nonato. Robot dance: a city-wise automatic control of Covid-19 mitigation levels. medRxiv, (2020). doi: 10.1101/2020.05.11.20098541. URL https://www.medrxiv.org/content/early/2020/05/18/2020.05.11.20098541.
https://www.medrxiv.org/content/early/20...
, we assume that most of the movement between cities may be attributed to daily commutes. On day t, part of the population of each city leaves their city to work or study, and comes back in the evening. This leads to a row stochastic matrix M = (p ij ) of order n, where n = |V |. We interpret pi j as the relative flow from city i to city j, given by pij = (tij + eij )/Pi, where t ij and e ij come from the matrices 𝒯 and from Section 2. This corresponds to the proportion of the population of i that regularly commutes to j. The diagonal entries are given by pii=1-Σjipij.

As a consequence, during the day each city j has an effective population of

P j ' = i V p i j P i .

We shall also assume that all classes of individuals are equally likely to move between cities, so that the effective number of individuals of each class in city j on day t is given by

S j ' ( t ) = i V p i j ( t ) S i ( t ) , E j ' ( t ) = i V p i j ( t ) E i ( t ) , I j ' ( t ) = i V p i j ( t ) I i ( t ) , R j ' ( t ) = i V p i j ( t ) R i ( t ) .

In our model, infections only occur during the day (at the city where each individual spends the day). Each such individual is assumed to meet L other individuals in a normal day. However, assuming that a susceptible individual spends the day at city j, the number of actual meetings on day t is assumed to be L(1 βj*(t))2, where βj*(t) is the relative rate of isolation of city j on day t, given in (3.2). This rate has been assumed under the simplifying assumption that the probability that, for a meeting to happen, both participants cannot be under self-isolation, and this would happen with probability (1 βj*(t))2 if the decision to self-isolate were taken by each individual spending the day in city j, independently of all others, with probability βj*(t). When an individual is infected, we assume that the disease takes its course in 14 days, following the phases described in guidelines of the Center for Disease Control and Prevention (CDC) 99 B. Adhikari, L. Fischer, B. Greening, S. Jeon, E. Kahn, G. Kang, G. Rainisch, M. Meltzer & M. Washington. COVID19Surge: a manual to assist state and local public health officials and hospital administrators in estimating the impact of a novel coronavirus pandemic on hospital surge capacity (2020).. In the first four days 1515 S. Ma, J. Zhang, M. Zeng, Q. Yun, W. Guo, Y. Zheng, S. Zhao, M.H. Wang & Z. Yang. Epidemiological parameters of coronavirus disease 2019: a pooled analysis of publicly reported individual data of 1155 cases from seven countries. medRxiv, (2020). doi: 10.1101/2020.03.21.20040329. URL https://www.medrxiv.org/content/early/2020/03/24/2020.03.21.20040329.
https://www.medrxiv.org/content/early/20...
, incubation occurs, in the next 5 days, infected individuals are contagious 2020 C.M. Peak, R. Kahn, Y.H. Grad, L.M. Childs, R. Li, M. Lipsitch & C.O. Buckee. Comparative Impact of Individual Quarantine vs. Active Monitoring of Contacts for the Mitigation of COVID-19: a modelling study. medRxiv, (2020). and, in the final five days, individuals are still convalescent, but do not transmit the disease 77 Report 9: Impact of non-pharmaceutical interventions to reduce COVID-19 mortality and healthcare demand, Imperial College COVID-19 Response Team (2020. Access on: 1 Ago. 2020).. While infectious, we assume that the probability that an encounter between a susceptible and an infected individual leads to an infection is given by τ(k), where k is the number of days since the infected individual became contagious. As in 2020 C.M. Peak, R. Kahn, Y.H. Grad, L.M. Childs, R. Li, M. Lipsitch & C.O. Buckee. Comparative Impact of Individual Quarantine vs. Active Monitoring of Contacts for the Mitigation of COVID-19: a modelling study. medRxiv, (2020)., we assume that τ(k) follows a triangular distribution over the five days, with a peak on the third day. We have τ(k) = 0 for k > 5. The area of the triangle in the definition of this distribution is given by R0/L, to ensure that the basic reproductive number (assuming no isolation) is R 0 = 2.4, following a situation report by the WHO 11 Coronavirus disease 2019 (COVID-19) Situation Report 46, World Health Organization (2020). (see also 77 Report 9: Impact of non-pharmaceutical interventions to reduce COVID-19 mortality and healthcare demand, Imperial College COVID-19 Response Team (2020. Access on: 1 Ago. 2020).).

The recurrence relations become

S i ( t + 1 ) = S i ( t ) - R 0 S i ( t ) j p i j k = t - 4 t ( 1 - β j * ( t ) ) 2 I ' j n e w ( k ) τ ( k - t + 5 ) P ' j ( t ) I i n e w ( t + 1 ) = R 0 S i ( t ) j p i j k = t - 4 t ( 1 - β j * ( t ) ) 2 I ' j n e w ( k ) τ ( k - t + 5 ) P ' j ( t ) E i ( t + 1 ) = E i ( t ) + I i n e w ( t + 1 ) - I i n e w ( t - 2 ) I i ( t + 1 ) = I i ( t ) + I i n e w ( t - 2 ) - I i n e w ( t - 13 ) R i ( t + 1 ) = R i ( t ) + I i n e w ( t - 13 )

In the above, for simplicity, we assume that Iinew(s)=0 for all s ≤ 0 and i ∈ [n]. Just to illustrate where these equations come from, we discuss the case where an individual in city i does not contract the disease at time t + 1 in the case where there is no social distancing. With probability p ij , the individual moved to city j on day t + 1. The probability that an encounter leads to an infection is

R 0 L k = t - 4 t τ ( k - t + 5 ) I ' j n e w ( k ) P ' j ( t ) ,

so that the probability that no encounter leads to an infection, given that the individual spends the day in city j, is

1 - R 0 L k = t - 4 t τ ( k - t + 5 ) I ' j n e w ( k ) P ' j ( t ) L 1 - R 0 k = t - 4 t τ ( k - t + 5 ) I ' j n e w ( k ) P ' j ( t ) .

Since the same holds for each susceptible individual in i and knowing the proportion of susceptible individuals that commute from i to j, the first equation in the above system gives the expected number of susceptible individuals at time t that remain susceptible at time t + 1.

We run this model starting with the official state data on May 26 to simulate the evolution of the disease until July 9. The number of new infections in the days before this date are estimated using data from May 20-26, where we assume that new cases correspond to 10% of the number of active cases. The results for the cities of Porto Alegre (the state capital and largest city), Rio Grande (the largest port in Southern Brazil and the city with highest average rate of self-isolation) and Antônio Prado (a small city with a population of about 13,000, where the average rates of self-isolation are lowest) appear in Figure 6.

It is striking to compare it with the behavior of these quantities in the case where there is no social distancing (that is βj*(t) = 0 for all j and t) and with the situation in which the high rates of self-isolation observed on the week between March 21 and 27 had been maintained after May 26 (β i = 0.614, on average). This appears in Figure 7.

Figure 6
Number of new cases and the cumulative number of cases (per 100,000 inhabitants) in three cities of Rio Grande do Sul. The x axis represents the number of days after May 26.

To see the effect of self-isolation in this model, in Figure 8 we plot the number of cases in Porto Alegre on July 9 assuming that the rate of self-isolation remained constant throughout the time period, and is given by the corresponding value on the x-axis.

According to our data, the average rate of isolation in Porto Alegre has been about 44.3% during this time period. We note that a simple calculations shows that, while the number of susceptible individuals is much higher than the number of individuals in the other classes, isolation would need to be above 55% to keep the effective reproductive number of the disease below 1.

Even though we have opted to plot the evolution of the disease from May 26 to avoid intrinsic errors coming from initial conditions where the number of infected individuals was very small, we should mention that the isolation data were successful at explaining the ups-and-downs in the number of cases in the first weeks of the pandemic in Porto Alegre. According to the simulations, the number of cases remained stable between March 31 and the week of May 26, and started growing rapidly since then. State data report that the number was stable until early June, and grew rapidly since then. (Specific data are in the ancillary files.)

Figure 7
On the top: Number of new cases in Porto Alegre assuming the actual isolation data (left), no isolation (center) and strict isolation (right). In the middle: number of active cases in each scenario. On the bottom: cumulative number of cases in each scenario.

Figure 8
Number of active cases, and the cumulative number of cases in Porto Alegre on July 9, assuming that the rate of isolation remains constant, and is given by the value on the x axis.

5 CONCLUDING REMARKS

In this paper, we looked at the evolution of the COVID-19 pandemic in Rio Grande do Sul using graph theory. We applied spectral clustering techniques on weighted graphs defined on the set of 167 municipalities in the state with population 10,000 or more, using official data provided by government agencies and isolation data by In Loco. Results related with our first measure, based on data for pendulum migration, provided a partition of the state into 10 clusters. The largest city in all but one of the clusters is also the largest city in its own governmental region, and the only exception gives two regions where the evolution of the disease was quite different. This confirms that pendulum migration is an important means of spreading the disease. Our results have also shown that, in this situation, considering dynamic clusters that incorporate self-isolation data would give essentially the same clusters. In future work, it would be interesting to see if this can also be observed in other regions, particularly if they are more heterogeneous.

Given the specific context of the flag system in Rio Grande do Sul, our main contribution was obtained using an affinity measure based on the availability of ICU beds. Our results suggest that considering a flexible approach to the regions themselves would be a useful additional tool in giving more leeway to cities with lower incidence rates, while keeping the focus on public safety. However, this is just a first step in evaluating the adequacy of such an approach. Future work could look for more data (in a municipal level), which would allow a direct comparison with the government system. Moreover, implementing this approach on the ground would require state and local authorities to assess the practicality of periodic changes to the regions. For instance, this would need to be met with changes in patient transfer protocols.

To evaluate the quality of the data used for clustering, we have observed that disease information from the literature, combined with the isolation data, have provided a coherent qualitative description of the evolution of the pandemic in Rio Grande do Sul using a simple discrete SEIR model. Extrapolating from this, we conclude that isolation measures have been very important in slowing down the spread of the disease. Of course, better results would be achieved with a better understanding of the behavior of the disease and with a model that takes more information into account.

Acknowledgments

The authors are particularly indebted to In Loco for providing data about self-isolation in the cities of Rio Grande do Sul and to Prof. Márcia Barbian for sharing her data about availability and occupancy of ICU beds in the state. The authors also thank Alisson Matheus Fachini Soares, Guilherme Tadewald Varella and Lucas da Rocha Schwengber for helpful discussions leading to this paper. C. Hoppen acknowledges the support of CNPq 308054/2018-0 and FAPERGS 19/2551-0001727-8. L. E. Allem acknowledges the support of FAPERGS 21/25510002053-9. M. M. Marzo acknowledges the support of CAPES. L. S. Sibemberg acknowledges the support of CNPq. We thank the anonymous reviewers for their careful reading of our manuscript and their many insightful comments and suggestions.

REFERENCES

  • 1
    Coronavirus disease 2019 (COVID-19) Situation Report 46, World Health Organization (2020).
  • 2
    Diário Oficial da União, Decisão Ação Direta de Inconstitucionalidade 6343, 1 de junho de 2020, Supremo Tribunal Federal (2020).
  • 3
    Diário Oficial da União, Portaria 454, 20 de março de 2020, Ministério da Saúde (2020).
  • 4
    Diário Oficial do Estado do Rio Grande do Sul, Decreto 55.128, 19 de março de 2020 (2020).
  • 5
    Diário Oficial do Estado do Rio Grande do Sul, Decreto 55.240, 10 de maio de 2020 (2020).
  • 6
    Painel Coronavírus RSSecretaria Estadual de Saúde (2020. Access on: 1 Ago. 2020). URL URL https://ti.saude.rs.gov.br/covid19/
    » https://ti.saude.rs.gov.br/covid19/
  • 7
    Report 9: Impact of non-pharmaceutical interventions to reduce COVID-19 mortality and healthcare demand, Imperial College COVID-19 Response Team (2020. Access on: 1 Ago. 2020).
  • 8
    Timeline of WHO’s response to COVID-19. ” ”https://www.who.int/news-room/detail/2906-2020-covidtimeline ” (2020. Access on: 1 Ago. 2020).
    » https://www.who.int/news-room/detail/2906-2020-covidtimeline
  • 9
    B. Adhikari, L. Fischer, B. Greening, S. Jeon, E. Kahn, G. Kang, G. Rainisch, M. Meltzer & M. Washington. COVID19Surge: a manual to assist state and local public health officials and hospital administrators in estimating the impact of a novel coronavirus pandemic on hospital surge capacity (2020).
  • 10
    N. Ajzenman, T. Cavalcanti & D. Da Mata. More Than Words: Leaders’ Speech and Risky Behavior during a Pandemic. SSRN, (2020). doi: 10.2139/ssrn.3582908. URL https://ssrn.com/abstract=3582908
    » https://doi.org/10.2139/ssrn.3582908» https://ssrn.com/abstract=3582908
  • 11
    D. Cyranoski. What China’s coronavirus response can teach the rest of the world. Nature, 579(7800) (2020), 479-480.
  • 12
    Jianbo Shi & J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8) (2000), 888-905.
  • 13
    M.Y. Li, H.L. Smith & L. Wang. Global dynamics of an SEIR epidemic model with vertical transmission. SIAM Journal on Applied Mathematics, 62(1) (2001), 58-69.
  • 14
    K. Linka, M. Peirlinck, F. Sahli Costabal & E. Kuhl. Outbreak dynamics of COVID-19 in Europe and the effect of travel restrictions. Computer Methods in Biomechanics and Biomedical Engineering, (2020), 1-8.
  • 15
    S. Ma, J. Zhang, M. Zeng, Q. Yun, W. Guo, Y. Zheng, S. Zhao, M.H. Wang & Z. Yang. Epidemiological parameters of coronavirus disease 2019: a pooled analysis of publicly reported individual data of 1155 cases from seven countries. medRxiv, (2020). doi: 10.1101/2020.03.21.20040329. URL https://www.medrxiv.org/content/early/2020/03/24/2020.03.21.20040329
    » https://doi.org/10.1101/2020.03.21.20040329» https://www.medrxiv.org/content/early/2020/03/24/2020.03.21.20040329
  • 16
    J. Macqueen. Some methods for classification and analysis of multivariate observations. In “In 5-th Berkeley Symposium on Mathematical Statistics and Probability” (1967), p. 281-297.
  • 17
    J. Magnus & H. Neudecker. “Matrix Differential Calculus with Applications in Statistics and Econometrics (Revised Edition)”. John Wiley & Sons Ltd (1999).
  • 18
    B. Nadler, S. Lafon, R.R. Coifman & I.G. Kevrekidis. Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators. In “Proceedings of the 18th International Conference on Neural Information Processing Systems”, NIPS’05. MIT Press, Cambridge, MA, USA (2005), p. 955-962.
  • 19
    A.Y. Ng, M.I. Jordan & Y. Weiss. On Spectral Clustering: Analysis and an Algorithm. In “Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic”. MIT Press, Cambridge, MA, USA (2001), p. 849-856.
  • 20
    C.M. Peak, R. Kahn, Y.H. Grad, L.M. Childs, R. Li, M. Lipsitch & C.O. Buckee. Comparative Impact of Individual Quarantine vs. Active Monitoring of Contacts for the Mitigation of COVID-19: a modelling study. medRxiv, (2020).
  • 21
    P.S. Peixoto, D.R. Marcondes, C.M. Peixoto, L. Queiroz, R. Gouveia, A. Delgado & S.M. Oliva. Potential dissemination of epidemics based on Brazilian mobile geolocation data. Part I: Population dynamics and future spreading of infection in the states of Sao Paulo and Rio de Janeiro during the pandemic of COVID-19. medRxiv, (2020). doi: 10.1101/2020.04.07.20056739. URL https://www.medrxiv.org/content/early/2020/04/11/2020.04.07.20056739
    » https://doi.org/10.1101/2020.04.07.20056739» https://www.medrxiv.org/content/early/2020/04/11/2020.04.07.20056739
  • 22
    P.J.S. Silva, T. Pereira & L.G. Nonato. Robot dance: a city-wise automatic control of Covid-19 mitigation levels. medRxiv, (2020). doi: 10.1101/2020.05.11.20098541. URL https://www.medrxiv.org/content/early/2020/05/18/2020.05.11.20098541
    » https://doi.org/10.1101/2020.05.11.20098541» https://www.medrxiv.org/content/early/2020/05/18/2020.05.11.20098541
  • 23
    U. Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4) (2007), 395-416.
  • 24
    U. von Luxburg, M. Belkin & O. Bousquet. Consistency of spectral clustering. Ann. Statist., 36(2) (2008), 555-586. doi: 10.1214/009053607000000640. URL https://doi.org/10.1214/009053607000000640
    » https://doi.org/10.1214/009053607000000640» https://doi.org/10.1214/009053607000000640
  • 25
    Y. Wang, J. Tong, Y. Qin, T. Xie, J. Li, J. Li, J. Xiang, Y. Cui, E.S. Higgs, J. Xiang & Y. He. Characterization of an asymptomatic cohort of SARS-COV-2 infected individuals outside of Wuhan, China. Clinical Infectious Diseases, (2020). doi: 10.1093/cid/ciaa629. URL https://doi.org/10.1093/cid/ciaa629
    » https://doi.org/10.1093/cid/ciaa629» https://doi.org/10.1093/cid/ciaa629
  • 26
    N. Zhu, D. Zhang, W. Wang, X. Li, B. Yang, J. Song, X. Zhao, B. Huang, W. Shi, R. Lu, P. Niu, F. Zhan, X. Ma, D. Wang, W. Xu, G. Wu & G. Gao. A novel coronavirus from patients with pneumonia in China, 2019. New England Journal of Medicine, 382 (2020). doi: 10.1056/NEJMoa2001017.
    » https://doi.org/10.1056/NEJMoa2001017

A FORMULA FOR FLAGS

This appendix contains the formulae used to compute the flags. The formulae are weighted means of a series of parameters.

A.1 Government formula

The parameters used in the government formula are as follows:

Rj = Number of cities in region j ∈ {1, 2, . . . , 20}.

Mk = Number of cities in macroregion k ∈ {1, 2, 3, 4, 5, 6, 7}, where M k( j) is the macroregion containing region j.

S = Total number o cities (167)

Pj = Population of region j

a(t) = New hospitalizations due to COVID-19 in week t

b(t) = SARS patients in ICU beds in week t

c(t) = New confirmed COVID-19 patients in regular hospital beds in week t

d(t) = New confirmed COVID-19 patients in ICU beds in week t

e(t) = Active cases in week t

f (t) = Recovered people during the seven weeks prior to t

g(t) = Deaths due to COVID-19 in week t

h(t) = COVID-19 patients in ICU beds in week t

i(t) = Free ICU beds in week t

B 1 R j ( t ) = a ( t ) 1 + a ( t - 1 ) ; B 2 M k ( j ) ( t ) = b ( t ) 1 + b ( t - 1 ) ; B 3 M k ( j ) ( t ) = c ( t ) 1 + c ( t - 1 ) ; B 4 M k ( j ) ( t ) = d ( t ) 1 + d ( t - 1 ) ; B 5 R j ( t ) = e ( t ) 1 + f ( t ) ; B 6 R j ( t ) = a ( t ) · 100 , 000 P j ; B 7 R j ( t ) = g ( t ) · h ( t ) h ( t - 1 ) ; B 8 M k ( j ) ( t ) = i ( t ) h ( t ) ; B 9 M k ( j ) ( t ) = i ( t ) i ( t - 1 ) ; B 10 S ( t ) = i ( t ) h ( t ) ; B 11 S ( t ) = i ( t ) i ( t - 1 )

Each parameter is associated with a ‘flag’ according to the following ranges:

β 1 R j ( t ) = 0 , if B 1 R j ( t ) < 1 . 05 1 , if 1 . 05 B 1 R j ( t ) < 1 . 2 2 , if 1 . 2 B 1 R j ( t ) < 1 . 5 3 , if 1 . 5 B 1 R j ( t ) β 2 R j ( t ) = 0 , if B 2 M k ( j ) ( t ) < 1 . 05 1 , if 1 . 05 B 2 M k ( j ) ( t ) < 1 . 3 2 , if 1 . 3 B 2 M k ( j ) ( t ) < 1 . 5 3 , if 1 . 5 B 2 M k ( j ) ( t ) β 3 R j ( t ) = 0 , if B 3 M k ( j ) ( t ) < 1 . 05 1 , if 1 . 05 B 3 M k ( j ) ( t ) < 1 . 2 2 , if 1 . 2 B 3 M k ( j ) ( t ) < 1 . 5 3 , if 1 . 5 B 3 M k ( j ) ( t ) β 4 R j ( t ) = 0 , if B 4 M k ( j ) ( t ) < 1 . 05 1 , if 1 . 05 B 4 M k ( j ) ( t ) < 1 . 1 2 , if 1 . 1 B 4 M k ( j ) ( t ) < 1 . 25 3 , if 1 . 25 B 4 M k ( j ) ( t ) β 5 R j ( t ) = 0 , if B 5 R j ( t ) < 0 . 25 1 , if 0 . 25 B 5 R j ( t ) < 0 . 5 2 , if 0 . 5 B 5 R j ( t ) < 0 . 75 3 , if 0 . 75 B 5 R j ( t ) β 6 R j ( t ) = 0 , if B 6 R j ( t ) < 1 . 5 1 , if 1 . 5 B 6 R j ( t ) < 3 2 , if 3 B 6 R j ( t ) < 5 3 , if 5 B 6 R j ( t ) β 7 R j ( t ) = 0 , if B 7 R j ( t ) < 0 . 25 1 , if 0 . 25 B 7 R j ( t ) < 0 . 6 2 , if 0 . 6 B 7 R j ( t ) < 1 3 , if 1 B 7 R j ( t ) β 8 R j ( t ) = 0 , if 4 < B 8 M k ( j ) ( t ) 1 , if 2 . 35 < B 8 M k ( j ) ( t ) 4 2 , if 1 . 5 < B 8 M k ( j ) ( t ) 2 . 35 3 , if B 8 M k ( j ) ( t ) 1 . 5 β 9 R j ( t ) = 0 , if 4 < B 9 M k ( j ) ( t ) 1 , if 2 . 35 < B 9 M k ( j ) ( t ) 4 2 , if 1 . 5 < B 9 M k ( j ) ( t ) 2 . 35 3 , if B 9 M k ( j ) ( t ) 1 . 5 β 10 R j ( t ) = 0 , if 1 . 001 < B 10 S ( t ) 1 , if 0 . 8 < B 10 S ( t ) 1 . 001 2 , if 0 . 7 < B 10 S ( t ) 0 . 8 3 , if B 10 S ( t ) 0 . 7 β 11 R j ( t ) = 0 , if 1 . 001 < B 11 S ( t ) 1 , if 0 . 95 < B 11 S ( t ) 1 . 001 2 , if 0 . 8 < B 11 S ( t ) 0 . 95 3 , if B 11 S ( t ) 0 . 8

Fix the weights α 1 = α 2 = α 3 = α 4 = 0.375, α 5 = 1, α 6 = α 7 = α 8 = α 9 = α 10 = α 11 = 1.25. The flag assigned to cities in region j in week t is

B j ( t ) = n = 1 11 α n · β n R j ( t ) .

A.2 Our formula

Our formula is computed using the parameters of the government formula that have been obtained for cities.

Cj = Counts the total of municipalities in the cluster j ∈ {1, 2, . . . , 20}

B 1 C j ( t ) = e ( t - 1 ) e ( t - 2 ) ; B 2 C j ( t ) = d ( t ) 1 + d ( t - 1 ) ; B 3 C j ( t ) = e ( t ) 1 + f ( t ) ; B 4 C j ( t ) = g ( t ) · h ( t ) h ( t - 1 ) ; B 5 C j ( t ) = e ( t ) · 100 , 000 P j ; B 6 C j ( t ) = i ( t ) h ( t ) ; B 7 C j ( t ) = i ( t ) i ( t - 1 ) ; B 8 S ( t ) = i ( t ) h ( t ) ; B 9 S ( t ) = i ( t ) i ( t - 1 )

Each parameter is associated with a ‘flag’ according to the following ranges:

β 1 C j ( t ) = 0 , if B 1 C j ( t ) < 1 . 05 1 , if 1 . 05 B 1 C j ( t ) < 1 . 2 2 , if 1 . 2 B 1 C j ( t ) < 1 . 5 3 , if 1 . 5 B 1 C j ( t ) β 2 C j ( t ) = 0 , if B 2 C j ( t ) < 1 . 05 1 , if 1 . 05 B 2 C j ( t ) < 1 . 1 2 , if 1 . 1 B 2 C j ( t ) < 1 . 25 3 , if 1 . 25 B 2 C j ( t ) β 3 C j ( t ) = 0 , if B 3 C j ( t ) < 0 . 25 1 , if 0 . 25 B 3 C j ( t ) < 0 . 5 2 , if 0 . 5 B 3 C j ( t ) < 0 . 75 3 , if 0 . 75 B 3 C j ( t ) β 4 C j ( t ) = 0 , if B 4 C j ( t ) < 0 . 25 1 , if 0 . 25 B 4 C j ( t ) < 0 . 6 2 , if 0 . 6 B 4 C j ( t ) < 1 3 , if 1 B 4 C j ( t ) β 5 C j ( t ) = 0 , if B 5 C j ( t ) < 30 1 , if 30 B 5 C j ( t ) < 90 2 , if 90 B 5 C j ( t ) < 270 3 , if 270 B 5 C j ( t ) β 6 C j ( t ) = 0 , if 4 < B 6 C j ( t ) 1 , if 2 . 35 < B 6 C j ( t ) 4 2 , if 1 . 5 < B 6 C j ( t ) 2 . 35 3 , if B 6 C j ( t ) 1 . 5 β 7 C j ( t ) = 0 , if 1 < B 7 C j ( t ) 1 , if 0 . 8 < B 7 C j ( t ) 1 2 , if 0 . 7 < B 7 C j ( t ) 0 . 8 3 , if B 7 C j ( t ) 0 . 7 β 8 C j ( t ) = 0 , if 4 < B 8 S ( t ) 1 , if 2 . 35 < B 8 S ( t ) 4 2 , if 1 . 5 < B 8 S ( t ) 2 . 35 3 , if B 8 S ( t ) 1 . 5 β 9 C j ( t ) = 0 , if 1 . 001 < B 9 S ( t ) 1 , if 0 . 95 < B 9 S ( t ) 1 . 001 2 , if 0 . 8 < B 9 S ( t ) 0 . 95 3 , if B 9 S ( t ) 0 . 8

Fix the weights α 1 = α 2 = 0.75, α 3 = 1, α 4 = α 5 = α 6 = α 7 = α 8 = α 9 = 1.25, the flag assigned to the cities in cluster j in week t is

B j ( t ) = n = 1 9 α n · β n C j ( t )

Publication Dates

  • Publication in this collection
    14 Nov 2022
  • Date of issue
    Oct-Dec 2022

History

  • Received
    01 Aug 2020
  • Accepted
    31 May 2022
Sociedade Brasileira de Matemática Aplicada e Computacional - SBMAC Rua Maestro João Seppe, nº. 900, 16º. andar - Sala 163, Cep: 13561-120 - SP / São Carlos - Brasil, +55 (16) 3412-9752 - São Carlos - SP - Brazil
E-mail: sbmac@sbmac.org.br