Analysis

Key Dimensions of Analysis
Data Quality

Accuracy, coherence and reliability of datasets.

Legal

Open data laws and reuse rights in Brazil.

Ethical

Respect for privacy, equity and transparency.

Technical

Metadata structure and FAIR principles compliance.

Data Quality Analysis
In accordance with Brazil’s Open Data Policy (Decree No. 8.777/2016) and the reuse principles described in the national open data platform dados.gov.br, we conducted a comprehensive quality assessment of all datasets used in this study. This evaluation ensures that our analysis of urbanization, deforestation, and flood-related disasters in Brazil is grounded in reliable and ethically reusable public data.

The quality assessment was conducted using four key dimensions: Accuracy, Coherence, Completeness, and Timeliness.

1. Accuracy (Syntactic and Semantic)

  • Disaster Records (S2ID) (D1): Data is self-reported by municipalities, which introduces variability in accuracy due to local reporting capabilities and technical standards.
  • Rainfall Data (INMET) (D2): Rainfall data is collected from calibrated meteorological stations, ensuring a high degree of measurement reliability.
  • Population Estimates (IBGE) (D3): Official demographic projections of population in 2024 based on statistics from the 2022 census; syntactically and semantically consistent.
  • Urban Expansion (MapBiomas) (D4): Derived from satellite imagery with validated classification processes. Urban land cover classes were extracted consistently.
  • Deforestation (MapBiomas) (D5): Follows the same rigorous image classification process as D4, with a strong accuracy track record across Brazilian biomes.
  • Civil Defense (Transparency Portal) (D6):Is based on oficial data from the Portal da Transparência (Transparecy Portal) that is the website that publish all the government costs in Brazil.


2. Coherence

  • Municipality names and state identifiers were standardized to ensure interoperability between datasets.
  • Rainfall (INMET) and disaster records (S2ID) were temporally and spatially aligned and showed expected patterns in high-risk regions.
  • Urban expansion and forest loss trends (MapBiomas) complemented each other and reflected known environmental transitions.


3. Completeness

  • S2ID (D1): Underreporting is a known issue in certain municipalities. The dataset required cleaning to handle encoding issues and numeric inconsistencies.
  • INMET (D2): Complete for 2024, though metadata extraction for each weather station required manual merging.
  • IBGE (D3): Fully complete for 2018. No updated census values are available at the municipal level for more recent years.
  • MapBiomas (D4 & D5): Complete and consistent at the national level, covering 1985–2022 with annual updates and no missing years.
  • Portal da Transparência (D6): Complete and consistent at the national level, covering 2014-2024 without missing years.


Some inconsistencies were noted during the validation of demographic and disaster data. In a few municipalities, the sum of people affected by multiple disaster events slightly exceeded the total projected population. This could be explained by:

  • Duplicate representation: Individuals may be counted more than once if affected by multiple events throughout the year.
  • Rounding and approximation: Reported figures like "10,000" or "6,000" suggest estimates rather than exact counts.
  • Population projections: IBGE estimates may not be fully up-to-date or precise, especially in smaller municipalities. In some cases, percentages exceed 100% due to these discrepancies.

Despite these challenges, using official sources like IBGE remains preferable to assuming unknown values, as it ensures transparency and consistency in comparative analysis.


4. Timeliness

  • S2ID: Includes records up to 2024. Reporting delays may limit visibility of recent disaster impacts.
  • INMET: Fully up-to-date with daily records through 2024.
  • IBGE: Timeliness is limited by the date of the last census (2018), affecting population-based indicators.
  • MapBiomas: Last available version is 2022 (Collection 9), with annual updates generally released mid-year.
  • Portal da Transparência: Include the general data from 2014 to 2025.


5. Summary Table – Data Quality Dimensions
ID Dataset Accuracy Coherence Completeness Timeliness
D1 S2ID – Disaster Records Medium Medium Medium Medium
D2 INMET – Rainfall Data High High High High
D3 IBGE – Population Data (2018) High High High Low
D4 MapBiomas – Urban Expansion High High High Medium
D5 MapBiomas – Deforestation High High High Medium
D6 Civil Defense - Transparency Portal High High High High

All datasets were obtained through official Brazilian open data platforms and are used in accordance with the country's open data reuse policy. Any pre-processing performed preserved the datasets' original structure, and all transformations are documented to support reproducibility and transparency.

Legal Analysis
The assessment is grounded in Brazilian open data regulation, especially the Política de Dados Abertos (Decree No. 8.777/2016), the Lei de Acesso à Informação (LAI), and reuse principles from the national platform dados.gov.br.

This legal analysis evaluates the permissibility and sustainability of using and publishing the datasets incorporated into this project.

1. Privacy Issues

All datasets used are classified as non-personal data under the Lei Geral de Proteção de Dados (LGPD – Law No. 13.709/2018). None of the datasets (S2ID, INMET, IBGE, MapBiomas Portal da Transparência) contain names, document numbers, biometric data, or any attribute that could identify individuals. S2ID includes aggregated figures on population affected by disasters, but these are fully anonymized and reported at the municipal level, presenting no risk of deanonymization.


2. Intellectual Property Rights

All datasets originate from official Brazilian public agencies and are released under public or open terms:

  • S2ID – Ministério da Integração e do Desenvolvimento Regional
  • INMET – Instituto Nacional de Meteorologia
  • IBGE – Instituto Brasileiro de Geografia e Estatística
  • MapBiomas – Scientific and civic coalition with open data policies
  • Portal da Transparência Oficial source of data about costs, payments and investiments of Brazilian Governments.

These sources do not impose restrictions on reuse, as long as proper attribution is ensured.


3. Licensing and Reuse

Datasets are licensed as follows:

  • IBGE: CC BY 4.0
  • MapBiomas: CC BY-SA 4.0
  • S2ID and INMET: Public domain or institutional open terms
  • Portal da Tranaparência: CC BY 4.0

The final mashup datasets produced in this project are published under the CC BY 4.0 license to ensure compatibility and openness.


4. Access Limitations

All datasets are openly available without registration or access restrictions. No sensitive data is included. There are no diplomatic, military, or classified elements requiring special handling, in accordance with Art. 7 of the Lei de Acesso à Informação.


5. Economic Conditions

All data was accessed free of charge from public repositories. The reuse complies with the principles of non-commercial restriction and respects the original platforms’ terms of service. No resale or licensing fees are involved.


6. Temporal Aspects

Datasets vary in update frequency:

  • INMET: Updated daily (2024 complete)
  • S2ID: Updated by local governments, may have delays
  • IBGE: Latest complete data from 2018 census
  • MapBiomas: Updated annually, most recent version is from 2022
  • Portal da Transparência – Civil Defense Action 22BO: Updated monthly, with the latest data covering through May 2025

Documentation includes date references to avoid misinterpretation and ensure transparent temporal alignment.


7. Final Note on Publication

This project adopts best practices in open data reuse as recommended by dados.gov.br, including:

  • Attributing all original sources
  • Documenting transformations and derived variables
  • Maintaining transparency in all methodological choices

Furthermore, Brazil’s Open Data Policy aligns with international standards such as:

  • The International Open Data Charter, promoting open-by-default public data
  • The Open Government Partnership (OGP), of which Brazil is a founding member
  • The UNESCO Open Science Recommendation and UN SDGs, emphasizing open data in disaster resilience
  • The Digital Public Goods Alliance, which recognizes open geospatial and environmental data as infrastructure

Publishing these integrated datasets under the CC BY 4.0 license ensures alignment with national and global open access principles.


Ethical Analysis
The ethical analysis of this project follows the principles of the Data Ethics EU Guidelines and the Open Data Institute’s Ethics Canvas. Given the project’s focus on climate-related disasters (e.g., floods) and their impact on human populations and ecosystems, we ensured an ethical handling of public data from acquisition to interpretation.

Our datasets include information aggregated at both the municipality and state (UF) levels, which required specific attention to regional equity, data granularity, and avoiding misinterpretation of territorial vulnerabilities.

1. Data Ethics Principles
  • Human-Centric Design: The analysis aims to understand how urbanization, forest loss, and climate dynamics contribute to disaster exposure—particularly among vulnerable communities. The use of demographic indicators and disaster impact data seeks to inform equitable urban and environmental policies centered on human resilience.
  • Fairness and Equity: Aggregated indicators were used to avoid identifying individuals or small groups. In comparing municipalities and states, we were careful not to stigmatize certain regions with higher reported disasters or exposure. Instead, we emphasized structural and historical factors (e.g., infrastructure, land use) to explain disparities.
  • Transparency: The data sources—S2ID, INMET, MapBiomas, and IBGE—are publicly accessible and cited throughout the project. All transformations, filtering methods, and indicators (e.g., urban growth, deforestation rate, precipitation volume) are documented in a reproducible workflow.
  • Accountability: The project adheres to the Brazilian Open Data Policy, and metadata were respected for each reused dataset. Data reuse policies were checked through the Reuso de Dados guidelines. The team also ensured internal responsibility for every data-cleaning or aggregation operation.
  • Privacy and Respect for Affected Populations: No microdata or personally identifiable information was used. When handling sensitive metrics (e.g., number of people affected or displaced), we ensured these were aggregated at the municipal or state level. Regional summaries are framed to inform risk reduction—not to judge the effectiveness of local responses.

2. Ethical Concerns and Mitigation
  • Avoiding Sensationalism: While 2024 saw historic flooding in southern Brazil, the project refrains from spotlighting dramatic figures without proper context. Instead, it contextualizes flood patterns within national trends and regional vulnerabilities.
  • Geographical Sensitivity: Because data are available at both municipal and state level, we took care to ensure that analyses do not reinforce existing inequalities between wealthier and poorer regions. Disparities in data reporting across states were acknowledged and mitigated with proportional indicators.
  • Responsibility in Interpretation: Visualizations and analyses are explicitly marked as non-causal. Correlation between urbanization and flood frequency, for example, is interpreted as a signal—not proof—of systemic risk.
  • Public Engagement and Literacy: The project includes simplified graphs and charts to communicate findings to broader audiences, particularly civil society, educators, and local administrators. Technical notebooks and a GitHub repository are provided for specialists who wish to audit or extend the methodology.

3. Final Note

The ethical safeguards adopted in this project reflect both Brazilian public data governance standards and international ethical frameworks. By treating both individual and regional vulnerabilities with respect and care, the project promotes the ethical reuse of public data to support resilience-building and informed policymaking.


Technical Analysis
This technical analysis provides a comprehensive overview of the source datasets used in our project on hydrological disasters and climate vulnerability in Brazil. All datasets were evaluated in light of the Brazilian Open Data Policy and metadata expectations published on dados.gov.br. Additionally, we applied the FAIR principles (Findable, Accessible, Interoperable, Reusable) to assess their potential for long-term reuse and integration.

We classified metadata quality using the AGID-inspired model based on syntactic and semantic correctness, completeness, and consistency of documentation.

FAIR Principles Evaluation

We also evaluated each dataset according to the FAIR data principles to assess their reuse potential:

FAIR Principle Description Assessment
Findable Data should have globally unique, persistent identifiers and searchable metadata. Portal da Transparência, IBGE and MapBiomas datasets are fully findable via dados.gov.br. INMET and S2ID lack persistent identifiers.
Accessible Data and metadata should be retrievable using open protocols. All datasets are downloadable via open protocols. IBGE and MapBiomas are most robust. INMET provides access via a portal but with minimal documentation.
Interoperable Data should use formal standards and shared vocabularies. Portal da Transparência, MapBiomas uses standard land classification codes. IBGE relies on official geographic codes. S2ID uses COBRADE, while INMET lacks standard structure.
Reusable Data should include clear licenses, provenance, and rich documentation. Portal da Tranaparência, IBGE and MapBiomas provide complete licensing and documentation. S2ID is open but lacks formal reuse policy. INMET does not specify license terms.

Results
12 Datasets
Including meteorological, land-use, demographic and disaster records supporting the analysis
2.3M+ Affected people
Total affected people during the 2024 flood in Rio Grande do Sul, based on official S2ID data
431 Municipalities
Officially impacted by the event, nearly 90% of the entire state.
400mm+ Extreme rainfall
Milimeters measured in several areas during May 2024, the highest in decades.
+12% Urbanization growth
Average urbanization growth across RS municipalities between 2013 and 2023.
R$ 2.1B Investments
Allocated to Civil Defense nationwide (2014–2024), based on Transparency Portal data (Reais Currency).