Climate Datasets
Country: Philippines
Contact: Thinking Machines Data Science | data-for-development@thinkingmachin.es
The Project CCHAIN dataset is an open, linked, analysis-ready dataset of validated health, climate, environmental, and socioeconomic variables collected at the village (“barangay”) level in 12 Philippine cities spanning 20 years (2003-2022). This dataset includes observations of about 17 diseases collected through field visits to the Philippines Department of Health (DOH) and the Philippine Statistical Authority (PSA). Focusing on the village or “barangay,” the smallest administrative unit in the Philippines, also helps disaggregate health risks for vulnerable communities, particularly those in informal settlements, and provides actionable insights for local governments.
Authors and Affiliations:
- Thinking Machines Data Science, Inc.: Patricia Anne Faustino, JC Albert Peralta, Veronica Marie Araneta, Dafrose Camille Bajaro, Abigail Moreno
- Epimetrics, Inc.: John Q. Wong, Anne Kathlyn Baladad, Luis Antonio Desquitado, Matthew Limlengco, Carlos Miguel Resurreccion
- Manila Observatory: Faye Abigail Cruz, Dr. Julie Mae Dado, Leia Pauline Tonga
- Philippine Action for Community-led Shelter Initiatives, Inc.: Ericka Lynne Nava
Country: Nigeria
Contact: Emmanuel Chukwuma | emmanuel.chukwuma@apse-ngo.org
This air quality dataset is the first of its kind in the country from abattoir centers. The localized dataset is crucial in air quality monitoring and prediction, as well as accurate modeling of the air quality index for early warning signals and modeling of health risk. The data was obtained from abattoir centers in Southern Nigeria. The team collected data from representative samples of various states (i.e., Anambra, Enugu, Abia, Imo, Ebonyi, and Delta) within the research area. The team visited 27 stations and conducted on-site investigations, collecting over 200,000 numerical values of particulate matter (PM) concentrations using 10 air quality sensors for PM1, PM2.5, and PM10. Additionally, aerial view images were captured using a drone at varying heights (10m, 20m, 30m) during operational hours; the images will be trained with satellite imagery for the prediction of PM values.
Exposure to particulate matter and black carbon released in abattoirs has detrimental health outcomes with elevated morbidity and mortality, as shown by previous studies. This project was undertaken by the Alliance for Progressive and Sustainable Environment (APSE), a local NGO focused on environmental sustainability (see more details here: www.apse-ngo.org).
Authors and Affiliations:
- Alliance for Progressive and Sustainable Environment: Emmanuel Chukwuma, Uche Okonkwo, Chukwuemeka Umeobi, Jervis Okafor, Sixtus Ezenwankwo, Shadrach Ugwu, Awonge Precious, Cynthia Egdede, Esther Eyo
Dataset: https://drive.google.com/drive/folders/1BRrVgYN-O6s7EsnEgAUCGqINvvfiXZC8?usp=drive_link
Country: Mauritius, Rodrigues, and Agalega Islands
This dataset includes 146,025 real-time solar irradiance data lines from different locations around Mauritius, Rodrigues, and Agaléga. The solar irradiance data (GHI in W/m2) spans 2017 to 2021, at an interval of one hour, and covers the hours of 07:00-18:00 each day. This dataset allows for the real-time visualization of the solar irradiance profile at the specified locations, helping with better assessment and planning of solar-generated power. The team is now collecting data (from 2023 on) at an interval of 15 minutes and plans to update this data repository to reflect that in the future.
The targeted beneficiaries for this project are the Government of Mauritius, which has a goal of 60% electricity generation from renewable energy sources by the year 2030. Similarly, the Mauritius Renewable Energy Agency, which is tasked with ensuring the country’s energy demand is increasingly met by renewable energy and keeping up with international commitments, can use this data on solar irradiance and forecasting mechanisms to better manage the utility’s power plants, minimize carbon emissions, ensure no loss of loads (blackouts), and allow higher penetration of photovoltaic (PV) projects in the country. With free online solar maps and accuracy-enhanced solar energy data, local PV plant operators will also have accurate information for PV performance appraisal. Additionally, the public at large can benefit from a free online solar energy platform, improving acceptance of solar PV technology and increasing penetration of clean technologies in the country to further reduce greenhouse emissions. Finally, machine learning models can be trained for intra-day, daily, and even weekly predictions of the solar irradiance profiles.
Authors and Affiliations:
- University of Mauritius: Yogesh Beeharry, Ravish Gokool, Yatindra Kumar Ramgolam, Aatish Chiniah
Dataset: https://www.scidb.cn/en/detail?dataSetId=2b499b91a4464fffa9f60fc8b51da03e&version=V2
Country: Madagascar
Contact: Fabienne Rafidiharinirina | f.rafidiharinirina@association-maidi.mg or assomaidi@gmail.com
This team annotated 2,125 Google Earth satellite images and 9,202 drone images, forming a combination of low and high-definition solar panel views in Madagascar. The Madagascar Initiatives for Digital Innovation (MAIDI) team performed field checks for up to 25% of satellite images and, in total, annotated 22,488 polygons.
This dataset will help data scientists and users develop a solar panel detection algorithm to measure solar energy adoption across Madagascar. Notably, this project represented all regions of the country; instead of focusing only on big cities, it also covered average and small villages as well as coasts and mountains.
Authors and Affiliations: Fabienne Rafidiharinirina (Madagascar Initiatives for Digital Innovation)
Country: Pakistan
Contact: Dr. Zeeshan Shafiq | zeeshanshafiq@uetpeshawar.edu.pk
This dataset comprises real-time electrical measurements of a specific climate zone in Pakistan, the Kalam Region, showcasing the energy generation and demand within an off-grid electricity infrastructure. It can be used for research in energy systems analysis, climate change studies, electrical engineering, and artificial intelligence applications. It includes voltages, currents, and power factors for three-phase and single-phase systems across generation, distribution, and consumption stages. Additionally, the dataset incorporates seven different climate parameters from the ERA5 dataset (provided by the Copernicus Climate Change Service), generating a total of 85,596 data points in areas such as temperature, dew point, wind components, precipitation, snowfall, and snow cover.
Collected every five minutes from June 3, 2023, to October 24, 2024, it includes over 45 million instances covering data from four micro-hydropower generators, 26 transformers (in addition to four data acquisition systems installed at Micro Hydro Power Plants (MHPs), and 585 end users. With local support, the team will continue monitoring the data until June 2025.
Authors and Affiliations:
- CISNR UET Peshawar: Zeeshan Shafiq, Prof. Dr. Gul Muhammad Khan, Engr. Sarmad Rafique, Engr. Muhammad Bilal Khan, Engr. Umer Khan, Engr. Mansoor Khan, Engr. Niaz Khan, Engr. Musa Khan, Engr. Abdul Moiz
Dataset: https://zenodo.org/records/14195731
All Lacuna Fund datasets are licensed under the CC-BY 4.0 International license unless otherwise noted.