Skip to content

Agriculture Domain

Lacuna Fund agriculture datasets unlock the power of machine learning to alleviate food security challenges, spur economic opportunities, and give researchers, farmers, communities, and policymakers access to superior agricultural datasets. Learn more and download released datasets below.

2020 Awards

Description: This machine learning dataset of smallholder farmer’s fields includes georeferenced crop images along with labels on input use, crop management, phenology, crop damage, and yields, collected across 8 counties in Kenya.

Authors: Lilian Waithaka, Koen Hufkens, Berber Kramer and Benson Njuguna

Dataset: access here

Description: This dataset includes corrected geolocations of fields, improving the usability of the most expansive Eastern Africa crop cut yield estimation. Collected by the non-profit One Acre Fund from 2015 – 2019, this dataset covers major crop producing regions in Kenya, Rwanda, and Tanzania.

Dataset: access here

Description: This project built a remotely monitored and controlled Internet of Things (IoT) fish pond water quality management system for the generation of labeled datasets both for conventional ponds and the aquaponic pond systems.

Authors: Udanor Collins, Blessing Ogbuokiri, and Nweke Onyiny

Dataset: access here

Description: This dataset contains a repository of image and spectrometry datasets for five main food security crops in Sub-Saharan Africa: cassava, maize, beans, bananas, and cocoa. Collected and curated in collaboration with the in-country agricultural experts, the datasets deliver a wide range of machine learning applications, including classification, object detection, early crop disease detection, and spatial analysis. The team collected and annotated 127,046 images and 39,300 spectral data points. 

Authors: Joyce Nakatumba-Nabende, Andrew Katumba, Claire Babirye, Jeremy Francis Tusubira, Godliver Owomugisha, Neema Mduma, Darlington Akogo, Blessing Sibanda

Dataset: access here

This dataset focuses on locations with predominantly pastoral communities in northern Tanzania to identify fine and broad-scale movements of livestock and land use patterns and to understand how these relate to communal conflicts. It is a high-quality, accurate and labeled (image, location, and time stamps) dataset containing detailed information on ~ 2000 communal resources (e.g., rangelands, water points, and dips) and their use patterns for over 220 villages across four large districts in northern Tanzania, representative of pastoral systems of livestock production in East Africa. The dataset can be used to describe forage and livestock resource management in managed ecosystems such as community rangelands; identify major migration routes among pastoralist herds and the location and type of infrastructure required to support livestock production; anticipate the location of conflicts with crop farmers and determine the best locations to establish forage banks and support infrastructure along livestock migratory routes. 

Contact: Gladness Mwanga | and Divine Ekwem | 

Authors and Affiliations: Dr. Divine Ekwem (University of Glasgow); Gladness Mwanga (Nelson Mandela African Institution of Science and Technology), Professor Gabriel Shirima (Nelson Mandela African Institution of Science and Technology), Professor Mizech Chagunda (University of Hohenheim) 

Dataset: access here.

2021 Awards

The project created labeled yield estimates from 3000 farmers, and was used to train prediction models for yield prediction across the country, consequently using the dataset to generate high resolution crop mask layers for the different value chains. The yield prediction models were enhanced by other biophysical datasets ranging from soil properties and climate related indicators. The datasets proved a concept of scalable machine learning models training, which may be able to respond more appropriately and cost-effectively to agricultural stressors, thereby ensuring a positive impact on agricultural practices (e.g., good agricultural practices), yields (e.g., harvest quality and quantity), and farmer access to financing (e.g., crop insurance).

Contact: Seth Odhiambo |

Authors and Affiliations: Pula Advisors 

Dataset: access here.

All Lacuna Fund datasets are licensed under the CC-BY 4.0 International license unless otherwise noted.