Skip to content


Lacuna Fund is the world’s first collaborative effort to provide data scientists, researchers, and social entrepreneurs in low- and middle-income contexts globally with the resources they need to produce labeled datasets that address urgent problems in their communities.

The Need

Machine learning has shown great potential to revolutionize everything from how farmers increase their crop yields, to how governments communicate with their citizens during natural disasters, to how healthcare providers respond to global pandemics. But in low- and middle-income contexts globally, a lack of labeled and unbiased data puts the benefits of machine learning out of reach. In many cases, data required to build AI applications for real-world problems doesn’t exist. And where it does exist, it’s often outdated, missing key information, or not representative of underserved populations, leading to bias and decreased accuracy. Machine learning tools then “learn” these biases, which can lead to harmful outcomes for people of color, women, and other marginalized populations.

Filling in the International Data Map

Worldwide, there are data gaps that lead to less robust and potentially harmful machine learning outcomes. Lacuna Fund is working to fill these gaps in several domains.


If automated speech recognition (ASR) systems are only built for a small percentage of international languages, the result is a pronounced gap in access to the digital world that is glaring in low- and middle-income contexts globally.


It is difficult to economically provide personalized services to smallholder farmers if the labeled data needed to accurately estimate crop yields does not exist.


AI can inform healthcare responses, but only with available and representative data.

The World’s First Collaborative Effort to Fund Labeled Data for Social Impact

Lacuna Fund will provide data scientists, researchers, and social entrepreneurs in low- and middle-income contexts globally with the resources they need to either produce new datasets to address an underserved population or problem, augment existing datasets to be more representative, or update old datasets to be more sustainable.

Guided by machine learning professionals worldwide, Lacuna Fund is designed for and by the communities it will serve. All datasets produced will be locally developed and owned, and they will be openly accessible to the international community while adhering to best practices regarding ethics and privacy.

Our Grantmaking

The Steering Committee has chosen to focus Lacuna Fund’s grantmaking to support the creation, expansion, and maintenance of labeled datasets in three domain areas with key needs: agriculture, health, and languages.

Lacuna Fund grants using requests for proposals. We accept applications from non-profit entities, research institutions, for-profit social enterprises, or teams of such organizations. Organizations seeking to apply should have the technical capacity to conduct dataset labeling, creation, expansion, and/or maintenance.

One requirement of applicants is that the datasets are locally developed and owned. We believe that local researchers need to create this data and see it through to implementation because that’s where the greatest potential for systemic change lies.

See open, upcoming, and past RFPs, as well as more information about domain areas and the application.

Galvanizing a step change in machine learning’s potential worldwide.

Lacuna Fund aims to:

  • Disburse funds to institutions to create, expand, and/or maintain datasets that fill gaps and reduce bias in training data used for machine learning. 
  • Make it possible for underserved populations to take advantage of advances offered by AI. 
  • Deepen understanding by the machine learning and philanthropy communities of how to most effectively and efficiently fund development and maintenance of equitably labeled datasets.  

While Lacuna Fund’s objective is to produce more equitable open source datasets, this is bigger than just data collection and labeling. We are working towards data collection protocols that ensure the data is scalable and replicable. We endeavor to see these protocols applied to other regions and sectors in the future, thereby catalyzing sustainable funding for this essential work. 

By helping build the capacity of local organizations to be data collectors, curators, and owners, our goal is to empower these changemakers with the data resources they need to uncover new insights and solutions, embrace innovative solutions, and ultimately fuel systemic change within their communities. 

Funders and Governance

Our Funders

Lacuna Fund began as a funder collaborative between The Rockefeller Foundation,, and Canada’s International Development Research Centre, with the first call for proposals on underserved languages supported by the German development agency GIZ on behalf of the Federal Ministry for Economic Cooperation and Development (BMZ).

Our Governance

Lacuna Fund has since evolved into a multi-stakeholder engagement composed of technical experts, thought leaders, local beneficiaries, and end users. Collectively, we are committed to creating and mobilizing labeled datasets that both solve urgent local problems and lead to a step change in machine learning’s potential worldwide.