Applying big data to realize big social impact

George Molina
Product Manager
Read Time
13 min read
Published On
April 7, 2020

90% of the world’s current stores of data was created in the past two years (Forbes). With new devices capable of recording ever increasing amounts of information, much has been written about the use of big data in the private sector. This is also a crucial time to take a step back and understand the impact that leveraging this data could have on improving social causes. In order to understand what social impact empowered by big data could look like, this post will provide an overview of five different sectors from economic empowerment to education where data is helping this sector take big strides towards improving outcomes and providing novel solutions that can improve the quality of life for those who need it most.                

1. Economic Development
To begin with, the volume of data being produced today is becoming a useful source of information to support forward thinking decisions that can aid countries and organizations focused on increasing financial security and economic development. Due to improvements in analysis and predictive modeling, these groups are now able to make more accurate decisions in a variety of scenarios.

In one case, the “” initiative is looking to make data more open and increasingly accessible to government agencies and private companies. This would enable them to use data to “reduce transaction costs, generate new forms of economic growth and prosperity, generate new revenue models, and disrupt traditional business models” ( In this way, creating a more transparent model would be beneficial to a variety of sectors as it can be leveraged by citizens and users to create economic opportunity and attempt to solve societal problems.


2. Education

Improving education is one of the most important building blocks towards better societal outcomes. School systems have always gathered large amounts of data from test scores to behavioral information making them the perfect place to start. Today, there is an opportunity for schools to use new methods of analysis to see what students are learning, the relevance of the education they are receiving, and how well resources are being used to produce more well rounded students.

One organization, the Data Quality Campaign, is looking to ensure that this useful information is collected and used to make change where it is most impactful. Among the DQC’s various initiatives, one undertaking aims to improve public reporting of data as it relates to school enrollment, student performance, teacher effectiveness and more. The goal here is to improve the quality of this data and begin to tailor this information to better meet the needs of specific students, parents and their communities. Publicly reported data would not include personal information to protect the privacy of students but it could enable these same students and their parents to get a better understanding of their desired academic path.

For example, parents would be able to see information related to school performance and the availability of special programs or services that are a better fit for their child’s interests. Communities would be able to use the data to hold schools more accountable by reviewing how overall school performance has improved – or not – over a span of time. By doing so, these communities can foster continued academic achievement across their respective school systems and take an informed stand when these systems require course correction.

3. Health
Data could also play an important part in furthering health initiatives where they are needed most. In one case, OpenStreetMap (OSM), provided incredibly useful mapping information to Sierra Leone’s National Ebola Response Centre as well as the United Nations Humanitarian Data Exchange to improve the coordination of the public health strategies that were enacted to tackle the disease. Due to remote nature of some towns in the region, there was a significant lack of data accessible to health workers that would allow them to make important decisions related to providing care.

According to the CDC, once volunteer mappers performed the hard work of scouring satellite images to identify villages and paths that were previously not recorded, “The OSM data was then often mashed up with open data from affected governments and international organizations.” This junction of non-profit volunteer efforts along with government participation resulted in the mapping of over “750,000 buildings and hundreds of kilometers of roads” which resulted in an immeasurable impact on the ebola response efforts at the time.

Another example is a research organization based in New Zealand known as CBG Health Research. Through their HealthStat research tool, healthcare professionals across the country are able to identify trends – informed by data collected from national hospitals – of flu and viral outbreaks in real time. The relevance of this data allows doctors to respond in a faster and more effective manner which prevents the spread of infection and helps generate useful data points for future needs.  


4. Food Scarcity
While developed and wealthy nations are beginning to face the issue of food scarcity due to over consumption and global warming, developing countries facing these same challenges will see more severe consequences and must look for ways to adapt their current situation to shield themselves from irreversible damage. In this case, data may be a valuable resource for developing nations looking to improve the quantity and quality of their overall food production.

According to a study on Enabling the IoT in Developing Countries, introducing initiatives backed by IoT (Internet of Things) devices could see improvements in “transportation safety, agriculture, environment, utility management, health monitoring, and more.” To begin with, IoT implementation could result in improved crop yield by rigging up growing environments and greenhouses with technology capable of using computer vision and environment monitoring solutions to provide data on variables such as air pollution, optimal temperature, and more.
Almost 40% of food loss occurs at post-harvest and processing stages in developing countries, keeping track of this data would empower these producers to create “smart farms” that will provide valuable information for themselves and the wider community around them.

5. Environmental Conservation
Lastly, the use of data to improve environmental outcomes should not be understated. As the world continues to grapple with the ever-increasing effects of climate change, organizations such as Global Forest Watch are using their NASA satellite imagery to help conservation groups and governments monitor the rate of deforestation across the world. Backed by Google, Global Forest Watch is using a vast amount of data to create these up to date and high resolution maps that reveal the decrease of forests worldwide. This tool has the potential to provide evidence backed arguments to inform new and aggressive environmental policy changes to curb the devastating results of deforestation.

A similar product, Aqueduct, looks to teach governments, companies, and more about water risks in their area. By identifying potential floods to impacted areas through Aqueduct, governments are given the time and knowledge to create a flood response plan should the worst case scenario occur. In this case, using this data could save lives while also helping countries learn more about water shortages so that they can take measures to decrease consumption and take steps towards avoiding a drought that could harm the population.  


Challenges to using Data for Social Impact

While it is clear that there is a range of invaluable applications for data across social sectors, the use of this data does not come without a few shortcomings that must be considered.
To begin with, data accessibility is an issue for social-impact uses as most useful data is in the hands of private companies that are not willing to share their data for reasons such as intellectual property or regulation concerns. Another challenge is the matter of data quality which is often not good enough for analysis and needs to undergo a few rounds of cleaning in order to discern any valuable information. Examples of this include data from IoT devices which can occasionally be faulty or inaccurate due to recording issues.

As a result of this, the ability to translate data into meaningful and understandable explanations will require specialized roles, such as data scientists, to be incorporated into social impact organizations. These data scientists would be able to interpret said data and turn conclusions into actionable insights. The demand for these roles in the American job market, however, outweighs the current supply with Linkedin estimating over 150,000+ positions are left unfilled due to the lack of trained talent. These challenges all present threats to the use of data by organizations looking to make the world a better place.

With that said, there is hope that these issues will improve over time. As companies look to improve their triple bottom line by implementing corporate social responsibility (CSR) initiatives, they will be required to be more transparent about their supply chains and the numbers behind them. This in turn would improve the quality of data gathered and hopefully make it more accessible for external fact checking and validation purposes. In terms of the data scientist shortage, there are promising signs that the supply will soon catch up as data science undergraduate programs, certifications, and boot camps have seen a sharp rise in recent years.

While the world continues to grapple with matters such as poverty, food security, and environmental decay, the time to pair purpose and technology through the use of big data is now. In spite of challenges such as data accessibility, quality, and lack of data focused talent, there are signs that these issues will be addressed in the near future. Whether that is creating platforms to inform governmental policy initiatives – Aqueduct, Global Forest Watch, Open Street Map – or painting a more transparent picture of our educational systems via the Data Quality Campaign, the impact data will continue to have on furthering social causes is consequential.

These examples represent early adopters of what could be an increasing amount of organizations in the social impact space that will begin to hire data scientists and improve their data recording and insight generation to help solve the biggest issues facing us today. Here’s hoping that governments, and the private sector will do their part to help that vision become a reality.

     ​                                                                           – George Molina is a Product Manager at Perpetual