Data has been called “the new oil” as private companies discover its value for generating business insights and driving profits. Likewise, the nonprofit and civil society sector — including healthcare organizations — is increasingly using data-driven insights to design better interventions and programs.
Unfortunately, the unethical and unsafe use of data in the nonprofit healthcare sector is a widespread and often overlooked problem that can lead to many negative outcomes, such as security breaches and political manipulation of data, or dividing patients into more and less “valuable” groups that can reinforce social and economic inequality. As NPR recently reported, a surprising amount of health data isn’t protected by U.S. medical privacy laws.
Why has this problem been overlooked? First, there are simply not enough experts in data science to address the market need. Second, many organizations focus too much on the benefits of data and not enough on the potential risks. Finally, many organizations are overwhelmed by the operational and legal aspects of handling data safely and ethically. Given that many people nowadays have been conditioned by private companies to automatically accept legal terms and conditions about data use, patients have not pushed nonprofit health organizations to reform their data management practices.
Here I offer suggestions for best practices that nonprofit healthcare organizations can follow to use data safely, ethically, and effectively. These are organized according to the four guiding principles of data management — privacy, consent, openness and pluralism — recommended by the Digital Civil Society Lab at the Stanford Center on Philanthropy and Civil Society.
Before sharing or publishing any health data, organizations should develop processes for data management and security, and for handling sensitive data.
If it is stored, it can be hacked, and hackers tend to find new ways into datasets faster than organizations find new ways to secure them. There are two privacy concerns regarding the technical infrastructure of data: where the data are stored (on a cloud service or on an organization’s server) and where they are analyzed (external or internal platforms). When using third-party service providers, be sure to check their legal terms (and any updates to these terms) and look closely for whether and how the provider intends to use the data themselves.
What qualifies as sensitive data? In healthcare, this is most commonly a person’s medical history. But it can also include data from sensors or wearable technology, or lifestyle and location data that, if misapplied, can have a negative impact on whole communities. Health organizations should engage data scientists not only in finding answers to our research questions, but also in understanding what personally identifiable information can be deducted from health datasets. Machine learning is making it much simpler to find these patterns in data, and this is a tool that health organizations can leverage to improve their data privacy practices.
Most patients have no idea what happens to their data. It is not that they don’t care what happens, but rather that they are not even aware of what is collected — and they are not given the chance to choose.
Andrew Means has identified the two most important factors for nonprofits to build trust with their data donors: securing the data and clearly specifying its use. The main challenge for health organizations is helping patients regain full control over their data by making them aware of what kind of information is collected and how it is used.
The nonprofit health sector should commit to the use of voluntary terms and conditions that are more robust than simply “opting in” or “opting out.” Our consent practices should convey the personal benefits of contributing data (for example, to develop a customized schedule of insulin injections for diabetics) but also explain the data’s role in the bigger picture (for example, developing a better insulin monitoring device). It is important to designate a person at the organization who is responsible for communicating with patients about their data and responding to any concerns or questions about data use.
The nonprofit and civil society sector should prioritize consent even if it hurts their short-term ability to access data. In the long term, it will build trust with the communities we serve and open up access to even more data opportunities.
Publishing and sharing health data enables us to find new insights and tackle problems on a larger scale. But there are clear risks to sharing health data. Which organization is responsible for the data? Who gets access within each organization? Can the findings of a data-sharing project then also be shared?
Research enquiries from third parties can be a benefit rather than a burden if the right safeguards are in place. At the most basic level, a policy for sharing or publishing data and analysis should include three components:
- An overview of the research and the data used
- An explanation of the differences between aggregated and anonymized data
- A comprehensive review of data security and privacy, including any potential for combining datasets to re-identify individuals.
Access to data must be strictly controlled, and there must be a plan for further sharing or publishing once the original data-sharing project is complete. Every organization involved in a data-sharing arrangement (whether they are the ones giving or receiving the data) needs formalized access policies, as well as a dedicated administrator responsible for managing the data, granting access and monitoring risks.
When it comes to handling data requests, whether from nonprofit partners or from the private sector, health organizations have to consider the best interests of their patients and constituents, even if that means saying no to some requests for data sharing.
Health data systems have the potential to reflect and learn from diverse cultures, backgrounds and voices. Research on health behaviors and outcomes across communities can yield valuable insights, such as a better understanding of risk factors for cancer or heart disease among specific demographic groups.
But health data can also serve to further disadvantage already marginalized groups. For example, a data set showing higher than average diabetes or HIV rates in a poor neighborhood could help a community clinic do better preventative outreach to high-risk patients, but it could also help drug companies target price increases to those patients and motivate health insurers to raise cost-sharing tiers (a process known as adverse tiering).
In addition to addressing privacy concerns, nonprofits should publish research only after analyzing potential negative effects on the target population.
The role of civil society in the data economy
Data repositories are seen by many as gold mines. The biggest new businesses in the world (Facebook, Google, Tesla, Uber, Airbnb, etc.) run on data, and it is their most valuable asset — one that consumers often share unconsciously, and sometimes unknowingly. When the right datasets are combined, our lives become transparent to anyone who gets hold of them.
The civil society sector is strategically placed to help people regain control over their data at a time when their data are more valuable than ever. An ethical approach to data management in the nonprofit and civil society space — particularly in the healthcare sector — must be a top priority.
This responsibility can seem overwhelming, especially when we compare the resources of the social sector with those of big business. Yet the goal is not to stop the use of data, but to build ethical frameworks for managing it in a way that minimizes the risks and maximizes positive social outcomes.
Many thanks to Agata Piekut for kicking off a conversation about “healthy” data practices in the social sector. Have something to add? Chime in below with a comment. You can also find Agata on Twitter or connect on LinkedIn.