1. What is the Common Data Project?
The Common Data Project is a 501(c)(3) not-for-profit organization dedicated to changing the way we think about and use personal data. Data can be a valuable and significant tool in decision-making, whether the federal government is trying to decide how best to reform healthcare in the U.S. or whether we as individuals are trying to pick a good health plan. But the data that’s already been collected from us isn’t ours—it belongs to Microsoft or Walmart or your local Department of Health—and it supposedly needs to be locked up because it’s too sensitive and personal.
We refuse to believe that. We actively support the development of new technologies and standards that will allow us to use valuable, sensitive data without jeopardizing our individual privacy. We work to create a “common space" for data, which we call a “datatrust.” In the same way our public libraries are a common space for to share the knowledge we've captured in the printed word, we hope that datatrusts, with the appropriate technologies and controls for privacy and data integrity, can become a common space for data.
2. Why is it called “The Common Data Project”?
Although the data that’s currently being collected comes from us, we don’t have access to it. We want to create a public resource of “common data," through an online service we're calling a datatrust that will allow organizations to make sensitive data available to the public and provide researchers, policymakers and application developers with a way to directly query that data.
I understand that more data for the public sector is a good thing, but why does this data need to be held in "common" (ie. openly and publicly)?
For any single data set there are infinite conclusions and interpretations of those conclusions. We believe it’s important that as we make more and more data-driven decisions, it's important that we have open access to the data that will become the basis of public discourse (ie. unemployment rates), so that analysis is open and vetted by many, not held in private by a few.
3. Who’s involved?
Right now, the team consists of four people with very different skills and experiences. Alex Selkirk has 10 years of experience helping software development teams understand and improve their products through data. Alex currently runs a consulting company (Shan Gao Ma) designing and managing large-scale data collection and business intelligence systems for clients like Microsoft. Mimi is the product designer for Chandler, a personal information manager with built-in sharing and collaboration capabilities. Geoff is an Associate Professor at the San Francisco State University Business School specializing in technology social ventures, while Grace is a lawyer and writer whose experience with nonprofits and client confidentiality issues informs her work with CDP everyday. MORE ABOUT US
4. How did it get started?
The Common Data Project was started in 2007 by Alex Selkirk.
Over the past 10 years, while designing and working with data collection systems, first at Epinions and then Microsoft, Alex saw both how valuable data collection was to the companies he worked for as well as the intrinsic limitations of how it was being done. Privacy policies were pitted against data collection in a tug of war that hamstrung data collection efforts while failing to truly empower individuals to become the keepers of their own privacy.
In the meantime, he became interested in how data collection techniques being developed by the tech industry could be repurposed for collecting data for public use.
The Common Data Project was started to explore alternative models for privacy and data collection that would result in a large public data store accessible to researchers, policymakers, individuals and businesses.
5. How does privacy fit into this?
“Privacy” is a word that gets used a lot, but no one seems to agree on its definition. Legally, our ideas about privacy are still catching up to the technological realities of the world we live in today. For example, the laws we developed around the wiretapping of telephones didn’t anticipate a world where your browser could capture as much, if not more, information about you than a month of telephone conversations.
We see better privacy protection as an integral part of our efforts to create a place for “common data.” We believe that in order for us to offer a more meaningful privacy guarantee, we need to shift away from an "either-or" mental model of privacy to a more graduated approach where privacy cost can be measured and therefore spent in increments and privacy violations can be enforced when offenders have exceeded agreed upon limits (ie. privacy budget).
We are currently working with new privacy technologies to enable calculating and tracking the expenditure of privacy risk.
6. Why should we trust CDP to hold our data?
First of all, we’re not holding any data yet. Second, there’s already so much data out there that is being held about us, but without any transparency or oversight by us. Finally, we have no desire to be a monopolistic master and holder of all data in the universe.
We want to start a new institution called a datatrust, to show that a common space for data is both possible and desirable. We will likely create the first datatrust, but we imagine that if the idea takes off, there will be competitors, both nonprofit and for-profit, that could create datatrusts of their own.
In structuring our organization, we’ve thought long and hard about what it takes for an organization to be trustworthy, and we’re not done thinking about it yet. We chose to be a nonprofit so we would be driven by our nonprofit mission, rather than a profit motive. We’re working on building an active and committed board of respected individuals. We are researching ways to build checks and balances into our organizational structure, including a shut-down fund to destroy any and all data we’re holding in the event that the organization is dissolved. We strive to be transparent in everything we do, and we welcome thoughtful commentary and questions on the way we build our organization.
Learn more about our work on creating a trustworthy organization.
7. What are you working on now?
The barriers we see to creating a space for “common data” are both technological and cultural. We know there are promising new technologies being developed, and we work to stay abreast of these developments. But the other barrier is understanding—what do we want as a society from our data and what do we need to feel our privacy is safe?
Our work falls into the following broad categories:
Education: We are participating in a public discussion of privacy and data collection issues on our blog at myplaceinthecrowd.org and through publications on our website.
Building the Datatrust: We are conducting technical explorations of differential privacy, building a prototype of the datatrust, maintaining a running list of Governance and Policy issues including: "How to define a quantifiable privacy guarantee;" and 4th Amendment Privacy Rights. LEARN MORE
8. How is CDP funded?
We are currently funded by donations from the founder. We have minimal expenses at this point, as we do not yet have any paid employees. We plan to apply for grants in the coming year and eventually solicit donations from the general public.
9. How do I get involved?
If you work at a nonprofit and want to help shape our understanding of how nonprofits use their data, please contact us and we can set up an interview with you by telephone or in person, depending on your location. If you have a data problem you think we could help solve, please contact us and tell us your story. If you are a lawyer with expertise in nonprofit organizations, privacy, or intellectual property, and would like to offer us pro bono assistance, we would love to hear from you! CONTACT US
10. How do I donate?
You can make a tax-deductible donation to the Common Data Project through Amazon's donation service. In 2009, all donations will go towards "program development" to complete the projects and papers described on this website.