Datatrust Governance and Policies: Questions, Concerns and Bright Ideas.

A running list of open issues for governing a datatrust.

  1. What is the datatrust? What is its purpose?
  2. Who builds the datatrust technology? Who gets to use it?
  3. Who holds the data?
  4. What is our privacy policy?

  5. Who runs the datatrust and how?
  6. The Community.
  7. The Board.
  8. The Staff.

  9. How is the datatrust funded?
  10. How do we monitor datatrust health?
  11. Can the datatrust change its mission? Does it have a living will?


Authority over the datatrust and the power to make decisions about its policies will be shared among three separate bodies, the Board of Directors, the Staff, and the Community of Donors, Borrowers, and Curators. We seek to create a system of checks and balances where the risk of abuse is minimized because authority is decentralized and because transparency allows many eyes to monitor the datatrust.


A. A Community of Data Donors, Users, and Curators

The datatrust, as a nonprofit institution that benefits the public, will be stewarded by the public. The people and institutions who donate data, use data, and curate the data will be members of a community that will take an active part in decisions about how data gets collected and how it’s released.

Our goal is to build a community of members with a broad and diverse range of expertise and experience. Members will help to:

  1. Maintain the community charter;
  2. Nominate and elect a certain number of board members;
  3. Monitor the practical effectiveness of our privacy guarantee (ie. Is it too strict? Not strict enough?);
  4. Set standards for and vetting and accepting data donations;
  5. Set standards for and evaluating Borrower applications for access to data;
  6. Curate data;
  7. Provide feedback on what data is useful, meaning scarce, in high demand, valuable, and or frequently used.

Community members will also be involved in monitoring the community and preventing abuse, so that the authority to monitor is not centralized in the datatrust's management staff.

  1. Will we require real identities or permit pseudonyms? Will we have different requirements for different roles? (ie., Donor versus Borrower versus Curator?) Will we permit anonymous activities and to what extent?
  2. What kind of "reputation management" will the community need? How will we reward good contributions and community participation? Will we require everyone to donate data to be a member of the community? (e.g. Can you be a Borrower without being a Donor?)
  3. How will we enable community members to monitor the community without preventing growth in new directions?
  4. In making decisions, will each member get one vote? Or will voting rights be a reward for participation and engagement? Do community votes always carry the day or does the staff have veto powers in certain situations?
  5. Can you get kicked out of the community? For what reasons? Who has input into that decision? How is it done?

B. Donors' Rights And Responsibilities

Data Donors can be universities and non-profit organizations, government agencies, independent researchers, businesses or interested individuals.

Individuals and organizations that donate data will have clearly stated rights with regard to their data. They will include:

  1. The Privacy Guarantee described above.
  2. The right to delete your account and remove your data at any time.

We will also provide tools to view how your data is being used by others, how it's being found, what it's being used for, and where the data is cited.

  1. Do we allow Donors to request that their data is released at a more or less "strict" level of privacy guarantee?
  2. If you delete your account, how will that impact the data that people have already used? (i.e., How will that affect data that has already been cited in studies and papers.)
  3. What kind of choices will Donors have in determining how their data is used? Will we offer Creative Commons-style licenses that allow a user to choose between Commercial/Non-Commercial?
  4. If we do offer choices, how will we incentivize Donors to be as open as possible?
  5. How do we ensure that Data Donors provide adequate documentation of the contents of their donation, how it was collected, and any known issues.
  6. Can we crowdsource data curation without risking privacy violations? Or do volunteer curators enter into individual confidentiality agreements with Data Donors?
  7. How do we encourage Donors to work with each other to collate data sets that would even more valuable if combined?

C. Data Users' Promises (and Responsibilities)

Individuals and organizations that want to make use of data will have to agree to certain terms of use, the terms of the datatrust and the terms the Data Donor has set.

Data Users can be researchers, government agencies, businesses and application developers or simply interested individuals.

  1. We want to reward community participation with increased access to data. Does that mean only those who donate data or are members of the community may have access to the data? Will any data be open to the general public?
  2. Will we require Data Users to share their queries and answers with the public? Will we allow any lag time for publication?
  3. How will Data Data Users vet the quality of data?
  4. What kind of communication channels will exist between Data Data Users and Data Donors? E.g. To provide feedback on data; Make requests for more data.
  5. How do we help Data Users work together to make more efficient use of limited privacy budget?
  6. Can Data Users request anonymity?


Today, the Common Data Project is governed by a board of three directors, Alex Selkirk, Mimi Yin, and Geoffrey Desa.

We will need to create a larger, more diverse board, with domain experts in security and privacy, and representatives of the diverse range of individuals and organizations that will donate and make use of the datatrust. The board will also include community members, voted to the board by the community.

  1. What domains of expertise should be represented in the make-up of the Board? Who has input into that list?
  2. How large will the Board be? How many of the directors will be nominated by and/or elected from the community?
  3. The Board's responsibilities will likely include
    • Bring in new data donors and outreach to find new creative uses for datatrust data.
    • Monitor the practical effectiveness of the privacy guarantee;
    • Monitor community health: participation levels, diversity, Donor and Borrower activity. Is there a large backlog of uncurated data?
    • Monitor "data health:" Quantity, quality and diversity of data both in terms of what kinds of data are available, where it's coming from and what it's being used for.
    • Regularly review fee structure and revenue stream to make sure it is equitable and enabling a healthy, diverse Borrower community.
    • Monitor software audits, both internal and external.
    • Monitor financial health of the datatrust and manage external budget audits.
    • Fundraising.
    • Initiate the shutdown procedure in the event of a fiscal (or other) emergency.
    • Review staff salaries.
    • Review size, make-up and effectiveness of staff.
    • Amend community charter?
    • Override community decisions?
  4. What is the board election cycle?
  5. How should the board be structured? Should there be a core Board of Directors for organizational oversight plus an Advisory Board of domain experts?
    • How do we stagger and balance term length across different board roles?
    • How do we define responsibilities to separate powers among members and minimize the risk of abuse?
  6. Should we have an Ombudsman, who represents the public?


The primary duties of the staff are to enable the community and inform the board. The success of the staff will be measured in terms of how active they are in the daily functioning, oversight and long-term planning of the organization.

In this way, we imagine that the datatrust staff will be able to remain small even as the datatrust itself grows both in terms of the amount of data it holds, the number of transactions it enables and the number of members in its community.

Specific responsibilities will likely include:

  1. Building software for the front-end experience of the datatrust (e.g. donating data, curating data, borrowing data, account management, community data analytics, fee payment system.)
  2. Running the datatrust service (e.g., keeping the servers up and running and secure.)
  3. Managing community tools (e.g., discussion forums, RSS feeds, etc.)
  4. Managing the community charter
  5. Setting up processes and proposing standards for donating and borrowing data
  6. Proposing Borrower fee structure
  7. Evaluating and incorporating software donations
  8. Licensing datatrust technology
  9. Managing day-to-day finances
  10. Reporting to both the board and the community on community health, data health, datatrust revenue stream, and organizational finances.