The 50 shades of shared data
Lots of data are already available for reuse, but not all possible data is yet available. Some data is not being shared because it is too complicated to share. There is a lot of data that is privacy or commercial sensitive that could be shared when the owners conditions are met. The conditions that a data-owner can pose for sharing data are diverse and not always easy to describe. We propose a standard to describe the various types of conditions for sharing data that are readable for men and machine.
Open, shared and closed data
We have open, shared and closed data. Open data is published for reuse without any conditions. In this way Open data is available for reuse for anyone without posing any requirements to the user.
Shared data is data that does have conditions for reuse. The conditions describe the requirements for reuse. Conditions can be like “only an employee of firm X may access and use this data and results”. Or “this API can be used by anyone that pays 10 Euro’s per month” or “all organisations that are member of this cooperation and signed the agreement can use this dataset”. Or “all users that signed this disclaimer may use this data”. It is probable that data needs more than one of these conditions to be met before it can be shared.
Transparency in conditions make more data available
For a lot of governments meeting a simple condition like “I have read the disclaimer that comes with this dataset” is enough to open up more data. When the data owners knows that the reuser understands the context, there is less chance on misuse of the data.
A lot of data-owners worry about the misuse of their data. When data is not being used or interpreted in the right way. This may result in situations that are unwanted or even dangerous for society. The data-owner wants to make sure that any reuser reads the background info on the datasets structure and possible limitations, before using it. An example: the departement for infrastructure wants to open their data about heights of tunnels and bridges but is worried about the interpretation of “height”. Large trucks may get stuck in tunnels if their navigation-system points them through a tunnel based on a erroneous us data interpretation. The department worries can be taken away by having the reuser sign for having read the disclaimer and leaving their email addresses for reference of this.
“Signing a disclaimer” is one of the simplest condition a data owner can pose and will make the data available to reusers. No reuser will have a problem signing and leaving their email address. Why is this not happening yet? In my opinion the problem is that there is no standard way to be transparant about the conditions for reuse and store the agreement between the data owner and reuser. There is no way to describe this condition in a way that the condition can be validated by any data provider and results can be fed back to the data owner.
Shades of shared data
So there are all shades of shared data determined by the conditions for sharing this data. The types of conditions can be endless. But there are a few basic conditions that almost all data-owners might pose and that can make more data available to share that is not being shared yet. We only need a way to describe these conditions in a way that man and machine (and lawyers) can read and understand.
So, open data is data that has no conditions at all for sharing. Anyone can share with anyone. Shared data has conditions, ranging from easy to comply to conditions to more complex conditions that require strict validation and authorisation of users. Closed data is data that has the ultimate condition that it may not be reused at all by anyone. In our opinion there is not a lot of closed data that is valuable to share with others.
A standard to describe conditions for sharing data
It is necessary to be transparant about the conditions for reuse of data to make more data available. Part of the transparency is to have a standard to describe these conditions. A standard for data-sharing conditions helps to communicate the conditions, to validate these conditions and to exchange the conditions as part of a dataset description in a data catalogue.
Data catalogues like the National Data-portals of each EU member states should make it possible to describe the conditions for sharing datasets. For open data there is already a very useful open licensing method by Creative Commons or the OpenDataCommons [link:https://www.opendatacommons.org/licenses/index.html]. We need an extension of this framework to describe conditions.
Commons for reuse conditions
We will need a creative commons for shared data to describe all the shades of shared data there are. When a data-owner can describe his conditions as a set of standard condition, the data can be shared over a network of data markets. A data-owner in Copenhagen can easily close a datadeal and share access to her data with a programmer in Amsterdam.
Work to be done: datadeals
Dexes is working on a standard to describe “datadeals” that includes a standard to describe conditions to share data. A datadeal is agreement that is closed when the conditions of the data owner are validated with the attributes of the reuser. At a certain moment in time the conditions for reuse of a certain dataset were positively validated. The datadeal is an irrefutable piece of information that both data-owner and reuser can use as reference for their agreement to share data. We will tell you more about datadeals in a next article.
To describe the datadeal and the condition for sharing data, we make thankful use of already existing ideas in creative commons, data commons and ODR. We will publish the first draft in May 2020 proposing this to our working group. The we should be able to describe the 50 shades of shared data in a standard that is readable for man and machine.
Maybe we did not see find all possible input for our work on describing conditions for data sharing. Do you have a suggestion for us? Want to join our effort to standardise the description of conditions for sharing data? Please contact us.