The need for DataDeals: sharing information conditionally
author: Joep Meindertsma
Sharing data is what the internet is all about: whether it’s webpages, images, videos or JSON files. Although just viewing other peoples data is often very easy (just enter the URL in your browser), legally re-using something can quickly get complicated. If you want to re-use someone else’s creation, you have to deal with intellectual property rights and licenses.
To stimulate data re-use, an increasing amount of it is shared under permissive licenses (CC-BY or Public Domain, for example). A license like this turns data into Open Data. Many of the apps and online services that we all use would not be possible without these licenses. But because these licenses are extremely permissive, they do not fit all usecases. Many people value their data, and only want to share it if they’re paid.
At Dexes, one of our core amibitons is to increase data-reuse. If we can make dealing with licensing easier and faster, data-reuse will increase as a result. We think this can be achieved by tackling two problems: implicit agreements and the lack of machine-readable licenses.
Licenses are often agreed to implicitly. Some text field on the webpage data tells you: this is the license, deal with it. There’s often no place that you have to sign. These licenses are not one-to-one, but one-to-many. That is great for many open data sets, where maximizing re-use is the only goal, but the lack of verfifiable agreements can create issues. In legal disputes, how do you prove that the license was there when the data was used? What if the license was added later than you’ve used the data? This is where explicit agreements can save the day, especially when they’re cryptographically proven with a digital signature. The data user now knows that they can’t be sued as long as they keep their end of the deal, and the data sharer now has evidence of what the user agreed to. Having a signed agreement removes uncertainty, and that’s a good thing for both parties involved.
Most licenses are only written in human language instead of some machine-readable format. Human language is the de facto standard for legal documents such as licenses, of course, but computers can’t really understand them. This means that computers can’t use their search capabilities to find licenses with specific characteristics. Having a machine readable license could not only help with finding the right data, it could also help to improve the UX of understanding a license. Instead of having to read a large piece of text, we could show a couple of checkboxes. Of course, these would have to correspond to legal text, but still: it would help to make licenses more modular, and in turn easier to understand. And last but not least, machine readble licenses could help users to comply to the conditions. For example, if the license requires attribution while sharing, software could include the right attributes while displaying the data. Or if re-sharing is not permitted, the share button might show an alert.
We’re not the first to realize these problems exist, and that the solution is mostly about standardization. The W3C (the organization behind many of the web standards) has two project that are relevant to the problems described above.
The Verifiable Credentials spec standardizes how cryptographically verifiable signatures could be represented, and are a good candidate for facilitating explicit agreeements. The Open Digital Rights Language (ODRL) spec standardizes concepts such as policies, rules, and permissions for digital rights.
Later this year, Dexes will be releasing open source software that helps to make the process of sharing data easier by providing a simple interface to create licenses and explicit agreements.
Do you want to know more about data sharing or data marketplaces? Please contact us on email@example.com.