Open data, the digital equivalent of letting visitors in for free

This is a repost of a blog post I wrote as the External Relationships Manager for the Research and Education Space looking at the issue of licensing. This post looks at how best to license your catalogue data so that your collections are more visible via products built on the Research and Education Space (RES), as well as other open products, like Wikipedia and Wikimedia Commons. The original is here.

The central ambition of RES is to make it easier for teachers, academics and students to access materials held in publicly funded collections and archives.

When the BBC and its partners, Jisc and Learning on Screen (the British Universities and Colleges Film and Video Council) were designing RES we realised that alongside the technical architecture of the platform, getting the licensing regime right would be a vital part of the project.

The RES licensing regime would need to balance two potentially competing demands:

  • Offering enough protection to archive/collections holders to give them enough confidence to publish their data in a RES compatible format.
  • Making it straightforward for people building things on top of RES to access the collections and archives indexed by the platform.

The approach the project took was to distinguish between the licences that apply to the catalogue data, which describe collections, as opposed to the licences that apply to the collections themselves. The RES platform will only index catalogue data that is openly licenced – under one of the licences listed at the bottom of this post. But we recognise that it’s up to each institution how they licence the assets in their collections.

In this blog post we examine the issues around licensing collections data, in a subsequent post we will look at issues around licensing assets themselves. (By collections data we mean information about a digitised object, its title, descriptive text, its url, technical information and preferably relevant links).

The RES team recently interviewed Chris Michaels, [then] Head of Digital and Publishing at The British Museum, for a short film about why major institutions are publishing data about their collections under an open licence. This was his main reason:

“The British Museum’s job is to share its collection with everyone in the whole world, and just as we do our job to put all our objects on display … now our job is to share our data about our objects with everyone and linked open data is the best way to do that.”

So the argument for publishing your data under an open licence is that it is the digital equivalent of allowing free entry to your institution, it allows access to as many people as possible.

“The original logo of the web had a lovely byline which was: ‘Let’s share what we know’. In many ways, I would hope cultural institutions would be willing to embrace that philosophy’’,

remarked Tom Scott, Head of Digital Engagement at the Wellcome Trust, when interviewed for the film. He points out that publishing data under an open licence is integral to one of the founding philosophies of the web, a philosophy shared by the cultural sector.

“Each institution should put as much information about their collection online as possible … then we can link them together, we can make them more available to researchers and students, for the general public to follow their interests”,

explained Dr Mia Ridge, Digital Curator at the British Library.

Dr Ridge also points out that publishing the British Library’s collections data under an open licence means that their collection becomes more valuable and useful as it can be linked to other collections. It also enables it to be reused by other people, so helps the British Library meet its educational obligations by enabling people in education to find the data they need at their fingertips rather than having to spend time searching for it themselves.

We’re obviously not the first to grapple with issues around the licensing of data:

‘The Milkmaid’, one of Johannes Vermeer’s most famous pieces, depicts a scene of a woman quietly pouring milk into a bowl. During a survey the Rijksmuseum discovered that there were over 10,000 copies of the image on the internet – mostly poor, yellowish reproductions. As a result of all of these low-quality copies on the web, according to the Rijksmuseum, “people simply didn’t believe the postcards in our museum shop were showing the original painting. This was the trigger for us to put high-resolution images of the original work with open metadata on the web ourselves. Opening up our data is our best defence against the ‘yellow Milkmaid’.”

In analysing the Rijksmuseum’s response to ‘The Problem of the Yellow Milkmaid’, as illustrated above in an image taken from a presentation by Lizzy Jongma Databeheerder at Rijksmuseum Amsterdam,   Europeana came up with a list of 10 benefits to publishing data under an open licence, which can be found on page 14 of the attachment: Whitepaper_2-The_Yellow_Milkmaid.pdf linked to here.

Which licence should I use?

There are many open licences out there on the Web. What’s suitable for your collection may not be suitable for everyone. Note that RES does not accept licences with the Creative Commons ‘NonCommercial’ module, and neither do other open platforms like Wikipedia and Wikimedia Commons, as these are not considered to be open licences.

“One of the questions that I was asked when I was talking to people internally about RES was: Will they be taking copies of our content? And I could say no, all they will be doing is making an index to that content … that made it a lot easier to get approval internally”, 

said Dr Ridge. This neatly illustrates the difference between openly licensing collections data and licensing the assets themselves. That’s an issue we’ll be looking at in a subsequent post.

If you would like to know more about open data and how it might be useful to you, The Open Data Institute offers many examples of how open data is being used to innovate; and this blog post on the perils of making data free, but not open is a pertinent reminder of the importance of getting your licensing right.

Meanwhile our partner Jisc has published a guide to making your collection available for learning.

List of open licenses accepted by RES:

The text of this blog is published freely under the following Creative Commons license: Creative Commons Attribution Share-Alike, Version 3.0 Unported, (CC BY-SA 3.0 Unported)


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s