EU bill on data mining lacks ambition
By Nick Wallace
European researchers have become frustrated in recent years by the restrictions European copyright laws put on their freedom to use text and data mining - two automated techniques for analysing data - on resources they can legally access and analyse with non-automated means.
As part of its recent proposals to reform copyright laws, the European Commission has recommended lifting these restrictions, but only for academics. This is a good first step, but the EU should also allow everyone to take advantage of these more efficient and effective data-driven research methods.
Dear EUobserver reader
Subscribe now for unrestricted access to EUobserver.
Sign up for 30 days' free trial, no obligation. Full subscription only 15 € / month or 150 € / year.
- Unlimited access on desktop and mobile
- All premium articles, analysis, commentary and investigations
- EUobserver archives
EUobserver is the only independent news media covering EU affairs in Brussels and all 28 member states.
♡ We value your support.
If you already have an account click here to login.
The commission’s proposal is good for European researchers in a wide range of disciplines, from bioinformatics to digital humanities. For scholars and scientists, access to the rigorously scrutinised work of their peers, such as academic journals and databases, has always been a vital resource.
Researchers who subscribe to these sources can explore them using traditional keyword searches and meta-tags predefined by publishers, but that has serious limitations.
Manually reviewing all of these sources is a slow and tedious process, the results of which are often inaccurate and incomplete.
Text and data mining is a powerful tool that allows researchers to plough into texts and datasets and interpret minute details.
Data mining gives researchers the ability to not only find a needle in a haystack, but to quickly find and categorise all manner of small objects hidden in many hundreds or thousands of haystacks.
For example, medical researchers can use technologies like natural language processing to quickly analyse the outcomes of thousands of clinical trials.
This type of analysis supports efforts to develop data-driven precision medicine initiatives that use the latest evidence to deliver personalised treatments.
Data mining cannot provide all of the insights gained from human experts closely studying texts, but it does allow researchers to use rapidly developing tools to draw on a much larger pool of literature and data to support their work.
The use of data mining on copyrighted material often falls foul of existing intellectual property laws because the technical process involves extracting data from its original source and copying it into another database for analysis.
The proposed exemption is reasonable because it creates a special dispensation for data mining and does not alter other laws that prohibit the unauthorised extraction or reproduction of copyrighted works.
After all, there is nothing illegal about “mining” databases manually; this technology only automates the process.
A researcher could legally sift through many thousands of published works, note their findings with pen and paper, and then analyse the assembled notes. This is why an exemption for academics is not enough: This method should be legal for anyone.
Copyright law should allow publishers to set the subscription fees for access to their content, prohibit unauthorised reproductions of their content, and receive appropriate compensation. But it should not require people with lawful access to content, such as paid subscribers, to seek approval from publishers for using automated research methods.
Some member states - such as the United Kingdom - have already implemented similar (and similarly inadequate) exceptions. But national legislation is insufficient; the issue should be tackled at the EU level, because research is often cross-border.
Researchers and sources are spread across different countries. Unless the same rule applies throughout Europe, this work is very difficult. For example, Europeana.eu is an online repository of books, films, art, and other materials that have been digitised in various member states.
A researcher could legally mine this archive from the UK while a colleague elsewhere could not - or the former could inadvertently commit a crime by mining a resource in the latter’s country.
Not far enough
It does not go far enough, but the commission’s proposal does address this problem for the academic community.
If the commission is serious about building a Digital Single Market, then it should introduce a rule change that applies to everybody, not just academics.
If everyone had this freedom, Europe would enjoy far greater opportunities for data-driven innovation in several sectors. Nevertheless, this exemption could be the first step towards that, so the Council and the Parliament should support it.