Blog posts

#LoveData24: Interview with Yves-Alexandre from the Computational Privacy Group

As part of #LoveData24, the Research Data Management team had a chance to catch up with Yves-Alexandre, Associate Professor of Applied Mathematics and Computer Science at Imperial College London, who also heads the Computational Privacy Group (CPG). The CPG are a young research group at Imperial College London studying the privacy risks arising from large scale behavioural datasets. In this short interview we discussed the interests of the group, the challenges of managing sensitive research data and whether we need to reevaluate what we think we know about anonymisation.

How did you become involved with the Computational Privacy Group (CPG)?

Yves-Alexandre de Montjoye (YD)
So, my career as a researcher started when I was actually doing my master’s thesis at the Santa Fe Institute in New Mexico. That was in in 2009. This was pre-A.I. and the beginning of the Big Data era. People were extremely excited about the potential for working with large amounts of data to revolutionise the sciences, ranging from social science to psychology, to urban analytics or urban studies and medicine.

So many things suddenly became possible and people were like, ”this is the microscope”, or any other kind of analogy you can think of in terms of this being a true revolution for the scientific process. Some even went as far as saying, “this is the end of theory, right?” or “this spells the end of hypothesis testing. The data are going to basically speak for themselves”.  There was a huge hype of expectation which, as time went on gradually decreased and eventually plateaued to what it is now. It did have a transformative impact on the sciences but to me it became quite obvious working with these data as a student, just how reidentifiable all these types of data potentially were.

Back in the days, we were looking at location data across the country and on the one hand, everyone was talking about how the data were anonymous. As a student, I was working with the data and I could see people moving around on the map, so to speak. And it just blew my mind. It didn’t seem like it would take very much for these data to not be anonymous anymore.

Anonymisation and the way we’ve been using it to protect data have been well documented in the literature. There has also been extensive research on how to properly anonymise data. I think what has taken a bit of time for people to grasp is that anonymisation, in the context of big data, is its own new, different question, and that actually a lot of the techniques that had developed from around 1990 to 2010 were basically not applicable to the world of big data anymore.
This is mostly due to two factors. The first one is just the sheer amount of data that is being collected about every single person in any given dataset that we are interested in, from social science to medicine.
Combining these with social media and the availability of auxiliary data (meaning data from an external source, such as census data) means that not only are there a lot of data about you in those datasets, but there are also a lot of data about you that can be cross-referenced with sources elsewhere to reidentify you. And I think what took us quite a bit of time to get across to people was that this was a novel and unique issue that had to be addressed. It’s really about big data and the availability of auxiliary data. I think that’s really what led a lot of our research into privacy. Regarding anonymisation, we are interested in the conversation around whether there is still a way to make it work as intended given everything we know or do we need to invent something fundamentally new. If that is the case, what should our contribution to a new method look like?

At the end of the day, I think the main message that we have is that anonymisation is a powerful guarantee because it is basically a type of promise that is made to you that the data are going be used as part of statistical models, et cetera, but they’re never going to be linked back to you.

The challenge lies in the way we go about achieving this in practice. Deidentification techniques and principles such as K-anonymity are (unfortunately) often considered a good way of protecting privacy. These techniques which, basically take a given dataset and modify it in one way or another, might have been considered robust enough when they were invented in the 90s and 2000s, but because of the world we live in today and the amount of data available about every single person in those datasets, they basically fall short.

There is a need for a real paradigm shift in terms of what we are using and there are a lot of good techniques out there. Fundamentally, the question comes down to what is necessary for you to make sure that the promise of anonymisation holds true, now and in the future.

Yves-Alexandre de Montjoye

Could you talk to me a little bit about the Observatory of Anonymity and what this project set out to achieve? And then as a second part of that question, are there any new projects that you’re currently working on?

YD: The Observatory of Anonymity comes from a research project published by a former postdoc of mine, and the idea is basically to demonstrate with very specific examples to people, how little it takes to potentially reidentify someone.

Fundamentally, you could spend time trying to write down the math to make sense of why you know for a certain number of reasons that a handful of pieces of information are going to be sufficient in linking back to you. The other option is to look at an actual model of the population of the UK. As a starting point, we know that there are roughly 66 million people living across the country. Even if you take London, there are still 10 million of us. And yet, as you start to focus on a handful of characteristics, you begin to realise very quickly, that those characteristics, when put together, are going to make you stand out and a significant fraction of the time that can mean that you will be the only person in all of the country to match those sets of characteristics.

The interesting part is what do we do?

You’re working within Research Data Management and your team are increasingly dealing with sensitive data and the question of how they can be safely shared?
Clearly there are huge benefits to data being shared in science, in terms of verifying research findings and reproducing the results and so on. The question is how do we go about this? What meaningful measures can you put in place to ensure that you sufficiently lower the risks of harmful disclosure in such a way that you know that the benefit of showing these data will clearly outweigh those risks?

I think from our perspective, it’s really about focusing on supporting modern privacy-ensuring approaches that are fit for purpose. We know that there are a range of techniques; from controlled access to query-based systems, to some of the encryption techniques that, depending on the use case, who needs to access your data, and the size of your dataset would allow someone to use your data, run analysis, and replicate your results fully without endangering people’s privacy. For us, it’s about recognising the right combination of those approaches and how we develop some of these tools and test them.

I think there has been a big push towards open data, under the de-identification model, for very good reasons. But this should continue to be informed by considerations around appropriate modern tools, to safeguard data while preserving some utility. Legally at least, you cannot not care about privacy and if you want to care about privacy properly, this will affect the utility. So we need to continue to handle questions around data sharing on a case-by-case basis rather than imply that everything should be fully open all of the time. Otherwise this will be damaging to the sciences and to privacy.

Yes, it is important to acknowledge that that tension between privacy and utility of research data exists and that a careful balance needs to be struck but this may not always be possible to achieve. This is something that we try to communicate in our training and advocacy work within Research Data Management services.
We have adopted a message that can hopefully be helpful (and which originated from Horizon Europe[1]), which states that open science operates on the principle of being ‘as open as possible, as closed as necessary’. In practice this means that results and data may be kept closed if making them open access is against the researcher’s legitimate interests or obligations to personal data protection. This is where a mechanism such as controlled access could play a role.

YD: Just so. I think you guys have quite a unique role to play. A controlled access mechanism that allows a researcher to run some code on someone else’s data without seeing the data on the other hand requires systems of management, authorisation and verification of users, et cetera. This is simply out of the reach of many individual researchers. As a facility or as a form of infrastructure however, this is actually something that isn’t too difficult to provide.
I think France has something called the CASD, which is the Center for Secure Access to Data (or Centre d’Accès Sécurisé aux Données) and this is how the National Institute of Statistics and Economic Studies (INSEE) is able to share a lot of sensitive data. Oxford’s OpenSAFELY in the UK is another great example of this. They are ahead in this regard. We need similar mechanisms when it comes to research data to facilitate replicability, reuse and for validating and verifying results. It is absolutely necessary. But we need proper tools to do this and it’s something that we need to tackle as a collective. No individual researcher can do this alone.

What in your experience are common misconceptions around anonymisation in the context of research data?

YD: I think the most common misunderstanding is a general underestimation of the scale of data already available. Concerns often revolve around a notion of, could someone search another person’s social media and deduce a piece of information to reidentify them in my medical dataset? In the world of big data, I would argue that what we strive to protect against also includes far stronger threat models than this.
We had examples in the US in which you had right wing organisations with significant resources buying access to location data, matching them manually, potentially at scale with the travel record, and other pieces of information they could find about clerics to potentially identify them in this dataset, in an attempt to see if anyone was attending a particular seminar[2].

We had the same with Trump’s tax record. Everyone was searching for the tax record and it turns out that it was available as part of an ’anonymous’ dataset, made available by the IRS and again these were data that were released years and years ago.
They remained online and then suddenly they’re an extremely sensitive set of information that you can no longer meaningfully protect.

This goes back to what you were saying again about anticipating how certain techniques could be used in the future to potentially exploit these data.

YD: Actually, on this precise point, we know from cryptography that good cryptographic solutions are actually fully open and that the cryptographic solution is solid. I can describe to you the entire algorithm. I can give you the exact source code. The secrecy is protected by the process but the process itself is fully open.
If the security depends on the secrecy of your process, often you’re in trouble, right? And so a good solution actually doesn’t rely on you hiding something, something being secret, or you hoping that someone is not going to figure something out. And I think that this is another very important aspect.

And this perhaps goes back again, to the type of general misunderstandings which sometimes arise where someone might assume that because some data have to be kept private, as you were saying, that the documentation behind the process of ensuring that security also has to be kept private, when in fact you need open community standards that can be scrutinised and that people can build upon and improve. This is very relevant to our work in supporting things like data management plans, which require clear documentation.

We have reached our final question: There is arguably a tendency to focus on data horror stories to communicate the limitations of anonymisation (if applied for example without a proportionate risk-based approach for a research project). Are there positive messages we can promote when it comes to engaging with good or sensible practice more broadly?

YD: In addition to being transparent about developing and following best practices as we have just talked about, I think there needs to be more conversations around infrastructure. To me, it is not about someone coming up with and deploying a better algorithm.
We very much need to be part of an infrastructure building community that works together to instill good governance.

There are plenty of examples already in existence. We worked, for example, a lot on a project called Opal which is a great use case of how we can safely share very sensitive data for good. I think OpenSAFELY is another really good case study from Oxford and the CASD in in France as I already mentioned.
These case studies offer very pragmatic solutions, but are an order of magnitude better, both from the privacy and the utility side, than any existing legacy solutions that I know of.

[1] https://rea.ec.europa.eu/open-science_en

[2] https://www.washingtonpost.com/religion/2021/07/21/catholic-official-grindr-reaction/

Useful links:

CASD

https://www.casd.eu/en/le-centre-dacces-securise-aux-donnees-casd/gouvernance-et-missions/

CPG

https://cpg.doc.ic.ac.uk/

Introduction to research data management

https://www.imperial.ac.uk/research-and-innovation/support-for-staff/scholarly-communication/research-data-management/introduction-to-research-data-management/

Opal project

https://www.opalproject.org/

OpenSAFELY

https://www.opensafely.org/about/

Open Science – European Commission

https://rea.ec.europa.eu/open-science_en

Open Access Week 2023: Imperial’s Research Publications Open Access Policy

This post was written by Ruth Harrison, Head of Scholarly Communications Management at Imperial College London.

After many years of work, the College will soon be able to announce that we are updating our institutional open access policy to allow researchers to make their peer-reviewed journal articles and conference proceedings available on open access under a CC BY licence at the point of publication with no embargo. This will apply to accepted manuscripts, and enable staff and students to retain their right to reuse the content of those outputs in teaching, research and further sharing of their work.  

Why? 

I don’t think many people would disagree with the moral and ethical case for open access to research, and that the principles of open research should be more widely applied. This is a global endeavour – in 2022, UNESCO published its recommendation on Open Science stating: 

“By promoting science that is more accessible, inclusive and transparent, open science furthers the right of everyone to share in scientific advancement and its benefits as stated in Article 27.1 of the Universal Declaration of Human Rights.” 

Open access publishing has existed for more than two decades now, and in the past 10 years, funders have increasingly required open access to the published outputs of research which public money, ultimately, has enabled. In the UK (and internationally) this has resulted in various policies which researchers, libraries and publishers have had to keep track of, and there are now many models through which open access can be achieved. But this also means considerable ‘policy stack’ and confusion, with varying workflows and messaging for researchers to keep up with.  

Introducing a policy through which author rights to their accepted manuscript are retained is a solution to the policy stack. Based on the lead taken by MIT with their open access policy, introduced over a decade ago, and other institutions around the world, within the UK the case has been made that we should adopt the same approach. At Imperial, this began with the introduction of the concept of the UK-SCL – Scholarly Communications Licence – and has now developed into what will be our Research Publications Open Access Policy (RPOAP). Generally such policies are referred to as rights retention policies or strategies, and we will join over 20 other UK universities who have already implemented similar policies, including the universities of Edinburgh, Cambridge, Oxford and Glasgow, as well as Sheffield Hallam, Swansea, Queen’s University Belfast and the N8 institutions 

How does a rights retention policy work? 

There are some key points to make: 

  • Authors will retain copyright over their work 
  • Under the policy, each author grants the College a non‐exclusive, irrevocable, sub-licensable, worldwide licence (effective from acceptance of publication) to make the AAM author accepted manuscript publicly available under the terms of a Creative Commons Attribution (CC BY) licence 
  • The right being granted is that of allowing the College to make the accepted manuscript openly available in Spiral without an embargo 
  • The College does not retain the copyright to research outputs – that is waived in favour of academics 
  • The policy applies to peer-reviewed journal articles and conference proceedings
  • There is no restriction on choosing where to publish. 

For the policy to be effectively implemented: 

  • Publishers need to be informed when an institution is going to implement a rights retention policy  
  • On behalf of all staff and students, the College will notify publishers of the policy 
  • There will be a list available of notified publishers. 

What will authors need to do? 

Authors should continue to upload their accepted manuscripts to Symplectic Elements which means for many people, there will be no change in their workflow at acceptance. When an accepted manuscript is received, the Library Services open access team will process it including managing any accompanying APC (article processing charge) application.  

We would recommend that authors: 

  • familiarise themselves with the RPOAP when it is published 
  • consult the list of notified publishers when they are preparing a manuscript for submission – this will be available in the next few weeks 
  • use our publisher agreements search tool to find out if the Library Services has covered the cost of open access publishing for the version of record 
  • upload their accepted manuscripts (or a link to where a copy is already deposited, such as arXiv or another institutional repository) as soon as they can after acceptance 

What’s next? 

When the policy implementation date is agreed by University Management Board, there will be further communications across College, contact information and guidance available online at the Scholarly Communication website. This will include the list of notified publishers, and advice on what to do if your intended publisher is not on that list. And it is not only staff who will be able to take advantage of the policy, students are included as well – if you are a student publishing a journal article or conference paper, you will grant and retain the same rights as outlined above. 

In the spirit of this year’s International Open Access Week theme, Community over Commercialisation, the ultimate question is: who decides? Should publishers get to decide what research readers see and what they can do with it, or should it be for the research community to decide for itself? RPOAP answers the question in favour of the community. 

Introducing a new journal search tool for open access publisher agreements

Screenshot of journal title search results for search term energyWe have a new tool available that allows you to search for journals that are included in publisher open access agreements for Imperial College London-affiliated corresponding authors. You can search by journal title, ISSN, or enter a keyword and be provided with a list of journal titles containing that word.

The tool (powered by SciFree) is part of our revamped publisher agreements and discounts webpage, which has also been reformatted for ease of navigation as the number of agreements Imperial is part of has grown. A full list of journals with fully covered APCs (.xls) is also available from the webpage to view in an Excel spreadsheet (Imperial members only).

The search tool allows users to see whether titles are included in agreements that fully cover the open access fee, offer a discount, or whether they are not covered but you can apply to the Imperial Open Access Fund (see the three examples below). Each of these icons links to instructions or further information for the relevant option.

 

Screenshot showing search results table with columns: Journals, Included in agreements, License option, Publishing model

 

The results also give the default open access license for the journal, and whether it is a fully open access journal, or hybrid (a subscription journal offering an open access option).

Also featured are links to the Directory of Open Access Journals (DOAJ), and an embedded version of the Plan S Journal Checker Tool (JCT). Journals listed in DOAJ are eligible for the Imperial Open Access Fund, so if your chosen journal is not part of a publisher agreement, but is listed in DOAJ, you should apply to the Imperial Fund. (Eligibility also requires that you have no access to alternative funding for open access, and that the paper is a research article). The Plan S JCT allows authors with UKRI or Wellcome Trust funding to check their options for meeting their funder’s open access requirements. Contact the open access team at openaccess@imperial.ac.uk if you need any help interpreting the search results.

Screenshot showing link to search DOAJ, and embedded Plan S journal checker search tool

If you want to feed back on whether this search tool was helpful, or access a link to book a one-to-one training session with the open access team, you can use the chat icon at the bottom right of the page. You can also book a training session via our website, or email us at openaccess@imperial.ac.uk

We hope you find this useful!

The changing state of Gold Open Access at Imperial

Publisher Agreements 

As was highlighted by Imperial’s Director of Library Services Chris Banks in her blog post earlier in this International Open Access Week 2022, the past few years have seen a rapid increase in the number of publisher agreements that Imperial College has signed up to. We now have 33 agreements in place that allow for open access (OA) fees to be fully covered for corresponding authors affiliated with imperial College London at no further cost. 

This has unsurprisingly led to a significant increase in the number of papers being made OA through such agreements. The below graph shows the number of papers covered over the last year via four of the most used Read & Publish agreements that we currently have:

Imperial papers made OA through publisher agreements (1 Oct 2021 – 30 Sep 2022)

This adds up to almost 1000 OA papers from these four agreements alone, which does not include the figures from other publishers we have agreements with such as SAGE, Oxford University Press, Taylor & Francis, and Cambridge University Press.

A shift away from individual APC payments?

As was predicted in an earlier blog post from OA Week 2020, the number of papers now being covered through publisher agreements has now overtaken the number of individual Article Processing Charges (APCs) that we pay for from the OA funds that we administer. For the period from 1 October 2021 to 30 September 2022 we paid for a total of 759 APCs, compared to well over 1000 covered through the agreements.

While we have only seen a slight drop in the total number of individual APCs paid for compared to last year, the most significant change has been an ongoing reduction in the number of APCs we have paid for papers in hybrid journals specifically (i.e. subscription journals that have an OA option) as shown in the below graph:

Individual APCs paid for from OA funds

This reduction in individual payments for APCs in hybrid journals should not be attributed to the increase in publisher agreements alone, as changes to funder policies in recent years have also introduced tighter restrictions on hybrid APC payments, and have offered authors alternative routes to compliance via the green OA route through rights retention. However, it is certainly one of the main reasons behind this shift and is a desired outcome in the transition away from a publishing model that allowed for ‘double-dipping’.

Imperial Open Access Fund

As most publisher agreements do not require authors to be funded, they have allowed many papers to be made OA via the gold route that would otherwise not have been eligible. As well as our funder OA block grants, we are also fortunate to be able to offer our authors the Imperial Open Access Fund. This is available for those without alternative funds available, and can be used to pay APCs for original research papers in fully OA journals listed in the Directory of Open Access Journals.

Although some of our publisher agreements do cover fully OA as well as hybrid journals (e.g. Wiley’s), most of them do not, and there are many publishers who exclusively offer fully OA journals with compulsory APCs. This means the Imperial OA Fund continues to have a big part to play in enabling our authors to publish OA and covered 363 APCs in the last year (nearly half of the total amount):

APCs paid for by each fund (1 Oct 2021 – 30 Sep 2022)

For details on Imperial’s current publisher agreements, please see our newly revamped Publisher agreements and discounts page, and for details on our OA funds and how Imperial authors can apply for APC funding please see our Applying for funding page.

 

Springer Nature negotiations

UK higher education institutions along with Jisc are currently in negotiation for a new “read and publish” agreement (also referred to as “transitional” or “transformative” agreements) with the publisher Springer Nature. Our current agreement runs to the end of December 2022 and we are seeking a new agreement that will not only enable us to read the journals covered by the deal, but also enables researchers to publish open access in those journals at no additional cost.

The sector has agreed criteria for our negotiations. Agreements should

  • Reduce and constrain costs
  • Provide full and immediate open access publishing
  • Aid compliance with funder open access requirements
  • Be transparent, fair, and reasonable
  • Deliver improvements in service, workflows, and discovery

We achieved these aims with last year’s negotiations with Elsevier and are seeking to do so with Springer Nature. In addition to seeking a renewal of the existing Springer Compact agreement which has been running since 2016, we are also seeking to include Nature research journals and Palgrave journals.

If you are reading this and wondering what a “transitional” agreement is, my colleague David Phillips wrote about these in an earlier blog. At the time David noted that we had 11 such agreements in place at Imperial. This has now risen to 33 with fully covered publishing costs plus further agreements which include discounted article processing charges (APCs). Back in 2019, only 9% of sector spend enabled full OA publishing. That figure is now over 80%.

Why are the negotiations criteria important for researchers?

It is worth taking a moment to reflect on the sector criteria and what they mean for academic authors:

Reduce and constrain costs

  • To be sustainable, the costs of reading and publishing cannot continue rising more than that of inflation. Back at the turn of the century, under 44% of Imperial’s Library Services budget was spent on content. Today it is closer to 60% and further increases are simply not sustainable either for Imperial or for the sector. Our most recent Jisc negotiations  went some way to stem the rise and we need the agreement with Springer Nature to similarly deliver. To illustrate the impact of increasing content prices, the chart below shows the breakdown of expenditure on staff, operations, and content costs.

100% stacked bar chart showing Breakdown of Imperial College Library Expenditure between content, operations and staff 2000 to 2021. In 2000 the cost of content was 45.5% of total budget and in 2021 it was 56.6% having reached 59% in 2020

Provide full and immediate open access publishing

Aid compliance with funder open access requirements

  • One of the  questions that libraries frequently get asked is what should authors do to both ensure they meet funder obligations, and that their research outputs are eligible for the Research Excellence Framework – the REF. Our agreement with Springer Nature needs to enable both, affordably.

Be transparent, fair, and reasonable

  • As researchers you have secured the grant funding, you have assembled the team, drawn up the protocols, undertaken the research, undertaken the analysis and written up the findings. You then undertake the peer review. All of the above without payment from the publisher. You may also act as editors for journals, often on a voluntary basis with no compensation. Libraries then pay the publisher for publishing and content provision services. We need those payments to be transparent, fair and reasonable, reflecting the contribution researchers already make to the system.

Deliver improvements in service, workflows, and discovery

  • We are in a transition from paying for content to paying for publishing services on behalf of researchers. It is really important that those services are efficient for all parties otherwise we simply introduce additional administrative costs into the system. For authors, time spent battling a clunky submissions system or an unclear or conflicting publishing contract, especially processes which involve back and forth with libraries, are taking time away from your research activities as well as adding to admin burdens.
  • It is of course vital that research is discoverable for it to be built on and to have impact.

What next?

Negotiations

Researchers can continue to publish in SN journals and meet both funder OA obligations and have REF eligibility

“It is the intention that the UK higher education funding bodies will consider a UKRI open access compliant publication to meet any future national research assessment open access policy without additional action from the author and/or institution”

  • To be sure that your research output both meets funder requirements and is eligible for the next REF, we advise that you insert the following Rights Assertion Statement on all submitted articles (not just Springer Nature):

“For the purpose of open access, the author has applied a ‘Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising”

If you have questions or want further information

For other activities during #OAWeek2022 see this post by my colleague John Murtagh.

About me: I am Director of Library Services at Imperial College London. My profile is here and you can find me on twitter @ChrisBanks. I have an ORCiD and you can get yours here

Open Access Week 2022 (24-30 October 2022)

This year’s International Open Access Week takes place from 24–30 October, and the theme is Open For Climate Justice.

This year’s theme seeks to encourage connection and collaboration among the climate movement and the international open community. Sharing knowledge is a human right, and tackling the climate crisis requires the rapid exchange of knowledge across geographic, economic, and disciplinary boundaries.

12 Month Highlights

At Imperial College London, we provide advice and guidance on an ever more rapidly changing open access landscape. The last 12 months have seen:

    • the successful results of our REF 2021 submission. A significant proportion of published research was made available on open access as a result of the 2018 REF OA policy to deposit the manuscript within 3 months of acceptance into a repository.
    • the start of a new UKRI open access policy from 1 April 2022 which requires immediate open access, without any embargo, under an open licence which applies to peer-reviewed research articles submitted for publication on or after 1 April 2022. We created a UKRI Open Access Policy YouTube video to explain the workflow.
    • the start of a new NIHR open access policy from 1 April 2022 which requires immediate open access, without any embargo under, an open licence which applies to peer-reviewed research articles submitted for publication on or after 1 April 2022
    • for published research that is funded by UKRI, Wellcome Trust, NIHR, and Horizon Europe a new Rights Retention Statement requirement on submissions “For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) license to any Author Accepted Manuscript version arising”
    • the increase to 33 of new publisher agreements and discounts, many of which cover open access fees in full for Imperial corresponding authors. This includes a three-year agreement with Elsevier the largest publisher of Imperial research
    • the Library’s support for several publishing initiatives within Jisc’s open access community framework (OACF) 2022-24 which aims to provide financial support for innovative open access content models
    • our first ever training session for all researchers new to Imperial covering open access and research data management (RDM) 
    • the launch of an open research education for doctoral students webpage that includes a roadmap for open research courses and support

Open Access Week logos with multiple global languages

OA Week 22 Activities

 

 

The upcoming Open Access Week will allow us to announce several initiatives and news items relevant to Imperial researchers and the wider community.

They will include the following:

  • daily tweets and Yammer posts highlighting statistics on OA and climate justice related publications held in Spiral using the hashtags #OpenAccessWeek #OAWeek2022 and this year’s theme is #OpenForClimateJustice.
  • the launch of a new journal search tool for open access publisher agreements
  • a revamped publisher agreements and discounts webpage, which has also been reformatted to ease navigation as the number of agreements grows
  • a blog post on Transformative Agreements that we have signed with publishers and includes statistics on the number of papers published 
  • a blog post on UK negotiations with the publisher Springer Nature written by our Director of Library Services, Chris Banks
  • a blog post on the Imperial Open Access Fund and how many publications we support

In the meantime, for open access advice, guidance and updates, email openaccess@imperial.ac.uk or visit the open access website. You can also request a one to one either via Teams or in person, or a training session for your group or team. You can also sign up to the Imperial Open Research Newsletter (requires an Imperial email address), and follow us on Twitter at @OAImperial and Yammer at Open Access publishing – LI.

Open Access Week 2021 (25–31 October)

This year’s International Open Access Week takes place 25–31 October, and the theme is It Matters How We Open Knowledge: Building Structural Equity.

This theme aligns with the UNESCO Recommendation on Open Science, which was released in draft in May, and will be put forward for adoption by UNESCO’s General Conference in November. Since it is the first global standard-setting framework on Open Science, it presents an important opportunity to build equity into the foundations of new policies.

Here at Imperial, we continue to provide advice and guidance on an ever more rapidly changing open access landscape. The last 12 months have seen:

Our year in statistics
(Download statistics are from https://irus.jisc.ac.uk/; other statistics are from Imperial College London’s records)

85% of Imperial College research in 2021 is open access.

Pie chart showing breakdown of open access types for 2021. Most is Open Access: In Spiral and/or DOAJ, followed by Likely to be Open Access: In Europe PubMed Central only. 15% is not known to be in an open access source, and small amounts are in arXiv only, and in gold OA journals

The Imperial Open Access Fund for unfunded researchers paid for open access for 367 papers between October 2020 and September 2021.

More than 700 research outputs have been published through Imperial’s transformative agreements with publishers in the last 12 months.

There were 200,000 downloads of COVID-19 publications in Spiral between March 2020 and October 2021.

There are 841 Imperial authored COVID-19 research publications in Spiral (https://spiral.imperial.ac.uk/handle/10044/1/78555)

The average monthly downloads of publications from Spiral is 175,000 (between October 2020 and September 2021).

The top three most downloaded research outputs from Spiral from October 2020 – September 2021 are:

How many downloads Title URL
79,128 REACT-1 round 13 final report: exponential growth, high prevalence of SARS-CoV-2 and vaccine effectiveness associated with Delta variant in England during May to July 2021​ http://hdl.handle.net/10044/1/90800 ​
22,592 Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand​ http://hdl.handle.net/10044/1/77482  ​
16,400 REACT-1 round 7 updated report: regional heterogeneity in changes in prevalence of SARS-CoV-2 infection during the second national COVID-19 lockdown in England​ http://hdl.handle.net/10044/1/84879 ​

 

Monthly Spiral downloads (the top 4 downloaded research output types) October 2020 – September 2021
Stacked bar chart showing downloads of articles, reports, PhD theses, and working papers each month from October 2020 to September 2021. Shows increased working paper downloads in August 2021

 

Downloads of reports and preprints between January 2017 and September 2021

Stacked bar chart showing downloads of reports and preprints each year from 2017 to 2021. Shows marked increase in report downloads in 2020 and preprints in 2021

 

There are 19,500 PhD theses available in Spiral.

There were 741,248 Imperial PhD theses downloads between October 2020 and September 2021.

The number of research items in Spiral by output type

Pie chart showing research outputs by type in Spiral. Comprising 52,572 journal articles, 27,389 thesis or dissertation, 5,323 conference papers, and 2,589 other types

 

For open access advice, guidance and updates, email openaccess@imperial.ac.uk or visit the open access website. You can also request a one to one either via Teams or in person, or a training session for your group or team. You can also sign up to the Imperial Open Research Newsletter (requires an Imperial email address), and follow us on Twitter at @OAImperial and Yammer at Open Access publishing – LI.

No double dipping! The rise of transformative publisher agreements in the transition to full Open Access

The impact of Plan S

In 2018 a group of funders and national research agencies launched Plan S, an initiative with the central aim that by January 2021 “…all scholarly publications on the results from research funded by public or private grants provided by national, regional and international research councils and funding bodies, must be published in Open Access Journals, on Open Access Platforms, or made immediately available through Open Access Repositories without embargo.” Implicit in this goal is the intention of funders to move away from supporting the ‘hybrid’ model of publishing, whereby journals offer a paid open access (OA) option for authors to make their paper freely available upon publication but continue to charge a subscription fee for the rest of their content.

As with many other institutions, at Imperial we are recipients of block grants from certain funders, which authors acknowledging support from those funders can use to pay for individual Article Processing Charges (APCs) in both fully OA and hybrid journals. Although we have already introduced some restrictions on when we will pay for hybrid APCs, due to limited funds, with funders increasingly adopting the Plan S Principles authors may be concerned that they will soon be completely prevented from choosing OA publishing options in hybrid journals.

This is where Plan S Principle 8 comes in, which states that “…as a transitional pathway towards full Open Access within a clearly defined timeframe, and only as part of transformative arrangements, Funders may contribute to financially supporting such arrangements”. So, while Plan S funders will no longer support the payment of individual APCs to hybrid journals, institutions are able to redirect OA funds to pay for arrangements with publishers to transition away from the hybrid model towards being fully OA (until the end of 2024).

Read & Publish agreements

There are several types of transformative arrangements, but perhaps the most common are Read & Publish agreements. Instead of institutions (generally via their libraries) paying separately for subscriptions and OA fees for the same journals (aka ‘double-dipping’), Read & Publish agreements combine the costs. This provides those affiliated with the institution access to journal content that is still paywalled, as well as allowing authors to choose the OA option for their publications at no further cost.

As more of the content in hybrid journals becomes free for all to read in the transition to becoming fully OA, the proportion paid for the ‘Read’ part of the deal will decrease, and the proportion paid for the ‘Publish’ part will increase accordingly. While these kinds of arrangements precede the announcement of Plan S, their uptake has undeniably been accelerated by the initiative. Prior to 2020 Imperial had signed up to one Read & Publish agreement (with Springer in 2016), but we now have 11 In place, all negotiated by Jisc for Imperial and other institutions.

just take one dip and end it, Peyri Herrera, CC BY-ND 2.0, https://www.flickr.com/photos/54552940@N00/2483791713

Read & Publish agreements can offer an alternative route for authors to publish their work OA in cases where we would normally not be able to provide funding for an APC. Unlike our OA block grants from funders, which only authors acknowledging the relevant funding can use, these agreements can be made available to all Imperial staff and students (usually with the requirement that they are the corresponding author). The process should generally be much quicker and easier for authors, as they do not need to request an invoice or make a separate payment for an APC, and publishers have also been encouraged to improve the workflows and dashboards used by authors and the staff who administer the agreements within institutions.

Not a panacea

However, it can be argued that such agreements do not solve all of the problems that are present in the existing hybrid OA model. To the authors that are eligible for these agreements it may feel that they are getting free and unlimited OA for their work, but there are still high costs involved to sign up for the deals in the first place, and often there are limits on how many papers can be made OA in a year. This has recently been seen with the restrictions introduced to the Wiley agreement, whereby only authors supported by certain funders are currently eligible for inclusion in the agreement due to high levels of demand.

During an OA Week with a theme of “Taking Action to Build Structural Equity and Inclusion”, it is also important to highlight that such agreements can be seen as perpetuating global inequalities in access to OA publishing, as is argued by Jefferson Pooley on the LSE Impact Blog. A transition away from the hybrid model towards journals being fully OA should benefit everyone wanting to access the outputs of research as a reader. Nevertheless, it is only those authors who are affiliated with institutions wealthy enough to pay for the agreements (predominantly research intensive and in the global North) who are in a position to directly benefit from the OA publishing aspect.

Others who wish to publish OA will continue needing to find alternative routes, such as applying for APC waivers, submitting to OA journals that do not charge APCs, or self-archiving. This is not to say that these other routes are not valid – the option to self-archive (aka ‘green’ OA) is also a key part of the Plan S principles – but for those authors who do not have ready access to APC funds or publisher agreements there is understandably a sense of inequality.

This diagram by Imperial’s Director of Library Services, Chris Banks, demonstrates the complexity of a transition to full OA when considering the different levels of research intensity across institutions
(https://twitter.com/ChrisBanks/status/1169530088276340736)

A shift in gold OA at Imperial?

At Imperial we are fortunate to be able to offer our authors a range of different ways to make their research outputs OA, via both the green and gold routes. While the majority of our time (and money) in the gold section of the OA Team is still spent on paying individual APC payments from the funds that we administer (totalling 853 payments from 1 Oct 2019 – 30 Sep 2020), an increasing number of articles are now being made OA through our aforementioned Read & Publish agreements.

Imperial papers made OA through Read & Publish agreements (1 Oct 2019 – 30 Sep 2020)

The graph above shows the numbers of papers made OA via our four most used agreements (with Springer, Wiley, the Royal Society of Chemistry and SAGE) totalling 567 papers between 1 Oct 2019 – 30 Sep 2020. We also have agreements in place with the Company of Biologists, European Respiratory Society, IOP, IWA, Microbiology Society, Portland Press and Thieme. As previously mentioned, only the Springer agreement was in place prior to 2020, and we are in the process of signing more agreements. We would therefore expect the figures for next year to be even higher, and to perhaps even overtake the number of APCs we pay for individually.

For details on Imperial’s current Read & Publish agreements, as well as other publisher arrangements and discounts available to Imperial authors, please see our Publisher agreements and discounts page.

Protecting your assets: copyright and licensing advice for online reports, briefing papers and working papers

In times of crisis it is important that research is shared rapidly but what else should researchers consider before informally publishing their report, briefing paper or working paper on a website, Spiral or a pre-print server?

Will this work become a journal article?

The first thing to consider is whether this informal publication is the final write up of your research or only a staging post on the way to formal publication in a journal. Most publishers accept that the research they receive as a paper may have already been presented in other formats, for example as a conference paper, a pre-print on arXiv  or another preprint server, or a working paper on RePEc or SSRN, and do not reject papers because these earlier versions already exist. However, it is always wise to read the prior publication policies of the key journals in your field to make sure putting your research online now won’t stop you publishing later in your chosen journal. This information is normally included in the ‘for authors’ section of the journal website but if you can’t find this information or you have questions then you can always contact the editorial team.

Which is the best platform?

The first location most researchers think about for an informal publication is a personal or departmental website. This works well when you or your research group have a strong brand and the traffic to these sites is already high, but when you are starting out in your research career it is good to share a platform with others in your university or subject. You can do this by depositing your publication in Spiral, Imperial’s research repository, or a pre-print server in your subject area.

Spiral offers a secure home for your publication, a DOI link that will never break, and usage metrics via Altmetric so you can track who is discussing your work and where. This is useful when you are asked to explain the real-world impact of your research or write an impact statement. Once you have uploaded your publication to Spiral you can link to it from departmental webpages, networking sites and social media sites using a DOI link (e.g. https://doi.org/10.25561/76707).


Depositing your work in Spiral also has copyright and licensing advantages because there is just one copy, with one copyright and licensing statement of your choice not multiple copies on multiple platforms all with different licensing options and use licenses.


If you do decide to upload your publication to another platform, read the service’s terms of use and copyright policies so that you are clear about what you are permitted to upload and how others can use your publication once it is publicly available. For comparison, ResearchGate simply hosts what you upload but the pre-print server bioRxiv asks users to choose a Creative Commons Licence for each uploaded paper to make them easier to share and reuse. Both licensing approaches have their advantages and disadvantages so you should pick the platform that works best for you and your research.


Who is the copyright holder?

The authors or the department can be named as the copyright holder. Through the College’s Intellectual Property Policy Imperial has waived its automatic right to copyright in research publication. Therefore it is recommended that copyright should be assigned jointly to the authors and that any alternative is agreed with them when work is commissioned.
This approach will avoid a situation whereby authors must request a department’s permission each time they want to reuse and publish extracts from the publication in journal articles. It allows a department to own copyright when a report or paper is the final work and it is more practical for a department to handle reproduction and translation requests.


How do I show ownership?

The next thing to think about is protecting your intellectual property and making sure you get the credit for your work. A myth has grown up that if you can view something on the web then you can reuse it in any way that you like. Make it clear to others this isn’t true by adding a copyright statement like the one below.

© 2020 The Authors. Published by Imperial College Business School


What is the advantage of a Creative Commons Licence?

When you add a Creative Commons licence to your work, you make it clear that it can be copied and redistributed so long as you are acknowledged as the author. If you make something easy to share then more people will do this and your research is more likely to get noticed and discussed.


Creative Commons Licences permit others to copy and share all or part of your work but only on the condition that the original author and source are credited. They are simple for others to read because they are written in plain English and familiar because they are already used in open access journal publishing. An earlier blog post, Your choice! Selecting a Creative Commons Licence, will help you get you understand the pros and cons of the six different licences. This is a sample copyright statement taken from an Imperial report :

© 2020 The Authors. Published by The Grantham Institute for Climate Change under the terms of the Creative Commons Attribution License https://creativecommons.org/licenses/by/4.0/

In this example if this report was uploaded to Spiral then anyone reading it should note the Creative Commons Attribution License displayed on the document. The default licence applied to work deposited in Spiral is a Creative Commons Attribution NonCommercial NoDerivatives License. If you apply a more permissive licence to your work (as above) this will override the Spiral default licence


How do I make sure others cite my work?


The best approach is to remove the intellectual effort of creating a citation by providing a suggested citation that they can copy and paste. You can take your inspiration from journals or adapt the example below. This report has a DOI because it was uploaded to Spiral but if your report has no DOI then insert a URL link to the hosting website.

SUGGESTED CITATION

Ghafur S, Fontana G, Halligan J, O’Shaughnessy J, Darzi A. NHS data: Maximising its impact on the health and wealth of the United Kingdom. Imperial College London (2020) doi: 10.25561/76409


What if all the content is not yours?


Sometimes you will include text and figures from previously published papers, yours and others, in a new publication. When you do this, you must be confident that your use is covered by: the UK copyright exception Quotation, Criticism & Review, a compatible Creative Commons License or direct permission from the publisher. Publishing agreements, even open access publishing agreements, often still ask authors to give the publisher the exclusive right to publish the paper’s contents.

While citing the source of reproduced text and figures is second nature, copyright, licencing and permission statements are often forgotten, leaving the reader to assume that the copied figure is owned and licensed under the same terms as the new publication. This may not always be the case, especially in a review paper, and may result in another researcher inadvertently reusing the figure without permission in a future paper.

For example, a figure in a paper has a copyright status ‘© 2020 Elsevier. All rights reserved.’ but you reuse it in a new publication which will be licensed under a Creative Commons NonCommercial License. It is important to alert the reader to the fact that the reuse terms of the copied figure are different and that you are unable to provide them with permission to copy and share it along with the original parts of your paper.

A visual representation of the text example in the paragraph above. An all rights reserved figure sits within a Creative Commons Licensed paper
figure 1: A copyright protected figure within a Creative Commons licensed paper.



Figures have a commercial value to publishers and the expectation is that the first journal is paid for re-use of a figure by the second journal or that both are members of STM and follow the STM guidelines on reciprocal reuse of figures.


In summary

When you make a publication available on the web you become the publisher. This is positive as it puts you in control of copyright and licensing decisions and allows you to license your publication in the way that is best for you and your research. However, it also means that you must take on some of the tasks automatically done by your publisher and that you normally wouldn’t think about. Hopefully this article has shown you that this is not as hard as you might think and that a little bit of knowledge will get you a long way.


Help and support

The Library’s Scholarly Communications team are happy to speak to you about any of the topics mentioned in this blog post. You can contact us via ASK the Library
You may also like to read our webpages about Publishing with Spiral. Much of this advice also applies to informally publishing on other platforms.

Philippa Hatch
Copyright and Licensing Manager, Library Services.

UKRI Open Access Policy Consultation: Imperial College London Response

Imperial College London has provided a response to UKRI’s Open Access Review consultation:

In addition to signposting the full UKRI consultation documentation and list of questions, consultation on the Imperial College response to the UKRI OA review has been undertaken as follows:

  • Presentation and discussion at the Vice Provost’s Advisory Group for Research
  • Presentations at each of the four Faculty Research Committee meetings
  • Via a recorded online presentation accompanied by a short questionnaire
  • Through information circulated via faculty and departmental mailing lists
  • Via social media including Twitter, and Yammer

    Responses to multiple choice questions are highlighted

    The response was submitted by Chris Banks, Assistant Provost (Space) & Director of Library Services on behalf of the College and is available via Spiral, the institutional repository.