How compliant are we with HEFCE’s REF open access policy? (Why Open Access reporting is difficult, part 2)

In what is hopefully not going to become a long series I am today dealing with the joys of compliance reporting in the context of HEFCE’s Policy for open access in the post-2014 Research Excellence Framework (REF). The policy requires that conference papers and journal articles that will be submitted to the next REF – a research assessment through which funding is allocated to UK universities – have to be deposited in a repository within three months of acceptance for publication. Outputs that are published as open access (“gold OA”) are also eligible, and during the first year of the policy the deposit deadline has been extended to three months of publication. The policy comes in force on 1st April and considering the importance of the REF the UK higher education sector is now pondering the question: how compliant are we?

As far as Imperial College is concerned, I can give two answers: ‘100%’ and ‘we don’t know’.

‘100%’ is the correct answer as until 1 April all College outputs remain eligible for the next REF. While correct, the answer is not very helpful when trying to assess the risks of non-compliance and for understanding where to focus communications activities. Therefore we have recently gone through a number crunching exercise to work out how compliant we would be if the policy had been in force since May last year. In May 2015 we made a new workflow available to academics, allowing them to deposit outputs ‘on acceptance’. The same workflow allows academics to apply for article processing charges for open access, should they wish to.

You would imagine that with ten months of data we would be able to give an answer to the question for ‘trial’ compliance, but we cannot, at least not reliably. In order to assess compliance we need to know the type of output, date of acceptance (to work out if the output falls under the policy), the date of deposit and the date of publication (to calculate if the output was deposited within three months). Additionally it would help to know whether the output has been made open access through the publisher (gold/immediate open access).

Below are eight issues that prevent us from calculating compliance:

  1. Publisher data feeds do not provide the date of acceptance
    Publishers do not usually include the date of acceptance in their data feeds, therefore we have to rely on authors manually entering the correct date on deposit. Corresponding authors would usually be alerted to acceptance, but co-authors will not always find out about acceptance, or there may be a substantial delay.
  2. Deposit systems do not always require date of acceptance
    The issue above is made worse by not all deposit systems requiring academics to enter the date of acceptance. In Symplectic Elements, the system used by Imperial, the date is mandatory only in the ‘on acceptance’ workflow; when authors deposit an output that is already registered in the system as published there is currently no requirement to add the date – resulting in the output listed as non-compliant even if it was deposited in time. Some subject repositories do not even include fields for date of acceptance.
  3. Difficulties with establishing the status of conference proceedings
    Policy requirements only apply to conference proceedings with an ISSN. Because of the complexities with the publishing of conference proceedings we often cannot establish whether an output falls under the policy, or at least there is a delay (and possible additional manual effort).
  4. Delays in receiving the date of publication
    It takes a while for publication metadata to make it from publishers’ into institutional systems. During this time (weeks, sometimes months) outputs cannot be classed as compliant.
  5. Publisher data feeds do not always provide the date of publication
    This may come as a surprise to some, but a significant amount of metadata records do not state the full date of publication. The year is usually included, but metadata records for 18% of 2015 College outputs did not specify year or month. This percentage will be much higher for other universities as the STEM journals (in which most College outputs are published) tend to have better metadata than arts, humanities and social sciences journals.
  6. Publisher data feeds usually do not provide the ‘first online’ date
    Technically, even where a full publication date is provided the information may not be sufficient to establish compliance. To get around the problem that publishers define publication dates differently, HEFCE’s policy states that outputs have to be deposited within three months of when the output was first published online. This information is not usually included in our data feeds.
  7. Publisher data feeds do not usually provide licence information
    Last year, Library Services at Imperial College processed some 1,000 article processing charges (APCs) for open access. We know that these outputs would meet the policy requirements. However, when the corresponding author is not based at Imperial College – last year around 55% of papers had external co-authors – we have no record on whether they requested that the output be made open access by a publisher. For full open access journals we can work this out by cross-referencing the Directory of Open Access Journals. However, for ‘hybrid’ journals (where open access is an (often expensive) option) we cannot track this as publisher metadata does not usually include licence information.
  8. We cannot reliably track deposits in external repositories
    Considering the effort universities across the UK in particular have put into raising awareness of open access there is a chance that outputs co-authored with academics in other institutions have been deposited in their institutional repository. Sadly, we cannot reliably track this due to issues with the metadata. If all authors and repositories used the ORCID identifier it would be easier, but even then institutional repositories would have to track the ORCID iD of all authors involved in a paper, not just those based at their university. If we had DOIs for all outputs in the repositories it would be much easier to identify external deposits.

Considering the issues above, reliably establishing ‘compliance’ is at this stage a largely manual effort that would take too much staff time for an institution that annually publishes some 10,000 articles and conference proceedings – certainly while the policy is not yet in force. Even come April I would rate such an activity as perhaps not the best use of public money. Arguably, publisher metadata should include at least the (correct) date of publication and also the licence, although I cannot see a reason not to include the date of acceptance. If we had that, reporting would be much easier. If we had DOIs for all outputs (delivered close to acceptance) it would be even easier as we could track deposits in external repositories reliably.

Therefore I call on all publishers: if you want to help your authors to meet funder requirements, improve your metadata. This should be in everyone’s interest.

Colleagues at Jisc have put together a document to help publishers understand and implement these and other requirements: http://scholarlycommunications.jiscinvolve.org/wp/2015/03/26/how-publishers-might-help-universities-implement-oa/

What we can report on with confidence is the number of deposits (excluding theses) to our repository Spiral during 2015: 5,511. Please note: 2015 is the year of deposit, not necessarily year of publication.

A Universal Open Access Policy?

Despite claims to the contrary, open access as such is not very complicated. Either publish your scholarly output with a publisher who will immediately make it available as open access, or put a copy of the (peer-reviewed) manuscript in a repository. What makes open access complicated is the myriad of policies that regulate it.

The Registry of Open Access Repository Mandates and Policies (ROARMAP) alone lists way over 700 OA policies – just from research organisations and funders. If you add publisher policies it gets even more confusing. As a sector we often complain about the difficulties publishers create with journal embargoes. We are also criticising funders for not aligning their policies. These criticisms are valid, but we tend to gloss over that universities are not always aligning their policies either. Policies that vary across universities make it more difficult for third parties to provide solutions as they need to map onto a wide range of workflows resulting partly from different policies. Different institutional policies also make it harder to communicate open access to academics.

I have on a few occasions suggested that we should aim to align institutional policies more, and that we should also simplify them. Thankfully, I am not the only one thinking about this. Jisc, SHERPA Services and ROARMAP have jointly developed a Schema for Open Access policies. The schema should help policymakers “to express their policies in a systematic manner”, as “an initial step to ensure greater clarity and uniformity in the way information about OA policies is recorded and made available”. Imperial College was one of 30 institutions that  provided information to the new initiative. You can read more about the schema, initial findings and how to engage on the Jisc blog.

My ideal would be that over time we move to a single open access policy, or at least to a core policy to which institutions can add a selection of clearly defined elements to reflect their specific needs – where this is really necessary, of course. In the UK we do already have what could be considered the core of an OA policy, the Policy for open access in the post-2014 Research Excellence Framework. Leaving the details aside, the policy requires deposit on acceptance (for publication). Currently it only applies to scholarly articles and conference proceedings, but I would argue that that makes it ideal as a starting point as these more formalised outputs (compared to e.g. performances) are easier to deal with across institutions.

Therefore, my suggestion for a minimal universal OA policy would be:

  • Publish in the journal of your choice, including full open access journals (subject to availability of funding).
  • Deposit a copy of the peer reviewed manuscript of your journal article or conference proceeding into a repository on acceptance for publication.

Incidentally, that is effectively the OA policy at Imperial College. As the vast majority of College publications are articles or conference proceedings we can effectively limit the policy to these, at least for the moment. An institution with a more diverse range of outputs may decide to add monographs, videos, websites etc., and those who cover costs for hybrid open access (Imperial’s own fund does not support it) may want this included as well.

I fully understand that just two bullet points will not be enough. However, I would like to put out a challenge: look at your institution’s open access policy and think about which elements you really need, and how you could simplify it in a way that would help us moving towards a universal policy. And make sure to check out the schema!

Making Open Access simple – The Imperial College approach to OA

When you come at it for the first time, open access looks pretty complicated. Funder policies, institutional policies, publisher policies, different flavours of OA including ‘green’, ‘gold’, ‘libre’ and ‘gratis’ and a whole new language with mystifying terms like ‘hybrid journal’, ‘article processing charge’ and ‘author accepted manuscript’ await. Even librarians sometimes struggle to understand journal policies, or what certain licensing conditions actually mean.

It was perhaps for this reason that, when we started the College open access project, academics gave us a clear mission: a one button solution to open access.

We haven’t quite achieved that yet, but since May we are running a new workflow that reduces the complexity to one sentence: ‘When you have a paper accepted, deposit the peer-reviewed manuscript – we do the rest, no matter what type of open access.’

The workflow is based on two ideas:

  1. Ask authors for the minimum information required.
  2. Offer authors a single publications workflow that covers green and gold OA as well information required for funder reporting.

The frontend for this workflow is Symplectic Elements, the system used by our academics to manage their scholarly outputs. We have worked with the vendor to deliver an OA workflow that kicks in on acceptance for publication, and then we customised the system to interface with ASK OA, our in-house APC management system.

On acceptance for publication, authors add minimal metadata and the manuscript to Elements, link the article to relevant grants and if they want the College to pay an open access charge they simply tick a box. Colleagues in the Library’s open access team then check the submission, set necessary embargoes and make the output available through Spiral, the College repository. If payment is requested, the data is automatically transferred to ASK OA, the cloud-based, workflow-driven system that we launched last year. Through that process, authors receive a purchase order number to send to their publisher. When the College receives the electronic invoice, our finance system matches the PO and the payment process starts. No author interaction needed.

OA form

Above you see a screenshot of the information we require from authors. In addition, they deposit the manuscript (or share a link if it was already deposited in an external repository) and link the output to relevant grants. That allows us to charge costs for open access publishing to the correct funders and, once funder systems are ready, will enable the College to automate funder reporting on research outputs. If you want to see a demonstration, check out this video guide produced by the College Library:

The feedback we had from academics has been positive so far, and the numbers show that as well:

June 2015

While the workflow is working well so far, we are still far away from what I would consider the ideal scenario. There are still enough journals with difficult and unhelpful policies, and no university workflow will be able to fix that. Publishers being unable to issue correct invoices is another issue. We also have the problem to reliably match the metadata entered on acceptance with the metadata for the published output. Publishers could help by issuing authors with a DOI on acceptance.

Even better, publishers could feed publication metadata into systems like CrossRef on the date of acceptance. If the metadata had funder, licence and embargo information attached and a link to the manuscript, then open access would indeed become a one-click-problem. Authors enter their data on submission, and following acceptance it automatically travels through all relevant systems, until it ends up in an institutional repository. There would be no additional effort for authors, and admin overhead would be reduced greatly. The components to enable this already exist, for example the author identifier ORCID that was rolled out across the College last year.

We are still working towards the goal of a “one button” solution for open access with our partners. Until then the message remains: deposit the manuscript on acceptance, we do the rest.

New Workflow: “Be REF compliant”

The library has released a new workflow on how to make your publications REF compliant. Authors can now deposit their journal articles and conference proceedings on acceptance in Spiral via Symplectic. At the same time an application can be made for APC funding to pay open access fees.

 

New REF workflow

Find out more about how the library can support you in making your work open access at our new web pages, or contact us: openaccess@imperial.ac.uk.

Open Access News, March-April 2014: HEFCE OA policy and Wellcome APC data

For the College’s Open Access Publishing group I put together a semi-regular digest of news and recent developments around to Open Access and related topics. As this might be of interest to others too, we have decided to make this available via the blog too. For more information on OA, take a look at the Open Access website of the College Library.

General News

HEFCE have released their Open Access policy. We will discuss this in more detail later, but this policy is likely to be a game changer as far as Open Access in the UK is concerned.

The Research Information Network have released a report on Monitoring Progress in the Transition to Open Access, including proposals for a framework of indicators to monitor progress towards open access. Jisc have, informally, confirmed that their OA Monitor project is likely to address at least part of this if institutions find this useful.

From April 2014 onwards, the National Institute for Health Research will expect peer-reviewed articles to be made available as Gold OA, expecting full compliance within four years.

Wellcome and NIH are withholding grant payments when OA obligations are not met (Imperial scholars have not been affected by this).

The University of Konstanz has broken off license negotiations with Elsevier and will no longer subscribe to any Elsevier content. “The publisher’s prices are too high, said university Rector Ulrich Rüdiger in a statement, and the institution ‘will no longer keep up with this aggressive pricing policy and will not support such an approach.’ […] Adding to tensions, the university hinted, was a feeling that academia is essentially paying twice for its own work. ‘Universities are in a way forced to purchase a good back in the form of expensive subscription fees – a good which is actually produced by their own scientists,’ said Petra Hätscher, a university administrator, in a statement.”

The Open Access Scholarly Publishers Association has suspended Springer’s membership because of systematic problems with the editorial process at Springer revealed by the so-called “Open Access sting”.

Jisc, RLUK, RCUK, Wellcome Trust and others published a report that examines the potential risks associated with the APC open access market (APC = Article Processing Charge for OA articles). The economic analyses undertaken provided a strong indication that the full open access journal market is functioning well in creating pressure for journals to moderate the price of APCs. On the other hand, the current hybrid market was found to be extremely dysfunctional, with significantly higher charges and low levels of uptake. Indeed, the average APC in a hybrid journal was found to be almost twice that for a born-digital full open access journal ($2,727 compared to $1,418). The authors suggest different approaches, including only paying APCs to hybrid journals that offer reductions for subscriptions payments or setting caps to APCs in relation to the quality and range of services offered by the journal.

Wellcome Trust releases data on Article Processing Charges for Open Access

The Wellcome Trust released the full data on the APC spend 2012-13. A community effort led to that data being cleaned up (Google doc spreadsheet) and analysed within a few days. The analysis revealed that the average APC paid by Wellcome is £1,820.

In her analysis, Michelle Brook from the Open Knowledge Foundation highlighted that most of the money goes to hybrid journals:

In Oct 2012 – Sept 2013, academics spent £3.88 million to publish articles in journals with immediate online access – of which £3.17 million (82 % of costs, 74 % of papers) was paying for publications that Universities would then be charged again for. For perspective, this is a figure slightly larger than the Wellcome Trust paid in 2012/2013 on their Society & Ethics portfolio. Only £0.70 million of the charity’s £3.88m didn’t have any form of double charging (ie, was published in a “Pure Open Access” journal) – with this total being dominated by articles published in PLOS and BioMed Central journals (68 % of total ‘pure’ hybrid journal costs, 80 % of paper total).

Ernesto Priego is concerned that high APC may effectively just shift the serials crisis from the library to the research budget and that arts and humanities researchers in particular might be priced out of publishing. He has created a visualisation of the lowest and highest APCs charged by 11 publishers (image licensed CC BY SA 3.0):

Lowest and highest APCs levied by 11 major publishers, by Ernesto Priego

Analysis of the Wellcome data has also identified issues with the licensing information on publishers’ websites:

  • Michelle Brook has shown that Wiley-Blackwell wrongly claim that CC BY licenses do not allow others to re-use the article commercially.
  • Peter Murray-Rust has identified several cases where Elsevier has put OA content behind paywalls, charged for the full text or mislabelled the license. This has been picked up by Times Higher and Elsevier have admitted that they mischarged 50 people for use of OA content; they are refunding money.

Building on the community effort, Wellcome have released a statement on the APC data. They thanked the community and criticised publishers for not delivering the quality of service expected. It is worth quoting this in more detail:

Inevitably, with a dataset of over 2000 articles, published by 94 different publishers, problems have been identified. These include:

  • Content remaining hidden behind a publisher pay-wall;
  • Content freely available on the publisher site, but not available in PMC/Europe PubMed Central;
  • Missing, incorrect, or contradictory licence information
  • CC-BY licensed articles still linked to sites such as the Copyright Clearance Centre, where readers may be charged for re-using open content.

In summary we contacted 20 publishers in relation to 150 articles (approximately 7% of the total number of articles for which an APC had been paid).

We expect every publisher who levies on open access fee to provide a first class service to our researchers and their institutions. […] Even though there are only a small number of articles that the Wellcome Trust has paid to be open access that have remained behind a pay-wall, this is not an acceptable situation in any instance.

The bigger issue concerns the high cost of hybrid open access publishing, which we have found to be nearly twice that of born-digital fully open access journals. We need to find ways of balancing this by working with others to encourage the development of a transparent, competitive and reasonably priced APC market.