Cultivating ORCID: Encouraging Growth
Report on the Jisc UK ORCID Consortium Members’ Event, June 2017.
This event was part of a series of members’ events organised by the Jisc support team for the UK ORCID Consortium.
The aim of the event was to bring the UK ORCID community together for peer-to-peer networking and support and to help progress key issues to aid better integration and implementation of ORCID iDs in member institutions.
Specific objectives were to:
- Reflect on progress to date, both in Jisc activity and community implementation
- Help members understand how to capitalise on the ORCID Collect & Connect integration and engagement programme
- Brief members on key technical developments
- Review next opportunities at a community and institutional level.
The day opened with an introduction by Nicky Ferguson, Project Manager for the Jisc ORCID UK Support team, followed by two updates.
Update 1: Jisc UK ORCID Support Service
Update 2: ORCID Update
The morning was then split into three parallel sessions focused on exploring new processes and technical developments.
Workshop 1 – ORCID integration use cases: principles and workflows
The group divided into three smaller groups at the outset. One group discussed the limitations of some types of system, notably HR systems which lack functionality in terms of integration with ORCID. They cannot be a complete source of truth as only employed researchers will be on there but not students and not researchers contributing to papers who don’t work at the institution.
One group also discussed criteria for choosing which systems to integrate with ORCID and identified 1) frequency of updating and 2) key institutional drivers as criteria that may help institutions to decide. For example, if the main driver is identifying links to funders, the grant management system may be most relevant. One other consideration is whether the ORCID IDs and access tokens collected would be locked into a vendor system, and whether there would be a path to exporting them should the institution move to a different system.
In the group led by Neil Jefferies, each member of the group described their experience of integrating ORCID with their systems. Members described different points of integration with ORCID (eg identity system, CRIS, Symplectic Elements, HR, Repository or not integrating at all) and the pros and cons of those choices.
A major ‘con’ that was identified related to how ORCID linked with outsourced systems. Under present arrangements, users grant the institution’s iD system access to their ORCID account. If they use a vendor system, they also have to grant the vendor system access to their ORCID data. There are two separate elements controlled by the user and there is not enough of a connection to synchronise that activity so these two things become disconnected.
The model that seems to be needed from ORCID is as follows:
- The researcher grants the institution rights.
- The institution then has a delegation mechanism which allows the institution to delegate some of its rights to a user’s ORCID data to a third party.
- The institution now controls the data without split permissions. The institution has visibility and control of the relationship and can make sure everything links up.
- This also allows the institution to revoke permissions if it discontinues its contract with the outsourced system provider.
NB: Neil Jefferies has sent out a subsequent email to the ORCID members list as follows:
At the recent Cultivating ORCID’s Meeting in Birmingham I ran a working group looking at different approaches to implementing ORCID’s. One of the outcomes was a common issue when it came to ORCID implementations and third party suppliers, namely, that institutional users needed to explicitly grant access to third party suppliers in addition to their own institution. This behaviour has a number of undesirable side effects:
- Communicating this to users can be difficult since they are not always aware of these third parties
- Getting consistent takeup across multiple systems can be difficult (user loses interest) which makes downstream integration more awkward than necessary
- Institution has little visibility of these third party interactions – which can cause problems when suppliers are dropped or other issues arise
- The only way round this is to let a supplier use the institutional key – which however then grants them *ALL* the rights can access that the institution has
The solution to this would be to have a mechanism that allows an institution to grant certain of its rights to a user’s ORCID data to a third party via a delegation mechanism, so the user only has to deal with the institutional grant. Laure Haak of ORCID was supportive of getting a group together to look at this in more detail (with ORCID involvement along the way). So here we are…
- There are a number of issues to be looked at, not all of them technical:
- How would this work in general?
- What are the rights we are interested in granting?
- Users need to be aware institutions can do this, and authorise them to do so
- How do we handle supplier termination/revocation?
- Do we need reporting/visibility of supplier actions since they now operate on our behalf?
I am looking to get together a group of around a dozen people from various institutions to commit some time to get together (virtually in the first instance, physically at a suitable juncture if needs be) to work through this in further detail. I am aware that we are now getting into the summer holiday dead-zone so I am looking for expressions of interest in the first instance with, if possible, an indication of availability over the next 8 weeks or so.
Members should contact Neil about joining this working group.
Workshop 2 – ORCID Collect & Connect: what, how and why?
Paula began by setting out that ORCID is part of the plumbing for the research infrastructure: integrations can be simple like the water spout or complex like the water fountain (see slides), but the key is to adopt best practice guidelines from ORCID to avoid flooding (problems)!
Goals of Collect & Connect are to:
- Clarify goals and expectations across sectors
- Standardize and improve the user experience
- Improve understanding and trust in connections made between ORCID and other identifiers
- Increase efficiency and quality of integrations
- Help achieve the ORCID vision through a community approach
ORCID has introduced badges for each stage of the Collect & Connect process to signify that an organisation has met a set of standards. The badges are for display within the ORCID website where there will be a new integration chart indicating members’ levels of progress. Badges are also for display in the member institution on relevant pages where explaining ORCID integration to users and also, for example, in a blog or any promotional material about ORCID. This will be Phase 2 of implementation after Phase 1 display on the ORCID site.
Paula went on to describe expectations for each of the five Collect & Connect badges. There was a lot of discussion around Authenticate & Collect.
- Collect and store authenticated iDs and provenance
Authentication is where the member institution is requesting the user themselves to validate their ORCID iD through the OAuth process. The institution is asking the researcher for their ORCID iD and through that grabbing permission back to update their record.
Leveraging the identity management system of the University makes this less onerous. This works already for many institutions using Shibboleth who participate in SURFconext and/or eduGAIN federations. The researcher can be presented with an option to sign in or register – a good time to tie this in is when institutions onboard new students and staff.
For Display, ORCID recommends members publicly display the ORCID iD that’s been collected with the green iD icon, following the ORCID guidelines.
Also recommend displaying the member logo when describing the institution’s connection with ORCID with an explanation of the benefits of ORCID
For Connect, ORCID is looking for member institutions to push affiliation, using the scope in the API to request permission from the user to update the ORCID record, emphasising the benefits of ORCID to researchers:
- Request permissions to write to/update the user’s ORCID record to assert affiliation, works, and more (with provenance) AND/OR
- Ingest information from ORCID records to auto-populate forms
The pilot project is now open looking at the institutional connect process, how to automatically request iDs and access permission from users who sign in to ORCID using institutional credentials – and push in affiliation information.
NB: Note this subsequent piece in the ORCID newsletter:
Seeking early adopters for Institutional Connect
Are you part of the eduGAIN interfederation service? Have you already built a custom integration with ORCID? We invite you to participate in our pilot project to test out Institutional Connect, a program to facilitate affiliation assertions into ORCID records.
The pilot will test a workflow for researchers using institutional sign-in, enabling them to simultaneously grant permission to their institution to u to update their ORCID record, including asserting the user’s affiliation with the university. This creates a trusted connection between the individual’s ORCID iD and the member’s name and organization ID, with the source of this connection clearly stated as the institution.
Interested in learning more? Check out our Institutional Connect documentation, and view the slides and video from our recent webinar on Institutional Connect. If you have any questions, or are interested in joining the pilot, please contact email@example.com.
Members should contact Paula to discuss participation in the institutional connect pilot.
For Synchronise, the aim is to:
- Create bidirectional information flow (synchronisation) between ORCID and your system AND/OR
- Automatically update ORCID records AND/OR
- Create a search & link wizard
- In all cases, explain benefits to individual
Premium members, can use web hook functionality
To meet Synchronise, some members can create search & link wizards
NB: Since the workshop…
Paula is doing a mapping document of where members stand in relation to the Collect & Connect process, with the potential level that members can reach and next steps. The focus is on members with an active integration.
Most members are using vendor systems for integration. ORCID has been working with the CRIS vendor systems PURE, CONVERIS and Symplectic regarding Collect & Connect and most have a few changes that they need to make to iD display. As soon as PURE, CONVERIS and Symplectic update their iD display ORCID can issue the relevant members with further badges without members having to complete any additional work.
The Jisc Support service will be requesting members who don’t yet display their ORCID member logo to do so on the page where they explain their integration.
Jisc Support will work with members to ensure that the individual elements of Collect & Connect have been met and will advise ORCID Support once certain levels have been achieved. ORCID will award the badges directly and will ensure that the member integration is recognised on the ORCID website in the Collect & Connect integration chart.
Workshop 3 – What questions can ORCID answer for my institution? Exploring the API and common queries
He explained that ORCID provides tools for organisations to manage:
- How your name is represented
- Who can claim affiliation with you
- Email addresses
- Research activities
ORCID collects iDs using an authenticated API to ensure:
- The person and the iD belong together
- The iD is correctly entered
- Privacy is respected
Authentication also provides the opportunity to ask for permissions to read/write/update ORCID records:
- Add affiliation data
- Add degree completion and thesis data
- Update affiliation end data
ORCID can enable institutional systems to automatically track a researcher’s activity (see slides). But there is a tension between potential Qs institutions want to ask of ORCID and what data is available from ORCID, either to do with whether the data is in ORCID or whether the privacy settings mean it’s not available.
The biggest question is, who in the institution have got ORCID iDs but haven’t informed the institution? They need to be asked to add affiliations. There are two challenges – 1) institutions need to be able to push data to ORCID and 2) they need to be able to pull data back when individuals update their ORCID record but individuals may not make that data available, issues of permission.
Matt provided feedback on what institutions can do via the ORCID API:
Option 1: Search by Affiliation – institution name or Ringgold ID
Option 2: Search by email domain name, but that’s imperfect as many have gmail accounts, or search the ORCID registry by DOI and look for names to match, doing approximal matching. There should not be an issue with privacy as this is data that an individual has made public so it’s there to be used.
At the end of the session, Matt also set up a GoogleDoc to allow delegates to ask further questions – a pdf document of questions from the commmunity and answers from ORCID is now available.
The morning concluded with a plenary session chaired by Nicky Ferguson with feedback from each of the sessions.
Networking Session – How are institutions dealing with uptake and advocacy?
Delegates were invited to add post-it note comments to questions posted round the walls about ORCID in their institutions. A write-up of these comments is available here.
Delegates had been asked to bring examples of ORCID advocacy materials that had worked effectively. Colleagues from University of York brought a bookmark and a flyer that had worked very effectively and colleagues from University of Worcester brought a guide they had produced for university staff.
Each delegate could choose two from the following sessions, swapping over at 1450.
Breakout 1 – ORCID and data: Re-using metadata to submit data reports and connect researchers with their outputs
Owen described the Cambridge workflow which involves submitting a dataset to the data repository, obtaining a DOI for the datacite from datacite – which includes submitting metadata containing the user’s ORCID ID, then datacite updates the ORCID registry so the user’s ORCID ID is connected to the dataset DOI (see handout).
One of general problems regarding ORCID uptake is getting people to log in to Symplectic Elements and use it, also getting OA to work. Cambridge has developed a solution to make this simpler.
The first step in the dataset workflow happens behind institutional sign-in so at that stage the individual’s identity is already known, which is an enormous help. When the researcher logs in to Elements, there is now a big red button on the screen and if he/she wants to submit a publication through OA they press the red button. The first part of the de-duplication step is to see if a publication that the researcher is about to submit already exists. The system then asks the researcher to complete a stripped out form with four fields. The form has a prefilled title and journal, the researcher can add co-authors linking other identities which are linked to ORCID IDs. They can show links to grants which can be passed to Researchfish as part of an interoperability pilot Cambridge is doing, so the system tackles OA and Researchfish simultaneously.
Once the form is completed, it’s uploaded and publication data goes into the repository (DSpace). Apollo (the name of DSpace data repository) requests a DOI from Datacite, Datacite returns the DOI and then Apollo generates a metadata package.
Datacite is one of the trusted parties that can push information to ORCID records. Every 8 hours, Datacite takes packages from Apollo (and many other data centres round world), and puts the data in an index that many services of Datacite round world have access to. Once a day, Datacite gets another package, in which they take ORCID DOI relationships and repackage, then push back to the ORCID registry.
The key step is writing the DOI-ORCIDID relationships to the ORCID registry, which Datacite makes possible. So Cambridge can then access records that appear in ORCID, having been updated there by Datacite.
The payoff is twofold – disambiguating not just the author in the repository but also the publication.
There are currently two complex permission bits which means the process is not intuitive for the researcher. Authenticating ORCID in Elements requires the researcher to access a third level menu option. The researcher also has to grant permission for Datacite to write to the ORCID record. This needs to happen at the ORCID registry web interface: by logging into ORCID and going into the dropdown from Works to find Datacite and confirm it. Ideally it would be possible to capture authentication in Elements and Datacite at same time. In the meantime, to encourage researchers to grant permissions all at one go, Cambridge has found video guides useful.
Drivers for pursuing this process are potentially REF, OA, but Researchfish is the key – this is a potential pain that everyone recognises, so automating Researchfish updating is a big win.
Demo 2: Using ORCID metadata to simplify submission of data reports to publishers: Demo of the data2paper project funded by Jisc by Neil Jefferies, University of Oxford
Neil showed a video of the data2paper demo. He explained that the data2paper project addresses requirements connected to the growth in ‘data papers’ in the sciences. These are methodological papers based on open data sets to describe the methodology used to undertake scientific procedures so they can be replicated. Most regular papers don’t have sufficient detail to allow for replicability. In fact, in some disciplines there is an 80% failure.
Oxford spotted an opportunity as most publication submission systems require data that is nearly entirely captured in Datacite and ORCID profiles. The starting point is a repository that supports both Datacite and ORCID iDs. If you’ve got a dataset with Datacite DOI and ORCID ID with it, you can attach it to a cloud app that Oxford have developed which will pick up that metadata and package it so that it can go straight into a publisher’s data paper submission system. All the researcher then needs to do is to attach the text of the paper and it by-passes all the publisher’s sub systems.
When you login to the cloud app, you see the dataset selected from the repository – the, cloud app is like a mini repository that holds datasets in transit. It also gives easy access to publication templates, shows publishers Ts & Cs etc, so, in effect, provides a mini journal data policy system. You (the researcher) can edit the metadata if need be. You select the journal you would like to publish with, select the record and add publisher details. Depending on the journal and the state of their template, you can fill in fields. You upload the edited template and supplementary files, add images, choose a licence etc. You can now submit the data paper straight into the data journal submission system. It takes 8 minutes end-to-end.
If you go to the publisher’s article management system, it should have received a data package in the submission queue with all data already validated because it comes from Datacite and ORCID.
Oxford has done this pilot project with F1000 Research which is an Open Access publisher but other publishers are interested. The system is now ready to go to production.
Breakout 2 – Career and person tracking with ORCID: Implications, challenges and way forward
Presentation 1: How ORCID can help with career tracking led by Matt Buys, ORCID
Matt gave a presentation showing how ORCID can help with career tracking. He explained that each institution needs to store ORCID iDs and Access Tokens in its system so that it can display, connect and synchronise information about its researchers. He stressed the reciprocity between employer, publisher and funder for the researcher, as well as the benefit of entering information once to reuse often.
Institutional collection of ORCID iDs through sign-in works already for many institutions, though some configuration may be needed. ORCID can then be the source of truth of information with asserted affiliations, trust in metadata, clear provenance and an open utility for career tracking. Assertions are key to enabling reciprocity across sectors and the broader community.
Presentation 2: RAPIIDs Demo led by Lucy May, University of Manchester
Lucy May explained the RAPIIDS (Raising Academic Profiles and Implementing IDs) project at University of Manchester, which started in 2016, run by Scott Taylor in the library. This was aimed at getting both academic staff and post-graduate researchers to register for ORCIDs, to help the university track the research activities of its post-graduates. Existing systems and departmental relationships at the university made it easier to implement this. Post-grads are not required to register for ORCIDs when they first start, but after their first year. This has resulted in 60% uptake from communication and encouragement alone, though it is expected that uptake will approach 100% during formal collection at the Y1 review stage. The project is is also working with the Alumni office to use ORCID data to track subsequent activity after leaving the university.
Presentation 3: Including ORCID iDs in Wikipedia (and Wikidata, and…) led by Andy Mabbett
This session was led by Andy Mabbett, ORCID’s Wikipedian in Residence. The slides are available online at: https://figshare.com/articles/Including_ORCID_iDs_in_Wikipedia_and_Wikidata_and_/5114329
Andy explained how ORCID iDs are used in Wikipedia, using the example of Christer Fuglesang, a Swedish Physicist and astronaut with the European Space Agency. His Infobox (to the right of the main article on Wikipedia) can be replicated automatically in 296 different-language Wikipedias through the use of Wikidata and his ORCID iD, without the need to update manually on each one. Andy showed how Wikidata can be used to research a vast range of facts. Within Wikidata is Wikicite, which allows you to find, for example, all statements citing the works of a particular author; all statements citing journal articles by physicists at your University in 2016; or all authors who have published on a topic, such as the Zika virus.
During the presentation Andy Mabbett made the point (slide 15) that there were only 2,347 ORCID iDs in Wikidata, the linked-, open-data sister project of Wikipedia, and after the event he put out an appeal for help coding a solution to automate the fetching of ORCID iDs, for authors whose papers are already in Wikidata. Magnus Manske wrote a tool to do this, which he has blogged at http://magnusmanske.de/wordpress/?p=464
As a result, there are now over 5,300 ORCiD IDs in Wikidata, and the number is steadily growing.
As ORCID’s Wikipedian in Residence, Andy Mabbett is ready to work with institutions to add ORCID iDs to any staff who are already the subject of a Wikipedia biography (or otherwise listed in Wikidata), and ensure the ORCiD IDs are displayed on the related Wikipedia articles. If anyone wishes to take Andy up on that offer they can find the details at https://en.wikipedia.org/wiki/Wikipedia:ORCID/Institutions.
Breakout 3 – Disambiguation of authors using ORCID in other services
Demo 1: Utilising ORCID iDs with IRUS-UK, to be able to uniquely distinguish every researcher/author in harvested metadata for statistical reporting purposes, led by Paul Needham, IRUS-UK
This demo and presentation was led by Paul Needham from Cranfield University, one of the founders of IRUS-UK, which is a Jisc-funded national aggregation service for institutional repository usage statistics. It collects raw download data from 127 repositories with data for all item types including non-peer reviewed resources. Download figures are increasing all the time. It then processes the data into a form that adheres to the COUNTER Code of Practice, so institutions have comparable, consistent and credible statistics on the usage of their institution’s research to share with key stakeholders e.g. to report to faculty or compare with other institutions.
In IRUS-UK you can search for a specific item with results by author name. The difficulty is author names, which are very unreliable, and the solution is ORCID as it is a persistent identifier. John Salter from University of Leeds did some work on the White Rose Consortium repository, one of the IRUS-UK participants. He had a ready-made source of ORCID iDs via PURE and was able to create a demonstrator, utilising ORCID records, to be able to distinguish uniquely every researcher/author and so remove ambiguities.
White Rose exposed ORCIDs in the RIOXX profile and John created a custom ‘irus-orcid’ dataset to allow IRUS-UK to harvest ORCIDs via OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) without having to trawl through all the records without ORCIDs. That allowed IRUS-UK to do a simple, but fully searchable, Author Index showing each author and their associated ORCID, along with the number of items and total download figures for each author. Clicking on a linked ‘ORCID’ lets you see an ORCID profile page for an individual showing a biography, publications list etc.
IRUS-UK now just need more institutions to adopt and incorporate ORCID records in their repository metadata and install RIOXX plugins (Eprints) or patches (DSpace) to expose ORCID metadata vai RIOXX in their OAI-PMH interfaces.
Demo 2: Use of ORCID iDs in the CORE project, led by Matteo Cancellieri
This demo was led by Matteo Cancellieri of the Open University. CORE aggregates research papers from institutional repositories and Open Access journals. It now has 77m metadata and 8m full text scientific articles. It provides an analytical dashboard, to give a benchmark on how each institution is performing re others.
After the ORCID Hack Day last October, CORE ran an experiment to try to understand how many ORCIDs they could get using DOI harvested already – of 5m papers with DOI, 16% have at least 1 ORCID iD connected, 27% if focus is restricted to UK only. This work led to an idea for a new dashboard tab in CORE to show information about papers with DOI in institutional repositories. This could give an idea of how many authors connected to the institution have ORCID iDs, even if they didn’t go back to the CRIS or repository, but registered on their own. CORE can give the institution an export of this data to feed back into the institution’s repository. CORE is not a source of trust re ORCID but institutions could use this data to get researchers identified in this way to authenticate. Unique identifier for CORE is key. Planning to have login with single sign-in of ORCID, lots of opportunity, e.g. to start searching collaborations.
Optional ‘Birds of a Feather’ sessions
- DSpace session
- EPrints session
- Pure session
- Repository session
The Eprints Birds of a Feather session was mainly spent looking at the Eprints/ORCID integration specification developed by Jisc in collaboration with the Eprints community. Participants reviewed the specification and Owen Stephens from the Jisc UK ORCID support service was able to answer questions from Eprints users about the plugin. Overall there seemed to be agreement that the specification was a thorough document and that the plugin for Eprints developed from the specification would meet the needs of Eprints users, perhaps especially in institutions where Eprints operated independently of any CRIS system (or there was no CRIS system).
NB: Since the workshop a plugin based on the specification has been commissioned by Jisc from EPrints Services, and that work is scheduled to start in July and aims to deliver the plugin for the start of the academic year in the autumn.
Conversation focused mainly on the level and future direction of ORCID integration with Pure. The main take-home message was probably a shared frustration at the progress made towards the synchronisation between Pure and ORCID. Since the meeting, Helen Newham initialised a discussion among the UK Pure user group on how to take this further and made contact with members of the Dutch Pure user group in order to link up with them on this matter.
We also briefly discussed experience with the Pure-ORCID-Researchfish interoperability. However, while interest is there, so far experience with this had been limited among those present.
The afternoon ended with feedback from the afternoon sessions and agreement to share the presentations and follow-through from the day with the ORCID mailing list.
If you have any queries about the event, please contact the UK ORCID Support Service at firstname.lastname@example.org.