Interviewing Repository Managers for the NORF Open Access Repository Project: The Process and Some Preliminary Data

 


My previous blog post on the NORF Open Access Repositories project outlined the preparatory work for interviews with stakeholders working in or with Irish repositories. Since then, I have conducted seven interviews with repository managers overseeing academic and governmental repositories across Ireland. Five further interviews with repository strategists will follow later in September. We designed semi-structured interviews with repository managers to cover a range of issues related not only to technical questions around metadata, but also to determine what issues most affect open access repositories in Ireland, what barriers to greater metadata compliance they face, and what other issues affect the daily running and strategic development of such repositories. Below I outline how interviewees were selected, how the interviews were conducted, and some preliminary insights after the repository manager interviews.

Selecting Interviewees

Data gathering using qualitative, semi-structured interviews with seven institutional repository managers and five institutional repository strategists is underway. Each interviewee represents a range of governmental and educational research institutions geographically spread across Ireland. They were chosen from a limited pool of potential interviewees, selected from 30 registered repositories. The interviewees are all subject matter experts (SMEs) in the field and have homogenous characteristics which include both technical metadata speciality knowledge as well as deep institutional knowledge. These homogenous characteristics allow comparison between institutional practices, norms, and expectations. At the same time, they represent a heterogeneous mix of small and large academic and governmental institutions from different areas of Ireland, with differing remits in terms of research audiences and institutional missions, allowing important contrasts to be drawn based upon institutional characteristics.

Each group of interviewees also differs in their professional functions and roles. Repository managers have daily working knowledge of metadata issues in repositories and in interview can yield valuable knowledge related to deficiencies in current metadata practices and policies. Those with responsibility for repository strategy can inform the project about broader issues and trends that affect institutional repositories at the policy, institutional and national level.

Advantages to Small Interviewee Numbers

There are advantages to interviewing a small number of experts in qualitative research. SMEs possess a high level of knowledge and expertise in their field. Due to their extensive experience and specialized understanding, even a small number of experts can provide valuable insights and in-depth perspectives on a research topic. Small sample sizes in qualitative research provide rich, detailed, and nuanced data. Researchers can delve deeply into each experts’ responses during the interviews, exploring the intricacies and complexities of their viewpoints. This level of depth will provide valuable and comprehensive information for analysis and interpretation in addition to the already collected survey data, allowing for much greater insight into the survey snapshot of the repository landscape. SMEs are usually selected based on their expertise and relevance to the research question. Their inclusion in the study ensures that the insights gained are from some of the most knowledgeable and qualified individuals in the field, which, in this case, focuses on institutional repository standards and strategies.

Interview Process

Although the interviews primarily focused on metadata issues, we asked interviewees to talk freely about any aspects of their role or institutional environment that came to mind during the questions, including things that work well and any difficulties or barriers they face in their role. This qualitative data will help guide the project towards a national metadata roadmap that is more holistic and inclusive than one based solely on technical issues, but rather focused on a community-driven approach that fosters mutual interdependence among Irish repositories.

The set of 21 questions were grouped into themes covering metadata generation, metadata compliance, compliance support, barriers and concerns, training and development, and questions about the interviewees’ particular role. Interviews were conducted and recorded over Microsoft Teams, with automatic transcription which will be corrected and anonymised.

Preliminary Data

According to interviewees, there is a great deal of variety in working practices when submitting articles to repositories and in generating metadata. For some repositories, much of the metadata is generated according to set criteria during deposit, usually with some fields populated by the author and the rest entered by repository staff. For others, the authors have sole responsibility for article submission and ensuring at the time that fields are correctly filled, although there is a secondary level of oversight by staff, with time spent correcting mistakes or filling forgotten or missing metadata such as funder information. Levels of oversight and mediation varied according to institution. This was often heavily dependent on levels of staffing available, closely related to resource pressures (of which there were many!) including time, training, and financial support. Research items are often sought out or gathered by staff themselves on databases such as Elsevier and PubMed, adding another time pressure to their already heavy workloads. For those repositories ingesting research from publishers and aggregators, the relevant metadata is supplied along with the article itself. There are also fields that can be prepopulated in certain repositories as these are generally stable over time and submission. However, even when metadata is imported alongside articles, there are usually fields that do not populate and must be manually added.

Scope of Manager Role

Many interviewees worked as the sole staff member on the repository, handling all aspects of operation, management, maintenance, and development even when the repository was not the only or major part of their job. As one manager described their work:

“I mean, the repository has a staff complement of one and that's me, and so that means a lot of my time is spent on comparatively low-level stuff such as, you know, just data entry, or just approving other people’s entries. And you could say again, I don't really have an IT background myself and some of the more technical work that would be required is beyond what my present skill set would be.”

Others had teams working for them, with multiple staff responsible for different aspects of repository work:

“I would say it's about 40% of my role…and it's a very big priority of my role and it's in my job description. It's very well described and such and I do have a permanent job. So I would say that, you know, that's a commitment on the university's behalf for this area and also significant resources in the senior library assistant and library assistant who work full time on [the repository].”

Metadata Alignment & International Best Practices

Another topic of conversation centred on the importance of adhering and aligning to international best practices and guidelines for metadata. While all considered alignment with international guidelines important, there were significant differences in the level of control regarding this issue, particularly with those repositories using a third-party commercial vendor to host it. While having a commercial vendor solves many issues involved in developing, hosting, and running the repository, at the same time it forces repositories to rely on the platform capabilities and development timetables of the vendor. One interviewee talked about their customer support being based in California, meaning they could only speak to them during certain hours of morning and evening. Although the level of customer support was very good, it still created some additional barriers to resolving issues in a timely manner.

Some repositories were also waiting for upgrades to the latest platform, in a customer queue behind larger repositories. This affected the level of alignment available. While many repositories align with OpenAIRE Guidelines for Literature Repositories v3, they were waiting for platform improvements to align with v4. For others, upgrading to the latest OpenAIRE guidelines would be a great deal of work, especially considering staff and resource pressures, but one they deemed important and necessary.

OpenAIRE

A number of interviewees mentioned the importance of OpenAIRE compliance in the context of their membership of European research alliances and, more generally, for maintaining a strong research presence in Europe and internationally. One interviewee tied this metadata backbone into the integration of the repository with their institutional CRIS (Current Research Information System) alongside the wider impact of their open access research outputs:

“We had our CRIS system, we had our repository and this was our guiding force, the outside community both in Ireland and internationally…So what I meant was that the metadata was going to be very strongly governed by the CRIS system, which was built originally on a very early version of CERIF, the Common European Research Information Format based on people, publications, projects and affiliations…

Research is an international enterprise, and we are a tiny country and we collaborate hugely with other countries, including with the UK, but increasingly with other parts of the world.
So it's very important for us to be compliant with that for harvesting purposes and everything else, for example, with OpenAIRE. That's an outstanding example of harvesting, it’s probably the primary one that that we have at the moment, especially since we no longer have our own national Open Access portal.”

Data Mining

When asked about whether they would be supportive of organisations such as OpenAIRE using data mining and machine learning to add and improve repository metadata, all interviewees were supportive. One said about developments in machine learning:

“I'm obviously a strong advocate of Open Access and manage an Open Access repository. And I'm also very much an advocate of using machine learning and data mining to make our lives easier. And we've conducted a sort of test project as regards machine learning and data mining, which we carried out in conjunction with somebody from IBM and they mined the content of certain collections in [the repository] and integrated that with a chatbot on the [institution] library website. So the people who enter questions that are potentially answerable by [the repository] can have the answers delivered to them rather than go searching for it directly. We've carried out at least proof of concept experiments in that direction ourselves. The results of that were there were positive enough and we had no trouble with it.”

Difficulties and Barriers Faced by Managers

One of the major themes to emerge in interviews was the difficulties and barriers faced by those working with little help or support, often working alone and trying, at the same time, to keep their repositories and their own metadata subject knowledge up to date. Interviewees spoke of their desire for support and resources, particularly from outside their own institutions, where they knew the resources to assist them were limited and unlikely to change. They were excited at the potential prospect of community guides available to support them, of training workshops and materials to assist their ongoing development and subject knowledge, and also in the possibility of any formal community network of managers and repository staff. While some had developed small, informal networks over the years, often these fizzled out over time as people changed roles, or they were inadequate to address all their informational and support needs. A strong desire for a formalised network was voiced multiple times by different interviewees alongside more national training and a knowledge base for finding answers to the many queries that arise during daily repository management, not least dealing with the endless acronyms! As one interviewee put the difficulties in not knowing everything you might need to:

“Maybe even just an online course that could actually say, well, there's these aspects that you need to know about. These are the main aspects, and then there's these aspects that branch out from that, because inevitably it's like a tree. You know, you get to the trunk and you think that's the tree, but then there's so many branches off it that you don't realize it underpins everything else that you need to know about. And you're maybe starting off at the leaf and not realizing you have to kind of work your way back, so that for me is the issue and obviously being quite new to the job as well in terms of managing the repository, I find that difficult. And then also asking questions and maybe people not knowing the answer and people being afraid to say ‘I don't know,’ which I find is a massive thing.

Conclusion

These are just a few sample snippets of the detailed interviews with repository managers that provided great insight from subject matter experts working in the field of repository management in Ireland. There will be much more to come as the project interviews more people and writes up the findings. The major milestone due in December is the landscape report that will lay out the issues in detail before we work with the community to draft metadata guidelines for the Irish repository ecosystem. These interviews will form a valuable resource for drafting guidelines to deliver a truly community-focused approach. My thanks to all those who agreed so readily to interview and to contribute their time and expertise to this valuable work.


Dr Christopher Loughnane is the NORF Open Access Repositories Project Manager at the University of Galway Library

Comments