Citation indexes integrated management for Institutional Repositories data enrichment

Dimitrios Kouis, George Veranis, Marios Zervas, Petros Artemis, Andreas Giannakopoulos, Christos Bellas, Konstantina Christopoulou

An important problem for researchers and for agencies (e.g., Quality Assurance Units) that are responsible for evaluating the research activity of academic entities (e.g., laboratories, departments, entire institutions, etc.) is to locate and retrieve the bibliographic records (e.g., scientific papers) and their citations automatically from the various citation indexes.

To calculate uniform bibliometric indicators, the deduplication of the documents collected from the different citation indexes is required. In addition, such a tool could assist the academic libraries in upgrading their Research Repositories with auto-enrichment capabilities, saving valuable labour time from their staff.

In this context, the initial results of implementing such a tool for data extraction from the four popular citation indexes (Scopus, Google Scholar, Web of Science and PubMed) and the ORCID service are presented. The tool aims to provide integrated management of multiple citation indexes, namely the collection of data per researcher and the application of deduplication algorithms so that a list of unique publications is obtained for each one of them. The processed data are combined with the data of the Institutional Repository and converted into a suitable format for ingestion.

The Institutional Repository of the Cyprus University of Technology has been selected as a testbed. All universities can undoubtedly utilize the obtained results.

University of West Attica, Cyprus University of Technology, Aristotle University of Thessaloniki
University of West Attica
Journal of Integrated Information Management, 2021, vol6, no1 pp.14-24