Developing a guideline to standardize the citation of bioresources in journal articles (CoBRA)

Sharing bioresources: needs and impediments

Bioresources are collections of data and/or samples that are scientifically built
and systematically documented. These include physical resources like human biobanks
with associated health information, databases, plant and animal repositories, registries,
and bioinformatics tools. Although an increasing proportion of biomedical research
relies on bioresources, these are seldom shared 1]. In clinical research, for instance, nearly half of trials remain unpublished 2]. In some instances, data sharing may be prohibited to protect subject/patient/victim
confidentiality, proprietary interests or national security, or for political reasons,
but these are very specific cases. Most of the time, sharing does not occur because
it entails a long and time-consuming process that is far from fully appreciated and
which is hardly recognized by stakeholders.

Sharing bioresources does not simply involve providing access to other users. Good
sharing requires specific work on metadata, on quality control for data and samples
management, and on documentation. Sharing also requires regular bioresource updating
and the development of a clear access policy. This work is not trivial. It is neither
recognized and valued in the present academic world nor considered an important aspect
in the processes of evaluation. This lack of recognition is a major obstacle to sharing.
The more difficult, costly, time-consuming, and highly specialized the development
of a useful bioresource is, the more researchers and institutions will expect recognition
for their usefulness. If the scientific community does not value this work, it is
unlikely to be performed adequately.

Sharing research outputs, data, and resources contributes to building reliable knowledge
and generating innovation. This is now supported by funding agencies that foster data
sharing, especially in health research 3]. The European Commission emphasizes that research data are as important as publications,
and it has committed to openness in its present funding scheme, Horizon 2020 4]. A number of initiatives are being developed to encourage exploiting existing healthcare
databases and exchanges in the medical world 5]. Initiatives of interest include the Canadian Network for Observational Drug Effect
Studies 6] and the initiative for Quality Assessment of Administrative Data 7] of the Institute for Clinical Evaluative Sciences. Some journals, including Annals
of Internal Medicine 8], provide guidance on data sharing and a number of authors have published articles
addressing the issue of sharing health data 9],10].

However, these kinds of initiatives mainly concern patient/trial data, and no standard
exists regarding the use of bioresources in general 1],11],12]. One potential solution, outlined in this article, is to develop a way to incentivize
the biomedical community through harmonized citation and recognition processes.

Why do we need a standard for bioresource citation?

As explored in a previous article 1], it is extremely difficult to identify the contribution of any specific bioresource
to research published in scientific articles because bioresources are either cited
in a confusing, heterogeneous, and unstandardized way, or they are not cited at all.

In most cases, due to the current modalities of citation, the use of a bioresource
in a research article (summarized below) is not retrievable via PubMed or other bibliographic
databases, which only index abstracts. This does not allow proper traceability and
visibility of bioresources in scientific literature or in other (online) sources that
would highlight their use and thus encourage bioresource sharing. For instance, an
analysis published in 2011 showed that half of the papers published in biomarkers
research contained no information about the biospecimens used 13].

The points below summarize some of the negative effects of citation heterogeneity
on the tracking of bioresource use, and the limitations generated by this situation.
Adopting a citation standard would bring several advantages, which are also outlined.

Current modes of bioresource citation

Bioresources may be acknowledged in various sections, including Materials and Methods,
Acknowledgements, and References

Bioresource acknowledgements or citations may be placed outside the main paper, or
in online supplementary materials

Citations may acknowledge different resource levels, for example, citation of the
consortia or network, but not the individual bioresource

Secondary use, such as the use of derivatives from the original bioresource (i.e.,
extracts from biospecimens)

Typing errors or approximation of the bioresource name/identification; a multiplicity
of names for a given bioresource; various names in different languages

Acknowledgement of persons and authorship instead of the bioresource itself

Absence of bioresource citation (negligence)

Websites: reference to web sites that no longer exist, are not updated, or are not
informative

Absence of a Material Transfer Agreement or Data Transfer Agreement number and/or
report information on the access(es)

Effects of variations in bioresource citations on tracking bioresource use

Full-text mining is required to trace bioresource use

No traceability

Difficult or impossible to recognize the use and contribution of individual bioresources

Disregarded criteria for correct reporting

At worst, no tracking is possible if there is no citation at all

Misleading or incomplete information about bioresource use

Effect of the lack of standardized citations

Bioresources are not indexed in PubMed or Web of Science

Incomplete information of the bioresources used in research articles

Lack of recognition and traceability

Studies may be impossible to replicate by others

No information on the number and type of biosamples, the amount of associated data,
or the data exchanged between the biobank and the user

Limited ability to develop accurate indicators of bioresource activity

Limited ability to adopt suitable policy for stakeholders at any level

Underestimation of the utility of bioresources

Advantages of standardized citation

The possibility to search literature for the use of a bioresource

Available information about bioresource use

Development of indexing tools in PubMed and Web of Science to track the use of bioresources

Support for the development of metrics to assess bioresource impact

Facilitation of stakeholders’ work (e.g., policy, decision-making, assessment, etc.)

Increased recognition of infrastructure and institutions involved in creating and
maintaining a bioresource

Improved specific knowledge of biospecimens and databases used in research articles

Increased sharing of data and biological samples

Improved trust of bioresource contributors including patients and donors

To track publications involving a bioresource, it is essential that researchers consistently
acknowledge the use of the bioresource by placing unique and traceable information
in all relevant publications in a defined section of the article. Ideally, this information
should include an actionable digital identifier (ID) assigned to the bioresource.
To date, such an ID is not available. In order to fulfil the requirements of the scholarly
record, a bioresource ID should be persistent, globally unique, citable, and easily
retrievable through the Internet. There is major debate over which body or bodies
should be responsible for assigning and managing bioresource IDs 11].

The systematic and standardized citation of bioresources in journal articles is needed
for the fair recognition of the impact of bioresources on health research, both in
qualitative and quantitative terms. While textual resource citation follows clear
editorial guidelines, citation rules for bioresources are yet to be defined. Most
current initiatives address data sharing policies or technical aspects of sharing,
such as interoperability, quality of data, and standards for data management, but
not the recognition of the work required in data sharing and the traceability of such
sharing 1],14],15]. However, a recent initiative by publishers has begun to address this issue 16]. The Bioresource Research Impact Factor (BRIF) initiative has taken the lead on activities
aimed toward standardizing the bioresource citation process, together with biomedical
editors 17],18] and other scientific organizations involved in bioresource monitoring and sharing.
This process of collaboration and its principal outcomes will be discussed in detail
in the following paragraphs.

The BRIF initiative

The BRIF initiative 19] is an on-going international framework, the major features of which are summarized
in Table 1. Five dedicated working subgroups deal with priority tasks: adapting ways of identifying
bioresources, parameters to consider when calculating impact, standardized ways of
citing bioresources in scientific literature, the embedding of BRIF in data sharing
policies, and dissemination. The “BRIF and Journal Editors” subgroup first established
connections with editors to standardize bioresource citation. The next phase is to
implement BRIF, once it has been assessed, with a specific task being to assess proposals
for amending available editorial guidelines. The subgroup gathers three editorial
experts and two biomedical researchers involved in biobanking based at the National
Institute of Health (Istituto Superiore di Sanità) in Italy, as well as the BRIF leader
and manager, who are from a joint research unit between the French National Institute
for Health and Medical Research (Inserm) and the University Toulouse III. All members
of the workgroup are involved in coordinating the work carried out with journal editors
and in leading the development of a guideline for bioresource citation in the scientific
literature.