Career paths and projections in Biocuration: Panel discussion from the Biocuration2021 virtual conference

By Nicole Vasilevsky and Sabrina Toro

The ISB hosted the second session for the Virtual Biocuration Conference on June 15, 2021. The session, chaired by Peter Uetz, Ph.D. from the Virginia Commonwealth University, focused on career paths and projections in Biocuration and hosted three panelists: Pankaj Jaiswal, Ph.D, Professor in Plant Genomics at Oregon State University (OSU) in Corvallis, Oregon; Tanya Berardini, Ph.D co-founder and Chief Scientific Officer at Phoenix Bioinformatics in Newark, California; and Nicola Mulder, Ph.D, Professor of Computational Biology at the University of Cape Town in South Africa. The session recording is available here.

Panelist paths in Biocuration

Dr. Tanya Berardini entered the biocuration field after completing a Ph.D. and a post-doc when she joined the Arabidopsis Information Resource (TAIR) as a curator. When TAIR underwent a funding crisis after many years of serving the plant genome community, Dr. Berardini and her colleagues founded the non-profit Phoenix Bioinformatics which developed a sustainable model to support the TAIR database through subscriptions and has subsequently expanded into assisting other databases and resources to address funding issues, through subscription and membership models. Dr. Beradini’s career path is unique, as she initially performed database curation for a single resource, TAIR, and now also works in an entrepreneurial position. She has learned various aspects about running a business (such as Human Resources, insurance requirements, contract negotiation), as well as curation in additional domains outside of plant biology. Dr. Beradini noted that her detailed-oriented curation skills and experience with databases were very transferable to the business world

Dr. Pankaj Jaiswal’s work on sequencing plant molecules (his initial training was in biochemistry and plant molecular biology) prompted his interest in bioinformatics analyses and genome biology curation. He currently runs a wet lab (“on the bench”) and a dry lab (“at the computer”) at OSU in the Comparative Plant Genomics department. Dr. Jaiswal leads the curation efforts for the Gramene database and the Planteome projects, which require the creation of ontologies for the standardization of plant characteristics such as gene function, phenotypes, pathways, and gene expression. Dr. Jaiswal started curating during his basic science training as he read papers learned about specific subjects and synthesized information to address biological questions. His efforts to facilitate the synthesis of information and ease of interpretation, search, and access, included networking with peers, including Gene Ontology and Model Organism Database curators, and brought him to the field of biocuration.  Dr. Jaiswal currently trains his students, post-docs, and researchers to apply data standards and learn the curation process to build upon the foundations laid by the biocuration community.
Dr. Nicola Mulder holds a Ph.D. from the University of Cape Town in South Africa, where she did basic science research and studied molecular biology of infectious diseases, which ultimately led her to bioinformatics. She became a curator at European Bioinformatics Institute (EBI),  first at SwissProt, then as part of the InterPro project, which she went on to lead. Dr. Mulder currently leads the Pan African Bioinformatics Network for the Human Heredity and Health in Africa (H3Africa) in Cape Town, which supports bioinformatics and genomic analysis in Africa. Her team brought together a global community of experts, including clinicians, biocurators, and ontologists, which led to the development of the Sickle Cell Disease Ontology (SCDO) in response to the need to standardize information around Sickle Cell Disease, and the Hearing Impairment Ontology. Dr. Mulder and her team’s curation efforts include standardizing phenotype data for research cohorts and curating genomic data for African relevance, such as curating single nucleotide polymorphism (SNPs) from African populations and curating diseases that are relevant to Africans.

Becoming a Biocurator

The field of biocuration is still relatively new and small; colleges and universities do not typically offer a degree in biocuration. Therefore, the path to becoming a biocurator rarely follows a straightforward trajectory like many other fields, as many biocurators are subject matter experts in various subdomains of biology who completed a Ph.D. in a biological area or have a background in some aspect of computer science or semantic technologies, and have an interest in standardizing data. Our panelists shared some suggestions for those interested in joining the field:

  • Draw on your area of expertise: Most databases focus on specific subject areas and expert community contributions (such as contribution to open biomedical ontologies, and all of the OBO Foundry ontologies) are always needed, welcomed, and greatly appreciated. If you notice missing information or content in a database, reach out and share your knowledge.  
  • As a researcher, curate your data before it is published: Work with the databases to make sure your data is prepared in a proper format for completeness and efficiency before you publish. Dr. Berardini mentioned that over 10,000 labs work on Arabidopsis, creating a massive backlog of papers to curate. Structuring data before and at the time of publication dramatically assists with the curation process.
  • Volunteer at databases: If you have expertise in a particular field, contact the databases directly to discuss opportunities to contribute. Volunteering can be beneficial to build your experience, provide contributions to biocuration efforts, and provide networking opportunities within the community. In addition, volunteering can reveal whether the field is right for you.  Biocuration requires a particular personality, including attention to detail and a desire to organize. While some people derive extreme satisfaction from it, others can find it quite tedious. Dr. Berardini noted, “if through volunteering, you find biocuration brings you joy, this is the right career for you.”
  • Participate in hackathons, data jamborees, biomedical competitions: these events bring together researchers across various career stages, from junior biologists to practicing clinicians, and are opportunities to network, build your CV, and contribute to impactful work. Examples are biomedical competitions like Dream Challenges, and hackathons, data jamborees, face-to-face meetings, and online events hosted by Dr. Mulder to facilitate community curation of H3Africa projects. 
  • Do as much training as you can:  Courses are available, such as massively open online courses (MOOCs), college courses, and the newer Post-Graduate Certificate in Biocuration offered by the University of Cambridge.
  • Build your skill set: Search for job advertisements to determine what qualifications are needed, and work towards enhancing your skill set and competencies that meet job requirements. As an outcome of the Careers in Biocuration Workshop at the Biocuration 2018 conference, we created a generic position description for a biocuration profession, which is available here.

Biocuration career opportunities

A lot of opportunities exist in the biocuration field: biocuration in academia, which may entail biocuration for grant-funded database projects and ontology development, such as the work of Dr. Jaiswal; community-based bioinformatics and curation projects, such as those led by Dr. Mulder; and biocuration in a non-profit business setting, as Dr. Berardini’s work at Phoenix Bioinformatics. Biocuration opportunities are also available in the industry as companies are recognizing the importance of curating and standardizing data (for example, standardizing clinical trial data),  in government agencies; and even as independent consultants

The skills gained as biocurators, such as attention to detail, the ability to take in and synthesize data, and computational skills, are very valuable and can be translated to different areas, such as other areas of science or technologies.

Biocuration is a growing field and we anticipate that, as the amount of biological data being generated increases, so will the demand for curators. The ISB aims to promote the field and support our community through offering dissemination of job openings (see regular posts on our website here.), training opportunities, and networking. The ISB also promotes collaborations and exchanges between biocuration groups and offers funding for exchange fellowships. This fellowship will fund members to visit another laboratory or organization for training or knowledge sharing; more information is available here.

Researchers have the opportunity to better structure their datasets, share their data in repositories, and better structure the content that they publish, however, they are often unaware of the career opportunities in biocuration. We have not only an opportunity to promote the biocuration field, but also the responsibility to train the future generations, provide knowledge transfer, and have succession plans for those coming up after us. 

EXECUTIVE COMMITTEE ELECTION 2021

The election of the new International Society for Biocuration Executive Committee (ISB EC) will be held from September 27 – October 04, 2021.

A list of candidates for 2021 are available here.

The Executive Committee is composed of nine (9) members, each with a 3-year term. Being a member of the Executive Committee is a great way to become directly involved with the work of our society, and contribute to the decisions that are taken on behalf of the biocuration community. We would like to encourage all members interested in running for election to get involved in the process.

Serving on the ISB EC minimally involves attending monthly (1 hour)  teleconference meetings, following up on any action items from meetings, and  promoting the ISB’s activity to members and non-members. Examples of activities performed by EC members include reviewing micro-grant submissions, preparing call for participation for hosting Biocuration meetings, preparing materials for the ISB election, monitoring ISB mail and maintaining the website. We particularly encourage candidates with web development skills, or who have experience working with WordPress to apply. The typical workflows involve basic knowledge of Git, PHP/HTML/CSS for fine tuning of the WordPress website, WordPress plugins management, domain name/email redirection management.

There are specific positions such as Chair, Secretary and Treasurer that do require a larger time commitment, as they are in charge of leading the steps of the EC and by extension the membership. There is no expectation that new EC members would take on these responsibilities.

4 positions on the Executive Committee are up for election in 2021/2022. These positions are currently held by Mary Ann Tuli, Frederic Bastian, Jane Lomax, and Sandra Orchard. Mary Ann, Frederic, and Jane can re-stand for election. (The current ISB EC members are here.)

2021 Electoral Process

A) The Nominating Committee:

A Nominating Committee (NC) has been formed to oversee the electoral process, to review applications, and establish the final list of candidates. We are very grateful for their assistance with the execution of this election. The members of the 2021 Nominating Committee are:

Cristina Casals
Moni Munoz-Torres
Leonore Reiser
Laurens Wilming
Val Wood

B) Instructions to Candidates: 

  1. If you would like to run for a position on the Executive Committee, you must first register your intent with the NC by emailing isb@biocurator.org
  1. Please fill out this form by 27 August 2021, which includes a ‘statement of intent‘, a brief biographical sketch, and a ‘conflict of interests‘ statement describing any activities, memberships of other associations, editorial positions on journals, etc. (Please email us at isb@biocurator.org if you are unable to access this form.)

C) Timeline:

  • Nominations will be received until 27 August 2021.
  • The NC will review all candidacies and share their selections with the ISB Executive Committee by 13 September 2021.
  • Candidates must be announced to the membership and on website (with letters of intent) by 20 September 2021.
  • Voting will take place online over the course of one week from 27 September – 04 October 2021. (Further details about the voting process will be shared soon). The election officer is Petra Fey.
  • Only current members, as of 20 September 2021, who receive an email* will be allowed to vote.

*Note – please note that if you do not receive the email please contact us at isb@biocurator.org

The Nominating Committee is looking forward to receiving your applications!

The Future of Biocuration: Panel discussion from the Biocuration2021 virtual conference

By: Nicole Vasilevsky and Jane Lomax

Like all in-person gatherings in this past year, the annual International Society for Biocuration conference went virtual in 2021. At the inaugural session on April 13, 2021, a group of panelists discussed ‘the future of biocuration’. The panel was moderated by Rama Balakrishnan, who has served on the ISB Executive Committee since 2017, and is the co-chair (along with Susan Bello from the Jackson Laboratory) of the Biocuration2021 conference. Rama was joined by four panelists from various roles in academia and industry to discuss what is in store for our community. The recording is available here.

What is curation: Distilling knowledge from information

Rama initiated the discussion with the fundamental and relevant question, ‘what does the word curation mean to you?’ Working in the biocuration field, many curators can probably relate to this question, a question that is frequently asked by people who are outside this field. The role of a curator at a museum, for example, may be more familiar, but biocuration is a less well-understood field. Rama, who has held varying roles as a curator (academic and industry), tried to get after how the actual task of curation may differ amongst us. Sandra Orchard, from EBI shared a classical definition of ‘turning unstructured data into structured searchable data’, but recognized this is not always true as, whilst some curation tasks involve making data more structured, text-minable and machine-readable, the outcome of data curation does not always result in completely structured data. Carol Bult from MGI defined curation as “applying semantic standards to ensure data findability and aggregation.” 

Coming from the industry perspective, both Kambiz Karimi (Myriad Women’s Health) and James Malone (SciBite) agreed. Curation involved meaning-based capture and structuring of content using controlled vocabularies. Data curation can also include data cleaning, which is often a pre-curation task. Curation can help improve and enrich data interpretability and ultimately add value. It allows for enhanced search, querying, semantic integration and meta-analysis. 

How can we ensure quality?

Given that the panelists all agreed on a high level definition of curation, Rama then asked about ensuring data quality. What does good quality mean and what are metrics to assess quality? Different quality control (QC) and quality assurance (QA) processes apply, depending on the type of curation that is being done, whether you are curating tax forms (as James did in a summer job long ago) or curating the mouse biology literature. Some processes that were discussed by Carol and others  included intercurator checks, crowdsourcing feedback from downstream users, practices to ensure collaboration, regression testing to ensure continuity and consistency across datasets. Sandra pointed out that curators cannot be all things to everything, and stressed the importance of specialist databases with curators who are domain experts who can take the first pass at the curation, and build re-processing pipelines or scoring mechanisms to export high quality subsets to other data resources.

James and Rama noted how detecting outliers can assist with quality checks. However, it may not always be easy to detect the outliers without the expert knowledge in a specific area. For example, Rama curates patient data at Genentech, and once came across a data reporting a patient had a 100℃ fever (rather than 100℉), which was easy to spot as an error. However, in a more complicated clinical use case, detecting erroneous data points may not be so obvious and require more specialized knowledge.

Kambiz shared that Myriad has several QC approaches, including a peer review process, a spot checking program to have curators spot check each other’s work and a quality check process that compares their classification to previous classifications from the community. 

Sandra also noted the importance of researchers collaborating with curators prior to publication. She shared an anecdote where an author published a paper with an erroneous dataset, a simple mistake where a row in a spreadsheet had been accidentally deleted, causing nonsensical results. The curator picked this up and contacted the author, who was able to correct it, but this speaks to the importance of pre-submitting data to the database before publication and the important role a curator can play with the research community. 

Opportunities with Machine Learning and Automation 

While a lot of biocuration is done manually, more and more processes and workflow are being automated, with text mining, machine learning (ML), natural language processing (NLP) and AI.  The panel was asked their opinion on how AI and ML will affect the work of biocurators? Sandra assured us that machine learning will enhance our work, but is not concerned that it will replace human curation. Data is too messy, the literature is too unstructured, and human review and curation is going to be needed in the foreseeable future. James echoed her sentiments in saying, “[Machine Learning] will become an assistant, it will not replace subject matter experts who are biologists, scientists, curators. It will play a role in helping us.” James sees it as an opportunity for biocuration, where we should work to exploit advances in deep learning, noting the importance of biocuration is more pronounced now than ever. We can train AI to aid in biocuration and we can work together. In addition, quality Machine Learning/AI requires training sets that have been human-curated, and the advances of these technologies will require more curators; this is a new opportunity for this community. Carol agreed, but brought up the point that there may be the perception that these technologies are advanced to the point where curators can be replaced. This is causing challenges with funding for biocuration due to the notion that machine learning can do all or most of what human curators do. While machine learning can assist with making biocuration scalable, we need to do better as a community at communicating how these things interrelate and feed off each other.

“Biocuration has never been more valuable than it is now and yet under appreciated.” It’s something the Society can help us tackle: this perception and articulate how manual and machine learning biocuration can go hand and hand. – Carol Bult

Approaching authors

An audience member inquired whether database curators approached authors for clarification about their published data, and whether authors were responsive. Kambiz shared that they did approach authors when there was ambiguity with the content or data in an article. Sandra concurred, and alluded to the challenge with time dependencies; if a paper was recently published (1 year – 18 months ago), they frequently got a response. If a paper is over 3 years old, in general, they were less likely to get a reply, as the first author may have moved on and the PI is unfamiliar with the details of the data. 

This may speak to an opportunity to better train researchers in becoming familiar with curation methods and standards, to allow for unambiguous reporting in their publications. Requirements to share data at the time of publication will also help address this need.

Getting the journals involved

This led to the next question about working with the journals to publish data in a more structured way. Carol has had some experience working with journals in the mouse community, who are careful about publishing mouse names with the accepted terminology and nomenclature. She did mention that sometimes there is push back as to whether the recommended standard is the accepted standard, and whether this is going to evolve or change in the future. We all may be familiar with the situation below.

Source: https://xkcd.com/927/

This is an opportunity for a systematic community approach, the ISB should promote standards adoption to the journals.

Sandra pointed out that a challenge with approaching journals to use our standards, is the sheer number of journals. A more targeted approach may be more appropriate. For example, the proteomics community was successful in getting a restricted number of journals in their field to require data sharing to ProteomeXchange (http://www.proteomexchange.org/) prior to publication.

Sandra also recommended that we first talk amongst ourselves as a community and define our needs, and what standards to adopt and promote, and then approach the journals.

The elephant in the room: Funding

In recent years, NIH funding has decreased to various databases. How do we sustain our own careers, and train the next generation of curators? 

Kambiz felt it is easier to justify the need for curation due to the regulatory aspect of his industry. Even if there are NLP based processes to extract gene to disease relationships,  manual review will always be needed. He foresees  automated processes will assist with manual curation going forward.

Carol emphasized that we need to promote how important curation is to data science. Data science is recognized as an important field, therefore we should frame curation within its role in data science. We have to be better about explaining return on investment in curation – what can we do when data is curated, and we wouldn’t be able to do, if it wasn’t? She pointed out that the reality that biocuration is considered infrastructure, which is largely ignored, until it is broken. As a Society, can we demonstrate the impact that biocuration has on advancing data science?

Sandra reiterated that we need to make ourselves more visible, we need people outside the community to understand what we do. We need to work together as a community efficiently to not duplicate efforts, we need to align on standards, use specialist databases for initial analysis and data cleaning, and use the baseline resources like accession numbers, and show good examples of good curation.

Continue the conversation on Slack.

Do you have topics you’d like to discuss in a future panel, or suggested speakers? Please let us know (intsocbio@gmail.com).

EBI Training: A guide to molecular interactions

A GUIDE TO MOLECULAR INTERACTIONS

During this webinar, we will give you an introduction to molecular interactions and how to find these through the molecular interaction database IntAct. We will show you examples of how you can search for interaction data, how to create molecular interaction networks using our network viewer based on Cytoscape.js and how to download this data for further analysis.

We will also have a quick look at two other resources, PSICQUIC and IMEx, that integrate molecular interactions from several sources.

Who is this course for?

This webinar is aimed at students or early researchers beginning to use bioinformatics resources in their studies/research who wish to learn more about molecular interactions and IntAct. No prior knowledge of bioinformatics is required, but undergraduate level knowledge of biology would be useful.

Outcomes

By the end of the webinar you will be able to:

Explain what molecular interactions are
Describe what IntAct can be used for
Search for interaction data

26 May 2021

15:30 – 16:30 ( BST )

Online and Free

2021 Biocuration Awards Nominations

The International Society for Biocuration is happy to announce the 2021 Biocuration Awards.

In 2021, the ISB will give two different awards to people who have made a significant impact in the field of biocuration. We welcome your nominations!

Description of the awards:

1) Award for Exceptional Contributions to Biocuration
ISB’s Exceptional Contributions Award recognizes a person who is a leader or a pioneer in the field of biocuration, and whose work has been fundamental to the advancement of biocuration.

2) Biocuration Career Award
The Biocuration Career Award recognizes biocurators in non-leadership positions who have made sustained contributions to the field of biocuration. Those who hold Principal Investigator or Group Leader positions are not eligible for the Biocuration Career Award.

Each award recipient will be invited to present a talk at the 2021 International Biocuration Conference, which will be held virtually this year (the dates and details are to be determined).

Nomination process:
Nominations will be reviewed by the 2021 ISB Awards Committee, comprised of one member of the ISB’s Executive Committee (ISB-EC) and six (6) additional members from the wider research community; these members were nominated by the ISB-EC based on diversity in area of expertise, organization type, role, and geographic location.

Who can nominate and/or be nominated?

·      Any currently active ISB member may nominate anyone in the field of biocuration, whether the potential nominee is a member of ISB or not.

·      Members of the ISB can make no more than 1 nomination per award.

·      Current members of the Executive Committee or the ISB Award Committee are not eligible for the awards.

·      Self-nominations will not be considered.

How to submit a nomination:

Nominations should be sent via email to the awards committee at intsocbio@gmail.com with the subject line “Biocuration Awards Nominations”.

The nomination email should contain all the following fields:

·      Nominator details (name, e-mail and affiliation, member of ISB);

·      Nominee details (name, e-mail and affiliation);

·      Type of award nomination (either Exceptional Contributions to Biocuration or Biocuration Career Award);

·      Short list of scholarly contributions (a maximum of 50 words);

·      Brief description of why you are recommending this person (a maximum of 350 words).

Deadline for submitting nominations:  Friday, February 26, 2021

Please welcome the new 2020-2021 ISB Executive Committee

We welcome Robin Haw as our newest member to the ISB EC. Nicole Vasilevsky and Rama Balakrishnan are returning for their second term.

Our new Chair/Secretary/Treasurer are as follows:

Thanks to Sylvain Poux for your years of service; our outgoing EC member and Treasurer (EC member 2014-2020, Treasurer 2018-2020). Thanks to Sandra Orchard, our outgoing chair (Chair 2018-2020; Sandra will continue on the EC for another year).

Please click here for the composition of the subcommittees. Please note, the Equity, Diversity and Inclusion subcommittee is open to all members, if you would like to join, please reply to this email.

2020 has been quite a year with COVID, quarantines, the Black Lives Matter movement, the US election and more. We feel optimistic about the year to come and we want to serve our community as best we can.

Biocuration 2020 online workshops

As part of the Biocuration 2020 conference we had received excellent workshop proposals from several groups. Since the cancellation of the meeting we have been working with interested workshop organizers to bring this part of the conference online. We are excited to announce that we now have 3 workshops scheduled for the fall:

  • September 24, 9am PT, 12pm ET, 5pm CET – Biocompute Objects: Methods for communicating provenance of data and analysis 
    • Organizers: Charles Hadley King, Raja Mazumder, Jonathon Keeney; George Washington University
    • Register here
    • Recording here
  • October 29, 12pm PT, 3pm ET, 8pm CET – Gene Wiki: how to synchronize and curate primary sources with and in Wikidata 
    • Organizers: Andra Waagmeester, Lynn Schriml and Sabeh Ul-Hasan, Gene Wiki
    • Register here
    • Recording here
  • Dec 04, 8am PT, 11am ET, 4pm UK, 5pm CET – Biolink Model – A community driven data model for life sciences
    • Organizers: Deepak Unni, Chris Mungall, Lawrence Berkeley National Laboratory
    • Register here
    • Recording here

All workshops will be free to all participants.

There is a Slack workspace set up to facilitate communication between organizers and participants. If you are interested in attending any of these workshops please email biocuration2020 @ gmail.com and we will send you an invite to the Slack workspace.

EXECUTIVE COMMITTEE ELECTION 2020

The election of the new International Society for Biocuration Executive Committee (ISB EC) will be held from September 27 – October 04, 2020.

The list of 7 candidates for 2020 can be viewed here.

The Executive Committee is composed of nine (9) members, each with a 3-year term. Being a member of the Executive Committee is a great way to become directly involved with the work of our society, and contribute to the decisions that are taken on behalf of the biocuration community. We would like to encourage all members interested in running for election to get involved in the process.

Serving on the ISB EC minimally involves attending monthly (1 hour)  teleconference meetings, following up on any action items from meetings, and  promoting the ISB’s activity to members and non-members. Examples of activities performed by EC members include reviewing micro-grant submissions, preparing call for participation for hosting Biocuration meetings, preparing materials for the ISB election, monitoring ISB mail and maintaining the website. There are specific positions such as Chair, Secretary and Treasurer that will require a larger time commitment, as they will be in charge of leading the steps of the EC and by extension the membership.

3 positions on the Executive Committee are up for election in 2020/2021. These positions are currently held by Nicole Vasilevsky, Rama Balakrishnan and Sylvain Poux. Nicole and Rama can re-stand for election. (The current ISB EC members are here.)

2020 Electoral Process

A) The Nominating Committee:

A Nominating Committee (NC) has been formed to oversee the electoral process, to review applications, and establish the final list of candidates. We are very grateful for their assistance with the execution of this election. The members of the 2020 Nominating Committee are TBD.

B) Instructions to Candidates: 

  1. If you would like to run for a position on the Executive Committee, you must first register your intent with the NC by emailing intsocbio@simplelists.com
  1. Please fill out this form by 28 August 2020, which includes a ‘statement of intent‘, a brief biographical sketch, and a ‘conflict of interests‘ statement describing any activities, memberships of other associations, editorial positions on journals, etc. (Please email us at intsocbio@simplelists.com if you are unable to access this form.)

C) Timeline:

  • Nominations will be received until 28 August 2020.
  • The NC will review all candidacies and share their selections with the ISB Executive Committee by 14 September 2020.
  • Candidates must be announced to the membership and on website (with letters of intent) by 21 September 2020.
  • Voting will take place online over the course of one week from 27 September – 04 October 2020. (Further details about the voting process will be shared soon). Sue Bello will act as election officer.
  • Only paying members* with registration fees cleared on or before 21 September 2020 will be allowed to vote. If you pay your registration via bank transfer, please allow at least 2-3 working days for the payment to be processed.

*Note – please contact us at intsocbio@simplelists.com if you have issues with registering or renewing your membership. Known issues exist with our membership payment system.

The Nominating Committee is looking forward to receiving your applications!

Search by Categories