Microgrant Report: Krysiak 2018

The second CIViC-hosted hackathon and curation workshop was held as an open-format one and a half day pre-conference to the 2018 ASHG meeting in San Diego. Over 50 Attendees were present representing over 20 organizations and institutions from multiple countries. Session topics were suggested by attendees and CIViC team members and covered coding (hackathon) and issues in cancer variant representation and curation. On the morning of the second day, groups presented the outcome of each session in short presentations covering multiple topics. Topics in the cancer variant sessions included the expansion of cancer variant databases to cover structural variants and copy number variants. The current capabilities of CIViC and other cancer variant knowledgebases to represent such variants was assessed, and strategies for future instances of such knowledgebases to implement these representations were covered. In addition, a system for quantifying somatic cancer variant oncogenicity/pathogenicity was proposed, and discussed extensively. This system was derived from the current standard for germline variant pathogenicity assessment based on ACMG codes. These discussions informed subsequent proposals for potential future guidelines. Other topics in the cancer variant sessions included machine learning in cancer variant annotation, and the standardization of generalized categories for cancer variant classification. SEPIO modeling of cancer variants was also covered. Parallel curation sessions covered a broad set of topics including methods to incentivize community curation of free and public knowledgebases such as CIViC, and hands-on curation of diagnostic evidence in pediatric cancer was performed. In multiple hackathon sessions, work was performed integrating CIViC and CRAVAT, integrating CIViC with igv.js, JBrowse and IGV genome browsers and CIViC-Wikidata integration. A session was held to work on a system for data transfer between the ClinGen VCI and CIViC named Linked Data Hub. NDEx made available CIViC drug-variant, gene-disease, gene-variant and variant-disease association networks.

Microgrant Report: ICBO2018

9th annual International Conference on Biological Ontologies (ICBO2018)
Ontologies for Health, Food, Nutrition and Environment: A partnership with BIG-Data and Analytics

Conference website: http://icbo2018.cgrb.oregonstate.edu

Oregon State University hosted the 9th annual International Conference on Biological Ontologies (http://icbo2018.cgrb.oregonstate.edu). The theme of the ICBO2018 was Ontologies for Health, Food, Nutrition and Environment A partnership with BIG-Data and Analytics. ICBO2018 was a marquee event celebrating the 150th anniversary of the founding of Oregon State University (OSU150).

ICBO2018  attended by over 130 participants from 10 countries, provided a venue for presenting and discussing research, development and usefulness of biomedical ontologies (including human health and diseases, vectors, drugs, bio-chemicals, biodiversity, plants, agriculture, food and environment) on building data standards, annotation workflows and data analytics. Attendees represented significant areas of biology, medicine, ecology, computer science, mathematics, text-mining, data analytics, and software and tool development. Dr. Pankaj Jaiswal from Oregon State University was the Conference Chair. The Conference Program co-Chairs Dr.Chris Mungall from the Lawrence Berkeley National Laboratory (LBNL) and Dr. Melissa Haendel from the Oregon State University organized the conference program.with help from the Program Committee members The scientific presentations were in the form of 30 plenary talks and 32 posters and software demonstrations.

The three thought-provoking ICBO2018 Keynote talks were given by Dr. Kwan Liu-Ma from the University of California Davis on “Visualization: A Powerful Tool for Data Exploration and Storytelling”, Josh Clark from Big Medium Inc. on “The Care and Feeding of Algorithms” for design, analytics and user engagement and Dr. Parag Chitnis from National Inst. of Food and National Institute of Food and Agriculture (NIFA) on “Changing Face of Agriculture: Data-driven opportunities for nutrition and health”.

The four invited talks were by Niklaus Grunwald from USDA ARS on “Taxa, metacoder, poppr and vcfR: Four packages for parsing, visualization, and manipulation of genetic, genomic and metagenomic data in R, David LeBauer from TerraRef project on “Vocabularies, Ontologies, APIs, and Formats for Heterogeneous High Throughput Crop Phenotyping Data”, Carolyn Lawrence from Iowa State University on “GO-MAP Implements CAFA Tools: Improved Automated Gene Function Annotation for Plants” and Matthew Lange from UC Davis on “Designing and Building the IC-FOODS Foundry: Community, Technology, and Standards for a Semantic Web of Food”.

Thirteen pre and post-conference workshop held at ICBO2018 included the Phenotype Ontologies Traversing All The Organisms (POTATO) Aligning phenotype ontologies using design patterns, ONCONTO 2018: 2nd International Workshop on Oncology and Ontology, Ontology-driven text-mining analysis and normalization of free-text specimen descriptions, Data Standards and Knowledge Sharing in Biodiversity -Tools and Applications, Deep Learning in the Life Sciences and Biological pathway curation jamboree. Each of the workshop session included talks, demo, hands-on exercises and discussion forums relevant to their theme.

The Biological pathway curation jamboree was organized by Sushma Naithani of the NSF-funded Gramene database. In the jamboree participants learned about the biocuration process, literature and data mining, pathway analysis, and biocuration tools with particular emphasis on using the Reactome Curator Tool and plant pathways. The curation of plant pathways is an ongoing work of the Plant Reactome database. The workshop report is available from Gramene News.

The day-long pre-conference workshop on Phenotype Ontologies Traversing All The Organisms (POTATO) was a venue to discuss data standards on phenotype annotation tools for pattern-based development (Dead Simple Ontology Design Patterns (DOSDP) and the Ontology Development Kit (ODK). The workshop report is available at Medium

The two-day post-conference workshop “Deep Learning in the Life Sciences” was an introductory hands-on workshop on Machine Learning to train students and researchers working on various biological datasets. The workshop was co-organized with the Center for Genome Research and Biocomputing (CGRB). The instruction was provided by experts from IBM.

The plenary talks and posters were selected after peer-review of over 60 scientific articless. The ICBO2018 conference abstracts are available online. The articles will be published later in an online open access conference proceedings.

We thanks our Sponsors, the International Society for Biocuration, the College of Agricultural Sciences, Department of Botany and Plant Pathology, Department of Environment and Molecular Toxicology, College of Engineering (EECS) and the Sponsored Research Office at Oregon State University and industry partners, Illumina Inc., Sanmita Inc, and Sensiplicity LLC. The conference was partially supported by the grants to Pankaj Jaiswal for the Gramene database (IPGA: Gramene – Exploring Function through Comparative Genomics and Network Analysis; NSF-PGRP Award 1127112)  and the Planteome project (cROP: Common Reference Ontologies and Applications for Plant Biology; NSF-PGRP Award 1340112) and the NIH conference grant to Melissa Haendel and Peter Robinson (Forums for Integrative phenomics; NIH award 1U13CA221044).

ICBO2018 concluded with a vote of thanks and the announcement for 10th ICBO (ICBO2019) to be held at the University at Buffalo, New York, USA.

Microgrant report: GCCBOSC 2018


By Karsten Hokamp on behalf of the GCCBOSC 2018 organizing committee

The first joint event of the Galaxy Community Conference and the Bioinformatics Open Source Conference (GCCBOSC 2018) was held from June 25-30 at Reed College in Portland, OR. The Galaxy Community supports data-intensive biomedical research through the open-source Galaxy platform. BOSC is organized by the Open Bioinformatics Foundation, a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. This was the 19th annual BOSC, but the first one to be held together with GCC.

The conference brought together over 300 bioinformatics researchers, biocurators, developers and users of open source software from academic and private institutions around the world in a relaxed and collegial atmosphere. A wide range of topics in bioinformatics open source projects, open science and open data were covered. This included workflows, developer tools and libraries, translational/medical bioinformatics, community building and standards for representing and sharing data.

Posters, software demos, birds-of-a-feather meetings, talks, invited keynotes, training, and collaborative work events were presented and held over six days. A panel session discussed the importance and underfunding of documentation and training in open source bioinformatics. Presentations of specific interest to biocurators included reports on miRTop, InterMine and the Mammalian Ortholog and Annotation Database, amongst many others. Several presentations covered resources that support biocurators in their work, such as BioThings Hub, Apollo and JBrowse.

GCCBOSC 2018 sought to be a family-friendly conference, and the ISB Micro-grant helped make this happen. These funds allowed the conference to offer subsidized child care and enabled parents with young children to attend (including one of the keynote speakers). This support for families received a lot of attention, both at the conference and online.

The International Society for Biocuration was listed as a sponsor in the conference materials, including the printed program, presentation slides and web pages.

Luana Licata’s fellowship report

By Luana Licata

The short-term fellowship conferred by the International Society for Biocuration (ISB) has given me the opportunity to spend, as a visitor, two weeks, from the 2nd to the 13th of July 2018, at the EMBL-EBI, Hinxton, UK.

At the EMBL-EBI, I have been hosted by the IntAct team and I have worked with the Protein Function Team (EMBL-EBI) and the Gene Annotation Team of the Centre for Cardiovascular Genetics (UCL, London) and with the Molecular Interaction Team (IntAct, EMBL-EBI).

During my stay, I have been worked on the following topics:

I worked with both Protein Function and Gene Annotation Teams to learn Gene Ontology annotation and how to use Protein2GO. In particular, Ruth Lovering and Rachael Huntley (Gene Annotation, UCL, London) introduced me to GO annotation practices, Extensions and rules and how to use the curation tool, Protein2GO. This has allowed me to start to annotate some proteins and protein relationships involved in the Acute Myeloid Leukemia (AML) pathway already annotated in SIGNOR database, one of the database that I curate at the Bioinformatics and Computational Biology Unit, at University of Rome Tor Vergata. Moreover, with the help of Penelope Garmiri (Protein Function, EMBL-EBI), I had the opportunity to learn the basis of NOCTUA annotation and how to use NOCTUA platform. Noctua annotation allows to combine simple GO annotations in order to generate a network of annotations. This acquired knowledge has allowed me to start to annotate, at a basic level, also in NOCTUA platform some relationships relevant to Acute Myeloid Leukemia (AML) pathway coming form SIGNOR database in order to be able to produce some GO-CAM models. GO-CAM models are the models produced with Noctua. The final goal of this collaboration has been not only to improve and enhance knowledge about current GO annotation practice but also to be able, in the next future, to represent and compare information relevant to the AML pathway that I have annotated in different ways, such as SIGNOR, GO and NOCTUA annotation.

I worked with the Molecular Interaction Team to further develop protein-nucleic acid interaction annotation in MINT database. In particular, I have learned how biocurators in the Molecular Interaction Team capture information about protein-nucleic acid interactions and I have annotated in the IntAct editor (IntAct curation tool) articles containing information on the interaction between transcription factor and transcribed gene. Moreover, during my visit, working in close contact with colleagues from the molecular interaction team has allowed to strengthen the work of the MINT database (the other database that I coordinate and curate) inside the IMEx Consortium through a better curation coordination.

Microgrant report: Arighi Oct.2017

ISB-Microgrant report of the BioCreative VI workshop
By Cecilia Arighi

The BioCreative VI workshop took place on October 18-20, 2017 in Bethesda, Maryland, USA.

BioCreative is a community-wide effort for evaluating text mining systems applied to the biological and biomedical domain. The meeting attracted participants from the biomedical natural language processing, biocuration, literature/publishing, research and funding
domains (over 60 workshop participants, one third being students), and 36 teams participated in the track activities (with representation from America, Asia, Europe and Australia).

The scientific program covered:

  1. the talks related to the individual tracks (ran previous to the workshop) with biocuration relevant topics (assignment of bioentity IDs to facilitate downstream curation; text mining services for triage for human kinases; extraction of causal network information using the Biological Expression Language; mining protein-protein interactions affected by mutations; and annotation of chemical-protein interactions)
  2. a panel about Innovation in biomedical digital curation with views presented by users, publishers, literature service providers
  3. a panel on funding stakeholders where funding opportunities and needs for text mining and collaborations were presented by representatives from various funding agencies;
  4. a general session for text mining topics that showcased other interesting bioNLP work;
  5. 2 keynote speakers (Dr. Patricia Flatley Brennan, Director, National Library of Medicine, talked about future of data-powered health, and Dr. Hongfang Liu, Mayo Clinic, discussed opportunities and challenges of text mining in precision medicine.
  6. a poster session with Additional points discussed included the challenges of using real data over a gold standard; the strategic direction of BioCreative; and the relationship with other NLP challenge evaluations.

Corpora and datasets from the different tracks are publicly available (with prior registration). The workshop Proceedings is publicly available on the BioCreative website.

Funds were used towards the rental of room for the poster session, and the ISB was listed as sponsor.

The OHSU Library Data Science Institute Promotes Biocuration and the ISB to Librarians and Researchers

By Nicole Vasilevsky

The Oregon Health Science University Library in Portland, Oregon hosted the “OHSU Library Data Science Institute” (GitHub repo) from November 6-8, 2017 in downtown, Portland.

The event was targeted towards researchers, librarians and information specialists with an interest in gaining beginner level skills in data science. The goal was to provide face-to-face, interactive instruction over a  three-day workshop. The learning objectives for the training were:

  • Increase awareness of key skills in data science and how these can be applied to the participants own daily practices, such as research or serving patrons
  • Increase confidence with using data science techniques
  • Increase the ability of participants to use or apply data science techniques in problems outlined in the course

Over 75 participants attended this event, which was held over the 3 days. Participants came from within and outside Portland,  Washington, Idaho, California, British Columbia and Kansas. The topics for the workshop included topics relevant to the biocuration community such as biomedical data standards; data description, sharing and reuse; and data cleaning and preparation. All of the  materials are openly available on our website. I gave a brief talk on the “Trials and Tribulations of a Biocurator” and described the lessons learned as a biocurator and how she wished she knew the things she knows now when she was a bench researcher (and how her biocuration skills can be applied in her current role as well). We hope that we instilled the value of biocuration and proper data management on researchers and librarians alike, and hope that they will apply the skills they learned to better manage and curate their data. We informed participants about efforts that are currently underway at the International Society for Biocuration, and distributed ISB stickers as well. Funds from the micro-grant were used to provide coffee each morning for attendees, which was greatly appreciated, and the ISB is listed as a sponsor on our website.