Development build for ELIXIR-Belgium/rdmkit-sandbox@0fafbdf (branch: contribute-refactor)
Skip to content Skip to footer

Your domain: Intrinsically disordered proteins

Introduction

Intrinsically disordered proteins (IDP) domain brings together databases and tools needed to organize IDP data and knowledge in a Findable, Accessible, Interoperable and Reusable (FAIR) manner. Experimental data created by users must be complemented by metadata in order to be deposited in an IDP resource. This document describes what community standards must be followed and where to find information needed to complete the metadata of an IDP experiment or study.

Description

As a researcher in the field of Intrinsically Disordered Proteins (IDPs), you want to know how to process an experimental result in a FAIR way. As a final aim, you want to deposit the data in a community database or registry for wider adoption.

Considerations

You can split the experimental process in several steps:

  • How should you properly describe an IDP experiment? Are there any community standards that you should follow?
  • How do you add metadata in order to make IDP data more machine readable?
  • How should you publish IDP data to a wider audience?

Solutions

  • The IDP community developed a MIADE standard under a PSI-ID workgroup. The standard specifies the minimum information required to comprehend the result of a disorder experiment.

    The standard is available in XML and TAB format. You can check example annotation in XML and TAB format and adapt it to your data.

  • The IDP community developed the Intrinsically Disordered Proteins Ontology (Intrinsically disordered proteins ontology (IDPO)). The ontology is an agreed consensus of terms used in the community, organised in a structured way.

    The ontology is available in OWL and OBO format.

  • You should deposit primary data into relevant community databases (BMRB, PCDDB, SASBDB). You should deposit literature data to the manually curated database DisProt. DisProt is built on MIADE standard and IDPO ontology. As such, DisProt requires curators to annotate all new data according to community standards. IDP data from primary databases, together with curated experimental annotations and software predictions, is integrated in the comprehensive MobiDB database. DisProt and MobiDB add and expose Bioschemas markup to all data records increasing data findability and interoperability.

Description

IDP field is actively evolving. It integrates newly published experimental evidence of protein disorder and translates it in a machine readable way in an IDP database. This mapping process relies on accurate knowledge of protein identifiers, protein regions under study and disorder region functional annotation.

Considerations

Most common issues that you as a researcher can encounter during the mapping process are:

  • how to properly and uniquely identify the protein (or fragment) under study?
  • how to deal with missing terms in IDPO?

Solutions

  • In order to uniquely identify the protein under study, you should identify the protein on UniProt reference protein database. The protein identifier must be complemented with an isoform identifier (if needed) in order to completely match the experimental protein sequence.

    Use the SIFTS database to precisely map the experimental protein fragment (deposited at PDB) to a reference protein database (UniProt) at an amino acid level.

  • Experimental evidence from literature must be mapped to relevant IDPO terms. If no suitable term could be found in IDPO, try with following resources:

    If there isn’t an appropriate term in ontologies or vocabularies, you can submit a new proposal for community review at DisProt feedback.

Related pages

More information

Skip tool table
Tool or resource Description Related pages Registry
Bioschemas Bioschemas aims to improve the Findability on the Web of life sciences resources such as datasets, software, and training materials Machine actionability Standards/Databases Training
BMRB Biological Magnetic Resonance Data Bank Structural Bioinformatics Tool info
DisProt A database of intrinsically disordered proteins Tool info Standards/Databases Training
Intrinsically disordered proteins ontology (IDPO) Intrinsically disordered proteins ontology Tool info
MIADE Minimum Information About Disorder Experiments (MIADE) standard Training
MobiDB A database of protein disorder and mobility annotations Tool info Standards/Databases Training
PCDDB The Protein Circular Dichroism Data Bank Tool info
PDB The Protein Data Bank (PDB) Galaxy Structural Bioinformatics Data publication Tool info Training
SASBDB Small Angle Scattering Biological Data Bank
SIFTS Structure integration with function, taxonomy and sequence
UniProt Comprehensive resource for protein sequence and annotation data Galaxy Proteomics Single-cell sequencing Structural Bioinformatics Machine actionability Tool info Standards/Databases Training
Contributors