HORIZON-MSCA-2023-SE-01

Scientific focus


WP1

FAIRification of IDR data

Objectives

The overall objective of WP1 is to provide a wide range of resources (databases and web tools) capable of scaling and capturing high quality data on IDRs and their functional features. WP1 aims to develop new standards for IDR metadata and new protocols for their exchange. A focus will be on the representation of understudied IDR functions like transient binding.
The main resource of disorder related information is the DisProt database which contains manually curated data about disordered regions and their function. DisProt also serves as the gold standard for the development and evaluation of prediction methods. However, the number of experimentally characterised IDRsisstill limited, therefore prediction tools are still vital to get information about protein disorder and their function at the genomic scale.
Starting from the available outcomes of the IDPfun Consortium such as database schemas and annotation guidelines as described in the Minimum Information About Disorder Experiments (MIADE) standard, we will improve the state-of-the-art of IDR data representation by developing new standards that comply with the FAIR (Findable, Accessible,Interoperable, Reusable) principles and that can capture IDR complexity.
In addition to new standards a new curation web infrastructure will be implemented and a better infrastructure for IDR data exchange will be developed thanks to the expertise and know-how provided the the EMBL-EBI IDPfun partner.
Collaboration of the IDPfun Consortium with the Gene Ontology (GO) and Evidence and Conclusion Ontology (ECO) consortia as well as the International Society of Biocuration, will help improve functional annotation for IDRs and extend existing IDR ontologies. Working closely with the Molecular Interactions and DisorderedProteins working groups of the HUPO Proteomics Standards Initiative (HUPO-PSI) will enable the enhancement of MIADE including the annotation of IDR molecular interactions in order to be compliant with the standards developed by the HUPO-PSI.
The strict collaboration with the ELIXIR IDP Community will provide the ground for the validation of the technology developed in this WP. This will improve the quality and consistency of the content of IDR databases and core resources of EMBL-EBI connected to them. The WP will also integrate information about protein interaction networks and pathways and will also provide additional information about the function of IDRs.

Task list

Task 1.1 – Standardisation of IDR data

Details

The aim of Task 1.1 is to improve the quality of annotation for IDRs by developing and extending the MIADE standard. This will enable the capture of information and generation of metadata for transient interactions and structural ensembles. Updated and extended annotation guidelines will also be provided to support the research community in contributing to IDR-related databases.

Task 1.2 – FAIRification of the curation process

Details

Task 1.2 aims to increase the amount of annotation for IDRs through enhancement of existing curation tools, and development of a new curation tool based on the Data Stewardship Wizard , implementing common, FAIR curation standards. New DSW models will be created to capture the complexity of IDR-related experimental settings.
The use of DSW, a portable module, will benefit databases like DisProt and PED by enabling them to adopt the same system and generate consistent metadata. Additionally, the APICURON service will be consistently implemented to acknowledge and credit curation efforts. APICURON allows curators to quantify and visualize their contributions in real-time, while database managers can implement gamification strategies to direct the curation task towards specific objectives.

Task 1.3 – Network context for IDP resources

Details

Extending the integration of IDP resources with major databases (MobiDB, InterPro, UniProtKB ) in the previous IDPfun project, task 1.3 will establish linkages between IDP resources and network-oriented database resources to support network-oriented functional analysis of IDP data and to extend and capture annotation of interactions of IDPs. We will implement set-oriented queries from DisProt to IntAct , Reactome , and the Complex Portal , supporting efficient analysis of molecular interactions, pathways, and molecular complex context for user-defined DisProt subsets, and also enable reverse connectivity from these resources to DisProt. The technical connectivity will be flanked by comprehensive targeted curation of IDP interactions, key pathways and molecular complexes in these resources, implementing the curation processes defined in Task 1.2.

Deliverables
Guidelines on how to curate IDRs and their function
Standards and formats for IDR data exchange

WP Information

WP type:

Research

Duration:

M1 – M48

Person months:

110

WP leader:

ELTE

Zsuzsanna Dosztányi