- Task 1.1 – PED pipeline: technical task implying engineering activity
- Task 1.2 – SLiMs from structure: Detection of short linear motifs from structure is complicated. Simple pattern search generates lot of false positives (FPs) since they are very short (ca. 5 residues). The task will provide new strategies to limit FPs by filtering matches localized to buried residues outside linear structural stretches.
- Task 1.3 – IDP text-mining: Text-mining is widely used for retrieving functional data from the literature, but a feasibility study focused on intrinsic disorder is missing. The task will test different tools and methods to find the best solution applicable for large-scale automatic searches. This information will then be integrated into existing web server interfaces (WP4) to increase knowledge on experimental IDPs.
- Task 1.4 – IDR from structure: IDR detection from structural data depends on the type of source. Detection of elongated structures in protein multi-chain X-ray structures will be done by comparing intra- vs. inter-chain residue contacts and training the parameters on the ANCHOR dataset. Moreover, flexible regions will be calculated from B-factor information. Homology transfer is well established for structured domains but very problematic for disordered regions as IDP evolution seems to follow different rules.
- Task 1.5 – IDR homology transfer: will provide an extensive analysis of the parameters and constraints necessary to safely apply transfer by homology to IDPs based on sequence similarity.
- Task 1.6 – Mobi 2.0: Technical task implying engineering activity
Deliverables
Software package for the automatic extraction of PED entry data from protein ensembles
Software for automatic detection of IDRs and linear motifs from protein structures
A software tool for the automatic extraction of IDR relevant information from literature
Software for the identification of homologous IDR clusters from sequence databases
New version of the Mobi software