Translating Deep Learning protein bioinformatics methods to accelerate structural determination by X-ray crystallography

Description

Even after the AlphaFold 2 (AF2) revolution, experimental structural biology contributes enormously to advances across all areas of biology. X-ray crystallography is the major structure determination workhorse and is served for decades by the CCP4 software suite. This project involves working at the intersection of structural bioinformatics and structural biology and will result in new methods and software, aiming for distribution via CCP4.

In the area of protein crystallography the project will improve methods for predicting proteins that will crystallise, obviously an absolute prerequisite for the field. It is known that flexible surface loops and termini are linked to a reluctance to crystallise. Recent Deep Learning methods, both structure modelling methods like AlphaFold 2/3 and protein Language Models such as ProteinMPNN, offer new routes to predict the existence of problematic flexibility. Such flexible regions, especially at the protein termini can often be removed by genetic engineering without affecting protein function, but bioinformatics also offers the option of looking across different species to find a protein that is predicted to be more tractable or protein design of more stable context for key functional determinants. Separately, it is known that thermostable proteins often crystallise more easily and Deep Learning methods and resources can help to identify such sequences.

Once suitable diffraction data have been obtained for a crystal, structure solution (typically using AlphaFold models for Molecular Replacement) is attempted. The success of these depends on accurate estimates of the number of protein molecules in each crystal unit. There is an opportunity to use Machine Learning methods to develop new methods for this, thereby saving time and environmental impact.

The successful candidate will develop new bioinformatics methods to guide crystallographers towards proteins and protein constructs that stand the best chance of crystallising. The student will also develop new methods to estimate crystal structure content. Finally, there may also be opportunities to spend time in a crystallographic lab with exposure to practical methods relevant to protein crystallography. The student will develop a range of valuable skills and benefit from abundant networking opportunities in the context of the premier UK consortium for crystallography software.

Availability

Open to students worldwide

Funding information

Self-funded project

The project is open to both European/UK and International students. It is UNFUNDED and applicants are encouraged to contact the Principal Supervisor directly to discuss their application and the project. 

Assistance will be given to those who are applying to international funding schemes.

Details of costs can be found on the University website: 

https://www.liverpool.ac.uk/study/postgraduate-research/fees-and-funding/fees-and-costs/

Supervisors

References

  1. Tertiary structure assessment at CASP15. (2023) Proteins, doi: 10.1002/prot.26593
  2. Predicted models and CCP4. (2023) Acta Cryst D 79, 806-819. doi: 10.1107/S2059798323006289.
  3. MrParse: finding homologues in the PDB and the EBI AlphaFold database for molecular replacement and more. (2022) Acta Cryst D 78, 553-559. doi: 10.1107/S2059798322003576.
  4. The CCP4 suite: integrative software for macromolecular crystallography. (2023) Acta Cryst D 79, 449-461
  5. Machine-Learned Fragment-Based Energies for Crystal Structure Prediction. (2019) J. Chem. Theory Comput. 15, 4, 2743-2758 doi: https://doi.org/10.1021/acs.jctc.9b00038