Wednesday, 9 March 2016

Evolution and Assembly of Protein Complexes

Since 2003, all Birkbeck researchers in structural biology and allied disciplines have collaborated with colleagues at UCL in the Institute of Structural and Molecular Biology. The Institute holds a varied series of events throughout the year, including a programme of research seminars arranged termly around current themes in molecular and structural biology research. The theme for the spring term 2016 seminar programme has been ‘Protein Dynamics: from Folding to Function’; one of the first of the distinguished scientists invited to present their research under that theme was Sarah Teichmann from the EMBL – European Bioinformatics Institute and the Sanger Institute at Hinxton, near Cambridge, UK. She gave a fascinating talk that linked evolution and protein folding to the topic of Section 7 of the PPS course, quaternary structure (or the assembly of protein complexes).

Teichmann has won many awards in what is still quite a short research career, including the Biochemical Society’s Colworth Medal for ‘an outstanding research biochemist under the age of 35’ (2011) and the EMBO Gold Medal (2015). Last year she was elected a Fellow of the prestigious Academy of Medical Sciences, with a citation that commended her as representing ‘a new breed of scientists at the interface between computational and experimental molecular biology’. She is also an advocate for women in science and has written a children’s novel.

She began her seminar by asking two related questions: ‘how do protein complexes assemble?’ and ‘how do protein complexes evolve?’ and by misquoting the poet John Donne: ‘no protein [man] is an island’. Many proteins are functional only when bound to others to form complexes, and in the crowded environment of a cell each newly synthesised protein has only a limited amount of time to find its partners and form a stable complex. Much can be learned about the evolution and dynamics of complex formation by studying the complexes that are available in the Protein Data Bank. Her group’s evaluation of these structures has contributed to the software that the PDB uses to predict the functional biological unit (monomer, dimer or multimer) for each structure in the PDB, and has led to the database of protein complexes. This database provides a hierarchical classification of now over 30,000 protein quaternary structures. Each complex is represented using graph theory as a simple 2D figure or ‘mini-graph’, with each polypeptide chain as a node and each interaction surface between two chains as an edge. These little graphs make it easier to distinguish between topologies involving the same number of subunits: for example, a complex of six identical protein chains may be a simple hexamer with 6-fold rotational symmetry (such as the traffic ATPase [PDB 1g6o]) or a dimer of trimers with 32 symmetry (such as annexin XII [PDB 1aei]). The links here are to the pages describing those proteins in Section 7 of the PPS course material.

Alongside the hierarchy described in the 3Dcomplex database, protein complexes can be divided into two large groups: homomers, which consist of multiple copies of the same polypeptide chain, and heteromers involving different chains. (Haemoglobin, a tetramer with two alpha and two closely related beta chains, is arguably an intermediate between the main two types.) Teichmann spent the rest of her lecture addressing three related questions about the assembly of both homomers and heteromers:

i) Does the assembly of protein complexes drive evolution?
ii) What are the mutational mechanisms involved in complex formation?
iii) Can the principles of protein assembly be used to predict topologies that have not yet been seen?

Starting with the first question, from an evolutionary point of view the simplest complex to form is a homodimer with two copies of the same monomer; one mutation that turns part of a protein surface into a ‘sticky patch’ is all that is necessary to stabilise dimer formation. Not surprisingly, the homodimer is also the commonest type of quaternary structure found in the PDB. Once a protein has dimerised, additional monomers can be added to form larger complexes with cyclic symmetry, or the dimer itself can (for example) dimerise. The order in which the interfaces in a multimer formed during evolution can be predicted from the amount of surface area buried by the formation of each interface, with the largest surface areas being buried first. This simple rule applies to complicated assemblies as much as to simple ones, and to heteromers as much as to homomers. Therefore, for all but the very simplest structures, it is almost impossible to predict the form that a complex will take unless you know the order in which the subunits assemble. Joseph Marsh, a former postdoc in Teichmann’s group now working at the MRC Human Genetics Unit in Edinburgh, represents this here in an analogy with the assembly of flat-pack furniture, with and without instructions.

Illustration © Joseph Marsh, MRC Human Genetics Edinburgh

Teichmann tested some of her predictions of protein assembly pathways using mass spectrometry in collaboration with Professor Dame Carol Robinson’s group at the University of Oxford, and found that seven out of nine pathways and 22 out of 27 steps within those pathways had been predicted correctly. This hierarchy of subunit assembly can also be used to predict the evolution of a complex, so it is clear that the assembly of protein complexes can indeed drive evolution.

Turning to the second question, Teichmann used specific examples of protein families that take up different quaternary structures in different species, including the PyR family of bacterial pyrimidine operon attenuators, to explore the evolutionary mechanisms that take a protein from one that is most stable as a monomer to different multimeric forms. These can involve direct mutations at the interface between subunits (for example, making the protein surface ‘stickier’ or creating a salt bridge) and other so-called ‘allosteric’ mutations that change the protein structure to allow different interfaces to form. Often, the difference between (for example) a protein that is stable as a dimer and one that is stable as a tetramer will come down to changes in a few amino acids. In the case of the PyR attenuator family, mutations away from the interface drive a conformational change that is equivalent to the one that occurs when the protein binds DNA, and so stabilise multimer formation.

Finally, Teichmann considered the use of the assembly principles that she had outlined in predicting the form that a protein complex would take from scratch. Most basic steps in complex assembly, as described earlier, can be grouped into one of three categories: dimerization of one or more chains, adding an identical subunit or subunits to a complex (cyclization) and adding a different type of subunit. These can be combined in different ways to form a large number of possible quaternary structure topologies. So far, about 120 different topologies are represented in the PDB, with four or five new ones being added each year, and the vast majority of these fit into one of Teichmann’s topologies. She assembled all the predicted topologies, including those not yet observed, into a ‘periodic table of protein complexes’ (S.E. Ahnert et al., Science 350, aaa2245 (2015)). This table has already been seen to correctly predict the topology of some newly determined complexes that were not included in the original list.