Principles of Protein Structure: 2016

Monday, 10 October 2016

Welcome to PPS students 2016-17!

This post is extremely like those I have written at this time of year for the past few years. This is because what I have to say now is very, very similar...

I would like to offer a warm welcome to the Principles of Protein Structure blog to all students who have just started studying Birkbeck's Principles of Protein Structure (PPS) course, and a welcome back to any who have taken a break in studies and intend to complete the course this year.

I run this blog to link the material that you will be studying in the course to new research developments in the areas of protein structure and function and related aspects of biotechnology and medicine. I might, example, report on talks given in the ISMB seminar series run jointly by the Department of Biological Sciences at Birkbeck and research departments in neighbouring University College London. The programme for Autumn 2016 focuses on the molecular and structural biology of infectious disease; there is material on similar topics in section 10 of the PPS course, 'Protein Interactions and Function'. Other posts may be reports from conferences or summaries of recently published papers in protein structure, protein bioinformatics and allied areas. Look out for an account of a conference on structural assemblies at Birkbeck in December that will honour the 50-year career of one of our emeritus professors, Steve Wood.

Some earlier posts on this blog were written by "guest blogger" Jill Faircloth, who took the MSc in Structural Molecular Biology a few years ago and is now working as a freelance science communicator. She introduces herself in this post written in March 2012, in which she also describes how she found the later part of the PPS course and her thoughts on the two choices available for the second year of the MSc.

Do, if you get a chance, look through some blog posts from earlier years to see the kind of topics that we will be discussing. However, don't be discouraged if at this stage of the course you find the science presented there difficult to understand. I can assure you that it will get easier!

I particularly recommend that you look at a couple of posts from December 2013 and July 2014 about the history of structural science, particularly X-ray crystallography. Crystallography was the first method to be developed for solving the structure of biological macromolecules, and it is still the most important. The year 2014 was designated by the United Nations as the International Year of Crystallography, marking the year between the centenaries of the publication of the first papers on X-ray diffraction and the award of the 1915 Nobel Prize for Physics to the father-and-son team of William and Lawrence Bragg who made the principal discoveries.

So - the best of luck for the 2016-17 PPS course and for your studies at Birkbeck! We hope that many of you will go on to complete our MSc in Structural Molecular Biology.

Best wishes,

Dr Clare Sansom
Senior Associate Lecturer, Biological Sciences, Birkbeck and Tutor, Principles of Protein Structure

Monday, 19 September 2016

Shining light on the 3D structures of membrane proteins

A symposium was held at Birkbeck on August 10, 2016 to honour one of the college’s most distinguished structural biologists, Professor Bonnie Wallace. This was organised by a postdoctoral member of her group, Lee Whitmore, and her long-term colleague and collaborator Dr Bob Janes from Queen Mary, University of London to celebrate her 65th birthday. It featured speakers from five continents, all of whom had collaborated with her or worked in her lab at some point in their careers.

Wallace joined what was then the Department of Crystallography at Birkbeck in 1990 after holding several research positions in her native USA. Her research on the structure and function of membrane proteins has won her several prestigious awards, including the Biochemical Society’s triennial AstraZeneca award in 2010. This is given for outstanding research in a UK or Irish laboratory that leads to the development of a new method or reagent. She has made significant contributions to both the development of circular dichroism spectroscopy as a tool for investigating the structures of proteins (including membrane proteins) at less than atomic resolution, and to studies of membrane protein structures using crystallography and electron microscopy. Her studies of voltage-gated sodium channel structures have led to some important insights about their functions in health and disease.

The symposium was divided into three sessions, with the first devoted to circular dichroism spectroscopy and the second two to membrane proteins. A general report of the day has been published on the Biological Sciences website; here, to fit in with the remit of the PPS course, I concentrate on the sections on membrane proteins.

The first talk was on electron microscopy, given, appropriately enough, by one of the pioneers of the field: Richard Henderson from the MRC Laboratory of Molecular Biology in Cambridge. Throughout most of the 1970s and 1980s he and his collaborator, Nigel Unwin, worked on the development of electron microscopy techniques for the study of protein structures. Most of their work involved the proton pump, bacteriorhodopsin, which is found in very high concentrations in the purple membranes of Halobacteria. At the beginning, this work was very time-consuming: it took them a year to locate the C-terminus of the protein, and another to determine the binding site of its ligand, retinal.

The first near atomic resolution structures of this protein were obtained in the mid-1990s. At about that time, too, he switched the focus of his interest from the structures of ‘2D crystals’ of bacteriorhodopsin to those of ‘blob-like’ single particles: isolated protein chains or, more often, membrane-embedded protein complexes. The list of biologically and medically important complexes to have been solved using this technique is now growing rapidly, and includes rotary ATP synthase (see the previous post on this blog); the next complex in the electron transport chain, known as respiratory complex I; and gamma secretase, which is a potential drug target for Alzheimer’s disease.

Molecular simulation and modelling techniques have developed alongside those of structural biology and for almost as long. Mark Sansom, a professor of structural bioinformatics at the University of Oxford, described simulations of membrane proteins. He started his talk describing a program to visualise and analyse the pores through the centres of membrane proteins that was written by Oliver Smart (now at the EBI) when he was a postdoc in Wallace’s group. This program, HOLE, is relatively simple but is still widely used. Sansom’s current work uses molecular dynamics to model the membrane bilayer with numbers of embedded proteins, focusing particularly on interactions between those proteins and the lipids of the membrane.

Not surprisingly, there were several talks about the ion channels that have been a focus of so much of Wallace’s more recent research: voltage-gated sodium channels. Hugh Hemmings from Weill Cornell Medicine, New York, USA described how these channels have become useful targets for anaesthetic drugs. General anaesthesia is a drug-induced coma characterised by unconsciousness, immobility and amnesia; an effective anaesthetic will achieve all these and a wide variety of molecules have been employed to greater or lesser effect since the nineteenth century. Many of these target proteins involved in the release of neurotransmitters by pre-synaptic nerves, including ion channels; sodium channels were first proposed as anaesthetic targets in the late 1970s but fell out of favour for several decades. Interest in this mechanism of anaesthesia has revived with the use of the bacterial proteins – a focus of Wallace’s structural studies – as a model system. Hemmings’ current studies focus on the mechanism through which volatile anaesthetics such as isoflurane inhibit the passage of sodium ions through these channels.

Crystal structure of the NavMs voltage-gated sodium channel pore depicted in ribbon motif and viewed from the membrane normal direction. each of the four monomers in the tetrameric structure is depicted in a different colour (from Naylor et al., 2016 - Wallace lab paper). The transmembrane sodium pathway run through the middle of the structure, from top to bottom.

Lin Field of Rothamsted Research, Harpenden, UK, described research leading to a very different application of sodium channel blockers: as insecticides. Insects cause an immense amount of crop damage worldwide, but non-specific insecticides might be toxic either to humans or to beneficial insects such as bees. The mechanism of the pyrethroid class of insecticides was unknown when the first members of this class were patented, but they are now known to bind to voltage-gated sodium channels and prevent their closure. Structural studies of these proteins have shown how mutations that are known to lead to pyrethroid resistance can prevent the molecules from binding, and why these compounds have very little effect on the very similar mammalian channels. Researchers hope that these studies are taking us nearer to the development of ideal, ‘designer insecticides’ that are only harmful to pest species.

Further talks were given by Wallace’s first Ph.D. student at Birkbeck, Declan Doyle, who is now at the University of Southampton; by Per Bullough from the University of Sheffield; and by Dame Carol Robinson, the first woman to be appointed as a full professor of chemistry at the University of Oxford. The symposium ended with a summary and vote of thanks from Janes, who stressed that it did not mark Wallace’s retirement: she still loves science and has many questions to answer. I hope that I will be blogging innovative research from the Wallace lab for many years to come.

Wallace’s research has been described in this blog on several previous occasions – see in particular this post from April 2013 and this one from November 2010. The use of cryo-electron microscopy to determine atomic resolution structures of proteins is covered in depth in our Techniques in Structural Molecular Biology course, which is one of the options for the second year of the Structural Molecular biology MSc.

Thursday, 4 August 2016

ATP synthase: a new drug target for tuberculosis

The London Structural Biology Club (LSBC) is a network for students and researchers working in all aspects of structural molecular biology and based in London and the south-east of England. Once a term, members get together for an afternoon of research talks and discussion followed by refreshments (generally featuring pizza and beer). These meetings are often held at Birkbeck, and we have featured them on the PPS blog before (see this post from 2008 and this one from 2012).

The LSBC meeting for the summer term of 2016 was also held at Birkbeck, and hosted and chaired by Alfonso de Simone, a lecturer in NMR spectroscopy at Imperial College London. One of the four talks was given by Thomas Meier, also from Imperial College. Meier, who has worked at ETH Zurich, Switzerland, and the Max Planck Institute of Biophysics in Frankfurt, Germany, was appointed to a chair of structural biology at Imperial just over a year ago. His research concerns the structure and function of a tiny, complex ‘molecular machine’, ATP synthase (the link is to our page on that enzyme in section 10 of the PPS course). This enzyme is a ‘rotary ATP synthase’, catalysing the conversion of the electrochemical energy of ion transfer across the cell membrane into chemical energy stored in ATP. Meier and his group have solved the complete structure of this multi-subunit enzyme complex using a combination of X-ray crystallography and electron microscopy, and shed light on its role as a target for a novel class of drugs against tuberculosis.

ATP synthase is a highly dynamic enzyme complex, which makes it particularly difficult to study structurally. The complete enzyme comprises two motor sub-complexes; most of one, termed F_o, is embedded in the membrane and the other, F₁, is in the matrix. They are tightly coupled with each other and linked by subunits forming an outer stalk and an inner, central stalk. From a functional point of view, the ATP synthase consists of a rotor and stator part, both sharing subunits from the F₁ and the F_o sectors. The motors are driven either by the proton (or sometimes Na⁺) motive force to form ATP, or in opposite direction by the hydrolysis of ATP to pump ions across the membrane. Similar enzymes are found in the inner membranes of bacteria and mitochondria and in the thylakoid membranes of chloroplasts.

John Walker of the MRC Laboratory of Molecular Biology in Cambridge was awarded a share of the 1997 Nobel Prize in Chemistry for determining the structure of the bovine F1 motor using X-ray crystallography (e.g. PDB 1E79). This subcomplex consists of five different types of subunits; three pairs of similar alpha and beta subunits are arranged alternately around the rotor (gamma / delta / epsilon subunits), which harbours an asymmetric coiled coil domain in the gamma subunit. The nucleotides ADP and ATP can bind to the interfaces between the beta and alpha subunits; the central gamma subunit rotates in 120^o steps, causing conformational changes that in turn change the affinity of the three catalytic sites for the nucleotides. Rotation in one direction, driven by the energy of ion transfer across the membrane, leads to synthesis of ATP from ADP and phosphorus (P_i); rotation in the other direction will hydrolyse ATP back into ADP and P_i, thus releasing the energy required to pump ions.

The membrane-embedded F_o motor consists of the rotor part, a ring of identical c-subunits termed the c-ring and the stator part, a single-chain a-subunit that grants access and release pathways for ions. Each c-ring subunit is a helical hairpin with its N and C termini on the cytoplasmic side; the number of subunits is constant within a species but varies between species, as far as we know today, between 8 and 15 c-subunits. Meier’s group has solved the structures of a number of c-rings from different species of bacteria, helping to elucidate the rotor’s mechanism of action; essentially, protons (or in some cases Na⁺ ions) can reach one of the c-ring subunits by an ion pathway mediated by the stator a-subunit, where they lock to a free ion binding site on the c-ring, rotate with the c-ring for almost a complete 360^o turn to reach the second release pathway that leads to the other side of the membrane. The ion translocation causes rotation of the F_o ring and with it the complete central stalk that protrudes the F₁ headpiece.

Artist's impression of an ATP synthase molecule embedded in a membrane. Image © Laura Preiss, Max Planck Institute of Biophysics, Frankfurt, Germany

Until recently, ATP synthase has not been thought of as a drug target, principally because the structures and mechanisms of the bacterial and human enzymes are so similar. Now, however, it has emerged that it is the target for bedaquiline, the first novel drug to be approved for treating tuberculosis (TB) for over 40 years. And new drugs for TB are needed very badly: in 2015, over 9 million people contracted this disease and about 1.5 million died from it. Tens of thousands of TB cases each year are multidrug resistant (MDR) or even of the extremely drug resistant (XDR) variant for which no other clinically approved antibiotic is available anymore. Bedaquiline, however, is effective against both, MDR- and XDR-TB strains.

Functional analysis has shown that bedaquiline, which is a diarylquinoline, acts by binding to and halting the rotation of the Fo rotor. Meier and his co-workers have now solved the structure of the drug bound to the c-ring from a similar, non-pathogenic bacterial species, Mycobacterium phlei. This structure has nine c-subunits and shares over 80% sequence identity with the M. tuberculosis c-subunit variant (100% match at and around the drug binding surface). The crystal structure (PDB 4V1F; Preiss et al. (2015)) shows the drug occupying the proton-binding site on each of the nine subunits and thus preventing proton transfer. This stalls the rotation of the F_o motor, preventing rotation and thus the synthesis of ATP in F₁. Small differences between the structures of the proton-binding sites account for the exquisite specificity of bedaquiline for the F_o rings of mycobacteria and thus for its efficacy and safety as an anti-tubercular drug.

Meier ended his talk by explaining that TB-causing bacteria will eventually – perhaps sooner rather than later – develop resistance to bedaquiline, just as they have to every previous drug that has entered the clinic. There is therefore a pressing need to develop further drugs that act at the same target, and his group’s structural studies are proving useful in the search for bedaquiline analogues.

The London Structural Biology Club has a public Facebook group, which can be found here.

Tuesday, 31 May 2016

Molecules that Walk

The Department of Biological Science’s contribution to Science week 2016 kicked off on 11 April with a lecture by Dr Anthony Roberts, a young Principal Investigator who arrived at Birkbeck in 2014. Anthony received his B.Sc. from Imperial College in London and his Ph.D. from the University of Leeds, and spent four years as a postdoc at Harvard in the USA before moving here to start his own research group as a Sir Henry Dale Fellow of the Wellcome Trust and Royal Society.

Anthony began his lecture by explaining that he was going to talk about molecules that have the capacity to produce directed movement – or to ‘walk’ – and their importance for human health. These molecules are all proteins, and the context in which they move is the interior of living cells. Both the proteins he studies, kinesin and dynein, ‘walk’ on a network of highways conceptually not unlike the transport system that we use to move around London. These cellular highways are filaments called microtubules, which, unlike our roads and railway tracks, are able to self-assemble and also to self-destruct.

The ability to move is one of the fundamental properties of life, and scientists and philosophers have been studying it for millennia. Muscles were identified as the organs of movement in antiquity, but it was not until the mid-twentieth century that the molecules involved in muscle contraction could be identified. The Hungarian physiologist Albert Szent-Györgyi discovered the muscle proteins now named actin and myosin using very simple equipment during the Second World War.

These proteins have similarities with kinesin and dynein, although historically they have been easier to study due to their abundance in muscle; actin forms fibrils and the enzyme myosin binds to and ‘walks’ along these filaments. This process, like all movement, requires energy, and this is obtained from the cell’s power source, the small molecule adenosine triphosphate (ATP). The part of the myosin molecule that binds to actin, which is called its head, breaks a phosphate bond in this molecule to liberate energy and power the walking motion; many of these ‘power strokes’ together cause the muscle fibre to contract.

Ideally, we would want to watch this, or any other form of molecular motion, in real time, but this is impossible because molecules are far too small: smaller than the wavelength of light, so they cannot be viewed in a light microscope. Studies of molecular structure require techniques like X-ray crystallography and electron microscopy, both of which have been used to study motor molecules.

However, neither of these techniques can do more than generate still images. Movement can only be inferred by taking lots of snapshots of the molecules at different points during the movement cycle, rather like the earliest movies. We have now built up a complete picture of actin and myosin that is detailed enough for the positions of individual atoms to be seen clearly.

Not all movement in nature, however, uses muscles. Single-celled organisms – the ‘animalcules’ observed by pioneer microscopist Antonie van Leeuwenhoek in the 1670s – have directed movement, as do bacteria, and these have neither muscles nor nervous systems. And directed movement also occurs inside cells. A good example of this is the division of replicated DNA between daughter cells during cell division.

The interior of all cells is a viscous mixture, crowded with molecules; it is possible for small molecules to move from one part of a cell to another through diffusion, but this process would be impossibly slow for larger ones. Motor proteins, on the other hand, can carry ‘cargo’ molecules across cells remarkably quickly and efficiently. Motor proteins can traverse a distance of 0.1 mm – the length of a large animal cell – in two minutes, which in terms of lengths per second is approximately three times faster than a car.

Both the motor proteins studied in Anthony’s lab, kinesin and dynein, ‘walk’ along microtubules inside cells. These filaments typically form with one end towards the centre of the cell, and its nucleus, and the other towards the cell periphery, and the motor proteins move in opposite directions: dynein towards the nucleus, and kinesin towards the cell edge.

Any kind of directed movement by molecules is challenging for several reasons. Motor molecules have no equivalent of our nervous systems for controlling movement, and they are far too small to be held on their tracks by gravity; instead, they grip the microtubules using chemical forces. They experience negligible inertia, and are constantly buffeted by other molecules in the cell. It would therefore be catastrophic for the whole of a walking molecule to leave its path at once.

The structure and function of conventional kinesin are now fairly well understood. It consists of two identical protein chains, and each chain has two major domains separated by a short linker. The larger domain of each chain coils together to form a single long stalk; the smaller domain is globular and attaches to the microtubule, so the molecule looks rather like a single leg with two feet. Each of the feet is an enzyme that generates the energy for the motion by breaking down ATP to form ADP and release a phosphate group, and it cycles between ATP-bound, ADP-bound and empty states.

The step between ADP-bound and empty is a bottleneck that can be relieved when the foot attaches to the microtubule in a particular position, ensuring that the whole molecule moves in the correct direction. The trailing foot is released from the microtubule and the cycle begins again once ATP has bound to the front foot, triggering a conformational change in the whole molecule.

The core of kinesin is similar in structure to myosin, suggesting that these two proteins have a common ancestor. The other microtubule-bound motor protein, dynein, has a different origin. Although we still know comparatively little about it, it was actually the first of the microtubule-bound motor proteins to be discovered: this was in the 1960s, when it was found as the protein that generates the force that allows protozoa and sperm cells to swim. Anthony’s group, however, has been studying how it functions inside cells to move ‘cargo’ – often nucleic acids or other proteins – from the edges of the cell towards its interior. It also helps to pull the duplicated genetic material between the two halves of the cell during cell division.

Dynein is a much larger and more complex molecule than the other motor proteins. Its structure, like those other proteins, has several components: in this case, a stalk, a ring and a tail, with a linker between the stalk and the ring. Much of what we know about this large structure has come from electron microscopy, and more recently X-ray crystallography.

The structure of dynein; the stalk is shown in yellow and the linker in magenta.

Anthony’s group and others have developed a model in which the main mechanical element is the linker, which bends and straightens to displace the cargo-bound end of the structure along the microtubule in the direction of travel. The image shown here is a still from an animated model of how dynein generates movement, which remains speculative in places and is helping to stimulate new experiments in these areas. It is also incomplete, as it only shows one half of the molecule: we do know that dynein, like kinesin, is a biped, but exactly how its ‘feet’ are coordinated remains at the frontier of our knowledge.

Anthony ended his talk by discussing some actual and potential medical applications of studies of walking molecules. Some commonly used anti-cancer drugs, including taxol, work by stabilising microtubules to prevent motion and therefore stop cancer cells from dividing. Molecules that interact with motor proteins are also being studied as potential treatments for neurodegenerative diseases and for some types of heart disease. One such compound is a myosin activator, omecamtiv mecarbil, which is showing promise as a treatment for heart failure. And we are likely to discover further applications as we learn more about these fascinating walking molecules.

Tuesday, 10 May 2016

Crystallography: from Chocolate to Drug Discovery

Birkbeck has already established lecture series in honour of some of its most distinguished alumni. Until 2016, however, Rosalind Franklin – co-discoverer of the DNA structure and perhaps the most widely recognisable of its ‘famous names’ – was missing from the list of honourees. This gap has now been filled; the annual Rosalind Franklin lecture forms part of the college’s Athena SWAN programme and will always be given by a distinguished woman scientist. And fittingly, the inaugural lecture, which was part of Science Week 2016, was devoted to Rosalind Franklin’s own discipline, crystallography. Elspeth Garman, Professor of Molecular Biophysics at Oxford University, gave an entertaining and illuminating lecture to a large audience that included Rosalind’s sister, the author Jenifer Glynn.

Garman began her lecture by showing a short video that she had produced for OxfordSparks.net that used a ‘little green man’ to illustrate the method of X-ray crystallography that is used to obtain molecular structures from crystals. The rest of the lecture, she said, would simply go through that process more slowly. She started by showing some beautiful examples of crystals. All crystals are formed from ordered arrays of molecules. They can be enormous, such as crystals of the mineral selenite in a cave in Mexico that measure over 30’ long or too small to be visible with the naked eye.

In the early decades of crystallography, structures could only be obtained from crystals of the smallest, simplest molecules: the first structure of all, published in 1913 by the father-and-son team of W.H. and W.L. Bragg, was of table salt. When they were jointly awarded the Nobel Prize for Physics in 1915, the younger Bragg was a 25-year-old officer in the trenches on the Western Front. His record as the youngest Nobel Laureate was unbroken until Malala Yousafzai’s Peace Prize in 2014.

The Braggs’ discoveries paved the way for studies of the structures of many, many substances: including the chocolate of the lecture title. Few of the audience can have known that chocolate exists in six different crystal forms, or that only one of these (Form V) is good to eat. The process of ‘tempering’ – a series of heating and cooling steps – is used to ensure that it solidifies in the correct form.

Garman then moved on to talk about her own field of protein crystallography. Proteins are the ‘active’ molecules in physiology, and they are formed from long, linear strings of 20 different ‘beads’ (actually, small organic molecules known as amino acids). Chemists can quite easily find out the sequence of these beads in a protein, but it is impossible to work out from this the way that the string will fold up into a definite structure ‘like a piece of wet spaghetti’. And it is this structure that places different units with different chemical properties on the surface or in the interior of the protein, or near each other, and that therefore determines what the protein will do.

Protein crystallography only became technically possible in the mid-twentieth century, and even then it was a painfully slow and complex process that could only be used to study the smallest, simplest proteins. Dorothy Hodgkin, also a professor at Oxford, won her Nobel Prize in Chemistry in 1964 for the structures of two biologically important but fairly small molecules: penicillin, with 25 non-hydrogen atoms and vitamin B12, with 80. She is perhaps better known for solving the structure of insulin, the protein that is missing or malfunctioning in diabetics. This has 829 non-hydrogen atoms; in contrast, the 2009 Chemistry Nobel Prize was awarded for the structure of the ribosome, the large (by molecular standards) ‘molecular machine’ that synthesises proteins from a nucleic acid template. The bacterial ribosome used for the Nobel-winning structural studies is well over 300 times larger than insulin, with over a quarter of a million atoms.

Protein structures are not only beautiful to look at and fascinating to study, but they can be useful, particularly for drug discovery. Many useful drugs have already been designed at least partly by looking at a protein structure and working out the kinds of molecule that would bind tightly to it, perhaps blocking its activity. Some viral proteins have been particularly amenable to this approach. Rosalind Franklin did some of the first research into virus structure when she was based at Birkbeck, towards the end of her tragically short life, and her student Aaron Klug cited her inspiration in his own Nobel lecture in 1982. X-ray crystal structures were used in the design of the anti-flu drugs Relenza™ and Tamiflu™ and of HIV protease inhibitors, and more recently still structures of the foot and mouth virus are helping scientists develop new vaccines for tackling this potentially devastating animal disease. The foot-and-mouth virus structure even made the front page of the Daily Express.

The equipment that Dorothy Hodgkin and her contemporaries used to solve protein structures in the 1960s and 1970s looks primitive today. Now, almost every step of protein crystallography has been automated. Powerful beams of X-rays generated by synchrotron radiation sources, such as the UK’s Diamond Light Source in Oxfordshire, allow structures to be determined quickly from the smallest crystals. It is even possible to control some of these machines remotely; Garman has operated the one at Grenoble from her sitting room. Yet there is one step that has changed remarkably little. It is still almost as difficult to get proteins to crystallise as it was in the early decades. Researchers have to select which of a large number of combinations of conditions (temperature, pH and many others) will persuade a protein to form viable crystals. Guesswork still plays a large part and some researchers seem to be ‘better’ at this than others: Garman adds the acronym ‘GMN’ or ‘Grandmother’s maiden name’ to her list of conditions to reflect this.

Yet, with every step other than crystallisation speeded up and automated beyond recognition, the trickle of new structures in the 70s and even 80s has become a torrent. Publicly available structures are stored online in the Protein Data Bank, which started in 1976 with about a dozen structures: it now (May 2016) holds over 118,000. Protein crystallography as a discipline is thriving, but there are many challenges ahead. We are only now beginning to tackle the 70% or so of human proteins that are only stable when embedded in fatty cell membranes and are therefore insoluble in water. It is possible to imagine a time when it is possible to solve the structure of a single molecule, with no more need for time-consuming crystallisation. And, hopefully, women scientists will play at least as important a role in the second century of crystallography as they – from Quaker Kathleen Lonsdale, who developed important equations while jailed for conscientious objection during World War II, through Franklin and Hodgkin to Garman and her contemporaries – have in the first.

Wednesday, 9 March 2016

Evolution and Assembly of Protein Complexes

Since 2003, all Birkbeck researchers in structural biology and allied disciplines have collaborated with colleagues at UCL in the Institute of Structural and Molecular Biology. The Institute holds a varied series of events throughout the year, including a programme of research seminars arranged termly around current themes in molecular and structural biology research. The theme for the spring term 2016 seminar programme has been ‘Protein Dynamics: from Folding to Function’; one of the first of the distinguished scientists invited to present their research under that theme was Sarah Teichmann from the EMBL – European Bioinformatics Institute and the Sanger Institute at Hinxton, near Cambridge, UK. She gave a fascinating talk that linked evolution and protein folding to the topic of Section 7 of the PPS course, quaternary structure (or the assembly of protein complexes).

Teichmann has won many awards in what is still quite a short research career, including the Biochemical Society’s Colworth Medal for ‘an outstanding research biochemist under the age of 35’ (2011) and the EMBO Gold Medal (2015). Last year she was elected a Fellow of the prestigious Academy of Medical Sciences, with a citation that commended her as representing ‘a new breed of scientists at the interface between computational and experimental molecular biology’. She is also an advocate for women in science and has written a children’s novel.

She began her seminar by asking two related questions: ‘how do protein complexes assemble?’ and ‘how do protein complexes evolve?’ and by misquoting the poet John Donne: ‘no protein [man] is an island’. Many proteins are functional only when bound to others to form complexes, and in the crowded environment of a cell each newly synthesised protein has only a limited amount of time to find its partners and form a stable complex. Much can be learned about the evolution and dynamics of complex formation by studying the complexes that are available in the Protein Data Bank. Her group’s evaluation of these structures has contributed to the software that the PDB uses to predict the functional biological unit (monomer, dimer or multimer) for each structure in the PDB, and has led to the 3Dcomplex.org database of protein complexes. This database provides a hierarchical classification of now over 30,000 protein quaternary structures. Each complex is represented using graph theory as a simple 2D figure or ‘mini-graph’, with each polypeptide chain as a node and each interaction surface between two chains as an edge. These little graphs make it easier to distinguish between topologies involving the same number of subunits: for example, a complex of six identical protein chains may be a simple hexamer with 6-fold rotational symmetry (such as the traffic ATPase [PDB 1g6o]) or a dimer of trimers with 32 symmetry (such as annexin XII [PDB 1aei]). The links here are to the pages describing those proteins in Section 7 of the PPS course material.

Alongside the hierarchy described in the 3Dcomplex database, protein complexes can be divided into two large groups: homomers, which consist of multiple copies of the same polypeptide chain, and heteromers involving different chains. (Haemoglobin, a tetramer with two alpha and two closely related beta chains, is arguably an intermediate between the main two types.) Teichmann spent the rest of her lecture addressing three related questions about the assembly of both homomers and heteromers:

i) Does the assembly of protein complexes drive evolution?
ii) What are the mutational mechanisms involved in complex formation?
iii) Can the principles of protein assembly be used to predict topologies that have not yet been seen?

Starting with the first question, from an evolutionary point of view the simplest complex to form is a homodimer with two copies of the same monomer; one mutation that turns part of a protein surface into a ‘sticky patch’ is all that is necessary to stabilise dimer formation. Not surprisingly, the homodimer is also the commonest type of quaternary structure found in the PDB. Once a protein has dimerised, additional monomers can be added to form larger complexes with cyclic symmetry, or the dimer itself can (for example) dimerise. The order in which the interfaces in a multimer formed during evolution can be predicted from the amount of surface area buried by the formation of each interface, with the largest surface areas being buried first. This simple rule applies to complicated assemblies as much as to simple ones, and to heteromers as much as to homomers. Therefore, for all but the very simplest structures, it is almost impossible to predict the form that a complex will take unless you know the order in which the subunits assemble. Joseph Marsh, a former postdoc in Teichmann’s group now working at the MRC Human Genetics Unit in Edinburgh, represents this here in an analogy with the assembly of flat-pack furniture, with and without instructions.

Teichmann tested some of her predictions of protein assembly pathways using mass spectrometry in collaboration with Professor Dame Carol Robinson’s group at the University of Oxford, and found that seven out of nine pathways and 22 out of 27 steps within those pathways had been predicted correctly. This hierarchy of subunit assembly can also be used to predict the evolution of a complex, so it is clear that the assembly of protein complexes can indeed drive evolution.

Turning to the second question, Teichmann used specific examples of protein families that take up different quaternary structures in different species, including the PyR family of bacterial pyrimidine operon attenuators, to explore the evolutionary mechanisms that take a protein from one that is most stable as a monomer to different multimeric forms. These can involve direct mutations at the interface between subunits (for example, making the protein surface ‘stickier’ or creating a salt bridge) and other so-called ‘allosteric’ mutations that change the protein structure to allow different interfaces to form. Often, the difference between (for example) a protein that is stable as a dimer and one that is stable as a tetramer will come down to changes in a few amino acids. In the case of the PyR attenuator family, mutations away from the interface drive a conformational change that is equivalent to the one that occurs when the protein binds DNA, and so stabilise multimer formation.

Finally, Teichmann considered the use of the assembly principles that she had outlined in predicting the form that a protein complex would take from scratch. Most basic steps in complex assembly, as described earlier, can be grouped into one of three categories: dimerization of one or more chains, adding an identical subunit or subunits to a complex (cyclization) and adding a different type of subunit. These can be combined in different ways to form a large number of possible quaternary structure topologies. So far, about 120 different topologies are represented in the PDB, with four or five new ones being added each year, and the vast majority of these fit into one of Teichmann’s topologies. She assembled all the predicted topologies, including those not yet observed, into a ‘periodic table of protein complexes’ (S.E. Ahnert et al., Science 350, aaa2245 (2015)). This table has already been seen to correctly predict the topology of some newly determined complexes that were not included in the original list.

Principles of Protein Structure