Principles of Protein Structure

Tuesday, 8 July 2014

Mimicking DNA: How a Repressor Meets its Waterloo

Bacillus subtilis is a non-toxic bacterium commonly found in soil, usually in the form of a dormant spore. It is extremely hardy due to a remarkable list of adaptations to environmental threats. These include the production of antibiotics and degrading enzymes, amalgamation into biofilms, the formation of an endospore and even the destruction of sibling bacteria.

This impressive range of threat responses is triggered by many different adaptation genes, which are governed by an army of regulators, some specific and some global.

One notable universal regulator protein is AbrB, which represses several adaptation genes when cell conditions are favourable. There are two mechanisms that block this repressor, when the bacterium faces a threat.

Under stress, an upstream regulator, SpoOA, is phosphorylated and, in an envy inducing display of multi-tasking, is able to bind to and inhibit the gene for AbrB and also activate the gene for anti-repressor AbbA.

AbbA binds to the repressor AbrB and blocks it, thereby allowing adaptation to environmental threat. AbbA has recently been revealed as a DNA mimic, which competes very effectively for the AbrB binding site by copying key DNA characteristics.

Inspection of the primary structure of AbbA reveals 65 residues of which 20 are polar. Given that the core of a protein tends to be hydrophobic, this implies an polar surface.

The next step was to establish the oligomerisation of the protein in its native state. Size-exclusion liquid chromatography (SELC) and native mass spectrometry both showed that the natural size of AbbA was double the mass of the monomer, that is, it dimerises.

NMR was used to solve the dimer structure.

This figure shows the ten lowest energy structures of the AbbA dimer obtained using NMR. One monomer has helices coloured blue to green whilst the second is yellow to red. (PDB 2LZF).

Figure taken from Tucker, A.T. et al (2014)

The monomer consists of three alpha helices, connected by two loops with a fairly unstructured N terminus. In the dimer there are substantial interactions between helices two and three of each monomer, which are largely hydrophobic, and further hydrophobic interactions between helices one and two.

To determine the binding site for AbbA and AbrB, SELC studies were performed on AbbA with the C terminus of AbrB and then with the N terminus. The team found that AbbA binds only to the N terminus, the region of the protein that is responsible for binding to DNA.

Previous mutation studies had shown that four arginine residues are responsible for AbrB binding to DNA, namely R8, R15, R23 and R24. Each of these was mutated to observe the impact on AbbA binding and three of them, R8, R15 and R23, were found to be critical, potentially indicating a similar binding mechanism.

To determine the strength of the competition offered to AbrB-DNA binding by AbrB-AbbA binding, isothermal titration calorimetry was used to measure the dissociation constants. This technique is covered in TSMB, one of the courses available after PPS. The dissociation constants showed that the binding strengths of the two pairs of molecules were similar, such that AbbA offers a significant threat to AbrB-DNA binding.

The last stage of the investigation was to determine the interaction between the Abba homodimer and the AbrB homodimer.

Molecular docking was used to model this interaction and showed a sizeable interaction area with a complex pattern of 18 hydrogen bonds and 16 salt bridges. This site is at the highly negatively charged terminal of AbbA and the DNA binding face of N terminal AbrB.

The first helix of each AbbA monomer has to pull back to allow this strong interaction but AbbA's second and third helices maintain their conformation and act as a stabilising anchor.

The picture below gives a striking illustration of the extent of AbbA's mimicry of DNA.

This figure illustrates the similarities between AbbA and the DNA phosphate backbone. (a) is the NMR structure of AbbA with positively charged residues shown in blue and negatively charged in red. This can be compared with (c), a DNA fragment (PDB 1BNA), showing the charge distribution in the same colour scheme. The length of one turn of the helix and the minor groove are shown with yellow dotted lines. (b) shows the structure of AbbA in cartoon format with the side chain oxygens of glutamic acid residues 16, 29. 33 and 67 as red spheres and (d) is the same structure with the backbone of the DNA from PDB 1BNA superimposed in yellow.

Figure taken from Tucker, A.T. et al (2014) .

Few DNA mimics have been discovered but they share the tactic of using negatively charged residues (glutamic acids and aspartic acids) to present similar bonding opportunities to the DNA backbone phosphates. It would seem that, faced with the challenge of competing with the strongly charged backbone of DNA, the mimics are following the logic that if you can't beat 'em, join 'em.

A DNA Mimic: The Structure and Mechanism of Action for the Anti-Repressor Protein AbbA

Tucker, A.T., Bobay, B.G., Banse, A.V., Olson, A.L., Soderblom, E.G., Moseley, M.A., Thompson, R.J., Varney, K.M., Losick, R., Cavanagh, J. J Mol Biol. (2014) ;426(9):1911-24

Wednesday, 7 May 2014

The Many Uses of Bioinformatics

Every year, Birkbeck hosts a lecture by a distinguished scientist to honour the memory of the founder of its Crystallography Department, J.D. Bernal. “Sage” as he was called by all who worked with him had an enormous range of research interests spanning both science and society; he is widely considered one of the most brilliant scientists never to have won a Nobel Prize. The 2014 Bernal Lecture, held on March 27, was given by Professor Janet Thornton, the director of the European Bioinformatics Institute (EBI) at Hinxton near Cambridge.

Professor Dame Janet Thornton, © BBSRC 2014

Introducing the lecture Professor David Latchman, Master of Birkbeck, described it as a unique occasion: the only time he has introduced as a guest lecturer someone who he had interviewed for a job. Thornton includes both Birkbeck and UCL on her CV: appropriately, her last post in London was that of Bernal Professor, held jointly at both colleges. She moved on to “even greater heights” as director of one of Europe’s top bioinformatics institutions in 2003.

Thornton began her lecture with a quote from Bernal: “We [academics] can go on being useless up to a point, with confidence that sooner or later some use will be found for our studies”. That quote is of particular relevance to the subject that she has made her own: bioinformatics. She had already begun her research career in 1977, when Fred Sanger invented the process that was used to obtain the DNA sequence of the human genome. That endeavour, which was completed in 2003, took over ten years and cost billions of dollars. Sequencing a human-sized genome, which has about 3 billion base pairs of DNA, now takes maybe 10 minutes and costs about a thousand dollars. While a decade ago we had one “Human Genome”, we now have lots. Mega-sequencing projects already planned or in progress include projects to sequence about 8,000 Finns, and the entire 50,000 population of the Faeroe Islands; one to sequence paired tumour and normal genomes from 20,000 cancer patients; and the UK10K project, which is investigating the genetic causes of rare diseases.

It is now almost extraordinarily simple and cheap to obtain genomic data, but real challenges remain in interpreting and understanding it so that it can be used in medicine. This is the province of bioinformatics, and Thornton devoted much of her presentation to explaining five ways in which gene (and protein) sequence information is being applied to both basic and clinical medical research:

Understanding the molecular basis of disease
Investigating differences in disease risk caused by human genetic variation
Understanding the genomics of cancer
Developing drugs for infectious diseases, including neglected diseases
Investigating susceptibility to infectious disease

There are rather more than 20,000 genes in the human genome, far fewer than were originally predicted. Tiny differences between individuals in many of these either directly cause a genetic disorder or confer an increased – or in some cases decreased – risk of developing a disease. The genetic causes of some diseases, such as the bleeding disorder haemophilia, were known many years before the “genome era”: others have been discovered more recently. Mapping known mutations onto the structure of the enzyme copper, zinc superoxide dismutase has revealed the cause of the inherited disorder amyotrophic lateral sclerosis, a form of motor neurone disease. And knowing the genome sequence has already made an enormous contribution to our understanding of the mechanisms of disease development, contributing to improvements in diagnosis and the design of novel drugs.

We now understand that cancer is a genetic disease: it arises when mutations in a group of cells cause them to grow and divide excessively. A cancer is no longer classified just by its location (for example, a breast or lung cancer) but by the particular spectrum of genetic variations in its cells. About 500 different genes are known to be mutated in cancer, some much more often than others. For example, about 60% of cases of melanoma, a type of skin cancer, contain one specific mutation in the gene BRAF. This codes for a protein that can direct cells to grow and divide, and the cancer-causing mutation sticks this protein into the ON position, so this signal is always sent. Scientists in a company called Plexxicon used their knowledge of this mutation and the structure of the protein to design a drug, vemurafenib, which prevents the BRAF protein from signalling. This can cause a dramatic, if short-term improvement in melanoma patients, but, crucially, it only works in patients whose cancers carry this mutation. It is one of the first developed examples of a “personalised medicine” that is only used alongside a diagnostic test for a genetic variation. There will soon be many more.

Genomics is also proving very useful in the fight against infectious disease. Antibiotic resistance is one of the greatest emerging threats to human health, and scientists have to use all the tools at their disposal, including genomics and bioinformatics, as they try to stay one step ahead of rapidly mutating pathogens. Sequencing is widely used to track the sources of outbreaks of infection and of resistant bacteria such as methicillin-resistant Staphylococcus aureus (MRSA) in hospitals, and it is the only way of determining the exact nature of an infection. One of the most dramatic examples of the use of genomics in infectious disease control occurred in 2011, when a novel strain of E. coli O104 caused about 4,000 cases of serious food-borne illness and 50 deaths in Germany. This was originally linked to cucumbers imported from Spain but a global effort to trace its specific sequence variants proved that the source of the infection was beansprouts grown on a farm near Hamburg.

There was much more to Thornton’s wide-ranging lecture than simply bioinformatics and medicine: more, indeed, than it is possible to do justice to in a single blog post. She went on to describe some of the benefits of genomics for agriculture and food security. These included designing new strategies for controlling pests and diseases, maximising the efficiency of biomass processing, and even managing biodiversity. It is necessary to measure biodiversity in order to manage it properly; it is now possible to define a short stretch of DNA sequence that fully identifies a species or sub-species (a so-called “DNA barcode”) and these are beginning to be used to track some very diverse organisms, including the 400,000 known species of beetle.

The lecture ended with a short discussion of some of the challenges facing bioinformatics and genomics in the second decade of this century, largely relating to difficulties with storing, manipulating and understanding the enormous quantity of data that is being generated. Mining this data mountain for the benefit of mankind is a task that is beyond either the academic community or the biotech industry alone. It will require novel ways of doing science that involve governments and charities as well as academia and industry. The new Centre for Therapeutic Target Validation, launched at Hinxton on the same day as Thornton’s Bernal Lecture, is a pioneering example of such a partnership. It has been set up by the EBI, the Sanger Institute where a third of the original human genome sequence was obtained, and pharmaceutical giant GSK, and its scientists aim to use the whole range of available genomic data to select and evaluate new targets for novel drugs.

Bioinformatics is covered in section 6 of the PPS course. Students who take the second-year option Techniques in Structural Molecular Biology will return to it then, where the material focuses on selecting protein targets for structural genomics initiatives: a task that is linked to that of selecting drug discovery targets.

This post will be cross-posted on the Birkbeck Events blog.

Thursday, 6 March 2014

Crystallins under the Lens

Written by Jill Faircloth

For generations, anyone who argued against evolutionary theory would point to the human eye and exclaim that nothing so perfectly adapted to its purpose could have evolved in a series of random steps. The well-rehearsed counter argument is that even a very basic recognition of light and shadow via an organic pinhole camera is useful as an aid to survival and that this could provide the first stepping stone towards the sophistication of the vertebrate eye (see references 1 and 2). The theory is supported by a succession of organisms with gradually increasing vision.

On a molecular level, the proof is harder to achieve but Christine Slingsby of Birkbeck's Department of Biological Sciences has used crystallography to do just that. In investigating the structure of the proteins of the vertebrate eye lens, Slingsby has not only greatly increased our understanding of their characteristics and mechanisms but also provided fascinating insights into their evolution.

Professor Slingsby's work is featured in several pages in the PPS course: Greek Key Motif, Beta Sandwiches, Lens Proteins and Cataract and Eukaryotic Genomes. Last year she published a paper (reference 3 here) which summarised the key conclusions of her research during the last ten years. This review is available in the Birkbeck e-library here.

Vertebrate lenses comprise layers of highly elongated fibre cells which give transparency and focus but the refractive power is given by high concentrations of transparent proteins from two superfamilies: the alpha crystallins and the beta-gamma crystallins. These proteins, which are all mainly made up of beta strands, have been co-opted from their original functions to generate a functioning lens.

You don’t need to look far to find the probable origin of α crystallins. They are small heat shock proteins (sHsps), molecular chaperones that are present in most types of cell in most organisms. They are upregulated: that is, produced in greater quantities, by cells under environmental stress as part of the protein homeostasis response.

Despite their name, βγ crystallins are unrelated to α crystallins; all crystallins interact to form a refractive index gradient which can vary as required. Apart from the vertebrate lens where they are very prevalent, and in stark contrast to α crystallins, βγ crystallins are found only in other vertebrate eye tissues (except as a component of a much larger gene/protein known as Aim1) and this makes their origin harder to identify. Beta and gamma crystallins each contain four Greek key motifs organised as two βγ-crystallin domains.

There are several requirements for an eye lens protein. It must be expressed at very high levels, unlike sHsps, so the sHsp gene promoters would have required modification. The proteins must pack tightly and uniformly enough so that there is no irregularity on the scale of the wavelength of light and they must be soluble but must not crystallise or separate into different phases. In addition, lens fibre cells have lost their organelles, which could cause light scattering, and so have no mechanisms for protein repair or disintegration. Accordingly, these proteins need to have a lifespan as long as the vertebrate using them.

One of the main reasons for crystallins having been adopted as lens proteins could be that the two α crystallins are able to dynamically form polymers with highly diverse size and shape. This ability was demonstrated as the first crystal structure of a sHsp revealed a hollow octahedral structure of 24 α crystallin monomers. The next one to be solved showed point group 32 symmetry and was constructed from six dimers arranged in two interlocking rings.

This figure shows the beta-sandwich structure of the alpha-crystallin domain of a monomer, the formation of the dimer with the B6 beta strand exchanging into the partner beta-sandwich, and the oligomer with six dimers forming interconnecting discs. The dimers link using motifs on the C terminal extension which insert into the pocket between the B4 and B8 strands, shown in dark blue, and by interaction of the N terminal helices.

Figure taken from Slingsby, C. et al. (2013. PDB 1GME

In addition to the wide range of alpha-crystallin oligomers, the numerous βγ-crystallin chains can be assembled to create a wide range of polymers which coexist in a polydisperse stable but flexible arrangement of varying density.

Beta-crystallins thus appear to function in a similar way to α-crystallins, forming a diverse range of differently sized hetero-oligomers that adjust the refractive index throughout the lens.

Gamma crystallins are different because they are monomeric and polar. They are present in differing concentrations throughout the lens and their polarity results in distinctive orientations towards other crystallins which may regulate inter-crystallin interactions. There is evidence that disruption of these dipoles results in cataracts.

By examining genomes of organisms which predate the development of the camera eye, Slingsby has shed light on the evolutionary pathway of crystallins as lens proteins. PPS students will bave read about the single-domain βγ-crystallin in the urochordate (invertebrate) sea squirt, Ciona intestinalis, that has exactly the same double Greek key structure as a vertebrate crystallin but includes a calcium binding sequence in each Greek key motif. This ancestral link was further demonstrated by the remarkable discovery that the gene promoter for Ciona-crystallin could successfully target reporter gene expression for proteins associated with vision in vertebrates.

Investigation of the genome of a cephalochordate, which is part of the lineage of both vertebrates and urochordates, revealed a less complex ancestor to βγ-crystallins. Signature sequences from the βγ-crystallins have also been found in bacterial and archaeal proteins. The implication of this is that all of the proteins of the vertebrate lens could well have evolved from proteins present in ancient species with no visual function. An interesting twist is introduced by the knowledge that the nonchordates, or animals without a spine, can use quite similar cellular lenses that involve non-crystallin proteins.

This suggests that lenses evolved independently in different animal kingdoms, relatively late on an evolutionary timescale, utilizing different proteins that were available in the respective phyla, that is proteins which already had an established purpose but which had qualities allowing them to form lenses. Since all species seem to have had access to at least a basic form of βγ-crystallin, it is an impressive demonstration that evolution can not only capitalise on the multiple possibilities presented by one family of proteins to develop a functioning visual system, but also repeat the trick from a different starting point.

References

Dawkins, R. (1994). The eye in a twinkling. Nature 368, 690-691
Nilsson, D.E., Pelger, S. (1994). A pessimistic estimate of the time required for an eye to evolve. Proc. Biol. Sci. 256(1345): 53-8.
Slingsby, C., Wistow, G.J. and Clark, A.R. (2013). Evolution of crystallins for a role in the vertebrate eye lens. Protein Sci. 22(4):367-80.

Thursday, 9 January 2014

New protein structures presented at the 2013 BCA Winter Meeting

The work of crystallographers in the UK is supported through the British Cryatallographic Association, which has about 700 members based in academia and industry. It is organised into four groups representing different disciplines within crystallography, including one for structural biologists called, not surprisingly, the Biological Structures Group. Every December, this group organises a one day conference to present some of the most recent developments in structural biology. I have blogged these meetings before, and searching this blog for "winter meeting" will find a few of those posts.

The 2013 meeting was billed as both a "final" event in the centenary year of the Braggs' landmark discoveries and part of the build-up to the International Year of Crystallography, but these were not the only anniversaries highlighted there. 2013 also marked the sixtieth anniversary of the publication of the structure of DNA. The 2013 Winter Meeting was held in King's College London, which played a very important part in that discovery: Maurice Wilkins and Rosalind Franklin, who obtained the X-ray diffraction patterns that led to the discovery of the double helix, were based there. (Wilkins shared the Nobel Prize for this discovery with Watson and Crick; Franklin died in 1958, four years before that prize was awarded.) And the first precise physical model of the double helix is still on display in the college.

Maurice Wilkins' original DNA model

The first researcher to speak at the meeting was Birkbeck's own Professor Bonnie Wallace. Her work on the structures of voltage gated sodium channels has been described on this blog before, most recently in April 2013. These proteins are responsible for the transport of ions in and out of cells, an essential signalling mechanism in all multi-cellular organisms. Their structures, however, are among the most intractable of all membrane proteins (PPS section 11, to be released in May, covers this fast moving field). Wallace has used a combination of X-ray crystallography, spectroscopy and molecular dynamics to explore the structure and mechanism of sodium channels in bacteria. The bacterial sodium channel is simpler than the mammalian equivalent, consisting of a tetramer in which helices from each monomer line the pore. The Wallace group's most recent strucure (PDB 3ZJZ) shows the position of the C-terminal domain of these channels for the first time. This domain consists of a coiled coil formed by one helix from each monomer that is linked to the rest of the protein by a flexible region. Moving the coiled coil up and down causes a conformational change that allows the channel to open and close.

The technique of rational or structure-based drug design, which involve modelling the interactions between a library of potential ligands and a protein binding site, has proved particularly successful in the design of anti-viral drugs. Several inhibitors of HIV protease and of influenza virus neuraminidase that were designed in this way have become very successful drugs. David Stuart from the University of Oxford and the Diamond synchrotron gave a talk illustrating how structure-based in silico techniques are now being applied to design drugs against another virus family: the Picornaviruses. Members of this large family are responsible for a diverse range of diseases, ranging in humans from polio to the common cold. The foot-and-mouth virus, which affects livestock and which devastated parts of the UK countryside in 2001, is also a member of this family.

One of the viruses studied in Stuart's goup is a human picornavirus that causes similar symptoms to the foot-and-mouth virus and that represents a serious threat to public health in East Asia. The disease is known as hand foot and mouth virus, and the virus as CAV16: like all picornaviruses, it consiss of a single strand of RNA enclosed within an icosahedral (20-sided) protein capsid. The intact virus particles are very fragile and diffraction patterns must be captured before the particles disintegrate in the X-ray beam. Stuart and his Chinese collaborators have used one of the microfocus beamlines at Diamond to take snapshots of the virus structure at several points during its life cycle. One of these is of an "uncoating intermediate" that shows one of the viral proteins (VP1) emerging from the capsid so that it can be embedded in the membrane of a host cell Ren et al., 2013). Stuart and his co-workers are now designing compounds to bind to these intermediate structures and prevent the virus from entering its human host cells.

All cells, whether prokaryotic or eukaryotic, contain long molecules of DNA that must be packaged in order to fit into the confined space available. Fortunately for developers of anti-bacterial drugs (and users of antibiotics) bacterial cells package DNA using a different mechanism from mammalian ones. In bacteria, enzymes called topoisomerases bind to, cut and re-join double-stranded DNA so that it can be unwound or untangled ahead of replication. Ivan Laponogov, a postdoctoral research assistant at King's College, described recent work in his group on the structure of one of these enzymes. Bacterial topoisomerase II ia a target for an important class of antibiotics, the fluoroquinolones, but resistance to these drugs is increasing.

These enzymes are powered by ATP and act as "clamps", capturing one double-helical strand of DNA and passing it through a break in another to remove supercoils and knots in the nucleic acid structure. The structure presented at this meeting was the first of a complete topoisomerase dimer bound to DNA in the "open clamp" position. This structure was solved with and without a fluoroquinolone drug (levofloxacin) bound. The structure with drug bound showed that molecule intercalating between DNA bases at the point where the nucleic acid would be cleaved, preventing that cleavage. The structure without the drug showed the DNA in a different position; the position of a functionally important magnesium ion also changed between the structures.

Many essential cellular processes involve a post-translational modification in which poly-(ADP ribose) or PAR is added to amino acid side chains, and the processing of this molecule involves a wide variety of enzymes. Inhibitors of one of these, poly-(ATP ribose) polymerase or PARP, have recently been developed as drugs against cancer. David Leys from the University of Manchester described his work on the structure of another enzyme in the PAR life cycle: poly-ADP-ribose glycohydrolase (PARG), which catalyses the removal of PAR from proteins.

Mammalian PARG enzymes have three domains, a N-terminal regulatory region and two C-terminal domains forming the catalytic region; the equivalent bacterial enzymes lack the N-terminus. Leys and his groups first solved structures of a bacterial PARG bound to ADP-ribose (PDB 3SIG) and to a known inhibitor with a similar structure. They found that a C-terminal helix in the protein was clamped around the terminal ribose of PAR, enabling the release of a single ADP-ribose from the polymer. This basic mechanism is similar in the mammalian enzyme. More recently, the Leys group has solved the structure of PARG bound to an intact PAR substrate (PDB 4L2H); modelling studies based on this structure suggest that the enzyme acts predominantly as an exo-glycohydrolase, that is, it catalyses the removal of one residue at the end of the polymer chain. Understanding the structure and mechanism of these enzymes should enable us to develop small-molecule inhibitors of PARG, and these may one day rival the PARP inhibitors as anti-cancer drugs.

A hundred years on from the "invention" of crystallography and sixty years on from the structure of DNA, these elegant, fascinating and complex structures presented at one meeting give a snapshot of recent progress in structural biology. Furthermore, each of these structures has already provided insights into human disease that may yet lead to the development of useful drugs.