Tuesday, 8 July 2014
Mimicking DNA: How a Repressor Meets its Waterloo
Wednesday, 7 May 2014
The Many Uses of Bioinformatics
Introducing the lecture Professor David Latchman, Master of Birkbeck, described it as a unique occasion: the only time he has introduced as a guest lecturer someone who he had interviewed for a job. Thornton includes both Birkbeck and UCL on her CV: appropriately, her last post in London was that of Bernal Professor, held jointly at both colleges. She moved on to “even greater heights” as director of one of Europe’s top bioinformatics institutions in 2003.
Thornton began her lecture with a quote from Bernal: “We [academics] can go on being useless up to a point, with confidence that sooner or later some use will be found for our studies”. That quote is of particular relevance to the subject that she has made her own: bioinformatics. She had already begun her research career in 1977, when Fred Sanger invented the process that was used to obtain the DNA sequence of the human genome. That endeavour, which was completed in 2003, took over ten years and cost billions of dollars. Sequencing a human-sized genome, which has about 3 billion base pairs of DNA, now takes maybe 10 minutes and costs about a thousand dollars. While a decade ago we had one “Human Genome”, we now have lots. Mega-sequencing projects already planned or in progress include projects to sequence about 8,000 Finns, and the entire 50,000 population of the Faeroe Islands; one to sequence paired tumour and normal genomes from 20,000 cancer patients; and the UK10K project, which is investigating the genetic causes of rare diseases.
It is now almost extraordinarily simple and cheap to obtain genomic data, but real challenges remain in interpreting and understanding it so that it can be used in medicine. This is the province of bioinformatics, and Thornton devoted much of her presentation to explaining five ways in which gene (and protein) sequence information is being applied to both basic and clinical medical research:
- Understanding the molecular basis of disease
- Investigating differences in disease risk caused by human genetic variation
- Understanding the genomics of cancer
- Developing drugs for infectious diseases, including neglected diseases
- Investigating susceptibility to infectious disease
We now understand that cancer is a genetic disease: it arises when mutations in a group of cells cause them to grow and divide excessively. A cancer is no longer classified just by its location (for example, a breast or lung cancer) but by the particular spectrum of genetic variations in its cells. About 500 different genes are known to be mutated in cancer, some much more often than others. For example, about 60% of cases of melanoma, a type of skin cancer, contain one specific mutation in the gene BRAF. This codes for a protein that can direct cells to grow and divide, and the cancer-causing mutation sticks this protein into the ON position, so this signal is always sent. Scientists in a company called Plexxicon used their knowledge of this mutation and the structure of the protein to design a drug, vemurafenib, which prevents the BRAF protein from signalling. This can cause a dramatic, if short-term improvement in melanoma patients, but, crucially, it only works in patients whose cancers carry this mutation. It is one of the first developed examples of a “personalised medicine” that is only used alongside a diagnostic test for a genetic variation. There will soon be many more.
Genomics is also proving very useful in the fight against infectious disease. Antibiotic resistance is one of the greatest emerging threats to human health, and scientists have to use all the tools at their disposal, including genomics and bioinformatics, as they try to stay one step ahead of rapidly mutating pathogens. Sequencing is widely used to track the sources of outbreaks of infection and of resistant bacteria such as methicillin-resistant Staphylococcus aureus (MRSA) in hospitals, and it is the only way of determining the exact nature of an infection. One of the most dramatic examples of the use of genomics in infectious disease control occurred in 2011, when a novel strain of E. coli O104 caused about 4,000 cases of serious food-borne illness and 50 deaths in Germany. This was originally linked to cucumbers imported from Spain but a global effort to trace its specific sequence variants proved that the source of the infection was beansprouts grown on a farm near Hamburg.
There was much more to Thornton’s wide-ranging lecture than simply bioinformatics and medicine: more, indeed, than it is possible to do justice to in a single blog post. She went on to describe some of the benefits of genomics for agriculture and food security. These included designing new strategies for controlling pests and diseases, maximising the efficiency of biomass processing, and even managing biodiversity. It is necessary to measure biodiversity in order to manage it properly; it is now possible to define a short stretch of DNA sequence that fully identifies a species or sub-species (a so-called “DNA barcode”) and these are beginning to be used to track some very diverse organisms, including the 400,000 known species of beetle.
The lecture ended with a short discussion of some of the challenges facing bioinformatics and genomics in the second decade of this century, largely relating to difficulties with storing, manipulating and understanding the enormous quantity of data that is being generated. Mining this data mountain for the benefit of mankind is a task that is beyond either the academic community or the biotech industry alone. It will require novel ways of doing science that involve governments and charities as well as academia and industry. The new Centre for Therapeutic Target Validation, launched at Hinxton on the same day as Thornton’s Bernal Lecture, is a pioneering example of such a partnership. It has been set up by the EBI, the Sanger Institute where a third of the original human genome sequence was obtained, and pharmaceutical giant GSK, and its scientists aim to use the whole range of available genomic data to select and evaluate new targets for novel drugs.
Bioinformatics is covered in section 6 of the PPS course. Students who take the second-year option Techniques in Structural Molecular Biology will return to it then, where the material focuses on selecting protein targets for structural genomics initiatives: a task that is linked to that of selecting drug discovery targets.
This post will be cross-posted on the Birkbeck Events blog.
Thursday, 6 March 2014
Crystallins under the Lens
For generations, anyone who argued against evolutionary theory would point to the human eye and exclaim that nothing so perfectly adapted to its purpose could have evolved in a series of random steps. The well-rehearsed counter argument is that even a very basic recognition of light and shadow via an organic pinhole camera is useful as an aid to survival and that this could provide the first stepping stone towards the sophistication of the vertebrate eye (see references 1 and 2). The theory is supported by a succession of organisms with gradually increasing vision.
On a molecular level, the proof is harder to achieve but Christine Slingsby of Birkbeck's Department of Biological Sciences has used crystallography to do just that. In investigating the structure of the proteins of the vertebrate eye lens, Slingsby has not only greatly increased our understanding of their characteristics and mechanisms but also provided fascinating insights into their evolution.
Professor Slingsby's work is featured in several pages in the PPS course: Greek Key Motif, Beta Sandwiches, Lens Proteins and Cataract and Eukaryotic Genomes. Last year she published a paper (reference 3 here) which summarised the key conclusions of her research during the last ten years. This review is available in the Birkbeck e-library here.
Vertebrate lenses comprise layers of highly elongated fibre cells which give transparency and focus but the refractive power is given by high concentrations of transparent proteins from two superfamilies: the alpha crystallins and the beta-gamma crystallins. These proteins, which are all mainly made up of beta strands, have been co-opted from their original functions to generate a functioning lens.
You don’t need to look far to find the probable origin of α crystallins. They are small heat shock proteins (sHsps), molecular chaperones that are present in most types of cell in most organisms. They are upregulated: that is, produced in greater quantities, by cells under environmental stress as part of the protein homeostasis response.
Despite their name, βγ crystallins are unrelated to α crystallins; all crystallins interact to form a refractive index gradient which can vary as required. Apart from the vertebrate lens where they are very prevalent, and in stark contrast to α crystallins, βγ crystallins are found only in other vertebrate eye tissues (except as a component of a much larger gene/protein known as Aim1) and this makes their origin harder to identify. Beta and gamma crystallins each contain four Greek key motifs organised as two βγ-crystallin domains.
There are several requirements for an eye lens protein. It must be expressed at very high levels, unlike sHsps, so the sHsp gene promoters would have required modification. The proteins must pack tightly and uniformly enough so that there is no irregularity on the scale of the wavelength of light and they must be soluble but must not crystallise or separate into different phases. In addition, lens fibre cells have lost their organelles, which could cause light scattering, and so have no mechanisms for protein repair or disintegration. Accordingly, these proteins need to have a lifespan as long as the vertebrate using them.
One of the main reasons for crystallins having been adopted as lens proteins could be that the two α crystallins are able to dynamically form polymers with highly diverse size and shape. This ability was demonstrated as the first crystal structure of a sHsp revealed a hollow octahedral structure of 24 α crystallin monomers. The next one to be solved showed point group 32 symmetry and was constructed from six dimers arranged in two interlocking rings.
This figure shows the beta-sandwich structure of the alpha-crystallin domain of a monomer, the formation of the dimer with the B6 beta strand exchanging into the partner beta-sandwich, and the oligomer with six dimers forming interconnecting discs. The dimers link using motifs on the C terminal extension which insert into the pocket between the B4 and B8 strands, shown in dark blue, and by interaction of the N terminal helices.
Figure taken from Slingsby, C. et al. (2013. PDB 1GME
In addition to the wide range of alpha-crystallin oligomers, the numerous βγ-crystallin chains can be assembled to create a wide range of polymers which coexist in a polydisperse stable but flexible arrangement of varying density.
Beta-crystallins thus appear to function in a similar way to α-crystallins, forming a diverse range of differently sized hetero-oligomers that adjust the refractive index throughout the lens.
Gamma crystallins are different because they are monomeric and polar. They are present in differing concentrations throughout the lens and their polarity results in distinctive orientations towards other crystallins which may regulate inter-crystallin interactions. There is evidence that disruption of these dipoles results in cataracts.
By examining genomes of organisms which predate the development of the camera eye, Slingsby has shed light on the evolutionary pathway of crystallins as lens proteins. PPS students will bave read about the single-domain βγ-crystallin in the urochordate (invertebrate) sea squirt, Ciona intestinalis, that has exactly the same double Greek key structure as a vertebrate crystallin but includes a calcium binding sequence in each Greek key motif. This ancestral link was further demonstrated by the remarkable discovery that the gene promoter for Ciona-crystallin could successfully target reporter gene expression for proteins associated with vision in vertebrates.
Investigation of the genome of a cephalochordate, which is part of the lineage of both vertebrates and urochordates, revealed a less complex ancestor to βγ-crystallins. Signature sequences from the βγ-crystallins have also been found in bacterial and archaeal proteins. The implication of this is that all of the proteins of the vertebrate lens could well have evolved from proteins present in ancient species with no visual function. An interesting twist is introduced by the knowledge that the nonchordates, or animals without a spine, can use quite similar cellular lenses that involve non-crystallin proteins.
This suggests that lenses evolved independently in different animal kingdoms, relatively late on an evolutionary timescale, utilizing different proteins that were available in the respective phyla, that is proteins which already had an established purpose but which had qualities allowing them to form lenses. Since all species seem to have had access to at least a basic form of βγ-crystallin, it is an impressive demonstration that evolution can not only capitalise on the multiple possibilities presented by one family of proteins to develop a functioning visual system, but also repeat the trick from a different starting point.
References
- Dawkins, R. (1994). The eye in a twinkling. Nature 368, 690-691
- Nilsson, D.E., Pelger, S. (1994). A pessimistic estimate of the time required for an eye to evolve. Proc. Biol. Sci. 256(1345): 53-8.
- Slingsby, C., Wistow, G.J. and Clark, A.R. (2013). Evolution of crystallins for a role in the vertebrate eye lens. Protein Sci. 22(4):367-80.
Thursday, 9 January 2014
New protein structures presented at the 2013 BCA Winter Meeting
The 2013 meeting was billed as both a "final" event in the centenary year of the Braggs' landmark discoveries and part of the build-up to the International Year of Crystallography, but these were not the only anniversaries highlighted there. 2013 also marked the sixtieth anniversary of the publication of the structure of DNA. The 2013 Winter Meeting was held in King's College London, which played a very important part in that discovery: Maurice Wilkins and Rosalind Franklin, who obtained the X-ray diffraction patterns that led to the discovery of the double helix, were based there. (Wilkins shared the Nobel Prize for this discovery with Watson and Crick; Franklin died in 1958, four years before that prize was awarded.) And the first precise physical model of the double helix is still on display in the college.
Maurice Wilkins' original DNA model
The first researcher to speak at the meeting was Birkbeck's own Professor Bonnie Wallace. Her work on the structures of voltage gated sodium channels has been described on this blog before, most recently in April 2013. These proteins are responsible for the transport of ions in and out of cells, an essential signalling mechanism in all multi-cellular organisms. Their structures, however, are among the most intractable of all membrane proteins (PPS section 11, to be released in May, covers this fast moving field). Wallace has used a combination of X-ray crystallography, spectroscopy and molecular dynamics to explore the structure and mechanism of sodium channels in bacteria. The bacterial sodium channel is simpler than the mammalian equivalent, consisting of a tetramer in which helices from each monomer line the pore. The Wallace group's most recent strucure (PDB 3ZJZ) shows the position of the C-terminal domain of these channels for the first time. This domain consists of a coiled coil formed by one helix from each monomer that is linked to the rest of the protein by a flexible region. Moving the coiled coil up and down causes a conformational change that allows the channel to open and close.
The technique of rational or structure-based drug design, which involve modelling the interactions between a library of potential ligands and a protein binding site, has proved particularly successful in the design of anti-viral drugs. Several inhibitors of HIV protease and of influenza virus neuraminidase that were designed in this way have become very successful drugs. David Stuart from the University of Oxford and the Diamond synchrotron gave a talk illustrating how structure-based in silico techniques are now being applied to design drugs against another virus family: the Picornaviruses. Members of this large family are responsible for a diverse range of diseases, ranging in humans from polio to the common cold. The foot-and-mouth virus, which affects livestock and which devastated parts of the UK countryside in 2001, is also a member of this family.
One of the viruses studied in Stuart's goup is a human picornavirus that causes similar symptoms to the foot-and-mouth virus and that represents a serious threat to public health in East Asia. The disease is known as hand foot and mouth virus, and the virus as CAV16: like all picornaviruses, it consiss of a single strand of RNA enclosed within an icosahedral (20-sided) protein capsid. The intact virus particles are very fragile and diffraction patterns must be captured before the particles disintegrate in the X-ray beam. Stuart and his Chinese collaborators have used one of the microfocus beamlines at Diamond to take snapshots of the virus structure at several points during its life cycle. One of these is of an "uncoating intermediate" that shows one of the viral proteins (VP1) emerging from the capsid so that it can be embedded in the membrane of a host cell Ren et al., 2013). Stuart and his co-workers are now designing compounds to bind to these intermediate structures and prevent the virus from entering its human host cells.
All cells, whether prokaryotic or eukaryotic, contain long molecules of DNA that must be packaged in order to fit into the confined space available. Fortunately for developers of anti-bacterial drugs (and users of antibiotics) bacterial cells package DNA using a different mechanism from mammalian ones. In bacteria, enzymes called topoisomerases bind to, cut and re-join double-stranded DNA so that it can be unwound or untangled ahead of replication. Ivan Laponogov, a postdoctoral research assistant at King's College, described recent work in his group on the structure of one of these enzymes. Bacterial topoisomerase II ia a target for an important class of antibiotics, the fluoroquinolones, but resistance to these drugs is increasing.
These enzymes are powered by ATP and act as "clamps", capturing one double-helical strand of DNA and passing it through a break in another to remove supercoils and knots in the nucleic acid structure. The structure presented at this meeting was the first of a complete topoisomerase dimer bound to DNA in the "open clamp" position. This structure was solved with and without a fluoroquinolone drug (levofloxacin) bound. The structure with drug bound showed that molecule intercalating between DNA bases at the point where the nucleic acid would be cleaved, preventing that cleavage. The structure without the drug showed the DNA in a different position; the position of a functionally important magnesium ion also changed between the structures.
Many essential cellular processes involve a post-translational modification in which poly-(ADP ribose) or PAR is added to amino acid side chains, and the processing of this molecule involves a wide variety of enzymes. Inhibitors of one of these, poly-(ATP ribose) polymerase or PARP, have recently been developed as drugs against cancer. David Leys from the University of Manchester described his work on the structure of another enzyme in the PAR life cycle: poly-ADP-ribose glycohydrolase (PARG), which catalyses the removal of PAR from proteins.
Mammalian PARG enzymes have three domains, a N-terminal regulatory region and two C-terminal domains forming the catalytic region; the equivalent bacterial enzymes lack the N-terminus. Leys and his groups first solved structures of a bacterial PARG bound to ADP-ribose (PDB 3SIG) and to a known inhibitor with a similar structure. They found that a C-terminal helix in the protein was clamped around the terminal ribose of PAR, enabling the release of a single ADP-ribose from the polymer. This basic mechanism is similar in the mammalian enzyme. More recently, the Leys group has solved the structure of PARG bound to an intact PAR substrate (PDB 4L2H); modelling studies based on this structure suggest that the enzyme acts predominantly as an exo-glycohydrolase, that is, it catalyses the removal of one residue at the end of the polymer chain. Understanding the structure and mechanism of these enzymes should enable us to develop small-molecule inhibitors of PARG, and these may one day rival the PARP inhibitors as anti-cancer drugs.
A hundred years on from the "invention" of crystallography and sixty years on from the structure of DNA, these elegant, fascinating and complex structures presented at one meeting give a snapshot of recent progress in structural biology. Furthermore, each of these structures has already provided insights into human disease that may yet lead to the development of useful drugs.


