Thursday, 11 December 2008

An Enzyme in Histidine Biosynthesis

The School of Crystallography at Birkbeck has a regular programme of research seminars held on Monday lunchtimes during term. Many of these describe recent developments in protein structure, and, from time to time, I will be reporting these here and linking in to course material where relevant. This week's seminar was by Adrian Lapthorn, a protein crystallographer based at Glasgow University, who just happens also to be the external examiner for the MSc in Structural Molecular Biology as well as the second year module TSMB. Adrian's research is concerned with solving the structures of enzymes, including those involved in the biosynthesis of the amino acid histidine.

Histidine is known as an essential amino acid; that is, it cannot be synthesised de novo in humans, but must be supplied from the diet. However, bacteria, fungi and plants all have enzymes that enable them to synthesise histidine from simple chemical precursors. The enzymes in the histidine synthesis pathway are therefore, at least potentially, good targets for novel antibiotics and herbicides as there are no equivalent human enzymes for them to inhibit, so they should be relatively free of side effects.

In bacteria, the histidine synthesis pathway consists of 10 steps, catalysed by a total of eight enzymes (some of which are bifunctional). The first step in this pathway is synthesised by an enzyme called HisG (or ATP-phosphyribosyltransferase) which catalyses the following reaction:


(PRPP is Phosphoribosyl pyrophosphate; PR-ATP is phosphoribosyl-ATP. The action of the enzyme is, therefore, to transfer a phosphoribose group on to the ATP molecule.

This enzyme is interesting for several reasons, besides the pharmaceutical and biotechnological interest in its inhibition. For one thing, unusually, there are no specific active site residues; the substrate is stabilised in the active site cleft by binding to magnesium ions.

Adrian's talk was subtitled "the long and the short of it", because some bacteria have a "short" form of this enzyme, and others a "long" form with an extra 80-odd residues at its C-terminus. All bacteria with the short form also have an additional enzyme, HisZ, which binds to HisG in an equivalent position to the C-terminal domain of the long form during catalysis. The long form of the enzyme consists of three discrete folded units called domains. There are two similar ones at the N terminus, followed by a long alpha helix and the C terminal domain, which is absent in the short form and has a similar structure to that of the small protein ferredoxin. The active site is between the two similar domains.

You will learn much more about domains and their folds in the next section, Towards Tertiary Structure. For now, look at this structure (PDB 1Q1K) of the long form of HisG (from E.coli) and try to identify the three domains and the active site. You might find it helpful to look at a single chain only.

Monday, 8 December 2008

Welcome 2008-9 Students to the PPS Blog

Yes - welcome!

I run this blog to help PPS students understand how what they study throughout the course relates to recent research, mainly in structural biology but also in bioinformatics and some other related subjects. I plan to update it every few weeks (or more frequently if there is enough to say) between the end of the Autumn term and the end of the course, with reports of lectures, conferences and new papers in the areas covered by the PPS course, and link these to the course material and the wider scientific literature. Some other lecturers at Birkbeck may also occasionally add posts.

The first blog post linked to the 2008-9 course will appear before the end of this week, and will describe crystal structures of enzymes presented at one of the talks in our weekly seminar programme here at Birkbeck. In the mean time, do scroll down and look at some of last year's posts, although you won't be able to follow the links to the course material there.

You will be encouraged to comment on blog posts and, if you wish, to make your own - you will all be added as authors in the New Year. Anyone who can view the blog should be able to comment, but only authors can make posts.

Enjoy the blog - and let us know what you think of it!

Clare Sansom

Tuesday, 29 July 2008

London Structural Biology Club meeting

We at Birkbeck have just hosted a meeting of the London Structural Biology Club. This is a network of students and researchers in structural biology based in London and the South-East of England. Members get together for a couple of afternoons a year to hear research presentations, and the talks are followed by further informal discussion over refreshments (usually pizza and beer).

Four talks were given at the Birkbeck meeting, with each presenting not only new structural studies but also novel insights into molecular function and mechanisms derived from those structures. First to talk was Carien Dekker from the Institute of Cancer Research in London. She described the protein interaction network - known as the "interactome" by analogy with "genome" and "proteome" - of a eukaryotic cytosolic chaperonin, CCT. Chaperonins are a sub-class of chaperones, the proteins that assist other proteins in forming their stable three-dimensional structures, and they consist of two ring-like structures that associate back-to-back forming a cavity in which their substrate proteins fold. Dekker and her co-workers have used a number of different proteomics techniques, including the insertion of a long internal tag into a loop of the protein, to discover the range of substrates for this chaperonin. Proteins involved in functions as diverse as protein import into the nucleus, protein degradation, and chromatin remodelling. CCT is also necessary for the formation of the septin ring complex, and thence for cytokinesis (the last stage of cell division).

This work was published very recently in the EMBO Journal. Dekker and her colleagues are now working on the structure of CCT, which they hope will reveal more details of the function of this chaperonin: watch this space.

The second talk was given by David Komander, who has just moved from the Institute of Cancer Research to set up his own lab at the prestigious MRC Laboratory of Molecular Biology in Cambridge. He described some intriguing details of the ubiquitin system, through which proteins can be tagged for degradation. Ubiquitin (mentioned briefly in section 7 of the PPS course material) is a small protein (only 76 amino acids) with an alpha+beta fold (see PDB entry 1UBQ). Its C-terminus can be covalently linked to lysine side chains or N termini of other proteins. As ubiquitin itself has seven lysine residues (as well, of course, as an N-terminus) it can polymerise to form short chains. Proteomics has shown that all possible combinations of ubiquitin linkages can exist, but linkages in which the molecules are connected through lysines K48 and K63 are the most common. Poly-ubiquitin tags composed of different linkages have been linked with different functions; for example, binding a K48-linked ubiquitin chain to a protein will tag it for proteasomal degradation, whereas a K63-linked chain will tag a protein for signalling. The structures of these two forms of poly-ubiquitin have been shown to be very different, with K63-linked poly-ubiquitin forming an extended chain and K48-linked poly-ubiquitin a compact fold.

Ubuquitinlyation is a reversible process, and Komander has been studying the enzymes (deubiquitinases) that catalyse the hydrolysis of the peptide bonds between two ubiquitins, or between ubiquitin and another protein. These DUBs are analogous to the phosphatases that remove phosphate groups from protein side chains; their specificity , however, is more complex than that of phosphatases. Earlier this year, Komander and David Barford published the structure of the N-terminal domain of one such protein, A20 (Komander & Barford (2008), Biochem. J. 409, 771-785l; full text available). This is a cysteine protease domain known as the ovarian tumour (OTU) domain. These structural studies suggest both a novel architecture for the protein's catalytic triad and a novel mechanism - reversible oxidation - for the regulation of protein ubiquitinylation.

The "home team" at Birkbeck contributed a talk from Han Renaut, in Professor Gabriel Waksman's group. Waksman's own account of this work, on the structure of bacterial secretion systems, was blogged back in May and will not be described in more detail now.

Lastly, we heard from Erhard Hohenester who described an unpublished structure of SPARC, a protein that binds collagen. About 30% of the dry weight of the human body is composed of fibrils of this structural protein. It has a unique structure, being composed of three strands wound round each other in a triple helix. Every third residue of each strand must be a glycine, and the protein also contains a high percentage of proline. Some proline residues are post-translationally modified with the addition of an -OH group to form hydroxyproline. Besides being the major structural component of animal tissue, collagen binds to and forms complexes with many proteins including integrins and some tyrosine kinases. However, until now the only structure of a complex of collagen with another protein was with integrin (see PDB entry 1DZI).

SPARC, or osteonectin, is secreted by osteoblasts during bone formation, and binds calcium as well as collagen. Its structure as an isolated protein has been known for over ten years; it has two domains, one alpha-helical and the other containing many disulphide bonds (PDB 1BMO). Details of the new crystal structure of the collagen-SPARC complex, solved by Hohenester's group, must wait until the paper is published, but it is possible to say that it binds the hydrophobic sequence GVMGFO (which is a rare sequence in collagens, although one often involved in protein-protein interactions) into a hydrophobic pocket on the PARC molecule. [Note that that "O" is not a mistake; it is the single letter amino acid code for hydroxyproline.]

This London Structural Biology Club meeting was sponsored by Alpha Laboratories Ltd.

Tuesday, 1 July 2008

Cell invasion by the malaria parasite

Last Monday's seminar in Crystallography (and the last of the summer term) was given by Dr Mike Blackman of the National Institute of Medical Research (NIMR) in Mill Hill, near London. His title was "Protease involvement in host cell invasion and exit by the malaria parasite. The following short report is contributed by Christine Slingsby:

Dr Blackman's lecture and discussion was on the topic of the
characterisation of several serine proteases, identified from the
genome of the malarial parasite, Plasmodium falciparum. One appears to operate in the membrane and one in the cytoplasm.

These are key enzymes used by the parasite to invade a host cell. Although they cleave with great precision certain proteins on the surface of the merozoite (blood stage of the parasite), it is unclear at the molecular level how the enzymes recognise their substrates. In other words, unlike, say, trypsin, which cleaves on the C-terminal side of a lysine of arginine, these subtilisin-like serine proteases have little sequence specificity.

Dr Blackman used the analogy of the success of HIV treatments based on the HIV aspartic protease, to enthusiastically push his work forward to try and discover inhibitors of these new enzymes as potential anti-malarial drugs.

Much more information is available on his website:

Friday, 27 June 2008

Insights into arginine methylation in histones

This week's seminar at Birkbeck was given by another alumnus of the School of Crystallography, Wyatt Yue, who studied for his PhD here from 1999-2003 working with Susan Buchanan. Wyatt is now at the Structural Genomics Consortium at Oxford University, but the work he presented was all done when he was a post-doc with Lawrence Pearl at the Institute of Cancer Research in London. He described some elegant work on the structure and function of one member of a family of enzymes that catalyse the addition of methyl groups to arginine residues in histones.

Histones are proteins that are involved in the coiling and compacting of DNA, so the long DNA molecules fit into the cell structures. In eukaryotes, DNA is first wound round an assembly of eight histone molecules to form a structure called a nucleosome. The DNA then resembles beads on a string; simplistically, this structure is further compacted by coiling into a chromatin fibre, from which the chromosome structures are formed. (Links here are to Wikipedia.) The histone molecules have N-terminal "tails" which protrude from the folded structure and are typically not seen in crystal structures. Residues on these tails can be chemically modified by e.g. phosphorylation, acetylation and methylation. These modifications are dynamic and form the so-called "histone code" which is one of the processes that control gene expression.

One of the most important of these changes is the methylation of nitrogen atoms on the side chain of arginine residues; this process is catalysed by a family of enzymes called protein arginine methyltransferases (PRMTs). Six members of this family are found in mammals; the protein that Wyatt has been working on is PRMA4, otherwise known as coactivator-associated methyltransferase 1 (CARM1). All these proteins share a common, catalytically active core domain; CARM1 is the only member with an additional C-terminal domain. The structure of the core is known (PDB 2oqb); it has two domains, an N-terminal Rossman fold, with two extra helices, and a C-terminal beta barrel with an insertion of a helix-turn-helix motif known as the arm. Core structures of two other PRMT proteins have also been known; compared to the others, the CARM1 core has a piece of ordered structure at the end of its C-terminal beta strand (the beginning of the C-terminal domain) and the helices of the arm are longer. Therefore, the cavity between the monomers of the crystallographic dimer is larger, leaving space for the C-terminal domains.

Lawrence Pearl's group, including Wyatt, has now solved the structure of the core domain bound to its co-factor, S-adenosyl methionine (PDB 2V74). This has revealed that it is only the co-factor binding that creates the cavity into which the arginine substrate is bound, indicating that the co-factor must bind first. This binding order has also been demonstrated in kinetic studies. The arginine pocket is close to the methyl group that is transferred during catalysis and is lined by negatively charged residues to attract the positively charged arginine.

CARM1 methylates three arginine residues in the N-terminal tail of the histone H3: R2, R17 anad R26. However, the mechanism of each methylation is not exactly the same. Studies with chimeras have shown that the pre-core region is necessary for methylation of R26 but not R17; Arg 17 methylation is faster if the neighbouring residue Lys 18 is acetylated.

You can read more about this work in the publication: Yue et al. (2007), EMBO J. 26(20): 4402-12. (Link to PubMed). The online version of EMBO Journal is available in the Birkbeck e-libarary.

Tuesday, 10 June 2008

J.D. Bernal and Crystallography's Beginnings

Last week at Birkbeck we held the annual lecture celebrating the life and work of J.D. Bernal, the founder and first head of the School of Crystallography at Birkbeck. Bernal - "Sage" as many of his contemporaries knew him - was one of the most influential figures in the early development of structural biology. In 1934, while at Cambridge University, he and his student Dorothy Crowfoot obtained the first X-ray diffraction pattern from a protein. That protein was pepsin; its structure was only solved many decades later. (The link is to a structure of porcine pepsin by a group led by another head of Birkbeck Crystallography, Professor Tom Blundell.)

Dorothy Crowfoot - as Dorothy Hodgkin - went on to win the 1964 Nobel Prize for Chemistry, for her work on the structures of vitamin B12 and penicillin. She also solved the structure of insulin. Bernal's mind was no less brilliant than his student's, but the ultimate prize still eluded him. Most commentators agree that he is one of the greatest scientists never to win the Nobel, although several of those he worked alongside and inspired, including Francis Crick, Aaron Klug and Max Perutz, did so.

It is possible that Bernal simply never allowed himself to stay focused on any one area for long enough to win the Nobel. This was essentially the view of Sir Lawrence Bragg, himself a Nobel Laureate (Physics, 1915): "if one traces back almost any fruitful line of crystallographic work, it will be found that Bernal assisted at its conception but left the child to be brought up by foster-parents. This is particularly so in the case of molecular biology and in the analysis of protein crystals. Immediately on seeing the first x-ray differentiation pictures from protein crystals...he assumed that protein structures would sooner or later be solved, and handed out problems to his students and to anyone whom he could persuade to take them up."

Bernal's politics may also have disenchanted him from the Establishment. He was a life-long and well-known Communist (in an era when this was far more common among intellectuals than it is today) and an internationalist, and visited the Soviet Union many times. The painter Picasso was among the giants of the European Left who knew him well; one meeting between the two, after which Picasso left behind a mural on the wall of Bernal's flat in the college, has even been made into a play. (The Wellcome Trust now own the mural, and have recently installed it in their London headquarters.) He was also a pacifist, and chaired the World Peace Council from 1959-1965.

Bernal's contributions to scientific thought go far beyond the practice of what is known as "hard science". He gave much attention to the interaction between science and society, and how science should be managed and funded. One of his many books, The Social Function of Science, published in 1939, is considered the earliest text on the sociology of science. And this year's Bernal Lecture focused on an earlier book still: The World, the Flesh and the Devil (1929): subtitled An Enquiry into the Future of the Three Enemies of the Rational Soul .

There is not enough space in a blog post to do justice to this fascinating man. Those of you who would like to know more about him are referred to a biographical memoir written in 1980 by Dorothy Hodgkin and available online.

Wednesday, 14 May 2008

Gabriel Waksman on bacterial secretion systems

Gabriel Waksman, head of the School of Crystallography at Birkbeck, has given a seminar in Birkbeck's School of Biological and Chemical Sciences describing his group's work bacterial secretion systems. Gabriel studied at the University of Paris and then spent the first part of his academic career in Washington University in St. Louis, Missouri, before joining Birkbeck in 2003. He now combines his position here with a chair in Biochemistry at University College London and directs the new Institute of Structural Molecular Biology which links the research in the two colleges.

Gabriel's research group works mainly on elucidating the structure and function of bacterial secretion systems, which produce hair-like appendages, or pili, on the surfaces of Gram negative pathogens. These are very important for infection as they allow bacteria to recognise, and then attach to, host cells. Mutant bacteria in which these proteins are not expressed are not pathogenic.

Two of the best understood of these systems are found in variants of E. coli which infect the human urinary tract. The P pilus recognises and binds to kidney cells, causing the kidney infection pyelonephritis, and the type I pilus recognises bladder epithelial cells and causes urinary infections. Both pili can be found on the same bacterium. The structures of both pili are similar, consisting of many protein subunits; it is the tip subunit that recognises its target cell types by binding to different cell surface sugar residues.

All subunits associated with a particular pilus are encoded by genes within a gene cluster and named accordingly: the P pilus genes are Pap genes and the type I genes Fim genes. Structures of at least one representative of each type of subunit have now been solved, many by Gabriel's group and its collaborators. Knowing these structures has enabled the group to understand the mechanism through which the pili are formed.

The pilus subunits that polymerise to form the main part of the fibrous structure all have similar structures. They are immunoglobulin-like, mainly-beta structures in which one sheet is lacking a central beta strand, and they are therefore unstable independently unless they are bound to a chaperone protein. A strand from the chaperone fits into the gap, forming regular hydrogen bonds with the neighbouring strands, and this stabilises the chaperone-subunit complex. This is then transported to the growing pilus, where the N terminal peptide from a subunit already in the structure replaces the chaperone strand in the new subunit, adding it to the polymer via a mechanism called "donor strand exchange". The resulting fibre therefore consists of a string of similar subunits, with the N terminal peptide of one subunit forming a strand in the central beta sheet of the previous subunit in the assembly.

The Waksman group's most recent structural studies concern the protein through which the pilus is assembled, known as the usher. This is a mainly beta membrane protein (link is to material in PPS section 11, which will be released next week) which is embedded in the E. coli outer membrane. The structure of the E. coli P pilus usher, solved by X-ray crystallography, shows the beta barrel and a middle or plug domain which interrupts the main beta sheet of the barrel. With 24 strands, it is the largest outer membrane beta barrel protein structure to be elucidated so far. In its inactive form, the plug domain fits inside the barrel, completely blocking it. They also used cryo-electron microscopy to isolate the structure of a type I pilus complex during pilus assembly. The usher forms a dimer within the cell membrane but, interestingly, the EM studies show that a pilus is secreted through only one monomer of the dimer.

This is very complex work which can only be touched on in a blog post. If you would like to know more, have a look at a few of these papers (links to abstracts in PubMed):

Monday, 21 April 2008

Predicting RNA binding from protein sequences

The first Birkbeck seminar of the new term was given by Sue Jones, from the University of Sussex. Sue is no stranger to Birkbeck as she did her Ph.D. with Janet Thornton at University College, and later worked with her at the European Bioinformatics Institute and the biotech company Inpharmatica. Today she described a piece of software that she and her colleagues have developed for predicting motifs in protein sequences that are likely to bind to RNA.

Proteins function largely by interacting with other molecules - they are "social" molecules. Protein interaction partners include other proteins, carbohydrates, "small" molecules and ions, and the focus of today's talk: nucleic acids. The structures and functions of RNA molecules are diverse and include protein coding (mRNA), protein synthesis (tRNA and ribosomal RNA), splicing, hydrolysis of nucleic acid bonds (in RNA enzymes or "ribozymes") and control of gene expression (the so-called "micro-RNAs or miRNAs). RNA-binding domains in proteins include RNP domains, dSRNA binding domains, and K homology (KH) domains - all these are mixed (alpha and beta) structures.

Jones and her colleagues surveyed known structures of protein-RNA complexes and marked residues that were in close contact (through van der Waals or hydrogen bonding) with the RNA. They described each amino acid in terms of predicted accessible surface area, conservation within the family of homologous proteins, and chemical properties. Not surprisingly, positively charged and polar amino acids were favoured in binding to the negatively charged nucleic acid over negatively charged and hydrophobic ones; glycine, which is flexible, and tryptophan, which can form base stacking interactions were also favoured.

Jones then built these features, averaged over a "window" of 5-25 amino acids, into a support vector machine to predict RNA binding features in proteins of unknown function. (This technique is a form of "machine learning"; you don't need to know about it for this course, but if you're interested in knowing more and can cope with maths at a relatively high level, see the Wikipedia entry.) This was found to be at least as reliable as any similar tools that are publicly available.

There will be more about protein-nucleic acid binding in the next section of course material, Protein Interactions and Function, which is due to be released at the end of April.

Monday, 14 April 2008

British Crystallographic Association Spring Meeting

I spent Tuesday - Thursday of last week at the British Crystallographic Association Spring Meeting. The meeting has 4 strands Biological, Chemical, Industrial and Physical Crystallography.
Each contribute a plenary and then have their own separate sessions. The Biological Plenary was the Bragg Lecture where one famous crystallographer speaks about another usually older crystallographer and their work. This year Tony Crowther from the MRC Cambridge talked about his work and that of Michael Rossmann from Purdue University. Both made seminal contributions to the method of molecular replacement in protein crystallography. More on that in TSMB. Michael Rossmann when a postdoc for Max Perutz at the LMB in Cambridge was the first person to realise that the chains of hemoglobin looked like the chain of myoglobin and hence that you could solve structures of related proteins by molecular replacement. Michael then developed the mathematics and early software for molecular replacement. Tony Crowther did a Ph.D. with David Blow at the LMB and developed an improved form of the translation function. While working on natural language processing inEdinburgh, Tony also realised how to give a much faster and more acccurate version of the rotation function, which was the basis of the molecular replacement method for a long time. Tony's career was actually mainly in electron microscopy, he returned to the LMB from Edinburgh to work for Aaron Klug and became a group leader in his own right. Both he and Michael Rossmann have done most of their work on viruses and he talked about Michael's work on bacteriophage and his work on Hepatitis.
The Biological Group sessions were on Membrane proteins. Chris Tate from LMB in Cambridge described work they have been doing to stabilise membrane proteins by mutation. They search for alanine mutations that increase the stability of the protein in detergent and then carry out mutations in combination until the protein is stable for half an hour at a temperature 15-20 degrees hotter than the original. They have succeeded in crystallising beta-1 androgenic receptor, which will give important comparison to the beta-2 published just before Christmas.
The most interesting talk for me was from Thomas Sorenson, now at Diamond, on the work he and colleagues had done in the group of Poul Nissen in Aarhus. The group have published structures in several states of eukaryotic ATPase transproters (Calcium, sodium, proton). Interestingly these proteins were discovered by a Dane, Jens Skou and the group used proteins provided by various groups in Denmark that have worked on the systems for many years. They used natural sources and did not purify the proteins down columns, but just used differential extraction. This means that they isolated the membrane fraction that contained most of the protein and then extracted with detergents and this material was pure enough to crystallise in the presence of the right combination of detergents and lipids. The other biological sessions were on neutron diffraction, probing fast biological reactions, complementary methods, and ligand binding and drug design. Neutron diffraction gives the position of hydrogen atoms as both hydrogen and even more so deuterium diffract neutrons much more relatively than they do X-rays and you get density for hydrogen atoms. The catch is that neutron fluxes are much weaker and you need crystals that are 0.1- 1 mm3 compared to 0.0001 mm3 for a protein crystal. Studying reactions in crystals often means trapping intermediates by freezing out. Arwen Pearson from Leeds gave a good talk about a redox system that she had worked on in Minnesota where the reaction cycle can be carried out in the crystal, even changing space group between states. The catch, and this is common, is that X-rays themselves generate free radicals which can reduce redox centres so by collecting the data the redox state is altered. This meant that they had to collect data from several crystals before they became too damaged. The highlight for me of the Ligand and Drug session was a talk from Chris Phillips at Pfizer about their new non-nucleoside HIV Reverse Transcriptase inhibitor. These target a hydrophobic pocket in the protein, and tend to be rather 'greasy'. The Pfizer group had carefully designed a ligand that was both smaller and more hydrophilic and hence a better drug in terms of bioavailability.
There were many more great talks, but I hope this gives you a flavour of the meeting

Greetings from Poznan

This is just by way of an apology for my relative silence on PPS blogs and forums lately.

I am half way through two weeks' teaching at Adam Mickiewicz University, Poznan, Poland, funded by a grant to Birkbeck through the EU Erasmus programme (formerly known as Socrates) which funds student and lecturer exchanges between EU countries (and some others). I am teaching a two-week course on bioinformatics mostly, this year, to postgraduate Physics students but I have also taught in other departments.

I have known my host here, Professor Mariusz Jaskolski, since we were both working in the same lab in the States, NCI Frederick, in the early 90's. Mariusz was involved in some of the early work on the structure of HIV protease which is covered extensively in the PPS course. Since then he has gone on to found the first X-ray crystallography group in central-eastern Europe and to solve the structures of viral integrases, asparaginases and others... and many of his students, and others at AMU, have taken PPS and/or other distance learning courses from Birkbeck.

Normal service will be resumed next Monday.

Monday, 17 March 2008

Diamond Beamlines

Today's seminar was by Dr Liz Duke from the Diamond Synchrotron. Liz was the first scientist employed to develop the protein crystallography beam lines over 5 years ago and now the first 3 beamlines are taking users. She outlined the design and some of the highlights (the first time she went into the experimental hall, the first beam and the first diffraction) and the ongoing issues that they want to improve. For example the sample robot takes 4 1/2 minutes to mount a new sample and be ready to collect data, which people find slow (although in tests the quickest a person could do it was 6 1/2 minutes). By really understanding the steps where corners can be cut they hope to take another minute or so off the time. They are now finding the balance between giving users access and finding enough time for themselves to implement the improvements that people want to see. However Diamond like all large projects is under some threat from the STFC (Science and Technology Facilities Council) review of its spending commitments. They do not have enough money in the next Comprehensive Spending Review to continue all the projects currently being funded and have launched a consultation which closes at the end of this week. Although Diamond is a High Priority there may not be as much money for the ongoing running and development costs as are needed.

Friday, 7 March 2008

Regulation of the EGF receptor

The epidermal growth factor (EGF) receptor is (as its name implies!) a receptor that sits at the surface of cell membranes. Like many other such receptors, its intracellular region contains a tyrosine kinase; the extracellular part binds to a small, soluble protein, epidermal growth factor (EGF). When EGF binds it stimulates a conformational change that leads to a dimerisation of two EGFR molecules. This activates the tyrosine kinases so they pass a signal - essentially saying "EGF has bound here" - through the cell in a cascade of phosphorylation reactions.

EGFR is one member of a family of four similar receptors, known as the ErbB family; it can also be known as ErbB1. You may have heard, indirectly, of another member of this family, ErbB2; this is over-expressed on the surface of breast cancer cells in about a quarter of breast cancers. It is the target of the drug herceptin, which has transformed the lives for many women with so-called "herceptin receptor positive" breast cancer.

This week, Professor Mark Lemmon from the University of Pennsylvania gave a seminar at University College, London, about the structural basis for the regulation of the EGF receptor. This was one of the regular seminars organised through the Institute of Structural Molecular Biology, which brings together researchers at Birkbeck and UCL working in structural biology, chemical biology, biophysics, proteomics and bioinformatics.

Lemmon's research is concerned with the structure of the extracellular, EGF-binding regions of these receptors. These are made up of four domains - two "L-domains" and two cysteine-rich domains, arranged in the order L-C-L-C starting from the N terminus of the protein. EGF binds between the two L-domains, and the two C-domains form the interface between the monomers in the dimer. In the absence of ligand, the extracellular region adopts a "tethered" conformation in which the dimerisation domain is occluded. However, this inactive, auto-inhibited conformation can also exist in the presence of EGF; Lemmon and colleagues solved the structure of the entire extracellular region, with ligand bound and in an inactive conformation (PDB file 1nql).

Lemmon and his colleagues have now studied the transition between the inactive, tethered state and the active, extended one using the technique of small-angle X-ray scattering (SAXS), which is used for observing large conformational changes in molecules. They found, importantly, that introducing mutations into the "tether" region of the protein cannot drive the transition to the active conformation. Rather, in this protein (but not in the apparently ligand-less herceptin receptor) it is only EGF binding that can cause the transition to the active form.

A useful (if a few years old) review of tyrosine kinase structure and function is Hubbard & Till (2000), Annu. Rev. Biochem. 69, 373-398. This is accessible from the Birkbeck e-library with your username and password.

Tuesday, 19 February 2008

The bioinformatics of the 'flu

What was the most lethal epidemic of infectious disease in modern times? AIDS perhaps? You might think so, but you would be wrong. Between the beginning of the epidemic and the end of 2007, AIDS killed people 25 million; the influenza epidemic of 1918 killed 40 million. This vast figure is also about double the number who died in the First World War, which ended the same year. Furthermore, a high proportion of those deaths were of healthy individuals in the prime of life. Yet most of the time we think of influenza as little more than a very nasty nuisance...

However, there has recently been a renewed interest in past influenza epidemics as a result of the result of the virulent strain of influenza currently sweeping through populations of wild and domestic birds worldwide. The influenza virus is endemic in birds, and strains tend to spread periodically from them to mammalian hosts: pigs as well as humans. Mapping genetic changes in the influenza virus, and how these affect its spread, is an important research area. Richard Goldstein of the National Institute for Medical Research, based in Mill Hill, London, gave a fascinating Monday seminar on this topic yesterday, looking into the past to see how genetic changes could have led to the lethal 1918 epidemic.

Influenza viruses contain two proteins on their spherical surfaces: a neuraminidase and a hemagglutinin. These proteins come in various forms: 16 different hemagglutinins are known, and 9 different neuraminidases. Any influenza virus can be characterised by these variants - for example, the most common type of influenza currently afflicting humans is H3N2, and the feared bird flu H5N1. Recent flu epidemics appear to have been caused by reassortment events, where the genomes of different viral subtypes combine to form an entirely new one that will not be recognised by human immune systems. An epidemic in 1957, for instance, coincided with a shift from influenza H1N1 (which had been circulating since 1918) to H2N2.

Influenza virus hemagglutinin and neuraminidase are both mainly-beta proteins, and their structures are described further on this page of PPS section 5.

So, what happened in, or before, 1918 to cause the epidemic of H1N1 flu? Molecular geneticists, such as Goldstein, study this by reconstructing phylogenetic trees showing the evolutionary distance between viral isolates taken in different places at different times. Yet most of these calculations can only show evolutionary distance, not the direction of change - in the jargon of phylogeny, they produce unrooted, rather than rooted, trees (there is no known "top"). The research was at rather an impasse until postdoc Mario dos Reis (a Birkbeck Ph.D.) noticed that the GC content of viruses infecting humans, but not of those infecting birds, decreased over time. This enabled the group to add an evolutionary "clock" to the phylogenetic tree for each of the influenza virus' 11 genes. This showed that some of the genes had entered the human population at different times, indicating that the variant that caused the 1918 flu had arisen from several recombination events. Interestingly, only one gene (neither H nor N) could have made the jump in 1918; most viral proteins were present in the human population in their 1918 forms well before that year. M1, like some other proteins, appears to have made the shift in 1899.

So, what did happen in 1918? There do appear to have been changes to the H gene then. But it may also be possible that the world population was so debilitated, and susceptible, after four years of war that a variant that had already been around for a few years was, unusually, able to cause such an epidemic...

Monday, 11 February 2008

Function of vFLIP - a protein from the virus associated with Kaposi's sarcoma

Today's Monday seminar was given by Professor Mary Collins from the Division of Infection and Immunity at University College London's medical school. She described some of her studies of the function and mechanism of vFLIP, a protein from the human herpesvirus 8 (HHV8). This is a cancer-causing (oncogenic) virus and is responsible for the AIDS-associated tumour, Kaposi's sarcoma.

While very many people - about 10-15% of blood donors in London, for example - have been exposed to, and have antibodies for, this virus, it only causes problems in people with defective immune systems. In these people, however, it can cause lymphoma or multicentric Castleman's disease as well as Kaposi's sarcoma. This, however, is now - thanks to HIV and AIDS - one of the commonest cancers worldwide, in fact the commonest in some African countries. AIDS-related Kaposi's sarcoma is fairly rapidly fatal if untreated; however, it can now be controlled very well (although not completely cured) with anti-retroviral therapy.

HHV8 is a large (by viral standards!) double-stranded DNA virus. The protein vFLIP is one of a cluster of proteins expressed while the virus is latent. It binds to a protein called i-kappa kinase gamma in virus-infected cells. This causes the kinase to phosphorylate inhibitors of the cytokine NF-kappa B, leading to the release of NF-kappa B from inhibitor complexes and ultimately the degradation of the inhibitors. This leads to a cascade of gene expression that is essential for the prevention of apoptosis; in contrast, knockdown of vFLIP levels by siRNA will lead to induction of apoptosis. vFLIP is therefore implicated in the survival of virally infected cells.

Endothelial cells that have been infected by HHV8 adopt a characteristic "spindle cell" phenotype. Microarrays have shown that the virus induces complex re-programming of gene transcription in these cells, which is likely to induce this dramatic change in their morphology. Future work will elucidate the precise role of the signal transduction cascade induced by vFLIP in this "morphological reprogramming".

There is no structure available for vFLIP, but there are a large number of kinase structures known. From the UniProt database, the closest to i-kappa kinase is the intracellular Ser/Thr protein kinase domain of Mycobacterium tuberculosis PknB (PDB entry 1MRU).

Tuesday, 5 February 2008

Seminar by Prof Armitage, Oxford

This weeks Monday seminar was by Prof Judy Armitage from Oxford ( . Do go to her website as it will help you to see some of the pictures. She was interesting to us as a biologist (microbial physiologist) heading up a Systems Biology centre. Systems Biology is a trendy phrase but has almost as many meanings as there are centres. Systems Biology is really putting biology on a quantitative basis by developing mathematical models that accurately predict experiments. This can be at various scales, whole organisms, individual organs, whole cells or just a particular pathway. Prof Armitage has been studying the photosynthetic bacteria Rhodobacter sphaeroides for many years. It has now become more significant as Craig Ventner has found that a close relation is the most abundant organism in sea water and therefore central to photosynthesis in the sea.
However what Judy is currently studying is the chemotactic response where these bacteria swim towards their food source. In contrast to the well studied E.coli system where there is only one set of proteins, there are 4 operons in R.spheroides. Although in vitro the proteins were able to cross react and phosphorylate, they did not compensate for each other when deleted.
Her group showed by attaching fluorescent proteins (GFP/CFP) etc to the various components that unlike E.coli where the chemotactic proteins were all at the pole of the cell, there were two systems in R.spheroides, one at the pole but the other in the mid cell. One to sense the internal "happiness" of the cell- ie was it well fed and the other to go and find new food sources when it was not. She pointed out the importance of a number of their experimental details. Firstly they add the fluorescent protein in frame with the protein of interest in the genome so as to get natural abundance not overexpression. Secondly they only pay attention to those mutants where the pathway still functions as the fusions can sometimes disrupt the true localisation and give a false result, but when this happens the pathway does not work. The group has also done work on the flagellae that drive the bacteria and using a special fluorescent microscope were able to count the numbers of the subunits that drive the motor- 22 per complex.
Her take home message was that you cannot assume that all things work the same as E.coli!

Saturday, 2 February 2008

Dear Dr. Clare;

Thank you for inviting us to this blog. It should be a interesting experience! Take care, bye!

Best regards,
Dr. Nadia

Monday, 21 January 2008

Structure and mechanism of Hsp90, a heat shock protein

Heat shock proteins are over-expressed when cells are exposed to heat or stress. Their function is to help other proteins to fold "correctly" into their mature, functional forms and, as such, they are classified as "chaperones". The structures and functions of these proteins will be described in much more detail in section 8 of the PPS course ("The Protein Lifecycle").

Today (21 January), in the School of Crystallography's Monday seminar programme, Maruf Ali from the Institute of Cancer Research in London spoke about his research on the structure of the heat shock protein Hsp90. This protein is found in all kingdoms except for the Archaea; it interacts with many other proteins (known as "client proteins" to help them enter their mature, active structural forms. Hsp90 client proteins include kinases that are important targets for anti-cancer drugs - hence the ICR's interest.

Hsp90 is a three-domain protein. The N-terminal domain binds ATP, which is necessary for the protein's activity; the middle domain binds client proteins; and the C-terminal one is involved in dimerisation. Structures of each domain separately were already known when Maruf started his post-doc a few years ago; each of these domains has a fold in Scop's alpha+beta class. Maruf's work involved solving the structure of a mutated form of the intact protein bound to a co-chaperone, (Ali et al. (2006), Nature 440, 1013-1019; PDB code 2CG9). This structure gives a clear picture of a complex structure, showing how a "lid" of structure closes to enabl the client protein to bind, and supports a previously proposed model in which the N-terminal domain is also involved in dimerisation.

Try downloading the structure, loading it into Jmol or a similar program, and seeing if you can identify the three domains.

Tuesday, 15 January 2008

CCP4 2008 Study Weekend

CCP4 (Collaborative computing Project 4) is responsible for one of the main X-ray crystallography computer packages. It holds a study weekend the first weekend in January each year attended by 3-500 delegates. It is the social event of the UK protein crystallographers calendar as well as being the best methods meeting each year certainly in Europe. This year the topic was "Low Resolution Structure Determination and Validation" inspired by the retraction in Dec 2006 of five high profile structures of membrane proteins, which had been wrongly determined due to the author using an incorrect piece of software (not CCP4!) which meant that his maps were inverted ie his helices were all left handed rather than right handed. Because the resolution was low, as membrane protein crystals often are, this mistake was not immediately obvious from the maps and he forced through the refinement of right handed helices into left handed density by some dubious methods. Less was said about another structure published in Nature which essentially does not have the copies of the protein touching in the crystal as this is still under investigation by the American University from where this structure originates. Analysing these mistakes should help the community to avoid them themselves and possibly spot them as referees- all these papers had got past expert referees.
On a more positive note the first example of an ab initio (ie without a sequence homologue) prediction of a protein structure that was good enough to solve a crystal structure was presented. The main conclusion of the meeting is that the resolution of the structure is pretty much the only thing that determines the quality of the structure. Big structures at the same resolution are just as good as small structures. However big structures tend not to be at as high a resolution so that on average big structures are less well determined than small structures, but this is because they are lower resolution. Structural genomics groups are no better or worse than targetted labs at determining structures.

From a PPS point of view probably the most interesting talk was by Chris Tate from MRC in Cambridge who said that the retracted structure had to be wrong because it was incompatible with the biochemical data. Although structure is powerful, it has to be compatible with the biology. The system he worked on (EmrE) has another peculiarity in that half the protein inserts into the membrane in one direction (N terminus in) and half in the other (N terminal out) and the active molecule consists of one of each of these protein chains.

Friday, 11 January 2008

Tim Hunt's Lecture

The Institute of Structural Molecular Biology, based at Birkbeck and University College, hosted a star performer as its first seminar speaker of 2008: Tim Hunt of the Cancer Research UK London Research Institute. Tim was awarded the Nobel Prize for Physiology or Medicine in 2001, with Leland Hartwell and Paul Nurse, for his discovery of cyclins - proteins that control the expression of the cyclin-dependent kinases (CDKs) which control a cell's passage through the cell cycle. So cyclins can be described as regulators of the regulators of the cell cycle.

Maybe only a Nobel Laureate could do it. Tim started his lecture with a quick tour through several hundred years of Physics, inspired by his small daughter's (unanswered) question "Why is the sky opaque?" He introduced (or re-introduced) his audience to Schrodinger's equation and Maxwell's idea of a "field" before confessing that he didn't understand quantum mechanics: no one should ever be ashamed of admitting as much.

The cell cycle, and its control, is, like quantum mechanics, "very interesting and very complicated". Tim's critical observation, which he made studying frog oocytes, was that cell division is controlled by the concentrations of the proteins that we now know as cyclins. They were given this name because their concentration in cells goes up and down according to where those cells are in the cell cycle - whether they are growing, replicating their DNA, undergoing mitosis...

Cyclins control the progress of cells through cell division by regulating the function of cyclin dependent kinases (CDKs). By binding to CDKs, cyclins control their activation state, and active CDKs drive the cells through the cell cycle. The press release for the 2001 Nobel Prize succinctly described CDKs as the cells' "motors" and cyclins as the gear boxes that control whether cells will be in idle or overdrive.

Cyclins are all-alpha proteins (link is to the PPS material) with 5-helical cores. PDB entry 1H1S shows human cyclin A bound to CDK2.

It is impossible to do justice to such a complex topic, and lecture, in a few paragraphs. To learn more, try the resources on the Nobel website would be a good first port of call.

Happy New Year!

It must have been the way the holidays fell in the middle of the week, this year, but that was certainly a long break! But we are now back in earnest.

And I hope that none of you have forgotten what you have been learning over the long holiday. Section 5 of the course will be released on Monday 14th Jan.

And there have already been a few interesting scientific events of 2008...