Vincentelli, R., Romier, C. (2013) Expression in Escherichia coli: becoming faster and more complex. Current Opinion in Structural Biology. In press, corrected proof.
Escherichia coli, or E. coli, has long been the trusty workhorse of the structural biologist. It is by far the most popular expression host, that is an organism which is used to translate introduced DNA into target proteins, with an astonishing 90% of structures deposited in the Protein Data Bank (PDB) having at least one subunit if not the entire protein produced by this bacterium. Targets are becoming more ambitious, however, as the importance of larger protein complexes in biological pathways becomes increasingly apparent. A recent review (Vincentelli, R., Romier, C. (2013)) examines whether E.coli is fit to tackle these new challenges. Can the workhorse be taught to jump fences?
E. coli as host for single expression
The virtual monopoly held by E. coli as a host is due to a number of factors, not least its ease of use and low cost. It is amenable to several different methods of genetic engineering, has rapid growth and benefits from an ever expanding range of host specific tools.
There are mature methodologies available for the production of a single protein. Reliable and straightforward screening processes, which can discover the conditions required for optimised protein solubility, are accessible for even challenging targets. Since the solubility of a protein has a direct impact on the size and quality of crystal that can be grown, this is a critical aim.
Genomics laboratories tend to use robotic platforms which can work with thousands of cultures side by side in order to optimise the culture conditions for their long shopping list of proteins. This wealth of experience now allows biologists to make a rational selection of a smaller set of expression parameters when targeting proteins which are more difficult to produce. If this restricted method fails then the team can revert to the broader set of conditions.
Supporting this trend towards more efficient screening processes are studies that have identified the parameters with the greatest impact on expression and solubility, a key example being specific fusion protein tags which enhance solubility.
Another recent development which has increased the complexity of proteins open to expression by E. coli, is the ability to co-express post-translational modifying factors. These factors can be critical in protein folding, complex assembly and catalysis, for example by promoting glycosylations and disulphide bridges.
Streamlining for maximum solubility and quality
The traditional approach to protein production was to perform small scale expression tests designed to discover the conditions for the highest soluble yield. These conditions would be scaled up and only at this stage would consideration be given to the quality of the sample, specifically protein aggregation, oligomeric states, protein stability and correct folding.
By using growth media with high cell density, particular E.coli strains and protein fusions designed to enhance solubility, initial yields have been increased. This coupled with the improvement in biophysical characterization, such that assays to examine protein quality can now be performed on just micrograms of a sample, has meant that culture conditions can be optimised for both solubility and quality at the first stage thereby streamlining the expression protocols.
E. coli and the expression of macromolecular complexes
Most biological processes involve macromolecular complexes alongside single proteins and there is an increasing desire to understand the structural and biochemical basis of these larger structures. The co-expression of partner proteins has been demonstrated to be advantageous for complex formation as protein-protein interactions are often the platform which allows co-folding and co-stabilization.
One example is the fimbrial tip complex of E. coli. Many Gram-negative and some Gram-positive bacteria are covered in a fringe of short, thin fimbriae which are used to attach both to eukaryotic cells and to each other although the mechanism was previously unknown. Co-expression of the FIM proteins of the fimbrial tip established that each subunit inserts a β strand into its neighbouring subunit such that allosteric changes in the tip protein trigger signals which can be passed down the fimbria.
Image adapted from (Le Trong, I. et al., (2010)). A view of a fimbrial tip complex of E. coli. (PDB 3JWN)
Birkbeck’s head of biological sciences, Prof Gabriel Waksman, has also used this technique to elucidate the interaction of FimH with its transmembrane translocation channel. His work has been the subject of previous blogs in June ’11 and May ’08.
Another impressive example which illustrates the size of complex that can be achieved through co-expression is the 1.8MDa baseplate of the lactococcal phage TP901-1. The baseplate is responsible for adhesion of the phage to the host and for delivery of the genome at infection and this particular version consists of 6 subunits of DIT, 18 of BppU and 54 of RBP proteins.
Image adapted from (Veesler, D. et al. (2012)). The baseplate of the lactococcal phage TP901-1. (PDB 4DIW)
Despite the advantages of co-expression, only a small percentage of large complexes listed in the PDB have been fully produced this way. Often, subunits are produced using co-expression but then labour intensive in vitro reconstitution strategies are employed to form the complete complex.
In some cases, this is because the protein complexes have critical interactions with nucleic acids but this is not always the case. This raises the question of what the barriers are that are discouraging complex formation via co-expression from E.coli.
Approaching the jumps
Studies have found various parameters that influence the quality and yield of co-expression using E.coli. Results can vary depending on whether a single vector is used to introduce the target genes rather than multiple vectors, whether multiple genes are used rather than cis and trans copies of the same gene or on the precise location of the affinity tag.
The best approach to tackle this quantity of possible sets of conditions is to use the high-throughput technologies which have been refined so effectively for single expression protein production. This will need to be combined with the tools in development for miniaturization of biophysical characterization so that the sample quality can be considered during the initial stages rather than creating a further bottleneck as a second round of tests are performed with greater quantities.
As techniques for performing characterization assays on minute samples improve further and, hopefully, a co-expression system can be evolved which allows the production of protein/RNA and protein/DNA complexes, there is optimism that E. coli can extend its hosting duties into ever larger and more intricate protein complexes.
Protein-protein interactions and protein expression for structural biology is covered in detail in the TSMB course.