
Полная версия
The Language of the Genes
Another tool uses enzymes extracted from bacteria to divide the landscape into manageable pieces. Bacteria are attacked by viruses which insert themselves into their genetic message and force the host to copy the invader. They have a defence: enzymes which cut foreign DNA in specific places. These ‘restriction enzymes’ can be used to slice human genes into pieces. Dozens are available, each able to cut a particular group of DNA letters. The length of the pieces that emerge depends on how often the cutting-site is repeated. If each sentence in this volume was severed whenever the word ‘and’ appeared, there would be thousands of short fragments. If the enzyme recognised the word ‘but’, there would be fewer, longer sections; and an enzyme that sliced through the much less frequent word ‘banana’ (which, I assure you, does appear now and again) would produce just a few fragments thousands of letters long.
The positions of the cuts (like those of the words and, but and banana) provide a set of landmarks along the DNA. To track them down is a first step to reconstituting the book itself. The process is close to that carried out by the students who stormed the American Embassy in Tehran after the fall of the Shah. With extraordinary labour they pieced together secret documents which had been put through a shredder. By putting the fragments together the students reconstituted a long, complicated and compromising message.
Molecular biology does much the same. First, it needs to multiply the number of copies of the message to allow each short piece to be surveyed in detail as a preliminary to the complete map. Various tricks allow cut pieces of DNA to be inserted into that of a bacterium or yeast. The DNA has been cloned. Whenever the host divides, it multiplies not only its own genetic message but the foreign gene. As a result, millions of copies of an original are ready for study in the exquisite detail needed for genetic geography.
Cloning has been supplemented by another contrivance, the polymerase chain reaction. This takes advantage of an enzyme used in the natural replication of DNA to make replicas of the molecule in the laboratory. To pursue our rather tortured literary analogy, the method is a biological photocopier which can produce many duplicates of each page in the genetic manual. The photocopying enzyme comes from a bacterium which lives in hot springs. The reaction is started with a pair of short artificial DNA sequences which bind to the natural DNA on either side of the length to be amplified. By heating and cooling the reaction mixture and feeding it with a supply of the four bases, the targeted strands of DNA unwind, copy themselves with the help of the enzyme, and re-form. Each time the cycle is repeated, the number of copies doubles and millions of replicas of the original piece of DNA are soon generated.
Another piece of trickery exploits DNA’s ability to bind to a mirror image of itself. DNA bases form two matched pairs; A with T and G with C. To find a gene, a complementary copy is made in the laboratory. When added to a cell this seeks out and binds to its equivalent on the chromosome. My computer can do the same. On a simple command, it will search for any word I choose and highlight it in an attractive purple. It does the job best with rare words (like ‘banana’). A DNA probe labelled with a fluorescent dye shows up genes in the same way. The method is known as FISHing (for Fluorescent In-Situ Hybridisation) for genes. A modified kind of FISH involves unwinding the DNA before it is stained. This makes the method more sensitive.
All this and much more has revolutionised the mapping of human DNA. First, it has improved the linkage map. Patterns of inheritance of short sequences of DNA can be tracked through the generations just as well as can those of colour-blindness or stubby fingers. There are millions of sites which vary from person to person. All can be used in pedigree studies. Another scheme is to use the polymerase chain reaction to multiply copies of DNA from single sperm cells. The linkage map is made from a comparison of the reordered chromosomes in the sperm with that in the man who made them. This avoids the problem of family size altogether.
Linkage mapping in humans took a long time to get started and still has some way to go. Before the days of high technology the great problem was a shortage of differences; of variable genes, or segments of genetic material, whose joint patterns of inheritance could be studied. That problem has been solved. Our DNA is now known to be saturated with hundreds of thousands of variable sites, many based on individual variation in the numbers and positions of repeats of the two letters C and A. As a result, a whole new industry based on the most traditional kind of genetics has burst into existence.
It needs, like any industry, raw material. The French, together with the Americans, have identified sixty or so large families with long and complicated pedigrees, well suited for gene mapping. They come from various parts of the world, from Venezuela to Bangladesh. From each individual, lines of cells are kept alive in the laboratory and thousands of variants have been identified, tightly packed along the entire length of the chromosomes. Patients with, say, heart disease can be screened to see whether they also tend to carry other inherited variants. If they do, there is a good chance that the actual gene involved is nearby, and is dragging its anonymous fellows along with it. To find such a milestone may be the first step to the gene itself.
The descendants of Morgan have at last managed to do for humans what was long ago achieved for the fruit fly, and a linkage map of man is close at hand. That of woman, it transpires, is rather longer. Such maps depend on the sexual reshuffling of genes. This takes place, for some reason, more in females than in males and, as a result, their chart works to a different scale.
The human linkage map is useful, but biologists have always wanted to make a different kind of chart, one rather like that used by geographers, based on a straightforward description of the genetic material. Now, it is here. The approach was brutal: to assault the genome with time, money and tedium until the whole lot was read from one end to the other.
The first move in tying the linkage map to one based on the physical structure of DNA depended on a stroke of luck. Morgan noticed that in one of his fly stocks a gene which was usually sex-linked started behaving as if it was not on the X chromosome at all. A glance down the microscope showed why. The X was stuck to one of the other chromosomes and was inherited with it. A change in the linkage relationships of the gene was due to a shift in its physical position.
Such chromosomal accidents were used to begin the human physical map. Sometimes, because of a mistake in the formation of sperm or egg, part of a chromosome shifts to a new home. Any parallel change in the pattern of inheritance of a particular gene shows where it must be. Now and again a tiny segment of chromosome is absent. That can lead to several inborn diseases at once. One unfortunate American boy had a deficiency of the immune system, a form of inherited blindness, and muscular dystrophy. A minute section of his X chromosome had been deleted. It must have included the length of DNA which carried these genes. He gave a vital hint as to just where the gene for muscular dystrophy – one of the most frequent and most distressing of all inherited diseases – was located. The absent segment was a landmark upon which a physical map of the area around this gene could be anchored.
To map genes with changes in chromosomes need not wait for natural accidents. Human cells, or those of mice or hamsters, can be cultured in the laboratory. When mixtures of mouse and human cells are grown together, the cells may fuse to give a hybrid with chromosomes from both species. As the hybrids divide, they lose the chromosomes (and the genes) from one species or the other. Some specifically human genes are lost each time a human chromosome is ejected. To match the loss of particular genes with that of chromosomes (or of their short segments) shows where they must be.
All these methods hint at a gene’s position rather than giving its precise coordinates. Small-scale cartography (or mindless sequencing, as it is affectionately known) involves various clever ruses. One depends on the ability of DNA to copy itself when a special enzyme is provided and the mixture fed with the A, G, C and T bases. It is possible to gradually lengthen pieces of a DNA strand from one end to the other, in four separate experiments (each using a different base). By chemical trickery, some of the growing strands are stopped each time a base is added. This produces a set of DNA pieces of different length, each stopped at an A, a G, a C or a T. Electrophoresis of the mixtures on the same gel gives four parallel lines of DNA fragments arranged by length. A scan across and down the gel gives the order of the bases. This is a most tedious task. It has been supplanted by machines that do the job in other ways. The most important change in genetics is a conceptual one. Because the three-letter code for each amino acid is known, it is possible to deduce the order of the amino acids made by a piece of the DNA once its sequence of bases has been established. What any gene does can be inferred by comparing that sequence with the computer database of others whose job is known. The fit need not be precise; after all, a French dictionary contains thousands of words similar enough to those in English to allow its meaning to be guessed at. It is also sometimes possible to work out the three-dimensional structure of the protein from its amino acid sequence and to deduce what its function might be.
There are some remarkable similarities among inherited vocabularies. The genes that control development are similar in humans and fruit flies, as are those that make their brains. Genes that, when they go wrong, damage the nervous system have close analogues in yeast (which do not have nerves at all) and one of our own genes is almost identical to another that alters the pattern of veins on an insect wing. Such conservatism has had a radical influence on human genetics.
The parts catalogue for a Mercedes C-class car contains four and a half thousand named items, from accelerator pedal to wing mirror to wheel nuts. Some (like individual bolts or washers) may be repeated dozens of times; but the factory has to make fewer than five thousand pieces to feed its assembly line and, in the end, to make its contribution to the European traffic jam. To make a human takes ten times as many – an executive jet’s worth – and the task of seeing how that vast number of pieces is bolted together might seem almost impossible. Even the yeast cell (scarcely the Mercedes of the living world) needs more than the car, with six thousand proteins.
The yeast gene sequence itself, like any other, is no more than a factory manual, containing information on castings, mouldings and blanks but also on various extraneous bits which are removed before the assembly line gets them. Then, as in the Mercedes factory, the parts have to be put together to make a functional piece of machinery. Even that is of no use to someone who cannot drive, and even a skilled driver is no help when dumped in a strange city without a road map. To understand the workings of the cell demands even more.
DNA dismantlers, like car wreckers, generate only a box of bits and pieces; the biological equivalents of the nuts, bolts, relays, springs, struts, wires and all the other things needed to make an automobile. The shape of a human protein can be inferred from a DNA sequence, but even usually gives no hint as to how it fits into the cellular machinery. Yeasts are simpler, and rather more is known about their mechanics. Life’s unwillingness to change allows the yeast machine to be used to explore our own cells. One approach in the human gene hunt is rather like fishing. Take a protein whose job is known, and attach a molecular hook and a separate float to it. Insert it into a male (or a cell showing what passes for maleness in yeasts). Then, mate that alluring individual to a female and drift his gene past all her thousands of cell parts until one takes the bait by slotting into it. The float causes the female cell to light up and the match is made.
A fishing expedition with two hundred or so bait proteins from yeast captured more than a thousand genes in human cells. One whole set of yeast proteins attached themselves to a single human protein that tells the cell when to start dividing and when to stop. The yeast bait is similar to one that, when it goes wrong, causes human cancer: and a quick test proved that the newly hooked human equivalents represented crucial parts of our own cells’ brake and accelerator systems. Such a discovery is of great interest to medicine, and marked the first step in what may become an era of hunting for genes in complex creatures with a lure based on more humble beings.
The genetic languages spoken by different organisms are close indeed; close enough, in fact, to give an even chance that a newly-discovered human gene sequence will be related to something else, either another of our genes or one from a creature remote from ourselves. Human genetics has been transformed. No longer does it start with an inherited change (such as a genetic disease) and search for its location. Instead, it uses the opposite strategy, with a logic precisely opposite that of Mendel: from inherited particle to function, rather than the other way around. Genetics is the first science to have accelerated by going into reverse.
The first breakthrough of this new approach was the successful hunt for the cystic fibrosis gene in 1990. It gave a hint as to what was possible and was the introduction to the advances that led to the complete map a mere decade or so later. The job cost one hundred and fifty million dollars, but the costs per gene have now dropped by hundreds of times.
Cystic fibrosis is the most common inherited abnormality among white-skinned people. In Europe, it affects about one child in two thousand five hundred. Until a few years ago those with the disease died young. Their lungs filled with mucus and became infected. Those with the illness find it hard to digest food as they cannot produce enough gut enzymes. Its dangers have long been recognised. Swiss children sing a song that says ‘The child will die whose brow tastes salty when kissed.’ These symptoms seem at first sight unrelated, but all are due to a failure to pump salt across the membranes which surround cells. Medicine has improved the lives of those affected, but few survive beyond their mid-thirties.
Family studies showed long ago that the disease is due to a recessive gene that is not carried on the sex chromosomes. In 1985, pedigrees revealed that it was linked to another DNA sequence which controls a liver enzyme, although it was not then known upon which chromosome that was. Within a year or so, a kindred was discovered in which this pair of genes was linked to a DNA variant that had already been mapped to chromosome seven. The relevant segment of that chromosome was inserted into a mouse cell line, cut into short lengths and the painful task of sequencing begun. By 1988 the crucial region had been tracked down to a segment of DNA one and a half million base-pairs long. Fragments were tested to see if (like the yeast and human sequences later found to control cell division) they had sequences in common with the DNA of other animals as, if they did, the order of letters must have been retained through evolution because they did some unknown but useful job. Several such sections were uncovered. One had an order of DNA letters similar to that of other proteins involved in transport across membranes. It followed the pattern of inheritance of cystic fibrosis. The gene had been tracked down.
The cystic fibrosis gene is a quarter of a million DNA bases long, although the protein has only about one and a half thousand amino acids. Computer models of its shape show that it spans the cell membrane several times, just as expected for a molecule whose job is to act as a pump. Many families with the disease have just one change in the protein: a single amino acid is missing. That changes its shape and stops the new protein from going to the right place in the cell. Instead it is picked up and destroyed by the internal quality-control network.
The discovery of the gene allowed carriers (together with foetuses bearing two copies) to be identified. Unfortunately, cystic fibrosis which once seemed a simple disorder, can, we now know, be caused by many different DNA changes that vary from place to place and from family to family. The illness gave the first hint about the unexpected and unwelcome complexity that the full map was to reveal.
Mapping exploded after that first discovery. At first, the mappers behaved like any explorer in a new territory. A cartographer does not start with a plan of the beach which is then extended in excruciating detail until the whole country is covered. Instead he picks out the major landmarks and leaves the details until later, when he knows what is likely to be interesting. Before today’s triumph of technology, most mappers were concerned with a small proportion of the genes, those that lead to inherited disease.
All the most important single-gene inherited illnesses were tracked down within a few years. Huntington’s Disease leads to a degeneration of the nervous system and death in middle age. It was once called Huntington’s Chorea (a word with the same root as choreography) after the involuntary dancing movements of those afflicted. An eighteenth-century Harvard professor claimed that those with the disease were blasphemers as their gestures were imitations of the movements of Christ on the Cross and some sufferers were burned. It is a dominant, but with a nasty twist: because of the late onset of symptoms, those at risk are left in uncertainty about their predicament. In 1983 came a breakthrough helped by great good luck. Soon after the search started, the approximate site of the Huntington’s gene was found by following its association with a linked DNA variant some distance away on the same chromosome. Then, luck ran out, and it took ten years to find the gene. It has now been tracked to the tip of chromosome 4. The shape of the protein which has gone wrong – huntingtin, as it is with some lack of imagination called – has been worked out to give, for the first time, some insight into the nature of the disease, which involves nerve cells in effect committing suicide when the aberrant protein (which looks like nothing else in the cell) instructs them to do so. Many more damaged genes soon fell victim to the genetic explorers and were pinned onto the map.
Type in the four letters OLIM – On Line Inheritance In Man – into any search engine and a list of ten thousand inherited diseases at once appears; symptoms, inheritance patterns, and, for nearly all, chromosomal grid reference. From the hunt for inherited illness, the search shifted to a wider set of genes. No longer were diseases needed as a first clue. To look for genes only when they go wrong is like trying to work out the principles of the internal combustion engine from car breakdowns. Now, the machine itself can be dismantled and its mechanism inferred directly.
When a gene makes something, it generates a complementary molecule – a messenger, as it is known – which transfers information from DNA to the main part of the cell. Because it produces nothing, most DNA generates no messengers at all. To find such molecules is hence an excellent way to search out working genes. There are tens of thousands of distinct messengers. What most do is quite unknown. In most cells, most are switched off but in the brain a large proportion are at work at any time. The brain is more active than is any other tissue (which may help to explain why more than a quarter of all inherited diseases lead to mental illness).
The hunt for genes is more like that for Timbuctu than for El Dorado. The mappers soon found that genes are oases of sense in a desert of nonsense. At one time, it seemed scarcely worth sifting the sands between the genetic cities, but, in the end, the complete map was made mainly on the grounds that it was worth while as one never knows what might turn up. It reaffirmed one of the most misunderstood facts in science; that it is possible to solve most problems by throwing money at them.
The assault on the physical map is best compared to surveying a country with a six-inch ruler, starting at one end and driving on to the opposite frontier. Twenty and more years ago, when the job began, one person could do about five thousand DNA bases a year. Now, it is routine to do thousands of times as many. Much of the intellectual effort of the job has moved from the simple accumulation of information to understanding it. Computer wizardry has played as important a part in the gene map as has biochemical machinery.
Once a segment of DNA has been sequenced, the local maps – the town plans – must be put in the right order. One way to build up a larger chart is to make a series of overlapping sequences of short pieces of DNA. The approach is a little like putting pages ripped out of a street guide back together by looking at the overlaps at the edge of each page in an attempt to find streets which run into each other. Sophisticated programs look for superimposed segments, long or short, and reassemble the torn fragments of DNA. That is much harder than it seems. An alphabet of just four letters and – like the map of an American city – many repeats of the same pattern of streets, gives plenty of chances for confusion. There are some short cuts. One trick, useful in the early days, was to jump several pages in the guide in the hope of missing out particularly tedious parts of the neighbourhood but for completion even the dullest parts of town must be charted.
New and powerful computers have made it possible, in principle at least, to make a whole genetic atlas at once, rather than piecing it together page by page. The ‘random shotgun’ approach lives up to its name. It blasts copies of the genome into thousands of segments, again and again, and, like a taxidermist rebuilding a single pheasant from the casual slaughter of many by a blind man with a twelve-bore, reconstitutes the whole thing from scratch. A giant program puts all the shattered pieces together, until at last they look like a map (or a game-bird). That approach worked well in fruit-flies, whose genome was sequenced before that of our own, but flies have a tenth as many DNA letters and far less repetition of easily-confused short sequences than we do. The less audacious ‘clone by clone’ approach takes tiny fragments (each about a twenty-thousandth of the whole of human DNA) and sequences them one by one. Then, it reassembles short segments of genes and, in time, re-forms the whole atlas. The approach, plodding as it may be, has worked well with humans and was used by the publicly-funded mappers to publish each clone as it appeared and to help thwart the privatised plan to sequence (and patent) the whole of our DNA at one fell swoop.
The physical map does not look at all like the linkage maps which emerged from family studies. The central difficulty is one of scale. A few tens of thousands of functional genes fit into three thousand million DNA letters. As most genes use only the information coded into several thousand bases there seems to be far more DNA than is needed. Mapping shows that just one part in twenty represents part of a gene. Our genome has an extraordinary and quite unexpected structure.
A geographical analogy may help. Imagine the journey along the whole of your own DNA as a trip from Land’s End to John o’Groat’s via London; about a thousand miles altogether. To fit in all the DNA letters into a road map on this scale, there have to be fifty DNA bases per inch, or about three million per mile. The journey passes through twenty-three counties of different sizes. These administrative divisions, conveniently enough, are the same in number as the twenty-three chromosomes into which human DNA is packaged. With the exception of some short segments a few hundred yards long which, for various technical reasons, have proved recalcitrant, the whole lot has been mapped out with an accuracy of one part in fifty thousand – an inch in a mile (which is as good or better than the maps sold by the Ordnance Survey).