Julia Draznin Maltzman, MD
Abramson Cancer Center of the University of Pennsylvania
Last Modified: November 2, 2003
The recent completion of the human genome sequence has greatly increased interest in genetics. However, knowing the human genome sequence is only the beginning; the next step is to understand the function of each gene and how its? malfunction may cause disease. From what we know thus far, the human genome has approximately 35,000 genes. Only about half of these have recognizable DNA sequences that would suggest a possible function. Diseases that occur due to genetic mutations have been recognized in ~1000 genes. Furthermore, the old teaching of one gene makes one protein has been rejected with the discovery of alternative splicing and post-translational modification. Alternative splicing results when a single gene can produce multiple related proteins; and post-translational modification is the addition of small molecules to an already existing protein rendering it a novel function. It is thought that more than 100,000 proteins can be produced from 35,000 genes. Genetics is the study of a single gene and how it maybe passed on from one generation to the next. Genomics, by contrast, is the study of all the genes in an organism as well as their interactions with each other. Understanding genomics may hold the key to a better understanding of diseases such as cancer.
Furthermore, genomics may enable oncologists to tailor therapy to each individual patient according to their genes. The hope is, that one day, science would be advanced enough that a patients? genome may be ascertained by a simple, non-invasive test and compared with a ?normal? or ?healthy? genome within a matter of minutes. By knowing which genes are responsible for a specific disease, doctors could target the therapy to that particular gene. In the ideal world, this technology could have the potential to also be used for screening purposes in identifying individuals at risk of developing a disease (cancer or otherwise) based on their genetic make up.
Proteomics, by contrast, is the study of all the proteins produced by a cell. Proteomics requires identification of these proteins and analyzing their role in the physiologic as well as the pathologic state. This is actually much more complicated than seems at first glance. While the DNA itself remains the same, proteins in any given cell change as genes are turned off and on in response to environmental influences and stimuli. Thus, at any given point, the proteins expressed in the cell may change. Given the dynamic nature of protein expression, their study has so far been a challenge.
Biologic behavior of a cell is determined by the pattern of gene expression within that cell. Each human cell has billions of DNA subunits, which encode for tens of thousands of genes. In any given cell only a small fraction of these genes are active. Cancer can be thought of as a disease that results after multiple genetic abnormalities. These abnormalities in the genome result in changes of gene expression ? in proteins, which functionally represent the genome. Changes in proteins result in aberrant cell behavior such as uncontrolled cell growth, loss of gene repair mechanisms, loss of contact inhibition and overall genetic instability.
Working with proteins is much more difficult than working with nucleic acids (RNA or DNA) for multiple reasons. Unlike nucleic acids, proteins cannot be amplified and thus proteins that are more rare may not be easily found. Furthermore, proteins are often denatured during experiments as small changes in working conditions may change their folded shape (tertiary and quaternary structure). Despite all the difficulties, the importance of proteins must be underlined. Proteins are responsible for cellular movement, adhesion, communication, metabolism, and reproduction of each living cell.
Due to the ubiquitous nature of proteins, the discipline of proteomics has a myriad of potential applications. One possibility is to identify new disease markers that can help in prevention, diagnosis, and even treatment of cancer. If scientists can identify which proteins are expressed in certain disease states, then novel drug therapy can be developed to target only the proteins responsible for a particular disease, sparing the healthy ones. This idea is already being employed by imatinib mesylate (Gleevec, Novartis) in Chronic Myeloid Leukemia (CML). In this disease, a novel protein is overexpressed. Imatinib?s mechanism of action is to inhibit this novel protein and thereby impede the disease process. Just as imatinib has revolutionized CML therapy, scientists hope to find such treatment for all cancers.
Microarrays are the technology used to examine DNA, RNA, proteins, or even whole tissue samples to identify differences between cancer cells and healthy cells. Genomic array is a technique that examines a cancer cell?s genome and compares it to a healthy genome. Unlike most normal cells that have 46 chromosomes (two copies of each gene ? one paternal and one maternal), cancer cells often have abnormal chromosomes, either having a single copy of every gene or having three or more copies of a particular gene.
Expression arrays look at the RNA from cancer cells. As mentioned earlier, not all genes are expressed. In other words, not all DNA is transcribed into RNA. Expression arrays look at the RNA produced by the cancer cell. By examining the RNA as opposed to the DNA, researchers can differentiate which genes are actively being expressed within the cell.
Methylation arrays were developed because it was noted that not all RNA is translated into proteins. Either the RNA message is unstable or it is mutated, or the gene itself is hyper-methylated and therefore turned off. The word methylation refers to the addition of a methyl group onto a gene and thereby rendering it inactive. Genes are turned off and are not transcribed by hyper-methylation. This technique allows researchers to see which genes are expressed by examining the cancer cell?s genome and looking for hyper-methylation.
Protein arrays, run by gel mass spectrometry are able to look directly at the proteins produced by the cell of interest. Tissue arrays permit investigators to look at entire tissue blocks as opposed to sub-cellular segments such as nucleic acids or proteins.
Each type of array is slightly different but the basic principles are similar in that they look for differences between normal and cancerous cells. The simplest genomic arrays are done by fixing known genetic material on to a microarray chip also called gene chip or biochip. These are either commercially purchased or made by the individual investigator to suit his or her special interests. Thousands of genes can be examined at once. It is important that the investigator keep track of the genetic organization on this chip. An array is an orderly arrangement of genomic samples on to a small flat area. Biochips can be as small as a postage stamp. Microarrays provide a venue for matching the known DNA (healthy cells) to the unknown DNA (cancer cells) based on the principles of DNA base pair annealing/matching. DNA is therefore extracted from a cancer cell and labeled with a fluorescent tag ? green for example. DNA from a normal cell is also collected and tagged with a different fluorescent color &endash; red. DNA from both cell types are allowed to mix and competitively bind to their complementary strand found on the gene chip. If a certain gene were overexpressed in the cancer cell, then the green labeled DNA would outnumber the red. There would therefore, be a greater likelihood that the green DNA will bind to the gene chip, and not the red tagged DNA. This overexpressed gene would out compete the genes of the non-cancerous cell and bind to its complement on the chip and that area on the chip would look more green. If the genes are not overexpressed by the cancer cell then there should be an even mix of red and green on the gene chip. Computerized fluorescent scanners are then used to read the fluorescent colors on the chip. A predominance of green, in this example, assumes cancer cell overexpression relative to the normal cell. A predominance of red assumes the opposite. By knowing the location of each gene on the biochip, the investigator knows exactly which gene is overexpressed in the cancer cell.
The past decade has been a period of unparalleled discovery in the field of genomics and proteomics. Researches in virtually every malignancy are rushing to try to define some genetic component to risk of disease development, disease prognosis, therapy, and, of course, disease outcome. There has been a tremendous amount of data published in the field of proteomics as it pertains to prostate cancer (as presented in the American Association for Cancer Research meeting 2003, Washington, DC), lymphoma, lung cancer, colon cancer as well as breast and ovarian cancers.