The Human Epigenome Project Biology

Essay add: 13-11-2017, 14:34   /   Views: 94

The Human Genome Project provided a map of the human genome. However, it did not predict how the genome is packaged into chromatin. It is important to know how the genome is packaged into chromatin because this dictates how a differential expression of genes at different times in the course of development will occur. This, in essence is the study of the epigenome. The human epigenome project was, therefore, launched to provide better understanding of the human epigenome. Epigenetic processes are now known to be increasingly involved in modulating the phenotype.

The human epigenome project explains the relationships that exist between the major epigenetic players and are called the 'epigenetic code'. It explains the concept of methylation and it provides comprehensive DNA methylation maps which are collectively called the 'methylome'. It also provides an understanding of other epigenetic mechanisms like histone modifications.

The aim of the human epigenome project was to identify the chemical changes and relationships that exist between chromatin constituents that provide function to the DNA code. This allows us to understand the physiology of normal development, aging, abnormal gene control in cancer and other diseases as well as environmental health.


Genomic imprinting is an epigenetic phenomenon by which epigenetic chromosomal modifications drive differential gene expression according to the parent of origin. There are two alleles inherited from the parents. Usually, both these alleles are expressed. However, in the case of imprinted genes, only one allele is expressed. The expression of the gene is entirely according to the parent of origin. Expression can be due to an allele inherited from the mother (as in H19 and CDKN1C genes where the paternal allele is imprinted) or it is because of an allele inherited from the father (such as the IGF2 gene where the maternal allele is imprinted). This inheritance does not follow classical Mendelian genetics. Usually, imprinted genes are involved in a particular stage of development.

Imprinting is essentially a dynamic process. The profile of the imprinted genes varies during development. One of the main mechanisms involved in the control of imprinting is DNA methylation. Histone modifications can also play a role in imprinting. Imprinted genes are seen to occur in clusters and the control of these genes is by common regulatory elements. The regulatory elements maybe noncoding RNAs or Differentially Methylated Regions (DMRs). Differentially methylated regions are segments of DNA rich in cytosine and guanine, with the cytosine nucleotides methylated on one copy but not on the other. As mentioned, these regulatory elements are clustered together and these regions are called 'Imprinting Control Regions' or ICRs. Any change in the methylation patterns in the ICRs would lead to a loss of imprinting and an abnormal expression of the parental gene.

<H2>Imprinted Genes and Human Genetic Diseases

Expression of imprinted genes is essentially monoallelic. There is only one copy of the gene and that copy is inherited from one parent. So, any problem with that gene would cause a genetic situation like a recessive mutation.

Prader-Willi syndrome (PWS) is a complex genetic condition. It is believed to be the most common genetically identified cause of life threatening obesity in humans. Approximately 350,000 - 400,000 people are affected worldwide. It is seen in approximately one in 10,000 to 20,000 live births. It is seen in all races and ethnic groups but it is more common in Caucasians. The clinical symptoms include

Obesity including early childhood obesity

Short stature

Small hands and feet

Growth hormone deficiency

Mental retardation and behavioral problems

infantile hypotonia


The patients have a characteristic facial appearance with a narrow bifrontal diameter, short upturned nose, triangular mouth, almond-shaped eyes, and oral findings (sticky saliva, enamel hypoplasia).

The origins of the complex genetic syndrome were elucidated by Butler and Palmer in 1983. They reported that the deletion of chromosome 15 was a de novo event. They also found that the deleted chromosome was donated only from the father. In about 70% of the cases of Praeder Willi syndrome, the 15q11-q13 deletion occurred in a chromosome inherited from the father. In about 25% of the cases, the patients had maternal disomy. This meant that both the copies of chromosome 15 were inherited from the mother. In effect, this meant that there was a complete absence of the paternal chromosome 15. In the remaining 5% of cases, there was a defect in the imprinting centre controlling the activity of the genes in chromosome 15.

In this last 5% of cases, there would be a defect in the ICR as referred to previously and then there would be a change in the methylation pattern of the gene leading to loss of imprinting. Several paternal genes are expressed in this region and so it is difficult to pinpoint one gene as the cause of all the problems.

Angelman syndrome (AS) is another disorder of chromosome 15. The prevalence of Angelman syndrome is not precisely known. This has an entirely different clinical presentation. The clinical features are as follows:

Seizures, severe mental retardation, ataxia and jerky hand movements.


Inappropriate laughter and lack of speech


Maxillary hypoplasia, a large mouth with a protruding tongue.

A prominent nose and widely spaced teeth.

In contrast to Praeder Willi syndrome, the maternal copy of chromosome 15 is deleted in Angelmans syndrome. There is a deletion of 15q11 - q13. As mentioned earlier, the exact gene involved in the pathogenesis of Praeder Willi syndrome is not known since there are several genes at that locus in the paternal chromosome. However, in contrast the gene involved in the pathogenesis of Angelmans syndrome is known. Angelmans syndrome is because of imprinting of a single gene, UBE3A, a ubiquitin ligase gene involved in early brain development.


Some genes are constitutive genes and are expressed all the time. However, there are some genes which are expressed only at certain times. These genes can be expressed only in specific tissues (spatial expression) or they may be expressed at specific times (temporal expression). The most widely studied epigenetic modification is the cytosine methylation of DNA within the CpG dinucleotide.

The CpG dinucleotide is a sequence of 5'-CG-3'. The "p" in CpG refers to the phosphodiester bond between the cytosine and the guanine. This CpG indicates that the cytosine and guanine are next to each other on the nucleotide strand in the sequence of nucleotides. It does not matter if the nucleotide strand is single or double stranded. During evolution, the dinucleotide CpG has been progressively eliminated from the genome of higher eukaryotes and is present at only 5% to 10% of its predicted frequency. In the genome, the size of these CpG islands varies between 0.5 to 5 Kb. They occur on an average after every 100 kb. CpG islands are usually found in the promoter region of genes. These CpG islands are responsible for turning gene expression on and off. Chromatin containing CpG islands is generally heavily acetylated, lacks histone H1, and includes a nucleosome-free region. This essentially means that the chromatin is in an open state and it is available for transcription. The point to note at this stage is that the DNA is not bound and unavailable for transcription.

Approximately half of all genes in mouse and humans (i.e., 40,000 to 50,000 genes) contain CpG islands. These are mainly housekeeping genes. Housekeeping genes are genes which are required for the basic maintainence of cell function. They are expressed in all cells of an organism under normal or pathological conditions However,approximately 40% of genes with a tissue-restricted pattern of expression are also represented. Usually methylation is inversely correlated with the transcriptional status of the genes i.e. if the gene is methylated, it is not expressed and vice versa.

The enzymes that transfer methyl groups to the cytosine ring are called cytosine 5-methyltransferases, or DNA methyltransferases (DNA-MTase). There are currently three known,

catalytically active DNMTs, DNMT1, 3a, and 3b and each one appears to play a distinct and critical role in the cell. Three possible mechanisms have been proposed to account for transcriptional repression by DNA methylation. These mechanisms are as follows:

The transcription of a gene begins when a transcription factor binds to the promoter region of a gene. There can be a direct interference with the binding of specific transcription factors to their recognition sites in their respective promoters. Several transcription factors are known including AP-2, cMyc/Myn, E2F and NFκB. It is likely that these transcription factors bind to sequences in the CpG islands. Binding of these factors to the CpG islands has been shown to be inhibited by methylation.

The second mechanism includes the direct binding of specific transcriptional repressors to methylated DNA. Two such factors are MeCP-1 and MeCP-2 (methyl cytosine binding proteins 1 and 2). These factors bind to the CpG islands and cause the genes to be methylated.

The third mechanism of methylation is by altering chromatin structure. Experiments show that methylation inhibits transcription only after chromatin is assembled. Once chromatin has assumed its inactive state after DNA methylation, it cannot be counteracted even by strong transcriptional agents. Therefore, methylation stabilizes the inactive state. In addition, it also prevents activation by blocking access of transcription factors to the promoter island.

It is important to realise that methylation turns off genes. Methylation of the CpG islands serves as a locking mechanism that may follow or precede other events that turn a gene on or off. Once the methylation mechanism is in place, it can prevent activation even if the nuclear environment is optimum for transcription.

<H2>DNA demethylation during development and tissue specific differentiation

After implantation, most of the genomic DNA is usually in the methylated state. The tissue specific gene undergo methylation in their specific tissues of expression. , This essentially means that some genes can be expressed, whereas, the other genes are repressed. This allows the body a step-wise development which accounts for the perfect structure of the tissues of the human body. If this system of methylation did not exist, tissues would develop randomly and the human body would never reach the perfect form.

<H2>DNA methylation in cancer

Role of DNA methylation in oncogenesis has been hypothesized since many years. Numerous studies have suggested aberrations in DNA methyltransferase activity in tumor cells. Neoplastic cells may show hypermethylation of tumour suppressor genes or there may be hypomethylation of oncogenes. This leads to repression of tumour suppressor genes and development of cancer.

<H3>DNA hypomethylation in cancer

DNA may show hypomethylation in cancer. Decreased level of overall genomic methylation is a common finding in tumorigenesis. This decrease in global methylation appears to begin early, much before the development of frank tumor formation. Specific oncogenes are hypomethylated. This leads to an increase in the expression of oncogenes and development of cancer. A good inverse correlation between methylation and gene expression was observed in the antiapoptotic bcl-2 gene in B-cell chronic lymphocytic leukemia and the k-ras proto-oncogene in lung and colon carcinomas.

<H3>Hypermethylation of tumor-suppressor genes

An additional means of inactivating tumour suppressor genes is by hypermethylation of the promoter sequences of the tumour suppressor genes in cancer. The retinoblastoma gene (Rb) was the first classic tumor-suppressor gene in which CpG island hypermethylation was detected.

<H2>Clinical and therapeutic implications of DNA methylation

In recent years, several attempts have been made to use methylation in a therapeutic scenario. The vertebrate globin genes were among the target for clinical intervention based on drugs that affect methylation. Treatment with 5-azacytidine has been attempted. This drug is an irreversible inhibitor of DNA methyltransferase and therefore inhibits methylation. Since there is an inhibition of methylation, genes which were previously silenced can now be expressed. Among these genes is the fetal γ globin gene. 5-azacytidine can thus cause an increase in the expression of the γ globin gene which can restore the imbalance between the α chains and the non α chains in thalassemias. Unfortunately 5 azacytidine is mutagenic. Because of its mutagenicity and the observation that the other S-phase active cytotoxic agents that do not inhibit DNA methylation could induce similar increase in γ globin gene expression, 5-azacytidine has not been widely used for this application. This points to the limitations of the use of agents that cause global DNA methylation.

The recent advances in understanding of altered DNA methylation in cancer also have potential clinical implications. Because methylation of many involved genes may represent a process specific to neoplastic cells, it may be possible to detect the presence of micro metastasis by looking for the presence of methylated genes.


Histones form the protein backbone of chromatin and are an important component of epigenetics. They act as important translators between genotypes and phenotypes. They are known to have a dynamic function. As compared to DNA methylation, not much work has been done on the study of histones..

In eukaryotic cells, DNA and histone proteins form chromatin, and it is in this context that transcription takes place. As mentioned earlier, the basic unit of chromatin is the nucleosome, and consists of an octamer of two molecules of each of the four histone molecules (H2A, H2B, H3 and H4), around which is wrapped 147 bp of DNA. Histones help package DNA so that it can be contained in the nucleus. However, in addition, they may also perform important functions in gene regulation. .

The core histones are highly conserved basic proteins with globular domains. Conserved sequences mean that similar or identical sequences are seen in other proteins as well. DNA is wrapped around these globular domains. The histones also contain a relatively unstructured flexible tail which protrudes from the nucleosome. These tails are subject to a variety of post translational modifications (PTMs) such as methylation, acetylation and phosphorylation. The other changes which can take place in the tail are ubiquitination, sumoylation, ADP ribosylation and deimination, and the non-covalent proline isomerization that occurs in histone H3. Most histone PTMs are dynamic and are regulated by families of enzymes that promote or reverse the modifications.

How do histones influence transcription? The histones influence the higher order chromatin structure. It does this by affecting contacts between different histones and between histones and DNA. Specific histone modifications take place which are responsible for dividing of the genome into two parts. The first part is the transcriptionally silent heterochromatin and the second portion is the transcriptionally active euchromatin. Thus, these histone - histone and histone - DNA interactions decide if a gene is to be transcriptionally active or inactive. They regulate nuclear processes like replication, transcription, DNA repair and chromosome condensation. The common changes that take place in the histone molecule and perhaps also the best studied are histone acetylation and methylation. Ranking next to DNA methylation, histone acetylation and histone methylation are well-characterized epigenetic markers. Methylation at some of the histones (H3K4, H3K36 or H3K79) results in an open chromatin configuration and is, therefore, characteristic of euchromatin. Acetylation mediated by histone acetyl transferase (HAT) also results in an open chromatin pattern or euchromatin. On the contrary, histone deacetylases remove these changes and result in transcriptional repression.

An analogy of the relationship between DNA and histones can be found in any 'C' grade movie. The histones are akin to the big brother and their job is to protect the DNA or the younger sister. Histones allow access to the DNA only under certain circumstances and prevent access under a different set of circumstances. Since these changes are independent of the genetic code, they come under the ambit of epigenetic changes.

Essentially, three general principles are thought to be involved in histone modifications and gene expression. These principles are:

PTMs directly affect the structure of chromatin, regulating its higher order conformation and thus acting in cis to regulate transcription; The word 'cis' means to be on the same side of. Therefore, a cis regulatory element means that the PTM regulates the activity of the DNA on the same chromosome.

PTMs disrupt the binding of proteins that associate with chromatin (trans effect); 'Trans' is the opposite of 'Cis' and it means that the action is on a different chromosome. Essentially, this means that the PTM's cause a change in the binding of various transcription factors to the chromatin.

PTMs attract certain effector proteins to the chromatin (trans effect). Similar to what has been elaborated earlier, the effector proteins bind to the promoter regions and this regulates transcription.


MicroRNAs (miRNAs) were discovered in the early 1990s by Victor Ambros and colleagues. They found that miRNAs act as gene regulators. Gene hunters at that time were mainly interested in long mRNA molecules because the long mRNA molecules were the ones which were translated to proteins. The small fragments of mRNA or the microRNAs were disregarded since at that time it was believed that they did not have any function. This has now been proved wrong.

MicroRNAs are approximately 22 nucleotides in length. They are single stranded and they inhibit the expression of specific mRNA targets. They do this by binding to sequences usually located in the 3' untranslated regions or UTRs. The portion of miRNA which binds to the 3'UTR is called the 'seed region'. The human genome is believed to code for up to 1000 miRNAs.

miRNA coding sequences can be found in introns or exons of a protein-coding gene. It can also be found in intergenic regions. Several miRNA genes can be clustered along the genome and they may share the same promoter. They can also be present individually. miRNA genes are transcribed into large non coding mRNA strands which is called the primary miRNA transcript. Primary miRNA is then processed and then exported across the nuclear membrane.

Article name: The Human Epigenome Project Biology essay, research paper, dissertation