Saturday, December 13, 2008

Haplotypes Animation

Haplotype is a set of single nucleotide polymorphisms (SNPs) on a single chromatid that are statistically associated. It is thought that these associations, and the identification of a few alleles of a haplotype block, can unambiguously identify all other polymorphic sites in its region. Such information is very valuable for investigating the genetics behind common diseases, and is collected by the International HapMap Project.

An organism's genotype may not uniquely define its haplotype. For example, consider a diploid organism and two bi-allelic loci on the same chromosome such as Single Nucleotide Polymorphisms (SNPs). The first locus has alleles A and T with three possible genotypes AA, AT, and TT, the second locus having G and C, again giving three possible genotypes GG, GC, and CC. For a given individual, there are therefore nine possible configurations for the genotypes at these two loci, as shown in the punnett square , which shows the possible genotypes that an individual may carry and the corresponding haplotypes that these resolve to. For individuals that are homozygous at one or both loci, it is clear what the haplotypes are; it is only when an individual is heterozygous at both loci that the phase is ambiguous.








Within the human genome SNPs occur on an average of 1 in 1000 base pairs .researchers have shown the groups of SNPs occur in predictable patterns within sections of DNA.
these inherent sections of 68 SNPs are called haplotype.Within one section of DNA it is believed that there are only 3 to 5 different haplotypes throughout the entire population.
The only unequivocal method of resolving phase ambiguity is by sequencing. However, it is possible to estimate the probability of a particular haplotype when phase is ambiguous using a sample of individuals.





Given the genotypes for a number of individuals, the haplotypes can be inferred by haplotype resolution or haplotype phasing techniques. These methods work by applying the observation that certain haplotypes are common in certain genomic regions. Therefore, given a set of possible haplotype resolutions, these methods choose those that use fewer different haplotypes overall. The specifics of these methods vary - some are based on combinatorial approaches (e.g., parsimony), whereas others use likelihood functions based on different models and assumptions such as the Hardy-Weinberg principle, the coalescent theory model, or perfect phylogeny. These models are combined with optimization algorithms such as expectation-maximization algorithm (EM) or Markov chain Monte Carlo (MCMC)

No comments: