My Blog List

Monday, November 14, 2011

Some achievements of the MDL project

Looking back at the project's goals posited  6 months ago, we would like to recapitulate some of the most important goals in order to analyzes how they have been achieved:

1.Performing the comprehensive Plink analysis, including the estimation of homozygous ROH (shared clusters and groups of homozygosity), possible Mendelian errors, extended LD-haplotypes (based on values of R2), shared IBD segments and IBS matrix (Plink format).

Although we performed all described types of Plink analysises and eve shared some results on the project's blog, we didn't consider these results worth of extensive coverage. And likewise, there was no interest in those analysises on behalf of the project's members.

Experiments with relatedness
Graphoanalytical approach to visualizing relatedness
IBD sharing
IBS similarity matrix in R

2.Phasing the genotype files, i.e establishing the haploid phase (this is a separate analysis demanding genotypes of your parents, so it will not be performed on a regular base) (Beagle or Merlin output format). 

We performed ad-hoc phasing of the genotypes in our project (MDLP) and, in order to assess possible discrepancies between phased and unphased data, we performed ADMIXTURE analysis (with 4 assumed clusters K=4) separately for original unphased dataset and BEAGLE-phased dataset.

Analyzing admixture in phased v.unphased dataset

3. Using AISconvert (based on HIRsearch) and Germline software to detect IBD segments.

Used only occassionally in combination with other analyses

Analyzing admixture in phased v.unphased dataset 
Grapho-analytical approach to the visualisation of IBD shared segments
IBD sharing


4.Using ADMIXTURE/STRUCTURE software for detecting admixture clusters and claculating allele frequencies.

We performed a plenty of ADMIXTURE and STRUCTURE runs (using different a priori number of assumed clusters under different models of  admixture). Discussions of ADMIXTURE results contibuted  the most signficant part to the MDLP's blog.
The allele frequencies, estimated in K=7 Admixture run, were provided for creating a custom modification of DIYDodecad's calculator (MDLP).

Analyzing admixture in phased v.unphased dataset
First results: Admixture unsupervised run
Admixture analysis: sorted after Baltic-Slavic component
Admixture results: Baltic-Slavic
Admixture analysis: the rest of groupings
The output of the PLINK and ADMIXTURE algorithms
Admixture clusters, Mclust and populations concordance
DIYDodecad calculator v2.0 for my BGA project (MDLP).
Root Means Square Comparison Excel 2007 Macro Enabled XLSM spreadsheet for the Magnus Ducatus Lithuaniae Project data

and many more ..

5. Creating MDS and  PCA plots
PCA plots (Eigensoft)

PCA plots for reference populations and project participants
MDS and PCA plots: for V157-V247
A close-up on "the core" of the MDL project

6.Creating RHHmapper schemes showing the location of rare heterozygous and homozygous genotypes

RHH mapper: results for V158-V165 and V201-V202








No comments:

Post a Comment