Friday, December 3, 2010

A Note About ADMIXTURE Autosomal Components and Y-Haplogroups

It is tempting to infer relationship between the components that pop out of Admixture with Y-haplogroups.

I have deliberately avoided associating haplogroups with components.  It is a great temptation to say that such and such component looks like it is associated with this or that haplogroup.  However, Admixture components are calculated across the genome, while y-haplogroups are from the single father to son line going back for eons. 

Geographically speaking, the genome is very averaged and cumbersome. Its autosomal signature depends on populations establishing a certain degree of isolation over time.  The y-haplogroup is more agile and is not subject to the need for isolation.  In fact, a y-haplogroup can appear in more than one Admixture component.  Additionally, different branches of the phylogenic trees for the y-haplogroup may be split across different Admixture components.

So, without a clear map of correlation, I will hesitate to infer that y-haplogroup such and such belongs to a particular Admixture component.

The same is true with mtDNA.

Thanks.  Have a nice weekend.

6 comments:

  1. Several of you on DNA Forums (http://dna-forums.org) have asked me questions about y-DNA results and ADMIXTURE components.

    I'm copying parts of the discussion into comments in this thread, for general reference.

    ReplyDelete
  2. "What are your thoughts on the Y-DNA haplogroups of these early Fertile Crescent farmers? Have you ever thought of doing a post about that?"

    From the Haak paper ancient DNA anaylsis, we can be pretty sure that the West Asian component
    contained G2a3 and F*.

    R1b-M269 is a clear candidate for the South European component, but West Asians may also have it at low frequency. Other branches of R1b may be common in the West Asian and Southwest Asian components.

    R-V88 looks to be a likely candidate for the Southwest Asian component. I have to get the map from the Myres paper to be sure. It's hidden in an appendix that I will have to pay to get access to. I intend have a look at the Myres R-V88 map before I post on the likelihood of R1b in the various.

    There's a new paper on J:
    http://www.nature.com/ejhg/journal/v18/n3/full/ejhg2009166a.html

    ReplyDelete
  3. A reader comments:

    "Hi marnie. Sounds interesting. If you do examine Fertile Crescent Y-DNA, one word of caution. Do not rely on Behar's Armenian frequencies. There was an error in reporting, not to mention hiccups in sampling. The Armenian DNA Project, in my opinion, is the absolute most reliable source for Armenian data. I ordinarily refer to the frequencies in the Abu-Amero Saudi paper for most populations. Y-DNA data on Assyrians has never been published. Roy King studied Assyrian haplogroup J men, in the study you referred to above. That is the extent of it. I have compiled, what I believe, is a generally reliable distribution of Assyrian Y-DNA. For Iraqi Jews and Iraqi Kurdish Muslims, the Nebel et al paper of a few years ago is a great resource. "

    ReplyDelete
  4. Here's what the Chiaroni, King et al paper (http://www.nature.co...hg2009166a.html) states in the abstract:

    Haplogroup J1 is a prevalent Y-chromosome lineage within the Near East. We report the frequency and YSTR diversity data for its major sub-clade (J1e). The overall expansion time estimated from 453 chromosomes is 10 000 years. Moreover, the previously described J1 (DYS388=13) chromosomes, frequently found in the Caucasus and eastern Anatolian populations, were ancestral to J1e and displayed an expansion time of 9000 years.

    So, according to this paper, the J y-DNA haplogroup likely originates on the eastern part Taurus-Zagros arc and in the area between the eastern Taurus-Zagros arc and the Caucasus. That would superimpose it on the area of the ADMIXTURE "West Asian" component.

    The paper goes on to say that J1e appears to originate on the Taurus-Zagros arc but has it's highest density on the Arabian penninsula. J1e is then overlapping with the Southwest Asian component, which is centered in the Arabian Peninsula.

    ReplyDelete
  5. A reader comments:

    "From the soon to be published Al-Zahery Marsh Arab paper. Suggesting, consistent with the bit you have quoted above, a northern Fertile Crescent origin for both J1c3 and J1* (w/ DYS388=13):

    'Interestingly, when the two M267 subclades, J1-M267* and J1e are considered, differential frequency trends emerge. The less represented J1-M267* primarily diffuses towards North East Mesopotamia and shows its maximum frequency in the northern area (Assyrian). In contrast, the most frequent J1e accounts for almost all the J1 distribution in South West Mesopotamia, reaching its highest value in the Marshes. By considering the STR variance associated to the two different subsets of J1 chromosomes, the highest variance for both J1-M267* and J1e is registered in the northern Mesopotamia area. . . . The lower variance value (0,118) registered in the Marshes Arabs is in agreement with a recent expansion event which, itself, clearly emerges from the network analysis. The presence of Y chromosomes belonging to the M267* paragroup suggests a long persistence of this haplogroup in the Mesopotamia Marsh area.'
    "

    ReplyDelete
  6. A reader, from above: "If you do examine Fertile Crescent Y-DNA, one word of caution. Do not rely on Behar's Armenian frequencies. There was an error in reporting, not to mention hiccups in sampling. The Armenian DNA Project, in my opinion, is the absolute most reliable source for Armenian data."

    Marnie: OK. Thank you for the word of warning.

    A reader (from above): "I ordinarily refer to the frequencies in the Abu-Amero Saudi paper for most populations. Y-DNA data on Assyrians has never been published. Roy King studied Assyrian haplogroup J men, in the study you referred to above. That is the extent of it. I have compiled, what I believe, is a generally reliable distribution of Assyrian Y-DNA. For Iraqi Jews and Iraqi Kurdish Muslims, the Nebel et al paper of a few years ago is a great resource."

    Marnie:

    I will look at the Abu-Amero Saudi paper and the Nebel paper. To be honest, I'm hesitant to head off into the land of y-DNA interpretation.

    The autosomal DNA results are "blunt" in there ability to pinpoint origin, but are quite effective on a macro scale. The data I have to work with are the ADMIXTURE autosomal results. Because y-DNA is not the whole story of a person's or population's genetic makeup, y-DNA results will likely not be linear in their representation of the relationship between populations. In order for me to be able to add and subtract population components such as I did with the "Syria to Assyria" post, I need a linear relationship for the data that represents populations.

    y-DNA results are useful as a talisman of population flow, but they are not by any means the whole story. Not only is there the female side of the genetic story, there is also lost y-DNA for men who had daughters instead of sons and who passed 22 instead of 23 chromosomes onto future generations.

    I appreciate the above references and I will be using y-DNA results from published papers to assist in understanding general population flow. However, autosomal ADMIXTURE results are not precise at picking out fine detail in closely spaced populations such as the difference between Armenians and Assyrians and Assyrian Jews. That's one of the reasons I've avoided moving further with a discussion about the origin of these populations. ADMIXTURE autosomal results are not up to that job.

    Dienekes' (www.dienekes.blogspot.com) "Galore" results using MCLUST are better at creating specific clusters from autosomal results. It can carve through the data to tell you that Assyrians are different from Armenians. However, I suspect that MCLUST unevenly weights the SNP's it is working with and therefore does not produce a linear result. You have precision and can clearly separate the two populations, but you can't tell how they are related and different.

    With its linear autosomal result, I'm trying to focus on extracting everything I can from ADMIXTURE. I will use y-DNA results in a limited way, as a guide for population flow.

    By using this separate autosomal path to understanding population flow, not leaning too heavily on y-DNA results, I will test some of the migration theories that have sprung out of y-DNA analysis.

    ReplyDelete

Comments have temporarily been turned off. Because I currently have a heavy workload, I do not feel that I can do an acceptable job as moderator. Thanks for your understanding.

Note: Only a member of this blog may post a comment.