Monday, October 15, 2007

Pre-Hispanic Philippines: Patrilineal Genetic Demography

In a comment to my previous post on my paternal genetic history, blogger Cocoy remarked:
"interesting to see how migration patterns occurred. wonder how many Filipinos have gone through what you did? Because it would be interesting to know where our general population comes from."
One way to define population distribution is via genetic markers called Haplogroups*. I did some googling and found this Map of Haplogroups of the World. The map showing Patrilineal (Y-Chromosome) Genetic distribution among world populations around 1500 AD prior to European imperial expansion.**

Copyright: JD Macdonald, 2005 (Click on image to enlarge)

As shown in the above diagram, pre-Hispanic Philippines was dominated by subgroups of Haplogroup O (shown in blue, comprising more than 80%) followed by subgroups under Haplogroup C (around 4%). The diagram below highlights (in yellow) the positions of these Haplogroups in the Y-DNA Haplogroup Tree.

Source: Wikipedia entry on Human Y-chromosome DNA haplogroups
(Click on image to enlarge)

There are other haplogroups found in the Philippines at that time like Haplogroup K (ancestor of Haplogroup O), but for this entry, i will focus on variants of the more common ones as follows:

Haplogroup O1A (M119): This genetic marker first appeared 30,000 years ago and is commonly found among the Austronesian people. As explained by Jojo Malig in his blog entry I, Austronesian, these people made their way to the islands comprising the Philippines via Taiwan. The languages found in the Philippines including Tagalog, Cebuano, Hiligaynon, Ilokano, Kapampangan, and Tausug belong to the Austronesian family of languages.

Haplogroup O1A (M119) Migration Pattern
(Click on image for details)
[Source: Genographic Project Website]

Haplogroup O3 (M122): This genetic marker first appeared 10,000 years ago and is commonly found among Han Chinese (more than 50% frequency). As per the wikipedia entry, about 35% of today's Filipino males possess this genetic marker. My non-expert guess is that these people tagged along with the more numerous Austronesians as the latter expanded out of the Asian Mainland. Another (non-mutually exclusive) possibility is that they arrived in the Philippines later via separate wave(s) of migration.

Haplogroup O3 (M122) Migration Pattern
(Click on image for details)
[Source: Genographic Project Website]

Haplogroup C (M130): This haplogroup first appeared 50,000 years ago and is found in such seemingly diverse populations as the Australian Aborigines and the Mongols. They are the ones who crossed the land bridge that connected the Philippines to the Asian Mainland during the last Ice Age.

Haplogroup C (M130) Migration Pattern
(Click on image for details)
[Source: Genographic Project Website]

Haplogroup C3 (M217): Around 20,000 years ago, a mutation appeared that gave rise to this Haplogroup which became one of the most widely dispered among the patrilineal genetic markers.

Haplogroup C3 (M217) Migration Pattern
(click on image for details)
[Source: Genographic Project Website]

In more recent history, this Haplogroup was propagated even more via the conquests of Genghis Khan.

More information about the above and other Haplogroups along their migration patterns can be found in the Genographic Website's Atlas of the Human Journey.

Update Oct-20-2007 1:12PM:Check out Anonymous' informative comments and clarifications in the comments section of this entry.

Update April-25-2008: Based on the Y-DNA Haplogroup information in the Genographic website, i have drawn a Family Tree to work out how each haplogroup is related to the other.

Update May-23-2008: Corrected my annotation on the above Wikipedia entry on Human Y-chromosome DNA haplogroups diagram.

*Aside from Y-Chromosone Haplogroups, there are also Haplogroups classified on the basis of mutations in mitochondrial DNA (mtDNA) which is passed on from the maternal line. However, the discussion in this blog entry will only cover Y-DNA Haplogroups.

**The current (as of today) Wikipedia entry on the Filipino People erroneously states that Filipinos predominantly belong to Haplogroup L and Haplogroup H which couldn't be correct. As per Anonymous' explanation in the comments section, it turns out this entry is not an error just some overlap in labeling.


Anonymous said...

Haplogroup O3 (M122) is also found in about half of all Amis (a group of Austronesian people in Taiwan) and about a quarter of all Polynesians when averaged among all the various islands. On some islands of Polynesia, about half the population belongs to haplogroup O3, while on other islands, very few men (only 2 or 3 percent, for example) belong to this haplogroup. Indonesians seem to belong more frequently to the typically Austro-Asiatic (e.g. Cambodian or Nicobarese) haplogroup, O2a (M95). Among all the major Austronesian countries, Filipinos and Malaysians have the highest frequency of haplogroup O3-M122.

Also, you should mention that haplogroup O1a-M119 is also quite common in southern China. Between 10% and 15% of southern Han Chinese, Zhuang, and other Tai peoples belong to this haplogroup.

cvj said...

Anonymous, thanks for your comments! I'll do a follow-up post based on what you said.

Anonymous said...

I checked the study to which the Wikipedia article on Filipino people referred, and I found that what it had labeled as "Haplogroup H" was actually equivalent to the International Society of Genetic Genealogy's Haplogroup O1a-M119, defined by the mutation M119, and what the study had labeled as "Haplogroup L" was actually the ISOGG's Haplogroup O3-M122. The study found Haplogroup O3-M122 (or "Haplogroup L") to be most common among Filipinos, followed closely by Haplogroup O1a-M119 ("Haplogroup H"), but the sample size was only 28 individuals. Other studies that have included Filipino samples have tended to find a somewhat lower frequency of haplogroup O3-M122, usually between 35% and 40%, about the same frequency of haplogroup O1a-M119, and minorities of haplogroup C-M216*, haplogroup C3-M217, and haplogroup O2a-M95. Overall, the patrilineal genepool of the Filipinos appears to be most similar to that of the Amis tribe of Taiwanese aborigines and the southern Han Chinese. Filipinos, Amis, and southern Han Chinese are next most similar to Tai-Kadai peoples, northern Han Chinese, and Koreans.

cvj said...

Anonymous, thanks very much! Your clarification on the correspondence between Haplogroups H to O3 & L to O1a is much appreciated. I downloaded the Stanford paper referenced by the Wikipedia entry on the Filipino People and i'm trying to understand the 'Figure 4' which shows the 'Principal-Component' analysis. Would you be able to shed light on this, particularly the meaning of the X and Y axes and what the clusters mean? I'm so far not able to understand the explanation on the paper itself.

Anonymous said...

In that Stanford paper, "haplogroup H" corresponds with the ISOGG standard "haplogroup O1a" (defined by the M119 mutation), whereas "haplogroup L" corresponds with the ISOGG standard "haplogroup O3" (defined by the M122 mutation). You seem to have reversed them accidentally.

As for the graph of the principal component analysis, the horizontal axis represents the first principal component (PC1), which explains 46% of the total variation in haplogroup frequencies among the population samples in this study. The haplogroups that displayed the greatest correlation coefficient with the horizontal axis were haplogroup H (O1a-M119) and haplogroup C (C-RPS4Y), at 0.64 and -0.44, respectively. This means that a greater frequency of haplogroup O1a-M119 would move a population toward the right side of the graph, while a greater frequency of haplogroup C-RPS4Y would move a population toward the left side of the graph. Sure enough, we find the Paiwan, Atayal, Bunun, and Yami, four Taiwanese aboriginal populations that display an inordinately high frequency of haplogroup O1a-M119 and an almost complete absence of haplogroup C-RPS4Y, in a cluster at the right edge of the graph, while the Madang (Papua New Guinea), Irian Jaya (Indonesian New Guinea), and Atiu (Cook Islands of Polynesia) populations appear near the left edge of the graph because of their high frequencies of haplogroup C-RPS4Y and utter lack of haplogroup O1a-M119.

The second principal component (PC2), which explains 20% of the variation, is represented on the vertical axis of the graph and appears to be associated with so-called "northern influence" (actually more like Southern Chinese-, Filipino-, or Amis-related influence) upon the southern indigenous populations, and the haplogroups that were most strongly correlated with PC2 were haplogroup L (O3-M122) and haplogroup F (K-M9*(xP-92R7, M-M4, O-M175)), with correlation coefficients of -0.46 and -0.22, respectively. This means that a greater frequency of haplogroup O3-M122 or haplogroup K-M9*(xP,M,O) would tend to move a population toward the bottom of the graph. Sure enough, we find the populations of South China, Tonga (Polynesia), French Polynesia, the Philippines, and the Amis close to the bottom of the graph because these populations display high frequencies of haplogroup O3-M122.

Overall, one may see that the Taiwanese aboriginal populations (excepting the Amis) cluster in the upper right corner of the graph because of their high frequency of O1a-M119, low frequency of C-RPS4Y, and low frequency of O3-M122.

The Melanesian/Papuan populations cluster in the upper left corner of the graph because of their low frequency of O1a-M119, high frequency of C-RPS4Y, and low frequency of O3-M122.

The Polynesian populations cluster in the lower left corner of the graph because of their low frequency of O1a-M119, high frequency of C-RPS4Y, and high frequency of O3-M122.

The Indonesian populations cluster in the center of the graph because they have moderate frequencies of O1a-M119, C-RPS4Y, and O3-M122.

The three outliers are the Philippines, Amis, and South Chinese populations. These three populations could be loosely considered as a northern extension of the Indonesian population cluster, but they are displaced toward the bottom of the graph because of their inordinately high frequencies of haplogroup O3-M122.

cvj said...

Anonymous, you're right i accidentally switched them. I stand corrected. Thanks for the explanation on the Principal-Component analysis. If you don't mind, i'm consolidating your comments for my next blog entry on this topic. If you are willing to give your name, i can give you proper acknowledgement.

Anonymous said...

You don't need to credit me. I'm just helping out since I am another amateur interested in the budding field of genetic genealogy.

Anyway, I would like to alert you to two other studies that have analysed samples of Filipino Y-chromosomes.

First, there is "Reduced Y-chromosome, but Not Mitochondrial DNA, Diversity in Human Populations from West New Guinea" by Kayser et al. (2002). This study reports results of the analysis of a sample of 39 Filipino Y-chromosomes, which makes the results somewhat more robust than the Stanford study that we have discussed previously. The Filipino sample in the study by Kayser et al. included 41.0% (16/39) haplogroup O1a-M119, 35.9% (14/39) haplogroup O3-M122, 10.3% (4/39) haplogroup C-RPS4Y*(xC2-M38, C3-M217, C4-390.1del), and 2.6% (1/39) each of haplogroup O2a-M95, haplogroup O-M175*(xO2a-M95, O1a-M119, O3-M122), haplogroup R-M173, haplogroup F-M89*(xK-M9), and haplogroup K-M9*(xK5-M230, M-M4, O-M175, P-M74). This study suggests that the major components of the patrilineal ancestors of Filipinos belonged to the typically Taiwanese aboriginal haplogroup, O1a-M119, and the typically (Han) Chinese and Korean haplogroup, O3-M122. A significant minority (about 10% in this study) seem to be descended from haplogroup C*, which is an extremely rare haplogroup elsewhere in the world. Most members of haplogroup C belong to one of five subclades that have already been defined, namely haplogroup C1-M8 (Japan), C2-M38 (East Indonesia, New Guinea, Melanesia, Micronesia, and Polynesia), C3-M217 (East Asia, North Asia, Central Asia, North America), C4-390.1del (Australian aborigines), or C5 (Indian subcontinent). A few Filipinos have been found in other studies (as well as in commercial testing) to belong to the typically North Eurasian and North American haplogroup C3-M217.

Another study, "The Dual Origin of the Malagasy in Island Southeast Asia and East Africa: Evidence from Maternal and Paternal Lineages," by Hurles et al. (2005) also studied a small Filipino sample of only 28 individuals, which makes its results no more significant than those of the old Stanford study, but it did make use of the now standard nomenclature and also distinguished some lower level subclades of haplogroup O. This study found 3/28 (10.7%) haplogroup O3a5-M134 (AKA haplogroup O3e), 14/28 (50.0%) haplogroup O3-M122*(xO3a5-M134), 1/28 (3.6%) haplogroup O1a2-M50 (AKA haplogroup O1b), 9/28 (32.1%) haplogroup O1a-M119*(xO1a1-M101, O1a2-M50) (AKA haplogroup O1*), and 1/28 (3.6%) haplogroup K-M9*(xK1, K2, K3, L, M, N, O, P). Pooling haplogroups O3a5-M134 and O3-M122* together, and likewise pooling haplogroup O1a2-M50 with O1a-M119*, we have a total frequency of 17/28 (60.7%) haplogroup O3-M122 and 10/28 (35.7%) haplogroup O1a-M119 among this Filipino sample. This study did not find any haplogroup C* Y-chromosomes among their Filipino sample, which suggests that haplogroup C* might be restricted to particular geographical enclaves or socio-ethnic groups (perhaps, for example, Aetas) within the Philippines.

The extremely high frequency (60.7%) of haplogroup O3-M122 in the second study is rather surprising, since this actually exceeds the frequency of this haplogroup among many samples of Han Chinese. Frequencies of haplogroup O3-M122 among Han Chinese samples generally range from approximately 45% to approximately 80%, with the modal being somewhere between 55% and 60%. However, in the case of the Han Chinese haplogroup O3 samples, the majority belong to the subclade O3a5-M134, which contained only 10.7% of the Filipino samples in this study.

Frequencies of haplogroup O3-M122 among samples of Koreans generally range from approximately 30% to approximately 50%, with the modal being about 40%. The frequency of the subclade O3a5-M134 is generally lower among Koreans (as among Filipinos), with approximately 10% to 15% of Koreans belonging to this subclade. The subclade O3a5-M134 seems to be strongly connected to Sino-Tibetan (and especially Tibeto-Burman) populations, being practically the only subclade of haplogroup O found among the Tibetans, for example. The overwhelming majority of haplogroup O3-M122 Y-chromosomes found among the Japanese, where they total about 15% to 20% of the samples, also belong to the subclade O3a5-M134, which is similar to the Sino-Tibetan populations. Most haplogroup O3 Y-chromosomes among Austronesians (including Filipinos) do not belong to the subclade (or "branch haplogroup") O3a5-M134, but rather to the "parent haplogroup," O3-M122*.

cvj said...

Hi Anonymous, an interested reader would like to know more about the occurrence of Haplogroup C3(M217) among Filipinos. Would you be able to provide information on this Haplogroup or point us to relevant papers or commercial test results? In particular, he would like you to expound on your comment above that:

"few Filipinos have been found in other studies (as well as in commercial testing) to belong to the typically North Eurasian and
North American haplogroup C3-M217."

Thanks in advance.

Anonymous said...

For a start, there is this individual, who reports an earliest known direct patrilineal ancestor in Batanggas, Philippines in the late 19th century:

N37285 Engracio Ilag, Batanggas, Phillippines 1880s C3

You can see his haplotype in the table on the "Y Results" page at the FamilyTreeDNA C/C3 Haplogroup Project website:

Some scientific studies have found small numbers of haplogroup C3-M217 individuals among samples of populations from Malaysia, Indonesia, and the Philippines, but the average frequency of haplogroup C3-M217 across this region is probably not higher than 1%. Haplogroup C3 in the Malay Archipelago region seems to be concentrated in Borneo (or perhaps also in Peninsular Malaysia), as it appears most regularly as a minority component of samples that are labeled as representing "Borneo" or "Malaysia," and the northern part of the island of Borneo is, of course, politically a part of the sovereign state of Malaysia. Samples from Borneo and Malaysia also regularly contain a few representatives of other "exotic" haplogroups, including haplogroup G and haplogroup R1a, so it should not be so surprising to find a haplogroup C3 individual here or there among them. These exotic haplogroups might have been introduced into parts of the Malay Archipelago during historical times as a result of immigration of Indo-Aryan, Dravidian (e.g. Tamil), or Chinese people.

Anonymous said...

Hi Anonymous
I was the one who was asking about Filipino C3. When you said there are "a few Filipinos" I was hoping you could point out more than one example. I read through the papers you cited. They don't show C3 in the Philippines. So if you can get more references to the claim that C3 has been found in the Philippines other than one example I would be very happy. Thanks! P

Anonymous said...

Please refer to "Table 1. Y Chromosome Lineage Frequencies of Island Melanesia and Nearby Regions" on page 19 of the online article, "Unexpected NRY chromosome variation in Northern Island Melanesia" by Laura Scheinfeldt et al. (2006), which may be found at . This table has combined the results of various studies in order to present a better overview of Y-chromosome variation in East Asia, Southeast Asia, and Oceania.

According to this table, haplogroup C3-M217 Y-chromosomes have been confirmed among the following populations:

Korea: 12% (3 out of 25 samples)
China: 6% (2/36)
Vietnam: 9% (1/11)
Taiwan Chinese: 4% (1/26)
Philippines: 1% (1/115)
Malaysia: 4% (2/50)
Southern Borneo: 3% (1/40)
Balinese: 0.2% (1/551)

In addition, haplogroup C2-M38* Y-chromosomes have been confirmed among the following populations:

Southern Borneo: 3% (1/40)
East Indonesians: 27% (15/55)
Moluccas: 15% (5/34)
Nusa Tenggara: 16% (5/31)
West New Guinea Lowlands/Coast: 9% (8/89)
Papua New Guinea Coast (previous studies): 13% (4/31)
Papua New Guinea Highlands (previous studies): 3% (1/31)
Papua New Guinea Coast (present study): 8% (2/25)
East New Britain: 3% (4/145)
West New Britain: 1% (2/245)
New Ireland: 7% (8/109)
North Bougainville: 2% (1/54)

Haplogroup C2b-M208 Y-chromosomes have been confirmed among the following populations:

West New Guinea Highlands: 25% (either 23/94 or 24/94?, but limited to the Dani and Lani ethnic groups)
Papua New Guinea Coast (previous studies): 10% (3/31)
Papua New Guinea Coast (present study): 28% (7/25)
Trobriand Islands: 9% (5/53)
Manus: 14% (1/7)
East New Britain: 1% (either 1/145 or 2/145)
West New Britain: 0.4% (1/245)
Mussau: 5% (1/20)
New Ireland: 1% (1/109)
Cook Islands: 82% (23/28)

Haplogroup C4 Y-chromosomes have been confirmed among the following populations:

Australia, Arnhem: 53% (32/60)
Australia, Desert: 69% (24/35)

Haplogroup C*(xC2,C3,C4) Y-chromosomes have been confirmed among the following populations:

Philippines: 4% (5/115)
Malaysia: 2% (1/50)
Java: 2% (1/53)
Balinese: 2% (n=551)
East Indonesians: 6% (3/55)
Moluccas: 9% (3/34)
Nusa Tenggara: 7% (2/31)
Papua New Guinea Coast (previous studies): 3% (1/31)
Australia, Arnhem: 10% (6/60)

In addition, there are the following haplogroup C Y-chromosomes that have been insufficiently tested to determine their precise haplogroup affiliation:

Taiwan, Yami: 2(.5)% (1/40)
Vanuatu: 18% (n=234)
Fiji: 3% (n=55)
Tonga: 23% (n=55)
Western Samoa: 69% (n=16)
Atiu: 84% (n=42)
French Polynesia: 53% (n=87)
(The individuals listed above could belong to C*, C2*, C2b, or C3)

Also, 42% of a sample of 54 Maori belonged to either haplogroup C2-M38* or haplogroup C2b-M208.

Anonymous said...

The URL of that article should read:

I don't know why the final "f" is being cut off.

Anonymous said...

Thanks...I checked the paper you cited and the entry on Philippines in Table 1 was further referenced based on 3 sources two of which I checked (Kayser and Capelli)and are not the sources for the C3 data. The third by Hammer et al 2005 is not in the list of references. I assume this is Hammer MF. If you could help dig out that actual reference for the C3 it would be great! I did email the author of the paper you cited. cheers, P

Anonymous said...

So the paper citing the lone C3 in the Philippines is based on the paper of Hammer, MF: Dual origins of the Japanese: common ground for hunter-gatherer and farmer Y chromosomes. In the supplement it is listed as 1 person out of 48 samples. If I am not mistaken, there are about 11-12% of chinese lineage in the Philippines. So this individual isprobably from a recent chinese admixture into the Philippine population.

Anonymous said...

i got one of the NG Genome Project kits for my nephew in the Philippines (my brother's son) and his test just came back as N(M231). Our family is Ilocano on the male side for many generations. N(M231) seems like a very rare haplotype for the Philippines.

cvj said...

Anonymous, that's very interesting. According to the web site, M231 is mostly found in Siberia and Scandinavia, particularly among the Saami tribe in Finland.

Momo, juanagrafica, and the Craft Baller said...

Hello, CVJ. long time following up but i was reviewing the genome project map regarding M231. about 45000 years ago Branch P128 emerged from the middle east and headed east for southeast asia. traveled through Borneo to the Philippines and Indonesia. is was later that the M231 Haplogroup branched off from P128 in Central Asia and headed to Scandinavia and Siberia. i wonder how M231 got back to the Philippines . . .

cvj said...

Hi MissM2u, thanks for sharing, that's an interesting question. Could it be that another group of individuals with the M231 marker travel from Central Asia to the Philippines?

Momo, juanagrafica, and the Craft Baller said...

Sure. In fact the family oral history says our paternal line was founded by a foreign pirate who made his base in the Philippines in our home town. No one knows for sure where he was from.

Momo, juanagrafica, and the Craft Baller said...

But also its not clear to me if the M231 marker first arose in Central Asia or only after the group got to northern Eurasia. If the latter then then would have probably been a group from Siberia that went to the P.I.

Unknown said...

I read in the notes "Haplogroup C4 Y-chromosomes have been confirmed among the following populations:

Australia, Arnhem: 53% (32/60)
Australia, Desert: 69% (24/35)" Could these two different grouping indicate two separate arrivals or could it have been local change due to environmental factors such as the great drying out of Australia during the last ice age?
Louis C

Anonymous said...

Update on the earlier comment above about the Engracio Ilag belonging to C3 (now called C2) haplogroup. The site on family tree DNA puts this individual on a class of his own C2b1b2 bearing the marker MF2091+ distinguishing him from other F845 carriers. INteresting!