Lai-Ping Wong, Rick Twee-Hee Ong, Wan-Ting Poh, Xuanyao Liu, Peng Chen, Roiying Li, Kevin Koi-Yau Lam, Nisha Esakimuthu Pillai, Kar-Seng Sim, Haiyan Xu, Ngak-Leng Sim, Shu-Mei Teo, Jia-Nee Foo, Linda Wei-Lin Tan, Yenly Lim, Seok-Hwee Koo, Linda Seo-Hwee Gan, Ching-Yu Cheng, Sharon Wee, Eric Peng-Huat Yap, Pauline Crystal Ng, Wei-Yen Lim, Richie Soong, Markus Rene Wenk, Tin Aung, Tien-Yin Wong, Chiea-Chuen Khor, Peter Little, Kee-Seng Chia, Yik-Ying Teo
3 January 2013
This beautiful paper discusses the need for deep whole-genome sequencing in order to detect low frequency and rare variants and genomic hotspots. It points out that this is especially true for populations that are not well covered by the 1000 Genomes Project. It uses Malays (SSMP Singapore Sequencing Malay Project) as the test population (Figures 2, 3 and 5) to demonstrate this.
Of significant interest is Figure 4, which shows regions of density and damaging nsSNP regions by chromosome for the SSMP population.
Figure 7 shows the genomic coverage for the SSM (Singapore Sequencing Malay) population, Europeans (CEU), East Asians (CHB + JPT) and Yorubans (YRI) for various commercially available genome wide genotyping arrays. The highest variation in coverage performance is seen for the YRI population, especially for low and uncommon variants.
Figure 8 shows the genomic coverage for the Exome Arrays. There is less variation in coverage across populations from commercially available exome arrays (compared to the genome arrays).
This is a terrific paper which should set the bar for assessing the quality of coverage for populations that are not well covered by the 1000 Genomes Project. It should serve to inform us as to the intelligent population and experiment dependent choices to be made when selecting commercially available genotyping arrays.