Feline Genomics and Complex Disease Studies
Tufts' Canine and Feline Breeding and Genetics Conference, 2009
Leslie A. Lyons, PhD
Department of Population Health & Reproduction, School of Veterinary Medicine, University of California, Davis, Davis, CA

Objectives of the Presentation

 Status of the Cat Sequencing Project

 Unlocked Secrets of the Cat Genome

 High Density DNA Array Resources

 Complex Disease Studies and Design

Overview of the Issue

The NIH-NHGRI supported a low resolution, 2X coverage, sequencing of the cat genome, performed by the Broad Institute and AgenCourt. Last spring, Hill's Pet Food announced a low coverage, private sequencing project and has donated a large portion of the data to the cat genetics community along with $1,000,000 to support development of a cat SNP array. Currently, the domestic cat is one of the initial species to have a more in depth sequencing coverage using a combination of NextGen sequencing technologies, including 454 and Illumina (Solexa) sequencing, which is being conducted by the Washington University Sequencing Center. These sequencing efforts will support the development of high-density single nucleotide polymorphism (SNP) arrays for the cat to investigate feline model of inherited traits and diseases.

To date, feline studies have required large-extended families with clearly defined modes of inheritance (MOI). A plethora of diseases exist in cats that have a significantly increased risk in certain breeds or populations, but the MOIs are not well defined. Often, MOIs of more simple diseases are highly suspected, but data has not been collected in an organized manner to document these conditions and would not pass peer-review in scientific journals. Accurate clinical definition of diseases is essential for family-based association studies, implying the clinical veterinary is a key player for selecting the number and the specific cases and controls for a SNP array-based genome-wide association studies (GWAS). Once the cases and controls are ascertained and DNA prepared, a GWAS can be accomplished within months instead of years, thus, cat disease studies should more forward rapidly.

Consortiums of researchers will aid the discovery of the genetic components of complex diseases in cats. Various diseases studies, such as for diabetes, heart disease and feline infectious peritonitis will benefit from the sequencing of the cat genome and the development of the DNA arrays. The design of arrays and case - control studies will be presented.

Additional Detail

The NHGRI sanctioned 2x sequencing of the domestic cat was undeniably an important asset for feline researchers. The Broad Institute and Agencourt collaborated to generate a ~2-fold whole genome shotgun (WGS) coverage. The first sequence assembly is now termed the reference assembly. However, with only 65% of the gene coding sequence represented, on average only 55.7% of the sequence data for a specific gene is represented (Pontius et al 2007). The 217,790 contigs generated from the sequence assembly contain greater than 660,000 gaps, hence obtaining the entire gene sequence from larger genes is virtually impossible. A 6-fold genome sequencing project has been recommended by NHGRI for the domestic cat. The deeper sequencing effort is being led by the Washington University at St. Louis Sequencing Center. For this effort, the cat sequencing approach used what is termed NextGeneration technologies, which is different from Sanger sequencing that has been the standard sequencing approach in the past. Long-read length 454 technology has been used to fulfill the targeted 6x coverage for the cat as well as paired-end reads based on Illumina (Solexa) technology. The actual sequencing effort for the cat is complete and the assembly of the sequence is now underway.

The availability of many single nucleotide polymorphisms (SNP) from the sequencing effort assures that the requisite fine mapping of disease loci will speed discovery of causative alleles in domestic cat diseases and traits. For SNP detection, the Illumina 50 bp paired-end reads was completed by the cat sequencing effort. This whole genome method should generate a robust, randomly representative set of SNPs in various breeds of cats that can be readily mapped to the cat genome reference assembly. Four individuals, 2 males and 2 females, from the following breeds have been re-sequenced for SNP identification, including Maine Coon, Norwegian Forest, Birman, Egyptian Mau, Turkish Van, and Japanese Bobtail. A previous project sponsored by Hill's Pet Food included 8 individual cats, the Abyssinian of the sequencing project, a South African wildcat, a female domestic shorthair, a male Cornish Rex, a female European Burmese, female Persian, female Siamese, and male Ragdoll. These combined efforts should generate hundreds of thousands of SNPs for the development of DNA arrays in the cat.

SNPs have been examined as an alternative to microsatellites for diversity estimates, for power in linkage studies (Xing et al 2005; Anderson and Garza 2006), and in many other studies that focus on linkage disequilibrium (LD) mapping and full genome scans. Due to the lower polymorphism of SNPs, in general, 2 - 10 times more SNPs are required to obtain the same discretionary power as a microsatellite (Clark et al 2005; Xing et al 2005). However, SNPs occur more frequently in the genome than microsatellites, thus, they can have denser coverage. In addition, the extent of selective sweep and LD causes SNPs to be associated with one another, forming extended haplotypes. By knowing the extent of the LD, less SNP coverage is required to survey a particular region of the genome as only a few SNPs are required to represent the haplotype for the region of interest (Clark et al 1998; Morton 2005; De et al 2007). Domesticated breeds follow the population dynamics that improve LD (Lindblad-Toh et al 2005). LD is expected to be more extensive in closed, inbred populations, as well as in areas that have been under selection. LD estimates are required to predict the overall number of SNPs that will be required for a feline haplotype mapping project and LD mapping. Overall, the extent of LD determines the number and distance between SNPs required to perform full genome scans for complex traits.

A majority of the 40 - 50 cat breeds have been developed over the last 50 - 100 years. Only 6 "breeds" were recognized at the inception of the cat fancy in the late 1800's and only 7 breeds were recognized by the CFA in the USA by 1958 (CFA 2004; Lipinski et al 2007). Over 33 cat breeds have been developed in the past 50 years. The recent selection in breeds also suggests that LD may be relatively short for the cat. The ease of the development of parentage and identification markers also supports short LD (Lipinski et al 2007). Most microsatellite markers have high variation in all breeds of cats and fewer markers have breed specific alleles, as is found in the domestic dog. In addition, cats are selected for single gene traits and some distinctive morphological traits. Cat breeds have not been selected for complex traits that are controlled by several genes. Thus, the selective sweep should also be less intense in cats than in the dog since dogs have been selected for complex behavioral and performance functions. It is likely that cats will need denser marker coverage for association studies than dogs and other species. However, cat breeds are not as abundant or genetically distinct as dogs. Our studies have shown that random bred cats can be sub-structured into 5 parent populations worldwide, with Western European and Southeast Asian being the most distinct (Lipinski et al 2008; Menotti-Raymond et al 2008). In addition, not all breeds can be clearly sub-structured, hence, the analysis of ~20 breeds should represent the genetic variation found in cats for the determination of the SNPs appropriate for a high-density array.

The genome sequence and subsequent development of SNP arrays is all a prelude to complex disease studies. A majority of the cats in the world are randomly bred and have diseases that are common to humans (Table 1). Complex diseases have both a environmental and genetic component. Proper clinical diagnoses help to decipher and simplify the environmental components, allowing the selection of cases that are homogenous in presentation and progression. The SNP arrays can then help to decipher the genetic component of the disease. The LUPA project in Europe (http://www.eurolupa.org/) is a large international collaboration focusing on the investigation of complex diseases in the dog. Based on dog LD and SNP array size, the LUPA project suggests 20 cases and 20 controls for recessive (AR) trait studies, 50 cases and 50 controls for dominant (AD) traits, 100 cases and 100 controls for traits with a 5X increased relative risk in the population. Horse studies to date have been successful using slightly more individuals than the dog studies (2). The cat array will have at least 2 times the number of SNPs than the current dog and horse arrays, thus, potentially more power to detect associations. A balance between cat LD, the number of cases and controls, and array power needs to be determined and appropriate studies are currently underway.

Table 1. Common Complex Diseases and Traits in Cats

Asthma

Flat-chested Kittens

Obesity

Behavior

Fur length

Porto-systemic Shunts

Color Variation

Hyperthyroidism

PKD Progression

Deafness

Inflammatory Bowel Disease

Tail Length

Diabetes

Lymphoma

Urinary Tract Disease

FIP

Morphology

Vaccine-induced Sarcomas

Summary

 The domestic cat will have one of the most complete genetic sequences for a companion animal.

 High Density SNP arrays will soon be available for researchers worldwide.

 SNP arrays will make it feasible to decipher the genetic components of complex diseases of cats.

 Clinical diagnoses need to be precise and exact for complex disease studies.

 Veterinarians will be key investigators in disease-focused research consortiums.

 

References/Suggested Reading

1.  Anderson EC, Garza JC (2006) The power of single-nucleotide polymorphisms for large-scale parentage inference. Genetics 172:2567-2582.

2.  Cat Fanciers' Association I (2008) Registration Numbers Table. Cat Fanciers' Almanac Online March:1-18.

3.  CFA (2004) Cat Fanciers' Association Registration Totals by Color and Breed - 2003, and 1/1/58 to 12/31/03. Cat Fanciers' Almanac 20:72-86.

4.  Clark AG, Weiss KM, Nickerson DA, Taylor SL, Buchanan A, Stengard J, Salomaa V, Vartiainen E, Perola M, Boerwinkle E, Sing CF (1998) Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am J Hum Genet 63:595-612.

5.  Du FX, Clutter AC, Lohuis MM (2007) Characterizing linkage disequilibrium in pig populations. Int J Biol Sci 3:166-178.

6.  Lathrop GM, Lalouel JM, Julier C, Ott J (1984) Strategies for multilocus linkage analysis in humans. Proc Nat Acad Sci USA 81:3443-3446.

7.  Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M, et al. (2005) Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438:803-819.

8.  Lipinski MJ, Amigues Y, Blasi M, Broad TE, Cherbonnel C, Cho GJ, Corley S, et al. (2007) An international parentage and identification panel for the domestic cat (Felis catus). Animal Genetics In Press: In Press

9.  Lipinski MJ, Froenicke L, Baysac KC, Billings NC, Leutenegger CM, Levy AM, Longeri M, Niini T, Ozpinar H, Slater MR, Pedersen NC, Lyons LA (2008) The ascent of cat breeds: genetic evaluations of breeds and worldwide random-bred populations. Genomics 91:12-21.

10. Menotti-Raymond M, David VA, Pflueger SM, Lindblad-Toh K, Wade CM, O'Brien SJ, Johnson WE (2008) Patterns of molecular genetic variation among cat breeds. Genomics 91:1-11.

11. Morton NE (2005) Linkage disequilibrium maps and association mapping. J Clin Invest 115:1425-1430.

12. Pontius JU, Mullikin JC, Smith DR, Lindblad-Toh K, Gnerre S, Clamp M, Chang J, Stephens R, Neelam B, Volfovsky N, Schaffer AA, Agarwala R, Narfstrom K, Murphy WJ, Giger U, Roca AL, Antunes A, Menotti-Raymond M, Yuhki N, Pecon-Slattery J, Johnson WE, Bourque G, Tesler G, O'Brien SJ (2007) Initial sequence and comparative analysis of the cat genome. Genome Res 17:1675-1689

13. Sutter NB, Eberle MA, Parker HG, Pullar BJ, Kirkness EF, Kruglyak L, Ostrander EA (2004) Extensive and breed-specific linkage disequilibrium in Canis familiaris. Genome Res 14:2388-2396.

14. Xing C, Schumacher FR, Xing G, Lu Q, Wang T, Elston RC (2005) Comparison of microsatellites, single-nucleotide polymorphisms (SNPs) and composite markers derived from SNPs in linkage analysis. BMC Genet 6 Suppl 1:S29.

15. Zeggini E, Rayner W, Morris AP, Hattersley AT, Walker M, Hitman GA, Deloukas P, Cardon LR, McCarthy MI (2005) An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets. Nat Genet 37:1320-1322.

 

Speaker Information
(click the speaker's name to view other papers and abstracts submitted by this speaker)

Leslie A. Lyons, PhD
Department of Population Health and Reproduction
School of Veterinary Medicine, University of California-Davis
Davis, CA, USA


MAIN : Lectures : Feline Genomics
Powered By VIN
SAID=27