qKAT: a high-throughput qPCR method for KIR gene copy number and haplotype determination

In this paper, we describe a KIR typing method based on real-time PCR. This method is able to detect the total number of copies of each KIR locus. Clear discrimination between 0, 1, 2, 3 or even 4 copies could be obtained using this method. We extended the approach to LILR loci, demonstrating that the qKAT approach can be used to analyse other mCNV loci.

This method is high-throughput and cost-effective. Using a Roche LightCycler 480 real-time PCR instrument with a 384-well Thermal Block Cycler we could complete our PCR assay in 65 min. A Twister II Plate Handler was used as an automation robotics system (Additional file 1: Figure S7). A MéCour Thermal Plate Stacker was used to keep the stacked plates constantly at 4 °C. With automation, around 22 plates comprising 8448 reactions can be finished within 24 h. Since our full KIR typing assay for each sample requires 40 reactions in total (including quadruplicates), this system can produce full KIR typing for around 210 samples every day.

Copy number information provided by quantitative PCR may be essential for accurate KIR determination. Accurate genotype data are required for population genetic studies and gene dosage effects [36]. Unlike standard genotypes, some KIR genes (e.g. KIR2DL5, KIR2DS3 and KIR2DS5) can be missing and actual genotype cannot be resolved without copy number information (e.g. –A and AA). Furthermore, recently discovered structural variations make typing even more difficult [3, 24, 28]. For truncated haplotypes carrying a multi-locus deletion, conventional methods can only detect them in a homozygote. For extended haplotypes carrying duplicated loci, typing at the allelic level may be helpful when the multiple alleles are different to each other. Specially designed inter-gene PCRs are useful approaches [3, 24, 37] but from our data it seems there may be many more truncated and extended haplotypes [38]. Nevertheless, none of the approaches could provide precise genotype without family data. Recently, pyrosequencing has been used for KIR typing and this can also provide copy number information although there are throughput and cost limitations [24, 39].

Accuracy is extremely important for quantification. We have shown that it is possible for real-time PCR to accurately determine the copy number from genomic DNA. Reference DNA samples were used to validate the accuracy and a large panel of families (1698 samples) to evaluate the precision. Since most CNVs follow Mendelian inheritance, family information can be used to infer copy number in each homologous chromosome after the total copy numbers are obtained from quantitative PCR. This method has been shown to enhance the accuracy of CNV detection [40]. For example, in CEPH family 1347, copy number information assisted in the deduction of gene content for all haplotypes when family data were insufficient to resolve haplotypes for all members with KIR presence/absence data. Our method could be further improved by using probabilistic models to increase confidence of chromosome-specific copy number estimates using family information [41]. This approach can be used for the future development of linkage and association tests that require chromosome-specific copy number information. However, like any other PCR-based method, highly polymorphic sequences always pose challenges for designing primers and probes. As we found with the 3DL1*056 allele in family CEPH1416, there is always the possibility that some rare alleles may be missed due to polymorphism. The primer and probes were designed to avoid known polymorphism in their annealing sites (Additional file 1: Tables S3 and S5) but as more alleles are described, care should be taken to continually review the assays and redesign the primers/probes as required. If an assay is disrupted by a rare SNP (true allele dropout) this will be identified by the loss of linkage with an adjacent gene that is known to be in high linkage disequilibrium; all KIR loci have another KIR locus in tight linkage or have an expected copy number, e.g. framework genes are usually always two copies. One can, therefore, check the data against these predefined ‘standard KIR haplotype rules’ (Additional file 1: Table S13) to identify unexpected results and these samples can be further investigated. Alternatively, inconsistencies can be found using the KIR Haplotype Identifier online tool through the appearance of an unusual haplotype in the results. In rare instances when confirmation is required, a second set of assays for each gene can be used for verification (Additional file 1: Table S14).

There are opportunities for further development of the copy number assay. For example, triplex real-time PCR was used in this assay, but it may be possible to achieve up to heptaplex real-time PCR [42] to improve the throughput. Inclusion of an additional reference gene or multicopy reference could provide superior normalisation for DNA input and avoid potential effects of local genomic changes to the reference gene. Supporting our current choice of reference gene, in our screening with qKAT we have not yet identified a sample exhibiting altered KIR copy number across all KIR loci, including framework genes, indicative of a genomic alteration to the reference gene. Currently, there are other methods to discriminate gene copy number, e.g. DNA microarray, multiplex ligation-dependent probe amplification (MPLA), branched DNA testing, paralogue ratio test (PRT), digital PCR [43] and next generation sequencing (NGS). In addition, KIR haplotyping can be achieved through dye-terminator sequencing of KIR gene amplicons [44]. Comparing current throughput, cost and complexity of assay setup, quantitative PCR has advantages over the others for KIR copy number analysis. Highly repetitive genomic intervals with long stretches of identical sequence, as in the KIR locus, have been less amenable to NGS. The present short read lengths obtained by NGS, or the current inaccuracy of long-read length single molecule sequencing, makes sequence assembly and phasing (haplotype-resolution) problematic for characterisation of mCNV loci, especially when more than two copies of a gene are present. As NGS methods improve and become cheaper, we anticipate that this approach will be useful, particularly at increased scale and for precise typing of KIR alleles at the nucleotide level. The two approaches will complement each other and be useful for cross-validation [45]. qKAT offers a simple solution for, as example, initial assessment KIR disease association at the gene-level or haplotype-level before investing more time in complex analysis at the allele-level. Once an association has been established, allele resolution typing could be informative if a sufficient number of samples are available for statistically powered analysis. qKAT is simple, one-step and flexible, in that a single gene, or combinations of genes, can be typed alone at minimal cost or as required (e.g. KIR A/B haplotype-defining genes). To date, 21 published studies (comprising 20,000 samples in total) including investigations of KIR disease association, function, expression and imputation have utilised the method [13, 31, 34, 38, 43, 4560]. A KIR typing service using qKAT has also been established at the Addenbrooke’s Hospital Histocompatibility and Immunogenetics (Tissue Typing) laboratory in Cambridge (UK).