Quantitative high resolution melting: two methods to determine SNP allele frequencies from pooled samples

Coral collection

Adult corals were used in all experiments. For the allele titration and SNP panel
validation experiments (peaks method), corals were collected from Trunk Reef (?18.368 S,
146.818 E) and Little Pioneer Bay, Orpheus Island (?18.604 S, 146.486 E), Great Barrier
Reef, Queensland, Australia in 2009. Samples used in the Orpheus/Magnetic population
comparison experiment (peaks method) were collected from Orpheus Island and Nelly
Bay, Magnetic Island (?19.165 S, 146.851 E). Coral samples used to validate the curves
method experiments were collected from 28 coral populations, summarized in Additional
file 1. All corals were collected under the appropriate Great Barrier Reef Marine Park Authority
permits.

DNA extraction

For the coral individuals used in the allele titration and SNP panel validation experiments
(peaks method), genomic DNA was extracted using the Qiagen DNeasy Blood and Tissue
kit (Qiagen) and was eluted into Buffer EB. Individuals sampled from Orpheus and Magnetic
Islands (peaks method) and from the 28 populations across the Great Barrier Reef (curves
method) were extracted according to the method described in Wilson et al. 26]. DNA concentrations were initially determined using the Nanodrop ND-1000 Spectrophotometer
(Thermo Scientific).

Peaks method

The peaks method was developed as a rapid means to quantify SNP allele frequencies
across several hundred SNP loci. Briefly, we first established proof of concept using
titrated allele frequencies and two SNP markers, then validated a panel of 384 SNPs
using six corals, genotyped individually and as a pool. This panel of SNPs was finally
applied to samples of two natural coral populations to detect loci with different
frequencies between the reefs. The workflow of the peaks method is illustrated in
Fig. 1.

Fig. 1. Flow chart of qHRM, including benchmarking experiments performed for each method

Pooling

In order to convert conventional HRM into a quantitative assay for pooled samples,
equal amounts of each individual’s DNA must be added to the initial asymmetric PCR
reaction. This DNA normalization step is particularly critical for pooling organisms
that may have known or unknown assemblages, infections or symbioses as the relative
amounts of target and contaminating DNA can vary among samples. To circumvent these
problems, we used quantitative PCR (qPCR) to accurately measure the quantity of coral
DNA in individual samples. Primers to amplify the SNP locus C23209S177 27] were used in a conventional qPCR reaction performed under the following conditions
using a LightCycler 480 machine (Roche): 5 ng holobiont DNA, 0.1 ?M each forward and
reverse primers, 2 mM MgCl2 and 1x High Resolution Master Mix (Roche) in a 15 ?l volume. Reactions were heated
to 95 °C for 10 minutes, then cycled 55 times as follows: 40 s at 95 °C, 40 s at 60 °C,
40 s at 72 °C, then cooled to 40 °C and held for 20s. Concentrations of target DNA
were normalized based on differences in quantification cycle (Cq) by assuming that
the fold-difference per each Cq unit difference was equal to 1/E, where E is the amplification factor per PCR cycle for the particular primer set 28]. The accuracy of this approach was verified by re-amplifying the adjusted DNA concentrations
again with the same primer pair.

Internal melting calibrators

Though within-plate variation among replicates was low for the peaks method, we noticed
that between-plate variation could shift the melting profile of all duplex DNA species
(i.e. for an analyzed heterozygote, such species include two perfect-match amplicons,
two mismatched amplicons, one perfect-match probe-SNP duplex, and one mismatch probe-SNP
duplex; Fig. 2a,b) by as much as 2 °C. To accommodate this phenomenon, which occurred only in the
peaks method, we included standard internal melting calibrator primers with each reaction
in order to provide landmarks in the melting profile. Four oligonucleotide calibrators
were designed to form two complementary DNA duplexes with known melting temperatures.
They were based on sequences from Gundry et al. 29] but were adjusted to melt at the slightly lower and higher temperatures of 47.5 °C
and 90 °C than the published sequences in order to melt well outside of the range
of the probe and amplicon duplexes. The 3? terminal ends of each of the four oligos
were modified with an inverted T to block any potential primer extension from occurring.
The forward sequence of the low-melting calibrator used is 5?- ATT TTA TAT TTA TAT
ATT TAT ATA TTT TT/3InvdT/ -3?, while the forward sequence of the high-melting calibrator
is 5?- GCG CGG CCG GCA CTG ACC CGA GAC TCT GAG CGG CTG CTG GAG GTG CGG AAG CGG AGG
GGC GGG/3InvdT/. The calibrator oligos were included in the initial asymmetric PCR
reaction at a concentration of 0.05 ?M for each of the high-melting oligos and 0.5 ?M
for the low-melting oligos; addition of the calibrators did not interfere with amplification
of the target products.

Fig. 2. Schematic diagram of qHRM. An individual heterozygous for a particular SNP locus is
shown as an example. a For both methods, the locus is amplified from genomic DNA using asymmetric PCR to
generate an excess of the reverse strand. Next, an oligonucleotide probe is added
to the reaction. b The mixture is heated to melt all duplexes apart, and subsequently cooled rapidly
to promote unbiased duplex reformation of all possible duplex DNA species. For a biallelic
SNP site, this produces six distinct types of duplexes: two perfect-match amplicon
duplexes, two mismatch amplicon duplexes, one perfect-match probe duplex, and one
mismatch probe duplex. c For the peaks method, the reaction is heated again to denature all duplexes while
the fluorescence of the reaction, which quantifies double stranded DNA, is monitored.
d Next, the negative first derivative of the decreasing fluorescence curve is calculated
to transform the curve into a peaks profile. The height of the lowest-melting, mismatched
probe duplex peak is divided by the total heights of both probe peaks to yield the
frequency of the mismatched allele in the sample. e For the curves method, the same process was repeated as described for panel c. Three
known samples that consist of each genotype (i.e. heterozygote, high-melting homozygote
and low-melting homozygote) were used as references. f Melt curves from the raw channel were normalized by averaging fluorescence value
outside a melting phase and forcing the values to be the same for each sample

Asymmetric template amplification and melt stage

In HRM, the asymmetric PCR stage serves to generate an excess of the amplicon strand
that is complementary to the probe. This excess strand can then form a duplex with
the unlabeled probe, whereas if it was not in excess it would preferentially bind
to the longer, complementary amplicon strand instead. Though some versions of HRM
use symmetric PCR, their applications typically involve whole-amplicon melting analysis.
In contrast, unlabeled probe HRM, which yields the increased sensitivity necessary
for SNP discrimination, requires asymmetric PCR in order to outcompete the amplicon.
The asymmetric PCR reaction prior to each assay’s melt stage was performed as follows
on the LightCycler 480 (Roche): 5 ng total pooled DNA, 0.3 ?M reverse primer, 0.067 ?M
forward primer, 0.05 ?M each high-melting calibrator forward and reverse oligos, 0.5 ?M
each low-melting calibrator forward and reverse oligos, 2 mM MgCl2, and 1x High Resolution Master Mix (Roche). Reactions were heated to 95 °C and held
for 10 m, then cycled as follows: 40 s at 95 °C, 40 s at 60 °C, 40 s at 72 °C for
55 cycles, which, for most SNPs, was at least 10 cycles past the beginning of the
reaction plateau. The reaction was then finally cooled to 40 °C and held for 20 s.

Following the asymmetric target amplification, the reaction plates were unsealed and
0.5 ?M of the oligo probe complementary to the excess reverse strand of the amplicon
was added. The plates were resealed and then heated to 95 °C for one minute to melt
all double-stranded DNA species apart. Subsequently, the plates were rapidly cooled
to 45 °C and held for one minute to allow reannealing of all possible probe and amplicon
duplexes. Next, the reactions were heated to 95 °C at the maximum rate allowed by
the Roche LightCycler 480 (0.02 ° C/s). During this heating period, the LightCycler
480 monitored the amount of fluorescence over time and collected over 1300 data points
over a 50 °C range, producing a high-resolution graph of the melting profile of each
species of duplex DNA in the reaction.

Analysis of denaturation profiles

We calculated the negative first derivative of the melt stage’s decreasing fluorescence
curve to produce a profile with discrete peaks centered at the melting temperature
of each dissociating duplex (Fig. 2c,d). Next, we drew a baseline connecting the linear regions bounding the melting
peaks to determine the background rate of dye disassociation. Then, the height of
the probe melting peaks above the baseline was measured using a custom [R] script
(see Additional file 2). For pooled reactions, three technical replicates were performed for each locus
and sample pair and their relative peak heights averaged. The genotype of a sample
was determined by the presence and position of the probe peaks, relative to other
samples for the same locus and to the calibrator melting peaks. The frequency of the
low-melting allele in a pooled sample was estimated by the contribution of its corresponding
melting peak to the total summed height of probe-related melting peaks. Across all
SNP assays, a minor allele was called only if it exceeded a frequency of 2 %.

DNA titrations

To demonstrate a linear relationship between known frequencies of a given allele and
the qHRM-estimated frequency estimated from a pooled sample, we first identified two
individuals homozygous for different alleles of the same SNP. We performed this experiment
for two SNP loci mined from the A. millepora transcriptome, C22162S248 and C45133S676 27]. We normalized the concentrations of DNA with qPCR as described above, then mixed
the two homozygotes in different proportions to obtain mixtures that incremented the
frequency of the low-melting allele from 0–100 % in 5 % or 10 % steps (Additional
file 3). qHRM was then performed on each mixture with five replicates for each mixture for
two independent normalization and mixing trials, for both SNPs assayed. The estimated
allele frequencies for all replicates of each mixture were then averaged and compared
to the known allele frequencies.

Analysis of 384 loci in six individuals

DNA samples from three adult coral individuals sampled from Trunk Reef and three from
Orpheus Island were genotyped individually by conventional HRM for 384 loci selected
from SNPs previously developed for a previously published coral linkage mapping project
27] (Additional file 4). Following this, the samples were normalized for coral DNA concentration as explained
above and pooled to generate a DNA sample with known allele frequencies for each SNP
locus. qHRM was performed in triplicate on the pooled sample for the same 384 SNPs.
To fully validate qHRM of each SNP in this panel, we only analyzed data from SNPs
for which all six individuals were both successfully amplified and had easily distinguishable
peaks when analyzed using conventional HRM, and which had clear qHRM peaks greater
than 2 % of the total fluorescence signal in the reaction (usually relative to the
amplicon peak’s height). At least two successful replicates were required for each
SNP in the qHRM reactions for analysis to proceed.

Analysis of 384 loci in 98 individuals collected from two locations

We used the peaks method to compare allele frequencies in two populations of corals
collected from Magnetic Island (n?=?48) and Orpheus Island (n?=?50). DNA was normalized, pooled according to reef of origin, and qHRM-genotyped
in triplicate using a panel of 384 SNPs. Nineteen of these SNPs were further validated
by individual HRM genotyping of each individual contributing to the pools. The set
of SNPs to validate was selected to represent loci where the low-melting allele was
major (8 SNPs) or minor (11 SNPs), loci with small or large minor allele frequencies
(11 SNPs with MAF??0.25; 8 SNPs with MAF??0.25), SNPs with more than two alleles
(7 SNPs with third alleles in one or both populations), and loci selected to span
the range of estimated FST values (7 SNPs with FST less than 0.01; 4 SNPs with FST between 0.01 and 0.02; 8 SNPs with FST greater than 0.02, including four of the top seven SNPs with the greatest FST).

Analysis of genetic subdivision between coral populations

SNP allele frequencies were compared between Orpheus and Magnetic populations by plotting
the averaged qHRM-estimated frequencies against each reef.

FST for each SNP was calculated according to Equ. 1, where p is the frequency of the high-melting allele, q is equal to (1-p), is the average frequency of p allele between the two populations, and is equal to .

(1)

Global FST was calculated by averaging all SNP FST values.

Curves method

The curves method was developed to extend qHRM and demonstrate functionality across
another dye chemistry and analytical approach. This method is best suited for highly
accurate determination of allele proportions across small numbers of SNPs. It requires
the usage of three reference sequences (two homozygous samples and a heterozygous
sample) for comparison with an unknown, pooled sample. We demonstrate that reactions
are extremely reproducible and robust among different markers and pooled samples.
The workflow of the curves method is illustrated in Fig. 2. Briefly, the curves method was validated through an initial proof of concept experiment
using different known proportions of alleles for a single SNP marker. This marker
was additionally validated on three DNA pools made from 43–56 individuals representing
distinct coral populations by comparing the true allelic frequency of the pool (revealed
by summing the genotypes of each individual) to the qHRM-estimated frequency. Thirdly,
reproducibility of this method was investigated by using five SNP markers for pooled
samples representing 25 coral populations.

Pooling

Each individual sample was amplified individually in an asymmetric qPCR reaction using
primers for the locus C70S236 27] under the following conditions using a Rotor-Gene Q machine (Qiagen): 0.1 ?M forward
primer, 1 ?M reverse primer, 10 ng holobiont DNA, 1× 7.5 ?l of the Accumelt HRM supermix
(Quanta Biosciences) in a 15 ?l volume. The reaction mixture was heated at 95 °C for
10 min, then cycled 40 times with the following thermoprofile: 95 °C for 40 s, 58 °C
for 40 s, and 72 °C for 40 s. DNA concentrations were normalized according to Pfaffl
28] as summarized in the Peaks – pooling section.

Asymmetric template amplification and melt stage

After normalization of individual samples, each individual was combined into a single
pooled sample. The pooled sample was then was amplified asymmetrically with the Rotor-Gene
Q machine (Qiagen) with the locus C70S236. Next, 1 ?M of probe complementary to the
excess strand was added to each reaction. Then, the reactions were heated to 95 °C
and held for 60 s, rapidly cooled to 45 °C and held for 150 s, heated again to 95 °C
at a rate of 0.1 °C/s with a 2 s hold each step, collecting 500 data points in total.

Analysis of denaturation profiles

Because many factors such as pipetting errors causing variations in quantity of fluorescent
dye and DNA can affect the relative fluorescence levels among normalized DNA samples,
it is important to standardize the fluorescence levels to exclude the noise for accurate
allele frequency measurements. Fluorescence values of the SNP-specific melt curves
were standardized by selecting regions before and after the probe melting phase where
nucleotide differences do not lead to variations in fluorescence level, then averaging
fluorescence levels of the selected regions among all samples using the Rotor-Gene
Q Series Software 2.0.2.4 (Corbett) in order to mitigate the effects of sample- and
SNP-specific variation (Fig. 2e). The exclusion of noise through this normalization process allows the direct comparison
of melting curves from different reactions.

The inclusion of three standard reference samples of known genotype (two different
homozygotes and a heterozygote) allows for even small differences in allele frequency
to be resolved. These standards set a reference fluorescence level to which unknown
samples can be compared for absolute allele frequency quantification (Fig. 2f) and also serve to reduce variations between runs.

The y-axis point at which the greatest difference in fluorescence between the two
homozygotes occurs is used to determine the relative proportion of alleles in an unknown
sample as it gives the best highest accuracy in estimating allele frequencies (Additional
file 5). This point should also be near where the inflection point of the heterozygote’s
curve falls. But, because the heterozygote can suffer from amplification biases, the
position of the inflection point can vary. We see increased gains in accuracy by calculating
the allele frequency of the unknown sample with respect to the heterozygote rather
than simply using the midpoint of the two homozygotes. For example, if an unknown
sample has a higher frequency of the high-melting allele (i.e. the unknown’s curve
falls above of the heterozygote’s curve and/or the point of the greatest distance
between homozygotes), then Equ. 2 can be used to estimate of the proportion of the high-melting allele. If the unknown
has a higher frequency of the low-melting allele (i.e. the unknown’s curve falls below
the heterozygote’s curve and/or the point of the greatest distance between homozygotes),
then Equ. 3 should be used to calculate the frequency of the high-melting allele. In both equations,
we calculate the proportion of fluorescence at the inflection point of the unknown
sample (x) relative to the heterozygote (fhet). After adjusting for empirical differences between the known heterozygote and one
of the known homozygotes (fhigh or flow), we add or subtract the value from 0.5 to determine the allele frequency of the
unknown sample relative to the low-melting homozygote.

(2)

(3)

DNA titrations and method validation

The use of three reference genotypes per SNP assay and the fluorescence standardization
step should theoretically allow for great sensitivity to detect allele frequency differences
among pooled samples. To assess this, we first identified multiple individuals of
each of the three possible genotypes for the locus C70S236 and normalized DNA concentrations
of each sample. We then combined them into one pool per genotype and mixed them in
varying proportions to generate a gradient of alleles spanning 0–100 % of the low-melting
allele by 5–10 % increments. The estimated allele frequencies for two technical replicates
of each mixture were averaged and compared to the expected allele frequencies (Additional
file 3).

Application to large pools of DNA

DNA sampled from individual corals sourced from Myrmidon (n = 43), Night (n = 56) and Wilkie (n = 49) Reefs was normalized and individually genotyped using SNP C70S236. Normalized
samples were then pooled by population and quantitatively genotyped in duplicate in
order to examine the accuracy of the curves method when used with many individuals
per pool.

qHRM reproducibility

Next, we evaluated the reproducibility of qHRM on 25 populations sampled along the
Great Barrier Reef. We pooled DNA from 35–56 individuals for each population, and
then performed curves qHRM on each pool in duplicate for five SNPs (C29226S281, C70S236,
C11461S560, C16774S791 and C20479S292), which were all individually normalized as
per above.

Consistency between the two replicates was measured by estimating the half-confidence
interval of the difference as follows:

Standard deviations (SD) and mean values were calculated using fluorescence levels
in the temperature range where the largest difference in fluorescence levels between
the two homozygotes were observed (Fig. 2f). A z value of 1.96 was used so the measure of precision represents half of a 95 %
confidence interval.

We further examined the qHRM reproducibility for the 5 markers and 25 reefs by comparing
mean errors. Errors in percentage were calculated as follows:

where f1 and f2 are two replicates of fluorescence data. Geometric means of errors were used as population-
and marker-specific errors.