99 dot 9percentage accuracy, We have shown our methods to be effective for robust multiplex SNP genotyping using APEX with 100 call rate and >,.
DNA sequencing reactions were performed by toNucleic Acid Protein Service Unit at toUniversity of British Columbia.
We directly sequenced three SNP loci in three independent samples, as described in tomain paper. In tofirst experiment, we genotyped 50 SNPs across toentire 270 HapMap Coriell DNA sample set. We obtained good results for 41 of toSNPs with 99 dot 8percent genotype concordance with HapMap data,, at an automated call rate of 94 dot 9percentage. Nonetheless, for every Coriell sample, DNA template was amplified in a total of 7 multiplex PCRs prior to genotyping. Purification, concentration and fragmentation of standard PCR amplicons. While showing ’50plex’ PCR products from two Coriell DNA samples, plus a negative PCR control, Multiplex PCR amplification of all 50 SNP loci in a single reaction tube using new PCR primer set. While showing vast selection of amplicon sizes across to50 SNP loci, Standard multiplex PCR from a single Coriell DNA sample using optimally designed primers within seven unique multiplex groups. Now regarding toaforementioned fact… Multiplexing PCR and subsequent amplicon fragmentation results, prior to APEX reaction on HapMap Chip.
Fragmentation of 50 plex PCR amplicons from aliquots of lane 1 lane 2 samples shown in Fig.
Lane 1 represents an aliquot of concentrated mixture of all seven multiplex products shown in Fig.
While generating ‘single stranded’ nucleic acid of ‘30100’ base length, Lane 2 shows tofragmentation result. The multiplex PCR was carried out in a 25 μL reaction containing 20 nM of any primer plus 20 left nM and right linker only primers, 200 μM dNTPs without dTTP, 160 μM dTTP, 40 μM dUTP, 6 units of HotStar Taq DNA polymerase, 5 mM MgCl2 in 1× PCR reaction buffer with 5 ng of genomic DNA. PCR was performed using a MJR PTC 200 ThermoCycler. Generally, all primers were computationally tested against tohuman genome and found to amplify single product. The new amplicon sequences were located within toamplicon sequences from tooriginal primer pairs. Plenty of information can be found easily on toweb. Each new PCR primer had a typical linker sequence designed at its 5′ end. Just think for a moment. The 3′ ends of toprimers were chosen to have noncomplementary bases with respect to ourselves, with an eye to reduce toprobability of primer interactions and primer dimer formation.
For tosecond experiment, with an eye to increase toefficiency of PCR, we designed 50× 5′ linker PCR primer pairs on the basis of a Tm of 65°C ± 7°C and performed 50 plex PCR in one single reaction per sample.
Subsequent inspection of toSNP Chart for this genotype showed that to’ASOAPEX’ probe intensity signals for toC allele were somewhat lower than toT allele signals.
Whenever using tooriginal PCR primer pairs, that said, this same sample/genotype had previously been concordant with HapMap in toinitial data set. Interestingly, to’LDAcalled’ genotype that had tolowest score was identical genotype as tothird MACGTcalled discordant genotype. Auto calling was independently undertaken. Let me tell you something. Using this training set, MACGT autocalling of totest set with a 001 fit threshold resulted in a call rate of 94 dot 04 and a concordance rate of 99 dot 94.
Whenever using a slightly reduced sized training set, yielded identical genotyping results to manual calling, at a 100 call rate across all 50 SNPs, LDA with dynamic variable selection.
Initially, MACGT cluster plots and quality control using SNP Chart were used to allow manual selection of a limited training set of samples from todata set.
Surprisingly, with a 65 threshold, among 16 ‘non calls’ 14 were homozygous with 11 cases from a single SNP rs1891403, that gives a homozygous call rate of 98 dot 9 and a heterozygous call rate of 99 dot 7percentage. The third discrepancy had a relativelyquite poor fit confidence score. Two of these genotypes were just like totwo that had been identified as part of tomanual calling data. Separate analysis of homozygous and heterozygous cases showed that for a 0 threshold, homozygous cases achieved a call rate of 100 with 100percent HapMap concordance, whereas heterozygous cases achieved a call rate of 100 with 99 dot 7 HapMap concordance.
Besides, the two discrepant genotypes, both of which were incorrectly called as homozygous, had high confidence scores, consistent with any NP to make togenotype call. 99 dot 9percent accuracy, for multiple SNPs and multiple samples, In summary, we have shown that a combination of multiplex PCR, redundant and robust APEX design and assay, and statistically robust ‘autocalling’ can achieve 100 completion and call rate with >. Redundancy in genotyping arrays is associated with higher costs per SNP, concomitant with lower numbers of SNPs able to be interrogated in a given area of tomicroarray. We reckon that so it is a significant improvement over other published APEX methodologies.
The strength of our methodology isn’t on the basis of toquality of a single measurement but on toredundancy obtained from measuring toallele intensities by using multiple chemistries. For research studies, a tradeoff may need to be taken into consideration, given toever increasing need to genotype as many SNPs as possible, at minimal cost per SNP, and a recent article by Smemo and Borevitz cogently argues for a reduction in toapproximately ’40fold’ probe redundancy currently featured on Affymetrix GeneChips, that only use hybridization for allelic signal generation. Simple scatter plots for SNP rs12466929 from ’50 plex’ data set. All values are in log scale. You should take it into account. Despite the fact that AG and GG genotype clusters overlap somewhat for this Left APEX probe, plot; plot is able to contribute to tofinal call for such genotypes. Notice that for every plot tox axis represents signal values for X allele and toy axis represents signal values for Y allele. Magenta, greenish, light blue and blackish coloured symbols denote toclasses YY, YX, XX and NN. Consequently, whenever addressing toredundant probe chemistry, Performance analyses for todifferent data sets are described below. There is some more info about it here. In tofirst row, all four classifiers were used to give tofinal genotype call, and in tofourth row, only toleft classifiers were used.
The extreme left hand column of any table indicates tocombination of four classifiers used to build toLDA model. In tolast four rows, only one classifier was used at a time to give independent genotype calls using tosimple LDA model. This article is published under license to BioMed Central Ltd. Therefore, toAPEX reaction was performed in a total volume of 40 μL by toaddition of 17 μL fragmented DNA template, 1 2 μL pmol/μL Npg1 positive control template oligonucleotide, 25 μM of any fluorescently labeled dideoxynucleotide triphosphate, 5 U Thermo Sequenase DNA polymerase diluted in its dilution buffer, 2× Thermo Sequenase reaction buffer.
Since tolow amount of genomic DNA required for to’50plex’ PCR, we have attempted APEX genotyping using our improved methodology on DNA derived from plasma samples.
While allowing its covalent attachment to toslide’s pre applied surface chemistry, The 5′ end of every oligonucleotide probe was aminomodified during synthesis.
Following toprinting of toarrays, toslides were incubated in one day at room temperature at 75percent relative humidity to drive tocovalent coupling reaction between toprobes’ 5′ amino group and toCodeLink slide chemistry to completion. Briefly, toAPEX and ‘ASOAPEX’ probe oligonucleotides following tomanufacturer’s recommended protocols.
While multiple bufferonly spots and positive control normalization spots, Each grid consisted of five spot replicates of any of tosix probes per SNP.
The latter comprised an oligonucleotide probe depending on a plantspecific gene sequence that will extend by a single N base as long as topresence of an exogenous complementary template oligonucleotide in toAPEX reaction mixture.
Arrays were generously printed for us at toMicroarray Facility of The Prostate Centre at Vancouver General Hospital. Each spot was approximately 110 μm in diameter. This enabled an useful degree of robustness in tosystem, especially helpful in cases of high local background and hybridization problems. Whenever enabling three samples to be genotyped per slide, Three replicated grids were printed on every slide. We considered two different training sets, one with a small number of prototypes and toother with a minimal number of prototypes for every SNP.
While verifying tochosen cases with SNP Chart, For LDA analysis of to’50plex’ PCR chemistry, performed on a subset of 50 HapMap samples which were chosen randomly out of tooriginal 287 samples, we selected prototypes to build a totally new training set using MACGT clusters. Since both had previously been concordant with HapMap in to’7 reaction multiplex’ PCR data set, we were interested in further study of totwo discrepant genotype cases and both showed high quality, unambiguous SNP Charts in to50plex PCR data set. Without a regular 5′ linker sequences, specific PCR cycling conditions were adopted from a previously published study by Wang et al We also attempted ’50 plex’ PCR using toredesigned PCR primers. 700 bp, This requirement can result in individual amplicon sizes in a multiplex mix ranging from 100 to >. Now let me tell you something. This approach helps reduce ‘primer dimer’ formation throughout the PCR. We were not able to optimally design toprimers on the basis of a balanced melting temperature, because of this limitation. Although, with amplicon sizes restricted to between 100 and 200 bp, New PCR primers were designed for to50 HapMap SNP loci. Then again, any new PCR primer had a typical linker sequence designed at its 5′ end intention to try to compensate for this potential problem.
That all 50 SNP loci will be simultaneously and robustly amplified in a single reaction vessel, Thus, our new objectives were to increase todegree of multiplexing and shorten toamplicon lengths to less than 200 bp.
Sequencecontext problems, especially in multiplex PCR, necessitate todesign of unique primers that have balanced annealing temperatures.
We randomly selected 50 of toHapMap Coriell DNA samples from our initial study, for ’50 plex’ PCR using topool of linkermodified primers. Did you know that the degree of multiplexing is usually limited to between four and ten amplicons per individual multiplex PCR, for our original HapMap chip, to50 SNP loci are amplified in a total of seven separate multiplex reactions. Linker sequence becomes incorporated into toamplicon sequence and is amplified with totemplate sequence, after tofirst few cycles of PCR. GC content to increase tomelting temperature of toprimer and an unique sequence not found in tohuman DNA template. For SNP genotyping, only toimmediate sequence around toSNP site is of interest.
Keeping toPCR amplicon size to a minimum ensures short extension times and minimal use of reagents.
Primer annealing in later cycles of PCR must become a lot more sensitive and robust, as long as toprimers have balanced C content.
These linkers have two properties. We initially tested multiplex PCR using all original PCR amplicon primer pairs in a single reaction. Large amplicons are optimal neither for fast PCR nor for tosubsequent APEX assay, that requires amplicons to be fragmented to ~50 100 base lengths.a few experimental attempts all failed to amplify even a modest proportion of to50 amplicons, as expected. So, to7 subgroup multiplex PCR products were pooled for every individual Coriell sample and precipitated by adding 5 ice volumes cold 100 ethanol and 25 volumes of 10 M ammonium acetate solution. Normally, toDNA pellet was consequently dissolved in 15 μL pure water. The supernatant was carefully removed, and toDNA pellet was washed with 400 icecold μL 70percent ethanol. While, Aliquots of PCR products were visualized with Gel Red fluorescent nucleic acid dye staining under ultraviolet illumination on a 2percent agarose gel. Accordingly the mixture was centrifuged at 20800 × g at 4°C for 20 min, right after precipitation at -20°C in one day.
The training set for MACGT was selected by manually inspecting SNP Charts for every of toSNPs across a lot of to 287 samples.
The final training set for to41 SNPs was made up of 519 genotypes.
Any SNP or sample with a high rate of NNs was subject to further inspection. Genotyping was performed by MACGT using toparameters NORMALIZEGROUPOF4 = 1, GROUPOF4MEANCUTOFF = 10, PATCHGROUPSOF4 = 1, DROPNNS = 1. Albeit manual inspection of SNP Charts did show that toassays were somewhat successful, We identified nine SNPs that toPCR assay performed poorly on and which MACGT could not confidently score, albeit ‘nonreproducibly’.
MACGT was run on just totraining data, and toclusters for any SNP were manually inspected to ensure there where no every genotype.
All NNs were inspected within SNP Chart and manually called if possible. All prototype data were exported from SNP Chart into a format readable by MACGT. Considering toabove said. HapMap Chip four colour microarray images showing successful ‘demultiplexing’ of 50plex PCR from two Coriell DNA samples, plus a negative control sample, prior to image analysis and automated genotyping. For discrepant genotype case 2, we found a sequence variant 52 bp downstream of SNP rs12472674, located within to’antisense’ PCR primer site, toevidence that we have identified two hitherto unreported SNPs provides a cautionary tale. However, as well as due to structural variation in togenome, elimination of such ‘sporadic’ genotyping won’t cause significant departure from HardyWeinberg equilibrium. Basically, multiplex PCR amplifications were performed on toCoriell genomic DNA samples.
The multiplex PCR group had an unique combination of toprimer pairs among 7 reactions.
PCRs were initiated by a 15 min polymerase activation step at 95°C and completed by a final 10 min extension step at 72° The PCR cycles were as follows.
All primers were computationally tested against tohuman genome and found to amplify single product. Genomic DNA and PCR master mixture were transferred into ABI 384well reaction plates using a Biomek FX robot. For tofirst experiment, PCR primers were designed to amplify toregions across to50 SNPs, depending on a melting temperature of 62°C ± 3°C. PCR reactions were performed in a GeneAmp PCR System 9700 ThermoCycler. Whenever containing 5 μL 10× PCR buffer, 5 mM MgCl2, 200 μM dNTPs without dTTP, 160 μM dTTP, 40 μM dUTP, 75 U HotStar Taq DNA polymerase, 1 μL 10 μM primer mixtures, and 25 ng genomic DNA, Each PCR was performed in a total volume of 15 μL.
Incorporation of todUTP allowed for toamplified DNA to be enzymatically sheared by uracil ‘Nglycosylase’ to produce a DNA size of approximately ‘50100’ bases, optimal for hybridization to tooligonucleotides on tomicroarray.
Probes were synthesized at a 25 nmol scale and aliquotted into 96well plates by Integrated DNA Technologies.
Allelespecific’ single base extension of these ‘ASOAPEX’ probes in the course of the reaction is contingent on topresence of toactual complementary base at toSNP site in tosample template DNA. APEX probes, plus four ‘allele specific’ APEX probes which include toactual SNP site at to3′ end of toprobe. Did you hear about something like that before? Six oligonucleotide probes for any SNP were designed using Biodata algorithms. For example, detailed descriptions of toalgorithms used in simple linear discriminant analysis with dynamic variable selection have previously been published by our laboratory. For variable construction, every genotype call may be on the basis of just amidst to four probes sets. Now pay attention please. LDA is a supervised learning technique which requires a valid training set in case you are going to build toclassification model for every SNP. Anyways, we have further developed robust assay design, chemistry and analysis methodologies, and have sought to determine how effective APEX is in comparison to leading ”gold standard” genotyping platforms.
Arrayed primer extension is a microarraybased rapid minisequencing methodology that may have utility in ‘personalized medicine’ applications that involve genetic diagnostics of single nucleotide polymorphisms. To date there was few reports that objectively evaluate toassay completion rate, call rate and accuracy of APEX. Previous studies from our laboratory have reported APEX genotyping accuracies ranging from 98percentage to 99 dot 8, though tocall rates in these studies have always been significantly lower than 100percentage, and usually do not include a proportion of tooriginally selected SNPs that fail toassay. Given topotential utility of APEX for rapid clinical diagnostics, we have developed robust assay design, chemistry and analysis methodologies, and have sought to determine just how effective APEX is in comparison to leading ‘goldstandard’ genotyping platforms, including Perlegen and Illumina. Now please pay attention. Our objective was to achieve 100percent assay completion rate, call rate and genotyping accuracy rate, for multiple SNPs across multiple samples. We report significant improvements to arrayed primer extension genotyping methodology that may show utility in future pointofcare genetic diagnostic applications.
Our methods been validated against industry leading technologies in a blinded experiment on the basis of Coriell DNA samples and SNP genotype data from toInternational HapMap Project.
This set comprised 270 DNA samples from toCoriell Institute for Medical Research plus hidden duplicates and negative controls, all of which our laboratory was blinded to.
DNA samples were obtained from McGill University and Génome Québec Innovation Centre. Furthermore, with reduced call rate, The underlying supposition is that, accuracy should increase successively until it reaches its maximum limit. Eventually, for toimproved 50 plex PCR chemistry, we were able to achieve a high concordance rate with 100 call rate.
Applying different extent of thresholds, we can control tocall rates and, given tovalidated genotype set, we can also check toperformance level by calculating tomiss classification rates.