Whole genome sequencing or resequencing
Large genomes, i.e. human or mouse, are sequenced on our Illumina NextSeq 500 instruments. These can sequence one whole human genome at ~40X coverage in just over a day. Smaller genomes are sequenced on our Illumina MiSeq or Ion Torrent PGM sequencers.
We usually prepare DNA fragment libraries by shearing genomic DNA using our Covaris sonicators and either Illumina TruSeq PCR-free or New England Biolabs Ultra or Ultra II DNA library preparation kits. However, enzymatic fragmentation (tagmentation) is also possible using Illumina Nextera DNA library preparation kits.
The minimum amount of input DNA required varies depending on the type of library preparation. PCR-free libraries require the most material (1-3 ug), TruSeq Nano libraries require 100-200 ng, Nextera and Nextera XT libraries require 50 ng and 1 ng, respectively, and NEB Ultra and Ultra II kits require 5 ng and 500 pg, respectively. For smaller samples, whole genome amplification using a product such as Qiagen’s multiple displacement amplification needs to be performed.
Whole exome sequencing
Since the protein coding portion (exons) of the human genome only represents 1% of the genome, strategies for exome sequencing have become widely used. Our facility prepares exome libraries using Agilent SureSelect Target Enrichment. The SureSelect technology uses large pools of long, 120-mer, biotinylated cRNA baits to enrich for all exons as well as optional UTR (untranslated regions) or COSMIC (Catalogue of Somatic Mutations in Cancer) regions.
A large number of different SureSelect kits are available. Our facility has used the SureSelect XT kits for whole human exome (51 Mb) and whole exome plus UTR (71 Mb) regions which require 200 ng – 3 ug of sheared DNA as input. For applications with limited sample, SureSelect QXT kits requiring on 50 ng of unsheared genomic DNA are available and a new product with low input requirements suitable for degraded FFPE samples is expected in 2016. Specialty SureSelect kits for the Clinical Research Exome (54 Mb with enhanced whole exome coverage) or the Focused Exome (12 Mb representing only regions previously associated with disease) are also available. For non-human exomes, there are SureSelect All Exon Mouse (49.6 Mb), All Exon Bovine (54 Mb), and All Exon Zebrafish (75 Mb) enrichment kits.
An alternative supplier of biotinylated RNA bait libraries, such as those in the SureSelect kits is MYcroarray. Their MYbaits custom bait libraries can be custom designed to target regions in the genomes of a wide range of organisms. The workflow required with the MYbait capture kits first requires creation of fragment DNA libraries, as per whole genome sequencing, and then the MyBaits are used to enrich for the targets of interest. Libraries of interest are then eluted from the capture beads and reamplified.
ChIP-Seq. DNA samples from ChIP-seq experiments are converted into DNA fragment libraries in much the same way as a genomic DNA sample. The exception being that the ChIP-seq DNA has already been sheared to ~ 100-300 bp and that only a few nanograms of input are available.
Cell-free DNA (CNA). Circulating nucleic acids (CNA) or cell-free DNA samples typically contain short fragments (~150 bp) contaminated with much longer genomic DNA. We can assay the size-distribution of these samples using our TapeStation and, if desired, remove high molecular weight contaminants with a magnetic bead-based size selection. The DNA can then be converted into fragment libraries using our low-input library preparation kits or selected regions can be enriched by PCR amplification.
16S rRNA Microbiome Metagenomics. We perform PCR amplification of the V3-V4 region of microbial 16S rRNA genes using the Illumina demonstrated protocol for 16S Metagenomic Sequencing Library Preparation. This protocol requires 12.5 ng of bacterial genomic DNA as input. It has the advantage of only requiring one pair of custom synthesized amplification primers because individual indexes are added in a second PCR reaction (using commercially available primers). In the first round of PCR amplification the V3-V4 region is amplified using the following primers:
16S Amplicon PCR Forward Primer =
16S Amplicon PCR Reverse Primer =
Bold font indicates sequences required for Illumina Nextera indexing, underlined sequences indicates V3-V4 specific sequence
To amplify and sequence a different region of interest, only the underlined portions of the above forward and reverse PCR primers need to be modified.
Then, in a second round of PCR amplification Nextera XT dual indexes are incorporated. Up to 384 different samples can be uniquely indexed with this approach. Sequencing is then performed using paired-end 2 x 300 bp sequencing on a v3 MiSeq sequencing cartridge generating ~ 20-25 million reads. This is sufficient for 96 samples per run at > 50-100,000 reads per sample. Up to 384 samples per run can be applied using all four sets of Nextera XT v2 indices, but the number of reads/sample may be at or below the minimum necessary for some samples.
As an alternative to just sequencing part of the 16S rRNA genes, the entire population of microbial genomic DNA can be sequenced as a single shotgun DNA fragment library. If sample DNA is limiting NEB Ultra II or Nextera XT libraries requiring only 0.5-1 ng of input can be prepared. Sequencing can be performed on either MiSeq or NextSeq instruments depending on the number of reads required. Shotgun metagenomic projects requiring only a few million reads/sample can be done on the MiSeq (maximum of 25 million reads) and projects requiring larger sequencing coverage (i.e.10-20 million reads/sample) can be done on mid- or high-output NextSeq cartridges (120 M and 400 M reads, respectively).
Amplicon sequencing – Ion Torrent PGM
Small amplicon sequencing projects, especially when only a single locus is being amplified (low diversity amplicons) can be sequenced on the Ion Torrent PGM because the semiconductor-based sequencing chemistry does not use fluorescently-labeled nucleotides. Instead, each sequencing bead (ISP) is contained in individual wells and base additions are recorded as pH changes. So, unlike Illumina sequencing, low diversity samples are not a concern. Targeted sequencing using highly multiplexed amplicons is possible using Ion Ampliseq Panels. Pools of up to 6,144 primer pairs are possible (10 ng of DNA required per pool). In addition to ready-to-use panels, custom panels can be created using the Ion Ampliseq Designer primer design tool. However, this approach is limited by the relatively small sequencing output of the Ion PGM chips.
Amplicon sequencing – Illumina MiSeq
The MiSeq sequencer is the preferred platform for sequencing low diversity amplicons because its Real Time Analysis software (RTA) does a much better job of cluster identification than the corresponding software in the HiSeq and NextSeq instruments. The most common amplicon application is 16S metagenomics (see above), but other regions of interest can also be similarly amplified and sequenced if the Forward and Reverse 5’-overhangs specific for Illumina Nextera sequencing are included in the PCR primers (fusion primers), as shown below.
Forward primer: 5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG‐[locus specific sequence]
Reverse primer: 5’-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG‐[locus specific sequence]
Then, up to 384 combinations of dual indices can be added for multiplexing by performing a second round of PCR using Illumina’s Nextera XT PCR primers.
Primers for custom amplicon sequencing can also be designed as TruSeq Custom Amplicons using DesignStudio. Up to 1,536 amplicons can be in a single reaction (100 ng DNA required) and up to 96 samples combined.