Sanjay Singh, Eurofins
Food Genomics

How is DNA Sequenced?

By Sanjay K. Singh, Douglas Marshall, Ph.D., Gregory Siragusa, Ph.D.
No Comments
Sanjay Singh, Eurofins

A guide through the genomics language barrier.

Our objective here is to provide a brief introduction to aspects of the technologies that are used for NGS. Execution of a sequencing project using any of the NGS technologies involves three steps:

  1. Library preparation: Generating small pieces of DNA so that they can be read in parallel
  2. Sequencing and imaging: Determining the sequence of the bases in immobilized DNA molecules in a massively parallel manner
  3. Data analysis a.k.a. bioinformatics: Piecing together the bits and pieces of the sequence collected in the second step into one logical, massive and contiguous sequence.

Before going much further, we have constructed a table of some important terms for your reference (see Table 1).

Term Brief Definition/Translation
Read Depth (or Sequencing Depth) Number of times a sequence is determined for a single sample. A single read can have errors so multiple reads are desired for data quality.
Read Length Length (bp) of an individual read
Coverage A measure to determine the fraction of the total genome represented in the sequence data with a particular level of accuracy.
Library Preparation  The first step in the NGS workflow, which involves fragmenting the target DNA to a size compatible with the NGS platform and prepping the same for sequencing, i.e., by attaching adaptors.
 Bp, Kb, Mb  A measure of read size or genome size: Base Pair, Kilobases (1,000 bp), Megabases (1 million bp).
 Read Quality  Number of bp read errors in a sequence
 FASTA and FASTQ files  Computer files containing the sequence
 DNA Extraction  Wet chemistry protocol to remove high-quality DNA from a specimen
 “Quality of DNA”  Indicators of quantity (ng/ml or ng’s) , purity and molecular weight of DNA extracted from a sample
“Just send me your DNA’s” Refers to mailing or bringing DNA extracted from a sample to the sequencing lab
Table I.

During library preparation, genomic DNA is randomly broken into pieces typically <1,000 bp long, followed by ligation of adaptors (synthetic double stranded (ds) DNA fragments of known sequence) to the ends of the sheared DNA. A common theme across the NGS technologies is that millions of these adaptor-flanked DNA templates are attached to solid supports using different methods. This spatial distribution of immobilized templates allows for millions to billions of sequencing reactions to be run simultaneously. For example, in the first next-gen sequencers launched by the company 454 Life Sciences, tiny beads are used that contain several DNA strands complementary to a segment of the added-on adaptor, where the attachment of one template (piece of DNA to be sequenced) to one bead is achieved. Using PCR, multiple copies (millions) of each fragment of DNA tied to a bead are then generated on the surface of each bead.

While different NGS technologies use different sequencing chemistries to determine the sequence, all NGS protocols use smaller quantities of reagent per sequencing reaction than Sanger techniques and allow for multiple orders of increase in the amount of sequence data collected. Each of these advancements helps lower the cost of sequencing. Since sequencing reactions are performed using immobilized DNA fragments, the features of the recorded signal (typically fluorescence or light emitted during the extension of the primer) are on the scale of microns (i.e., smaller than the thickness of a human hair). Therefore, an image of reasonable surface area can provide information on millions of sequencing reactions being run in parallel. Picture a screen with many different colored dots appearing/disappearing in all parts of the screen, each representing a nucleotide base being detected and recorded into a sequence.

In case of the 454 Life Science sequencers, sequencing is conducted by a process called pyrosequencing, where a clever use of the luciferase enzyme makes every base incorporated give off a burst of light. In a single run, the 454 instrument can obtain around 400,000 reads at lengths of 200 to 400 bp. Several NGS platforms have emerged and have further reduced the cost of sequencing a genome (see Table 2).

Platform Instruments Read Lengths (bp)
Illumina MiniSeq, MiSeq, NextSeq, HiSeq, HiSeqX 125–600
Ion Torrent Proton, PGM 200–400
Pacific Biosciences PacBio RS, PacBio RS II 4,600–14,000
Roche 454 GS FLX, GS FLX+ 400–700
SOLiD 5500, 5500xI, 5500 W 100
Table 2.  NGS Sequencing Platforms

In the end, all of these instruments spit out a result that is generally in the form of a file type known as a FASTA or FASTQ (refer to Table 1). These files contain the sequence of ATCG’s in a sample and are the start of the bioinformatics process to be covered in a forthcoming addition to this column.

For the food safety professional, genomics investigations require accurate sequence information for reliable interpretation. Professionals are urged to consider certified sequencing providers that offer strong customer orientation, impeccable quality, fast service and high reliability. Poor quality sequence information can lead to poor quality species assignments in public databases. Faulty assignments lead to wrong bioinformatics interpretations. Recent highly sensationalized food genomics press releases showing the presence of difficult-to-believe contaminants, such as human or rat DNA in highly processed foods, may be due to analysis of poor quality sequence information. It is also recommended that professionals consult with organizations that know something about food science and technology to make sure sequence-based conclusions are based on a foundation of real and sound data.

References

  1. Hutchinson, C. A. III. (2007) DNA sequencing: bench to bedside and beyond. Nucleic Acids Res. 35, 6227–6237.
  2. Lee, T. F. (1991). The Human genome project; Cracking the genetic code of life.
  3. Watson, J. D. and Crick, F.H. (1953). A structure for deoxyribose nucleic acid. Nature. 171 (4356): 737–738.
  4. Smith, H. O. and Wilcox, K. W.(1970) A restriction enzyme from Hemophilus influenza I. Purification and general properties. J. Mol. Biol. 51, 379–391.
  5. Kaiser A D, Wu R (1968) Structure and function of DNA cohesive ends. Cold Spring Harb. Symp. Quant. Biol 1968;33:729-734.
  6. Sanger, F. et al. (1977) Nucleotide sequence of bacteriophage phi X174 DNA. Nature 265, 687–696.

Additional Resources

  1. Smith LM, Sanders JZ, Kaiser RJ, et al. (1986). “Fluorescence detection in automated DNA sequence analysis”. Nature. 321 (6071): 674–9.
  2. Ewing, B.; Hillier, L.; Wendl, M. C.; and Green, P. (1998). Base-calling of automated sequencer traces using Phred. I. Accuracy assessment. Genome Res. 8, 175–185.

About The Author

Sanjay Singh, Eurofins

About The Author

Douglas Marshall, Ph.D., Eurofins
Douglas Marshall, Ph.D.
Chief Scientific Officer

Douglas Marshall is chief scientific officer with Eurofins Microbiology Laboratories, Inc., a division of the global life sciences company Eurofins Scientific. He also is cofounder and director of the Food Safety Institute, LLC, an integrated consulting and analytical services company affiliated with Eurofins. He currently holds adjunct professor positions with Colorado State University and Florida State College. His former positions include associate dean and professor of public health at the College of Natural and Health Sciences, University of Northern Colorado; adjunct professor with the Colorado School of Public Health; professor of food science, nutrition, and health promotion at Mississippi State University; assistant professor of food science at Louisiana State University; contributing editor for the peer-reviewed scientific journal Food Microbiology, and four consecutive terms on the editorial board of the Journal of Food Protection. He is a frequent consultant to NIH, WHO, FAO, USDA and other government agencies and private companies. His research and expertise has been featured in popular press venues such as Consumer’s Reports, Fine Cooking, USA Today, Fitness, Health, Men’s Health, Chemtech, Nature Science Updates, and ASM Journal Highlights. Marshall is a frequently invited speaker and a prolific book chapter writer. With more than 250 publications and more than 150 invited presentations, his scientific research and outreach interests focus on improving the microbiological quality and safety of foods. Among these was the completion of the four-volume Handbook of Food Science, Technology, and Engineering, which he co-edited. He has been the recipient of a number of awards for his scholarly efforts including the Mississippi Chemical Corporation Award of Excellence for Outstanding Work and the International Association for Food Protection Educator Award. He is a fellow of the Institute of Food Technologists, where he has previously served as chair of two divisions and two regional sections, member of the board of directors, an Inaugural member and chair of the International Food Science Certification Commission, and a founding member of the Global Traceability Center.

About The Author

Gregory Siragusa, Eurofins
Gregory Siragusa, Ph.D.
Senior Principal Scientist

Gregory Siragusa is senior principal scientist with Eurofins Microbiology Laboratories, Inc., a division of the global life sciences company Eurofins Scientific. He has held positions with Danisco/DuPont and the USDA. He has been a reviewer for the Journal of Food Protection and Applied Environmental Microbiology.

Siragusa’s research spans fields of microbiology focusing on foods. He has been a speaker on the topics of food genomics and antibiotic alternatives. He obtained the B.S./M.S. in microbiology from Louisiana State University and a Ph.D. from the University of Arkansas. He has authored more than 100 peer-reviewed papers, chapters and abstracts. His latest activities focus on applications of genomics to food microbiology.

Leave a Reply

Your email address will not be published. Required fields are marked *