ForensicsBuilding statistical foundation for next-gen forensic DNA profiling

Published 2 August 2018

DNA is often considered the most reliable form of forensic evidence, and this reputation is based on the way DNA experts use statistics. When they compare the DNA left at a crime scene with the DNA of a suspect, experts generate statistics that describe how closely those DNA samples match. A jury can then take those match statistics into account when deciding guilt or innocence. These match statistics are reliable because they’re based on rigorous scientific research. However, that research only applies to DNA fingerprints, also called DNA profiles, that have been generated using current technology. Now, scientists have laid the statistical foundation for calculating match statistics when using Next Generation Sequencing, or NGS, which produces DNA profiles that can be more useful in solving some crimes.

DNA is often considered the most reliable form of forensic evidence, and this reputation is based on the way DNA experts use statistics. When they compare the DNA left at a crime scene with the DNA of a suspect, experts generate statistics that describe how closely those DNA samples match. A jury can then take those match statistics into account when deciding guilt or innocence.

These match statistics are reliable because they’re based on rigorous scientific research. However, that research only applies to DNA fingerprints, also called DNA profiles, that have been generated using current technology. Now, scientists at the National Institute of Standards and Technology (NIST) have laid the statistical foundation for calculating match statistics when using Next Generation Sequencing, or NGS, which produces DNA profiles that can be more useful in solving some crimes. This research, which was jointly funded by NIST and the FBI, was published in Forensic Science International: Genetics.

“If you’re working criminal cases, you need to be able to generate match statistics,” said Katherine Gettings, the NIST biologist who led the study. “The data we’ve published will make it possible for labs that use NGS to generate those statistics.”

How to create a DNA profile
NIST notes that to generate a DNA profile, forensic labs analyze sections of DNA, called genetic markers, where the genetic code repeats itself, like a word typed over and over again. Those sections are called short tandem repeats, or STRs, and the number of repeats at each marker varies from person to person. The analyst doesn’t actually read the genetic sequence inside those markers, but just counts the number of repeats at each one. That yields a series of numbers that, like a long social security number, can be used to identify a person.

STR-based profiling was developed in the 1990s, when genetic sequencing was hugely expensive. Today, NGS makes sequencing cost-effective for biomedical research and other applications. NGS can also be used to create forensic DNA profiles that, unlike traditional STR profiles, include the actual genetic sequence inside the markers. That provides a lot more data.