Technical Note: With the same set of samples from the same labs you can get very different averages!

This post originated from a dialog with a Ph.D. in Molecular Genetics that I often discuss many aspects of microbiome analysis with.

The root of the problem is how many “Reads” from a 16s sample do you deem to be threshold for reliability. A “Read”, “num_hits” or “Count” is the number of matches to specific pattern found in the sample that matches a library. These are “best efforts” identification. Not always correct.

Accuracy can be as low as 62% [Then and now: use of 16S rDNA gene sequencing for bacterial identification and discovery of novel bacteria in clinical microbiology laboratories ]. It is generally assumed that a single “Read” is questionable. Commercial labs and test providers will often use them so they can claim that they identify more bacteria than the competition. Accuracy is rarely a marketing concept.

To this end, we processed the biggest collection of samples of one lab with different Read Levels to see what happens. The higher accuracy required to be included that you use, the higher the values.

obs	mean	stddev	median	boxplotlow	boxplothigh	tax_name	rank	Reads
471	386.5	5301.7	30	10	70	Neisseria	genus	1
242	733.6	7387.0	50	10	110	Neisseria	genus	2
136	1275.8	9835.5	70	30	210	Neisseria	genus	3
95	1800.3	11747.6	80	20	300	Neisseria	genus	4
68	2491.2	13853.4	120	0	360	Neisseria	genus	5
55	3059.5	15375.2	160	0	380	Neisseria	genus	6
41	4071.7	17748.5	200	0	500	Neisseria	genus	7
30	5517.0	20650.4	240	0	778	Neisseria	genus	8
21	7825.7	24488.4	350	50	1532	Neisseria	genus	9

Two labs may report different reference ranges for the simple reason that one requires at least 2 reads and the other lab 4 reads. This decision is often well hidden from the consumer. If the reference ranges are based on 4 reads and you apply them to 1 read samples then you will get a lot of false too high and too lows.

For the above example bacteria a 1 read reference range would have 386 being the average, while a 4 read reference range would have the average being 1800. So, a sample with 800 from 2 reads would be 2x the average for one reference range and and 1/2 the average for the other reference range.

This is part of the complexity of doing microbiome analysis and understanding the mechanism involved. Mechanisms that are often not understood by the labs and kit providers.

Microbiome Prescription Blog

A site exploring the microbiome, what it affects and how to manipulate it.

Technical Note: With the same set of samples from the same labs you can get very different averages!

Recent Posts

Pages

Reference Material

Recent Comments