The taxonomy nightmare before Christmas… Episode III

This is a continuation with real numbers of my 2019 post The taxonomy nightmare before Christmas….

See also: The taxonomy nightmare — Episode II

I have just updated the site using a refinement of the Kaltoft-Moldrup algorithm that became available to the site this weekend. Before getting into the nerdy details, let us recap the purpose.

In general, studies find associations between higher or lower levels of some bacteria to symptoms or conditions. These are primitive calculations with many deficiencies. In general, they do not establish causality, only association.

The common hypothesis is that being too high increases the risk. A common assumption for many medical conditions (when there is insufficient studies) is that the “top or bottom 5 percentage of patients” may be at risk. This can also be expressed as those in the bottom 5 percentile and top 5 percentile.

In many branches of physical science, this can be computed from the mean and standard deviation. This requires the data to be a normal distribution. This is not the case with microbiome data. Our purpose is to identify the values where we suspect that the risk has become significant.

The tables in this post illustrates the nightmare in my earlier post!

Scope of Investigations

I am going to the bacteria cited in Dr. Jason Hawrelak Recommendations to illustrate the issues. His levels came from published studies, observations and the test results of his patients (which could have been done using labs not covered in this post).

Whether his ranges applies to your sample depends on the lab that did the sample. In some cases, many of the labs have reasonable agreement. In other cases, major differences.

TaxonomyRankLow PercentageHigh Percentage
Bacteroidiaclass035
Akkermansiagenus13
Bacteroidesgenus020
Bifidobacteriumgenus2.55
Blautiagenus510
Desulfovibriogenus00.25
Eubacteriumgenus015
Lactobacillusgenus0.011
Methanobrevibactergenus0.00010.02
Roseburiagenus510
Ruminococcusgenus015
Proteobacteriaphylum04
Bilophila wadsworthiaspecies00.25
Escherichia colispecies00.01
Faecalibacterium prausnitziispecies1015

What are the numbers?

Unless %ile is after the number, the numbers are percentages reported.

  • Lab Ranges use a computational method that is common with medical labs
    • Lab Low means a value computed as Mean – 1.96 Standard Deviation. If the value is negative, it is set to zero because we cannot have a negative count.
    • Lab High means a value computed as Mean + 1.96 Standard Deviation.
  • KM is the Kaltoft-Moltrup algorithm that detects data that is akin or not akin to other numbers in the data set.

Bacteroidia 0 -35%

My impression is that this measurement does not matter. All of the ranges are much greater than Jason’s guidance.


Lab
KM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All25.736026 %ile89.665699 %ile082.691640.917421.3133
BiomeSight23.906021.4 %ile81.022998.4 %ile2.847674.533138.690318.2871
Ombre/Thryve4.081311.8 %ile100100 %ile090.231241.855624.6814
uBiome37.096725.8 %ile100100 %ile9.650488.581249.115820.1354
Akkermansia 1-3%

The consensus pattern seem to be 0-10%.

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All0.00308.5 %ile11.768997 %ile010.42471.7598.944208
BiomeSight0.002010.6 %ile4.438092.6 %ile09.67321.4131.342143.2
Ombre/Thryve0.006116.6 %ile10.870196.9 %ile010.43041.7167.344457.7
uBiome00 %ile14.912795.9 %ile013.32843.2611.951363.5
Bacteroides 0 – 20%

Jason’s high level is below the average of every lab! The consensus for high level is around 50%

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HignMeanStandard Deviation
All8.110020.5 %ile58.934098.5 %ile051.580622.988214.5879
BiomeSight9.848019.8 %ile58.555098.2 %ile052.294323.727914.5746
Ombre/Thryve2.095316.7 %ile52.561896.8 %ile051.634121.309815.4715
uBiome18.054026.8 %ile45.283195.4 %ile1646649.479025.562712.2021
Bifidobacterium 2.5 to 5%
LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All0.00969 %ile8.455295.2 %ile013.02051.82875.7100
BiomeSight0.011011 %ile7.208097.4 %ile07.24851.08633.1439
Ombre/Thryve0.00537.5 %ile10.148493.6 %ile020.34383.08318.8064
uBiome00 %ile7.294593.9 %ile012.82452.20965.4157
Blautia 5-10%

The consensus range appears to be 5-20%

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All4.048019.4 %ile21.451095.9 %ile021.09988.84596.2519
BiomeSight4.575018.7 %ile28.511098.2 %ile021.16719.10796.1526
Ombre/Thryve2.973414.9 %ile23.293896 %ile022.23658.60226.9562
uBiome6.659625.7 %ile19.028294.9 %ile020.15199.73365.3154
Desulfovibrio 0 – 0.25%

The consensus range appears to be 0- 1.5%

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All.00209.3 %ile1.924697.8 %ile01.51140.33110.6021
BiomeSight.002012.3 %ile1.598097.6 %ile01.14940.22440.4718
Ombre/Thryve.00246.8 %ile1.744596.9 %ile01.58120.40330.6009
uBiome00 %ile2.190995.3 %ile02.29960.57990.8773
Eubacterium 0 – 15%

The consensus range appears to be 0 – 7%

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All0.00609.9 %ile7.151395.1 %ile06.79821.45602.7256
BiomeSight0.004010.5 %ile1.451096 %ile01.5494.24400.6660
Ombre/Thryve1.185816.1 %ile11.757597.1 %ile011.00023.93473.6048
uBiome00 %ile0.297694.1 %ile07.0530.08440.3167
Lactobacillus 0.1 – 1%

The numbers are all over the place.

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All0.004011.8 %ile1.235391.1 %ile07.0988.52063.3562
BiomeSight0.00207.2 %ile0.171993.3 %ile05.6439.15512.8004
Ombre/Thryve0.012914.9 %ile5.240997.2 %ile03.9082.86461.5528
uBiome00 %ile0.385088.6 %ile013.13761.09066.1463
Methanobrevibacter 0.0001 – 0.02%
LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All0.004017.6 %ile2.722896.7 %ile04.24850.48061.9223
BiomeSight00 %ile0.490094.5 %ile01.10640.1451.4904
Ombre/Thryve0.002116.5 %ile1.666795.2 %ile01.67430.3337.6839
uBiome00 %ile100 100 %ile04.11321.16881.5022
Roseburia 5-10%

Consensus range seems to be 0 – 10%

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All0.141510.8 %ile11.327895.5 %ile010.96423.37803.8705
BiomeSight0.06506.7 %ile13.591097.5 %ile010.60023.10893.8220
Ombre/Thryve0.108314.5 %ile11.156896.4 %ile09.67542.60433.6076
uBiome2.341622.9 %ile13.235896.2 %ile013.13135.56993.8578
Ruminococcus 0 – 15%

Note that there was not sufficient samples to compute KM for ubiome. In this case, the consensus is in close agreement with Jason’s target.

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All1.343018.9 %ile14.082095.3 %ile014.20905.07084.6623
BiomeSight2.031013.4 %ile13.157092.4 %ile015.03575.98244.6190
Ombre/Thryve0.169911.6 %ile14.134497.2 %ile012.22893.63024.3871
uBiome 154913.58282.5659.5188
Proteobacteria 0 – 4%

Consensus range appears to be 1-15%

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All1.167819.2 %ile11.899793.6 %ile018.59494.97886.9470
BiomeSight1.309011.5 %ile17.756097.4 %ile015.66885.17655.3532
Ombre/Thryve0.368312.9 %ile9.170295.8 %ile017.20163.56516.9573
uBiome0.950411.4 %ile15.983196.2 %ile014.34495.13774.6975
Bilophila wadsworthia 0 – 0.25%

Ombre had no distinguishable difference in being akin. The consensus range appears to be 0- 1%

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All0.008010.3 %ile0.960093.9 %ile01.3219.3356.5032
BiomeSight0.00407.5 %ile0.875093.2 %ile01.2979.3219.4979
Ombre/Thryve00 %ile100100 %ile0.9597.2778.3478.7
uBiome00 %ile1.221494.6 %ile01.5297.4254.5634
Escherichia coli 0 – 0.01%

This is badly measured by 16s tests. Xenogene full sequencing averages almost 4%. The consensus range appears to be 0 – 0.4%

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All0.002014.8 %ile0.319090.8 %ile00.167600.13770.7848
BiomeSight0.002015.1 %ile0.469096 %ile00.53230.07030.2357
Ombre/Thryve00 %ile100100 %ile00.55190.09670.2322
uBiome 353
Faecalibacterium prausnitzii 10 – 15%

The consensus range appears to be 3 – 30%.

LabKM LowKM Percentile LowKM HighKM Percentile HighLab LowLab HighMeanStandard Deviation
All2.665019.8 %ile26.182495.7 %ile026.550510.41398.2329
BiomeSight4.078020.2 %ile31.224098 %ile027.972811.63858.3338
Ombre/Thryve4.513325 %ile33.322598.2 %ile028.705711.52568.7653
uBiome0.675317.5 %ile16.763496.2 %ile016.31906.06415.2320

Bottom Line

The purpose of this post is to demonstrate with actual sample numbers the issues raised in The taxonomy nightmare before Christmas….

To which we need to add:

You should ask for full disclosure from any source on how the ranges are calculated. Ignoring or dismissing the differences between different lab results suggests a low bandwidth understanding of the issues involved with the microbiome.

You can see the values when you look up bacteria as shown below