Odds Ratios for Neurological-Audio: hypersensitivity to noise

I just got an email asking for which bacteria are involved with hypersensitivity to noise. This post is just presenting the tables derived from the methodology described in this technical post: Odds Ratios and the Microbiome

Here are some odds ratios using BiomeSight data. Odds Low means when the reading is below the Median and Odds High above the Median (of those with this symptom). We use the symptom median to get balanced (same approximate size) categories. Using an average results in poorer results.

A 1.2 in Odds Low, means that having less then typical/median increases your odds, i.e. you want to increase the amount.

At first look for probiotics, we see:

  • Bifidobacterium adolescentis
  • Bifidobacterium longum
  • Lactococcus

I also note that Odds Low really dominant, i.e. too little of a lot of different bacteria. This hints at Prescript-Assist®/SBO Probiotic with 22 different unusual probiotics as being a possible candidate as well as General Biotics/Equilibrium.

Tax_Nametax_RankOdds LowOdd High
Collinsella tanakaeispecies0.741.67
Segatella paludivivensspecies1.560.73
Viridiplantaekingdom1.510.65
Peptostreptococcus stomatisspecies1.470.69
Bacteroides salyersiaespecies1.420.74
Neisserialesorder1.420.63
Bifidobacteriumgenus1.420.77
Bifidobacterium adolescentisspecies1.410.76
Bifidobacterialesorder1.410.77
Bifidobacteriaceaefamily1.410.77
genistoids sensu latoclade1.400.72
rosidsclade1.400.72
Rothiagenus1.400.72
core genistoidsclade1.400.72
Crotalarieaetribe1.400.72
Fabaceaefamily1.400.72
Papilionoideaesubfamily1.400.72
50 kb inversion cladeclade1.400.72
Fabalesorder1.400.72
fabidsclade1.400.72
Desulfosporosinusgenus1.390.76
Gunneridaeclade1.390.73
Streptophytinasubphylum1.390.73
Tracheophytaclade1.390.73
Embryophytaclade1.390.73
eudicotyledonsclade1.390.73
Spermatophytaclade1.390.73
Magnoliopsidaclass1.390.73
Mesangiospermaeclade1.390.73
Euphyllophytaclade1.390.73
Streptophytaphylum1.390.73
Pentapetalaeclade1.390.73
Bifidobacterium choerinumspecies1.380.77
Neisseriagenus1.380.67
Bifidobacterium adolescentis JCM 15918strain1.370.79
Lysobactergenus0.791.37
Neisseriaceaefamily1.370.65
Rothiagenus1.360.75
Actinomycetotaphylum1.360.79
Bifidobacterium gallicumspecies1.350.75
Catenibacterium mitsuokaispecies1.350.72
Planococcusgenus1.340.57
Bifidobacterium indicumspecies1.340.75
Planococcus columbaespecies1.340.58
Enterobactergenus1.340.80
Morganellaceaefamily1.330.64
Clostridium chartatabidumspecies1.330.78
Sutterella stercoricanisspecies1.320.78
Aeromonadalesorder1.320.75
Mesoplasma entomophilumspecies1.310.78
Rothia mucilaginosaspecies1.300.78
Entomoplasmataceaefamily1.300.79
Entomoplasmatalesorder1.300.79
Eukaryotasuperkingdom1.300.71
Mesoplasmagenus1.300.79
Succinivibriogenus1.300.76
Bifidobacterium longumspecies1.290.81
Succinivibrionaceaefamily1.290.81
Ruminococcus callidusspecies1.290.78
Streptococcus cristatusspecies1.280.58
Tepidibactergenus1.280.82
Catenibacteriumgenus1.280.76
Atopobium fossorspecies1.280.62
Rivulariaceaefamily1.270.82
Dyadobactergenus1.270.63
Actinomycetesclass1.270.82
Oribacteriumgenus1.270.82
Clostridium cadaverisspecies1.260.83
Micrococcaceaefamily1.260.72
Micromonosporaceaefamily1.260.65
Micromonosporalesorder1.260.65
Streptococcus sanguinisspecies1.250.73
Citrobactergenus1.250.83
Oribacterium sinusspecies1.250.83
Acinetobactergenus1.250.75
Salisaetaceaefamily1.250.59
Salisaetagenus1.250.59
Salisaeta longaspecies1.240.59
Thermosediminibacteralesorder1.230.83
Candidatus Tammella caduceiaespecies1.230.76
Lachnobacteriumgenus1.230.84
Alishewanellagenus1.230.62
Heliorestisgenus1.230.84
Actinocatenisporagenus1.220.65
Azospirillumgenus1.220.80
Candidatus Tammellagenus1.220.77
Bifidobacterium catenulatum PV20-2strain1.220.84
Lactococcusgenus1.220.83
Bifidobacterium subtilespecies1.220.84
Negativicoccusgenus1.220.82
Succinivibrio dextrinosolvensspecies1.210.85
Opisthokontaclade1.210.71
Eumetazoaclade1.210.71
Metazoakingdom1.210.71
Caloramator indicusspecies1.210.85
Desulfurisporaceaefamily1.210.78
Desulfurisporagenus1.210.78
Desulfurispirillum alkaliphilumspecies1.210.81
Streptococcus millerispecies1.210.79
Coprococcus eutactusspecies1.210.84
Desulfurispora thermophilaspecies1.210.78
Herbaspirillum magnetovibriospecies1.200.59
Phocaeicola massiliensisspecies1.200.85
Prevotella dentasinispecies1.200.79
Collinsella intestinalisspecies1.200.81
Pseudomonasgenus1.200.85
Coraliomargarita akajimensisspecies1.200.67
Coraliomargaritaceaefamily1.200.67
Coraliomargaritagenus1.200.67
Pseudomonadaceaefamily1.200.85
Bilateriaclade1.200.72

Odds Ratios and the Microbiome

In working with Microbiome Prescription, I experimented with various prediction approaches before settling on a workaround that, in many cases, could successfully predict the top 10 symptoms for new microbiome samples, with individuals confirming about 80% of them as accurate reflections of their own symptoms. Though this solution was adequate for practical needs, it was admittedly less than ideal in theory. Recently, I recognized that a more robust and principled prediction algorithm is achievable. The aim of this post is to walk through that process, making it accessible for anyone interested in trying this more rigorous approach.

Accurate prediction identifies the key bacteria that should be altered with statistical justification.

An odds ratio (OR) is a measure of association that describes the odds of a disease, symptom, or event occurring in one group compared to another, often used in medical and epidemiological studies to estimate the strength of risk factors or the effectiveness of interventions.

Understanding Odds Ratios

  • The odds ratio is calculated by dividing the odds of the event in the exposed group by the odds in the non-exposed group.
  • OR > 1 indicates higher odds of disease with the exposure or risk factor; OR < 1 indicates reduced odds; OR = 1 means no difference in odds between groups.
  • Odds ratios are especially used in case-control studies, but also in cohort and cross-sectional studies, and they can approximate risk ratios when the disease or symptom is rare.

Using Multiple Odds Ratios in Disease Analysis

When you have several odds ratios related to a disease, there are several key uses:

  • Compare the magnitude of different risk factors: By looking at the odds ratios for various exposures (e.g., smoking, age group, genetic markers), you can identify which exposures are most strongly associated with the disease.​​
  • Synthesize evidence: Meta-analysis allows combining odds ratios from multiple studies to produce a summary effect estimate, which helps determine overall strength of association and consistency across populations.

Example Table of Interpreting Odds Ratios

Exposure/Risk FactorOdds RatioInterpretation
Smoking3.5 Exposure increases odds
Physical Activity0.7 Exposure decreases odds
High BMI1.2 Exposure slightly increases odds
Family History4.0 Strong increased odds

These odds ratios can guide targeted interventions, identify priority risk factors, and inform clinical decision-making or public health policy.

Each odds ratio’s confidence interval should be considered to determine statistical significance: if it includes 1, the specific association may not be statistically meaningful.

Summary

The charts below used naïve odds ratio computation (ignoring Probability) of people with declared symptom and those without this symptoms. If you use Zero(0) as a threshold, we correctly predicted 74% of the time for those with symptoms and those without symptoms.

Odds ratios quantify the likelihood of disease or symptoms given exposures and allow comparison and synthesis of risk across different factors or populations. When handling multiple odds ratios, use them to identify, adjust for, and summarize the impact of risk factors on disease occurrence.

Applying to the Microbiome

We encounter some challenges here. Consider this constructed example:

  • Bacteria Foo has OR of 1.5 when the microbiome exceeds 5%
  • Bacteria Bar has OR of 2 when the microbiome exceeds 3%
  • Bacteria Foo and Bar are associated.

If a sample has both, the OR is not 1.5 x 2 or 3.0. Instead, we need to know much they influence each other, i.e. the R2. We can estimate this from Microbiome Taxa R2 Site. Suppose that R2 is 0.5, significant inference.

The Odds ratio is thus reduced to 2.66 from 3.0.

Odds Ratios and Continuous Values

Odds ratios are commonly used for binary data, such as smoker versus non-smoker or high school graduation status. Continuous data can also be categorized; for example, instead of treating smoking as simply yes/no, you might use metrics like the number of cigarettes smoked per day or packs per week. Similarly, the microbiome data can be categorized, though caution is needed to avoid over-interpreting sparse data. A rough guideline from many studies suggests a minimum of 30 cases and 30 controls are needed to calculate an odds ratio with basic reliability. For data on the lower end, it can be helpful to binarize using the median rather than the mean. This is important because bacterial abundances tend to be highly skewed—using the mean often results in about 70% of samples falling below it and 30% above, whereas the median splits the data evenly with 50% below and 50% above.

Example: Brain Fog

Here are some odds ratios using BiomeSight data. Odds Low means when the reading is below the Median and Odds High above the Median (of those with this symptom). We use the symptom median to get balanced (same approximate size) categories.

A few quick take away:

  • Probiotics such as Bifidobacterium, Ligilactobacillus, Lactococcus lactis, Lactiplantibacillus
    • Bifidobacterium catenulatum subsp. kashiwanohense (OR 1.37) is the preferred one!
    • Ligilactobacillus: Ligilactobacillus salivarius is the only one available retail
    • Lactiplantibacillus: Lactiplantibacillus plantarum is the only one available retail
    • Veillonella atypica is offered as FITBIOMICS V•Nella Lactic Acid Metabolizing Probiotic …
      • Note: Brain fog is often ascribed to too much Lactic Acid.
Tax_Nametax_RankOdds LowOdd High
Cerasicoccus arenaespecies1.590.71
Polyangiasubclass1.470.72
Lelliottiagenus1.420.75
Lelliottia amnigenaspecies1.420.75
Microcoleaceaefamily0.821.41
Myxococciaclass1.380.71
Myxococcalesorder1.380.71
Myxococcotaphylum1.380.71
Bifidobacterium catenulatum subsp. kashiwanohensesubspecies1.370.74
Denitratisomagenus0.871.37
Microcoleus antarcticusspecies0.811.36
Microcoleusgenus0.811.36
Desulfosporosinusgenus1.340.80
Trabulsiellagenus1.330.80
Rivulariaceaefamily1.320.79
Segatella paludivivensspecies1.320.79
Prosthecobactergenus1.320.73
Ligilactobacillusgenus1.310.77
Enterobacter cloacae complexspecies group1.300.80
Peptostreptococcus stomatisspecies1.300.80
Alcanivoraxgenus0.931.30
Alcanivoracaceaefamily0.931.30
Tepidanaerobacter syntrophicusspecies1.300.79
Tepidanaerobactergenus1.300.79
Tepidanaerobacteraceaefamily1.300.79
Hoylesella loescheiispecies1.290.81
Thermosediminibacteralesorder1.280.81
Enterobacter hormaecheispecies1.280.82
Slackia isoflavoniconvertensspecies0.841.27
Bifidobacterium choerinumspecies1.270.82
Desulfovibrio simplexspecies1.270.80
Chromatiumgenus0.901.27
Lactococcus fujiensisspecies1.270.67
Chromatium weisseispecies0.901.27
Klebsiellagenus1.270.82
Klebsiella/Raoultella groupno rank1.270.82
Veillonella atypicaspecies1.260.82
Isoalcanivoraxgenus0.941.26
Isoalcanivorax indicusspecies0.941.26
Schaalia turicensisspecies1.250.72
Lactococcus lactisspecies1.250.83
Bifidobacteriaceaefamily1.240.84
Bifidobacterialesorder1.240.84
Chloroflexotaphylum1.240.79
Salidesulfovibrio brasiliensisspecies0.921.24
Salidesulfovibriogenus0.921.24
Enterobactergenus1.240.81
Bifidobacteriumgenus1.240.84
Actinomycetotaphylum1.240.84
Acholeplasma hippikonspecies0.851.23
Mycoplasmataceaefamily1.230.82
Mycoplasmatalesorder1.230.82
Bifidobacterium angulatumspecies1.230.82
Clostridium nitrophenolicumspecies0.851.23
Bacteroides uniformisspecies0.851.22
Lactococcusgenus1.220.83
Lactiplantibacillusgenus1.220.84
Mycoplasmagenus1.220.82
Filifactor villosusspecies0.881.22
Anaerolineaeclass1.210.85
Veillonella denticariosispecies0.891.21
Actinomycetesclass1.210.85
Acidimicrobiumgenus1.210.79
Cerasicoccaceaefamily1.210.79
Cerasicoccusgenus1.210.79
Mycoplasmoidalesorder1.210.81
Parabacteroides gordoniispecies1.210.84
Thioalkalivibrio jannaschiispecies1.210.63
Candidatus Blochmanniella camponotispecies1.210.79
Thioalkalivibriogenus1.210.63
Acidimicrobiaceaefamily1.210.77
Bifidobacterium adolescentisspecies1.210.85
Bifidobacterium longumspecies1.210.85

That’s it for the moment

Also, see the links below for by-request tables

The next step is seeing how these odds ratio perform against samples and against the old algorithm. Stay tune.

Special note: This is not based on using averages of healthy populations, but more on the skewness of the distribution of those with the symptom. It is a different way of thinking about the issue.

caveat emptor

The table above applies only and exclusively with Biomesight data. For an explanation of why, see The taxonomy nightmare before Christmas… If you use a different lab, you will need to get that lab to crunch their numbers in the same manner as detailed above

Ghost Bacteria in 16s Reports

This morning I was trouble shooting an upload issue on Ombre CSV data — the reason was “they changed the format again!“. While triaging the issues I saw a lot of counts of “1” in the sample that I was working with. A count of 1 means that only one unit of bacteria was detected. Most microbiologists would deem that to be unreliable, the bacteria may not actually be present, i.e. a “Ghost Bacteria Identification”.

As a result, I look at the 16s tests that has been uploaded to compute the percentages of ghosts in samples.

16s Test fromAverage Lowest RateHighest RateBacteria Reported
Biomesight22.1%0%35.3%611
Ombre28.8%0%41.1%694
Medivere20.5%19%22.3%756
BiomeSightRdp11%1.9%20.0%476
CerbaLab13.9%0%24%Over 600
SequentiaBiotech1.4%0%5%313
CosmosId0.01%0%0.28%463

The numbers above suggests that reporting on ghosts results in more bacteria reports — which is a good marketing strategy. It is a questionable service to the consumers.

For myself, for my offline research database, I will be excluding counts of “1”. I may also offer an option to remove them on the upload page in the future. This is not a significant issue with shotgun reports.

“Buyer beware,” or caveat emptor 

From Perplexity (Click to get sources):

In 16S microbiome sequencing, counts of “1” (single read assigned to a taxon in a sample) are generally not considered reliable for determining the true presence of that organism. Here’s why:

  • Low-abundance signals (especially a single read) can easily result from sequencing errors, index hopping, cross-contamination, or misclassification in the bioinformatic pipeline.
  • Studies show that only OTUs (Operational Taxonomic Units) with higher counts (usually >10 reads, and especially >1% relative abundance) are consistently detected with high reliability and quantification accuracy.
  • Single-read taxa are much more likely to be false positives or background noise. They typically do not pass statistical filtering thresholds used in rigorous microbiome analysis.
  • Many pipelines recommend removing OTUs present in very low abundances (often <10 reads or <0.1–1% relative abundance) for reliable interpretation.

Summary:

  • Counts of “1” should be viewed as unreliable noise and not taken as meaningful evidence of that organism’s presence in your microbiome sample.
  • Reliable detection begins at much higher read counts and relative abundances, with reproducibility improving rapidly as counts increase.

Best practices:

  • Filter out taxa with extremely low counts for clinical or quantitative interpretation.
  • Use statistical and bioinformatic guidelines to set raw count and relative abundance thresholds for reporting results.

If you see a taxon with just one assigned read in your 16S data, consider it an artifact rather than true biological detection unless verified by other means.

Graphic Exploration into Significant Bacteria

Lazy versus Old School

I have observed that many data scientists tend to push data into a model and report the results of the model. I am old school and was taught to always chart the data to look for abnormalities. Doing that revealed that microbiome data is highly skewed. I covered this in Microbiologist / Data Scientist Guide to Bacterium Statistics.

I subsequently came across an odds plot where we have an appearance similar to electron shell densities and not the nice linear model that is often assumed.

The result was a clear need to review a lot more data graphically. There are the main patterns:

  • The condition line is clearly to the left of the reference line, i.e. transformed average is less
  • The condition line is clearly to the right of the reference line, i.e. transformed average is more
  • The condition line is on both sides of the reference line, i.e. a complex situation.
  • The lines are on top of each other — no association to the symptom

Lower Transformed Average

Higher Transformed Average

Mixed Case

No Association

A Video Show

I generated a program to walk through some random bacteria and recorded them in the video below. Pause the video when you want to look at a specific chart in greater detail. My main conclusion is that often a bacteria is significant only when it is in a certain range.

400+ more over 20 minutes

Autism Only

Long COVID

Mast Cell Activation Syndrome and Multiple Chemical Sensitivity

A person who suffered from Multiple Chemical Sensitivity(MCS) for many years before it progressed into Mast Cell Syndrome(MCAS) forward an article, “Chemical Intolerance and Mast Cell Activation: A Suspicious Synchronicity“, 2023. At the same time, my understanding of the complex nature of the microbiome also made a leap forward. For those interested, see these three very technical posts:

I decided to look at Mast Cell Activation Syndrome again in the hope of gaining insight into treatment possibilities.

The samples being using are donated by readers from various labs with symptoms being self-declared. Symptoms may not agree with clinical definitions. All of the data is freely available for those that are highly skilled with statistics at Citizen Science Distribution.

First, MCS::MCAS

With MCASWith MCSWITH MCAS and MCSWith Any Symptoms
Count305219623025
Percentage10%7.2%2%
  • If MCAS and MCS are independent, we would expect 10% x 7.2% or 0.72% overall. We have 3 times more than expected.
  • The chi-square statistic is 19.3693. The p-value is .000011. VERY SIGNIFICANT CONNECTION.

This disagrees on face value with the reported “Our outcomes confirm the previously published study where the majority of MCAS patients also have CI. ” For this to be true, With MCAS and MCS would be > 150. Differences in methodology may be the cause for this disagreement, but regardless, we see that a person with MCAS is around three times more likely to have had MCS. I read this as suggesting that MCS is a precursor for a class of MCAS. Having MCS prior is not required to developing MCAS; but having MCS means the odds of getting MCAS are much increased.

Looking at Bacterium

I am going to use samples processed through Biomesight only because it is the largest sample set.

For MCS

The table below is filtered to those with P < 0.001 at the genus level with the highest first (P < 5.19132E-05).

NameDirection
ActinocatenisporaLow
HathewayaHigh
ThaueraLow
DevosiaLow
ThiocapsaLow
DeferribacterLow
ViridibacillusLow
Candidatus TammellaLow
CoraliomargaritaLow
GeothrixLow
DesulfosporosinusLow
GlutamicibacterLow
DenitratisomaLow
CatenibacteriumLow
DesulforamulusLow
GeobacterLow
NeisseriaLow
NonomuraeaLow
AgromycesLow
AnaerotruncusHigh
OenococcusLow
SaccharopolysporaLow
LentibacillusLow

MCAS

The table below is filtered to those with P < 0.001 at the genus level with the highest first (P < 6.25726E-07).

NameDirection
EmticiciaLow
PseudoramibacterLow
ParascardoviaLow
RickettsiaLow
CalothrixLow
NonomuraeaLow
MarinospirillumLow
AzospirillumLow
NeisseriaLow
ViridibacillusLow
HelicobacterLow
PeptacetobacterLow
NitrosococcusLow
AvibacteriumLow
SchaaliaLow
PropionigeniumLow
FlammeovirgaLow
OligellaLow
ErysipelothrixHigh
GeobacterLow
CatenibacteriumLow
PontibacterLow
IsoalcanivoraxLow
FaecalitaleaLow
JonesiaLow
ThalassospiraLow
AmedibacillusHigh
ArthrobacterLow
HathewayaHigh

MCAS and MCS

The table below is filtered to those with P < 0.001 with the highest first (P < 1.85255E-05). The sample size is much smaller, so fewer items were significant, hence all ranks are shown.

NameRankDirection
ChloroflexotaphylumLow
AnaerolineaeclassLow
Eggerthella sinensisspeciesLow
DesulfofundulusgenusLow

Probiotic Remedies?

Because there are simply no published studies on most of the above bacterium, I went over to the R2 site to compute candidate probiotics. Note: Some of these probiotics are still in development or available only in some countries.

MCS

I enclosed the full list because you want to make sure NOT to take any with a Net being negative. Also, the safest are those with BAD being Zero (0)

Tax_NameTax_RankGoodBadNet
Christensenella minutaspecies19429165
Aspergillus oryzaespecies1380138
Faecalibacterium prausnitziispecies18578107
Anaerobutyricum halliispecies16258104
Enterococcus faeciumspecies1243787
Blautia hanseniispecies1223785
Lactiplantibacillus plantarumspecies64064
Roseburia intestinalisspecies1186058
Bifidobacterium catenulatumspecies53053
Priestia megateriumspecies47047
Bacillus pumilusspecies43043
Bacteroides thetaiotaomicronspecies37037
Latilactobacillus sakeispecies37037
Bifidobacterium brevespecies32032
Levilactobacillus brevisspecies31031
Parabacteroides distasonisspecies31031
Parabacteroides goldsteiniispecies542826
Pediococcus pentosaceusspecies25025
Limosilactobacillus reuterispecies23023
Shouchella clausiispecies23023
Lactiplantibacillus argentoratensisspecies23023
Bifidobacterium longumspecies20020
Bifidobacterium adolescentisspecies392118
Blautia wexleraespecies745717
Lactococcus cremorisspecies362115
Enterococcus faecalisspecies14014
Bifidobacterium pseudocatenulatumspecies13013
Limosilactobacillus vaginalisspecies12012
Lactobacillus kefiranofaciensspecies12012
Lactococcus lactisspecies11011
Clostridium beijerinckiispecies11011
Streptococcus thermophilusspecies10010
Leuconostoc mesenteroidesspecies10010
Segatella coprispecies37298
Phocaeicola coprocolaspecies27216
Bacillus subtilisspecies26215
Lactobacillus crispatusspecies11110
Lactiplantibacillus pentosusspecies011-11
Bacteroides uniformisspecies2032-12
Limosilactobacillus mucosaespecies014-14
Lacticaseibacillus caseispecies017-17
Bacillus cereusspecies3354-21
Bacillus licheniformisspecies022-22
Ligilactobacillus salivariusspecies1141-30
Lactobacillus jenseniispecies036-36
Akkermansia muciniphilaspecies1250-38

MCAS

Tax_NameTax_RankGoodBadNet
Christensenella minutaspecies83083
Aspergillus oryzaespecies68068
Enterococcus faeciumspecies58058
Faecalibacterium prausnitziispecies53053
Roseburia intestinalisspecies53053
Anaerobutyricum halliispecies51051
Blautia wexleraespecies44044
Bacillus pumilusspecies28028
Priestia megateriumspecies27027
Levilactobacillus brevisspecies25025
Latilactobacillus sakeispecies25025
Lactiplantibacillus argentoratensisspecies23023
Blautia hanseniispecies22022
Limosilactobacillus fermentumspecies21021
Shouchella clausiispecies20020
Limosilactobacillus reuterispecies18018
Lactiplantibacillus plantarumspecies17017
Bacillus subtilisspecies16016
Bifidobacterium animalisspecies15015
Bifidobacterium animalis subsp. lactissubspecies15015
Lactobacillus acidophilusspecies14014
Clostridium butyricumspecies13013
Bifidobacterium adolescentisspecies12012
Ligilactobacillus salivariusspecies11011
Hafnia alveispecies11011
Bacteroides uniformisspecies015-15
Lacticaseibacillus rhamnosusspecies016-16
Bacteroides fragilisspecies023-23

Bottom Line

The most confidence is to work on probiotics only with the following being strongly recommended.

  • Aspergillus oryzae
  • Enterococcus faecium
  • Bacillus pumilus
  • Bacillus subtilis
  • Lactiplantibacillus plantarum a.k.a. Lactobacillus plantarum
  • Bifidobacterium catenulatum
  • Bifidobacterium breve
  • Levilactobacillus brevis a.k.a. Lactobacillus brevis
  • Latilactobacillus sakei a.k.a. Lactobacillus sakei
  • Limosilactobacillus reuteri a.k.a. Lactobacillus reuteri
  • Shouchella clausii a.k.a. Bacillus Clausii
  • Lactobacillus acidophilus
  • Ligilactobacillus salivarius a.k.a. Lactobacillus salivarius

The top one for both is Aspergillus oryzae. This is likely unfamiliar to most people. It is also known as Shirayuri Koji. It is available on Amazon, not as a probiotic but cooking additive!! It is in Koji Rice. It is also solid as strong wakamoto w

With Tariffs ordering from Japan can get expensive, https://www.yami.com/ ships from the US, so no tariffs costs!

CAUTION: This is based on modelled data and not verified by clinical studies. IMHO, it is likely a superior set of suggestions than other more “conventional” approaches.

Using Mean and Standard Deviation for Bacteria is Inappropriate.

In an earlier post (Significant Bacteria and Their Thresholds – Part 1), I raised that issue and a EU colleague, Valentina Goretzki, suggested that I take data from 1 thousand shotgun samples from healthy individuals to illustrate the problem.

Microbiome data distributions frequently display extreme skewness—often greater than 20. In such cases, computing mean and standard deviation is simply incorrect.  My friend “Perplexity” writes Mean and standard deviation become inappropriate measures for computing significance if the distribution’s skewness is substantial—specifically, when the absolute skewness exceeds ±2.

The result was about two thousand bacterium that occurs at least 60 times in these samples could be plotted as shown below.

It is clear that non-parametric methods are needed to compute “healthy ranges”. For those with just basic statistics, this may become a significant challenge.

Significant Bacteria and Their Thresholds – Part 2

Conventional medical science tend to think of one bacteria to one condition. These are known as Single-Bacterium Diseases.

Single-Bacterium Diseases

  • Tuberculosis — caused by Mycobacterium tuberculosis
  • Diphtheria — caused by Corynebacterium diphtheriae
  • Cholera — caused by Vibrio cholerae
  • Leprosy (Hansen’s disease) — caused by Mycobacterium leprae
  • Whooping cough (Pertussis) — caused by Bordetella pertussis
  • Tetanus — caused by Clostridium tetani
  • Typhoid fever — caused by Salmonella typhi
  • Syphilis — caused by Treponema pallidum
  • Gonorrhea — caused by Neisseria gonorrhoeae
  • Lyme disease — caused by Borrelia burgdorferi
  • Gastric ulcer — caused by Helicobacter pylori
  • Strep throat — caused by Streptococcus pyogenes
  • Urinary tract infection — most commonly caused by Escherichia coli
  • Pneumonia — can be caused by Streptococcus pneumoniae
  • Meningitis — can be caused by Neisseria meningitidis (meningococcus), or Streptococcus pneumoniae
  • Bacterial vaginosis — often caused by Gardnerella vaginalis

There are other conditions that could be cause by any one of several bacterium, but not bacterium cooperating with each other.

When we enter the world of microbiome dysbiosis, this simplicity disappears.

Case Study of number of bacterium associated with many symptoms

We return to our collection of 4,290 unique samples with 327 symptoms having statistical significant bacterium discussed in Significant Bacteria and Their Thresholds – Part 1. Restricting our data to associations with P < 0.01, the graph below shows the number of bacteria associated with each symptom.

My view is that symptoms arise from the metabolites produced by a specific combination of bacteria. Examining the data from KEGG: Kyoto Encyclopedia of Genes and Genomes, we see that some metabolites can be produced by hundreds of different bacterium. Some bacteria associated with a symptom may actually be due to secondary effects—reflecting shifts caused by other species—so distinguishing causal bacteria from merely correlated ones remains difficult.

A practical working hypothesis for reducing or eliminating a symptom is therefore to normalize the bacteria associated with it. A rational approach is to start with those that have the strongest association.


The Naïve Approach

A well-educated medical professional typically follows this reasoning:

  1. Identify which bacteria are outside the normal range and linked to the patient’s symptoms.
  2. Determine whether each bacterium is elevated or reduced.
  3. Review substances known to influence these bacteria.
  4. Recommend lifestyle or dietary changes based on those substances.

In practice, certain substances may be counter-indicated for other bacteria that are also out of range. This is often overlooked, as many professionals adopt a “find the first substances that address the bacterium shift” approach. This sometimes makes the patient worse.

Microbiome Prescription uses a manually curated database containing over 7.4 million relationships between substances and specific bacteria. Because of this depth, these potential conflicts are often identified and the risk of adverse effects is reduced.

A professional in this situation would reasonably expect to see a chart such as the one below for each of the bacterium associated with a symptom. The chart, table or other items giving a desired range of values.

With dozens of bacteria out of range, there is no clear objective ability to rank these bacteria by importance. There are a variety of speculative punts that could be tried:

  • Rank them by the volume of bacteria
  • Rank them by the deviation from the reference range, i.e.(value – mean)/standard deviation
  • Rank them by any of many possible algorithms that could be tossed at this issue.

Turning the issue upside down

Let us take the concrete example promised in the earlier post: Long COVID. We have 538 samples with Long COVID in our population of 4,290 contributed samples. This is about 12% of the samples.

Filtering the associations to those bacterium with P < 0.0001; our highest priority or weight. We obtain the table below. While there are many bacteria, some are tightly related according to lineage:

MycoplasmatotaMollicutesAnaeroplasmatalesAnaeroplasmataceaeAnaeroplasma

Taxontax_nametax_rank
1737405Tissierellalesorder
626933Odoribacter laneusspecies
186332Anaeroplasmatalesorder
186333Anaeroplasmataceaefamily
8563016SrX (Apple proliferation group)species group
10226016SrXV (Hibiscus witches’-broom group)species group
47565Candidatus Phytoplasma prunorumspecies
2086Anaeroplasmagenus

Filtering the associations to those bacterium with P < 0.001; we get a second table shown below. One of the bacteria is Lactobacillus jensenii, which is available as a probiotic (I have some in my fridge) — but we do not know if we want to increase or decrease this bacteria.

Taxontax_nametax_rank
2330Halanaerobiumgenus
1381Atopobium minutumspecies
972Halanaerobiaceaefamily
724Haemophilusgenus
194Campylobactergenus
28128Prevotella corporisspecies
72294Campylobacteraceaefamily
33037Anaerococcus vaginalisspecies
38288Corynebacterium genitaliumspecies
42857Moorella groupnorank
45254Dysgonomonas capnocytophagoidesspecies
45404Beijerinckiaceaefamily
47420Hydrogenophagagenus
102261Candidatus Phytoplasma brasiliensespecies
109790Lactobacillus jenseniispecies
89061Weissella thailandensisspecies
215579Schlegelellagenus
382673Syntrophomonas cellicolaspecies
386414Hoylesella timonensisspecies
1963360Parachlamydialesorder
1853231Odoribacteraceaefamily

We could continue onwards and look at the 40 bacterium associated with P < 0..01 and 48 bacterium with P < .05. While potentially important, because of the lesser degree of association, we will ignore them here. My preference is always to favor highest probability and thus would only look at those in the above two tables.

Question: Is Lactobacillus Jensenii too high or too low

Some medical practitioners would hear the word “Lactobacillus” and immediately say “Take it” because they have a (questionable) belief that Lactobacillus will help everything! Is this the case here?

The Data for Lactobacillus Jensenii

The table below shows the data. The percentages have been transformed to percentiles for better presentation with a count of the occurrences at each. One of the first items some people will note is that this bacterium is not reported often; but there is enough data to get a P < 0.001 using Chi2 . I disagree with the approach seen in some papers, to only examine very commonly reported bacterium.

%-ile RangeHasTotalHasNotTotal
Not Present3443852
0.0013
4.21015
20.00222
45.26213
61.0513
65.2603
68.4203
71.5805
76.8402
78.9511
81.0501
82.1110
83.1610
84.2101
85.2602
87.3701
88.4201
89.4710
90.5301
91.5810
92.6310
93.6801
94.7401
95.7901
96.8401
97.8901
98.9501
100.0001

Doing a little more aggregation we get the table below. If a person has no L. Jensenii they have a 8% chance of having Long COVID, If there is any present, the odds increases 13% chance, a higher amount pushes it up to 17% (double the odds).

Conclusion: L. Jensenii probiotics are a definite to be avoided probiotic for Long COVID

%ile RangeHasTotalHasNotTotalRatio
Not Present34438520.08
0.00130.25
4.210150
20.002220.08
Over 209440.17
Over All12840.13

Danger Will Robinson: Do not over simplify

Looking at another bacteria with P < 0.0001, we see charted below. The bacteria is commonly reported.

What is evident is that the association is range sensitive, and thus reference ranges:

  • Below 17%ile is out of range
  • Between 28%ile and 34%ile is out of range
  • Over 60% is out of range

Many microbiologists would say that this does not make sense. At this point I should remind people of quorum sensing with bacteria.

Quorum sensing is a communication process that allows bacteria to sense and respond to their population density using chemical signaling molecules called autoinducers. Each bacterium produces and releases autoinducers into its environment. As the population grows, the concentration of these molecules increases. When a threshold level is reached, autoinducers bind to specific receptors, triggering changes in gene expression that coordinate group behaviors such as biofilm formation, virulence factor secretion, sporulation, and bioluminescence.

At this point, many minds may be going into ‘statistical culture shock‘. For me, it makes complete sense and is often seen across nature. They are sometimes termed “islands of stability” in some sciences. In our case “islands of symptoms” would be a more accurate name.

We may end up committed blasphemy against conventional linear mechanic thinking!

Examples:

Alloys With Maximum Strength at Specific Composition

  • Iron-Carbon Steel: In carbon steel, maximum strength is usually achieved at a carbon content near 0.8% (known as “eutectoid steel”), where the steel forms a very fine pearlite structure on cooling. Both lower and higher carbon percentages reduce ductility or create brittleness, decreasing usable strength for many applications.
  • Aluminum-Copper Alloy (Al-Cu): The precipitation strengthening of aluminum alloys, such as in the Al-Cu system, reaches maximum effectiveness at about 4–5% copper by weight. Below or above this optimal range, strengthening diminishes because either not enough or too much second phase is created.
  • Nickel-Iron Alloys: For instance, “Permalloy” (a nickel-iron alloy) typically reaches its desired magnetic properties with about 80% nickel and 20% iron. Changes in this ratio result in reduced magnetic strength, which can be considered a parallel with mechanical strength in many alloys.

Wildlife Systems with Optimal Mixtures

  • Animal Diversity in Food Webs: Studies show that increasing the number of animal species generally increases total animal biomass and plant consumption rates, up to a point. Beyond this, higher diversity leads to increased intraguild predation (animals eating each other), which can reduce overall community efficiency and stability.
  • Keystone Species: Some ecosystems depend on a particular balance among keystone species and others. Removing or adding too many can severely weaken the system, in analogy to alloys with optimal composition.
  • Biodiversity-Function Relationship: The stability and strength of ecosystems typically follow a nonlinear relationship with species richness: there is a “sweet spot” where ecosystem functions (like carbon sequestration or primary production) are maximized.

Nuclear Islands of Stability

  • The best-known island is around atomic numbers (Z) 114 to 126 and neutron number (N) 184, where theoretical calculations suggest nuclei could have half-lives of minutes, days, or potentially even years, instead of the microseconds typical for super heavy elements nearby.

Other Physical Sciences

  • The term may be used metaphorically to describe stable orbital configurations or regions where system dynamics are less chaotic.

Returning to Long COVID

With Tissierellales we have a complex behavior. With Lactobacillus Jensenii we have a simple “if present, reduce it” finding. I returned to the lists above and attempted to identify those with a simple finding with a P < 0.05 for the odds ratios. This is shown in the table below.

  • Reduce Beijerinckiaceae, family, With odds being almost 2x for those with Long COVID, i.e. 0.059 vs 0.026
  • Increase Parachlamydiales, order, With odds being more than 10x for not having Long COVID, i.e.
    0.067 vs 0.1

Other have the odds being close to each other. For example, Tissierellales (0.978 vs 0.973). In the next post, we will examine more of the complex behavior ones.

CorrectionTax_nameTax_rankSymptom OddsNo Symptom Odds
Anaeroplasmagenus0.1460.135
Candidatus Phytoplasma prunorumspecies0.7220.738
16SrX (Apple proliferation group)species group0.7220.735
Anaeroplasmatalesorder0.1460.135
Anaeroplasmataceaefamily0.1460.135
Odoribacter laneusspecies0.4940.506
Tissierellalesorder0.9780.973
16SrXV (Hibiscus witches’-broom group)species group0.020.013
Candidatus Phytoplasma brasiliensespecies0.020.014
Lactobacillus jenseniispecies0.0340.021
Odoribacteraceaefamily0.9610.941
IncreaseParachlamydialesorder0.0670.1
Schlegelellagenus0.0620.042
Syntrophomonas cellicolaspecies0.0390.019
ReduceHoylesella timonensisspecies0.3480.295
Weissella thailandensisspecies0.0280.016
Campylobacteraceaefamily0.4190.386
Campylobactergenus0.3880.354
Haemophilusgenus0.7750.745
Halanaerobiaceaefamily0.2220.184
Atopobium minutumspecies0.0420.027
Halanaerobiumgenus0.2220.184
Prevotella corporisspecies0.4970.457
Anaerococcus vaginalisspecies0.2130.215
Corynebacterium genitaliumspecies0.0310.019
ReduceMoorella groupnorank0.5340.447
Dysgonomonas capnocytophagoidesspecies0.0420.053
ReduceBeijerinckiaceaefamily0.0590.026
ReduceHydrogenophagagenus0.0340.014

Summary

This post explains my approach for ranking which bacteria should be targeted for change, primarily using the P value to guide priority. I compared two types of bacteria: one that is rare and one that is common. Rare bacteria are usually omitted from standard analyses because their scarcity makes conventional measures—such as calculating the mean and standard deviation—poor indicators of significance. Overlooking these bacteria is a methodological error, especially when the data skew exceeds 2, making such traditional metrics inappropriate.

It’s important to keep in mind that bacteria engage in quorum sensing, which influences the metabolites they produce and likely affects the symptoms observed. For some bacteria, the difference in their relative abundance between people with and without a particular symptom can be substantial and may be all that is needed for significance.

When I see lab present a patient data with images like below, I roll my eyes! “It looks likes a bell curve, it smells like a bell curve,….”

Some reports gives event less information with no ranges and a “normal” — which is computed how? Some choices are below.

  • Arithmetic Mean:
    The most common type, calculated by summing all the values and dividing by the number of values.
    Arithmetic Mean=1n∑i=1nxiArithmetic Mean=n1∑i=1nxi
  • Geometric Mean:
    The nth root of the product of all values, commonly used for ratios or percent changes.
    Geometric Mean=∏i=1nxinGeometric Mean=ni=1nxi
  • Harmonic Mean:
    The reciprocal of the arithmetic mean of the reciprocals of the data, useful for rates or ratios (e.g., average speed).
    Harmonic Mean=n∑i=1n1xiHarmonic Mean=∑i=1nxi1n
  • Quadratic Mean (Root Mean Square, RMS):
    The square root of the average of the squares of the numbers, often used in engineering and physics.
    RMS=1n∑i=1nxi2RMS=n1∑i=1nxi2
  • Weighted Mean:
    The mean where each value has its own (possibly different) weight, calculated as:
    Weighted Mean=∑i=1nwixi∑i=1nwiWeighted Mean=∑i=1nwii=1nwixi
  • Truncated (or Trimmed) Mean:
    The mean calculated after removing a specified percentage of the largest and smallest values to reduce the effect of outliers.
  • Median (sometimes referred to as a kind of mean):
    The middle value when the data are sorted. While technically not a “mean,” it is often referenced in the context of central tendency.
  • Mode:
    The most frequently occurring value in the set. Also not a “mean,” but sometimes grouped with measures of central tendency.

Significant Bacteria and Their Thresholds – Part 1

Anyone who regularly reads peer-reviewed medical studies on the microbiome will notice findings reported as bacteria being “too high” or “too low,” with phrases like “trending” when statistical significance isn’t reached. Frankly, my reaction to 95% of these papers is an eye-roll, as the statistical methods used are often inappropriate for the data at hand. With multiple degrees in statistics, professional memberships, and experience, I’m acutely aware of both best practices and common pitfalls.

Microbiome data distributions frequently display extreme skewness—often greater than 20. In such cases, computing mean and standard deviation is simply incorrect.  My friend “Perplexity” writes Mean and standard deviation become inappropriate measures for computing significance if the distribution’s skewness is substantial—specifically, when the absolute skewness exceeds ±2.

Despite this, using these metrics remains standard in high school statistics and unfortunately persists in many life science studies. This “comfort zone” approach does nothing but cloud true findings in microbiome science.

My alternative methodology uses a much larger, highly annotated dataset—over 4,290 unique samples generously donated to Microbiome Prescription., most transferred from Biomesight.com. Importantly, these samples are uniformly processed and richly annotated with symptoms rather than diagnoses, yielding superior analytical clarity.

My Natural Questions to ask

Natural for a statistician that is.

  • For people with a symptom or diagnosis
    • What are the significant bacteria associating (and likely causing) the symptom
    • What is the threshold levels for these bacteria
      • I use levels and not level because I have observed the same symptom may occur with a bacteria outside of a specific range. That is, too high or too low. I have also encountered this reported in a few studies, often hidden under a term like “altered microbiome”.
      • There is a dangerous assumption in the literature that significant bacteria must be either too high or too low. I unfortunately know Kierkegaard’s “Either/Or” well.
      • There are no universal threshold for all symptoms, each has its own
  • For people without a symptom with a statistical model but with dysbiosis
    • How do you determine which bacteria are significant?
    • What is the threshold levels for these bacteria

Over the last decade, these are important questions because they lead directly to treatment suggestions.

They are also significant in evaluating progress. At present I have a forecasting algorithm that has a high prediction rate for symptoms from a microbiome. The forecasting algorithm also is useful for evaluating progress, an example for a recent sample the donor asked me to review.

Prediction

The checks indicates that the donor agrees that they have this symptom.

Monitoring

The person above followed the suggestions and the subsequent test results are shown below.

What are the most common bacteria associated with symptoms?

This is a generic question that is useful for health practitioners to know. For example, Kristina Mitts, of Mind Mood Microbiome and who I frequently correspond with, or Dr. Jason Hawrelak.  

Using more appropriate statistical methods on our sample of 4,292 distinct different samples; we found significant bacteria identified over 327 symptoms resulting in the following statistical significances.

Significance: P < Count
0.0513,855
0.0112,411
0.0017,614
0.00015,532

So what are the top one for each of these significance?

Overall Significance

TaxonnamerankInstances
820Bacteroides uniformisspecies165
35833Bilophila wadsworthiaspecies142
35832Bilophilagenus139
818Bacteroides thetaiotaomicronspecies137
1426Parageobacillus thermoglucosidasiusspecies133
118884Gammaproteobacteria incertae sedisno rank125
871324Bacteroides stercorirosorisspecies124
120580Symbiobacterium toebiispecies122
53244Desulfonatronovibriogenus122
543349Symbiobacteriaceaefamily122
2733Symbiobacteriumgenus122
1498Hathewaya histolyticaspecies122
454155Paraprevotella xylaniphilaspecies120

P < 0.05

TaxonnamerankInstances
2950010Salidesulfovibriogenus47
221711Salidesulfovibrio brasiliensisspecies46
658623Chelonobactergenus45
69224Erwinia psidiispecies44
213462Syntrophobacteralesorder44
3024408Syntrophobacteriaclass44
31977Veillonellaceaefamily44
1843489Veillonellalesorder43
550Enterobacter cloacaespecies43
35832Bilophilagenus42
841Roseburiagenus41
53244Desulfonatronovibriogenus41
871324Bacteroides stercorirosorisspecies41
1260Finegoldia magnaspecies41
1498Hathewaya histolyticaspecies40

P < 0.01

TaxonnamerankInstances
35833Bilophila wadsworthiaspecies51
78448Bifidobacterium pullorumspecies50
820Bacteroides uniformisspecies47
841Roseburiagenus46
818Bacteroides thetaiotaomicronspecies44
118884Gammaproteobacteria incertae sedisno rank41
1769729Hathewayagenus41
1426Parageobacillus thermoglucosidasiusspecies41
112902Propionisporagenus40
36853Desulfitobacteriumgenus40
386414Hoylesella timonensisspecies40
119065unclassified Burkholderialesfamily40
1853231Odoribacteraceaefamily40
400091Hymenobacter xinjiangensisspecies39
209080Propionispora hippeispecies39
871324Bacteroides stercorirosorisspecies39
69224Erwinia psidiispecies39
35832Bilophilagenus39

P < 0.001

TaxonnamerankInstances
820Bacteroides uniformisspecies50
35833Bilophila wadsworthiaspecies37
35832Bilophilagenus32
118884Gammaproteobacteria incertae sedisno rank32
658623Chelonobactergenus31
246787Bacteroides cellulosilyticusspecies31
120580Symbiobacterium toebiispecies31
543349Symbiobacteriaceaefamily31
253238Ethanoligenensgenus31
2733Symbiobacteriumgenus31
292833Candidatus Rhabdochlamydiagenus30
324707Candidatus Rhabdochlamydia crassificansspecies30
1426Parageobacillus thermoglucosidasiusspecies30
689704Candidatus Rhabdochlamydiaceaefamily30
70190Chroococcusgenus29
402401Chroococcus minutusspecies29
1890464Chroococcaceaefamily29
283169Odoribacter denticanisspecies28

P < 0.0001

TaxonnamerankInstances
820Bacteroides uniformisspecies40
246787Bacteroides cellulosilyticusspecies34
818Bacteroides thetaiotaomicronspecies30
1963360Parachlamydialesorder30
454155Paraprevotella xylaniphilaspecies30
1426Parageobacillus thermoglucosidasiusspecies30
2733Symbiobacteriumgenus29
543349Symbiobacteriaceaefamily29
120580Symbiobacterium toebiispecies29
35832Bilophilagenus26
191412Chlorobiaceaefamily25
256319Chlorobaculumgenus25
244127Anaerotruncusgenus25
189723Prevotella micansspecies25
53244Desulfonatronovibriogenus25
324707Candidatus Rhabdochlamydia crassificansspecies24
191410Chlorobiiaclass24
35833Bilophila wadsworthiaspecies24

Summary

This is a high level overview of Significant Bacteria. The patterns above are specific for tests done by Biomesight; a lack of standardization results in using these identifications for other tests is unsafe (legal sense). Background here. IMHO, it is a moral responsibility for labs to produce similar tables.

The key findings are:

  • “Common suspects” such as bifidobacterium and lactobacillus are missing!
  • Large sample sizes with the same processing is critical. The processing must be the same as used in a clinical setting.
  • Appropriate statistical methods must be used

Stay tune for the next part as we drill deeper into appropriate handing of data with some specific issues like Long COVID.

The input data that I used is publicly available at: https://citizenscience.microbiomeprescription.com/

Post Script

Probiotics and the above gets interesting. Take Bacteroides uniformis which is at the top of many of these tables. If we go to my bacteria association site,

We can determine the probiotics (available or pending) that will increase this bacteria (none decreases)

Again, the “cure all” lactobacillus and bifidobacterium genus is absent (apart from Ligilactobacillus ruminis which is not currently available).

Probiotics Fundamentals: Part 4 Probiotic Selection?

Related posts:

In the first post of this series, Probiotics Fundamentals: Part 1 Specific Strains I cited strains that are available retail that has been researched. The logical starting point is to search for your needs, read the studies and then rank the probiotics in prefer order for doing a personal trial. You want to do one probiotic at a time with rotation and described in the prior post (see prior post).

To searching for strain specific studies of probiotics available retail. Click here.

No Study found or issue not listed

The next step is to look at the conditions that I have abstracted/extracted studies for, listed at “U.S. Nat. Lib. Medical Conditions Studies with Microbiome Shifts“. We are shifting from strain to species level. This gives several paths, let us examine Autism. There are

Based on Publish Studies of Species

Clicking on [Can You Help Improve Suggestions] will take you to a page. At the bottom you will see “Treatment Substances” which lists things that have helped in studies. Scan it for probiotic names, for example: 

A synbiotic formulation of Lactobacillus reuteri and inulin alleviates ASD-like behaviors in a mouse model: the mediating role of the gut-brain axis. Food & function (Food Funct ) Vol: 15 Issue: 1 Pages: 387-400 Pub: 2024 Jan 2 ePub: 2024 Jan 2 Authors Wang C,Chen W,Jiang Y,Xiao X,Zou Q,Liang J,Zhao Y,Wang Q,Yuan T,Guo R,Liu X,Liu Z

Which suggests L. Reuteri with inulin may help. The source is linked. Make sure you read them.

Based on Deficiencies of Probiotic Bacteria

Clicking on 🦠 Taxons will take you to a page showing all of the bacteria shifts reported for the condition.

Look for Lactobacillus, Bifidobacterium,etc  with  ⬇️

These species are found at lower levels, suggesting their metabolites are also reduced. Supplementing with them as single-strain probiotics is logical. Stay at the species level (e.g., Bifidobacterium longum) rather than higher classifications such as the genus Bifidobacterium. In general, avoid probiotic mixtures, as they may include strains that are counter-indicated (e.g., Bifidobacterium catenulatum, Bifidobacterium breve) or strains for which we lack sufficient information.

Based on Modelled correction of Bacteria

Clicking on 🥣 Candidates, will send the huge bacteria list above through a fuzzy logic expert system to compute suggestions with weights given for each one.

Note that this also lists ones to avoid.

Disagreements!

We can see that levels of Lactobacillus plantarum are low, but the model is telling us to avoid Lactobacillus plantarum. So what’s going on here?

The issue comes from the fact that the model/studies is based on multiple subgroups of people with Autism (or other conditions). The data might be accurate within each subgroup, but when you merge them together, you can end up with contradictions. So it’s not really a problem with the approach—it’s a problem with the data mix.

The best rule of thumb is to start with the things that show up as agreements across the data. For example: Bifidobacterium longum and Limosilactobacillus reuteri. Once you’ve tried probiotics that have clear agreements, then you can carefully experiment with the ones where there’s disagreement and see how your body responds.

The next level up in Probiotic Suggestions

It is pretty simple, get a microbiome test. My preferred tests are:

  • Biomesight for 16s (economic, low resolution)
  • Thorne for shotgun (more expensive but much higher detail)

You want to ideally get a test that reports on all of the common probiotic bacteria. Many common tests do not report many of these. For example: Diagnostic Solution GI-Map reports only

On the other hand both of the above tests report species.

When you select a test, you should check Microbiome Prescription to see what the detection rate is. For example for Bifidobacterium longum, we see how often this is detected in samples.

  • For the shotgun tests (Xenogene and Thorne) we see 96% and 100% of the time, if it show low, you can have confidence in taking some
  • For SequentiaBiotech we see it is seen 25% of the time. If you have none reported we are left being uncertain if you actually have none or is the none because of the test’s methodology

Another example is L.Reuteri where the shotgun tests find in in over 50% of samples, while some 16s finds in only 2% of samples.

Bottom Line

We’re piecing things together from lots of scattered knowledge, and there’s no single standard method—either for testing microbiomes in labs or for the studies themselves. Nothing here is clear-cut; everything’s kind of fuzzy, sometimes super fuzzy. In this post, the focus was on picking probiotics for a condition using literature (an “a priori” approach). Basically, it means trusting the data at face value, even though we know it isn’t rock-solid.

Some additional readings:

I also foreshadowed the next post: Using a detailed microbiome test to select probiotics based on the whole microbiome.

Opinion: GTDB should be blacklisted for Clinical Use

I believe it is the consumer interest to share this email thread and to promote discussion of this issue.

Blacklisting is the action of suggesting something to be avoided or distrusted.

Request

[Customer name withheld] has forwarded me the PDF and some CSV files associated. She wishes to see what the recommendations from a fuzzy logic expert system that uses over 7.4 million facts based on data from the US National Library of Medicine will suggest.

 I know that the following data is very much available and possible to provide. Other firms like Biomesight.com, Thorne, Vitract and Precision Biome has no trouble providing it:

For all taxonomical layers (From Clade to strains [when available]) just 3 numbers are sufficient.

  • NCBI Taxon Number, 
  • Percentage Amount, 
  • Percentile Ranking across a reference set of healthy individuals

Additional data is welcome, but not required:

  1. Names
  2. Your reference ranges 
  3. etc

Percentiles should be actual percentiles and NOT percentiles estimated using mean and standard deviation. Most bacteria has a SKEW exceeding 20. Using the mean and average requires a SKEW of zero.

Your customer would greatly appreciate a speed return of an appropriate file. With that file in hand, we will add your lab to our list of over 50 labs that our free site supports. (See https://microbiomeprescription.com/Upload/Index ).

If you are unable to provide such, please tell us so we may black list your site as not supportable to spare other consumers a waste of money.

Response

Hi Ken,
Thank you for your detailed email and for sharing your perspective on the data formats and metrics you require.

We’d like to clarify that Microba uses the Genome Taxonomy Database (GTDB) for microbial classification. GTDB and NCBI classify genomes differently, our species consist of multiple genomes which may have different NCBI classifications, species level classifications cannot always be mapped to each other through name matching alone. Due to this, providing a microbiome profile in NCBI taxonomy is not practical nor would it be a correct representation of the actual microbiome profile.

Once GTDB is formally supported within your workflow, please reach out and we can discuss options for providing the data your service requires.

We appreciate your understanding and encourage you to support the more accurate, resolved, and phylogenetically consistent GTDB taxonomy.

Kind regards,
The Microba Team

Reply To Response

Many thanks for your reply. Our purpose is to provide clinical suggestions for review by medical professionals to people suffering from a wide variety of conditions using hallucination-free AI.

A simple example may be seen here. DepressionExample.

Unfortunately, Genome Taxonomy Database(GTDB) appears to be a research tool and IMHO seems very inappropriate and misleading to sell to consumers. GTD was first proposed in academic papers in August 2018. We were active in the microbiome before that and the de facto standard in the industry as then, and still is today, the NCBI. The leading consumer microbiome testing company back then was uBiome which provided NCBI identifiers. In the 7 years since first release, we have seen what appears to be some 226 revisions given the release of   10-RS226, dated April 16, 2025.

Checking the US National Library of Medicine, there appears to be less than 200 studies done using GTDB that are of likely clinical use. With NCBI, we found over 20,000 suitable studies. Regardless, no study done prior to 2020, likely 2022, can, in a legal sense, be safely used for clinical purposes. 

I am aware of many tools to convert GTDB to NCBI, a few are:

  • TaxonKit: Command-line toolkit that supports creating NCBI-style taxdump files from GTDB and also reformatting and mapping taxonomies.
  • gtdb_to_taxdump: Python tool to convert GTDB taxonomy files into NCBI taxdump format, usable by downstream tools like Kraken2.
  • GTDB-Tk: Assigns genomes to GTDB taxonomy but includes metadata fields mapping to NCBI taxonomy, enabling conversion between formats.
  • NCBI-GTDB Map: Direct tool for mapping GTDB taxonomy to NCBI taxonomy, supporting both directions and handling rank prefixes.
  • gtdb-taxdump: A specialized toolkit for generating stable, trackable NCBI taxdump files from GTDB releases with reproducible taxon IDs.
  • NCBI-taxonomist: Python command-line utility that retrieves, handles, and allows mapping of taxonomic information, supporting cross-database operations.

I am aware that folks embracing the hottest new technology can have an attitude, especially when most recent studies decline to use it. Intransient on this issue is not beneficial to your customers; people with challenging health issues unless you are willing to provide a GTDB based suggestions engine working off only GTDB studies, 

hallucination-free AI that is equivalent or better than what NCBI can provide. Until such time, I would advocate that you stop making misleading sales to consumers.

Given your response, I feel that I have no option back to blacklist you for clinical use.