The intent of this site to assist people with health issues that are, or could be, microbiome connected. There are MANY conditions known to have the severity being a function of the microbiome dysfunction, including Autism, Alzheimer’s, Anxiety and Depression. See this list of studies from the US National Library of Medicine. Individual symptoms like brain fog, anxiety and depression have strong statistical association to the microbiome. A few of them are listed here.
The base rule of the site is to avoid speculation, keep to facts from published studies and to facts from statistical analysis(with the source data available for those wish to replicate the results). Internet hearsay is avoid like the plague it is.
Conventional medical science tend to think of one bacteria to one condition. These are known as Single-Bacterium Diseases.
Single-Bacterium Diseases
Tuberculosis — caused by Mycobacterium tuberculosis
Diphtheria — caused by Corynebacterium diphtheriae
Cholera — caused by Vibrio cholerae
Leprosy (Hansen’s disease) — caused by Mycobacterium leprae
Whooping cough (Pertussis) — caused by Bordetella pertussis
Tetanus — caused by Clostridium tetani
Typhoid fever — caused by Salmonella typhi
Syphilis — caused by Treponema pallidum
Gonorrhea — caused by Neisseria gonorrhoeae
Lyme disease — caused by Borrelia burgdorferi
Gastric ulcer — caused by Helicobacter pylori
Strep throat — caused by Streptococcus pyogenes
Urinary tract infection — most commonly caused by Escherichia coli
Pneumonia — can be caused by Streptococcus pneumoniae
Meningitis — can be caused by Neisseria meningitidis (meningococcus), or Streptococcus pneumoniae
Bacterial vaginosis — often caused by Gardnerella vaginalis
There are other conditions that could be cause by any one of several bacterium, but not bacterium cooperating with each other.
When we enter the world of microbiome dysbiosis, this simplicity disappears.
Case Study of number of bacterium associated with many symptoms
We return to our collection of 4,290 unique samples with 327 symptoms having statistical significant bacterium discussed in Significant Bacteria and Their Thresholds – Part 1. Restricting our data to associations with P < 0.01, the graph below shows the number of bacteria associated with each symptom.
My view is that symptoms arise from the metabolites produced by a specific combination of bacteria. Examining the data from KEGG: Kyoto Encyclopedia of Genes and Genomes, we see that some metabolites can be produced by hundreds of different bacterium. Some bacteria associated with a symptom may actually be due to secondary effects—reflecting shifts caused by other species—so distinguishing causal bacteria from merely correlated ones remains difficult.
A practical working hypothesis for reducing or eliminating a symptom is therefore to normalize the bacteria associated with it. A rational approach is to start with those that have the strongest association.
The Naïve Approach
A well-educated medical professional typically follows this reasoning:
Identify which bacteria are outside the normal range and linked to the patient’s symptoms.
Determine whether each bacterium is elevated or reduced.
Review substances known to influence these bacteria.
Recommend lifestyle or dietary changes based on those substances.
In practice, certain substances may be counter-indicated for other bacteria that are also out of range. This is often overlooked, as many professionals adopt a “find the first substances that address the bacterium shift” approach. This sometimes makes the patient worse.
Microbiome Prescription uses a manually curated database containing over 7.4 million relationships between substances and specific bacteria. Because of this depth, these potential conflicts are often identified and the risk of adverse effects is reduced.
A professional in this situation would reasonably expect to see a chart such as the one below for each of the bacterium associated with a symptom. The chart, table or other items giving a desired range of values.
With dozens of bacteria out of range, there is no clear objective ability to rank these bacteria by importance. There are a variety of speculative punts that could be tried:
Rank them by the volume of bacteria
Rank them by the deviation from the reference range, i.e.(value – mean)/standard deviation
Rank them by any of many possible algorithms that could be tossed at this issue.
Turning the issue upside down
Let us take the concrete example promised in the earlier post: Long COVID. We have 538 samples with Long COVID in our population of 4,290 contributed samples. This is about 12% of the samples.
Filtering the associations to those bacterium with P < 0.0001; our highest priority or weight. We obtain the table below. While there are many bacteria, some are tightly related according to lineage:
Filtering the associations to those bacterium with P < 0.001; we get a second table shown below. One of the bacteria is Lactobacillus jensenii, which is available as a probiotic (I have some in my fridge) — but we do not know if we want to increase or decrease this bacteria.
Taxon
tax_name
tax_rank
2330
Halanaerobium
genus
1381
Atopobium minutum
species
972
Halanaerobiaceae
family
724
Haemophilus
genus
194
Campylobacter
genus
28128
Prevotella corporis
species
72294
Campylobacteraceae
family
33037
Anaerococcus vaginalis
species
38288
Corynebacterium genitalium
species
42857
Moorella group
norank
45254
Dysgonomonas capnocytophagoides
species
45404
Beijerinckiaceae
family
47420
Hydrogenophaga
genus
102261
Candidatus Phytoplasma brasiliense
species
109790
Lactobacillus jensenii
species
89061
Weissella thailandensis
species
215579
Schlegelella
genus
382673
Syntrophomonas cellicola
species
386414
Hoylesella timonensis
species
1963360
Parachlamydiales
order
1853231
Odoribacteraceae
family
We could continue onwards and look at the 40 bacterium associated with P < 0..01 and 48 bacterium with P < .05. While potentially important, because of the lesser degree of association, we will ignore them here. My preference is always to favor highest probability and thus would only look at those in the above two tables.
Question: Is Lactobacillus Jensenii too high or too low
Some medical practitioners would hear the word “Lactobacillus” and immediately say “Take it” because they have a (questionable) belief that Lactobacillus will help everything! Is this the case here?
The Data for Lactobacillus Jensenii
The table below shows the data. The percentages have been transformed to percentiles for better presentation with a count of the occurrences at each. One of the first items some people will note is that this bacterium is not reported often; but there is enough data to get a P < 0.001 using Chi2 . I disagree with the approach seen in some papers, to only examine very commonly reported bacterium.
%-ile Range
HasTotal
HasNotTotal
Not Present
344
3852
0.00
1
3
4.21
0
15
20.00
2
22
45.26
2
13
61.05
1
3
65.26
0
3
68.42
0
3
71.58
0
5
76.84
0
2
78.95
1
1
81.05
0
1
82.11
1
0
83.16
1
0
84.21
0
1
85.26
0
2
87.37
0
1
88.42
0
1
89.47
1
0
90.53
0
1
91.58
1
0
92.63
1
0
93.68
0
1
94.74
0
1
95.79
0
1
96.84
0
1
97.89
0
1
98.95
0
1
100.00
0
1
Doing a little more aggregation we get the table below. If a person has no L. Jensenii they have a 8% chance of having Long COVID, If there is any present, the odds increases 13% chance, a higher amount pushes it up to 17% (double the odds).
Conclusion: L. Jensenii probiotics are a definite to be avoided probiotic for Long COVID
%ile Range
HasTotal
HasNotTotal
Ratio
Not Present
344
3852
0.08
0.00
1
3
0.25
4.21
0
15
0
20.00
2
22
0.08
Over 20
9
44
0.17
Over All
12
84
0.13
Danger Will Robinson: Do not over simplify
Looking at another bacteria with P < 0.0001, we see charted below. The bacteria is commonly reported.
What is evident is that the association is range sensitive, and thus reference ranges:
Below 17%ile is out of range
Between 28%ile and 34%ile is out of range
Over 60% is out of range
Many microbiologists would say that this does not make sense. At this point I should remind people of quorum sensing with bacteria.
Quorum sensing is a communication process that allows bacteria to sense and respond to their population density using chemical signaling molecules called autoinducers. Each bacterium produces and releases autoinducers into its environment. As the population grows, the concentration of these molecules increases. When a threshold level is reached, autoinducers bind to specific receptors, triggering changes in gene expression that coordinate group behaviors such as biofilm formation, virulence factor secretion, sporulation, and bioluminescence.
At this point, many minds may be going into ‘statistical culture shock‘. For me, it makes complete sense and is often seen across nature. They are sometimes termed “islands of stability” in some sciences. In our case “islands of symptoms” would be a more accurate name.
We may end up committed blasphemy against conventional linear mechanic thinking!
Examples:
Alloys With Maximum Strength at Specific Composition
Iron-Carbon Steel: In carbon steel, maximum strength is usually achieved at a carbon content near 0.8% (known as “eutectoid steel”), where the steel forms a very fine pearlite structure on cooling. Both lower and higher carbon percentages reduce ductility or create brittleness, decreasing usable strength for many applications.
Aluminum-Copper Alloy (Al-Cu): The precipitation strengthening of aluminum alloys, such as in the Al-Cu system, reaches maximum effectiveness at about 4–5% copper by weight. Below or above this optimal range, strengthening diminishes because either not enough or too much second phase is created.
Nickel-Iron Alloys: For instance, “Permalloy” (a nickel-iron alloy) typically reaches its desired magnetic properties with about 80% nickel and 20% iron. Changes in this ratio result in reduced magnetic strength, which can be considered a parallel with mechanical strength in many alloys.
Wildlife Systems with Optimal Mixtures
Animal Diversity in Food Webs: Studies show that increasing the number of animal species generally increases total animal biomass and plant consumption rates, up to a point. Beyond this, higher diversity leads to increased intraguild predation (animals eating each other), which can reduce overall community efficiency and stability.
Keystone Species: Some ecosystems depend on a particular balance among keystone species and others. Removing or adding too many can severely weaken the system, in analogy to alloys with optimal composition.
Biodiversity-Function Relationship: The stability and strength of ecosystems typically follow a nonlinear relationship with species richness: there is a “sweet spot” where ecosystem functions (like carbon sequestration or primary production) are maximized.
Nuclear Islands of Stability
The best-known island is around atomic numbers (Z) 114 to 126 and neutron number (N) 184, where theoretical calculations suggest nuclei could have half-lives of minutes, days, or potentially even years, instead of the microseconds typical for super heavy elements nearby.
Other Physical Sciences
The term may be used metaphorically to describe stable orbital configurations or regions where system dynamics are less chaotic.
Returning to Long COVID
With Tissierellales we have a complex behavior. With Lactobacillus Jensenii we have a simple “if present, reduce it” finding. I returned to the lists above and attempted to identify those with a simple finding with a P < 0.05 for the odds ratios. This is shown in the table below.
Reduce Beijerinckiaceae, family, With odds being almost 2x for those with Long COVID, i.e. 0.059 vs 0.026
Increase Parachlamydiales, order, With odds being more than 10x for not having Long COVID, i.e. 0.067 vs 0.1
Other have the odds being close to each other. For example, Tissierellales (0.978 vs 0.973). In the next post, we will examine more of the complex behavior ones.
Correction
Tax_name
Tax_rank
Symptom Odds
No Symptom Odds
Anaeroplasma
genus
0.146
0.135
Candidatus Phytoplasma prunorum
species
0.722
0.738
16SrX (Apple proliferation group)
species group
0.722
0.735
Anaeroplasmatales
order
0.146
0.135
Anaeroplasmataceae
family
0.146
0.135
Odoribacter laneus
species
0.494
0.506
Tissierellales
order
0.978
0.973
16SrXV (Hibiscus witches’-broom group)
species group
0.02
0.013
Candidatus Phytoplasma brasiliense
species
0.02
0.014
Lactobacillus jensenii
species
0.034
0.021
Odoribacteraceae
family
0.961
0.941
Increase
Parachlamydiales
order
0.067
0.1
Schlegelella
genus
0.062
0.042
Syntrophomonas cellicola
species
0.039
0.019
Reduce
Hoylesella timonensis
species
0.348
0.295
Weissella thailandensis
species
0.028
0.016
Campylobacteraceae
family
0.419
0.386
Campylobacter
genus
0.388
0.354
Haemophilus
genus
0.775
0.745
Halanaerobiaceae
family
0.222
0.184
Atopobium minutum
species
0.042
0.027
Halanaerobium
genus
0.222
0.184
Prevotella corporis
species
0.497
0.457
Anaerococcus vaginalis
species
0.213
0.215
Corynebacterium genitalium
species
0.031
0.019
Reduce
Moorella group
norank
0.534
0.447
Dysgonomonas capnocytophagoides
species
0.042
0.053
Reduce
Beijerinckiaceae
family
0.059
0.026
Reduce
Hydrogenophaga
genus
0.034
0.014
Summary
This post explains my approach for ranking which bacteria should be targeted for change, primarily using the P value to guide priority. I compared two types of bacteria: one that is rare and one that is common. Rare bacteria are usually omitted from standard analyses because their scarcity makes conventional measures—such as calculating the mean and standard deviation—poor indicators of significance. Overlooking these bacteria is a methodological error, especially when the data skew exceeds 2, making such traditional metrics inappropriate.
It’s important to keep in mind that bacteria engage in quorum sensing, which influences the metabolites they produce and likely affects the symptoms observed. For some bacteria, the difference in their relative abundance between people with and without a particular symptom can be substantial and may be all that is needed for significance.
Anyone who regularly reads peer-reviewed medical studies on the microbiome will notice findings reported as bacteria being “too high” or “too low,” with phrases like “trending” when statistical significance isn’t reached. Frankly, my reaction to 95% of these papers is an eye-roll, as the statistical methods used are often inappropriate for the data at hand. With multiple degrees in statistics, professional memberships, and experience, I’m acutely aware of both best practices and common pitfalls.
Microbiome data distributions frequently display extreme skewness—often greater than 20. In such cases, computing mean and standard deviation is simply incorrect. My friend “Perplexity” writes Mean and standard deviation become inappropriate measures for computing significance if the distribution’s skewness is substantial—specifically, when the absolute skewness exceeds ±2.
Despite this, using these metrics remains standard in high school statistics and unfortunately persists in many life science studies. This “comfort zone” approach does nothing but cloud true findings in microbiome science.
My alternative methodology uses a much larger, highly annotated dataset—over 4,290 unique samples generously donated to Microbiome Prescription., most transferred from Biomesight.com. Importantly, these samples are uniformly processed and richly annotated with symptoms rather than diagnoses, yielding superior analytical clarity.
My Natural Questions to ask
Natural for a statistician that is.
For people with a symptom or diagnosis
What are the significant bacteria associating (and likely causing) the symptom
What is the threshold levels for these bacteria
I use levels and not level because I have observed the same symptom may occur with a bacteria outside of a specific range. That is, too high or too low. I have also encountered this reported in a few studies, often hidden under a term like “altered microbiome”.
There is a dangerous assumption in the literature that significant bacteria must be either too high or too low. I unfortunately know Kierkegaard’s “Either/Or” well.
There are no universal threshold for all symptoms, each has its own
For people without a symptom with a statistical model but with dysbiosis
How do you determine which bacteria are significant?
What is the threshold levels for these bacteria
Over the last decade, these are important questions because they lead directly to treatment suggestions.
They are also significant in evaluating progress. At present I have a forecasting algorithm that has a high prediction rate for symptoms from a microbiome. The forecasting algorithm also is useful for evaluating progress, an example for a recent sample the donor asked me to review.
Prediction
The checks indicates that the donor agrees that they have this symptom.
Monitoring
The person above followed the suggestions and the subsequent test results are shown below.
What are the most common bacteria associated with symptoms?
Using more appropriate statistical methods on our sample of 4,292 distinct different samples; we found significant bacteria identified over 327 symptoms resulting in the following statistical significances.
Significance: P <
Count
0.05
13,855
0.01
12,411
0.001
7,614
0.0001
5,532
So what are the top one for each of these significance?
Overall Significance
Taxon
name
rank
Instances
820
Bacteroides uniformis
species
165
35833
Bilophila wadsworthia
species
142
35832
Bilophila
genus
139
818
Bacteroides thetaiotaomicron
species
137
1426
Parageobacillus thermoglucosidasius
species
133
118884
Gammaproteobacteria incertae sedis
no rank
125
871324
Bacteroides stercorirosoris
species
124
120580
Symbiobacterium toebii
species
122
53244
Desulfonatronovibrio
genus
122
543349
Symbiobacteriaceae
family
122
2733
Symbiobacterium
genus
122
1498
Hathewaya histolytica
species
122
454155
Paraprevotella xylaniphila
species
120
P < 0.05
Taxon
name
rank
Instances
2950010
Salidesulfovibrio
genus
47
221711
Salidesulfovibrio brasiliensis
species
46
658623
Chelonobacter
genus
45
69224
Erwinia psidii
species
44
213462
Syntrophobacterales
order
44
3024408
Syntrophobacteria
class
44
31977
Veillonellaceae
family
44
1843489
Veillonellales
order
43
550
Enterobacter cloacae
species
43
35832
Bilophila
genus
42
841
Roseburia
genus
41
53244
Desulfonatronovibrio
genus
41
871324
Bacteroides stercorirosoris
species
41
1260
Finegoldia magna
species
41
1498
Hathewaya histolytica
species
40
P < 0.01
Taxon
name
rank
Instances
35833
Bilophila wadsworthia
species
51
78448
Bifidobacterium pullorum
species
50
820
Bacteroides uniformis
species
47
841
Roseburia
genus
46
818
Bacteroides thetaiotaomicron
species
44
118884
Gammaproteobacteria incertae sedis
no rank
41
1769729
Hathewaya
genus
41
1426
Parageobacillus thermoglucosidasius
species
41
112902
Propionispora
genus
40
36853
Desulfitobacterium
genus
40
386414
Hoylesella timonensis
species
40
119065
unclassified Burkholderiales
family
40
1853231
Odoribacteraceae
family
40
400091
Hymenobacter xinjiangensis
species
39
209080
Propionispora hippei
species
39
871324
Bacteroides stercorirosoris
species
39
69224
Erwinia psidii
species
39
35832
Bilophila
genus
39
P < 0.001
Taxon
name
rank
Instances
820
Bacteroides uniformis
species
50
35833
Bilophila wadsworthia
species
37
35832
Bilophila
genus
32
118884
Gammaproteobacteria incertae sedis
no rank
32
658623
Chelonobacter
genus
31
246787
Bacteroides cellulosilyticus
species
31
120580
Symbiobacterium toebii
species
31
543349
Symbiobacteriaceae
family
31
253238
Ethanoligenens
genus
31
2733
Symbiobacterium
genus
31
292833
Candidatus Rhabdochlamydia
genus
30
324707
Candidatus Rhabdochlamydia crassificans
species
30
1426
Parageobacillus thermoglucosidasius
species
30
689704
Candidatus Rhabdochlamydiaceae
family
30
70190
Chroococcus
genus
29
402401
Chroococcus minutus
species
29
1890464
Chroococcaceae
family
29
283169
Odoribacter denticanis
species
28
P < 0.0001
Taxon
name
rank
Instances
820
Bacteroides uniformis
species
40
246787
Bacteroides cellulosilyticus
species
34
818
Bacteroides thetaiotaomicron
species
30
1963360
Parachlamydiales
order
30
454155
Paraprevotella xylaniphila
species
30
1426
Parageobacillus thermoglucosidasius
species
30
2733
Symbiobacterium
genus
29
543349
Symbiobacteriaceae
family
29
120580
Symbiobacterium toebii
species
29
35832
Bilophila
genus
26
191412
Chlorobiaceae
family
25
256319
Chlorobaculum
genus
25
244127
Anaerotruncus
genus
25
189723
Prevotella micans
species
25
53244
Desulfonatronovibrio
genus
25
324707
Candidatus Rhabdochlamydia crassificans
species
24
191410
Chlorobiia
class
24
35833
Bilophila wadsworthia
species
24
Summary
This is a high level overview of Significant Bacteria. The patterns above are specific for tests done by Biomesight; a lack of standardization results in using these identifications for other tests is unsafe (legal sense). Background here. IMHO, it is a moral responsibility for labs to produce similar tables.
The key findings are:
“Common suspects” such as bifidobacterium and lactobacillus are missing!
Large sample sizes with the same processing is critical. The processing must be the same as used in a clinical setting.
Appropriate statistical methods must be used
Stay tune for the next part as we drill deeper into appropriate handing of data with some specific issues like Long COVID.
Probiotics and the above gets interesting. Take Bacteroides uniformis which is at the top of many of these tables. If we go to my bacteria association site,
We can determine the probiotics (available or pending) that will increase this bacteria (none decreases)
Again, the “cure all” lactobacillus and bifidobacterium genus is absent (apart from Ligilactobacillus ruminis which is not currently available).
In the first post of this series, Probiotics Fundamentals: Part 1 Specific Strains I cited strains that are available retail that has been researched. The logical starting point is to search for your needs, read the studies and then rank the probiotics in prefer order for doing a personal trial. You want to do one probiotic at a time with rotation and described in the prior post (see prior post).
To searching for strain specific studies of probiotics available retail. Click here.
No Study found or issue not listed
The next step is to look at the conditions that I have abstracted/extracted studies for, listed at “U.S. Nat. Lib. Medical Conditions Studies with Microbiome Shifts“. We are shifting from strain to species level. This gives several paths, let us examine Autism. There are
Based on Publish Studies of Species
Clicking on [Can You Help Improve Suggestions] will take you to a page. At the bottom you will see “Treatment Substances” which lists things that have helped in studies. Scan it for probiotic names, for example:
Which suggests L. Reuteri with inulin may help. The source is linked. Make sure you read them.
Based on Deficiencies of Probiotic Bacteria
Clicking on 🦠 Taxons will take you to a page showing all of the bacteria shifts reported for the condition.
Look for Lactobacillus, Bifidobacterium,etc with ⬇️
These species are found at lower levels, suggesting their metabolites are also reduced. Supplementing with them as single-strain probiotics is logical. Stay at the species level (e.g., Bifidobacterium longum) rather than higher classifications such as the genus Bifidobacterium. In general, avoid probiotic mixtures, as they may include strains that are counter-indicated (e.g., Bifidobacterium catenulatum, Bifidobacterium breve) or strains for which we lack sufficient information.
Based on Modelled correction of Bacteria
Clicking on 🥣 Candidates, will send the huge bacteria list above through a fuzzy logic expert system to compute suggestions with weights given for each one.
The issue comes from the fact that the model/studies is based on multiple subgroups of people with Autism (or other conditions). The data might be accurate within each subgroup, but when you merge them together, you can end up with contradictions. So it’s not really a problem with the approach—it’s a problem with the data mix.
The best rule of thumb is to start with the things that show up as agreements across the data. For example: Bifidobacterium longum and Limosilactobacillus reuteri. Once you’ve tried probiotics that have clear agreements, then you can carefully experiment with the ones where there’s disagreement and see how your body responds.
The next level up in Probiotic Suggestions
It is pretty simple, get a microbiome test. My preferred tests are:
Thorne for shotgun (more expensive but much higher detail)
You want to ideally get a test that reports on all of the common probiotic bacteria. Many common tests do not report many of these. For example: Diagnostic Solution GI-Map reports only
On the other hand both of the above tests report species.
When you select a test, you should check Microbiome Prescription to see what the detection rate is. For example for Bifidobacterium longum, we see how often this is detected in samples.
For the shotgun tests (Xenogene and Thorne) we see 96% and 100% of the time, if it show low, you can have confidence in taking some
For SequentiaBiotech we see it is seen 25% of the time. If you have none reported we are left being uncertain if you actually have none or is the none because of the test’s methodology
Another example is L.Reuteri where the shotgun tests find in in over 50% of samples, while some 16s finds in only 2% of samples.
Bottom Line
We’re piecing things together from lots of scattered knowledge, and there’s no single standard method—either for testing microbiomes in labs or for the studies themselves. Nothing here is clear-cut; everything’s kind of fuzzy, sometimes super fuzzy. In this post, the focus was on picking probiotics for a condition using literature (an “a priori” approach). Basically, it means trusting the data at face value, even though we know it isn’t rock-solid.
I believe it is the consumer interest to share this email thread and to promote discussion of this issue.
Blacklisting is the action of suggesting something to be avoided or distrusted.
Request
[Customer name withheld] has forwarded me the PDF and some CSV files associated. She wishes to see what the recommendations from a fuzzy logic expert system that uses over 7.4 million facts based on data from the US National Library of Medicine will suggest.
I know that the following data is very much available and possible to provide. Other firms like Biomesight.com, Thorne, Vitract and Precision Biome has no trouble providing it:
For all taxonomical layers (From Clade to strains [when available]) just 3 numbers are sufficient.
NCBI Taxon Number,
Percentage Amount,
Percentile Ranking across a reference set of healthy individuals
Additional data is welcome, but not required:
Names
Your reference ranges
etc
Percentiles should be actual percentiles and NOT percentiles estimated using mean and standard deviation. Most bacteria has a SKEW exceeding 20. Using the mean and average requires a SKEW of zero.
Your customer would greatly appreciate a speed return of an appropriate file. With that file in hand, we will add your lab to our list of over 50 labs that our free site supports. (See https://microbiomeprescription.com/Upload/Index ).
If you are unable to provide such, please tell us so we may black list your site as not supportable to spare other consumers a waste of money.
Response
Hi Ken, Thank you for your detailed email and for sharing your perspective on the data formats and metrics you require.
We’d like to clarify that Microba uses the Genome Taxonomy Database (GTDB) for microbial classification. GTDB and NCBI classify genomes differently, our species consist of multiple genomes which may have different NCBI classifications, species level classifications cannot always be mapped to each other through name matching alone. Due to this, providing a microbiome profile in NCBI taxonomy is not practical nor would it be a correct representation of the actual microbiome profile.
Once GTDB is formally supported within your workflow, please reach out and we can discuss options for providing the data your service requires.
We appreciate your understanding and encourage you to support the more accurate, resolved, and phylogenetically consistent GTDB taxonomy.
Kind regards, The Microba Team
Reply To Response
Many thanks for your reply. Our purpose is to provide clinical suggestions for review by medical professionals to people suffering from a wide variety of conditions using hallucination-free AI.
Unfortunately, Genome Taxonomy Database(GTDB) appears to be a research tool and IMHO seems very inappropriate and misleading to sell to consumers. GTD was first proposed in academic papers in August 2018. We were active in the microbiome before that and the de facto standard in the industry as then, and still is today, the NCBI. The leading consumer microbiome testing company back then was uBiome which provided NCBI identifiers. In the 7 years since first release, we have seen what appears to be some 226 revisions given the release of 10-RS226, dated April 16, 2025.
Checking the US National Library of Medicine, there appears to be less than 200 studies done using GTDB that are of likely clinical use. With NCBI, we found over 20,000 suitable studies. Regardless, no study done prior to 2020, likely 2022, can, in a legal sense, be safely used for clinical purposes.
I am aware of many tools to convert GTDB to NCBI, a few are:
TaxonKit: Command-line toolkit that supports creating NCBI-style taxdump files from GTDB and also reformatting and mapping taxonomies.
gtdb_to_taxdump: Python tool to convert GTDB taxonomy files into NCBI taxdump format, usable by downstream tools like Kraken2.
GTDB-Tk: Assigns genomes to GTDB taxonomy but includes metadata fields mapping to NCBI taxonomy, enabling conversion between formats.
NCBI-GTDB Map: Direct tool for mapping GTDB taxonomy to NCBI taxonomy, supporting both directions and handling rank prefixes.
gtdb-taxdump: A specialized toolkit for generating stable, trackable NCBI taxdump files from GTDB releases with reproducible taxon IDs.
NCBI-taxonomist: Python command-line utility that retrieves, handles, and allows mapping of taxonomic information, supporting cross-database operations.
I am aware that folks embracing the hottest new technology can have an attitude, especially when most recent studies decline to use it. Intransient on this issue is not beneficial to your customers; people with challenging health issues unless you are willing to provide a GTDB based suggestions engine working off only GTDB studies,
hallucination-free AI that is equivalent or better than what NCBI can provide. Until such time, I would advocate that you stop making misleading sales to consumers.
Given your response, I feel that I have no option back to blacklist you for clinical use.
From years of using different probiotics, I have developed some simple rules of thumbs on the use of probiotics. These rules have worked very well with probiotics from my favorite source: Maple Life Science™. Their probiotics show both manufacture date and expiry date. Typically they arrive within a month of the manufacture date direct from the manufacturer. IMHO, high probability from being alive at arrival.
Do one Species at a time
Maple Life Science™ probiotics are usually single species with just FOS as an additive. My usual preference is taking probiotic powder dissolved in warm water at least one hour away from any meals. Bacteria in your gut has to enter somewhere — and that location is the mouth. You may want to also alter your mouth microbiome so it is less likely to repopulate your gut bacteria with undesirables.
My typical pattern is doing one probiotic for 2 weeks and then rotate to another. See below for the rationale.
How do I know that they are different probiotics?
I could send them off for testing, but what I have observed is this:
They are often slightly different colors
They tastes differently
I do not know definitively if they are as claimed, but I do see that they are different.
Are there any changes within a week?
I monitor myself after starting each probiotics. I expect at least one of the following to change:
If there are no changes, then I label the bottle as “No effect” and put it at the back of the refrigerator shelf. To me, probiotics should change the microbiome is some observable way. The above are indicators of change. This does take some self-awareness of each.
Personal Example: My wife has Crohn’s disease. Whenever she starts to have a flare, she takes Mutaflor (E.Coli Nissle 1917) probiotics and within 1-2 hours the flare ends. Probiotics impact should be apparent in hours or a few days.
Dosages titration
When I try a new probiotic, I usually start with the standard dosage. If there are no apparent change happening in 3 days, I double the dosage (and keep doubling every 3 days for up to 14 days). In practical terms:
Day 1: 1 capsule
Day 4: 2 capsules
Day 7: 4 capsules
Day 10: 8 capulses
Day 13: 16 capsules
The logic is simple: there may be less viable bacteria (for some reason) and thus more capsules are needed to get effective dosages to induce a change.
Probiotic Rotation
To me, the purpose of probiotics is to change the microbiome, typically, a dysbiosis. The metabolites and bacteriocins being produced by the probiotics will alter the population by either increasing metabolites that may feed (increase) other bacteria or decrease other bacteria by the bacteriocins. In other words, I view the probiotics as a course correction around a reef.
Bacteriocins are natural antibiotics. Many antibiotics are derived from bacteriocins. This means that bacteriocins resistance needs to be considered. Typically, most of the targeted bacteria has some bacteria that are resistant to some form of antibiotics. These resistors will prosper because their sibling competitors are no longer there. I have read several studies that found pulsed or rotated antibiotics were more effective than continuous antibiotics. My take away is simple: “a course of probiotics” followed by rotation. How long should the course be? I take the duration from the typical duration of prescribted antibiotics (10-14 days).
Some known bacteriocins are listed below
Nisin – produced by Lactococcus lactis.
Pediocin PA-1/AcH – produced by Pediococcus acidilactici.
Enterocin AS-48 – produced by Enterococcus faecalis.
Colicin A – produced by Escherichia coli.
Colicin E1 – produced by Escherichia coli.
Microcin J25 – produced by Escherichia coli.
Plantaricin E – produced by Lactobacillus plantarum.
Plantaricin F – produced by Lactobacillus plantarum.
Leucocin A – produced by Leuconostoc gelidum.
Helveticin I – produced by Lactobacillus helveticus.
Lactocin MXJ 32A – produced by Lactobacillus coryniformis.
Enterolysin A – produced by Enterococcus faecalis.
Salivaricin – produced by Lactobacillus salivarius.
Pyocin S2 – produced by Pseudomonas aeruginosa.
Microcin E492 – produced by Klebsiella pneumoniae.
Lactococcin G – produced by Lactococcus lactis.
Plantaricin JK – produced by Lactobacillus plantarum.
Plantaricin EF – produced by Lactobacillus plantarum.
Goadsporin – produced by Streptomyces sp..
Plantazolicin – produced by Bacillus amyloliquefaciens.
Some antibiotics obtained from bacteria:
Streptomycin – from Streptomyces griseus
Chloramphenicol – from Streptomyces venezuelae
Tetracycline – from Streptomyces rimosus and Streptomyces aureofaciens
Erythromycin – from Saccharopolyspora erythraea (formerly Streptomyces erythraeus)
Neomycin – from Streptomyces fradiae
Lincomycin – from Streptomyces lincolnensis
Rifamycin – from Amycolatopsis rifamycinica (previously Streptomyces rifamycinica)
Vancomycin – from Amycolatopsis orientalis
Bacitracin – from Bacillus subtilis and Bacillus licheniformis
Gramicidin – from Bacillus brevis
Polymyxin B – from Bacillus polymyxa
Teicoplanin – from Actinoplanes teichomyceticus
Fusidic acid – from Fusidium coccineum (a fungus, included due to bacterial-related antibiotic use)
Novobiocin – from Streptomyces niveus
Ristocetin – from Amycolatopsis lurida
Mupirocin – from Pseudomonas fluorescens
Tyrocidine – from Bacillus brevis
Clavulanic acid – from Streptomyces clavuligerus
Daptomycin – from Streptomyces roseosporus
Carbapenems (e.g., Imipenem) – from Streptomyces species and related bacteria
Bottom Line
The bottom line is simple: rotate and note changes. If there are no changes,
Below you see some data information shown on a few probiotic products. Many products do not show either production date nor best by date. They are not legally required.
My preference has always favor probiotics that includes manufacturing date. Two of the products above cite a three year shelf life (assuming appropriate storage). My favorite source, Maple Life Science™, ship directly to me from their factory. Usually they arrive within one month of manufacture (occasionally, the same month!)
From the moment that a probiotic leaves a factory, temperature control is usually non-existent. The trucks that transport them are likely not refrigerated nor are wholesale warehouse storing them. When they arrive at a health food store they are typically placed in a refrigerated cabinet for presentation to customers. In many cases, if you insist on seeing where the bottles are stored before, do not be surprise to see that it is not a refrigerated area. It is possible that the probiotics may be subject to 37C(98.6F) for months before the shop keeper places it in the refrigerated cabinet. Each number on the left scale indicate 1/10 of the number above it.
A shipment from a east coast producer to a west coast store direct, is 4-8 days. If the shipment goes to a wholesaler’s warehouse than expect a few days more. The result in summer can easily be as much as 97% of the viable bacteria that leaves the factory may be killed if it takes 15 summer days (per above chart).
The short version is that during summer in the US, the amount of viable bacteria may be 1/100 of what the probiotic had at the factory. There are stick-on labels that will change color if storage exceeds a threshold — unfortunately, no one is using that.
In terms of probiotics sold by microbiome testing companies, the only one that I know of that “ships direct from probiotic manufacturer” is PrecisionBiome.Eu that has established a relationship with a German probiotic producer. Their client base is the EU, so transit time from “vat to customer” is short.
Absence of Regulations Problem
There are recommendations such as Best Practices Voluntary Guidelines for Probiotics[2017] provides some guidance (ignored by most producers). Producers can make claims of shelf life (best before) of 10 years without consequences. With no manufacture date, no one knows when it was made. Calls to their customer lines will usually give questionable answers. The probiotic industry have many active lobbyists to inhibit anything that may effect profits.
Figure Pointing for Probiotics being DAO
If you buy a probiotic and found it is effectively DAO and contact the manufacturer. The manufacturer will claim no responsibility once it left their premise. It is the responsibility of the trucking companies and wholesaler storage. Those folks will then point at the retail store mishandling things.
This harsh reality is why I try to buy direct from the manufacturer (no Amazon, no health food stores).
Bottom Line
“I didn’t get any benefit from probiotics, I got them refrigerated from my trusted health food store“ is a frequent complaint that I hear. IMHO, they do not work because they have been well cooked! Our habit is order our year supply of probiotics from Maple Life Science™ in the fall and winter. The colder it is outside, the better it is.
My “rules of thumbs” on taking probiotics will be the next topic.
What is the difference between a Species and a Strain? To understand this, view Species as “dogs” and strains as specific types. Is picking a Chihuahua as a police dog a good choice, or a St. Bernard suitable for someone with disability living in a one room apartment?
The chart below shows different aspects of different strains for Lactobacillus Reuteri. When you buy a probiotic names “Lactobacillus Reuteri”, it is unlikely which species if was obtained from is specified on the bottle. If it was not from a human, it is very unlikely that it will reproduce or take root in your body.
Probiotic manufacturers and packagers are focused on making money. They will ask for the cheapest source for a probiotic that they expect to be able to sell for the greatest profit. Human source is not a factor, cheapness is!
I have known people that are histamine sensitive that are fine with one brand of Lactobacillus Reuteri but get sick from another brand…. Looking at the chart below, the answer is obvious: One has a histamine producer and one does not.
This morning I was asked about Bacteroides fragilis BF839 which is cited in several studies on the US National Library of Medicine. Most of the studies are from 2024 or 2025. At present, it is not for sale anywhere and I do not expect it to be for five(5) years at least because of approval processes. Given the authors’ location, I expect it will be first available in China.
Researched and Stain is for sale 🙂
Several years ago I set up a free page listing those available (somewhere in the world). I also automated a weekly automatic scan of the US National Library of Medicine for any new studies using these strains. The page is kept up to date.
Occasionally, someone emails me about a new strain that has one or more studies associated. I add those to the list. If you find one that I missed, please email me!
The page allows searching across the studies abstracts for key words. For example, if you are interested in Autism, just enter that and click search. The page will then show the retail brands with links to the studies.
The intent of the page is discourage random trial of probiotics which has no effect (except on bank accounts).
List of Strains with name of product or seller
At present we are at 156 different strains. These are listed below.
Akkermansia muciniphila WB-STR-0001: Pendukum Glucose Control
VSL3 / Visbiome / De Simone Formulation: Alfasigma USA, Inc.
Safest Product for Correct Identification
These strains are usually under legal protection and thus the manufacturer has a vested (financial) interest to make sure that “what is advertised is delivered”.
Why is this important, just look at some of the literature
64.4% were incorrectly labeled in either number of viable cells or bacterial species
51.6% exhibited resistance to at least one antimicrobial agent
26.8% had a lower number of viable cells than their label claims, No viable Lactobacillus was found in some products
57.8% comprised other species rather than those claimed on the contents
Your first choice should be the probiotics strains that are most likely to be as advertisedandhas been studied for the symptom of condition that you are interested in.
Mast Cell Activation Syndrome (MCAS) has been a frustrating condition to deal with. It is likely that the microbiome is a factor but finding bacteria patterns has had poor results. Recently I did a post One Consistent Pattern Across Different Vendors Results where the agreement from different lab tests on the bacteria involved simply had not agreement (see this page to illustrate). Using estimates on compounds and enzymes using KEGG: Kyoto Encyclopedia of Genes and Genomes went two ways:
enzymes associations was worse than bacteria
compound produced was far better than bacteria, 31x better
I decided to drill into these associations restricted it to associations seen in ALL of uBiome, Biomesight and Ombre dataset. The goal is to do a manual walkthrough of using this data to identify issues and potential solutions.
Logic Flow
Obtain a large collection of microbiome samples process through different labs (in our case 3)
Estimate the net amount of compound produce from each sample
Using the symptom annotation, determine which compound are significant using chi2
Identify the compounds that are significant for all three labs (concurrent)
Remove the compounds that only occur in a few labs. Assume they are probable noise
Take a shotgun sample from an individual and compute the compounds
Identify where we have compound matching 5 (same direction)
Ignore/discard compound that are not produced/substrate by the bacteria in sample
Create a list by reversing taxa->Compound to compound->Taxathat are present in sample
Pass the bacteria list with desired shift(increase/decrease) to the Fuzzy Logic Expert System to generate suggestions
List the suggestions
[Step 4 Above] The compounds identified are below with ALL OF THEM BEING LOW:
Bacteria produces and consumes the above metabolites/compounds.
We are able to compute the impact of probiotics on the above, these are listed under Probiotics modeled on Compounds on this link. None makes it better, in fact all are predicted to make things worse. Checking PubMed (while some probiotics may help), the number of studies are surprisingly sparse. Sparse studies for something like this suggests that there may be a large number of unpublished studies showing no effect or a negative effect. These unfavorable studies are very rarely published, especially when funded by someone owning the patent of the probiotic being tested.
Trying this approach on a Shotgun Sample of someone with MCAS [Step 5]
There are 52 compounds identified above. I have a Thorne sample. 42 of the compounds were less than 1%ile, 6 were over 50%ile. The pattern was obtained using this trio of 16s-labs (ubiome, biomesight and ombre) and the pattern appears to hold true for a sample from a shotgun analysis. We have 80% agreement when randomness would suggest < 2% agreement.
Looking at the details, we see that most of the 42 had no bacteria producing these compounds. This suggests that we are dealing with sparse data and possible false identification. Restricting to those reported as > 0%ile with over 200 samples we ended up with just 11 compounds.
Working backwards from these 4, we end up with 3,953 taxa on our first pass. For many of these bacteria there were none reported hence the challenge of encouraging bacteria that are not there!!!
Increase these bacteria (most significant is at top)
Corynebacterium diphtheriae
Streptococcus anginosus
Rhodopseudomonas palustris
Ralstonia solanacearum
Pseudomonas oryzihabitans
Sorangium cellulosum
Pseudomonas putida
Decrease these bacteria ((most significant is at bottom)
Cupriavidus necator
Cupriavidus basilensis
Thermaerobacter sp. FW80
Note we are talking species here. KEGG is actually at the strain level and we extrapolated to species. Passing this data to the fuzzy logic expert system we get as suggestions [Step 11]:
henopodium quinoa {Quinoa}
hydromorphone – an addictive pain killer
proton-pump inhibitors (prescription)
synthetic disaccharide derivative of lactose {Lactulose}
Moringa Oleifera {Moringa}
Limosilactobacillus fermentum {L. fermentum}
Heyndrickxia coagulans {B. coagulans}
Latilactobacillus sakei {Lactobacillus sakei}
Bottom line, it is possible to get suggestions. The route is not direct. The suggestions above are typically viewed as safe (except the one crossed out), and thus I will pass it along to the person whose sample I used. At the end of multiple steps, we ended up with four (4) compounds that mapped to some 10 bacteria to shift. Those bacteria resulted in just 7 suggestions. We are travelling well outside of the mapped world.
From Perplexity AI
To increase or decrease the abundance of the listed bacteria, specific strategies should be considered, but many target species are not commonly sought for probiotic modulation due to their environmental, pathogenic, or niche-specific nature. Below, approaches to influence each are summarized when evidence is available.
Increase These Bacteria
Corynebacterium diphtheriae: There is no safe or recommended method to purposely increase C. diphtheriae, as it is a human pathogen responsible for diphtheria, a serious disease. Laboratory studies using animal models suggest that general Actinobacteria (the phylum to which Corynebacterium belongs) can be modulated using certain yeast probiotics or lactic acid bacteria in animal models, but these findings are not intended or safe for human manipulation targeting pathogenic species. Increasing this bacterium is not medically advised.
Streptococcus anginosus: This member of the Streptococcus anginosus group is a commensal found mainly in the mouth, throat, and gut but is also opportunistically pathogenic. There is currently no evidence-based supplement or intervention to selectively increase S. anginosus in humans; it can be affected by local tissue conditions, diet, and the existing microbiome, but any intervention to increase such bacteria outside of medical settings is not recommended.
Rhodopseudomonas palustris: This bacterium is environmental and photosynthetic, rarely present in the human body. In agricultural or industrial settings, it is cultured using media containing carbon (like molasses), nitrogen (like corn steep liquor), and trace minerals, kept at slightly acidic to neutral pH and moderate temperatures (around 37°C). There is no role for increasing this bacteria in humans; it is used as a soil and water probiotic for plants.
Ralstonia solanacearum, Pseudomonas oryzihabitans, Sorangium cellulosum, Pseudomonas putida: These are mainly environmental, soil, or water-dwelling microbes. There is no supplement or dietary approach to increase these in the human gut or on human mucosa, and such actions are not studied for human health.
Decrease These Bacteria
Cupriavidus necator: This environmental bacterium can be suppressed in biotechnological fermentations by nutrient limitation (especially removing nitrogen or phosphorus), but there is no guidance for targeting this in the human body, as it is not a usual component of the human microbiome. In environmental settings, its growth is limited by removing accessible carbon, nitrogen, or phosphate sources.
Cupriavidus basilensis: Similar to C. necator, mainly handled in environmental and fermentation settings; standard medical interventions if infection occurs would use specific antibiotics based on sensitivity.
Thermaerobacter sp. FW80: This is a thermophilic (heat-loving) environmental bacterium not typically present in or relevant to humans. Methods to decrease it would involve lowering environmental temperatures, not relevant to dietary or supplement choices.
Practical Notes
Most of the bacteria on your list are not normal human microbiome residents and manipulating them carries significant health risks or is not practically or ethically possible.
For environmental and plant-associated bacteria (such as R. palustris, P. putida, Ralstonia, etc.), enrichment is done in laboratory or agricultural settings with nutrient and temperature controls but is not safe or meaningful for humans.
For bacteria that are pathogens or opportunists (C. diphtheriae, S. anginosus), intentional enrichment is never recommended.
For reducing rare environmental or opportunistic bacteria in humans, general good hygiene and avoiding immunosuppression help, while in clinical cases, antibiotics may be used based on sensitivity.
In my prior post, Patterns of Microbiome Distributions Across Different Vendors, we saw that the distribution patterns were not consistent. With annotated samples from several different vendors, it was logical to see if there was consistency of bacteria identified across vendors for specific symptoms. This question was directed to me by Chidozie Ojobor, Ph.D. The results can be viewed on my Special Studies page.
This page hunts for statistical association with P < 0.001. The results shown are not only for bacteria, but estimates on compounds and enzymes using KEGG: Kyoto Encyclopedia of Genes and Genomes.
Definitions:
Singleton: only found in one of the labs
Pair: two labs reported the same
Triplet: all three labs reported the same
Given the difference of sample sizes and thus significance levels, some lack of consensus is expected.
uBiome: 790
Thryve: 1542
Biomesight: 4604
Over P < 0.001 data, we had the following
Looking at bacteria to symptom agreement
19558 singleton relationships
1914 pair relationships
94 triplet relationship. or 0.436%
Looking at enzymes to symptom agreement
74159 singleton relationships
1629 pair relationships
3 triplet relationship or 0.004%
Looking at compound to symptom agreement
69147 singleton relationships
9364 pair relationships
6189 triplet relationships or 7.3%
Of special interest is when we went to P < 0.0001 for compound to symptom , we got significantly better results for compounds
5952 singleton
45439 pair relationships
8153 triplet relationships or 13.7%
Bottom Line
While I am doing a naive estimate for compounds using KEGG data, the results support the model:
It is not the bacteria that causes the symptoms, it is the net amount of compounds produced by the bacteria that causes the symptoms. Surplus or deficiency of compounds can come from a vast array of different collections of bacteria. The bacteria may just be noise!!
This shifts any model to generate suggestions for a symptom to one further and significant indirection.
This week I had some discussion with several CEOs of various microbiome testing companies and the question came up:
Can we take the percentile threshold from the Kaltoft-Møldrup(KM) Range computation and apply it to other tests?
If this is likely true, then we reduce the impact of the Blue “Whale in the Room”
Fortunately, I have the data to reasonably answer that (for the common good) and show charts from some of the 60+ providers that have had samples uploaded. Most people in this area are very siloed into one lab, one way and cannot see the forest, only the leaf that they are on.
Lactobacillus
Biomesight
Ubiome
Ombre
PrecisionBiome
Vitract
Thorne
Bifidobacterium Longum
Biomesight
Ubiome
Ombre
Not sufficient data (3 samples had it)
PrecisionBiome
Vitract
Not sufficient data (3 samples had it)
Thorne
Faecalibacterium prausnitzii
Biomesight
Ubiome
Ombre
PrecisionBiome
Vitract
Thorne
Bottom Line
The first item of note: Two labs will typically (99% odds!!) report no Bifidobacterium Longum! In both cases, they do report it, but rarely detect any. Some medical practitioners (not knowing better about the test) will advocate this strongly as a probiotic with no safe evidence support that. This returns us to Scott’s quotation cited above, the blue whale!
While conceptually, the ability to transfer percentile ranges between labs is very appealing, it appears to be unsafe. Looking at the lower bounds for Faecalibacterium prausnitzii we see the following suggested (eye balling – make your own estimates):
Biomesight: 18%ile
uBiome: 30%ile
Ombre: 12%ile
PrecisionBiome: 30%ile
Vitract: 12%ile
Thorne: 20%ile
To me, this settles the issues. The KM percentile estimates for ranges cannot be transfered across labs. They are lab specific (and unlike the naive mean and standard deviation approach that ignores the high skews) require significant sample size.
Recent Comments