Significant Bacteria and Their Thresholds – Part 2

Conventional medical science tend to think of one bacteria to one condition. These are known as Single-Bacterium Diseases.

Single-Bacterium Diseases

  • Tuberculosis — caused by Mycobacterium tuberculosis
  • Diphtheria — caused by Corynebacterium diphtheriae
  • Cholera — caused by Vibrio cholerae
  • Leprosy (Hansen’s disease) — caused by Mycobacterium leprae
  • Whooping cough (Pertussis) — caused by Bordetella pertussis
  • Tetanus — caused by Clostridium tetani
  • Typhoid fever — caused by Salmonella typhi
  • Syphilis — caused by Treponema pallidum
  • Gonorrhea — caused by Neisseria gonorrhoeae
  • Lyme disease — caused by Borrelia burgdorferi
  • Gastric ulcer — caused by Helicobacter pylori
  • Strep throat — caused by Streptococcus pyogenes
  • Urinary tract infection — most commonly caused by Escherichia coli
  • Pneumonia — can be caused by Streptococcus pneumoniae
  • Meningitis — can be caused by Neisseria meningitidis (meningococcus), or Streptococcus pneumoniae
  • Bacterial vaginosis — often caused by Gardnerella vaginalis

There are other conditions that could be cause by any one of several bacterium, but not bacterium cooperating with each other.

When we enter the world of microbiome dysbiosis, this simplicity disappears.

Case Study of number of bacterium associated with many symptoms

We return to our collection of 4,290 unique samples with 327 symptoms having statistical significant bacterium discussed in Significant Bacteria and Their Thresholds – Part 1. Restricting our data to associations with P < 0.01, the graph below shows the number of bacteria associated with each symptom.

My view is that symptoms arise from the metabolites produced by a specific combination of bacteria. Examining the data from KEGG: Kyoto Encyclopedia of Genes and Genomes, we see that some metabolites can be produced by hundreds of different bacterium. Some bacteria associated with a symptom may actually be due to secondary effects—reflecting shifts caused by other species—so distinguishing causal bacteria from merely correlated ones remains difficult.

A practical working hypothesis for reducing or eliminating a symptom is therefore to normalize the bacteria associated with it. A rational approach is to start with those that have the strongest association.


The Naïve Approach

A well-educated medical professional typically follows this reasoning:

  1. Identify which bacteria are outside the normal range and linked to the patient’s symptoms.
  2. Determine whether each bacterium is elevated or reduced.
  3. Review substances known to influence these bacteria.
  4. Recommend lifestyle or dietary changes based on those substances.

In practice, certain substances may be counter-indicated for other bacteria that are also out of range. This is often overlooked, as many professionals adopt a “find the first substances that address the bacterium shift” approach. This sometimes makes the patient worse.

Microbiome Prescription uses a manually curated database containing over 7.4 million relationships between substances and specific bacteria. Because of this depth, these potential conflicts are often identified and the risk of adverse effects is reduced.

A professional in this situation would reasonably expect to see a chart such as the one below for each of the bacterium associated with a symptom. The chart, table or other items giving a desired range of values.

With dozens of bacteria out of range, there is no clear objective ability to rank these bacteria by importance. There are a variety of speculative punts that could be tried:

  • Rank them by the volume of bacteria
  • Rank them by the deviation from the reference range, i.e.(value – mean)/standard deviation
  • Rank them by any of many possible algorithms that could be tossed at this issue.

Turning the issue upside down

Let us take the concrete example promised in the earlier post: Long COVID. We have 538 samples with Long COVID in our population of 4,290 contributed samples. This is about 12% of the samples.

Filtering the associations to those bacterium with P < 0.0001; our highest priority or weight. We obtain the table below. While there are many bacteria, some are tightly related according to lineage:

MycoplasmatotaMollicutesAnaeroplasmatalesAnaeroplasmataceaeAnaeroplasma

Taxontax_nametax_rank
1737405Tissierellalesorder
626933Odoribacter laneusspecies
186332Anaeroplasmatalesorder
186333Anaeroplasmataceaefamily
8563016SrX (Apple proliferation group)species group
10226016SrXV (Hibiscus witches’-broom group)species group
47565Candidatus Phytoplasma prunorumspecies
2086Anaeroplasmagenus

Filtering the associations to those bacterium with P < 0.001; we get a second table shown below. One of the bacteria is Lactobacillus jensenii, which is available as a probiotic (I have some in my fridge) — but we do not know if we want to increase or decrease this bacteria.

Taxontax_nametax_rank
2330Halanaerobiumgenus
1381Atopobium minutumspecies
972Halanaerobiaceaefamily
724Haemophilusgenus
194Campylobactergenus
28128Prevotella corporisspecies
72294Campylobacteraceaefamily
33037Anaerococcus vaginalisspecies
38288Corynebacterium genitaliumspecies
42857Moorella groupnorank
45254Dysgonomonas capnocytophagoidesspecies
45404Beijerinckiaceaefamily
47420Hydrogenophagagenus
102261Candidatus Phytoplasma brasiliensespecies
109790Lactobacillus jenseniispecies
89061Weissella thailandensisspecies
215579Schlegelellagenus
382673Syntrophomonas cellicolaspecies
386414Hoylesella timonensisspecies
1963360Parachlamydialesorder
1853231Odoribacteraceaefamily

We could continue onwards and look at the 40 bacterium associated with P < 0..01 and 48 bacterium with P < .05. While potentially important, because of the lesser degree of association, we will ignore them here. My preference is always to favor highest probability and thus would only look at those in the above two tables.

Question: Is Lactobacillus Jensenii too high or too low

Some medical practitioners would hear the word “Lactobacillus” and immediately say “Take it” because they have a (questionable) belief that Lactobacillus will help everything! Is this the case here?

The Data for Lactobacillus Jensenii

The table below shows the data. The percentages have been transformed to percentiles for better presentation with a count of the occurrences at each. One of the first items some people will note is that this bacterium is not reported often; but there is enough data to get a P < 0.001 using Chi2 . I disagree with the approach seen in some papers, to only examine very commonly reported bacterium.

%-ile RangeHasTotalHasNotTotal
Not Present3443852
0.0013
4.21015
20.00222
45.26213
61.0513
65.2603
68.4203
71.5805
76.8402
78.9511
81.0501
82.1110
83.1610
84.2101
85.2602
87.3701
88.4201
89.4710
90.5301
91.5810
92.6310
93.6801
94.7401
95.7901
96.8401
97.8901
98.9501
100.0001

Doing a little more aggregation we get the table below. If a person has no L. Jensenii they have a 8% chance of having Long COVID, If there is any present, the odds increases 13% chance, a higher amount pushes it up to 17% (double the odds).

Conclusion: L. Jensenii probiotics are a definite to be avoided probiotic for Long COVID

%ile RangeHasTotalHasNotTotalRatio
Not Present34438520.08
0.00130.25
4.210150
20.002220.08
Over 209440.17
Over All12840.13

Danger Will Robinson: Do not over simplify

Looking at another bacteria with P < 0.0001, we see charted below. The bacteria is commonly reported.

What is evident is that the association is range sensitive, and thus reference ranges:

  • Below 17%ile is out of range
  • Between 28%ile and 34%ile is out of range
  • Over 60% is out of range

Many microbiologists would say that this does not make sense. At this point I should remind people of quorum sensing with bacteria.

Quorum sensing is a communication process that allows bacteria to sense and respond to their population density using chemical signaling molecules called autoinducers. Each bacterium produces and releases autoinducers into its environment. As the population grows, the concentration of these molecules increases. When a threshold level is reached, autoinducers bind to specific receptors, triggering changes in gene expression that coordinate group behaviors such as biofilm formation, virulence factor secretion, sporulation, and bioluminescence.

At this point, many minds may be going into ‘statistical culture shock‘. For me, it makes complete sense and is often seen across nature. They are sometimes termed “islands of stability” in some sciences. In our case “islands of symptoms” would be a more accurate name.

We may end up committed blasphemy against conventional linear mechanic thinking!

Examples:

Alloys With Maximum Strength at Specific Composition

  • Iron-Carbon Steel: In carbon steel, maximum strength is usually achieved at a carbon content near 0.8% (known as “eutectoid steel”), where the steel forms a very fine pearlite structure on cooling. Both lower and higher carbon percentages reduce ductility or create brittleness, decreasing usable strength for many applications.
  • Aluminum-Copper Alloy (Al-Cu): The precipitation strengthening of aluminum alloys, such as in the Al-Cu system, reaches maximum effectiveness at about 4–5% copper by weight. Below or above this optimal range, strengthening diminishes because either not enough or too much second phase is created.
  • Nickel-Iron Alloys: For instance, “Permalloy” (a nickel-iron alloy) typically reaches its desired magnetic properties with about 80% nickel and 20% iron. Changes in this ratio result in reduced magnetic strength, which can be considered a parallel with mechanical strength in many alloys.

Wildlife Systems with Optimal Mixtures

  • Animal Diversity in Food Webs: Studies show that increasing the number of animal species generally increases total animal biomass and plant consumption rates, up to a point. Beyond this, higher diversity leads to increased intraguild predation (animals eating each other), which can reduce overall community efficiency and stability.
  • Keystone Species: Some ecosystems depend on a particular balance among keystone species and others. Removing or adding too many can severely weaken the system, in analogy to alloys with optimal composition.
  • Biodiversity-Function Relationship: The stability and strength of ecosystems typically follow a nonlinear relationship with species richness: there is a “sweet spot” where ecosystem functions (like carbon sequestration or primary production) are maximized.

Nuclear Islands of Stability

  • The best-known island is around atomic numbers (Z) 114 to 126 and neutron number (N) 184, where theoretical calculations suggest nuclei could have half-lives of minutes, days, or potentially even years, instead of the microseconds typical for super heavy elements nearby.

Other Physical Sciences

  • The term may be used metaphorically to describe stable orbital configurations or regions where system dynamics are less chaotic.

Returning to Long COVID

With Tissierellales we have a complex behavior. With Lactobacillus Jensenii we have a simple “if present, reduce it” finding. I returned to the lists above and attempted to identify those with a simple finding with a P < 0.05 for the odds ratios. This is shown in the table below.

  • Reduce Beijerinckiaceae, family, With odds being almost 2x for those with Long COVID, i.e. 0.059 vs 0.026
  • Increase Parachlamydiales, order, With odds being more than 10x for not having Long COVID, i.e.
    0.067 vs 0.1

Other have the odds being close to each other. For example, Tissierellales (0.978 vs 0.973). In the next post, we will examine more of the complex behavior ones.

CorrectionTax_nameTax_rankSymptom OddsNo Symptom Odds
Anaeroplasmagenus0.1460.135
Candidatus Phytoplasma prunorumspecies0.7220.738
16SrX (Apple proliferation group)species group0.7220.735
Anaeroplasmatalesorder0.1460.135
Anaeroplasmataceaefamily0.1460.135
Odoribacter laneusspecies0.4940.506
Tissierellalesorder0.9780.973
16SrXV (Hibiscus witches’-broom group)species group0.020.013
Candidatus Phytoplasma brasiliensespecies0.020.014
Lactobacillus jenseniispecies0.0340.021
Odoribacteraceaefamily0.9610.941
IncreaseParachlamydialesorder0.0670.1
Schlegelellagenus0.0620.042
Syntrophomonas cellicolaspecies0.0390.019
ReduceHoylesella timonensisspecies0.3480.295
Weissella thailandensisspecies0.0280.016
Campylobacteraceaefamily0.4190.386
Campylobactergenus0.3880.354
Haemophilusgenus0.7750.745
Halanaerobiaceaefamily0.2220.184
Atopobium minutumspecies0.0420.027
Halanaerobiumgenus0.2220.184
Prevotella corporisspecies0.4970.457
Anaerococcus vaginalisspecies0.2130.215
Corynebacterium genitaliumspecies0.0310.019
ReduceMoorella groupnorank0.5340.447
Dysgonomonas capnocytophagoidesspecies0.0420.053
ReduceBeijerinckiaceaefamily0.0590.026
ReduceHydrogenophagagenus0.0340.014

Summary

This post explains my approach for ranking which bacteria should be targeted for change, primarily using the P value to guide priority. I compared two types of bacteria: one that is rare and one that is common. Rare bacteria are usually omitted from standard analyses because their scarcity makes conventional measures—such as calculating the mean and standard deviation—poor indicators of significance. Overlooking these bacteria is a methodological error, especially when the data skew exceeds 2, making such traditional metrics inappropriate.

It’s important to keep in mind that bacteria engage in quorum sensing, which influences the metabolites they produce and likely affects the symptoms observed. For some bacteria, the difference in their relative abundance between people with and without a particular symptom can be substantial and may be all that is needed for significance.

Leave a Reply