Odds Ratios and the Microbiome

In working with Microbiome Prescription, I experimented with various prediction approaches before settling on a workaround that, in many cases, could successfully predict the top 10 symptoms for new microbiome samples, with individuals confirming about 80% of them as accurate reflections of their own symptoms. Though this solution was adequate for practical needs, it was admittedly less than ideal in theory. Recently, I recognized that a more robust and principled prediction algorithm is achievable. The aim of this post is to walk through that process, making it accessible for anyone interested in trying this more rigorous approach.

Accurate prediction identifies the key bacteria that should be altered with statistical justification.

An odds ratio (OR) is a measure of association that describes the odds of a disease, symptom, or event occurring in one group compared to another, often used in medical and epidemiological studies to estimate the strength of risk factors or the effectiveness of interventions.

Understanding Odds Ratios

  • The odds ratio is calculated by dividing the odds of the event in the exposed group by the odds in the non-exposed group.
  • OR > 1 indicates higher odds of disease with the exposure or risk factor; OR < 1 indicates reduced odds; OR = 1 means no difference in odds between groups.
  • Odds ratios are especially used in case-control studies, but also in cohort and cross-sectional studies, and they can approximate risk ratios when the disease or symptom is rare.

Using Multiple Odds Ratios in Disease Analysis

When you have several odds ratios related to a disease, there are several key uses:

  • Compare the magnitude of different risk factors: By looking at the odds ratios for various exposures (e.g., smoking, age group, genetic markers), you can identify which exposures are most strongly associated with the disease.​​
  • Synthesize evidence: Meta-analysis allows combining odds ratios from multiple studies to produce a summary effect estimate, which helps determine overall strength of association and consistency across populations.

Example Table of Interpreting Odds Ratios

Exposure/Risk FactorOdds RatioInterpretation
Smoking3.5 Exposure increases odds
Physical Activity0.7 Exposure decreases odds
High BMI1.2 Exposure slightly increases odds
Family History4.0 Strong increased odds

These odds ratios can guide targeted interventions, identify priority risk factors, and inform clinical decision-making or public health policy.

Each odds ratio’s confidence interval should be considered to determine statistical significance: if it includes 1, the specific association may not be statistically meaningful.

Summary

Odds ratios quantify the likelihood of disease or symptoms given exposures and allow comparison and synthesis of risk across different factors or populations. When handling multiple odds ratios, use them to identify, adjust for, and summarize the impact of risk factors on disease occurrence.

Applying to the Microbiome

We encounter some challenges here. Consider this constructed example:

  • Bacteria Foo has OR of 1.5 when the microbiome exceeds 5%
  • Bacteria Bar has OR of 2 when the microbiome exceeds 3%
  • Bacteria Foo and Bar are associated.

If a sample has both, the OR is not 1.5 x 2 or 3.0. Instead, we need to know much they influence each other, i.e. the R2. We can estimate this from Microbiome Taxa R2 Site. Suppose that R2 is 0.5, significant inference.

The Odds ratio is thus reduced to 2.66 from 3.0.

Odds Ratios and Continuous Values

Odds ratios are commonly used for binary data, such as smoker versus non-smoker or high school graduation status. Continuous data can also be categorized; for example, instead of treating smoking as simply yes/no, you might use metrics like the number of cigarettes smoked per day or packs per week. Similarly, the microbiome data can be categorized, though caution is needed to avoid over-interpreting sparse data. A rough guideline from many studies suggests a minimum of 30 cases and 30 controls are needed to calculate an odds ratio with basic reliability. For data on the lower end, it can be helpful to binarize using the median rather than the mean. This is important because bacterial abundances tend to be highly skewed—using the mean often results in about 70% of samples falling below it and 30% above, whereas the median splits the data evenly with 50% below and 50% above.

Example: Brain Fog

Here are some odds ratios using BiomeSight data. Odds Low means when the reading is below the Median and Odds High above the Median (of those with this symptom). We use the symptom median to get balanced (same approximate size) categories.

A few quick take away:

  • Probiotics such as Bifidobacterium, Ligilactobacillus, Lactococcus lactis, Lactiplantibacillus
    • Bifidobacterium catenulatum subsp. kashiwanohense (OR 1.37) is the preferred one!
    • Ligilactobacillus: Ligilactobacillus salivarius is the only one available retail
    • Lactiplantibacillus: Lactiplantibacillus plantarum is the only one available retail
    • Veillonella atypica is offered as FITBIOMICS V•Nella Lactic Acid Metabolizing Probiotic …
      • Note: Brain fog is often ascribed to too much Lactic Acid.
Tax_Nametax_RankOdds LowOdd High
Cerasicoccus arenaespecies1.590.71
Polyangiasubclass1.470.72
Lelliottiagenus1.420.75
Lelliottia amnigenaspecies1.420.75
Microcoleaceaefamily0.821.41
Myxococciaclass1.380.71
Myxococcalesorder1.380.71
Myxococcotaphylum1.380.71
Bifidobacterium catenulatum subsp. kashiwanohensesubspecies1.370.74
Denitratisomagenus0.871.37
Microcoleus antarcticusspecies0.811.36
Microcoleusgenus0.811.36
Desulfosporosinusgenus1.340.80
Trabulsiellagenus1.330.80
Rivulariaceaefamily1.320.79
Segatella paludivivensspecies1.320.79
Prosthecobactergenus1.320.73
Ligilactobacillusgenus1.310.77
Enterobacter cloacae complexspecies group1.300.80
Peptostreptococcus stomatisspecies1.300.80
Alcanivoraxgenus0.931.30
Alcanivoracaceaefamily0.931.30
Tepidanaerobacter syntrophicusspecies1.300.79
Tepidanaerobactergenus1.300.79
Tepidanaerobacteraceaefamily1.300.79
Hoylesella loescheiispecies1.290.81
Thermosediminibacteralesorder1.280.81
Enterobacter hormaecheispecies1.280.82
Slackia isoflavoniconvertensspecies0.841.27
Bifidobacterium choerinumspecies1.270.82
Desulfovibrio simplexspecies1.270.80
Chromatiumgenus0.901.27
Lactococcus fujiensisspecies1.270.67
Chromatium weisseispecies0.901.27
Klebsiellagenus1.270.82
Klebsiella/Raoultella groupno rank1.270.82
Veillonella atypicaspecies1.260.82
Isoalcanivoraxgenus0.941.26
Isoalcanivorax indicusspecies0.941.26
Schaalia turicensisspecies1.250.72
Lactococcus lactisspecies1.250.83
Bifidobacteriaceaefamily1.240.84
Bifidobacterialesorder1.240.84
Chloroflexotaphylum1.240.79
Salidesulfovibrio brasiliensisspecies0.921.24
Salidesulfovibriogenus0.921.24
Enterobactergenus1.240.81
Bifidobacteriumgenus1.240.84
Actinomycetotaphylum1.240.84
Acholeplasma hippikonspecies0.851.23
Mycoplasmataceaefamily1.230.82
Mycoplasmatalesorder1.230.82
Bifidobacterium angulatumspecies1.230.82
Clostridium nitrophenolicumspecies0.851.23
Bacteroides uniformisspecies0.851.22
Lactococcusgenus1.220.83
Lactiplantibacillusgenus1.220.84
Mycoplasmagenus1.220.82
Filifactor villosusspecies0.881.22
Anaerolineaeclass1.210.85
Veillonella denticariosispecies0.891.21
Actinomycetesclass1.210.85
Acidimicrobiumgenus1.210.79
Cerasicoccaceaefamily1.210.79
Cerasicoccusgenus1.210.79
Mycoplasmoidalesorder1.210.81
Parabacteroides gordoniispecies1.210.84
Thioalkalivibrio jannaschiispecies1.210.63
Candidatus Blochmanniella camponotispecies1.210.79
Thioalkalivibriogenus1.210.63
Acidimicrobiaceaefamily1.210.77
Bifidobacterium adolescentisspecies1.210.85
Bifidobacterium longumspecies1.210.85

That’s it for the moment

Also, see the links below for by request tables

The next step is seeing how these odds ratio perform which against samples and against the old algorithm. Stay tune.

caveat emptor

The table above applies only and exclusively with Biomesight data. For an explanation of why, see The taxonomy nightmare before Christmas… If you use a different lab, you will need to get that lab to crunch their numbers in the same manner as detailed above

Leave a Reply