Odds Ratio for the Microbiome 101

By Kenneth Lassesen, B.Sc.(Statistics), M.Sc.(Operations Research)

Odds Ratio and Chi2 are two sides of the same coin. The worth of this coin is far more than the fourrées seen with studies using averages.

The simplest case is how often is a specific bacteria reported with the control versus study groups. This is easy computed and can be placed in a table such as the one below

Control (without Symptom)Study (or with Symptom)
Bacteria Seen30090
Bacteria Not Seen600700

Just looking at the table, it is obvious that this bacteria is less likely to be seen in a study group. We can just drop these numbers in a page like this one, and get the results.

Converting to odds ratio is simple:

  • Compute odds for study group30090=3.33390300=3.333.
  • Compute odds for control group6007000.857700600≈0.857.
  • Odds ratioOR=3.3330.8573.89 that seeing this bacteria put you likely not in the study group
    • Or 1/3.89 = 0.257 if seeing this bacteria, places you in the study group

Second Tier: The amount

This is identical to the above, except there is a little mathematics needed to compute the best range of bacteria for odds ratio.

At 0.04%Control (without Symptom)Study (or with Symptom)
Above or Equal10060
Below20030

Again a simple computation with great statistical significant.

And again the Odds Ratio is calculated the same as above.

  • 100/60 = 1.66
  • 200/30 = 6.66
  • OR = 1.66 / 6.66 = 0.25 (or 4.00 for the reverse.

We have a tri-state odds ratio

  • Bacteria not seen: 0.257 of having symptom (i.e. bacteria is rarely seen with symptom)
  • Bacteria see but above or equals to 0.04%: 3.89 * 4 =15.56
  • Bacteria see but below 0.04%: 3.89 * .25 = 0.9725, almost no effect.

In this example, we used above or below 0.04%; we could have also used in the range (0.03 to 0.07) or not in the range.

Key points

  • Use only bacteria with P < 0.001 or better
    • Check Present or not Present
    • There is a finite enumeration of possible ranges when a bacteria present.
      • With today’s powerful computers, this is not a challenge
  • Check all bacteria that satisfies the minimum size constraint for the function used for the 2×2 table

For some symptoms we have:

  • over 450 bacteria with significant odds ratios for some conditions.
  • Highest Odds ratio over 92 for some bacteria

Performance

This data is based on self-declared symptoms from users. Often the symptoms entered are incomplete (some users had over 100 symptoms entered). While not rigorous, this appears to work for getting sample annotations entered in a citizen science context and for demonstration of the concept. There was enough consistency of data to get results.

The best news: The following had the Odds Ratio > 1.0, over a dozen in the sampling and agreement with entered symptoms.

SourceSymptomNameRatio
BiomeSightOfficial Diagnosis: Mood Disorders100
ThryveDePaul University Fatigue Questionnaire : Frequently get words or numbers in the wrong order100
ThryveAutism: More Repetitive Movements100
ThryveAutonomic Manifestations: cardiac arrhythmias100
ThryveCondition: Acne100
ThryveDePaul University Fatigue Questionnaire : Pain in Multiple Joints without Swelling or Redness100
ThryveDePaul University Fatigue Questionnaire : Feeling like you have a temperature100
ThryveOfficial Diagnosis: Diabetes Type 1100
ThryveNeurological: Spatial instability and disorientation100
ThryveCondition: Type 1 Diabetes100
ThryveNeuroendocrine Manifestations: abnormal appetite100
BiomeSightAutonomic Manifestations: delayed postural hypotension100
ThryvePhysical: Long term antibiotics(over 6 months)100
ThryveComorbid: Electromagnetic Sensitivity (EMF)100
BiomeSightPhysical: Bad Air Quality100
BiomeSightNeuroendocrine Manifestations: marked diurnal fluctuation100
ThryvePhysical: Amalgam fillings100
BiomeSightComorbid: Reactive Hypoglicemia100
ThryveComorbid: Sugars cause sleep or cognitive issues100
BiomeSightOfficial Diagnosis: Dermatitis (all types)100
ThryvePhysical: Steps Per Day 2000-4000100
ThryveNeuroendocrine Manifestations: Painful menstrual periods100
ThryveGeneral: Anhedonia (inability to feel pleasure)100
BiomeSightVirus: Parvovirus positive (B19)100
BiomeSightBlood Type: FUT2 secretor100
ThryveOfficial Diagnosis: High Blood Pressure (Hypertension)100
ThryveDePaul University Fatigue Questionnaire : Poor hand to eye coodination100
ThryveInfection: Coxsackie100
ThryveNeuroendocrine Manifestations: marked diurnal fluctuation100

Looking at the biggest sets. we see very good performance for some symptoms and poor performance for items like gender. Unrefreshing Sleep is interesting:

  • Unrefreshed sleep: 88.6% accurate
  • Unrefreshing Sleep, that is waking up feeling tired: 36.7% accurate

Is the cause, the fineness of definition (and lack of clarity by users entering) or some other issues?

SourceSymptom% CorrectSize
BiomeSightGeneral: Fatigue98.70317694
BiomeSightNeurocognitive: Brain Fog98.18182660
BiomeSightSleep: Unrefreshed sleep88.57616604
BiomeSightNeurocognitive: Difficulty paying attention for a long period of time75.54113462
BiomeSightImmune Manifestations: Bloating90.13761436
BiomeSightDePaul University Fatigue Questionnaire : Fatigue85.96491399
BiomeSightGender: Male59.79644393
BiomeSightComorbid: Histamine or Mast Cell issues88.0102392
BiomeSightOfficial Diagnosis: COVID19 (Long Hauler)97.87798377
BiomeSightDePaul University Fatigue Questionnaire : Unrefreshing Sleep, that is waking up feeling tired36.66667360
BiomeSightNeurocognitive: Can only focus on one thing at a time63.76404356
BiomeSightNeuroendocrine Manifestations: worsening of symptoms with stress.70.26239343
BiomeSightNeurological-Audio: Tinnitus (ringing in ear)60.71429336
BiomeSightNeurocognitive: Problems remembering things47.00599334
BiomeSightAge: 30-4097.14286315
BiomeSightDePaul University Fatigue Questionnaire : Post-exertional malaise, feeling worse after doing activities that require either physical or mental exertion92.33227313
BiomeSightNeurocognitive: Absent-mindedness or forgetfulness62.7907301
BiomeSightSleep: Daytime drowsiness69.33333300
BiomeSightPost-exertional malaise: General85.95318299
BiomeSightImmune Manifestations: Constipation83.22148298

Lab Performance

Identification by Age exhibits the reality of all labs are not equal. If Odds Ratios from the microbiome was not statistically significant for estimating age, we would see 14% for accuracy. We far exceed that.

LabSymptomAccuracySize
BiomeSightAge: 0-1086.229
OmbreAge: 0-1076.359
BiomeSightAge: 10-208025
OmbreAge: 10-2094.719
BiomeSightAge: 20-3058.5135
OmbreAge: 20-3064.734
BiomeSightAge: 30-4097.1315
OmbreAge: 30-4066.3104
BiomeSightAge: 40-5022.2203
OmbreAge: 40-5071.463
BiomeSightAge: 50-6029.7111
OmbreAge: 50-6061.747
BiomeSightAge: 60-7052.559
OmbreAge: 60-7018.183
BiomeSightAge: 70-809020

This difference of labs is seen with other symptoms — some of which has associations reported in the literature.

SourceSymptomNameRatioSize
BiomeSightGeneral: Depression67.7195
OmbreGeneral: Depression13.9108
BiomeSightGeneral: Fatigue98.7694
OmbreGeneral: Fatigue20.8149
BiomeSightGeneral: Headaches71.6197
OmbreGeneral: Headaches15.5103

Summary

The use of odds ratios provides statistically significant evidence for identifying probable symptoms. While not definitive—acknowledging that few diagnostic tests achieve complete certainty—the results demonstrate that both the selected testing method and its interpretation (for example, in relation to bacterial associations) materially influence diagnostic accuracy.

In clinical contexts, reliance on odds ratios offers greater methodological rigor than studies reporting merely “higher or lower levels of certain bacteria with P<0.05.” A notable clinical strength of this approach lies in its capacity to generate a structured list of potential symptoms for further inquiry, including those that patients may not have initially disclosed.

Nota Bene: It should be noted that the observed error rate is likely attributable, at least in part, to underreporting of symptoms. Patients often disclose only the symptoms they perceive as most severe, thereby introducing reporting bias into the dataset.

The table below shows the accuracy from 4 different labs. It is not a surprise that Shotgun data is more accurate than 16s tests.

SourceRatioSize
BiomeSight – 16s60.845069
Thorne – Shotgun80.7491
Ombre/Thryve – 16s40.817123
uBiome – 16247.613071

Leave a Reply