I am off from my usual day job until the new year. In the new year I know that I will be very busy because my firm just landed a contract for a major software product that I am the principal for. I decided to give myself a challenge to explore on these down days:
How accurately can you prediction symptoms from the presence or absence of bacteria reported ALONE with different 16s test.
For those familiar with various forms of Artificial Intelligence, that approach is often used. It reduces the problem to a collection of true/false. For most microbiologists, it is a road not even thought about, lest travelled.
To make the challenge harder, I required the data to have a P value of 0.001. The analysis demanded bacteria-symptom associations with a stringent P-value of 0.001 (Chi² > 10.83), exceeding typical microbiome study thresholds.
I have four contributed and annotated datasets:
- 16s
- uBiome: 791 samples
- Ombre (formerly Thryve): 1,319 samples
- Biomesight: 4,436 samples
- Shotgun
- Thorne : 253 samples
Using these datasets, I explored the strength of relationships based on Odds Ratio. A subsequent post includes using odds ratio based on a threshold of bacteria which will get much higher values. A high cumulative value indicates a very strong microbiome basis of the symptom and thus remediation.
For details, see the methodology in : New Standards for Microbiome Analysis? Also, taking amount of each bacteria into consider is shown in Ability to Predict Symptoms with 99.9% probability using Bacteria Incidence and Amount
Relationships were quantified using Odds Ratios (OR) at consistent taxonomy levels to avoid dependence, with cumulative log(OR) indicating symptom-microbiome strength (higher values suggest robust basis for remediation).
The Tax Rank indicates what is likely the most effective level to use for investigation (i.e. highest discrimination ability).
Thorne
| Symptom Name | Tax_rank | Cumulative | Cnt |
| General: Fatigue | species | 25.26 | 37 |
Ombre / Thryve
| SymptomName | Tax_rank | Cumulative | Cnt |
| Autonomic Manifestations: Orthostatic intolerance | genus | 21.39 | 23 |
| General: Fatigue | species | 7.72 | 32 |
| General: Headaches | genus | 27.26 | 37 |
| General: Myalgia (pain) | species | 8.49 | 31 |
| Neurological: Confusion | species | 1.31 | 2 |
| Neurological: Difficulty processing information (Understanding) | species | 9.05 | 19 |
| Neurological: Disorientation | species | 1.40 | 3 |
| Neurological: emotional overload | species | 4.74 | 11 |
| Neurological: Impairment of concentration | genus | 22.71 | 32 |
| Neurological: Word-finding problems | genus | 15.24 | 15 |
| Neurological-Audio: hypersensitivity to noise | genus | 29.58 | 43 |
| Neurological-Sleep: Chaotic diurnal sleep rhythms (Erratic Sleep) | genus | 35.01 | 41 |
| Neurological-Vision: inability to focus eye/vision | genus | 42.34 | 53 |
| Neurological-Vision: photophobia (Light Sensitivity) | genus | 47.87 | 64 |
| Post-exertional malaise: Inappropriate loss of physical and mental stamina, | species | 20.20 | 45 |
| Sleep: Unrefreshed sleep | species | 21.19 | 49 |
uBiome
While no longer in existence, sharing numbers may be interesting.
| Symptom Name | Tax_rank | Cumulative | Cnt |
| General: Fatigue | species | 5.69 | 15 |
| General: Headaches | species | 3.12 | 7 |
| General: Myalgia (pain) | species | 1.74 | 6 |
| Neurological: Confusion | species | 2.82 | 3 |
| Neurological: Difficulty processing information (Understanding) | species | 1.57 | 6 |
| Neurological: emotional overload | species | 6.08 | 17 |
| Neurological: fasciculations | strain | 1.25 | 3 |
| Neurological: Impairment of concentration | species | 14.29 | 18 |
| Neurological: Short-term memory issues | species | 0.94 | 7 |
| Neurological: Spatial instability and disorientation | species | 1.82 | 1 |
| Neurological: Word-finding problems | species | 6.38 | 11 |
| Neurological-Audio: hypersensitivity to noise | species | 7.71 | 11 |
| Neurological-Sleep: Chaotic diurnal sleep rhythms (Erratic Sleep) | species | 7.72 | 6 |
| Neurological-Vision: inability to focus eye/vision | species | 11.63 | 14 |
| Neurological-Vision: photophobia (Light Sensitivity) | species | 10.57 | 13 |
| Sleep: Unrefreshed sleep | species | 5.47 | 11 |
BiomeSight
| Symptom Name | Tax_rank | Cumulative | Cnt |
| Autonomic Manifestations: irritable bowel syndrome | species | 3.87 | 10 |
| Autonomic Manifestations: light-headedness | species | 8.07 | 15 |
| Autonomic Manifestations: nausea | species | 2.55 | 11 |
| Autonomic Manifestations: Neurally mediated hypotension (NMH) | species | 1.46 | 1 |
| Autonomic Manifestations: Postural orthostatic tachycardia syndrome (POTS) | species | 1.87 | 11 |
| General: Fatigue | species | 7.65 | 20 |
| General: Headaches | species | 3.94 | 15 |
| General: Myalgia (pain) | species | 1.63 | 10 |
| Neurological: Confusion | species | 4.93 | 6 |
| Neurological: Difficulty processing information (Understanding) | species | 0.63 | 9 |
| Neurological: emotional overload | species | 0.81 | 10 |
| Neurological: fasciculations | genus | 3.61 | 9 |
| Neurological: Impairment of concentration | species | 2.10 | 5 |
| Neurological: Short-term memory issues | species | 2.41 | 5 |
| Neurological: Spatial instability and disorientation | species | 2.91 | 4 |
| Neurological: Word-finding problems | species | 2.68 | 21 |
| Neurological-Audio: hypersensitivity to noise | genus | 2.53 | 6 |
| Neurological-Vision: inability to focus eye/vision | species | 2.10 | 4 |
| Neurological-Vision: photophobia (Light Sensitivity) | species | 9.46 | 28 |
| Post-exertional malaise: Inappropriate loss of physical and mental stamina, | species | 2.44 | 12 |
| Sleep: Unrefreshed sleep | species | 1.75 | 17 |
Summary
Nota Bene: the above is the cumulative of the log values. It is assumed that for each bacteria, the highest odd ratio is used /hit. A value of 0.81 means exp(0.81) = 2.25 is the highest odds ratio possible if the sample hits every child highest odds ratio. A value of 8.07 becomes odds ratio of 3188.
All of the above are 16s tests which typically are viewed accurate to species at best. The difference of test processing is strongly exhibited in the table below. For background on the challenge on a lack of standardization in microbiome testing, see my post from 6 years ago: The taxonomy nightmare before Christmas…
| Symptom Name | Ombre | uBiome | BiomeSight |
| General: Fatigue | 7.72 | 5.69 | 7.65 |
| General: Headaches | 27.26 | 3.12 | 3.94 |
| General: Myalgia (pain) | 8.49 | 1.74 | 1.63 |
| Neurological: Confusion | 1.31 | 2.82 | 4.93 |
| Neurological: Difficulty processing information (Understanding) | 9.05 | 1.57 | 0.63 |
| Neurological: emotional overload | 4.74 | 6.08 | 0.81 |
| Neurological: Impairment of concentration | 22.71 | 14.29 | 2.10 |
| Neurological: Word-finding problems | 15.24 | 6.38 | 2.68 |
| Neurological-Audio: hypersensitivity to noise | 29.58 | 7.71 | 2.53 |
| Neurological-Vision: inability to focus eye/vision | 42.34 | 11.63 | 2.10 |
| Neurological-Vision: photophobia (Light Sensitivity) | 47.87 | 10.57 | 9.46 |
| Sleep: Unrefreshed sleep | 21.19 | 5.47 | 1.75 |
Looking at the counts:
| Symptom Name | Ombre | uBiome | BiomeSight |
| General: Fatigue | 32 | 15 | 20 |
| General: Headaches | 37 | 7 | 15 |
| General: Myalgia (pain) | 31 | 6 | 10 |
| Neurological: Confusion | 2 | 3 | 6 |
| Neurological: Difficulty processing information (Understanding) | 19 | 6 | 9 |
| Neurological: emotional overload | 11 | 17 | 10 |
| Neurological: Impairment of concentration | 32 | 18 | 5 |
| Neurological: Word-finding problems | 15 | 11 | 21 |
| Neurological-Audio: hypersensitivity to noise | 43 | 11 | 6 |
| Neurological-Vision: inability to focus eye/vision | 53 | 14 | 4 |
| Neurological-Vision: photophobia (Light Sensitivity) | 64 | 13 | 28 |
| Sleep: Unrefreshed sleep | 49 | 11 | 17 |
I view these stark differences due to the fragments of RNA that each test looks at to make the identification of bacteria. It is those RNA fragments that is important.
All of the data used above is available for download.