Accurate Inference from Studies on the Microbiome

My exploration of microbiome modification began with reading studies archived in the U.S. National Library of Medicine. As someone who has been developing expert systems since the 1990s, my instinct was to encode the findings from these studies as facts within an expert system—letting logic determine the optimal course of action.

In artificial intelligence (AI), an expert system is a computer system emulating the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as if–then rules rather than through conventional procedural programming code — Wikipedia

Some Difficult Discoveries

As I built the knowledge base, some major problems quickly emerged:

  • Results from different studies often contradicted one another.
  • Some results were replicated consistently, while others produced conflicting outcomes.
  • Certain findings were reported only once and never replicated.
  • There was significant uncertainty about bacterial identification due to non-standardized testing methods (see this explanation).
  • Studies tended to report results at a single taxonomy rank—often not the rank relevant to my analysis.

To address the first issue, I incorporated fuzzy logic into the expert system, allowing it to handle ambiguity and partial truths rather than rigid yes/no classifications.

Fuzzy logic is based on the observation that people make decisions based on imprecise and non-numerical information. Fuzzy models or fuzzy sets are mathematical means of representing vagueness and imprecise information (hence the term fuzzy). These models have the capability of recognising, representing, manipulating, interpreting, and using data and information that are vague and lack certainty. — Wikipedia

The second issue required a different approach. I began using bacterial association data (available here) to infer relationships between taxa. For example, if a compound influenced the genus Bifidobacterium, I could reasonably infer a similar effect for its species. This two-way relationship also works in reverse: if you want to increase Bifidobacterium overall, the species Bifidobacterium longum—a readily available probiotic—shows the strongest positive association.

Species NameEstimate Percentage Inference
Bifidobacterium actinocoloniiforme18.8
Bifidobacterium adolescentis54.8
Bifidobacterium angulatum26.6
Bifidobacterium animalis14.5
Bifidobacterium asteroides40.2
Bifidobacterium avesanii34.2
Bifidobacterium bifidum25.5
Bifidobacterium bohemicum52.7
Bifidobacterium bombi57.7
Bifidobacterium boum64.1
Bifidobacterium breve52.4
Bifidobacterium catenulatum33.9
Bifidobacterium choerinum66.6
Bifidobacterium commune45.4
Bifidobacterium cuniculi21.8
Bifidobacterium dentium23.2
Bifidobacterium gallicum30.8
Bifidobacterium indicum52.9
Bifidobacterium lemurum50.4
Bifidobacterium longum73.7
Bifidobacterium magnum62.5
Bifidobacterium minimum27.5
Bifidobacterium mongoliense31.9
Bifidobacterium pseudocatenulatum31.2
Bifidobacterium pullorum30.2
Bifidobacterium ruminantium20.4
Bifidobacterium scardovii16.9
Bifidobacterium subtile38.8
Bifidobacterium thermacidophilum44.5
Bifidobacterium thermophilum29.8
Bifidobacterium tsurumiense11.7

With fuzzy logic, study findings indicating increases or decreases could be translated into numerical values. Using bacterial association data, I could then adjust those values to create a more accurate estimate of impact.

Why Do the Mathematics?

Modern AI models, particularly Large Language Models (LLMs), operate differently. They generate responses by finding text that resembles the question rather than reasoning from factual relationships. LLMs do not distinguish whether data comes from a single study or many, nor do they analyze hierarchical relationships within bacterial taxonomy. As a result, taxonomic nuances—such as the difference between Lactobacillus reuteri and Limosilactobacillus reuteri—are often overlooked because the evolution of bacterial naming conventions is ignored.

At the other end of the treatment spectrum is a “whole health” influencer who might recall a single study about Bifidobacterium dentium and use it to infer a complete treatment plan. My approach is simpler: I prefer actions grounded in probability—ones that have the best odds of success.

This is a bit of the how of my free site for individuals, Microbiome Prescription.

Leave a Reply