Predicted Symptoms – Performance Review

At present we have many ways to estimate symptoms from a sample. In some cases, we look at what bacteria as a group produces, in other cases just the bacteria. This post is going to look at how well different method behaves. The different approaches are fishing expeditions to see if we can find more predictive analysis tools.


Today, I added the ability to see predicted symptoms against entered symptoms, as shown below

We are going to pick the samples with the most symptoms, one per user and see how well each compares. Sample B had nothing from KEGG — KEGG is works from Species (and this sample lacked any). The nu,mbers below are matches in the top 20 predicted symtp,s/

Method (Top # selected)ABCDEFGHIJKLM
Consensus (30)11161169631067144
Bacteria (20)916912111212291441112
End Products (20)1181074765973611
KEGG Enzymes (20)8085833857134
KEGG Modules (20)130151058951071059
KEGG Products (20)701076710499367
Walking a collection of samples (each from a different person), All samples had at least 80 symptoms entered.

There are challenges with the symptoms entered – namely

What we do find is that every sample had at least one method identifying at least 50% of the person’s symptoms with the highest being 80%, followed by 75%,

Second we find that predicting from bacteria or KEGG Modules had the best performance. In no case was bacteria end products nor KEGG products nor KEGG Enzymes the best. Consensus only once matched the performance of the other two and was often very bad.

Bottom Line

The above shows a strong association of the bacteria (or it’s functions) to various symptoms. It does not prove causality, but causality is my working hypothesis (at least as a catalyst or contributor to symptoms).

This means that modifying bacteria may results in reduction or elimination of many symptoms.

1 thought on “Predicted Symptoms – Performance Review

Leave a Reply

Your email address will not be published. Required fields are marked *