Performance of Microbiome-Symptom Prediction Model

Preliminary evaluations have shown strong and accurate forecasting ability. I am working with a Ph.D. in Molecular Genetics and processing his microbiome sample the top forecasts were correct:

He was a male
He had no Health Issues
He was blood type O-positive
He snored
He woke up early

This was not from a blood sample, but from a stool sample. The methodology is described in my earlier posts.

The methodology is computing intensive. Unlike many contemporary artificial intelligence engines, it is easy to understand:

In the entire population(n:10,000), a bacteria like unclassified Clostridiales has 10% of samples reported this bacteria with a mean of 0.3%.
In the condition / sample population(n:1000) we find that 30% of the samples reported this bacteria with a mean of 4.8%

I used Chi² to determine significance because significance using means assumes a normal distribution which is a false assumption.

With this methodology it is good to get some indication of what sample sizes for conditions / symptoms is needed (assuming the control is at least the same size). A quick plot informs us that 200 looks like a minimum size with 300+ being desired.

One of the labs that I work with reports percentile ranking on their reports. This makes doing an exploration economical (i.e. cheap). Testing of the symptom/condition group is only needed because the control numbers are gifted you.

This also illustrates that the symptom is not with just one taxa, but the influence of many taxa working in unison to influence the symptom (or be influenced by the condition).

Samples of Symptoms / Conditions

The ones with the most predictors are below.

Neurocognitive: Absent-mindedness or forgetfulness
Neurological: Impairment of concentration
Neurocognitive: Difficulty paying attention for a long period of time
General: Fatigue
Autonomic Manifestations: irritable bowel syndrome
Official Diagnosis: COVID19 (Long Hauler)
Sleep: Problems staying asleep
Neurocognitive: Slowness of thought
Neurological-Sleep: Insomnia
Sleep: Daytime drowsiness
Immune Manifestations: Bloating
Neurocognitive: Problems remembering things

For blood types:

Only two were significant and we have the importance of sample size apparent.

Symptom Name	Sample N	Forecast Variables
Blood Type: O Positive	284	295
Blood Type: A Positive	126	3

For Age:

Symptom Name	Sample N	Forecast Variables
Age: 60-70	206	143
Age: 0-10	98	42
Age: 50-60	164	33
Age: 20-30	152	12
Age: 30-40	369	11
Age: 40-50	244	8

For Gender

Symptom Name	Sample N	Forecast Variable
Gender: Female	415	214
Gender: Male	554	200

Bottom Line

From the individual forecast variables, we can compute odds of a match. Consider a simplistic example of three factors

Factor A uses 5%ile
Factor B uses 10%ile
Factor C uses 20%ile

With A,B,C all being true has a 99% chance of being a taxa match (1 – .05 * .1 * .2). This does not mean the person has it, it indicate a significant risk of having it (depending on other factors, DNA, age, gender etc).

Of course, the more forecasting variables, the better the estimate becomes and also the more tolerant the forecast is to variations in the microbiome.

Microbiome Prescription Blog

A site exploring the microbiome, what it affects and how to manipulate it.

Performance of Microbiome-Symptom Prediction Model

Samples of Symptoms / Conditions

For blood types:

For Age:

For Gender

Bottom Line

Recent Posts

Pages

Reference Material

Recent Comments