What is your NEXT diagnosis?

This morning I chatted 90 minutes with another data scientist about the microbiome. After the video chat he sent me a link to this recent article: From IBS to ME – The dysbiotic march hypothesis [2020]

” The pathogenesis of the relationship is unknown. Intestinal dysbiosis may be a common abnormality, but based on 1100 consecutive IBS patients examined over a nine years period, we hypothesize that the development of the disease, often from IBS to ME, actually manifests a “dysbiotic march”. In analogy with “the atopic march” in allergic diseases, we suggest “a dysbiotic march” in IBS; initiated by extensive use of antibiotics during childhood, often before school age. Various abdominal complaints including IBS may develop soon thereafter, while systemic symptom like CFS, fibromyalgia and ME may appear years later.”

Related to the above:

Personally, I have seen someone progress from GERDs to IBS to Chronic Fatigue Syndrome to atypical Crohn’s disease. The progression is not deterministic, with DNA being a significant factor.

I have had on my todo list, creating a microbiome progression map. I have just added it (based solely on gold-standards PubMed studies). It can be seen via https://microbiomeprescription.com/Library/PubMed

on https://microbiomeprescription.com/Library/PubMed

When you click the crystal ball . you will be taken to a new page. For example, IBS

Associated medical conditions to IBS
For Chronic Fatigue Syndrome
For Autism

Bottom Line

This is based on PubMed studies which are often hit and miss for depth of analysis and reporting shifts. Over time, I expect data to improve and the forecasts on this page to improve.

COVID Microbiome and implications for Long Haulers

A Periodic Review of PubMed for COVID Fecal Microbiome finally found some studies:

One paper reported very very good results!

The optimal eight oral microbial markers (seven faecal microbial markers) were selected by fivefold cross-validation
on a random forest model, and the classifier based on the optimal microbial markers was constructed and achieved an area under the curve (AUC) of 98.06% (99.74% in the faecal microbiome).”

Alterations in the human oral and gut microbiomes and lipidomics in COVID-19 [2020]

“The heatmap showed that the faecal microbial community in CPRs (Confirmed Patients Recovered) was different from that in CPs (Confirmed Patients) and HCs,(Healthy Controls)” [SP is suspected Patient, SPR is suspected Patient recovered] which appears to confirm my hypothesis that most infections will leave a “garbage state” in the microbiome which will generally shift slowly back to the healthy state. This return to a healthy state is not certain and when it fails to happen, then we have diagnosis such as long haul covid, chronic fatigue syndrome, and post-infection syndrome.

More coming…

Alzheimer’s treatment via the microbiome.

This month (Feb 2021) there as a major article Structural and Functional Dysbiosis of Fecal Microbiota in Chinese Patients With Alzheimer’s Disease released. I have updated my list of bacteria (with links to source studies),

This post is for caregivers that are interest in low risk treatment that theoretically have a high probability of success and low cost.

Short Summary of Approach

The microbiome produces some 4000+ different chemicals. For many conditions, especially “untreatable”, it appears that imbalances in these chemical mixtures result in cells, including brain cells, malfunctioning.

Some drugs help — and often those drugs were seen to alter the microbiome, correcting some of these shifts. The stupid question is this, if we know the bacteria that are involved — then why not starve or feed to put it into better balance.

IMHO It works! In my 50’s I had a sudden onset of cognitive issues, including memory. A SPECT scan was read as Early Onset Alzheimer’s. I also had another diagnosis. That other diagnosis has a bacteria shift pattern reported in 1998 in Australia. Making changes to alter that pattern caused the cognitive issues to fade and disappear.


You need to have a microbiome sample (done by taking a little bit of a stool and sending it to a lab). Then the data need to be upload to the free citizen science site, Microbiome Prescription. Not all labs are supported (i.e. they do not make their data available in a suitable format); those that are supported are listed here (with discount codes).

Once the data is uploaded, there are two Quick Suggestions links that generates suggestions using Fuzzy Logic Artificial Intelligence techniques.

There is a demo logic that show all of the features…. BiomeSight Example Login

There are a lot of tools there, depending on you skill sets and devotion to seeking improvement.

There is a YouTube Channel showing how to use this site and discussion of issues.

Comparing Extreme 3% to Kaltoft-Moltrup Selection

A reader contacted me about a disagreement and the cause was a bug in the code for Kaltoft-Moltrup — subsequently fixed. This post looks at the bacteria selected by each for similarities and differences — so people can better understand the difference (which is a little abstract).

I am going to use one the demo samples from BiomeSight (BiomeSight:2019-06-10 Self).

  • Extreme 3% picked 29 bacteria
  • KM picked 24 bacteria

I sorted their selections below in alphabetical order, 13 are in common (just over 50% of the KM choices).

Kaltoft-MoltrupExtreme 3%
Actinomyces : Too HighActinomyces : Too High
Actinomyces naturae : Too High
Anaerofilum : Too High
Actinomycetaceae : Too High
Bacillales Family X. Incertae Sedis : Too HighBacillales Family X. Incertae Sedis : Too High
Bacteroides cellulosilyticus : Too High
Bacteroides denticanum : Too High
Bacteroides dorei : Too Low
Bacteroides intestinalis : Too LowBacteroides intestinalis : Too Low
Bacteroides rodentium : Too HighBacteroides rodentium : Too High
Bacteroides sartorii : Too HighBacteroides sartorii : Too High
Bacteroides thetaiotaomicron : Too High
Bacteroides vulgatus : Too LowBacteroides vulgatus : Too Low
Blautia : Too High
Blautia obeum : Too LowBlautia obeum : Too Low
Brochothrix : Too High
Brochothrix thermosphacta : Too High
Chitinophagaceae : Too High
Clostridium paradoxum : Too HighClostridium paradoxum : Too High
Coprobacillus : Too High
Coprococcus : Too HighCoprococcus : Too High
cunicula : Too Low
Dehalogenimonas : Too High
Desulfovibrio vietnamensis : Too Low
Johnsonella : Too HighJohnsonella : Too High
Johnsonella ignava : Too HighJohnsonella ignava : Too High
Lachnospira : Too High
Lactococcus : Too HighLactococcus : Too High
Leuconostoc : Too High
Listeriaceae : Too High
Micrococcaceae : Too High
Oscillospira : Too High
Prevotellaceae : Too Low
Streptococcaceae : Too High
Streptococcus vestibularis : Too HighStreptococcus vestibularis : Too High
Sutterella : Too Low
Syntrophobacteraceae : Too High
Tetragenococcus : Too High
Thiothrix : Too High
Turicibacter sanguinis : Too Low

Looking at a chart of Prevotellaceae, we see that KM low is 2.25%, thus be this sample being between 2.25 and 3 resulted it being excluded on one and included on another. For Listeriaceae : Too High, KM used 95.6% instead of 97%.

For Sutterella, KM uses 22% for low, hence it included. This is reasonable because there is a distinctive drop off around that!

For Coprobacillus, it looks like I need to do some adjustments of the KM, a chunk of unusual data caused a “step” that incorrectly triggered the high computation.

Bottom Line

We have good overlap with the differences being due to the curves being different. With the extreme 3% approach, we are insensitive to the difference of shapes. With KM we are sensitive (and some parameters to the algorithm needs a little adjustment).

A child with cerebral palsy microbiome

This is an analysis using a standard flow that I tend to use… The analysis was done using data from Thryve and uploaded to Microbiome Prescription.

Microbiome Functional Abnormalities

With the recent addition of KEGG information, my focus has shifted from a naive “this bacteria is too high or too low” to this enzyme or end-product is too high or too low. Why? Bacteria can substitute for each other.

Consider building a wall on a house. If you grow up in the US. Northwest, it will be 90% wood and 10% gypsum board. But walls may be built with steel framing, concrete siding, bricks, stones, logs, concrete blocks etc. Is a wall not built of 90% wood unsafe? No, the question should be how does the wall function for structure strength, insulation etc. It is the same with bacteria.

Three Functional Checks

We use the three items are at the bottom of Changing Your Microbiome

End Product Abnormalities

At this point, the reader should copy and google information about each end product.

  • “Indole may act as an interspecies signaling compound.” [src] translation: the bacteria internet is flaky
  • Biogenic amines – these are neurotransmitters
  • Bacteriocin: Lasso peptide – “It has varieties of biological activities, among which the most important one is its antibacterial efficacy.” [src] in other words is suppresses other bacteria

So the story appears to be bad communications between bacteria and the microbiome being bias towards those that tolerate the natural antibioitic, Bacteriocin: Lasso peptide.

Clicking thru to the bacteria involved, we find that the antibiotic effect is due to one bacteria, shown below. The Indole appears to be due to low Alistipes level (most common source, and has a low value)

Source for Biogenic amines and Bacteriocin: Lasso peptide
KEGG Enzyme Outliers

I dropped filtering to 90% and only the enzymes show issue.

Core Supplements

The purpose of this is to identify items that may compensate for low amounts. There were none.

There are no significant lows ( < 25% of Median) and we note two high items: GABA and DAO

Predicted Symptoms

There was only one item that had strong likelihood:

 Impaired Memory & concentration|Neurological-Vision: Blurred Vision|

Pub Med Microbiome Reports

Cerebral palsy is not in my existing list of medical conditions, so I did a pub med search and found just two studies.

Distinct Gut Microbiota Composition and Functional Category in Children With Cerebral Palsy and Epilepsy [2019]With Cerebral Palsy and Epilepsy [2019]

“The increased risk of “Neurodegenerative diseases” in CPE patients was probably attributed to Streptococcus, Parabacteroides, and Bacteroides ” [2019]

I added CP to the database with the bacteria shifts reported. And then ran for matches to the reported patterns

Values in top or bottom 9%. A good match for PubMed report

The top results for probiotics are shown below

In terms of diet additions:

As well as:

Vitamin D, Blackberries, Limes, Oranges are in the Flavonoids list

And diet avoidance

We note that some species of lactobacillus are on the to take and others to avoid. Different species produce different bacteriocins (natural antibiotics against other bacteria)

Bottom Line

The above changes will likely have a positive impact and should slow neurodegeneration. As always, these suggestions should be review by knowledgeable medical professional before starting. These are machine learning suggestions that is blinkered in terms of factors considered.

Sample Comparison Tools

This set of tools is designed for two scenarios:

  • One person who has had many samples of time. Typically that person is looking for outliers to reduce or disappear
  • A family with samples from different members. Typically one person is challenged and since the family group has shared DNA and diet — the hope is that the bacteria grouping causing the challenge will be identified. Once identified, it may be actionable.
    • This scenario will have more tools added over the next weeks.

I have used them in these prior posts:

If you have two or more samples uploaded, you will see the top two items on the Available Samples page. These may be collapsed into one over the next few weeks.

Clicking the right button of these two will take you to a sample selection page.

Clicks TWO or more samples

The program will list all items below that matches all samples OR all samples except 1 (but at least two).

I selected a group of 5 samples from when I was having a ME/CFS flare.

The lack of dehydrogenases above would account for high lactate(brain fog) and agrees with research [Nicotinamide adenine dinucleotide (NADH) in patients with chronic fatigue syndrome]

IMHO, it correctly identified what was wrong with me.

KEGG Modules takes a bit of research to understand.

Doing some research, I found “. L-tryptophan is produced in the shikimate pathway from chorismate” Which lead to many ME/CFS articles on PubMed.

Again, it may take some research to understand what this is.

This informed me which bacteria were constantly high over 5 samples

The cause of ME/CFS flare was stress and a quick search on PubMed found several articles where this genus increased with stress, the latest was “Gut Microbiota Are Associated With Psychological Stress-Induced Defections in Intestinal and Blood–Brain Barriers“[2020]

Bottom Line

This set of tools does not give immediate answers; it gives you leads to investigate. For myself, the findings plus the use of PubMed studies weaved a story of what happened that agrees with the literature. This is very important because ME/CFS contains dozens of subsets. Often I have seen that what is helpful for one subset is harmful for another. I suspect this also applies to other conditions, such as ASD/Autism.

In this case, it identified one key family to reduce. Identified enzymes that I was short on. Lead to a possible supplements that I should consider because of the dysruption.

Thryve vs BiomeSight – the numbers

A reader asked which one to use. They can be compatible prices, especially this weekend with Black Friday specials A cost item that should also be factored in is shipping costs to and from. In the US, Thryve comes with a postage paid return package.

The Numbers

The upload page gives raw numbers. I am also going to dive a little deeper into the numbers

Elusive 1%

Adjusting for number of samples, they appear very similar.

Elusive 2%

These charts are those between 1% and 2% in occurance

Elusive 2-4%

This is the count between 2 and 4% Frequency. BiomeSight appears to have an edge.

Elusive 4-8%

As above, BiomeSight curves appears better. 100 is at the 84%ile with Thryve and at the 41% with Microbiome; in other words, Biomesight report more in this range per sample. This is important for the AI analysis, because we need a threshold count before we can detect patterns.

Elusive 8-16%

Thryves now pulls ahead. Biome Sight has 70 bacteria count at the median, while Thryve has 90

But wait! Does it report on what you are interested in?

In my last person analysis, there was two probiotic recommendations:

I would like to see those counts on my next sample…. so clicking on the above links, I see that stats:

Ouch, BiomeSight is the only one that reports either! Looking at the parent group, I see BiomeSight again reports better

Bottom Line

There is no clear better or worst — it depends on your needs.

BiomeSight offers free processing of Thryve FASTQ files which is big Kudo to Rose at BiomeSight. Thryve offered free processing once upon a time, but it does not appear to be offered any more (or it is sufficiently hidden that I cannot find it).

The new kid on the block, nirvanabiome, which uses CosmosID.com, is 3x as expensive and does not appear to report any more bacteria types (which is surprising given their claims, I expected counts close to Xenogenes shown at the top of this post). I do not have sufficient samples via CosmosID/Nirvanabiome to do more analysis.

nirvanabiome also appears to be targeting the Autism market. I have a separate blog on Autism and the Microbiome, so I am interested if they will produce actual beneficial results or if this is “the best of intentions, the worst of execution” scenario.

Australia Microba Gut Test Uploads

I have implemented an upload for Microba, an Australian firms that claims “With the most comprehensive microbiome test available”. Instructions on how to do a download and upload is in this

I have tried several times in the past to do it. One of the biggest problems is that they do not use NCBI reference numbers or names. In fact, many of the bacteria they name — you will not find a single study on PubMed with that name. In other words — valueless information.

I have a mapping of their interesting names to NCBI names on line (and it will grow as samples are added and new names are added). The mapping is located here. I have repeatedly email them to make a download with NCBI taxon numbers available without success.

Left side is their name, right side is NCBI name. CAG-### is very vague.
Some additional “delights”

Only Selected Layers are Reported

They report only on the Phylum, Family, Genus and Species levels. Excluded are Orders, Classes and Strains. After the mapping, we are typically left with less than 100 bacteria taxonomy versus many more from other providers. I do not know how they define “With the most comprehensive microbiome test available”. Most means better than ALL…

Number of different bacteria reported by Sample

In short:

  • the information available is far less.
  • This is made worse by the use of atypical names for bacteria. If you are high “Peh17” and go to PubMed to see what will lower it, or what conditions are associated with it — you hit a blank page. They may provide advice — but the basis of that advice cannot be independently checked.
  • The sum of all Species/Genus/Family is 100%. This implies that they have identified every bacteria — impossible. They have scaled the numbers of the bacteria that they detected to 100%. A person with actually 40% of one bacteria in their gut could see a report of 45%, 65%, 85% — depending on what other bacteria is there.
  • The report is to 0.01% that is 100 / 1,000,000, a coarser measurement than some other tests.

Bottom Line

For those of you who have already tested with Microba, you can upload and MicrobiomePrescription will do as much as it can with that information. If you decided to do a retest— I would not recommend using Microba for the reasons sited above. I have heard that the UK firm BiomeSight is making it easier for Australians to use their service. I have heard that duties and shipping costs makes  Thryve Inside more expensive than BiomeSight.

The upload page is at: https://microbiomeprescription.com/Upload/Microba

Expect it to be a few days before 100% of your sample is ready — any new odd-ball names has to be researched and entered into the mapping table. At upload, you will likely be 80+% processed immediately.

One Stool, Two Samples, One Lab — What the shit!

A reader sent me the message below and gave permission to use his sample. I had, about a year ago, wrote The taxonomy nightmare before Christmas… that looks at the differences between lab results using the sample sample (as represented by a FASTQ digital file). We now try one more variation.

Last september I did (again) test my microbiome with Thryve. Because I had some general doubts about the validity of stool samples, I ordered two tests and took two different samples of the same stool and send them in under two different names.
…the results confirmed my doubts as I got different bacteria levels of the ten strains Thryve shows in their overview. 

STRAINS% sample 1% sample 2

So I do not doubt the reliability of each sample, but see that the validity of the sample is the problem. The results of a sample seem to be more or less random and not representative of the microbiome in general.
…so I think that any advice given, based on the results of one sample is arbitrary. If we are to take the importance of the microbiome seriously, we will have to consider a new way of getting a representative sample to have a solid base for interventions concerning our health.

Sampling Statistics

The typical sample seems to contain a round a 100,000 bacteria and is usually reported out of a million (scaled up). “Bacteria in faeces have been extensively studied. It’s estimated there are nearly 100 billion bacteria per gram of wet stool. ” [src] The sample that you sent it was likely no more than one milligram.

To use the “if I was a Martian” model… It is like a spaceship abducting a boatload of people in the Mediterranean…. If the boat is a cruise ship full of fat diabetic elderly Americans you will get one result. If the boat are full of starving Nigerians children trying to become refugees in Europe, a very different result. That is a disturbing concept when you mind is fixed on a deterministic precise definitive result. It’s a sample folks! For most industrial processes, dozens (or hundreds) of samples are required to get quality assurance. For the nerds, some readings: [2015] [Wikipedia]

Example: Two employees working for the same company at the same job earning the same amount and living in the same community. You stop each of them and take a sample of how much money they have in their wallet.
Would you expect them to have the same amount? Would they have the same number of pennies? dimes? quarters? Credit Cards?

A Sampling Analogy

Reviewing these two samples

Fortunately, I have sample comparison tools already on https://microbiomeprescription.com/,

Diversity By Taxonomy Rank

I would expect differences in samples to increase as you move down the rank. It is similar to asking at one level [European, African, Asian] on the abducted ship above. At the next level [Swede, Dane, Italian, etc] , the counts between sample will diverge as you do more detail classification.

This is an illustration on why I do fuzzy logic on predicting symptoms with good success according to readers. Using studies from PubMed have been reported to produce poor results according to readers.

When the two samples are used to predict symptoms, we have a strong convergence. While the actors may be different, their impact are similar.

Adjusting for Natural Variation

Using counts without context is a good way to get upset without justification. I use percentiles to provide context and have a comparison page (which I need to revise). At the phylum level we see general agreement between the samples. One rare phylum was lacking in one sample (not found in 30% of Thryve Samples but only 6% of BiomeSight – hint: download the FASTQ files and process them thru BiomeSight [for free!]).

Medical Condition Matches

Going over to Pub Med Medical condition matches, we see a striking similarity between the samples as shown below. So for detecting medical conditions — they are almost identical to each other (despite the differences in bacteria)

End Products Predictions

Again, we have strong agreement between the samples using 3 buckets.

  • Both below 12%ile (i.e. Low)
  • Both below 82%ile (i.e. High)
  • Both in normal range

This means for this type of diagnostic evaluation — they appear to be the same.

Bottom Line

There are several questions that need to be asked (and an answer to one):

  • To the folks at Thryve (and Biomesight.com), why are the numbers so different?
  • For users of my analysis site: https://microbiomeprescription.com/, for diagnostic purposes there are few differences! We have general agreement for:
    • End Product Production
    • Medical Studies Matches
    • Symptom Matches
    • Detecting high or low levels by percentile

The critical difference between the information lab providers and my site is interpretation sophistication.

So, to answer the reader’s question “The numbers are in major disagreement, but the diagnostic significance of the whole sample is in strong agreement”. Doing the lab analysis is worth it — just ignore the lab’s “value added” suggestions/information.


I have just completed a series of charts showing timelines (over 1000 new charts for most users). Timeline are important because they show how things are changing. Showing a time line can be complicated because numbers from different lab software are not comparable (see Taxonomy Nightmare post) even when the same analysis data file (FastQ) is used!

To address the issues of different numbers, different symbols are used for different lab software. It is strongly recommented that you obtain the FastQ files from the lab that did the sample and process them thru:

The results should be 3 or more sets of reading for the single sample.

There are no magic most important number for all people. Nor can a single test provide a solution. Your microbiome changes over time, and as you attempt to change things, there will be unexpected shifts. If you are dealing with health issues, one test every 2-3 months is strongly recommended.

The data is recomputed at least once a week (there are a lot of numbers to recompute to keep the data current!).

The timelines are divided into 5 collections, as shown below

Expected Benefit

My own experience has been that I was able to improve (and in some cases, eliminate) medical concerns by using regular microbiome tests and altering diet, supplements etc to manipulate the microbiome. The challenge is identifying what is off, and then how to correct it.

This suite of timeline charts are intended to make that easier (although you may have many charts to look thru).

Bacteria Time Line

This collects the distributions for all bacteria types seen in all of your samples and allow you to inspect changes by time (and lab software!).

Showing using Log(Bacteria Count) – Default
Showing actual values —
Shown as Percentage of highest value of any sample

The bacteria rank goes down to Strains, when that is reported by the lab software.

End Product Timelines

End products being produced by the bacteria are estimated from available data. Since each lab reports different bacteria counts, there will be some variability. Again we have 3 style of display: Log, Value, % of highest value.

There are about 150 choices to explore. If a chart is blank, then there was no bacteria matches — this may not be an item of concern because we have partial knowledge only of what is produced by which bacteria.

Medical Conditions Timelines

This uses studies from PubMed which report ‘higher’ or ‘lower’ levels. We use quartiles (highest and lowest) to compute values. In general, readers have reported that these numbers are less accurate than symptoms (see section below). We have 3 style of display: Log, Value, % of highest value.

There are about 150 choices to explore

Symptoms Timeline

This is from this site’s Citizen Science Artificial Intelligence algorithms. Only symptoms that have at least 5 bacteria very statistically significant are shown. The number of items may increase or decrease with time. At present, it’s around 100.

For myself, when I was above the 90%ile, I was having regular night sweats.

Sample Profile Timeline

This displays a variety of general characteristics as shown below:

A brief summary

  • Bacteria Count: Number of bacteria identified. More diversity is usually good, but excessive diversity, especially of rare bacteria, usually indicate issues
    • Rarest 1%: Seen in less than 1% of uploaded samples
    • Rare 2%: Seen in less than 2% of uploaded samples
    • Rare 4%: Seen in less than 4% of uploaded samples
    • Unusual 8%: Seen in less than 6% of uploaded samples
    • Infrequent 16%: Seen in less than 16% of uploaded samples
  • Firmicutes-to-Bacteroidetes Ratio: Some people deem it significant for some conditions
  • Prevotella-to-Bacteroides Ratio: Some people deem it significant for some conditions
  • Overall Symptom Health: This is a measure summing all bacteria matches for all symptoms
  • Overall Medical Condition Health: This is a measure summing all bacteria matches for all medical conditions

Bottom Line

This suite of charts gives a lot of analysis information. Most of these charts allow you to drill down immediately

Which takes you to a page like this:

Enjoy, learn, share.