The nature of data for the microbiome is not a straight line, nor a bell curve. Finding associations is challenging with often poor results I know from years working as a statistician that finding a “magical data transformation” is the key to finding associations. However, a ongoing issue is over-fitting the data when people try formula at random. I have tried a variety of methods from machine learning — with poor results in general.
I put my lateral thinking cap on. Instead of using a defined explicit formula — instead an intrinsic transformation: the percentile of the readings. To do this approach, you need a large sample size – fortunately I have such with over 1500 pairs of data points being common. A similar approach was discussed in Percentile Regression: A Parametric Approach 1978, Journal of the American Statistical Association, but never gained popularity.
This post gives a walk thru of the process being done on 14,374,869 possible associations that we have (excluding symptoms and conditions)
Example
I picked one of my initial good results and will walk thru charts showing how charts change according to the approach. First the raw numbers plotted
We see a relationship which looks weak (flat) if you do not do the R2 calculations
Then we chart of log of the raw numbers (log of the values worked well to determine the Kaltoft-Moldrup normal ranges – KM is based on different moments of the resulting curves)
The pattern is stronger (20% higher R2)
The new way is shown below, using the intrinsic transformation to percentile
Plotting Percentile against Percentile (52% higher R2 than original)
Bottom Line
Finding associations as illustrated above, means we can tease information from our data. For example, for B12 levels, we have a strong association to Glycolysis (Embden-Meyerhof pathway), glucose => pyruvate. This means that the bacteria associated with that is likely associated with B12 production. For example, a few of some 2000+ strains associated with this module.
Faecalibacterium prausnitzii
Bacteroides vulgatus
Bacteroides uniformis
Parabacteroides distasonis
Bacteroides caccae
Bacteroides dorei
Bacteroides thetaiotaomicron
Bacteroides ovatus
Roseburia intestinalis
Flavonifractor plautii
Bacteroides fragilis
Odoribacter splanchnicus
Alistipes finegoldii
Eggerthella lenta
Additionally, it means that where there is a relationship between bacteria but we know nothing about how to modify one of the bacteria and something about the other; then we can propose suggestions by association. This will be coming soon to Microbiome Prescription – the citizen science site.
Hey do you think microbiota dysbiosis could cause circadian disturbance? Most articles go in an opposite direction and say its lifestyle causing circadian disturbance…But my disturbance is resistant to lifestyle… I just have primary circadian problem that might be even my worst symptom… Most resistant and almost lifelong.
Asked by a Reader
In keeping to “gold standard” of information instead of bloggers’ urban myths and ideologies, I head over to the National Library of Medicine studies.
“gut microbial metabolites influence central and hepatic clock gene expression and sleep duration in the host and regulate body composition through circadian transcription factors”[2020]
“Findings have suggested that gut microbiota play a major role in regulating brain functions through the gut-brain axis. A unique bidirectional communication between gut microbiota and maintenance of brain health could play a pivotal role in regulating incidences of neurodegenerative diseases. ” [2021]
First, a more precise definition of circadian rhythms from the above study.
A fundamental part of eukaryotic life, circadian rhythms are endogenous, entrainable biological processes that oscillate in a 24-hour period in concert with the circadian environment of the earth. Circadian rhythms can be found at an intracellular level and have the ability to impact all aspects of metabolism (11). The mammalian circadian rhythm is orchestrated by a master clock, located in the suprachiasmatic nucleus (SCN) of the hypothalamus (12). The master clock follows the 24-hour light-dark cycle (the diurnal cycle) and coordinates the release of neurotransmitters such as serotonin and norepinephrine. Serotonin and norepinephrine are present at higher levels during wakefulness, while melatonin peaks during the night, regardless of the diurnal or nocturnal sleep cycles across species… The peripheral circadian clock is a system of organs within the 22 body which collect environmental and internal signals in order to direct the expression of circadian clock genes
And then we read:
“food intake can disassociate peripheral clock periodicity from the master clock; when this happens, greater immune system activation and metabolic dysfunction occur”
“Dysbiosis and metabolic consequences resulting from circadian clock disruption may be due to increased permeability of the intestinal epithelial barrier “
“gut microbial metabolites such as the short-chain fatty acids butyrate and acetate may influence clock gene expression“
“Leone et al. found that a lack of gut microbiota, and consequently a deficit of microbial metabolites, resulted in markedly impaired central and hepatic circadian clock gene expression (40), suggesting the possibility that gut microbiota play a role in propagating circadian rhythm at the molecular level”
“Serotonin deficiency elicits the loss of the circadian sleep-wake rhythm”
“The microbes of the gastrointestinal tract exhibit circadian rhythm, and their composition oscillates in response to the daily feeding/fasting schedule.
The characteristics of the gastrointestinal microbiome and metabolism are related to the host’s sleep and circadian rhythm. Moreover, emotion and physiological stress can also affect the composition of the gut microorganisms. The gut microbiome and inflammation may be linked to sleep loss, circadian misalignment, affective disorders, and metabolic disease.
On the other hand, peripheral clocks are found in the nucleus of almost every single cell (eg, enterocyte, hepatocyte, myocyte, adipocyte), and they show circadian rhythms and oscillations that are dependent and independent of the circadian rhythms from the master clock. While the master clock responds mainly to light/dark cycle, peripheral clocks respond to other zeitgebers (eg, temperature, diet, timing, and content of food intake), which indirectly regulate the central clock … However, Parabacteroides, Lachnospira, and Bulleida were specific to the human GI tract. Lachnospira was unique in that it was the dominant species that were affected by time and behavior (energy consumption early during the day) [114]. However, it is not fully understood why some species increase with clock time throughout the day. One of the theories is that some species are bile resistant, so they increase during the day as the food is ingested, and bile is secreted (eg, Oscillospira and )
“We found that up to 20% of all commensal species in mice and humans undergo diurnal fluctuations in their relative abundance, resulting in rhythmic changes of the entire bacterial community over the period of one day. For instance, the common mouse and human commensal genus Lactobacillus increases in relative abundance during the resting phase (the light phase in a mouse) and declines during the active phase.”
Bottom Line
Time of day, time of year, eating time and diet impacts intra-day microbiome population and thus the metabolites being produced. Some of these metabolites have been shown to impact circadian cycle in recent studies. A few bacteria pulled from the studies cited above include:
Fusobacterium
Porphyromonas,
Prevotella
Bacteroides acidifaciens,
Lactobacillus reuteri,
Peptococcaceae
Eggerthella,
Anaerotruncus,
Desulfovibrio,
Roseburia,
Ruminococcus
Time of year impacts (and may be a factor for Seasonal Affective Disorder – SAD)
Helicobacter,
Bacillus,
Stenotrophomonas
Proteobacteria,
Lactobacillus
Romboutsia
I was unable to find any 16s clinical studies on SAD
Advice for taking samples
Record the day of the week, time of day, and if female, where you are in your cycle for stool samples. For best consistency (i.e. seeing what actually changed between samples) — make sure all follow up control for these factors as much as possible.
By same data, I mean the same FASTQ files, a detail file of the parts of your sample returned by a 16s machine. This is then processed through software to infer the bacteria. The result is two different reports. If you pass the same files to other providers, you will like get even more different reports. For why, see this post from 2019, The taxonomy nightmare before Christmas…
This post is going to look an actual example.
What a FASTQ file looks like… the letters CGAT mean adenine (A), cytosine (C), guanine (G), and thymine (T) – parts of DNA
Krona View
At this level, they look similar – but there is often a 25% difference between the numbers of a species.
ThryveBiomeSight
Comparing Samples
At the class level you can see some dramatic changes in counts and percentile. At present, I am using percentiles from aggregations of all labs sources.
When I hit 1000 samples from a specific lab, I will doing lab specific percentiles. Current counts — thus we are using an aggregate for percentile for all labs
From https://microbiomeprescription.com/Upload/Statistics
For items of concern, you can actually drill down manually on the bacteria. For example for Bacilli above.
You can also get the percentile that is lab specific by going to https://microbiomeprescription.com/Library/Statistics?taxon=91061 with no sample and then changing to the lab as shown below.
We find that we are at the 20%ile for biomesight specific samples and 2.4%ile for thryve specific samples. For explanations, you will need to ask the questions to the lab — microbiome prescription just presents the data.
The bottom line is that you want to always use the same lab software for comparing samples. Ideally, the same lab for the physical processing. Comparing the same sample that is processed by two different pieces of software results interpretation challenges.
To give a more human context — take a book and ask two people to retell it aloud, one is from the rural areas of Scotland (with thick Scottish accent) and the other from Mumbai India (with thick Marathi accent) with a third person (a native from Bermuda) trying to recall what they heard…. Different choice of words in the retelling with different intonations. That is the human reality — which also applies to labs.
When this site was started, there was one dominant player in retail-provider: uBiome. In June 2018, the first ThryveInside sample was uploaded, A year later, in May 2019, the first American Gut sample. A year later, in July 2020, BiomeSight started rolling in significant numbers — for 10 months, BiomeSight was the most frequent upload type every month. At present, I support 8 upload types and provide an API for any lab that wishes to do a direct transfer. BiomeSight lead the way here. Statistics are here for those interested.
The first three labs, uBiome, Thryve and American Gut, all used the NCBI Bacteria Taxonomy systems. These are number and thus easy to store in the database and economic to do analysis on. This is a critical foundation. There are problems using names, because names change overtime. One bacteria has 237 different names. As illustrated below — same bacteria was discovered by many different people. Each person gave it a name and published papers using that name. In time (especially with DNA techniques) it was realized that they were all the same!!
NCBI is an unique identifier just like social security number is for American. Unfortunately, Canadians have SIN numbers. Other nations have Person Numbers. The same thing has happened with lab equipment. The problem is matching identities. With non-Americans in the US, some are issued TIN numbers (and thus we are good for US identity), others do not have TIN numbers. A person is like a bacteria.
Case Studies With Microba and BiomeSight
Microba does not use NCBI numbers. Microba uses the Genome Taxonomy Database (GTDB https://gtdb.ecogenomic.org/) for taxonomic classification. The question arises, who attempts the mapping of the GTDB identifiers to NSBI — Microba or MicrobiomePrescription or no-one?
With cooperation from them (namely, they provided a reasonably complete list of the GTDB identifiers that they used), I was able to create a mapping table between those names and NCBI numbers that was not 100%, but sufficient to give meaningful results.
With BiomeSight.com, they added the numbers to their database. I always prefer the lab to take ownership of the mapping – there can be many nuisances specific to the lab equipment that they are using.
Popular Medical Tests that cannot be added to the data
There are two main reasons that these cannot be added:
They only measure selected bacteria (see below)
Their unit of measure is different. One counts the number of hex nut in a mixture of 1000 nuts; the other counts the number of packages of hex nuts (with a different number of nuts per package) in a carton of nuts. They are simply too different.
Lab Name
Bacteria Reported
Bioscreen (cfu/gm)
17
Biovis Microbiome Plus (cfu/g)
40
DayTwo
76
Diagnostic Solution GI-Map (cfu/gm)
34
GanzImmun Diagnostic A6 (cfu/gm)
72
GanzImmun Diagnostics AG Befundbericht
25
Genova Gi Effects (cfu/g)
28
Genova Parasitology (cfu/g)
7
InVitaLab (cfu/gm)
23
Kyber Kompakt (cfu/g)
11
Medivere: Darm Mikrobiom Stuhltest (16s limited)
16
Medivere: Darn Magen Diagnostik (16s Limited)
16
Medivere: Gesundsheitscheck Darm (16s Limited)
17
Metagenomics Stool (De Meirleir) (16s Limited)
53
Smart Gut (ubiome 16s – Limited Taxonomy)
23
Verisana (cfu/ml) aka (kbe/ml)
11
Viome (No objective measures)
29
For these test, users must transcribe whether the test indicated too high(↑) or too low (↓) levels. I give the ability to indicate how much…
How the labs represents varies greatly. Their units are not compatible.
Suggestions are based on these rough values and uses the same logic. A key limitation is that their normal ranges are likely computed assuming a bell curve and not Kaltoft-Moltrup Ranges. You may be acting on items that are in the typical ranges seen.
Issue of Missing Hierarchical Layers
If you look at “My Biome View” on Microbiome Prescription, you will see the hierarchy (per NCBI). Most labs do not give the full hierarchy in their reports. Often they will skip layers. The clearest example is Microba. They provide information in only 4 files.
But when this upload is viewed, you see all of the levels!
My Biome View
A more extreme example is the CosmosID’s PDF files, where they only list the species and strains!
The user who submitted this would see the following My Biome View…
Microbiome Prescription “completes” the data by summing up each level into the level above if missing. So I sum the count of all of the species in a genus to get the genus count if it was missing from the upload. There is an unfortunate gotcha. you may have 8000 in a genus and the sum of the species is 6000. If the lab provided the genus count, then we are good — no need to create a record with 6000. If we must create this level, then we are missing 2000 and higher levels are underreporting!.
This issue is also seen in some lab results. They scale the numbers so that the species that they report adds up to the count for the genus. What they do not report on is dropped from all of the parent levels.
When you use the Krona Chart, if there are no “unknown section” the0n this ignoring the not identified is a possible issue with the lab results. You can also do this on the My Biome View by comparing the numbers of the parent to the sum of the children – if they always match, then assume that the not identified are ignored.
Illustrates when the not identified is shown on a Krona Chart
Inconsistent Numbers
Above we have the case of the genus count being more than the sum of it’s species. This is a good state, because the numbers are more accurate. We have the unidentified bacteria being identified as least at the genus level.
I have also found cases where the sum of the species exceeds the genus. This can legitimately happen when alternative hierarchies are used. It becomes a problem when we attempt to keep everything in one hierarchy (“There can only be one!”)
From TV Series Highlander.
As a result, if the sum of the species (using NCBI hierarch) exceeds the genus, then we update the genus number for consistency (if we do not do that, then Krona charts can look bizarre — which a user emailed about).
Bottom Line
“Different strokes for different folks” is the problem. In accepting data from 9 different sources, I need to harmonize. The key that I play in is NCBI. This is a huge benefit because it is used with KEGG: Kyoto Encyclopedia of Genes and Genomes, which really enhances analysis.
Right Solution
It is simple, the labs should add to their websites equivalent pages seen on Microbiome Prescription — but only using their lab results. If their staff lacks the skills, I am a professional developer and can be contracted to do a lot of the backend coding (at my usual commercial rates ).
If you wish to be pro-active.
Verify that every bacteria shown on my biome view is shown on the lab results page. If it is not, they are skipping elements of the hierarchy
Verify that the count agrees, if not look at what is added up
Contact the provider and ask for automatic transfer to be implements. Code wise it is very simple, a few hours of work at most for most developers. What is needed is documented here, including a test site!
I cannot fix the root issue — inconsistent data. You are their customers and by being vocal, you can make a difference. If the upload is correct and complete — I make no modifications, it is only for problematic uploads.
In an earlier post, I had illustrate the problem of whether L.Reuteri produced histamine. The answer is “Not sufficient information to answer” — why is shown below. It depends on which strain you have! The source (human/not human) is not sufficient. In general, the probiotic species is insufficient to answer the histamine question.
Individual strains of Lactobacillus paracasei differentially inhibit human basophil and mouse mast cell activation [2016] “Thus, L. paracasei CNCM I‐1518 could not only inhibit mouse mast cell and human basophil activation 19, but also protect mice from SalmonellaTyphimurium infection 31, induce regulatory T cells in skin inflammation model 32, and improve allergic rhinitis in children 33. These studies, which focused on one or a few strains of bacteria, did not permit an accurate comparison of the effects of different bacterial strains.“
Lactobacillus Casei and Paracasei
On many studies, this is reported to reduce hay fever and allergies. If you check the web, you will find that it is also reported as a histamine producer. How can this be true since increased histamines would make allergies worst. The answer is simple. BOTH ARE CORRECT when you look at the fine print… (and you need the fine print that may be missing on the probiotic label).
“histamine production were found in … Lactobacillus casei 18, isolated from cider)” [2013]
“According to the results, Lactobacillus casei CCDM 198 exhibited the best degradation abilities…. significantly (P < 0.05) reduced BAs (putrescine, histamine, tyramine, cadaverine), up to 25% decline in 48 h.” [2020]
” Lb. paracasei subsp. paracasei CB9CT and another strain (CACIO6CT) of the same species that was able to degrade all the BAs were singly used as adjunct starters for decreasing the concentration of histamine ” [2016]
“Seventeen isolates were found that were able to degrade tyramine and histamine in broth culture. All 17 isolates were identified by 16S rRNA sequencing as belonging to Lactobacilluscasei.” [2012]
For Lb Casei and Paracasei, most of the studies suggests that it degrades histamine.
Worked Example
We use L. Casi and L. Paracasei from Custom Probiotics, for two main reasons, they are the cheapest per BCFU, they have no fillers, pre-biotics, etc so we do not have to deal with counter-indicated formulation that often happens with commercial probiotics blends (often using a marketing-driven formulation).
So looking up the strains, I see Lc-11 and Lpc-37, Time to search for information on these:
Obesity and gut microbiome: review of potential role of probiotics [2021] “In obese women, the administration of a probiotic mix composed of L acidophilus LA-14, L casei LC-11, L lactis LL-23, and B bifidum BB–06, improved BW composition after an 8-week supplementation period. A”
Know every strain in your probiotics (not just species!)
You need to be able to locate studies using that strains (Lb. Casei Snakeoildmay be just a marketing name to hide that fact that the manufacturer/packager does not know the strain)
I have seen some product literature claiming benefits from a different strain because they were the same species —FALSE LOGIC.
Ideally, you will find some relevant studies using these strains — ideally on humans!
If you are using antibiotics, you may wish to search for the probiotics antibiogram. Ideally, the manufacturer/seller would provide all of that information with a simple email requesting it.
This is an area that I became aware of a decade ago and used this knowledge to discard some suggestions and take other suggestions based on the physical characteristics of the brain. This has major implication for brain fog, autism, ME/CFS, depression, Alzheimer’s, Long Haul Covid and many many more conditions.
The literature
“Research on disease-modifying treatments for central nervous system [CNS] diseases have generated a cemetery of failed drugs, rejected in part because of their incapacity to cross the BBB” [3,4,5,6,7].
The Molecular Weight [MW] threshold of BBB drug transport of small molecules has been demonstrated previously in studies of drug penetration into the brain.33, 34 Blood–brain barrier permeation decreases 100-fold when the size of the drug is increased from an MW of 300 Da, which corresponds to a surface area of 50 square angstroms, to an MW of 450 Da, which corresponds to a surface area of 100 square angstroms.35
I remember that for my treatment of ME/CFS back in 2000, Low Molecular Weight Heparin was what was advocated. It was found far more effective than normal heparin (and costed 30+time more!!)
For some items (lacking the specific chemical formula) we may not get that data.
Bottom Line
The concept of “needing to take anti-inflammatories for brain inflammation” is correct — the gotcha is that many of the items you may take will never reach the brain because they are too big! Go thru your lists (and suggestions from others) and trim them down to the light molecular weight ones. You will likely get much better success — I did!
I’ve recently added computations for Methane and Hydrogen using KEGG data to Microbiome Prescription. Checking the contributed symptoms, I had over 120 samples with SIBO indicated. So it was time to test the hypothesis.
Charts
15% was 90%ile r over, 10% was expectedPoorer match:: 5% was over 90%ile, 10% expected
What about the old Methane?
The old computation was done on adhoc gathered associations from the literature. It also did not show any patterns.
Bottom Line
With the obvious path being unsuccessful, then time to examine where we did find associations.
ProductName
Pattern
(4R,5S)-4,5,6-trihydroxy-2,3-dioxohexanoate
Between 33%ile to 66%ile
cob(II)yrinate a,c diamide
Between 66%ile to 100%ile
D-tagatose
Between 0%ile to 33%ile
undecaprenyl phosphate
Between 0%ile to 33%ile
EndProduct
Pattern
Vitamin B9 (Folic Acid/Folate)
Between 66%ile to 100%ile
Lactic acid
Between 0%ile to 33%ile
2-Butanone
Between 66%ile to 100%ile
Hydrogen cyanide
Between 66%ile to 100%ile
Methyl thiocyanide
Between 66%ile to 100%ile
Propionate
Between 66%ile to 100%ile
Vitamin K
Between 66%ile to 100%ile
Vitamin B7 (biotin)
Between 66%ile to 100%ile
Sialic acid
Between 66%ile to 100%ile
Norepinephrine
Between 66%ile to 100%ile
EnzymeName
Pattern
succinyl-CoA:acetate CoA-transferase
Between 33%ile to 66%ile
phosphoenolpyruvate carboxykinase (GTP)
Between 33%ile to 66%ile
Enzymes are a wash — all of focused on typical values
We have some agreement, Bacteroidia is HIGH above and Firmicutes is low.
Looking at the “usual suspect” for SIBO, Methanobrevibacter smithii, a methane producer, we found only 23 samples with any (and all of the labs associated with these samples reports this bacteria so over 100 (80%) of the people reporting SIBO had none appearing)
There is no clear association of this bacteria to SIBO in our samples
Tentatively conclusion, SIBO does not leave any clear tracks in a 16s Sample.
It is that season again — and some areas are reporting much higher levels than usual (with predictions for it getting worst). Some people will load themselves up daily on antihistamine, for example, Diphenhydramine HCl , which impacts over 800 different bacteria. We do have a profile of the bacteria shifts seen on Microbiome Prescription.
Supplements
The bad news is that we have lots of studies, but no good studies — all of them have problems…
” A total of 57 062 articles were derived from searching seven online databases and evidence from 48 RCTs and 10 observational studies were reviewed for methodological quality and risk of bias. No qualitative studies meeting the inclusion criteria could be found, therefore only a quantitative review was performed. ”
“This MR study found no evidence supporting a causal association between serum 25(OH)D levels and risk of AR, AS and NAR in European-ancestry population. ” [2021]
“Probiotics may be beneficial in improving symptoms and quality of life in patients with allergic rhinitis; however, current evidence remains limited due to study heterogeneity and variable outcome measures. Additional high-quality studies are needed to establish appropriate recommendations.”
Probiotic Potential of Lactobacillus Species in Allergic Rhinitis [2021 – full text] is a recent review with two appearing to be most likely effective (to some extent): Lactobacillus Casei and the closely related Lactobacillus Paracasei with dosages up to 30 billion CFU/day. These happen to be less than the suggested dosages from Custom Probiotic.
Bifidobacterium longum produces a rich collection of end products (1,380), the absence of which may account for hay fever.
Bottom Line
I suspect Placebo effect and poor study construction has resulted in the fuzziness for supplements and lactobacillus probiotics. The Nat.Lib.of Medicine profiles points to some specific bacteria that are low and the available studies, appear to suggest that taking those bacteria as probiotics will significantly improve hay fever. The list is:
Adequate Vitamins D and E supplementation may also help. I use the word adequate because often the dosages suggested in some studies are insufficient to alter blood level by any reasonable amount in a month (see this post for a formula ) – hence “no effect”.
There is one more path to consider, getting suggestions explicit for the shifts reports.
We see L.Casei, L. Paracasei and Clostridium butyricum on the recommended list — in agreement with the above. Further down, we see Selenium (cited above) also listed
The above is evidence based on the microbiome shifts seen with hay fever.
This morning I chatted 90 minutes with another data scientist about the microbiome. After the video chat he sent me a link to this recent article: From IBS to ME – The dysbiotic march hypothesis [2020]
” The pathogenesis of the relationship is unknown. Intestinal dysbiosis may be a common abnormality, but based on 1100 consecutive IBS patients examined over a nine years period, we hypothesize that the development of the disease, often from IBS to ME, actually manifests a “dysbiotic march”. In analogy with “the atopic march” in allergic diseases, we suggest “a dysbiotic march” in IBS; initiated by extensive use of antibiotics during childhood, often before school age. Various abdominal complaints including IBS may develop soon thereafter, while systemic symptom like CFS, fibromyalgia and ME may appear years later.”
Microbial transitions from health to disease [2021] “dysbiotic microbial populations may be important in the development of approaches to prevent the progression of disease and to restore health in diseased individuals.”
Personally, I have seen someone progress from GERDs to IBS to Chronic Fatigue Syndrome to atypical Crohn’s disease. The progression is not deterministic, with DNA being a significant factor.
I have had on my todo list, creating a microbiome progression map. I have just added it (based solely on gold-standards PubMed studies). It can be seen via https://microbiomeprescription.com/Library/PubMed
on https://microbiomeprescription.com/Library/PubMed
When you click the crystal ball . you will be taken to a new page. For example, IBS
Associated medical conditions to IBSFor Chronic Fatigue SyndromeFor Autism
Bottom Line
This is based on PubMed studies which are often hit and miss for depth of analysis and reporting shifts. Over time, I expect data to improve and the forecasts on this page to improve.
Recent Comments