Technical Note: Metabolite estimates from the Microbiome

A colleague gave me a list of bacteria producing different metabolites according to studies. The question arise – how accurate are making estimates. I am going to focus on one of them: Methane. The reason is simple that the microbiome data set that I have access to has self-reporting of SIBO which typically is an excess of methane.

I intend to do a three way analysis:

  • Using the list of well documented bacteria he provided
  • Using data from KEGG: Kyoto Encyclopedia of Genes and Genomes
  • Using data from self-reporting.

This is a rough, on a paper napkin, analysis to explore this issue.

The methane list from studies is very short:

  • Methanobrevibacter smithii
  • Methanosphaera stadtmanae

From KEGG, we use C01438 (CH4) and have a list of 622 taxa capable of producing it. A few examples are:

  • Serratia sp. AS12
  • Serratia sp. AS13
  • Sinorhizobium sp. CCBAU 05631
  • Bradyrhizobium arachidis
  • Cytobacillus kochii
  • Cupriavidus sp. USMAA2-4
  • Halarcobacter anaerophilus

We have 88 annotated samples with SIBO with a total population of 2461 samples. In the table below, units are cells out of one million detected.

MethodAverageStd DevIncidence
Studies1378441624% has some
KEGG34321026498% has some
Studies with SIBO709175329% has some
KEGG with SIBO69013664998% has some

The obvious conclusion is that KEGG is definitely superior with the count of bacteria doubling for SIBO samples while the studies approach resulted in the count count of bacteria being halved. Incidence of detection was unchanged KEGG and we notice an increase with studies, but only 29% of people with SIBO will have any methane estimated.

A second gas for SIBO is hydrogen Sulfide (C00283 H2S). Applying the same process.

MethodAverageStd DevIncidence
Studies3635553392% has some
KEGG253720155882100% has some
Studies with SIBO3689446189% has some
KEGG with SIBO289389186070100% has some

The results are not as dramatic as with methane. There was no change with Studies and a 14% increase in count using KEGG. The incidence rate went down slightly with studies.

Bottom Line

This drill down suggests that I made the right decision to deprecate the computations of metabolites using study data and shifted to using KEGG data.

The questions raised by this napkin computation is that the above process should be done with actual measurement of methane and hydrogen sulfide in samples to definitely identify the better process.

This is one of a continuing sequence of ad hoc analysis trying to raise questions about current process. See Technical Notes on Microbiome Analysis.

The microbiome data is available at https://citizenscience.microbiomeprescription.com/ . The Lab Source used for this post was “Biomesight”.