An early user of the Odd Ratio approach (see New Suggestions Approach and Ranges for Healthy and Unhealthy Bacteria) raised concerns about some of the bacteria identified.
For example, Odoribacter denticanis was only 0.002%, Clostridium akagii was 0.002%, and Symbiobacterium was 0.004%. The natural question is whether bacteria present at such tiny levels could have any meaningful impact. Interestingly, these values are not extreme outliers; they appear to be fairly common at these levels.
There are several ways to approach microbiome adjustment. One is to focus on bacteria that dominate the microbiome. Another is to target bacteria with extreme values. A third is to focus on bacteria whose mechanisms of impact are known, such as those that produce metabolites linked to leaky gut.
My own approach is based on strong statistical associations. Association does not prove causation, and in microbiome research, the causal details are often not well established. My working assumption is that bacteria strongly associated with a condition are likely influencing it, perhaps through metabolites they produce or consume. If so, reducing those bacteria should reduce the metabolic effect.
The low-abundance dilemma
The microbiome can be thought of as a population, much like a country’s human population. If there were 989 billionaires[Forbes] in the United States, that would still be only about 0.0003% of the population. Yet few people would conclude from that alone that billionaires have little influence on the country. In practice, a very small number of highly influential actors can still shape outcomes in major ways.
The same logic applies to microbiome analysis. Low abundance does not necessarily mean low impact.
The odds ratios used here are not based on an ideal dataset, but on the best data currently available. The choice is not between perfect evidence and flawed evidence; it is between using the best evidence now or waiting indefinitely for perfect data. In that sense, this is a best-effort approach grounded in the data we have rather than silence in the face of incomplete evidence.
Reader Response
I think the question is whether such low values represent an actual organism or noise. I remember in the days of Ubiome, a reading of 0.001% meant only a single organism was found. Ubiome actually discarded it if there was only one found. They only reported if there were two or more. Thryve otoh, reported everything, which is one of the reasons they found more than Ubiome.
ofc, some of these results are more than one organism, but the question still remains as to whether this is a real organism or noise. It just seems that there are a lot of variables in this analysis with big error margins, and you are compounding them by bundling them all together. The error margin in the final result is likely huge.
Resolution
The way to handle this issue was requiring the raw count to be at least 5. This should reduce the noise level to acceptable levels. The dilemma remains on identification differences between tests (See this post for details). With aggregation across different tests, this issue should be reduced.