I would be interested to see how the three separate consensus suggestions compare (i.e. not doing the uber consensus). Do the top takes & avoids match across the different labs, or are they different? Because if they are different then the algorithm is not robust to changes in lab.Request from a reader.
This is a part of this series of post:
- Comparing Microbiomes from Three Different Providers – Part 1.
- Ombre Suggestions Analysis – Failing Grade – 2A
- Biomesight Suggestions Analysis – Good Results – 2B
- Thorne Suggestions Analysis – INCOMPLETE / FAILED – 2C
- Microbiome Prescription Uber-Consensus Analysis – Excellent – 2D
Using the same data, the process that I will use is where items suggested in both are the same (i.e. take or avoid) or different recommendations. In pseudo sql:
Select Percent(A.Take=B.Take) from Suggestions1 A Join Suggestions2 B on A.substance=B.substance
The results actually surprised me!
|Lab Comparison||Items||Agreement||Avg Difference|
|Ombre vs Biomesight||1705||100%||52|
|Ombre vs Thorne||1706||100%||100|
|Biomesight vs Thorne||1694||100%||54|
My expectation was somewhere between 80-90%, the same range that I got doing cross validation. The Priority and weight are different, but the take or avoid decision are the same. The difference between these pseudo values was also calculated and added to the table above. Magic Soy on Ombre may be 430, on Thorne 330, on Biomesight 530.
Conclusion, the algorithm is more robust than I expected!
Caveat: This was done using “Just give me suggestions” collection of algorithm on each lab’s data. Disagreements are definitely expected when bacteria selection are “over-focused” and not including the holistic picture of the microbiome.