This post started out seeking to confirm or debunk the claim located here.
The method was very simple because we have a continuous stream of samples from before COVID, before the COVID vaccination and after the majority of people uploading samples would have been vaccinated. If this massive change is happening then the pre-COVID bifidobacterium count (by lab) would be much higher than the post-COVID vaccination bifidobacterium counts.
My results: there was no statistical significance between the averages
- Pre 2020-01-01: Average Count 20380 on 118 samples, Std Dev 98300
- Post 2022-06-01: Average Count 26111 on 406 samples, Std Dev 72700
That is a 28% increase when a decrease was expected from the above talk.
I am open data, so you can pull the data and check the calculations:
Volatility of Numbers
I was also curious to see if there was any apparent month by month pattern, so I pulled the statistics for biidobacterium, shown below. It is illuminating to a statistician like me, perhaps confusing or concerning to people with poor understanding of statistics (who would expect the numbers from month to month to be similar).
|Year||Month||Average||Std Dev||Obs||Average||Std Dev||Obs|
My conclusion is that you need to have two things to get good results:
- All of the samples should be processed by the same lab at the same time. Different batches of reagents may cause different results.
- You need good sample sizes, at least 100+
- You need to be very very careful not to cherry pick data (example below)
An example from Thryve/Ombre data above, with a sample size of 30, the average was 101600. Later a sample size of 21 reported just 3186. Conclusion: going back to school caused family bifidobacterium to tank!
On sample size of 100 issue: