For my last of the 4 installment per weekend, I will look at computing statistics. I have found non-parametrics analysis to get more interesting results, but classic core statistics are also helpful to understand the nature of each taxonomic layer.
Data Schema Update
We are dealing with this section of the schema which has another table added.

For thoses who are not statisticians, terms like Kurtosis may sound like a medical condition and not a statistical measure. I have enclosed links to Wikipedia articles on each of these items below.
The code sample does just one LabTest and one Quantitle. For a production implementation, I would do iterate over
Select LabTestId from LabTests
And quantiles = {3,4,5,6..}
The code is at:
 https://github.com/Lassesen/Microbiome2/tree/master/TaxonomyStatistics 
Remember to update database schema too.
Bottom Line
That’s it for this weeks installments. Homework on this one is whether you should include Zero (none-found) values or not. This is actually a complex question which may depend on what percentage of samples have a specific taxonomic unit.
3 thoughts on “Computing Statistics”
Comments are closed.