Computing Statistics

For my last of the 4 installment per weekend, I will look at computing statistics. I have found non-parametrics analysis to get more interesting results, but classic core statistics are also helpful to understand the nature of each taxonomic layer.

Data Schema Update

We are dealing with this section of the schema which has another table added.

For thoses who are not statisticians, terms like Kurtosis may sound like a medical condition and not a statistical measure. I have enclosed links to Wikipedia articles on each of these items below.

The code sample does just one LabTest and one Quantitle. For a production implementation, I would do iterate over

Select LabTestId from LabTests

And quantiles = {3,4,5,6..}

The code is at:
https://github.com/Lassesen/Microbiome2/tree/master/TaxonomyStatistics

Remember to update database schema too.

Bottom Line

That’s it for this weeks installments. Homework on this one is whether you should include Zero (none-found) values or not. This is actually a complex question which may depend on what percentage of samples have a specific taxonomic unit.