Readers have expressed interest in some of my work being open sourced. The actual site would be described as an “evolved beta”, rather than subject people to quirks and kludges, I am proceeding as a redesign of a V.2 product. If you are interested, please FOLLOW (top left) to get updates as they happen.
The Repository is at:
The first item that I want to get up for discussion is the core database tables – for review and comments. The Database diagram is shown below.
A few quick notes:
- Statistics were done as a separate table instead of the typical additional columns because trying multiple quantiles is seen as the way to go for non-parametric analysis. This becomes open ended with items like “Q2_18” – Quantile 2 of a 18 way quantization being possible. With that type of breakdown, we want to know if we are dealing with stale date, so we need to know the computation date.
Next post will deal with populating TaxonHierarchy and TaxonNames from ncbi downloads.