Requirements for any Microbiome Provider to make their data accessible.

Web Site to assist with becoming a provider

Directly sending the file

This should only be done by their client from their site. This simplifies the transfer (more client friendly) and usually results in good will for the provider (as well as repeat business). Many people have switched to supported providers for subsequent tests because they wish to use the features on this site.

The simplest format is a very very simple upload (post) a text file consisting of nothing more than:

Line 1: Email Address of user
Line 2-N: NCBI Taxon Number, Percentage

That is it!!! We have a test page available for people who wish to try it (data is not saved).

We will provide a key to you so that we know where the data is coming from. See statistics on uploads here.

This also clarifies to users whether they should or could be seeing a specific species in their report.

Lactobacillus reuteri: NCBI 1598

Giving the client a file to download and then upload

Line 1-N: NCBI Taxon Number, Percentage

356, 0.004
468, 0.002
469, 0.002
506, 0.373
543, 0.338
551, 0.002
561, 0.311
562, 0.025
613, 0.019
712, 0.05
713, 0.002
724, 0.0429
729, 0.0429
766, 0.003
775, 0.003
780, 0.003
815, 35.804
816, 35.804
817, 0.002
818, 4.764
820, 4.165
821, 0.011
823, 3.028
830, 0.002
836, 0.015
838, 0.032
841, 2.608
853, 11.933
872, 0.039
904, 0.025
905, 0.025
906, 0.008
970, 0.045
976, 46.664
1030, 0.018
1046, 0.002
1047, 0.002
1051, 0.002

Why NCBI Numbers?

https://www.ncbi.nlm.nih.gov/

The reasons are simple:

  • various bacteria have dozens of names — on occasion, the name was deprecated and assigned to a different bacteria. It assures us that we have the right bacteria.
  • We use the KEGG: Kyoto Encyclopedia of Genes and Genomes, and their data is all keyed to NCBI Taxon numbers
  • It allows data to be stored in a more compact fashion (up to 60x smaller database) and allows faster processing of data (saving operating costs).

Please note that there are open source tools available to assist with finding the correct NCBI numbers, see  https://youtu.be/VMi0dOeNQFA and https://youtu.be/B0zOSF8f0mo for an illustration.

We request all taxonomic ranks to be specified

Our experience is that percentages become distorted (under reported) when we rollup from species to genus to family etc. Typically, there will be some undetermined species in each genus, family, etc. These will often be referred to as “unclassified Lactobacillus” etc.


Example of a complete file:

Ken@lassesen.com
356, 0.004
468, 0.002
469, 0.002
506, 0.373
543, 0.338
551, 0.002
561, 0.311
562, 0.025
613, 0.019
712, 0.05
713, 0.002
724, 0.0429
729, 0.0429
766, 0.003
775, 0.003
780, 0.003
815, 35.804

A download will start

The file will show the NAME for each of the taxon, so you can verify that the numbers are correct.