Opinion: GTDB should be blacklisted for Clinical Use

I believe it is the consumer interest to share this email thread and to promote discussion of this issue.

Blacklisting is the action of suggesting something to be avoided or distrusted.

Request

[Customer name withheld] has forwarded me the PDF and some CSV files associated. She wishes to see what the recommendations from a fuzzy logic expert system that uses over 7.4 million facts based on data from the US National Library of Medicine will suggest.

 I know that the following data is very much available and possible to provide. Other firms like Biomesight.com, Thorne, Vitract and Precision Biome has no trouble providing it:

For all taxonomical layers (From Clade to strains [when available]) just 3 numbers are sufficient.

  • NCBI Taxon Number, 
  • Percentage Amount, 
  • Percentile Ranking across a reference set of healthy individuals

Additional data is welcome, but not required:

  1. Names
  2. Your reference ranges 
  3. etc

Percentiles should be actual percentiles and NOT percentiles estimated using mean and standard deviation. Most bacteria has a SKEW exceeding 20. Using the mean and average requires a SKEW of zero.

Your customer would greatly appreciate a speed return of an appropriate file. With that file in hand, we will add your lab to our list of over 50 labs that our free site supports. (See https://microbiomeprescription.com/Upload/Index ).

If you are unable to provide such, please tell us so we may black list your site as not supportable to spare other consumers a waste of money.

Response

Hi Ken,
Thank you for your detailed email and for sharing your perspective on the data formats and metrics you require.

We’d like to clarify that Microba uses the Genome Taxonomy Database (GTDB) for microbial classification. GTDB and NCBI classify genomes differently, our species consist of multiple genomes which may have different NCBI classifications, species level classifications cannot always be mapped to each other through name matching alone. Due to this, providing a microbiome profile in NCBI taxonomy is not practical nor would it be a correct representation of the actual microbiome profile.

Once GTDB is formally supported within your workflow, please reach out and we can discuss options for providing the data your service requires.

We appreciate your understanding and encourage you to support the more accurate, resolved, and phylogenetically consistent GTDB taxonomy.

Kind regards,
The Microba Team

Reply To Response

Many thanks for your reply. Our purpose is to provide clinical suggestions for review by medical professionals to people suffering from a wide variety of conditions using hallucination-free AI.

A simple example may be seen here. DepressionExample.

Unfortunately, Genome Taxonomy Database(GTDB) appears to be a research tool and IMHO seems very inappropriate and misleading to sell to consumers. GTD was first proposed in academic papers in August 2018. We were active in the microbiome before that and the de facto standard in the industry as then, and still is today, the NCBI. The leading consumer microbiome testing company back then was uBiome which provided NCBI identifiers. In the 7 years since first release, we have seen what appears to be some 226 revisions given the release of   10-RS226, dated April 16, 2025.

Checking the US National Library of Medicine, there appears to be less than 200 studies done using GTDB that are of likely clinical use. With NCBI, we found over 20,000 suitable studies. Regardless, no study done prior to 2020, likely 2022, can, in a legal sense, be safely used for clinical purposes. 

I am aware of many tools to convert GTDB to NCBI, a few are:

  • TaxonKit: Command-line toolkit that supports creating NCBI-style taxdump files from GTDB and also reformatting and mapping taxonomies.
  • gtdb_to_taxdump: Python tool to convert GTDB taxonomy files into NCBI taxdump format, usable by downstream tools like Kraken2.
  • GTDB-Tk: Assigns genomes to GTDB taxonomy but includes metadata fields mapping to NCBI taxonomy, enabling conversion between formats.
  • NCBI-GTDB Map: Direct tool for mapping GTDB taxonomy to NCBI taxonomy, supporting both directions and handling rank prefixes.
  • gtdb-taxdump: A specialized toolkit for generating stable, trackable NCBI taxdump files from GTDB releases with reproducible taxon IDs.
  • NCBI-taxonomist: Python command-line utility that retrieves, handles, and allows mapping of taxonomic information, supporting cross-database operations.

I am aware that folks embracing the hottest new technology can have an attitude, especially when most recent studies decline to use it. Intransient on this issue is not beneficial to your customers; people with challenging health issues unless you are willing to provide a GTDB based suggestions engine working off only GTDB studies, 

hallucination-free AI that is equivalent or better than what NCBI can provide. Until such time, I would advocate that you stop making misleading sales to consumers.

Given your response, I feel that I have no option back to blacklist you for clinical use.