Metastats – Detect Differentially Abundant Features in Metagenomic Data



Metastats, employs the false discovery rate to improve specificity in high-complexity environments, and separately handles sparsely-sampled features using Fisher’s exact test. Under a variety of simulations, we show that Metastats performs well compared to previously used methods, and significantly outperforms other methods for features with sparse counts. We demonstrate the utility of our method on several datasets including a 16S rRNA survey of obese and lean human gut microbiomes, COG functional profiles of infant and mature gut microbiomes, and bacterial and viral metabolic subsystem data inferred from random sequencing of 85 metagenomes. The application of our method to the obesity dataset reveals differences between obese and lean subjects not reported in the original study. For the COG and subsystem datasets, we provide the first statistically rigorous assessment of the differences between these populations.


James White






PLoS Comput Biol. 2009 Apr;5(4):e1000352. Epub 2009 Apr 10.
Statistical methods for detecting differentially abundant features in clinical metagenomic samples.
White JR, Nagarajan N, Pop M.