RATS (Resource Aware Taxon Selection)
The phylogenetic diversity (PD) of a set of taxonomical units (e.g. genes, individuals, populations, species) is the total length of the evolutionary tree connecting them. This measure is relevant for choosing taxa in a variety of applications. In comparative genomics, the statistical power in testing various evolutionary hypotheses (e.g. low substitution rates) and in finding interesting genomic features (e.g. protein-coding genes, noncoding conserved elements) is strongly correlated with the total PD of the sequences being compared. Therefore, sequencing projects (both at the genome and gene level) should target taxa with a high total PD (see  and references therein). In biodiversity conservation, when not all species or populations in a geographical area can be protected, it is reasonable to concentrate conservation efforts on a subset with maximum PD .
Given the growing interest in PD, Fabio Pardi has worked on a hierarchy of optimisation problems where the aim is to select, from a collection of candidate taxa, a subset with maximum total PD. Depending on the nature of the constraints on this subset, the problems have varying computational complexity and different algorithmic solutions are devisable. When the aim is to select a fixed number of taxa, a simple greedy algorithm can be shown to produce optimal solutions . When different taxa require different amounts of resources (money, time, etc.) for their selection (sequencing or conservation) and we have a limit on the total amount of resources available, however, it transpires that the problem is more difficult. We have devised a novel dynamic programming algorithm that can compute the optimal solution efficiently .
This algorithm is implemented by the program rats.
- Linux / MacOsX
:: MORE INFORMATION
Syst Biol. 2007 Jun;56(3):431-44.
Resource-aware taxon selection for maximizing phylogenetic diversity.
Pardi F, Goldman N.