Aug 272012



TGICL is a software system for fast clustering of large EST datasets. This package automates clustering and assembly of a large EST/mRNA dataset. The clustering is performed by a slightly modified version of NCBI’s megablast , and the resulting clusters are then assembled using CAP3 assembly program. TGICL starts with a large multi-FASTA file (and an optional peer quality values file) and outputs the assembly files as produced by CAP3. Both clustering and assembly phases can be parallelized by distributing the searches and the assembly jobs across multiple CPUs, as TGICL can take advantage of either SMP machines or PVM (Parallel Virtual Machine) clusters. The two full precompiled packages below were built on Linux and SunOS, respectively. They include CAP3, mgblast and all the other binaries for this platform (of course, except the base Unix utilities like ‘sed’, ‘sort’ etc.). Please note that only the Linux version was thoroughly tested at DFCI.


The Gene Index Project


  • Windows / MacOsX / Linux
  • Perl





Bioinformatics. 2003 Mar 22;19(5):651-2.
TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets.
Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J.



Sorry, the comment form is closed at this time.