HAC – Hierarchical Agglomerative Clustering for a large-scale Network data

HAC 1.2.1

:: DESCRIPTION

HAC is developed for fast clustering of heterogeneous interaction networks.

::DEVELOPER

Joel Bader lab

:: SCREENSHOTS

N/A

::REQUIREMENTS

  • Linux

:: DOWNLOAD

 HAC

:: MORE INFORMATION

Citation

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1:S44. doi: 10.1186/1471-2105-12-S1-S44.
Resolving the structure of interactomes with hierarchical agglomerative clustering.
Park Y, Bader JS.

MR-MSPOLYGRAPH / MSPolygraph 1.61 – MapReduce Implementation of a Hybrid Spectral Library-Database Search Method for Large-scale Peptide Identification

MR-MSPOLYGRAPH / MSPolygraph 1.61

:: DESCRIPTION

MR-MSPOLYGRAPH is a MapReduce based implementation for parallelizing peptide identification from mass spectrometry data

MSPolygraph is an open source hybrid database and spectral library MS/MS search engine that runs in either serial or parallel modes. MSPolygraph examines the ions in MS/MS spectra to identify likely matching candidate peptides from a reference database of protein sequences (.Fasta file). MSPolygraph also supports searching a spectral library (collection of annotated MS/MS spectra).

::DEVELOPER

Ananth Kalyanaraman

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • C Compiler

:: DOWNLOAD

 MR-MSPOLYGRAPH / MSPolygraph

:: MORE INFORMATION

Citation

A. Kalyanaraman, W.R. Cannon, B. Latt, D.J. Baxter (2011).
MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification.
Bioinformatics (2011) 27 (21): 3072-3073. doi: 10.1093/bioinformatics/btr523.

Cannon W.R., K.H. Jarman, B.M. Webb-Robertson, D.J. Baxter, C.S. Oehmen, K.D. Jarman, A. Heredia-Langner, G.A. Anderson, and K.J. Auberry. 2005.
Comparison of probability and likelihood models for peptide identification from tandem mass spectrometry data”,
J. Proteome Research, 4(5): 1687-169

CloudNMF – MapReduce implementation of NMF for large scale Biological datasets

CloudNMF

:: DESCRIPTION

CloudNMF is a distributed open-source implementation of NMF (nonnegative Matrix Factorization) on a MapReduce framework.

::DEVELOPER

CloudNMF team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

:: DOWNLOAD

 CloudNMF

:: MORE INFORMATION

Citation:

Genomics Proteomics Bioinformatics. 2014 Feb;12(1):48-51. doi: 10.1016/j.gpb.2013.06.001. Epub 2013 Aug 8.
CloudNMF: a MapReduce implementation of nonnegative matrix factorization for large-scale biological datasets.
Liao R, Zhang Y, Guan J, Zhou S

MegaDot 0.9 – Large Scale Dot Plotter

MegaDot 0.9

:: DESCRIPTION

MegaDot is a large scale dot plotter which is capable of producing dot density plots of tens of megabases of DNA sequence on a modest sized desktop machine, and of entire mammalian chromosomes on server scale computers.

::DEVELOPER

K. James Durbin (Atlas Genome Tools Group)

:: SCREENSHOTS

megadot

:: REQUIREMENTS

  • Windows  / Linux / MacOsX
  • Java

:: DOWNLOAD

 MegaDot

:: MORE INFORMATION

pGraph / psgraph 3.0 – Parallel Construction of large-scale Protein Sequence Homology Graphs

pGraph 3.0

:: DESCRIPTION

 pGraph /psgraph (parallel sequence graph construction)  is a novel hybrid between the hierarchical multiple-master/worker model and producer-consumer model, and is designed to break the irregularities imposed by alignment computation and work generation.

::DEVELOPER

Ananth Kalyanaraman

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • C Compiler

:: DOWNLOAD

 pGraph / psgraph

:: MORE INFORMATION

Citation

C. Wu, A. Kalyanaraman, W. Cannon.
pGraph: Efficient parallel construction of large-scale protein sequence homology graphs.
IEEE Transactions on Parallel and Distributed Systems Volume: 23 , Issue: 10 Page(s): 1923 – 1933

DupTree – Large Scale Gene Tree Parsimony Analysis

DupTree

:: DESCRIPTION

DupTree is a program for large-scale phylogenetic analyses using gene tree parsimony.The program implements a novel algorithm that significantly improves upon the run time of standard search heuristics for gene tree parsimony, and enables the first truly genome-scale phylogenetic analyses.

::DEVELOPER

Computational Biology Laboratory, Department of Computer Science,Iowa State University

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows / Linux / Mac OsX

:: DOWNLOAD

  DupTree

:: MORE INFORMATION

Citation

DupTree: A program for large-scale phylogenetic analyses using gene tree parsimony
Andre Wehe, Mukul S. Bansal, J. G. Burleigh, Oliver Eulenstein.
Bioinformatics, 24(13): 1540-1541, 2008.

phenomeImpute 1.0 – R package to Impute Missing values in large-scale High-dimensional Phenome data

phenomeImpute 1.0

:: DESCRIPTION

phenomeImpute is an R package to impute missing values in large-scale high-dimensional phenome data. It includes several variations of KNN, random forest and MICE methods.

::DEVELOPER

George C. Tseng 

:: SCREENSHOTS

N/A

:: REQUIREMENTS

:: DOWNLOAD

 phenomeImpute

:: MORE INFORMATION

Citation

Serena G. Liao&, Yan Lin&, Dongwan Kang, Naftali Kaminski, Frank C. Sciurba, George C. Tseng. (2014)
Missing value imputation in high-dimensional phenomic data: Imputable or not? And how?.
In preparation.

LSGSP – Large-Scale Genome Sequence Processing

LSGSP

:: DESCRIPTION

LSGSP is the java codes which provided for use on this site are not part of the book Large-Scale Genome Sequence Processing published by Imperial College Press

::DEVELOPER

Masahiro Kasahara & Shinichi Morishita

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows/Linux/MacOsX
  • Java

:: DOWNLOAD

 LSGSP

:: MORE INFORMATION

Citation:

Large-scale Genome Sequence Processing
Masahiro Kasahara and Shinichi Morishita (Jul 7, 2006)

Optimus Primer – Primer Design for large-scale Resequencing by Second Generation Sequencing

Optimus Primer

:: DESCRIPTION

Optimus Primer is a PCR enrichment primer design program for next-generation sequencing of human exonic regions.

::DEVELOPER

Laboratory of Guillaume Lettre

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Web Browser

:: DOWNLOAD

 NO

:: MORE INFORMATION

Citation

BMC Res Notes. 2010 Jul 7;3:185. doi: 10.1186/1756-0500-3-185.
Optimus Primer: A PCR enrichment primer design program for next-generation sequencing of human exonic regions.
Brown AM, Lo KS, Guelpa P, Beaudoin M, Rioux JD, Tardif JC, Phillips MS, Lettre G.

RESQUE 1.0 – REduction-based scheme using Semi-Markov scores for networkQUEing

RESQUE 1.0

:: DESCRIPTION

RESQUE is an efficient algorithm for querying large-scale biological networks.

::DEVELOPER

Mohammad Ebrahim Sahraeian and Byung-Jun Yoon

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux/ Windows/ MacOsX
  • Python / Matlab

:: DOWNLOAD

 RRESQUE

:: MORE INFORMATION

Citation

RESQUE: Network reduction using semi-Markov random walk scores for efficient querying of biological networks
Sayed Mohammad Ebrahim Sahraeian and Byung-Jun Yoon,
Bioinformatics, 28 (16): 2129-2136, 2012. doi:10.1093/bioinformatics/bts341