Debian Med Project
Help us to see Debian used by medical practicioners and researchers! Join us on the Alioth page.
summary
Biology
Debian-Med micro-biology packages

This meta package will install Debian packages related to molecular biology, structural biology and bioinformatics for use in life sciences.

The list to the right includes various software projects which are of some interest to the Debian-Med Project. Currently, only a few of them are available as Debian packages. It is our goal, however, to include all software in Debian-Med which can sensibly add to a high quality Custom Debian Distribution.

For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:

If you discover a project which looks like a good candidate for Debian-Med to you, or if you have prepared an inofficial Debian package, please do not hesitate to send a description of that project to the Debian-Med mailing list

Debian-Med micro-biology packages

Official Debian packages

Adun.app
Molecular Simulator for GNUstep
http://diana.imim.es/Adun
Maintainer: Debian-Med Packaging Team
Version: 0.8.2
License: DFSG free
Official Debian package
This is a new extendible molecular simulation program that also includes data management and analysis capabilities.
Amap-align
Protein multiple alignment by sequence annealing
http://bio.math.berkeley.edu/amap/
Maintainer: Debian-Med Packaging Team
Version: 2.2
License: DFSG free
Official Debian package
AMAP is a command line tool to perform multiple alignment of peptidic sequences. It utilizes posterior decoding, and a sequence-annealing alignment, instead of the traditional progressive alignment method. It is the only alignment program that allows to control the sensitivity / specificity tradeoff. It is based on the ProbCons source code, but uses alignment metric accuracy and eliminates the consistency transformation.
The java visualisation tool of AMAP 2.2 is not yet packaged in Debian.
Arb
Integrated package for data handling and analysis
http://www.arb-home.de/
Maintainer: Debian-Med Packaging Team
Version: 0.0.20071207.1
License: non-free
Official Debian package
The ARB software is a graphically oriented package comprising various tools for sequence database handling and data analysis. A central database of processed (aligned) sequences and any type of additional data linked to the respective sequence entries is structured according to phylogeny or other user defined criteria.
The ARB project (latin, "arbor"=tree) is a joint initiative of the Lehrstuhl fuer Mikrobiologie http://www.mikro.biologie.tu-muenchen.de/ and the Lehrstuhl fuer Rechnertechnik und Rechnerorganisation http://wwwbode.informatik.tu-muenchen.de/ of the Technical University of Munich.
Autodock
analysis of ligand binding to protein structure
http://autodock.scripps.edu/
Maintainer: Debian-Med Packaging Team
Version: 4.0.1
License: DFSG free
Official Debian package
AutoDock is a prime representative of the programs addressing the simulation of the docking of fairly small chemical ligands to rather big protein receptors. Earlier versions had all flexibility in the ligands while the protein was kept rather ridgid. This latest version 4 also allows for a flexibility of selected sidechains of surface residues, i.e., takes the rotamers into account.
The AutoDock program performs the docking of the ligand to a set of grids describing the target protein. AutoGrid pre-calculates these grids.
Autogrid
pre-calculate binding of ligands to their receptor
http://autodock.scripps.edu/
Maintainer: Debian-Med Packaging Team
Version: 4.0.1
License: DFSG free
Official Debian package
The AutoDockSuite addresses the molecular analysis of the docking of a smaller chemical compounds to their receptors of known three-dimensional structure.
The AutoGrid program performs pre-calculations for the docking of a ligand to a set of grids that describe the effect that the protein has on point charges. The effect of these forces on the ligand is then analysed by the AutoDock program.
Biococoa.app
Sequence file format conversion for GNUstep
http://bioinformatics.org/biococoa/
Maintainer: Debian-Med Packaging Team
Version: 1.6.0
License: DFSG free
Official Debian package
The BioCocoa framework provides developers with the opportunity to add support for reading and writing BEAST, Clustal, EMBL, Fasta, GCG-MSF, GDE, Hennig86, NCBI, NEXUS, NONA, PDB, Phylip, PIR, Plain/Raw, Swiss-Prot and TNT files by writing only three lines of code. The framework is written in Cocoa (Objective-C).
Version 1.6 is the last upstream version that works with GNUstep. If newer versions are needed to work under Linux try to convince upstream to support GNUstep.
Biosquid
utilities for biological sequence analysis
http://selab.wustl.edu/cgi-bin/selab.pl?mode=software
Maintainer: Debian-Med Packaging Team
Version: 1.9g+cvs20050121
License: DFSG free
Official Debian package
SQUID is a library of C code functions for sequence analysis. It also includes a number of small utility programs to convert, show statistics, manipulate and do other functions on sequence files.
The original name of the package is "squid", but since there is already a squid on the archive (a proxy cache), it was renamed to "biosquid".
Blast2
Basic Local Alignment Search Tool
http://www.ncbi.nih.gov/BLAST/
Maintainer: Aaron M. Ucko
Version: 1:2.2.18.20080302
License: DFSG free
Official Debian package
The famous sequence alignment program. This is "official" NCBI version, #2. The blastall executable allows you to give a nucleotide or protein sequence to the program. It is compared against databases and a summary of matches is returned to the user.
Note that databases are not included in Debian; they must be retrieved manually.
Boxshade
Pretty-printing of multiple sequence alignments
http://www.ch.embnet.org/software/BOX_form.html
Maintainer: Debian-Med Packaging Team
Version: 3.3.1
License: DFSG free
Official Debian package
Boxshade is a program for creating good looking printouts from multiple-aligned protein or DNA sequences. The program does not perform the alignment by itself and requires as input a file that was created by a multiple alignment program or manually edited with respective tools.
Boxshade reads multiple-aligned sequences from either PILEUP-MSF, CLUSTAL-ALN, MALIGNED-data and ESEE-save files (limited to a maximum of 150 sequences with up to 10000 elements each). Various kinds of shading can be applied to identical/similar residues. Output is written to screen or to a file in the following formats: ANSI/VT100, PS/EPS, RTF, HPGL, ReGIS, LJ250-printer, ASCII, xFIG, PICT, HTML
Clustalw
global multiple nucleotide or peptide sequence alignment
http://www.clustal.org/
Maintainer: Debian-Med Packaging Team
Version: 2.0.5
License: non-free
Official Debian package
This program performs an alignment of multiple nucleotide or amino acid sequences. It recognizes the format of input sequences and whether the sequences are nucleic acid (DNA/RNA) or amino acid (proteins). The output format may be selected from in various formats for multiple alignments such as Phylip or FASTA. Clustal W is very well accepted.
The output of Clustal W can be edited manually but preferably with an alignment editor like SeaView or within its companion Clustal X. When building a model from your alignment, this can be applied for improved database searches. The Debian package hmmer creates such in form of an HMM.
For details and citation purposes see paper "Clustal W and Clustal X version 2.0", Larkin M., et al. Bioinformatics 2007 23(21):2947-2948
Clustalw-mpi
MPI-distributed global sequence alignment with ClustalW
http://kmlvli.com/kuobin/clustalw-mpi/
Maintainer: Debian-Med Packaging Team
Version: 0.14
License: non-free
Official Debian package
ClustalW is a popular tool for multiple sequence alignment. The alignment is achieved via three steps: pairwise alignment, guide-tree generation and progressive alignment. ClustalW-MPI is an MPI implementation of ClustalW. Based on version 1.82 of the original ClustalW, both the pairwise and progressive alignments are parallelized with MPI, a popular message passing programming standard. The pairwise alignments can be easily parallelized since the many alignments are time independent on each other. However the progressive alignments are essentially not parallelizable because of the time dependencies between each alignment.
Here the recursive parallelism paradigm is applied to the linear space profile-profile alignment algorithm. This approach is more time efficient on computers with distributed memory architecture. Traditional approach that relies on precomputing the profile-profile score matrix has also been implemented. Results shown the latter is indeed more appropriate for shared memory multiprocessor computer.
ClustalX is suggested for its support for local realignments, seaview is a versatile editor of alignments.
The original ClustalW/ClustalX can be found at URL: http://www.clustal.org/download/pre-2/
Clustalx
GUI for Clustal W
ftp://ftp.ebi.ac.uk/pub/software/unix/clustalx/
Maintainer: Debian-Med Packaging Team
Version: 1.83
License: non-free
Official Debian package
This package offers a GUI interface for the Clustal W multiple sequence alignment program. It provides an integrated environment for performing multiple sequence- and profile-alignments to analyse the results. The sequence alignment is displayed in a window on the screen. A versatile coloring scheme has been incorporated to highlight conserved features in the alignment. For professional presentations, one should use the texshade LaTeX package or boxshade.
The pull-down menus at the top of the window allow you to select all the options required for traditional multiple sequence and profile alignment. You can cut-and-paste sequences to change the order of the alignment; you can select a subset of sequences to be aligned; you can select a sub-range of the alignment to be realigned and inserted back into the original alignment.
An alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted.
Dialign
Segment-based multiple sequence alignment
http://dialign.gobics.de/
Maintainer: Debian-Med Packaging Team
Version: 2.2.1
License: DFSG free
Official Debian package
DIALIGN2 is a command line tool to perform multiple alignment of protein or DNA sequences. It constructs alignments from gapfree pairs of similar segments of the sequences. This scoring scheme for alignments is the basic difference between DIALIGN and other global or local alignment methods. Note that DIALIGN does not employ any kind of gap penalty. It has been published by Morgenstern B. in Bioinformatics. 1999 Mar;15(3):211-8.
Dialign-t
Segment-based multiple sequence alignment
http://dialign-t.gobics.de/
Maintainer: Debian-Med Packaging Team
Version: 0.2.2.dfsg
License: DFSG free
Official Debian package
DIALIGN-T is a command line tool to perform multiple alignment of protein or DNA sequences. It is a complete reimplementation of the segment-base approach including several new improvements and heuristics that significantly enhance the quality of the output alignments compared to DIALIGN 2.2. For pairwise alignment, DIALIGN-T uses a fragment-chaining algorithm that favours chains of low-scoring local alignments over isolated high-scoring fragments. For multiple alignment, DIALIGN-T uses an improved greedy procedure that is less sensitive to spurious local sequence similarities.
DIALIGN-T has been published in Amarendran R. Subramanian, Jan Weyer-Menkhoff, Michael Kaufmann, Burkhard Morgenstern: DIALIGN-T: An improved algorithm for segment-based multiple sequence alignment. BMC Bioinformatics 2005, 6:66.
Emboss
The European Molecular Biology Open Software Suite
http://www.emboss.org
Maintainer: Debian-Med Packaging Team
Version: 5.0.0
License: DFSG free
Official Debian package
EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole. EMBOSS breaks the historical trend towards commercial software packages.
Reference for EMBOSS: Rice,P. Longden,I. and Bleasby,A. "EMBOSS: The European Molecular Biology Open Software Suite" Trends in Genetics June 2000, vol 16, No 6. pp.276-277
Exonerate
tool for comparison of long biological sequences
http://www.ebi.ac.uk/~guy/exonerate/
Maintainer: Debian-Med Packaging Team
Version: 1.4.0
License: DFSG free
Official Debian package
Much of the functionality of the Wise dynamic programming suite was reimplemented in C for better efficiency. Exonerate is an intrinsic component of the building of the Ensembl genome databases, providing similarity scores between RNA and DNA sequences and thus determining splice variants and coding sequences in general.
Fastdnaml
[Biology] Tool for construction of phylogenetic trees of DNA sequences
http://web.archive.org/web/20061017161001/http://geta.life.uiuc.edu/~gary/programs/fastDNAml.html
Maintainer: Debian-Med Packaging Team
Version: 1.2.2
License: DFSG free
Official Debian package
fastDNAml is a program derived from Joseph Felsenstein's version 3.3 DNAML (part of his PHYLIP package). Users should consult the documentation for DNAML before using this program.
fastDNAml is an attempt to solve the same problem as DNAML, but to do so faster and using less memory, so that larger trees and/or more bootstrap replicates become tractable. Much of fastDNAml is merely a recoding of the PHYLIP 3.3 DNAML program from PASCAL to C.
Fastlink
A faster version of pedigree programs of Linkage
http://www.ncbi.nlm.nih.gov/CBBResearch/Schaffer/fastlink.html
Maintainer: Debian-Med Packaging Team
Version: 4.1P
License: DFSG free
Official Debian package
Fastlink is much faster than the original Linkage but does not implement all the programs.
Garlic
A visualization program for biomolecules
http://www.zucic.org/garlic/
Maintainer: Debichem Team
Version: 1.6
License: DFSG free
Official Debian package
Garlic is written for the investigation of membrane proteins. It may be used to visualize other proteins, as well as some geometric objects. This version of garlic recognizes PDB format version 2.1. Garlic may also be used to analyze protein sequences.
It only depends on the X libraries, no other libraries are needed.
Features include:
 - The slab position and thickness are visible in a small window. 
 - Atomic bonds as well as atoms are treated as independent drawable 
   objects. 
 - The atomic and bond colors depend on position. Five mapping modes 
   are available (as for slab). 
 - Capable to display stereo image. 
 - Capable to display other geometric objects, like membrane. 
 - Atomic information is available for atom covered by the mouse 
   pointer. No click required, just move the mouse pointer over the 
   structure! 
 - Capable to load more than one structure. 
 - Capable to draw Ramachandran plot, helical wheel, Venn diagram, 
   averaged hydrophobicity and hydrophobic moment plot. 
 - The command prompt is available at the bottom of the main window. 
   It is able to display one error message and one command string. 
Gdpc
visualiser of molecular dynamic simulations
http://www.frantz.fi/software/gdpc.php
Maintainer: Debian-Med Packaging Team
Version: 2.2.4
License: DFSG free
Official Debian package
gpdc is a graphical program for visualising output data from molecular dynamics simulations. It reads input in the standard xyz format, as well as other custom formats, and can output pictures of each frame in JPG or PNG format.
Gff2aplot
pair-wise alignment-plots for genomic sequences in PostScript
http://genome.imim.es/software/gfftools/GFF2APLOT.html
Maintainer: Debian-Med Packaging Team
Version: 2.0
License: DFSG free
Official Debian package
A program to visualize the alignment of two genomic sequences together with their annotations. From GFF-format input files it produces PostScript figures for that alignment. The following menu lists many features of gff2aplot:
 * Comprehensive alignment plots for any GFF-feature. Attributes are defined 
   separately so you can modify only whatsoever attributes for a given file or 
   share same customization across different data-sets. 
 * All parameters are set by default within the program, but it can be also 
   fully configured via gff2ps-like flexible customization files. Program can 
   handle several of such files, summarizing all the settings before producing 
   the corresponding figure. Moreover, all customization parameters can be set 
   via command-line switches, which allows users to play with those parameters 
   before adding any to a customization file. 
 * Source order is taken from input files, if you swap file order you can 
   visualize alignment and its annotation with the new input arrangement. 
 * All alignment scores can be visualized in a PiP box below gff2aplot area, 
   using grey-color scale, user-defined color scale or score-dependent 
   gradients. 
 * Scalable fonts, which can also be chosen among the basic PostScript default 
   fonts. Feature and group labels can be rotated to improve readability in 
   both annotation axes. 
 * The program is still defined as a Unix filter so it can handle data from 
   files, redirections and pipes, writing output to standard-output and 
   warnings to standard error. 
 * gff2aplot is able to manage many physical page formats (from A0 to A10, and 
   more -see available page sizes in its manual-), including user-defined ones. 
   This allows, for instance, the generation of poster size genomic maps, or 
   the use of a continuous-paper supporting plotting device, either in portrait 
   or landscape. 
 * You can draw different alignments on same alignment plot and distinguish 
   them by using different colors for each. 
 * Shape dictionary has been expanded, so that further feature shapes are now 
   available (see manual). 
 * Annotation projections through alignment plots (so called ribbons) emulate 
   transparencies via complementary color fill patterns. This feature allows 
   to show color pseudo-blending when horizontal and vertical ribbons overlap. 
Gff2ps
produces PostScript graphical output from GFF-files
http://genome.imim.es/software/gfftools/GFF2PS.html
Maintainer: Debian-Med Packaging Team
Version: 0.98d
License: DFSG free
Official Debian package
gff2ps is a script program developed with the aim of converting gff-formatted records into high quality one-dimensional plots in PostScript. Such plots maybe useful for comparing genomic structures and to visualizing outputs from genome annotation programs. It can be used in a very simple way, because it assumes that the GFF file itself carries enough formatting information, but it also allows through a number of options and/or a configuration file, for a great degree of customization.
Ghemical
A GNOME molecular modelling environment
http://bioinformatics.org/ghemical/ghemical/
Maintainer: Debichem Team
Version: 2.95
License: DFSG free
Official Debian package
Ghemical is a computational chemistry software package written in C++. It has a graphical user interface and it supports both quantum- mechanics (semi-empirical) models and molecular mechanics models. Geometry optimization, molecular dynamics and a large set of visualization tools using OpenGL are currently available.
Ghemical relies on external code to provide the quantum-mechanical calculations. Semi-empirical methods MNDO, MINDO/3, AM1 and PM3 come from the MOPAC7 package (Public Domain), and are included in the package. The MPQC package is used to provide ab initio methods: the methods based on Hartree-Fock theory are currently supported with basis sets ranging from STO-3G to 6-31G**.
Glam2
gapped protein motifs from unaligned sequences
http://bioinformatics.org.au/glam2/
Maintainer: Debian-Med Packaging Team
Version: 1058
License: DFSG free
Official Debian package
GLAM2 is a software package for finding motifs in sequences, typically amino-acid or nucleotide sequences. A motif is a re-occurring sequence pattern: typical examples are the TATA box and the CAAX prenylation motif. The main innovation of GLAM2 is that it allows insertions and deletions in motifs.
The package includes these programs: glam2 — for discovering motifs shared by a set of sequences; glam2scan — for finding matches, in a sequence database, to a motif discovered by glam2; glam2format — for converting glam2 motifs to standard alignment formats; glam2mask — for masking glam2 motifs out of sequences, so that weaker motifs can be found; glam2-purge — for removing highly similar members of a set of sequences.
In this package, the fast Fourier algorithm (FFT) was enabled for glam2.
If you use GLAM2, please cite: MC Frith, NFW Saunders, B Kobe, TL Bailey (2008) Discovering sequence motifs with arbitrary insertions and deletions, PLoS Computational Biology (in press).
Gromacs
Molecular dynamics simulator, with building and analysis tools
http://www.gromacs.org/
Maintainer: Debichem Team
Version: 3.3.3
License: DFSG free
Official Debian package
GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non- biological systems, e.g. polymers.
GROMACS offers entirely too many features for a brief description to do it justice. A more complete listing is available at <http://www.gromacs.org/content/view/12/176/>.
Hmmer
profile hidden Markov models for protein sequence analysis
http://hmmer.janelia.org/
Maintainer: Debian-Med Packaging Team
Version: 2.3.2
License: DFSG free
Official Debian package
HMMER is an implementation of profile hidden Markov model methods for sensitive searches of biological sequence databases using multiple sequence alignments as queries.
Given a multiple sequence alignment as input, HMMER builds a statistical model called a "hidden Markov model" which can then be used as a query into a sequence database to find (and/or align) additional homologues of the sequence family.
Kalign
Global and progressive multiple sequence alignment
http://msa.cgb.ki.se
Maintainer: Debian-Med Packaging Team
Version: 2.03
License: DFSG free
Official Debian package
Kalign is a command line tool to perform multiple alignment of biological sequences. It employs the Wu-Manber string-matching algorithm, to improve both the accuracy and speed of the alignment. It uses global, progressive alignment approach, enriched by employing an approximate string-matching algorithm to calculate sequence distances and by incorporating local matches into the otherwise global alignment. In comparisons made by its authors, Kalign was about 10 times faster than ClustalW and, depending on the alignment size, up to 50 times faster than popular iterative methods. It has been published in Lassmann and Sonnhammer, BMC Bioinformatics 2005, 6:298.
Loki
MCMC linkage analysis on general pedigrees
http://web.archive.org/web/20051220050931/http://loki.homeunix.net/
Maintainer: Debian-Med Packaging Team
Version: 2.4.7.4
License: DFSG free
Official Debian package
Performs Markov chain Monte Carlo multipoint linkage analysis on large, complex pedigrees. The current package supports analyses on quantitative traits only, although this restriction will be lifted in later versions. Joint estimation of QTL number, position and effects uses Reversible Jump MCMC. It is also possible to perform affected only IBD sharing analyses.
The homepage of this project used to be at http://loki.homeunix.net but the project is dead now and the homepage vanished. The Homepage field above points to the web archive.
Melting
computing the melting temperature of nucleic acid duplex
http://www.ebi.ac.uk/~lenov/meltinghome.html
Maintainer: Debian-Med Packaging Team
Version: 4.2h
License: DFSG free
Official Debian package
This program computes, for a nucleic acid duplex, the enthalpy, the entropy and the melting temperature of the helix-coil transitions. Three types of hybridisation are possible: DNA/DNA, DNA/RNA, and RNA/RNA. The program first computes the hybridisation enthalpy and entropy from the elementary parameters of each Crick's pair by the nearest-neighbor method. Then the melting temperature is computed. The set of thermodynamic parameters can be easily changed, for instance following an experimental breakthrough. Melting was published in Le Novère N. (2001), Bioinformatics, 17: 1226-1227.
Mipe
Tools to store PCR-derived data
http://mipe.sourceforge.net
Maintainer: Debian-Med Packaging Team
Version: 1.1
License: DFSG free
Official Debian package
MIPE provides a standard format to exchange and/or storage of all information associated with PCR experiments using a flat text file. This will:
 * allow for exchange of PCR data between researchers/laboratories 
 * enable traceability of the data 
 * prevent problems when submitting data to dbSTS or dbSNP 
 * enable the writing of standard scripts to extract data (e.g. a 
   list of PCR primers, SNP positions or haplotypes for different animals) 

Although this tool can be used for data storage, it's primary focus should be data exchange. For larger reporisitories, relational databases are more appropriate for storage of these data. The MIPE format could then be used as a standard format to import into and/or export from these databases.
Molphy
Program Package for MOLecular PHYlogenetics
http://www.ism.ac.jp/ismlib/softother.e.html
Maintainer: Debian-Med Packaging Team
Version: 2.3b3
License: non-free
Official Debian package
ProtML is a main program in MOLPHY for inferring evolutionary trees from PROTein (amino acid) sequences by using the Maximum Likelihood method. Other programs (C language)
 NucML:  Maximum Likelihood Inference of Nucleic Acid Phylogeny 
 ProtST: Basic Statistics of Protein Sequences 
 NucST:  Basic Statistics of Nucleic Acid Sequences 
 NJdist: Neighbor Joining Phylogeny from Distance Matrix 
Utilities (Perl)
 mollist:  get identifiers list        molrev:   reverse DNA sequences 
 molcat:   concatenate sequences       molcut:   get partial sequences 
 molmerge: merge sequences             nuc2ptn:  DNA -> Amino acid 
 rminsdel: remove INS/DEL sites        molcodon: get specified codon sites 
 molinfo:  get varied sites            mol2mol:  MOLPHY format beautifer 
 inl2mol:  Interleaved -> MOLPHY       mol2inl:  MOLPHY -> Interleaved 
 mol2phy:  MOLPHY -> Sequential        phy2mol:  Sequential -> MOLPHY 
 must2mol: MUST -> MOLPHY              etc. 
Mozilla-biofox
extension of bioinformatics tools to Iceape and Iceweasel browsers
http://schematron.unl.edu/biofox/
Maintainer: Debian-Med Packaging Team
Version: 1.1.4
License: DFSG free
Official Debian package
Code bioFOX aims at implementing various bioinformatics tools as an extension on the Iceape and Iceweasel browsers. Analysis of your favorite gene(s) usually require(s) retrieving it from a database like NCBI or Swiss-Prot and then performing one or more tasks including but not limited to:
 * Translation of a nucleotide sequence; 
 * Blast search (eg. blastn, blastp etc.) of the desired nucleotide/protein 
   sequence; 
 * Calculation of properties (like PI, charge, molecular weight, AT/GC content 
   etc.) of a protein/nucleotide sequence; 
 * Conversion between formats (Genbank, Fasta, Swiss-Prot etc.); 
 * Prediction of sequence for sub-cellular localization (PREDOTAR, TargetP, 
   pSORT etc). 
Mummer
Efficient sequence alignment of full genomes
http://mummer.sourceforge.net/
Maintainer: Debian-Med Packaging Team
Version: 3.20
License: DFSG free
Official Debian package
MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form. For example, MUMmer 3.0 can find all 20-basepair or longer exact matches between a pair of 5-megabase genomes in 13.7 seconds, using 78 MB of memory, on a 2.4 GHz Linux desktop computer. MUMmer can also align incomplete genomes; it handles the 100s or 1000s of contigs from a shotgun sequencing project with ease, and will align them to another set of contigs or a genome using the NUCmer program included with the system. If the species are too divergent for DNA sequence alignment to detect similarity, then the PROmer program can generate alignments based upon the six-frame translations of both input sequences.
Muscle
Multiple alignment program of protein sequences
http://www.drive5.com/muscle/
Maintainer: Debian-Med Packaging Team
Version: 3.70+fix1
License: DFSG free
Official Debian package
MUSCLE is a multiple alignment program for protein sequences. MUSCLE stands for multiple sequence comparison by log-expectation. In the authors tests, MUSCLE achieved the highest scores of all tested programs on several alignment accuracy benchmarks, and is also one of the fastest programs out there.
Ncbi-epcr
Tool to test a DNA sequence for the presence of sequence tagged sites
http://www.ncbi.nlm.nih.gov/sutils/e-pcr/
Maintainer: Debian-Med Packaging Team
Version: 2.3.9
License: DFSG free
Official Debian package
Electronic PCR (e-PCR) is computational procedure that is used to identify sequence tagged sites(STSs), within DNA sequences. e-PCR looks for potential STSs in DNA sequences by searching for subsequences that closely match the PCR primers and have the correct order, orientation, and spacing that could represent the PCR primers used to generate known STSs.
The new version of e-PCR implements a fuzzy matching strategy. To reduce likelihood that a true STS will be missed due to mismatches, multiple discontigous words may be used instead of a single exact word. Each of this word has groups of significant positions separated by 'wildcard' positions that are not required to match. In addition, it is also possible to allow gaps in the primer alignments.
The main motivation for implementing reverse searching (called Reverse e-PCR) was to make it feasible to search the human genome sequence and other large genomes. The new version of e-PCR provides a search mode using a query sequence against a sequence database.
Ncbi-tools-bin
NCBI libraries for biology applications (text-based utilities)
http://www.ncbi.nlm.nih.gov/IEB/ToolBox/
Maintainer: Aaron M. Ucko
Version: 6.1.20080302
License: DFSG free
Official Debian package
This package includes various utilities distributed with the NCBI C SDK. None of the programs in this package require X; you can find the X-based utilities in the ncbi-tools-x11 package. BLAST and related tools are in a separate package (blast2).
Ncbi-tools-x11
NCBI libraries for biology applications (X-based utilities)
http://www.ncbi.nlm.nih.gov/IEB/ToolBox/
Maintainer: Aaron M. Ucko
Version: 6.1.20080302
License: DFSG free
Official Debian package
This package includes some X-based utilities distributed with the NCBI C SDK: Cn3D, Network Entrez, Sequin, ddv, and udv. These programs are not part of ncbi-tools-bin because they depend on several additional library packages.
Njplot
A tree drawing program
http://pbil.univ-lyon1.fr/software/njplot.html
Maintainer: Debian-Med Packaging Team
Version: 2.2
License: DFSG free
Official Debian package
NJplot is able to draw any tree expressed in the standard phylogenetic tree format (e.g., the format used by the Phylip package). NJplot is especially convenient for rooting the unrooted trees obtained from parsimony, distance or maximum likelihood tree-building methods.
Perlprimer
Graphical design of primers for PCR
http://perlprimer.sourceforge.net
Maintainer: Debian-Med Packaging Team
Version: 1.1.14
License: DFSG free
Official Debian package
PerlPrimer is a free, open-source GUI application written in Perl that designs primers for standard Polymerase Chain Reaction (PCR), bisulphite PCR, real-time PCR (QPCR) and sequencing. It aims to automate and simplify the process of primer design.
If operated online, the tool nicely communicates with the Ensembl project for further insights into the gene structure, i.e., allowing for taking the location of exons and introns into account for the design of the primers. The sequences themselves can be retrieved, too.
Phylip
[Biology] A package of programs for inferring phylogenies
http://evolution.genetics.washington.edu/phylip.html
Maintainer: Debian-Med Packaging Team
Version: 1:3.67
License: non-free
Official Debian package
The PHYLogeny Inference Package is a package of programs for inferring phylogenies (evolutionary trees) from sequences. Methods that are available in the package include parsimony, distance matrix, and likelihood methods, including bootstrapping and consensus trees. Data types that can be handled include molecular sequences, gene frequencies, restriction sites, distance matrices, and 0/1 discrete characters.
Plasmidomics
draw plasmids and vector maps with PostScript graphics export
http://www.bioprocess.org/plasmid
Maintainer: Debian-Med Packaging Team
Version: 0.2.0
License: DFSG free
Official Debian package
Plasmidomics is written for easy drawing of plasmids and vector maps to use them in theses, presentations or other forms of publications. It natively supports PostScript as output format.
Poa
Partial Order Alignment for multiple sequence alignment
http://www.bioinformatics.ucla.edu/poa
Maintainer: Debian-Med Packaging Team
Version: 2.0+20060928
License: DFSG free
Official Debian package
POA is Partial Order Alignment, a fast program for multiple sequence alignment (MSA) in bioinformatics. Its advantages are speed, scalability, sensitivity, and the superior ability to handle branching / indels in the alignment. Partial order alignment is an approach to MSA, which can be combined with existing methods such as progressive alignment. POA optimally aligns a pair of MSAs and which therefore can be applied directly to progressive alignment methods such as CLUSTAL. For large alignments, Progressive POA is 10-30 times faster than CLUSTALW. POA is published in Bioinformatics. 2004 Jul 10;20(10):1546-56.
Primer3
Tool to design flanking oligo nucleotides for DNA amplification
http://primer3.sourceforge.net
Maintainer: Debian-Med Packaging Team
Version: 1.1.3
License: DFSG free
Official Debian package
Primer3 picks primers for Polymerase Chain Reactions (PCRs), considering as criteria oligonucleotide melting temperature, size, GC content and primer-dimer possibilities, PCR product size, positional constraints within the source sequence, and miscellaneous other constraints. All of these criteria are user-specifiable as constraints, and some are specifiable as terms in an objective function that characterizes an optimal primer pair.
It has been published in Rozen S and Skaletsky H, "Primer3 on the WWW for general users and for biologist programmers.", Methods Mol Biol. 2000;132:365-86.
The Whitehead Institute for Biomedical Research provides a web-based front end to Primer3.
Probcons
PROBabilistic CONSistency-based multiple sequence alignment
http://probcons.stanford.edu/
Maintainer: Debian-Med Packaging Team
Version: 1.12
License: DFSG free
Official Debian package
Tool for generating multiple alignments of protein sequences. Using a combination of probabilistic modeling and consistency-based alignment techniques, PROBCONS has achieved the highest accuracies of all alignment methods to date. On the BAliBASE benchmark alignment database, alignments produced by PROBCONS show statistically significant improvement over current programs, containing an average of 7% more correctly aligned columns than those of T-Coffee, 11% more correctly aligned columns than those of CLUSTAL W, and 14% more correctly aligned columns than those of DIALIGN. Probcons is published in Do, C.B., Mahabhashyam, M.S.P., Brudno, M., and Batzoglou, S. 2005. Genome Research 15: 330-340.
Proda
multiple alignment of protein sequences
http://proda.stanford.edu/
Maintainer: Debian-Med Packaging Team
Version: 1.0
License: DFSG free
Official Debian package
ProDA is a system for automated detection and alignment of homologous regions in collections of proteins with arbitrary domain architectures. Given an input set of unaligned sequences, ProDA identifies all homologous regions appearing in one or more sequences, and returns a collection of local multiple alignments for these regions.
ProDA is published in: Phuong T.M., Do C.B., Edgar R.C., and Batzoglou S. Multiple alignment of protein sequences with repeats and rearrangements. Nucleic Acids Research 2006 34(20), 5932-5942.
Pymol
An OpenGL Molecular Graphics System written in Python
http://pymol.sourceforge.net
Maintainer: Debichem Team
Version: 1.1~beta3
License: DFSG free
Official Debian package
PyMOL is a molecular graphics system with an embedded Python interpreter designed for real-time visualization and rapid generation of high-quality molecular graphics images and animations. It is fully extensible and available free to everyone via the "Python" license. Although a newcomer to the field, PyMOL can already be used to generate stunning images and animations with unprecedented ease. It can also perform many other valuable tasks (such as editing PDB files) to assist you in your research.
R-cran-qtl
GNU R package for genetic marker linkage analysis
http://www.rqtl.org
Maintainer: Steffen Moeller
Version: 1.08
License: DFSG free
Official Debian package
R/qtl is an extensible, interactive environment for mapping quantitative trait loci (QTLs) in experimental crosses. It is implemented as an add-on-package for the freely available and widely used statistical language/software R (see http://www.r-project.org).
The development of this software as an add-on to R allows to take advantage of the basic mathematical and statistical functions, and powerful graphics capabilities, that are provided with R. Further, the user will benefit by the seamless integration of the QTL mapping software into a general statistical analysis program. The goal is to make complex QTL mapping methods widely accessible and allow users to focus on modeling rather than computing.
A key component of computational methods for QTL mapping is the hidden Markov model (HMM) technology for dealing with missing genotype data. We have implemented the main HMM algorithms, with allowance for the presence of genotyping errors, for backcrosses, intercrosses, and phase-known four-way crosses.
The current version of R/qtl includes facilities for estimating genetic maps, identifying genotyping errors, and performing single-QTL genome scans and two-QTL, two-dimensional genome scans, by interval mapping (with the EM algorithm), Haley-Knott regression, and multiple imputation. All of this may be done in the presence of covariates (such as sex, age or treatment). One may also fit higher-order QTL models by multiple imputation.
Rasmol
Visualize biological macromolecules
Homepage not available
Maintainer: Teemu Ikonen
Version: 2.7.3.1
License: DFSG free
Official Debian package
RasMol is a molecular graphics program intended for the visualisation of proteins, nucleic acids and small molecules. The program is aimed at display, teaching and generation of publication quality images.
The program reads in a molecule coordinate file and interactively displays the molecule on the screen in a variety of colour schemes and molecule representations. Currently available representations include depth-cued wireframes, 'Dreiding' sticks, spacefilling (CPK) spheres, ball and stick, solid and strand biomolecular ribbons, atom labels and dot surfaces.
Supported input file formats include Protein Data Bank (PDB), Tripos Associates' Alchemy and Sybyl Mol2 formats, Molecular Design Limited's (MDL) Mol file format, Minnesota Supercomputer Center's (MSC) XYZ (XMol) format, CHARMm format, CIF format and mmCIF format files.
This package installs two versions of RasMol, rasmol-gtk has a modern GTK-based user interface and rasmol-classic is the version with the old Xlib GUI.
 Homepage: http://rasmol.org 
Readseq
[Biology] Conversion between sequence formats
http://iubio.bio.indiana.edu/soft/molbio/readseq/
Maintainer: Debian-Med Packaging Team
Version: 1
License: DFSG free
Official Debian package
Reads and writes nucleic/protein sequences in various formats. Data files may have multiple sequences. Readseq is particularly useful as it automatically detects many sequence formats, and converts between them.
Seaview
Multiple sequence alignment editor
http://pbil.univ-lyon1.fr/software/seaview.html
Maintainer: Debian-Med Packaging Team
Version: 1:2.4
License: DFSG free
Official Debian package
SeaView is a graphical multiple sequence alignment editor developed by Manolo Gouy. Multiple alignment formats (NEXUS, MSF, CLUSTAL, FASTA, PHYLIP, MASE) are supported for reading and writing. Alignments can be manually edited. The user is further supported by an integration of external programs, i.e., to run DOT-PLOT or MUSCLE, to locally improve the alignment.
When using SeaView for investigations that lead to a publication, please cite the following reference:
Galtier, N., Gouy, M. and Gautier, C. (1996) "SeaView and Phylo_win, two graphic tools for sequence alignment and molecular phylogeny." Comput. Applic. Biosci. 12:543-548.
Sibsim4
align expressed RNA sequences on a DNA template
http://sibsim4.sourceforge.net/
Maintainer: Debian-Med Packaging Team
Version: 0.16
License: DFSG free
Official Debian package
The SIBsim4 project is based on sim4, which is a program designed to align an expressed DNA sequence with a genomic sequence, allowing for introns. SIBsim4 is a fairly extensive rewrite of the original code with the following goals:
 * speed improvement; 
 * allow large, chromosome scale, DNA sequences to be used; 
 * provide more detailed output about splice types; 
 * provide more detailed output about polyA sites; 
 * misc code cleanups and fixes. 
Sigma-align
Simple greedy multiple alignment of non-coding DNA sequences
http://www.imsc.res.in/~rsidd/sigma/
Maintainer: Debian-Med Packaging Team
Version: 1.1.1
License: DFSG free
Official Debian package
Sigma ("Simple greedy multiple alignment") is an alignment program with a new algorithm and scoring scheme designed specifically for non-coding DNA sequence. It uses a strategy of seeking the best possible gapless local alignments, at each step making the best possible alignment consistent with existing alignments, and scores the significance of the alignment based on the lengths of the aligned fragments and a background model which may be supplied or estimated from an auxiliary file of intergenic DNA.
Sigma has been published in BMC Bioinformatics. 2006 Mar 16;7:143.
Sim4
tool for aligning cDNA and genomic DNA
http://www.bx.psu.edu/miller_lab/
Maintainer: Debian-Med Packaging Team
Version: 0.0.20030921
License: DFSG free
Official Debian package
sim4 is a similarity-based tool for aligning an expressed DNA sequence (EST, cDNA, mRNA) with a genomic sequence for the gene. It also detects end matches when the two input sequences overlap at one end (i.e., the start of one sequence overlaps the end of the other).
sim4 employs a blast-based technique to first determine the basic matching blocks representing the "exon cores". In this first stage, it detects all possible exact matches of W-mers (i.e., DNA words of size W) between the two sequences and extends them to maximal scoring gap-free segments. In the second stage, the exon cores are extended into the adjacent as-yet-unmatched fragments using greedy alignment algorithms, and heuristics are used to favor configurations that conform to the splice-site recognition signals (GT-AG, CT-AC). If necessary, the process is repeated with less stringent parameters on the unmatched fragments.
T-coffee
Multiple Sequence Alignment
http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html
Maintainer: Debian-Med Packaging Team
Version: 5.31
License: DFSG free
Official Debian package
T-Coffee is a multiple sequence alignment package. Given a set of sequences (Proteins or DNA), T-Coffee generates a multiple sequence alignment. Version 2.00 and higher can mix sequences and structures.
T-Coffee allows the combination of a collection of multiple/pairwise, global or local alignments into a single model. It also allows to estimate the level of consistency of each position within the new alignment with the rest of the alignments. See the pre-print for more information
Tigr-glimmer
Gene detection in archea and bacteria
http://www.cbcb.umd.edu/software/glimmer
Maintainer: Debian-Med Packaging Team
Version: 3.02
License: DFSG free
Official Debian package
Developed by the TIGR institute this software detects coding sequences in bacteria and archea.
Glimmer is a system for finding genes in microbial DNA, especially the genomes of bacteria and archaea. Glimmer (Gene Locator and Interpolated Markov Modeler) uses interpolated Markov models (IMMs) to identify the coding regions and distinguish them from noncoding DNA.
Tree-ppuzzle
Parallelized reconstruction of phylogenetic trees by maximum likelihood
http://www.tree-puzzle.de
Maintainer: Debian-Med Packaging Team
Version: 5.2
License: DFSG free
Official Debian package
TREE-PUZZLE (the new name for PUZZLE) is an interactive console program that implements a fast tree search algorithm, quartet puzzling, that allows analysis of large data sets and automatically assigns estimations of support to each internal branch. TREE-PUZZLE also computes pairwise maximum likelihood distances as well as branch lengths for user specified trees. Branch lengths can also be calculated under the clock-assumption. In addition, TREE-PUZZLE offers a novel method, likelihood mapping, to investigate the support of a hypothesized internal branch without computing an overall tree and to visualize the phylogenetic content of a sequence alignment.
This is the parallelized version of tree-puzzle.
Tree-puzzle
Reconstruction of phylogenetic trees by maximum likelihood
http://www.tree-puzzle.de
Maintainer: Debian-Med Packaging Team
Version: 5.2
License: DFSG free
Official Debian package
TREE-PUZZLE (the new name for PUZZLE) is an interactive console program that implements a fast tree search algorithm, quartet puzzling, that allows analysis of large data sets and automatically assigns estimations of support to each internal branch. TREE-PUZZLE also computes pairwise maximum likelihood distances as well as branch lengths for user specified trees. Branch lengths can also be calculated under the clock-assumption. In addition, TREE-PUZZLE offers a novel method, likelihood mapping, to investigate the support of a hypothesized internal branch without computing an overall tree and to visualize the phylogenetic content of a sequence alignment.
Treetool
interactive tool for displaying phylogenetic trees
http://iubio.bio.indiana.edu/soft/molbio/unix/treetool/
Maintainer: Debian-Med Packaging Team
Version: 2.0.2a
License: non-free
Official Debian package
Treetool is an interactive tool for displaying, editing, and printing phylogenetic trees. The tree is displayed visually on screen, in various formats, and the user is able to modify the format, structure, and characteristics of the tree. Trees may be viewed, compared, formatted for printing, constructed from smaller trees, etc.
The development of this software has stopped in 1995.
Treeviewx
Displays and prints phylogenetic trees
http://darwin.zoology.gla.ac.uk/~rpage/treeviewx/
Maintainer: Debian-Med Packaging Team
Version: 0.5.1
License: DFSG free
Official Debian package
TreeView X is an open source and multi-platform program to display phylogenetic trees. It can read and display NEXUS and Newick format tree files (such as those output by PAUP*, ClustalX, TREE-PUZZLE, and other programs). It allows to order the branches of the trees, and to export the trees in SVG format.
The program was written by Rod Page r.page@bio.gla.ac.uk using the wxWidgets C++ library. It was published in Computer Applications in the Biosciences. 1996 12: 357-358.
Wise
comparison of biopolymers, commonly DNA and protein sequences
http://www.ebi.ac.uk/~birney/wise2/
Maintainer: Debian-Med Packaging Team
Version: 2.4.1
License: DFSG free
Official Debian package
Wise2 is a package focused on comparisons of biopolymers, commonly DNA and protein sequences. There are many other packages which do this, probably the best known being BLAST package (from NCBI) and the Fasta package (from Bill Pearson). There are other packages, such as the HMMER package (Sean Eddy) or SAM package (UC Santa Cruz) focused on hidden Markov models (HMMs) of biopolymers.
Wise2's particular forte is the comparison of DNA sequence at the level of its protein translation. This comparison allows the simultaneous prediction of say gene structure with homology based alignment.
Wise2 also contains other algorithms, such as the venerable Smith-Waterman algorithm, or more modern ones such as Stephen Altschul's generalised gap penalties, or even experimental ones developed in house, such as dba. The development of these algorithms is due to the ease of developing such algorithms in the enviroment used by Wise2.
Wise2 has also been written with an eye for reuse and maintainability. Although it is a pure C package you can access its functionality directly in Perl. Parts of the package (or the entire package) can be used by other C or C++ programs without namespace clashes as all externally linked variables have the unique identifier Wise2 prepended.

Experimental or unofficial Debian packages, projects with packaging stuff in SVN

Altreewnpp
program to perform phylogeny based analyses
http://claire.bardel.free.fr/software.html
Responsible: Charles Plessy
Version: N/A
License: GPL
Unofficial Debian package
This software was designed to perform phylogeny based analysis: first, it allows the detection of an association between a candidate gene and a disease, and second, it enables to make hypothesis about the susceptibility loci.
Ballview
free molecular modeling and molecular graphics tool
http://www.ballview.org
Responsible: Andreas Moll
Version: N/A
License: LGPL
Unofficial Debian package
BALLView provides fast OpenGL-based visualization of molecular structures, molecular mechanics methods (minimization, MD simulation using the AMBER, CHARMM, and MMFF94 force fields), calculation and visualization of electrostatic properties (FDPB) and molecular editing features.
BALLView is based on BALL (Biochemical Algorithms Library) , which is currently being developed in the groups of Hans-Peter Lenhof (Saarland University, Saarbruecken, Germany) and Oliver Kohlbacher (University of Tuebingen, Germany). BALL is an application framework in C++ that has been specifically designed for rapid software development in Molecular Modeling and Computational Molecular Biology. It provides an extensive set of data structures as well as classes for Molecular Mechanics, advanced solvation methods, comparison and analysis of protein structures, file import/export, and visualization.
Cluster3wnpp
find clustering solutions for genome data
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm#ctv
Responsible: Steffen Moeller
Version: N/A
License: non-free
Unofficial Debian package
Cluster 3.0 is an enhanced version of Cluster, which was originally developed by Michael Eisen while at Stanford University. The main improvement consists of the k-means algorithm, which now includes multiple trials to find the best clustering solution. This is crucial for the k-means algorithm to be reliable. The routine for self-organizing maps was extended to include 2D rectangular geometries. The Euclidean distance and the city-block distance were added to the available measures of similarity.
Dazzle
Java-based DAS server
http://www.biojava.org/dazzle
Responsible: Steffen Moeller
Version: N/A
License: LGPL
Unofficial Debian package
Dazzle is a general purpose server for the Distributed Annotation System (DAS) protocol. It is implemented as a Java servlet, using the BioJava APIs. Dazzle is a modular system which uses small "datasource" plugins to provide access to a range of databases. Several general-purpose plugins are included in the package, and it it straightforward to develop new plugins to connect to your own databases.
Information on DAS is available from http://www.biodas.org/
Ecellwnpp
Concept and environment for constructing virtual cells on computers
http://www.e-cell.org/
Responsible: Steffen Moeller
Version: N/A
License: GPL
Unofficial Debian package
The E-Cell Project is an international research project aiming at developing necessary theoretical supports, technologies and software platforms to allow precise whole cell simulation.
The E-Cell System is an object-oriented software suite for modeling, simulation, and analysis of large scale complex systems such as biological cells, architected by Kouichi Takahashi and written by a team of developers.
The core part of the system, E-Cell Simulation Environment version 3, allows many components driven by multiple algorithms with different timescales to coexist.
E-Cell System consists of the following three major parts:
 * E-Cell Simulation Environment (or E-Cell SE) 
 * E-Cell Modeling Environment (or E-Cell ME) 
 * E-Cell Analysis Toolkit. 
Gamgiwnpp
general atomistic modelling graphic interface
http://www.gamgi.org/
Responsible: Steffen Moeller
Version: N/A
License: Free
Unofficial Debian package
GAMGI provides a graphical user interface for the handling of molecular structures.
Gbrowsewnpp
The Generic Genome Browser from GMOD
http://www.gmod.org/wiki/index.php/GBrowse
Responsible: Charles Plessy
Version: N/A
License: Perl Artistic License, plus additional clauses
Unofficial Debian package
The Generic Genome Browser is a combination of database and interactive Web page for manipulating and displaying annotations on genomes. Some of its features:
 * Simultaneous bird's eye and detailed views of the genome. 
 * Scroll, zoom, center. 
 * Attach arbitrary URLs to any annotation. 
 * Order and appearance of tracks are customizable by administrator and end-user. 
 * Search by annotation ID, name, or comment. 
 * Supports third party annotation using GFF formats. 
 * Settings persist across sessions. 
 * DNA and GFF dumps. 
 * Connectivity to different databases, including BioSQL and Chado. 
 * Multi-language support. 
 * Third-party feature loading. 
 * Customizable plug-in architecture (e.g. run BLAST, dump & import many formats, 
   find oligonucleotides, design primers, create restriction maps, edit features) 
Haploviewwnpp
Analysis and visualization of LD and haplotype maps
http://www.broad.mit.edu/mpg/haploview/
Responsible: Steffen Moeller
Version: N/A
License: DFSG free
Unofficial Debian package
This tools assists in the analysis of the nucleotide variation in a population. Such investigations are performed to determine genes and genetic pathways that are associated with diseases. This is an early stage in the quest for new drugs.
Infernalwnpp
RNA sequence comparison
http://infernal.janelia.org/
Responsible: Steffen Moeller
Version: N/A
License: GPL
Unofficial Debian package
Infernal ("INFERence of RNA ALignment") is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs). A CM is like a sequence profile, but it scores a combination of sequence consensus and RNA secondary structure consensus, so in many cases, it is more capable of identifying RNA homologs that conserve their secondary structure more than their primary sequence.
The tool is an integral component of the Rfam database.
Users of this package should cite: "Query-Dependent Banding (QDB) for Faster RNA Similarity Searches."
 E. P. Nawrocki, S. R. Eddy. PLoS Comput. Biol., 3:e56, 2007. 
Jmol
molecule viewer
http://jmol.sourceforge.net/
Responsible: Daniel Leidert
Version: N/A
License: LGPL
Unofficial Debian package
Jmol is a free, open source molecule viewer for students, educators, and researchers in chemistry and biochemistry.
 * The JmolApplet is a web browser applet that can be integrated into web pages. 
 * The Jmol application is a standalone Java application that runs on the 
   desktop. 
 * The JmolViewer is a development tool kit that can be integrated into other 
   Java applications. 

For more detailed information about packaging status please see http://lists.debian.org/debian-med/2008/03/msg00097.html
Jtreeviewwnpp
Java re-implementation of Michael Eisen's TreeView
http://jtreeview.sourceforge.net/
Responsible: Steffen Moeller
Version: N/A
License: GPL
Unofficial Debian package
TreeView creates a matrix-like display of expression data, known as Eisen clustering. The original implementation was a Windows program named TreeView by Michael Eisen. This TreeView package, sometimes also referred to as jTreeView, was rewritten in Java under a free license, the original implementation also comes with the source code, but controls commercial distribution. And it did not run on Unix.
Java TreeView is an extensible viewer for microarray data in PCL or CDT format.
Mage2tabwnpp
MAGE-MLv1 converter and visualiser
https://www.cbil.upenn.edu/magewiki/index.php/mage2tab
Responsible: Charles Plessy
Version: N/A
License: CBIL Software and Data License (Apache-like)
Unofficial Debian package
This tool-kit is part of MR_T, a framework for import or export various of MAGE (MicroArray Gene Expression) documents (MAGE-MLv1, MAGE-TAB, SOFT, MINiML) from or into databases like GUS (the Genomics Unified Schema, www.gusdb.org).
Martj
distributed data integration system for biological data
http://www.ebi.ac.uk/biomart/
Responsible: Steffen Moeller
Version: N/A
License: GPL
Unofficial Debian package
BioMart is a simple, distributed data integration system with powerful query capabilities. The BioMart data model has been applied to the following data sources: UniProt Proteomes, Macromolecular Structure Database (MSD), Ensembl, Vega, and dbSNP.
Mauvealigner
multiple genome alignment algorithms
http://asap.ahabs.wisc.edu/mauve/
Responsible: Andreas Tille
Version: N/A
License: GPL
Unofficial Debian package
The mauveAligner and progressiveMauve alignment algorithms have been implemented as command-line programs included with the downloadable Mauve software. When run from the command-line, these programs provide options not yet available in the graphical interface.
Mauve is a system for efficiently constructing multiple genome alignments in the presence of large-scale evolutionary events such as rearrangement and inversion. Multiple genome alignment provides a basis for research into comparative genomics and the study of evolutionary dynamics. Aligning whole genomes is a fundamentally different problem than aligning short sequences.
Mauve has been developed with the idea that a multiple genome aligner should require only modest computational resources. It employs algorithmic techniques that scale well in the amount of sequence being aligned. For example, a pair of Y. pestis genomes can be aligned in under a minute, while a group of 9 divergent Enterobacterial genomes can be aligned in a few hours.
Mauve computes and interactively visualizes genome sequence comparisons. Using FastA or GenBank sequence data, Mauve constructs multiple genome alignments that identify large-scale rearrangement, gene gain, gene loss, indels, and nucleotide substutition.
Mauve is developed at the University of Wisconsin.
Meme
motif discovery and search
http://meme.nbcr.net/meme/
Responsible: Steffen Moeller
Version: N/A
License: non-free for commercial purpose (http://meme.nbcr.net/meme/COPYRIGHT.html)
Unofficial Debian package
MEME is a tool for discovering motifs in a group of related DNA or protein sequences. A motif is a sequence pattern that occurs repeatedly in a group of related protein or DNA sequences. MEME represents motifs as position-dependent letter-probability matrices which describe the probability of each possible letter at each position in the pattern. Individual MEME motifs do not contain gaps. Patterns with variable-length gaps are split by MEME into two or more separate motifs.
MEME takes as input a group of DNA or protein sequences (the training set) and outputs as many motifs as requested. MEME uses statistical modeling techniques to automatically choose the best width, number of occurrences, and description for each motif.
Mgltoolswnpp
preparation of proteins and ligands to investigate their binding
http://mgltools.scripps.edu
Responsible: Steffen Moeller
Version: N/A
License: various custom non-free, mostly academia-only
Unofficial Debian package
The package comprises AutoDockTools, Python Molecular Viewer, Vision and many helping smaller python libraries which shall all become available as Debian packages ... if their license permits. The essential mslib library for instance comes with binaries which should not go into Debian. The analysis itself is performed with AutoDock, which was recently made available under the GPL.
Upstream is very supportive but their binary-only libraries are of a third party. This set of tools is well known across biochemistry and computational/structural biology.
Mirawnpp
Whole Genome Shotgun and EST Sequence Assembler
http://chevreux.org/projects_mira.html
Responsible: Charles Plessy
Version: N/A
License: GPL
Unofficial Debian package
The mira genome fragment assembler is a specialised assembler for sequencing projects classified as 'hard' due to high number of similar repeats. For expressed sequence tags (ESTs) transcripts, miraEST is specialised on reconstructing pristine mRNA transcripts while detecting and classifying single nucleotide polymorphisms (SNP) occuring in different variations thereof.
The assembler is routinely used for such various tasks as mutation detection in different cell types, similarity analysis of transcripts between organisms, and pristine assembly of sequences from various sources for oligo design in clinical microarray experiments.
Murasaki
homology detection tool across multiple large genomes
http://murasaki.dna.bio.keio.ac.jp/
Responsible: no one
Version: N/A
License: GPL
Unofficial Debian package
Murasaki is a scalable and fast, language theory-based homology detection tool across multiple large genomes. It enable whole-genome scale multiple genome global alignments. Supports unlimited length gapped-seed patterns and unique TF-IDF based filtering.
Murasaki is an anchor alignment software, which is
 * exteremely fast (17 CPU hours for whole Human x Mouse genome (with 
   40 nodes: 52 wall minutes)) 
 * scalable (Arbitrarily parallelizable across multiple nodes using MPI. 
   Even a single node with 16GB of ram can handle over 1Gbp of sequence.) 
 * unlimited pattern length 
 * repeat tolerant 
 * intelligent noise reduction 
Ncoilswnpp
coiled coil secondary structure prediction
http://www.russell.embl.de/cgi-bin/coils-svr.pl
Responsible: Steffen Moeller
Version: N/A
License: GPL
Unofficial Debian package
The program predicts the coiled coil secondary structure predictions from protein sequences. The algorithm was published in Lupas, van Dyke & Stock, Predicting coiled coils from protein sequences Science, 252, 1162-1164, 1991.
Phylographerwnpp
Graph Visualization Tool
http://www.atgc.org/PhyloGrapher/PhyloGrapher_Welcome.html
Responsible: Charles Plessy
Version: N/A
License: GPL
Unofficial Debian package
PhyloGrapher is a program designed to visualize and study evolutionary relationships within families of homologous genes or proteins (elements). PhyloGrapher is a drawing tool that generates custom graphs for a given set of elements. In general, it is possible to use PhyloGrapher to visualize any type of relations between elements. Used in conjunction with tcl_blast_parser, PhyloGrapher can represent the results of a BLAST search as a graph.
PhyloGrapher and tcl_blast_parser are useful tools to analyse BLAST biological sequence alignment reports (BLAST is provided by Debian's blast2 package).
Plink
whole-genome association analysis toolset
http://pngu.mgh.harvard.edu/~purcell/plink/
Responsible: Steffen Moeller
Version: N/A
License: GPL
Unofficial Debian package
plink expects as input the data from SNP (single nucleotide polymorphism) chips of many individuals and their phenotypical description of a disease. It finds associations of single or pairs of DNA variations with a phenotype and can retrieve SNP annotation from an online source.
Smilewnpp
infer motifs in a set of sequences
http://www-igm.univ-mlv.fr/~marsan/smile_english.html
Responsible: Steffen Moeller
Version: N/A
License: GPL
Unofficial Debian package
SMILE is a tool that infers motifs in a set of sequences, according to some criteria. It was first made to infer exceptional sites as binding sites in DNA sequences. Since the 1.4 version, it allows to infer motifs written on any alphabet (even degenerate) in any kind of sequences.
The specificity of SMILE is to allow to deal with what we call structured motifs, which are motifs associated by some distance constraints.
Tacgwnpp
command line program for finding patterns in nucleic acids
http://sourceforge.net/projects/tacg
Responsible: Charles Plessy
Version: N/A
License: GPL and others
Unofficial Debian package