Summary
Biology
Debian Med micro-biology packages
This meta package will install Debian packages related to molecular biology,
structural biology and bioinformatics for use in life sciences.
The list to the right includes various software projects which are of some interest to the Debian Med Project. Currently, only a few of them are available as Debian packages. It is our goal, however, to include all software in Debian Med which can sensibly add to a high quality Debian Integrated Solution.
For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:
If you discover a project which looks like a good candidate for Debian Med
to you, or if you have prepared an unofficial Debian package, please do not hesitate to
send a description of that project to the Debian Med mailing list
Links to other tasks
|
Debian Med Biology packages
Official Debian packages
Adun.app
Molecular Simulator for GNUstep
|
Version: 0.8.2
License: DFSG free
Official Debian package
-
|
|
This is a new extendible molecular simulation program that also
includes data management and analysis capabilities.
|
Altree
program to perform phylogeny based analyses
|
Version: 1.0.1
License: DFSG free
Official Debian package
-
|
|
ALTree was designed to perform phylogeny based analysis: first,
it allows the detection of an association between a candidate gene and
a disease, and second, it enables to make hypothesis about the
susceptibility loci.
|
Amap-align
Protein multiple alignment by sequence annealing
|
Version: 2.2
License: DFSG free
Official Debian package
-
|
AMAP is a command line tool to perform multiple alignment of peptidic
sequences. It utilizes posterior decoding, and a sequence-annealing
alignment, instead of the traditional progressive alignment method. It
is the only alignment program that allows to control the sensitivity /
specificity tradeoff. It is based on the ProbCons source code, but
uses alignment metric accuracy and eliminates the consistency
transformation.
The java visualisation tool of AMAP 2.2 is not yet packaged in Debian.
|
Arb
Integrated package for sequence database handling and analysis
|
Version: 0.0.20071207.1
License: non-free
Debian package in non-free
-
|
The ARB software is a graphically oriented package comprising various tools
for sequence database handling and data analysis. A central database of
processed (aligned) sequences and any type of additional data linked to the
respective sequence entries is structured according to phylogeny or other
user defined criteria.
The ARB project (latin, "arbor"=tree) is a joint initiative of the Lehrstuhl
fuer Mikrobiologie http://www.mikro.biologie.tu-muenchen.de/ and the
Lehrstuhl fuer Rechnertechnik und Rechnerorganisation
http://wwwbode.informatik.tu-muenchen.de/ of the Technical University
of Munich.
|
Autodock
analysis of ligand binding to protein structure
|
Version: 4.0.1
License: DFSG free
Official Debian package
-
|
AutoDock is a prime representative of the programs addressing the
simulation of the docking of fairly small chemical ligands to rather big
protein receptors. Earlier versions had all flexibility in the ligands
while the protein was kept rather ridgid. This latest version 4 also
allows for a flexibility of selected sidechains of surface residues,
i.e., takes the rotamers into account.
The AutoDock program performs the docking of the ligand to a set of
grids describing the target protein. AutoGrid pre-calculates these grids.
|
Autogrid
pre-calculate binding of ligands to their receptor
|
Version: 4.0.1
License: DFSG free
Official Debian package
-
|
The AutoDockSuite addresses the molecular analysis of the docking of
a smaller chemical compounds to their receptors of known three-dimensional
structure.
The AutoGrid program performs pre-calculations for the docking of a
ligand to a set of grids that describe the effect that the protein has
on point charges. The effect of these forces on the ligand is then
analysed by the AutoDock program.
|
Biococoa.app
biological sequence file format conversion applet for GNUstep
|
Version: 1.6.0
License: DFSG free
Official Debian package
-
|
Demo application to demonstrate the possibilities of the BioCocoa framework.
This package contains a GNUstep applet to convert between sequence file
formats. The BioCocoa framework provides developers with the opportunity to
add support for reading and writing BEAST, Clustal, EMBL, Fasta, GCG-MSF, GDE,
Hennig86, NCBI, NEXUS, NONA, PDB, Phylip, PIR, Plain/Raw, Swiss-Prot and TNT
files by writing only three lines of code. The framework is written in Cocoa
(Objective-C).
Version 1.6 is the last upstream version that works with GNUstep. If
newer versions are needed to work under Linux try to convince upstream to
support GNUstep.
|
Biosquid
utilities for biological sequence analysis
|
Version: 1.9g+cvs20050121
License: DFSG free
Official Debian package
-
|
SQUID is a library of C code functions for sequence analysis. It also
includes a number of small utility programs to convert, show statistics,
manipulate and do other functions on sequence files.
The original name of the package is "squid", but since there is already
a squid on the archive (a proxy cache), it was renamed to "biosquid".
|
Blast2
Basic Local Alignment Search Tool
|
Version: 1:2.2.18.20080302
License: DFSG free
Official Debian package
-
|
The famous sequence alignment program. This is "official" NCBI version,
#2. The blastall executable allows you to give a nucleotide or protein
sequence to the program. It is compared against databases and a summary of
matches is returned to the user.
Note that databases are not included in Debian; they must be retrieved
manually.
|
Boxshade
Pretty-printing of multiple sequence alignments
|
Version: 3.3.1
License: DFSG free
Official Debian package
-
|
Boxshade is a program for creating good looking printouts from
multiple-aligned protein or DNA sequences. The program does not perform
the alignment by itself and requires as input a file that was created
by a multiple alignment program or manually edited with respective tools.
Boxshade reads multiple-aligned sequences from either PILEUP-MSF,
CLUSTAL-ALN, MALIGNED-data and ESEE-save files (limited to a maximum
of 150 sequences with up to 10000 elements each). Various kinds of
shading can be applied to identical/similar residues. Output is written
to screen or to a file in the following formats: ANSI/VT100, PS/EPS,
RTF, HPGL, ReGIS, LJ250-printer, ASCII, xFIG, PICT, HTML
|
Clustalw
global multiple nucleotide or peptide sequence alignment
|
Version: 2.0.9
License: non-free
Debian package in non-free
-
|
This program performs an alignment of multiple nucleotide or amino acid
sequences. It recognizes the format of input sequences and whether the
sequences are nucleic acid (DNA/RNA) or amino acid (proteins). The output
format may be selected from in various formats for multiple alignments such as
Phylip or FASTA. Clustal W is very well accepted.
The output of Clustal W can be edited manually but preferably with an
alignment editor like SeaView or within its companion Clustal X. When building
a model from your alignment, this can be applied for improved database
searches. The Debian package hmmer creates such in form of an HMM.
For details and citation purposes see paper "Clustal W and Clustal X version
2.0", Larkin M., et al. Bioinformatics 2007 23(21):2947-2948
|
Clustalw-mpi
MPI-distributed global sequence alignment with ClustalW
|
Version: 0.15
License: non-free
Debian package in non-free
-
|
ClustalW is a popular tool for multiple sequence alignment. The
alignment is achieved via three steps: pairwise alignment,
guide-tree generation and progressive alignment. ClustalW-MPI is an
MPI implementation of ClustalW. Based on
version 1.82 of the original ClustalW, both the pairwise
and progressive alignments are parallelized with MPI, a
popular message passing programming standard. The
pairwise alignments can be easily parallelized since the many
alignments are time independent on each other. However
the progressive alignments are essentially not parallelizable
because of the time dependencies between each alignment.
Here the recursive parallelism paradigm is applied to the linear space
profile-profile alignment algorithm. This approach is more time
efficient on computers with distributed memory architecture.
Traditional approach that relies on precomputing the profile-profile
score matrix has also been implemented. Results shown the latter is indeed
more appropriate for shared memory multiprocessor computer.
ClustalX is suggested for its support for local realignments, seaview
is a versatile editor of alignments.
The original ClustalW/ClustalX can be found at
URL: http://www.clustal.org/download/pre-2/
|
Clustalx
GUI for Clustal W
|
Version: 1.83
License: non-free
Debian package in non-free
-
|
This package offers a GUI interface for the Clustal W multiple sequence
alignment program. It provides an integrated environment for performing
multiple sequence- and profile-alignments to analyse the results.
The sequence alignment is displayed in a window on the screen.
A versatile coloring scheme has been incorporated to highlight conserved
features in the alignment. For professional presentations, one should
use the texshade LaTeX package or boxshade.
The pull-down menus at the top of the window allow you to select all the
options required for traditional multiple sequence and profile alignment.
You can cut-and-paste sequences to change the order of the alignment; you can
select a subset of sequences to be aligned; you can select a sub-range of the
alignment to be realigned and inserted back into the original alignment.
An alignment quality analysis can be performed and low-scoring segments or
exceptional residues can be highlighted.
|
Dialign
Segment-based multiple sequence alignment
|
Version: 2.2.1
License: DFSG free
Official Debian package
-
|
|
DIALIGN2 is a command line tool to perform multiple alignment of
protein or DNA sequences. It constructs alignments from gapfree pairs
of similar segments of the sequences. This scoring scheme for
alignments is the basic difference between DIALIGN and other global or
local alignment methods. Note that DIALIGN does not employ any kind of
gap penalty. It has been published by Morgenstern B. in
Bioinformatics. 1999 Mar;15(3):211-8.
|
Dialign-tx
Segment-based multiple sequence alignment
|
Version: 1.0.1
License: DFSG free
Official Debian package
-
|
DIALIGN-TX is a command line tool to perform multiple alignment of protein or
DNA sequences. It is a complete reimplementation of the segment-base approach
including several new improvements and heuristics that significantly enhance
the quality of the output alignments compared to DIALIGN 2.2 and DIALIGN-T.
For pairwise alignment, DIALIGN-TX uses a fragment-chaining algorithm that
favours chains of low-scoring local alignments over isolated high-scoring
fragments. For multiple alignment, DIALIGN-TX uses an improved greedy
procedure that is less sensitive to spurious local sequence similarities.
DIALIGN-TX has been published in Amarendran R. Subramanian, Michael Kaufmann,
Burkhard Morgenstern: Improvement of the segment-based approach for multiple
sequence alignment by combining greedy and progressive alignment strategies,
Algorithms for Molecular Biology 3:6, 2008
|
Emboss
The European Molecular Biology Open Software Suite
|
Version: 5.0.0
License: DFSG free
Official Debian package
-
|
EMBOSS is a free Open Source software analysis package specially developed for
the needs of the molecular biology (e.g. EMBnet) user community. The software
automatically copes with data in a variety of formats and even allows
transparent retrieval of sequence data from the web. Also, as extensive
libraries are provided with the package, it is a platform to allow other
scientists to develop and release software in true open source spirit. EMBOSS
also integrates a range of currently available packages and tools for sequence
analysis into a seamless whole. EMBOSS breaks the historical trend towards
commercial software packages.
Reference for EMBOSS: Rice,P. Longden,I. and Bleasby,A.
"EMBOSS: The European Molecular Biology Open Software Suite"
Trends in Genetics June 2000, vol 16, No 6. pp.276-277
|
Exonerate
generic tool for pairwise sequence comparison
|
Version: 2.1.0
License: DFSG free
Official Debian package
-
|
Exonerate allows you to align sequences using a many alignment models, using
either exhaustive dynamic programming, or a variety of heuristics. Much of
the functionality of the Wise dynamic programming suite was reimplemented in C
for better efficiency. Exonerate is an intrinsic component of the building of
the Ensembl genome databases, providing similarity scores between RNA and DNA
sequences and thus determining splice variants and coding sequences in
general.
An In-silico PCR Experiment Simulation System (see the ipcress man page) is
packaged with exonerate.
This package also comes with a selection of utilities for performing
simple manipulations quickly on fasta files beyond 2Gb
|
Fastdnaml
Tool for construction of phylogenetic trees of DNA sequences
|
Version: 1.2.2
License: DFSG free
Official Debian package
-
|
fastDNAml is a program derived from Joseph Felsenstein's version 3.3 DNAML
(part of his PHYLIP package). Users should consult the documentation for
DNAML before using this program.
fastDNAml is an attempt to solve the same problem as DNAML, but to do so
faster and using less memory, so that larger trees and/or more bootstrap
replicates become tractable. Much of fastDNAml is merely a recoding of the
PHYLIP 3.3 DNAML program from PASCAL to C.
|
Fastlink
A faster version of pedigree programs of Linkage
|
Version: 4.1P
License: DFSG free
Official Debian package
-
|
|
Fastlink is much faster than the original Linkage but does not
implement all the programs.
|
Garlic
A visualization program for biomolecules
|
Version: 1.6
License: DFSG free
Official Debian package
-
|
Garlic is written for the investigation of membrane proteins. It may be
used to visualize other proteins, as well as some geometric objects.
This version of garlic recognizes PDB format version 2.1. Garlic may
also be used to analyze protein sequences.
It only depends on the X libraries, no other libraries are needed.
Features include:
- The slab position and thickness are visible in a small window.
- Atomic bonds as well as atoms are treated as independent drawable
objects.
- The atomic and bond colors depend on position. Five mapping modes
are available (as for slab).
- Capable to display stereo image.
- Capable to display other geometric objects, like membrane.
- Atomic information is available for atom covered by the mouse
pointer. No click required, just move the mouse pointer over the
structure!
- Capable to load more than one structure.
- Capable to draw Ramachandran plot, helical wheel, Venn diagram,
averaged hydrophobicity and hydrophobic moment plot.
- The command prompt is available at the bottom of the main window.
It is able to display one error message and one command string.
|
Gdpc
visualiser of molecular dynamic simulations
|
Version: 2.2.4
License: DFSG free
Official Debian package
-
|
|
gpdc is a graphical program for visualising output data from
molecular dynamics simulations. It reads input in the standard xyz
format, as well as other custom formats, and can output pictures of
each frame in JPG or PNG format.
|
Gff2aplot
pair-wise alignment-plots for genomic sequences in PostScript
|
Version: 2.0
License: DFSG free
Official Debian package
-
|
A program to visualize the alignment of two genomic sequences together with
their annotations. From GFF-format input files it produces PostScript figures
for that alignment.
The following menu lists many features of gff2aplot:
* Comprehensive alignment plots for any GFF-feature. Attributes are defined
separately so you can modify only whatsoever attributes for a given file or
share same customization across different data-sets.
* All parameters are set by default within the program, but it can be also
fully configured via gff2ps-like flexible customization files. Program can
handle several of such files, summarizing all the settings before producing
the corresponding figure. Moreover, all customization parameters can be set
via command-line switches, which allows users to play with those parameters
before adding any to a customization file.
* Source order is taken from input files, if you swap file order you can
visualize alignment and its annotation with the new input arrangement.
* All alignment scores can be visualized in a PiP box below gff2aplot area,
using grey-color scale, user-defined color scale or score-dependent
gradients.
* Scalable fonts, which can also be chosen among the basic PostScript default
fonts. Feature and group labels can be rotated to improve readability in
both annotation axes.
* The program is still defined as a Unix filter so it can handle data from
files, redirections and pipes, writing output to standard-output and
warnings to standard error.
* gff2aplot is able to manage many physical page formats (from A0 to A10, and
more -see available page sizes in its manual-), including user-defined ones.
This allows, for instance, the generation of poster size genomic maps, or
the use of a continuous-paper supporting plotting device, either in portrait
or landscape.
* You can draw different alignments on same alignment plot and distinguish
them by using different colors for each.
* Shape dictionary has been expanded, so that further feature shapes are now
available (see manual).
* Annotation projections through alignment plots (so called ribbons) emulate
transparencies via complementary color fill patterns. This feature allows
to show color pseudo-blending when horizontal and vertical ribbons overlap.
|
Gff2ps
produces PostScript graphical output from GFF-files
|
Version: 0.98d
License: DFSG free
Official Debian package
-
|
|
gff2ps is a script program developed with the aim of converting gff-formatted
records into high quality one-dimensional plots in PostScript. Such plots
maybe useful for comparing genomic structures and to visualizing outputs from
genome annotation programs.
It can be used in a very simple way, because it assumes that the GFF file
itself carries enough formatting information, but it also allows through a
number of options and/or a configuration file, for a great degree of
customization.
|
Ghemical
A GNOME molecular modelling environment
|
Version: 2.95
License: DFSG free
Official Debian package
-
|
Ghemical is a computational chemistry software package written in C++.
It has a graphical user interface and it supports both quantum-
mechanics (semi-empirical) models and molecular mechanics models.
Geometry optimization, molecular dynamics and a large set of
visualization tools using OpenGL are currently available.
Ghemical relies on external code to provide the quantum-mechanical
calculations. Semi-empirical methods MNDO, MINDO/3, AM1 and PM3 come
from the MOPAC7 package (Public Domain), and are included in the
package. The MPQC package is used to provide ab initio methods: the
methods based on Hartree-Fock theory are currently supported with
basis sets ranging from STO-3G to 6-31G**.
|
Glam2
gapped protein motifs from unaligned sequences
|
Version: 1064
License: DFSG free
Official Debian package
-
|
GLAM2 is a software package for finding motifs in sequences, typically
amino-acid or nucleotide sequences. A motif is a re-occurring sequence
pattern: typical examples are the TATA box and the CAAX prenylation motif. The
main innovation of GLAM2 is that it allows insertions and deletions in motifs.
The package includes these programs:
glam2: discovering motifs shared by a set of sequences;
glam2scan: finding matches, in a sequence database, to a motif discovered
by glam2;
glam2format: converting glam2 motifs to standard alignment formats;
glam2mask: masking glam2 motifs out of sequences, so that weaker motifs
can be found;
glam2-purge: removing highly similar members of a set of sequences.
In this package, the fast Fourier algorithm (FFT) was enabled for glam2.
If you use GLAM2, please cite: MC Frith, NFW Saunders, B Kobe, TL Bailey
(2008) Discovering sequence motifs with arbitrary insertions and deletions,
PLoS Computational Biology (in press).
|
Gromacs
Molecular dynamics simulator, with building and analysis tools
|
Version: 3.3.3
License: DFSG free
Official Debian package
-
|
GROMACS is a versatile package to perform molecular dynamics, i.e. simulate
the Newtonian equations of motion for systems with hundreds to millions of
particles.
It is primarily designed for biochemical molecules like proteins and lipids
that have a lot of complicated bonded interactions, but since GROMACS is
extremely fast at calculating the nonbonded interactions (that usually
dominate simulations) many groups are also using it for research on non-
biological systems, e.g. polymers.
GROMACS offers entirely too many features for a brief description to do it
justice. A more complete listing is available at
<http://www.gromacs.org/content/view/12/176/>.
|
Hmmer
profile hidden Markov models for protein sequence analysis
|
Version: 2.3.2
License: DFSG free
Official Debian package
-
|
HMMER is an implementation of profile hidden Markov model methods for
sensitive searches of biological sequence databases using multiple sequence
alignments as queries.
Given a multiple sequence alignment as input, HMMER builds a statistical
model called a "hidden Markov model" which can then be used as a query into
a sequence database to find (and/or align) additional homologues of the
sequence family.
|
Kalign
Global and progressive multiple sequence alignment
|
Version: 2.03
License: DFSG free
Official Debian package
-
|
|
Kalign is a command line tool to perform multiple alignment of
biological sequences. It employs the Wu-Manber string-matching
algorithm, to improve both the accuracy and speed of the alignment.
It uses global, progressive alignment approach, enriched by employing
an approximate string-matching algorithm to calculate sequence
distances and by incorporating local matches into the otherwise global
alignment. In comparisons made by its authors, Kalign was about 10
times faster than ClustalW and, depending on the alignment size, up to
50 times faster than popular iterative methods. It has been published
in Lassmann and Sonnhammer, BMC Bioinformatics 2005, 6:298.
|
Loki
MCMC linkage analysis on general pedigrees
|
Version: 2.4.7.4
License: DFSG free
Official Debian package
-
|
Performs Markov chain Monte Carlo multipoint linkage analysis on large,
complex pedigrees. The current package supports analyses on quantitative
traits only, although this restriction will be lifted in later versions.
Joint estimation of QTL number, position and effects uses Reversible Jump
MCMC. It is also possible to perform affected only IBD sharing analyses.
The homepage of this project used to be at http://loki.homeunix.net
but the project is dead now and the homepage vanished. The Homepage
field above points to the web archive.
|
Maq
maps short fixed-legth polymporphic DNA sequence reads to reference sequences
|
Version: 0.6.7
License: GPL
Official Debian package
-
|
Maq (short for Mapping and Assembly with Quality) builds mapping assemblies
from short reads generated by the next-generation sequencing machines. It is
particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has a
preliminary functionality to handle ABI SOLiD data. Maq is previously known as
mapass2.
With Maq you can:
- Fast align Illumina/SOLiD reads to the reference genome. With the
default options, one million pairs of reads can be mapped to the
human genome in about 10 CPU hours with less than 1G memory.
- Accurately measure the error probability of the alignment of each
individual read.
- Call the consensus genotypes, including homozygous and heterozygous
polymorphisms, with a Phred probabilistic quality assigned to each base.
- Find short indels with paired end reads.
- Accurately find large scale genomic deletions and translocations with
paired end reads.
- Discover potential CNVs by checking read depth.
- Evaluate the accuracy of raw base qualities from sequencers and help
to check the systematic errors.
However, Maq can NOT:
- Do de novo assembly. (Maq can only call the consensus by mapping reads
to a known reference.)
- Map shorts reads against themselves. (Maq can only find complete overlap
between reads.)
- Align capillary reads or 454 reads to the reference. (Maq cannot align
reads longer than 63bp.)
This package is likely to be useful for users working with genetics
or genomic studies in biology who need to assembly DNA sequences from
fixed-length sequencers.
|
Melting
computing the melting temperature of nucleic acid duplex
|
Version: 4.2h
License: DFSG free
Official Debian package
-
|
|
This program computes, for a nucleic acid duplex, the enthalpy, the
entropy and the melting temperature of the helix-coil
transitions. Three types of hybridisation are possible: DNA/DNA,
DNA/RNA, and RNA/RNA. The program first computes the hybridisation
enthalpy and entropy from the elementary parameters of each Crick's
pair by the nearest-neighbor method. Then the melting temperature is
computed. The set of thermodynamic parameters can be easily changed,
for instance following an experimental breakthrough. Melting was
published in Le Novère N. (2001), Bioinformatics, 17: 1226-1227.
|
Mipe
Tools to store PCR-derived data
|
Version: 1.1
License: DFSG free
Official Debian package
-
|
MIPE provides a standard format to exchange and/or storage of all
information associated with PCR experiments using a flat text file. This will:
* allow for exchange of PCR data between researchers/laboratories
* enable traceability of the data
* prevent problems when submitting data to dbSTS or dbSNP
* enable the writing of standard scripts to extract data (e.g. a
list of PCR primers, SNP positions or haplotypes for different animals)
Although this tool can be used for data storage, it's primary focus
should be data exchange. For larger repositories, relational databases
are more appropriate for storage of these data. The MIPE format could
then be used as a standard format to import into and/or export from
these databases.
MIPE was published in: Aerts J & Veenendaal T. MIPE - a XML-format to
facilitate the storage and exchange of PCR-related data. Online Journal of
Bioinformatics 6(2): 114-120 (2005).
|
Molphy
Program Package for MOLecular PHYlogenetics
|
Version: 2.3b3
License: non-free
Debian package in non-free
-
|
ProtML is a main program in MOLPHY for inferring evolutionary trees from
PROTein (amino acid) sequences by using the Maximum Likelihood method.
Other programs (C language)
NucML: Maximum Likelihood Inference of Nucleic Acid Phylogeny
ProtST: Basic Statistics of Protein Sequences
NucST: Basic Statistics of Nucleic Acid Sequences
NJdist: Neighbor Joining Phylogeny from Distance Matrix
Utilities (Perl)
mollist: get identifiers list molrev: reverse DNA sequences
molcat: concatenate sequences molcut: get partial sequences
molmerge: merge sequences nuc2ptn: DNA -> Amino acid
rminsdel: remove INS/DEL sites molcodon: get specified codon sites
molinfo: get varied sites mol2mol: MOLPHY format beautifer
inl2mol: Interleaved -> MOLPHY mol2inl: MOLPHY -> Interleaved
mol2phy: MOLPHY -> Sequential phy2mol: Sequential -> MOLPHY
must2mol: MUST -> MOLPHY etc.
|
Mozilla-biofox
extension of bioinformatics tools to Iceape and Iceweasel browsers
|
Version: 1.1.4
License: DFSG free
Official Debian package
-
|
Code bioFOX aims at implementing various bioinformatics tools as an extension
on the Iceape and Iceweasel browsers. Analysis of your favorite gene(s)
usually require(s) retrieving it from a database like NCBI or Swiss-Prot and
then performing one or more tasks including but not limited to:
* Translation of a nucleotide sequence;
* Blast search (eg. blastn, blastp etc.) of the desired nucleotide/protein
sequence;
* Calculation of properties (like PI, charge, molecular weight, AT/GC content
etc.) of a protein/nucleotide sequence;
* Conversion between formats (Genbank, Fasta, Swiss-Prot etc.);
* Prediction of sequence for sub-cellular localization (PREDOTAR, TargetP,
pSORT etc).
|
Mummer
Efficient sequence alignment of full genomes
|
Version: 3.20
License: DFSG free
Official Debian package
-
|
|
MUMmer is a system for rapidly aligning entire genomes, whether
in complete or draft form. For example, MUMmer 3.0 can find all
20-basepair or longer exact matches between a pair of 5-megabase genomes
in 13.7 seconds, using 78 MB of memory, on a 2.4 GHz Linux desktop
computer. MUMmer can also align incomplete genomes; it handles the 100s
or 1000s of contigs from a shotgun sequencing project with ease, and
will align them to another set of contigs or a genome using the NUCmer
program included with the system. If the species are too divergent for
DNA sequence alignment to detect similarity, then the PROmer program
can generate alignments based upon the six-frame translations of both
input sequences.
|
Muscle
Multiple alignment program of protein sequences
|
Version: 3.70+fix1
License: DFSG free
Official Debian package
-
|
|
MUSCLE is a multiple alignment program for protein sequences. MUSCLE
stands for multiple sequence comparison by log-expectation. In the
authors tests, MUSCLE achieved the highest scores of all tested
programs on several alignment accuracy benchmarks, and is also one of
the fastest programs out there.
|
Ncbi-epcr
Tool to test a DNA sequence for the presence of sequence tagged sites
|
Version: 2.3.10
License: DFSG free
Official Debian package
-
|
Electronic PCR (e-PCR) is computational procedure that is used to identify
sequence tagged sites(STSs), within DNA sequences. e-PCR looks for potential
STSs in DNA sequences by searching for subsequences that closely match the
PCR primers and have the correct order, orientation, and spacing that could
represent the PCR primers used to generate known STSs.
The new version of e-PCR implements a fuzzy matching strategy. To reduce
likelihood that a true STS will be missed due to mismatches, multiple
discontigous words may be used instead of a single exact word. Each of this
word has groups of significant positions separated by 'wildcard' positions
that are not required to match. In addition, it is also possible to allow
gaps in the primer alignments.
The main motivation for implementing reverse searching (called Reverse e-PCR)
was to make it feasible to search the human genome sequence and other large
genomes. The new version of e-PCR provides a search mode using a query
sequence against a sequence database.
|
Ncbi-tools-bin
NCBI libraries for biology applications (text-based utilities)
|
Version: 6.1.20080302
License: DFSG free
Official Debian package
-
|
|
This package includes various utilities distributed with the NCBI C SDK.
None of the programs in this package require X; you can find the X-based
utilities in the ncbi-tools-x11 package. BLAST and related tools are
in a separate package (blast2).
|
Ncbi-tools-x11
NCBI libraries for biology applications (X-based utilities)
|
Version: 6.1.20080302
License: DFSG free
Official Debian package
-
|
|
This package includes some X-based utilities distributed with the
NCBI C SDK: Cn3D, Network Entrez, Sequin, ddv, and udv. These
programs are not part of ncbi-tools-bin because they depend on
several additional library packages.
|
Njplot
A tree drawing program
|
Version: 2.2
License: DFSG free
Official Debian package
-
|
|
NJplot is able to draw any tree expressed in the standard
phylogenetic tree format (e.g., the format used by the Phylip package).
NJplot is especially convenient for rooting the unrooted trees
obtained from parsimony, distance or maximum likelihood tree-building
methods.
|
Perlprimer
Graphical design of primers for PCR
|
Version: 1.1.14
License: DFSG free
Official Debian package
-
|
PerlPrimer is a free, open-source GUI application written in Perl that designs
primers for standard Polymerase Chain Reaction (PCR), bisulphite PCR,
real-time PCR (QPCR) and sequencing. It aims to automate and simplify the
process of primer design.
If operated online, the tool nicely communicates with the Ensembl
project for further insights into the gene structure, i.e., allowing
for taking the location of exons and introns into account for the design
of the primers. The sequences themselves can be retrieved, too.
|
Phylip
[Biology] A package of programs for inferring phylogenies
|
Version: 1:3.68
License: non-free
Debian package in non-free
-
|
|
The PHYLogeny Inference Package is a package of programs for inferring
phylogenies (evolutionary trees) from sequences.
Methods that are available in the package include parsimony, distance
matrix, and likelihood methods, including bootstrapping and consensus
trees. Data types that can be handled include molecular sequences, gene
frequencies, restriction sites, distance matrices, and 0/1 discrete
characters.
|
Plasmidomics
draw plasmids and vector maps with PostScript graphics export
|
Version: 0.2.0
License: DFSG free
Official Debian package
-
|
|
Plasmidomics is written for easy drawing of plasmids and vector maps
to use them in theses, presentations or other forms of publications. It
natively supports PostScript as output format.
|
Poa
Partial Order Alignment for multiple sequence alignment
|
Version: 2.0+20060928
License: DFSG free
Official Debian package
-
|
|
POA is Partial Order Alignment, a fast program for multiple sequence
alignment (MSA) in bioinformatics. Its advantages are speed,
scalability, sensitivity, and the superior ability to handle branching
/ indels in the alignment. Partial order alignment is an approach to
MSA, which can be combined with existing methods such as progressive
alignment. POA optimally aligns a pair of MSAs and which therefore can
be applied directly to progressive alignment methods such as CLUSTAL.
For large alignments, Progressive POA is 10-30 times faster than
CLUSTALW. POA is published in Bioinformatics. 2004 Jul
10;20(10):1546-56.
|
Primer3
Tool to design flanking oligo nucleotides for DNA amplification
|
Version: 1.1.4
License: DFSG free
Official Debian package
-
|
Primer3 picks primers for Polymerase Chain Reactions (PCRs), considering as
criteria oligonucleotide melting temperature, size, GC content and
primer-dimer possibilities, PCR product size, positional constraints within
the source sequence, and miscellaneous other constraints. All of these
criteria are user-specifiable as constraints, and some are specifiable as
terms in an objective function that characterizes an optimal primer pair.
It has been published in Rozen S and Skaletsky H, "Primer3 on the WWW for
general users and for biologist programmers.", Methods Mol Biol.
2000;132:365-86.
The Whitehead Institute for Biomedical Research provides a web-based
front end to Primer3.
|
Probcons
PROBabilistic CONSistency-based multiple sequence alignment
|
Version: 1.12
License: DFSG free
Official Debian package
-
|
|
Tool for generating multiple alignments of protein sequences. Using a
combination of probabilistic modeling and consistency-based alignment
techniques, PROBCONS has achieved the highest accuracies of all alignment
methods to date. On the BAliBASE benchmark alignment database, alignments
produced by PROBCONS show statistically significant improvement over current
programs, containing an average of 7% more correctly aligned columns than
those of T-Coffee, 11% more correctly aligned columns than those of CLUSTAL W,
and 14% more correctly aligned columns than those of DIALIGN. Probcons is
published in Do, C.B., Mahabhashyam, M.S.P., Brudno, M., and Batzoglou, S.
2005. Genome Research 15: 330-340.
|
Proda
multiple alignment of protein sequences
|
Version: 1.0
License: DFSG free
Official Debian package
-
|
ProDA is a system for automated detection and alignment of homologous
regions in collections of proteins with arbitrary domain architectures.
Given an input set of unaligned sequences, ProDA identifies all
homologous regions appearing in one or more sequences, and returns a
collection of local multiple alignments for these regions.
ProDA is published in: Phuong T.M., Do C.B., Edgar R.C., and Batzoglou
S. Multiple alignment of protein sequences with repeats and
rearrangements. Nucleic Acids Research 2006 34(20), 5932-5942.
|
Pymol
Molecular Graphics System
|
Version: 1.1
License: DFSG free
Official Debian package
-
|
PyMOL is a molecular graphics system targetted at medium to large
biomolecules like proteins. It can generate high-quality publication-ready
molecular graphics images and animations.
Features include:
* Visualization of molecules, molecular trajectories and surfaces
of crystallography data or orbitals
* Molecular builder and sculptor
* Internal raytracer and movie generator
* Fully extensible and scriptable via a python interface
File formats PyMOL can read include PDB, XYZ, CIF, MDL Molfile, ChemDraw,
CCP4 maps, XPLOR maps and Gaussian cube maps.
|
R-cran-qtl
GNU R package for genetic marker linkage analysis
|
Version: 1.09
License: DFSG free
Official Debian package
-
|
R/qtl is an extensible, interactive environment for mapping quantitative
trait loci (QTLs) in experimental crosses. It is implemented as an
add-on-package for the freely available and widely used statistical
language/software R (see http://www.r-project.org).
The development of this software as an add-on to R allows to take
advantage of the basic mathematical and statistical functions, and
powerful graphics capabilities, that are provided with R. Further,
the user will benefit by the seamless integration of the QTL mapping
software into a general statistical analysis program. The goal is to
make complex QTL mapping methods widely accessible and allow users to
focus on modeling rather than computing.
A key component of computational methods for QTL mapping is the hidden
Markov model (HMM) technology for dealing with missing genotype data. We
have implemented the main HMM algorithms, with allowance for the presence
of genotyping errors, for backcrosses, intercrosses, and phase-known
four-way crosses.
The current version of R/qtl includes facilities for estimating
genetic maps, identifying genotyping errors, and performing single-QTL
genome scans and two-QTL, two-dimensional genome scans, by interval
mapping (with the EM algorithm), Haley-Knott regression, and multiple
imputation. All of this may be done in the presence of covariates (such
as sex, age or treatment). One may also fit higher-order QTL models by
multiple imputation.
|
R-other-bio3d
GNU R package for biological structure analysis
|
Version: 1.0
License: DFSG free
Official Debian package
-
|
|
The bio3d package contains utilities to process, organize and explore
protein structure, sequence and dynamics data. Features include the
ability to read and write structure, sequence and dynamic trajectory
data, perform atom summaries, atom selection, re-orientation,
superposition, rigid core identification, clustering, torsion analysis,
distance matrix analysis, structure and sequence conservation analysis,
and principal component analysis (PCA). In addition, various utility
functions are provided to enable the statistical and graphical power of
the R environment to work with biological sequence and structural data.
|
Rasmol
Visualize biological macromolecules
|
Version: 2.7.4.2
License: DFSG free
Official Debian package
-
|
RasMol is a molecular graphics program intended for the visualisation of
proteins, nucleic acids and small molecules. The program is aimed at
display, teaching and generation of publication quality images.
The program reads in a molecule coordinate file and interactively displays
the molecule on the screen in a variety of colour schemes and molecule
representations. Currently available representations include depth-cued
wireframes, 'Dreiding' sticks, spacefilling (CPK) spheres, ball and stick,
solid and strand biomolecular ribbons, atom labels and dot surfaces.
Supported input file formats include Protein Data Bank (PDB), Tripos
Associates' Alchemy and Sybyl Mol2 formats, Molecular Design Limited's
(MDL) Mol file format, Minnesota Supercomputer Center's (MSC) XYZ (XMol)
format, CHARMm format, CIF format and mmCIF format files.
This package installs two versions of RasMol, rasmol-gtk has a modern
GTK-based user interface and rasmol-classic is the version with the old
Xlib GUI.
|
Readseq
[Biology] Conversion between sequence formats
|
Version: 1
License: DFSG free
Official Debian package
-
|
|
Reads and writes nucleic/protein sequences in various
formats. Data files may have multiple sequences.
Readseq is particularly useful as it automatically detects many
sequence formats, and converts between them.
|
Seaview
Multiple sequence alignment editor
|
Version: 1:2.4
License: DFSG free
Official Debian package
-
|
SeaView is a graphical multiple sequence alignment editor developed by Manolo
Gouy. Multiple alignment formats (NEXUS, MSF, CLUSTAL, FASTA, PHYLIP, MASE)
are supported for reading and writing. Alignments can be manually edited.
The user is further supported by an integration of external programs,
i.e., to run DOT-PLOT or MUSCLE, to locally improve the alignment.
When using SeaView for investigations that lead to a publication, please
cite the following reference:
Galtier, N., Gouy, M. and Gautier, C. (1996) "SeaView and
Phylo_win, two graphic tools for sequence alignment and molecular
phylogeny." Comput. Applic. Biosci. 12:543-548.
|
Sibsim4
align expressed RNA sequences on a DNA template
|
Version: 0.17
License: DFSG free
Official Debian package
-
|
The SIBsim4 project is based on sim4, which is a program designed to align
an expressed DNA sequence with a genomic sequence, allowing for introns.
SIBsim4 is a fairly extensive rewrite of the original code with the following
goals:
* speed improvement;
* allow large, chromosome scale, DNA sequences to be used;
* provide more detailed output about splice types;
* provide more detailed output about polyA sites;
* misc code cleanups and fixes.
|
Sigma-align
Simple greedy multiple alignment of non-coding DNA sequences
|
Version: 1.1.1
License: DFSG free
Official Debian package
-
|
Sigma ("Simple greedy multiple alignment") is an alignment program with a new
algorithm and scoring scheme designed specifically for non-coding DNA
sequence. It uses a strategy of seeking the best possible gapless local
alignments, at each step making the best possible alignment consistent with
existing alignments, and scores the significance of the alignment based on the
lengths of the aligned fragments and a background model which may be supplied
or estimated from an auxiliary file of intergenic DNA.
Sigma has been published in BMC Bioinformatics. 2006 Mar 16;7:143.
|
Sim4
tool for aligning cDNA and genomic DNA
|
Version: 0.0.20030921
License: DFSG free
Official Debian package
-
|
sim4 is a similarity-based tool for aligning an expressed DNA sequence
(EST, cDNA, mRNA) with a genomic sequence for the gene. It also detects end
matches when the two input sequences overlap at one end (i.e., the start of
one sequence overlaps the end of the other).
sim4 employs a blast-based technique to first determine the basic matching
blocks representing the "exon cores". In this first stage, it detects all
possible exact matches of W-mers (i.e., DNA words of size W) between the two
sequences and extends them to maximal scoring gap-free segments. In the
second stage, the exon cores are extended into the adjacent as-yet-unmatched
fragments using greedy alignment algorithms, and heuristics are used to favor
configurations that conform to the splice-site recognition signals (GT-AG,
CT-AC). If necessary, the process is repeated with less stringent parameters
on the unmatched fragments.
|
T-coffee
Multiple Sequence Alignment
|
Version: 5.72
License: DFSG free
Official Debian package
-
|
T-Coffee is a multiple sequence alignment package. Given a set of
sequences (Proteins or DNA), T-Coffee generates a multiple sequence
alignment. Version 2.00 and higher can mix sequences and structures.
T-Coffee allows the combination of a collection of multiple/pairwise,
global or local alignments into a single model. It also allows to
estimate the level of consistency of each position within the new
alignment with the rest of the alignments. See the pre-print for more
information
T-Coffee has a special called M-Coffee that makes it possible to combine the
output of many multiple sequence alignment packages. In its published version,
it uses MUSCLE, PROBCONS, POA, DiAlign-TS, MAFFT, Clustal W, PCMA and
T-Coffee. A special version has been made for Debian, DM-Coffee, that uses
only free software by replacing Clustal W by Kalign. Using the 8 Methods of
M-Coffee can sometimes be a bit heavy. You can use a subset of your favorite
methods if you prefer.
|
Tigr-glimmer
Gene detection in archea and bacteria
|
Version: 3.02
License: DFSG free
Official Debian package
-
|
Developed by the TIGR institute this software detects coding sequences in
bacteria and archea.
Glimmer is a system for finding genes in microbial DNA, especially the
genomes of bacteria and archaea. Glimmer (Gene Locator and Interpolated
Markov Modeler) uses interpolated Markov models (IMMs) to identify the
coding regions and distinguish them from noncoding DNA.
|
Tree-ppuzzle
Parallelized reconstruction of phylogenetic trees by maximum likelihood
|
Version: 5.2
License: DFSG free
Official Debian package
-
|
TREE-PUZZLE (the new name for PUZZLE) is an interactive console program that
implements a fast tree search algorithm, quartet puzzling, that allows
analysis of large data sets and automatically assigns estimations of support
to each internal branch. TREE-PUZZLE also computes pairwise maximum
likelihood distances as well as branch lengths for user specified trees.
Branch lengths can also be calculated under the clock-assumption. In
addition, TREE-PUZZLE offers a novel method, likelihood mapping, to
investigate the support of a hypothesized internal branch without
computing an overall tree and to visualize the phylogenetic content of
a sequence alignment.
This is the parallelized version of tree-puzzle.
|
Tree-puzzle
Reconstruction of phylogenetic trees by maximum likelihood
|
Version: 5.2
License: DFSG free
Official Debian package
-
|
|
TREE-PUZZLE (the new name for PUZZLE) is an interactive console program that
implements a fast tree search algorithm, quartet puzzling, that allows
analysis of large data sets and automatically assigns estimations of support
to each internal branch. TREE-PUZZLE also computes pairwise maximum
likelihood distances as well as branch lengths for user specified trees.
Branch lengths can also be calculated under the clock-assumption. In
addition, TREE-PUZZLE offers a novel method, likelihood mapping, to
investigate the support of a hypothesized internal branch without
computing an overall tree and to visualize the phylogenetic content of
a sequence alignment.
|
Treetool
interactive tool for displaying phylogenetic trees
|
Version: 2.0.2a
License: non-free
Debian package in non-free
-
|
Treetool is an interactive tool for displaying, editing, and printing
phylogenetic trees. The tree is displayed visually on screen, in
various formats, and the user is able to modify the format, structure,
and characteristics of the tree. Trees may be viewed, compared,
formatted for printing, constructed from smaller trees, etc.
The development of this software has stopped in 1995.
|
Treeviewx
Displays and prints phylogenetic trees
|
Version: 0.5.1
License: DFSG free
Official Debian package
-
|
TreeView X is an open source and multi-platform program to display
phylogenetic trees. It can read and display NEXUS and Newick format tree files
(such as those output by PAUP*, ClustalX, TREE-PUZZLE, and other programs). It
allows to order the branches of the trees, and to export the trees in SVG
format.
The program was written by Rod Page r.page@bio.gla.ac.uk using the wxWidgets
C++ library. It was published in Computer Applications in the Biosciences. 1996
12: 357-358.
|
Wise
comparison of biopolymers, commonly DNA and protein sequences
|
Version: 2.4.1
| |