SNAP Combine is a command-line based tool that merges the contents
of multiple single locus DNA sequence files into a single
multi-locus output file. There are various input and output
file formats. The files can be merged into a union or
intersection of all the input loci. Additionally Combine
tracks the start and end positions of each file allowing
the user to exclude variable sites or taxa, important
in creating input files for multilocus analyses.
TO INSTALL:
Download the file combine.zip
and extract the contents.
SNAP Combine is distributed as a single Java jar file,
Combine.jar. SNAP Combine was designed and tested on Mac
OS X 10.3/10.4 with Java 1.4/1.5 but it should be compatible
on different operating systems and recent versions of
java.
COMMAND LINE
java –jar Combine.jar [-i | -u] [-rc {column range}]
[-rr {row range}] [-I] OUTFILE [INFILE [... INFILE]]
EXAMPLE
java –jar Combine.jar combine_union.phy -i combine_intersect.phy
NC.phy ORFA.phy ORFB.phy
Option |
Description |
-u |
Union mode – Includes individuals in output file from
input files that are not necessarily represented
by every input file. Missing regions are padded
with ‘?’ characters. Note: this feature
is enabled by default and is mutually exclusive
to intersection mode described below. |
-i |
Intersection mode – Excludes any individual that is not
represented by every input locus. |
-rc [x -y[,…,x-y]] |
Remove column – Takes a range of columns, hyphen delimited,
or a list of ranges, comma delimited, and removes
them from the final output file. Note: There should
be no spaces between the hyphens or commas. |
-rr [x-y[,…,x-y]] |
Remove row – Takes a range of rows, hyphen delimited,
or a list of ranges, comma delimited, and removes
them from the final output file. Note: There should
be no spaces between the hyphens or commas. |
-I |
Interleave output – Sets the sequence formatting to interleaved
instead of sequential, which is the default setting,
for the specific output type specified by the
output files extension. |
SUPPORTED FORMATS
Combine allows the user to specify their desired output
format implicitly via the file name extension for the
output file. Additionally each supported file has both
a sequential and an interleaved sub-format which is
specified with the –I flag. The supported files
extensions are listed below:
|
Extension |
NEXUS |
nxs |
CLUSTAL |
aln |
FASTA |
fas |
PHYLIP |
phy |
Each of these file types is also supported as an input
file format. Input file types in Combine are determined
internally by the actual content of the file, not its
filename extension. Therefore input files do not necessarily
have to follow the same filename extension conventions
used to determine output file format.
Also included in this distribution
is MLCombine, a java program which combines multiple
single locus MIGRATE infiles
generated using SNAP Map into a single multilocus MIGRATE
infile.
COMMAND LINE
java –jar MLCombine.jar
OUTFILE [MIGRATE INFILE [... MIGRATE INFILE]]
|