This page is intended for users who are already comfortable using a command line shell. (Another way to run ErmineJ from a command line is to use the R support)
In addition to providing a scriptable interface to the software, the ErmineJ CLI provides access to a few less-used features that are not accessible through the graphical user interface (GUI). The CLI can also be used to start the GUI.
To access the CLI, you need to have installed the generic bundle. See the instructions.
Your environment has to define the
JAVA_HOME variable (where Java is installed, e.g. /usr/lib/java) and also
ERMINEJ_HOME (point to the installation directory). You may want to put
$ERMINEJ_HOME/bin in your path so you can run the scripts with less typing.
Once you have set up the package, you should be able to access ermineJ by running
ermineJ.bat (Windows) or
ermineJ.sh. The rest of the instructions assume you are using a *nix platform (e.g. Linux, MacOSX), but using the included ermineJ.bat file would be analogous.
See further down the page for an example
The following options are supported:
-a,--annots <file> Annotation file to be used [required unless using GUI] -aspects <selections> GO aspects to include: B, C, M; for example for Biological Process only use B; to add Cellular Component use BC (Default: BCM = all ) -b Sets 'big is better' option for gene scores to true [default = false] -batch <scoreFileList> Batch process score files from a list, one per line. Incompatible with -o, -s, -G -C,--config <config file> Configuration file to use (saves typing); additional options given on the command line override those in the file. If you don't use this option, no configuration file will be used. -c,--classFile <file> Gene set ('class') file, e.g. GO file [XML or OBO; required unless using GUI] -d <directory> Data directory; default is your ermineJ.data directory -e,--scoreCol <integer> Column index for scores in input file (default: 2) -f <directory> Directory where custom gene set are located -G,--gui Launch the GUI. -g,--reps <BEST|MEAN> What to do when genes have multiple scores in input file (due to multiple elements per gene): BEST = best of replicates; MEAN = mean of replicates; default=MEAN -h,--help Print this message -i,--iters <iterations> Number of iterations (GSR and CORR methods only) -j,--genesOut Output should include gene symbols for all gene sets (default=don't include symbols) -l,--logTrans Log transform the scores (and change sign; recommended for p-values), default=don't transform -M,--mtc <value> Multiple test correction method: FWE = Bonferonni FWE, FDR = Benjamini-Hochberg FDR [default] -m,--stats <option> Method for computing raw class statistics (used for test=GSR only): MEAN (mean), QUANTILE (quantile), or MEAN_ABOVE_QUANTILE (mean above quantile), or PRECISIONRECALL (area under the precision-recall curve); default=MEAN -n,--test <value> Method for computing gene set significance: ORA (ORA), GSR (resampling of gene scores; use with -m to choose algorithm), CORR (profile correlation), ROC (ROC) -o,--output <output file> Output file name; if omitted, results are written to standard out -q,--quantile <quantile> quantile to use, only used for 'MEAN_ABOVE_QUANTILE', default=50 (median) -r,--rawData <data file> Raw data file, only needed for profile correlation analysis -S,--saveconfig <file> Save preferences in the specified file -s,--scoreFile <score file> Score file, required for all but profile correlation method -seed <value> Seed for random number generation (integer) -t,--threshold <threshold> Score threshold, only used for ORA; default = 0.001 -x,--maxClassSize <maxClassSize> Sets the maximum class size; default = 200 -y,--minClassSize <minClassSize> Sets the minimum class size; default = 20
Minimal command line, using defaults except for the three key input files and the choice of method (ORA) and the threshold for score selection (0.0001).
ermineJ.sh -s geneScores.txt -c ~/ermineJ.data/go_daily-termdb.rdf-xml.gz \ -a ~/ermineJ.data/Generic_human_noParents.an.txt.gz -n ORA -t 0.0001 \ > results.txt
Configuration files and the command line
For GUI users, the configuration file refers to the “settings” file, normally called “erminej.properties” and stored in the user’s home directory. The CLI permits the use of a configuration file (identified with the
-C option) instead of setting parameters as arguments to the shell command. This section describes how the CLI interprets this option.
Note that this behavior has changed in recent versions of ermineJ. Please note that when running ermineJ from Webstart or the Windows installed version, changes in the GUI (graphical user interface) are immediately reflected in the default configuration file stored in your home directory.
If you don’t specify a configuration file, all the options must be supplied on the command line. In previous versions, the configuration file would be read in by default.
If you do specify a configuration file, options can be overridden on the command line, but they will not be written into the config file.
If you don’t specify a configuration file and use -G to start the GUI, the default configuration file will be written and used as usual: it will be modified by other options you pass in or change in the GUI.
If you do specify a configuration file and use -G to start the GUI, the specified config file will NOT be modified.
This allows a consistent reuse of the customized config files, if so desired.