CoNet on command line

Using CoNet on command line - step by step tutorial

When a high number of iterations is requested (especially with renormalization enabled), CoNet is best run on command line. To ease the command line call of CoNet, the Cytoscape plugin allows generating the command line call from the current settings.

Prerequisites

This tutorial demonstrates how to run CoNet on command line. It assumes that you have downloaded and unzipped the CoNet.zip file. It also assumes that you have Java Runtime Environment 1.6 or higher installed on your machine (if Cytoscape 2.8 runs on your computer, you don't need to worry about this).

Optional: Set the Classpath

Optionally, you can add CoNet.jar located in the lib folder of CoNet to your class path. Java will find the jar by looking up this class path variable. Check this tutorial on how to do this on different systems. Example for MacOS and UNIX-based systems (on command line):
export LIB=/Users/me/Documents/CoNet/lib
export CLASSPATH=${CLASSPATH}:${LIB}/CoNet.jar
Example for Windows (in command prompt window):
set CLASSPATH=%CLASSPATH%;C:\Users\me\Documents\CoNet\lib\CoNet.jar;

Tutorial steps

  1. Start Cytoscape and open the CoNet plugin. Load the demo settings (the file is located in the demo folder of the CoNet directory).
  2. Open CoNet's Data menu and select "Costello_2009_oral.txt" located in the demo folder as input matrix.
  3. Click the "Generate command line call" button in CoNet's main menu. This will open another window with some text in it.
  4. Copy the line(s) of the text starting with "java". These lines represent the commands you want to carry out. They call CoNet's command line version with the parameters you have selected or loaded in the Cytoscape plugin.
  5. If you did not set the class path, paste the command into an editor and replace "java" by "java -cp C:\Users\me\Documents\CoNet\lib\CoNet.jar" in Windows and by "java -cp /Users/me/Documents/CoNet/lib/CoNet.jar" in MacOS/UNIX. For either OS, please do not copy the example paths, but give the path in which your CoNet.jar is located. Also note that the -cp option is placed after "java" but before the call to the CoNet main class, like this: "java -cp PATH_TO_MY_CONET_JAR be.ac.vub.bsb.cooccurrence.cmd.CooccurrenceAnalyser"
    Copy the modified command.
  6. Open a command shell. In Windows, you can do so by clicking "Start", then "Run...". A window will open into which you write "cmd". This will open the command prompt window. Alternatively, you can also open the "Command prompt" program located in "Accessories". In MacOS, you can open the terminal application located in /Applications/Utilities.
  7. In the command shell, go to the demo folder of the CoNet directory using the command "cd". Example for MacOS:
    cd /Users/me/Documents/CoNet/demo
    Example for Windows:
    cd C:\Users\me\Documents\CoNet\demo
  8. Paste the command. On MacOS/UNIX, you can add an ampersand (&) at the end of the command to send it to the background.
  9. Push Enter to start the execution of the command.
  10. After a few seconds, a network file starting with "cooccurrence" and ending with ".gdl" should have been generated in the demo folder.
  11. You can load this network into Cytoscape by selecting it using the "Load" button in CoNet's main menu and then pushing "GO".

Command line tips and tricks

Command line help

You can get a short and a long version of the command line help. For the short version, please type:
java be.ac.vub.bsb.cooccurrence.cmd.CooccurrenceAnalyser -h
For the long version, type:
java be.ac.vub.bsb.cooccurrence.cmd.CooccurrenceAnalyser -H

Runtime memory

You can increase java runtime memory with option -Xmx. Example:
java -Xmx2000M be.ac.vub.bsb.cooccurrence.cmd.CooccurrenceAnalyser -h

Alias

In MacOS/UNIX, you can set an alias to shorten the CoNet call on command line. For this, add the line below to your bash shell configuration file:
alias conet="java be.ac.vub.bsb.cooccurrence.cmd.CooccurrenceAnalyser"
Then you can call the program as:
conet -h

Running CoNet on command line with a configuration file

Some options of the CoNet Cytoscape plugin are given to CoNet via a configuration file. These concerns the following options (the corresponding command line option is given in brackets): In addition, a number of other variables can be set via the configuration file (check the command line help for more details).
Here's an example for a configuration file:
############ CONET CONFIG #############
# rserve config
rserve_host=127.0.0.1
rserve_port=6311
# phylogenetic lineage
lineage_separator=--
# mutual information computation
mi_implementation=minet
minet_r_batch=true
# access to R
no_rserve=false
no_r=false
# p-value computation
poolvar=false
# speed-up disabled
disable_speedup=true
This configuration file indicates that CoNet makes use of Rserve at the specified host and port. It then specifies the lineage separator string (which is needed for taxon metadata specifying phylogenetic lineages). Furthermore, it enables mutual information computation in minet and, since R is needed for minet usage, allows calls to R via command line and via Rserve. In addition, the configuration switches off the CoNet speed-up, so that CoNet uses the previous (slower) implementation of various similarity measures (disable_speedup).

Example for CoNet with configuration file

Here is an example of running CoNet command line with a configuration file. The example assumes that Rserve is running and minet is installed in R. Please copy the command into one line.
java be.ac.vub.bsb.cooccurrence.cmd.CooccurrenceAnalyser 
-Z CoNetConfig.txt --method ensemble 
--input /CoNet/demo/Costello_2009_oral.txt --matrixtype abundance 
--ensemblemethods correl_pearson/dist_bray/
dist_kullbackleibler/sim_mutInfo --minetdisc equalfreq 
--format gdl --nantreatment pairwise_omit 
--nantreatmentparam 5 --networkmergestrategy union 
--stand col_norm --multigraph 
--ensembleparams correl_pearson~upperThreshold=0.72/
correl_pearson~lowerThreshold=-0.6/dist_bray~upperThreshold=0.89/
dist_bray~lowerThreshold=0.25/dist_kullbackleibler~upperThreshold=7.16/
dist_kullbackleibler~lowerThreshold=0.45/sim_mutInfo~lowerThreshold=0.6 
--filter row_minocc --filterparameter 5.0 
--output cooccurrenceNetworkDemo.gdl
where CoNetConfig.txt is a file located in the directory where the command is carried out. CoNetConfig.txt has the following content:
############ CONET CONFIG #############
# rserve config
no_rserve=false
rserve_host=127.0.0.1
rserve_port=6311
# mutual information computation with minet
mi_implementation=minet

Advanced users: Submitting CoNet jobs to a SGE cluster (MacOS/UNIX)

CoNet can send jobs to a SunGridEngine (SGE) cluster (command line option -b). This feature is needed when thousands of permutation iterations need to be carried out. A number of options in the configuration file allow to manage the cluster submission capabilities of CoNet (see list below). Most important is the job_num option, which defines how iterations are split into jobs. For example, if 1,000 permutations need to be carried out, a job_num of 100 will result in the submission of 100 jobs, each run with 10 iterations. In consequence, 100 temporary score files will be created (having 10 lines each), which will be merged into a final randomization score file after completion of all jobs. When submitting jobs to a SGE cluster, the lib_dir, jar_file, temp_dir, queue, job_num and memory options should all be set, and it is advisable to set both keep_tmpscores and no_rserve to true. Here's an example configuration:
############ CONET CONFIG #############
no_rserve=true
no_r=true
####### cluster configuration 
# location of the CoNet jar
lib_dir=/path/to/my/conet/lib/folder/
# name of the CoNet jar
jar_file=CoNet.jar
# temporary score files are stored here
tmp_dir=/path/to/my/temp/folder
# SGE queue name
queue=all.q
# memory allocated to java (via option -Xmx)
memory=4000
# number of jobs into which iterations are splitted
job_num=100
# keep the temporary score files
keep_tmpscores=true
# dry run: test run that does not submit jobs to the cluster
dry_run=false
# do not keep launcher scripts
keep_scripts=false

When the CoNet command exits before completion of the jobs, temporary score files can be concatenated after completion of the jobs using command line option --restorefromscorefolder

List of cluster options

Here is the list of all cluster-related options, to be set via the configuration file.

Advanced users: Parallelizing CoNet (MacOS/UNIX)

CoNet randomization can be split in several jobs to speed it up. Conet supports this parallelization in two ways: SGE-cluster submission (see above) and user-made wrappers. If you would like to run several CoNet jobs in parallel using your own wrapper, the following options are of interest to you: You can specify -f with value "randscore". This causes CoNet to write random scores instead of a network to the output. The original network can be provided via option -I. If it is provided (and -f is set to randscore), the original network will not be recomputed, but read in from the file given via -I. If the file given via -I does not exist yet, the original network will be exported to this file. Thus, you can set up parallelization in the following way: The first step is to compute the original network scores:
java -Xmx2000m -cp CoNet.jar 
be.ac.vub.bsb.cooccurrence.cmd.CooccurrenceAnalyser 
-i input.txt -f randscore -E correl_spearman/dist_bray --method ensemble 
--ensembleparamfile thresholds.txt --multigraph --pvaluemerge brown 
-F rand --iterations 1 -g 0.05 --resamplemethod shuffle_rows 
-I oriscores.txt -K edgeScores --scoreexport 
--output randomScores.0 > oriscores.log &
Next, you can create N jobs (where i is the job index going from 1 to N), by launching the following command N times:
java -Xmx2000m -cp CoNet.jar 
be.ac.vub.bsb.cooccurrence.cmd.CooccurrenceAnalyser 
-i input.txt -f randscore -E correl_spearman/dist_bray --method ensemble 
--ensembleparamfile thresholds.txt --multigraph --pvaluemerge brown 
-F rand --iterations 10 -g 0.05 --resamplemethod shuffle_rows 
-I oriscores.txt -K edgeScores --scoreexport 
--output randomScores.i > randscores_i.log &
Finally, you can merge all separately generated random score files (randomScores.i with i from 1 to N) by appending them to the original score file (oriscores.txt), thus creating your final permutation or bootstrap score file. If all your random score files are in one directory and if their names start with "randomScores.", you can let CoNet do the final merge. For this, point to the random score directory by setting tmp_dir in the configuration file to this directory and add option --restorefromscorefolder on command line. An example command could look like this:
java -Xmx2000m -cp CoNet.jar 
be.ac.vub.bsb.cooccurrence.cmd.CooccurrenceAnalyser 
-i input.txt -f gdl -E correl_spearman/dist_bray --method ensemble 
--ensembleparamfile thresholds.txt --multigraph --pvaluemerge brown 
-F rand --iterations 100 -g 0.05 --resamplemethod shuffle_rows 
-I oriscores.txt -K edgeScores --restorefromscorefolder 
-Z CoNetConfig.txt --output network.gdl > restore.log &

Advanced users: Ensemble Pipeline Bash Script (MacOS/UNIX)

In the cmd folder of the CoNet distribution, there is an example for a bash script that runs the steps of the ensemble part of the pipeline published in PLoS Computational Biology 8, e1002606. The example bash script uses input data from the third CoNet tutorial. It assumes that it is located in a folder that has two sub-folders "Input" and "Output", where input files (input matrix and metadata) are located in the input folder and the output files will be written to the output folder. Don't forget to give the script execution permission (chmod 755 cooc.sh).
The bash script runs all required network construction steps in one go, that is computation of initial thresholds, generation of renormalized permutation and bootstrap scores and final network construction. The renormalized permutation score computation is the longest step, it takes around 5 minutes on an 8GB RAM machine.
You can run a first test by setting PERMUT, BOOT and RESTORE to false and enabling COMPUTE_THRESHOLDS and TEST instead.
If you want to enable CLUSTER, make sure you have the SGE cluster management system installed. Then, add required cluster options to the CoNetConfig.txt and CoNetConfigBoot.txt configuration files. Both configuration files should specify the location of the CoNet jar file (lib_dir and jar_file) as well as the requested job number (job_num) and memory (memory). The CoNetConfig.txt configuration file should in addition point to the location of the directory where temporary permutation score files will be saved (tmp_dir). Likewise, the CoNetConfigBoot.txt configuration file should point to a different location, where temporary bootstrap score files will be saved. Note that in case you keep temporary score files (keep_tmpscores=true), you can restore random score distributions from these files using option --restorefromscorefolder later on.