Select measures and specify their thresholds

A set of methods to infer relationships between items can be selected. In order to select a measure of interest, activate its corresponding check box.

Correlations

Correlation measures all have a range of [-1,+1], with -1 for the strongest negative relationship (anti-correlation), 0 for the neutral case and +1 for the strongest positive relationship. Briefly stated, Pearson is assuming a linear relationship between the data, whereas Spearman and Kendall are rank-based and thus do not assume linearity. It should be pointed out that Pearson, Spearman and Kendall are not defined in case the standard deviation is zero. Thus, if two taxa have constant abundances, they will not be linked by these correlation measures. For large matrices, the computation of Kendall can take a prohibitive amount of time and should better be carried out on command line. For the formulas and in-depth discussions, we refer to the respective Wiki pages: Pearson, Spearman, Kendall
Note that all three correlations are sensitive to the so-called double-zero problem (Legendre & Legendre), that is a vector pair with many matched zeros will receive a higher score than the same vector pair without them. Since the interpretation of zeros is often ambiguous, this is a serious drawback when dealing with sparse data.
In addition, correlation results may be strongly biased when applied to normalized data (see Aitchison 2003). Renormalization (see the randomization menu) can reduce this bias.

Similarities

Similarities range from 0 to 1 or infinity. The higher their score, the more similar are two objects.

Dissimilarities

Dissimilarities range from 0 to 1 or infinity. The higher their score, the more dissimilar are two objects. A distance (or metric) has to fulfill the following criteria: 1) It should never be negative, 2) It should be zero only if objects are identical, 3) It should be symmetric, e.g. d(x,y)=d(y,x) and 4) It should fulfill the triangle inequality (d(x,z) <= d(x,y)+d(y,z)). Dissimilarities do not meet the fourth criterion.

Incidence methods

In case the matrix is of type "incidence" (i.e. only contains presence/absence values) or a conversion to the incidence type has been specified in the preprocessing menu, a number of methods specific to incidence matrices are available.

Network inference with Minet

Minet is an R package that implements three popular network inference algorithms used in genetic regulatory network inference, namely CLR (Faith et al.), ARACNE (Margolin et al.) and MRNET (Meyer et al. 2007). These algorithms work on the basis of a similarity matrix, where the similarity is usually mutual information. Note that using a correlation measure instead requires the data to be normally distributed. Using minet requires Rserve to be enabled. Minet's strategies for mutual information estimation and discretization (required for abundance or normalized matrices) can be set in the configuration menu. Note that by default, mutual information is not computed in minet, but using ARACNE's algorithm. The default can be changed in the config menu.

Threshold setting

In order to set method-specific thresholds, two options are available:

Visualize score distributions

Score distributions of selected measures can be visualized by selecting an export folder and activating the "Export distribution" check box in the Threshold setting menu. To select an export folder, click the "Select" button and click on a folder, then click "Choose". The score distributions will be exported into a single pdf file called "score_distributions.pdf" in the selected folder. When this option is activated, pushing the "GO" button in the main menu will not result in a network, but instead the user is asked whether to open the pdf file with the distribution plots. Pdf files are displayed using the Adobe Acrobat Viewer. In case the user confirms opening the pdf file, Acrobat Viewer asks whether the user accepts the License Agreement and upon acceptance displays the pdf file. When score visualization is enabled, previously set thresholds are lost.

Note that threshold guessing and score visualization are not supported for minet, hypergeometric distribution and association rule mining.