Authors & Copyright: Claire Rioualen, Ghislain Bidaut. This document has been placed under the CeCILL Licence.

Introduction

HTS-Net, a network-based analysis program, was developed to identify gene regulatory modules impacted in high-throughput screenings by integrating regulatory data (regulome), protein-protein interactions (interactome) and RNAi screening z-scores.

HTS-Net works by discovering subnetworks in parallel on an interactome and a regulome network. RNAi-based gene scores are superimposed on the interaction and the regulation maps separately. Each one of them is then searched for high scoring areas. The identified regions of interest, so-called subnetworks, are extracted and reported. The obtained modules are merged to form meta-subnetworks that incorporate regulation information and PPIs.

The program takes normalized data as an input, under the form of a list of genes with associated score values. Two organisms are supported so far (Homo sapiens and Mus musculus). The data is superimposed on 2 networks:

These networks were aggregated from the following databases:

After detection and statistical validation by data shuffling, subnetworks detected in the interactome and in the regulome are integrated by measuring their identity (number of common nodes).

Note that non free data (TRANSFAC) is not available through this implementation. It is also possible to use HTS-Net through the commandline by downloading source code available at http://htsnet.marseille.inserm.fr/distribution.html

Usage

The HTS-Net pipeline is accessible through our Mobyle portal. Mobyle allows querying the input data, verifying data and arguments integrity and tracing your run. It will also send you an email when the HTS-Net run is done.

Data Formatting

The input file containing the dataset (screening scores) must be correctly formatted according to the format accepted by HTS-Net:

  • 2-column tab-delimited file
  • first column contains EntrezGene identifiers
  • second column contains the associated normalized score measured from the sceening

If you need to convert gene IDs from EntrezGene identifiers to Gene Symbols or the reverse, you can use the convertGID_GS utility from our Mobyle portal: http://mobylehome.marseille.inserm.fr/cgi-bin/portal.py#forms::convertGID_GS

This is especially important as HTS-Net works internally with Entrez Gene identifiers and does not accept any other type of identifier.

Access to the system

HTS-Net is accessible from out Mobyle portail at the following address: http://htsnet.marseille.inserm.fr/index.html by clicking on “HTS-Net RUN page on Mobyle portal” link.

You don’t need to possess an account on our Mobyle portal and can directly use it as a guest user. Once you access the system, you are welcomed with a parameter form. This form allows the customization of the HTS-Net analysis. Help can be obtained for all parameters by clicking on the ‘?’. All parameters marked with an asterisk (*) are mandatory.

HTS-Net welcome screen

HTS-Net welcome screen

List of parameters to set up

  • **Screening score file:** Upload your formatted data file (or paste the content). We propose to use the Chia-Fav dataset provided at the HTS-Net Web page: http://htsnet.marseille.inserm.fr/data-examples.html.

  • Seed list: This allows to limit the recursive exploration of regulome and interactome to a set of genes of specific interest. This is optional, and we won’t use it in the current example.

  • Taxon: Here, we specify Homo sapiens.

  • Minimum score threshold: The minimum score required to keep a subnetwork before statistical validation. Here, we specify 0.3.

  • Minimum improvement value: This is the threshold th of minimal score improvement used for subnetwork aggregation. Here, we specify 0.05.

  • Absolute value: This specifies whether we detect subnetworks with both positive and negative gene z-scores. We will use “No”.

  • Use Anticorrelation: This specifies that we wish to detect genes with highly negative z-scores. This is set to ‘Yes’ for our example.

  • Fitting model: This specifies the model to use for null distribution of score. This is set to ‘Gamma’.

  • P-values Interactome and Regulome: These are the p-values used for the comparison of selected subnetworks and random subnetworks. They are left to their default (Auto) for the present tutorial. The program will choose automatically a p-value separately for interactome and regulome subnetworks in order to obtain a managable number of subnetworks.

  • Overlap: If two subnetworks overlap (proportion of proteins/genes in common of 2 subnetworks) over this rate, only the subnetwork with the best score is kept. By default, we will keep at 0.8 (80%).

  • Interactome-Regulome subnetworks connectivity threshold: This specifies the number of common nodes between interactome and regulome subnetworks for the construction of meta-subnetworks. Set to 1.

  • Project name: Please specify a name for your project in this box (e.g. “project_screening”).

  • Analysis name: Please specify a name for your analysis in this box (e.g. “chia_analysis”). This is used throughout the final report.

  • Other optional parameters are left to their default. Note that non-default values are displayed in yellow windows.

Once done, your configuration should look as follows.