Tutorial: High-Throughput Screening Network analysis with the HTS-Net Pipeline

Authors & Copyright: Claire Rioualen, Ghislain Bidaut. This document has been placed under the CeCILL Licence.

Introduction

HTS-Net, a network-based analysis program, was developed to identify gene regulatory modules impacted in high-throughput screenings by integrating regulatory data (regulome), protein-protein interactions (interactome) and RNAi screening z-scores.

HTS-Net works by discovering subnetworks in parallel on an interactome and a regulome network. RNAi-based gene scores are superimposed on the interaction and the regulation maps separately. Each one of them is then searched for high scoring areas. The identified regions of interest, so-called subnetworks, are extracted and reported. The obtained modules are merged to form meta-subnetworks that incorporate regulation information and PPIs.

The program takes normalized data as an input, under the form of a list of genes with associated score values. Two organisms are supported so far (Homo sapiens and Mus musculus). The data is superimposed on 2 networks:

A protein-protein interactions network (interactome)
A transcription factor-target gene network (regulome)

These networks were aggregated from the following databases:

Interactome: DIP, HPRD, I2D, Intact, Mint, Proteinpedia, Mysickova et al., Ravasi et al., Yu et al.
Regulome: ITFP, TRED, Pazaar, ORegAnno

After detection and statistical validation by data shuffling, subnetworks detected in the interactome and in the regulome are integrated by measuring their identity (number of common nodes).

Note that non free data (TRANSFAC) is not available through this implementation. It is also possible to use HTS-Net through the commandline by downloading source code available at http://htsnet.marseille.inserm.fr/distribution.html

Usage

The HTS-Net pipeline is accessible through our Mobyle portal. Mobyle allows querying the input data, verifying data and arguments integrity and tracing your run. It will also send you an email when the HTS-Net run is done.

Data Formatting

The input file containing the dataset (screening scores) must be correctly formatted according to the format accepted by HTS-Net:

2-column tab-delimited file
first column contains EntrezGene identifiers
second column contains the associated normalized score measured from the sceening

If you need to convert gene IDs from EntrezGene identifiers to Gene Symbols or the reverse, you can use the convertGID_GS utility from our Mobyle portal: http://mobylehome.marseille.inserm.fr/cgi-bin/portal.py#forms::convertGID_GS

This is especially important as HTS-Net works internally with Entrez Gene identifiers and does not accept any other type of identifier.

Access to the system

HTS-Net is accessible from out Mobyle portail at the following address: http://htsnet.marseille.inserm.fr/index.html by clicking on “HTS-Net RUN page on Mobyle portal” link.

You don’t need to possess an account on our Mobyle portal and can directly use it as a guest user. Once you access the system, you are welcomed with a parameter form. This form allows the customization of the HTS-Net analysis. Help can be obtained for all parameters by clicking on the ‘?’. All parameters marked with an asterisk (*) are mandatory.

HTS-Net welcome screen

List of parameters to set up

**Screening score file:** Upload your formatted data file (or paste the content). We propose to use the Chia-Fav dataset provided at the HTS-Net Web page: http://htsnet.marseille.inserm.fr/data-examples.html.
Seed list: This allows to limit the recursive exploration of regulome and interactome to a set of genes of specific interest. This is optional, and we won’t use it in the current example.
Taxon: Here, we specify Homo sapiens.
Minimum score threshold: The minimum score required to keep a subnetwork before statistical validation. Here, we specify 0.3.
Minimum improvement value: This is the threshold th of minimal score improvement used for subnetwork aggregation. Here, we specify 0.05.
Absolute value: This specifies whether we detect subnetworks with both positive and negative gene z-scores. We will use “No”.
Use Anticorrelation: This specifies that we wish to detect genes with highly negative z-scores. This is set to ‘Yes’ for our example.
Fitting model: This specifies the model to use for null distribution of score. This is set to ‘Gamma’.
P-values Interactome and Regulome: These are the p-values used for the comparison of selected subnetworks and random subnetworks. They are left to their default (Auto) for the present tutorial. The program will choose automatically a p-value separately for interactome and regulome subnetworks in order to obtain a managable number of subnetworks.
Overlap: If two subnetworks overlap (proportion of proteins/genes in common of 2 subnetworks) over this rate, only the subnetwork with the best score is kept. By default, we will keep at 0.8 (80%).
Interactome-Regulome subnetworks connectivity threshold: This specifies the number of common nodes between interactome and regulome subnetworks for the construction of meta-subnetworks. Set to 1.
Project name: Please specify a name for your project in this box (e.g. “project_screening”).
Analysis name: Please specify a name for your analysis in this box (e.g. “chia_analysis”). This is used throughout the final report.
Other optional parameters are left to their default. Note that non-default values are displayed in yellow windows.

Once done, your configuration should look as follows.

Configuration of HTS-Net

Start HTS-Net run

To start the analysis, click on the Run button at the top of the form.

The system prompts for an email address. Please provide your email in the box.

E-mail prompt

Then, there is a Captcha confirmation. Please fill in the text appropriately.

Captcha

After this operation, a Tab containing the HTS-Net run appears. You can verify the parameters from this display.

Summary of the parameters

You can close your browser if necessary, the run will continue on our server.

HTS-Net results

After waiting a couple of hours, you will receive a notification in your mailbox, to the address provided above (be careful, this email may be filtered out by spam filter). Results are provided under the form of a URL that links back to a page showing the following form.

Results

Three files are available from this page:

HTS-Net analysis report.

This file contains analysis report (list of subnetworks detected, list of genes, and annotations) in HTML format.

Standard output and Standard error files.

These files contain the messages produced by HTS-Net in the standard output and standard error. This is mostly useful for developers and advanced users.

You can download the HTML report by clicking on the save button , unzip it on your desktop and open the index.html file.

Results 2 Three tabs are provided: Interactome, Regulome, Integrated Analysis. Each of these give access to a list of subnetworks, a list of genes contained in these subnetworks, and a list of enriched GO terms. Please refer to the publication for details on the analysis provided.

By clicking on the Integrated Analysis link and on the meta-reg-snw-54920 link, you can display the following subnetwork as an exemple.

Results 3