jClust: A tool for
clustering analysis
|
|
Installation |
This program is available only for 32 bit systems. JAVA 6 or higher should be preinstalled for the application to run.
To run from command line just place all of the files in the same folder and type
java -Xmx350M -jar 'jar filename'
The parameter -Xmx defines the size of the memory - in this example 350MB of RAM. The bigger amount of memory is the better for the application, especially in cases where the user wants to run jClust for large scale networks. Please download and unzip the application and modify the bat file if necessary.
To run the system in 64 bit systems just recompile the source code in C
|
|
INPUT FILE |
The input file is usually a list of weighted or unweighted connections. An example is given below.
|
cat |
hat |
1 |
|
hat |
bat |
1 |
|
bat |
cat |
1 |
|
bit |
fit |
1 |
|
fit |
hit |
1 |
|
hit |
bit |
1 |
or
|
RPL7A |
IPI1 |
1 |
|
EFT2 |
ACS2 |
1 |
|
ZUO1 |
ASC1 |
1 |
|
NAB3 |
NAB6 |
1 |
|
MED7 |
YAP1 |
1 |
|
NOC4 |
IMP3 |
1 |
|
MRPL3 |
MRP7 |
1 |
|
MRP7 |
MRPL7 |
1 |
|
NOP13 |
ERB1 |
1 |
|
FUR1 |
RVB2 |
1 |
|
TPS1 |
IML1 |
1 |
|
IML1 |
CTR9 |
1 |
|
RPN2 |
SAM1 |
1 |
|
RPL13A |
BRX1 |
1 |
|
SEC27 |
RVB2 |
1 |
|
RSC8 |
HHF2 |
1 |
|
RPL2A |
RPL25 |
1 |
|
YMR31 |
CDC19 |
1 |
|
SSN2 |
MED6 |
1 |
|
RAD50 |
RPT3 |
1 |
|
APL6 |
HMO1 |
1 |
|
RIX7 |
RRP1 |
1 |
|
ERG13 |
RPS25A |
1 |
|
ERG13 |
RRN3 |
1 |
|
WTM1 |
ARO3 |
1 |
The file should be TAB DELIMITED. Sample files about data are downloaded with the application. These data refer to protein-protein interactions. The related articles for these data are:
ΙΤΟ dataset - 4038 interactions among 3279 proteins
Ito T, ..., Sakaki Y: A comprehensive two-hybrid analysis to explore the yeast protein interactome Proceedings of the National Academy of Science 2001, 98(8):4569-4574.
Tong dataset - 7430 edges and 2262 vertices.
Tong AH, ..., Chang M et al: Global mapping of the yeast genetic interaction network. Science 2004, 303(5659):808-813
Krogan dataset - 7088 edges and 2675 vertices
Krogan NJ, ..., Tikuisis AP et al: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 2006, 440(7084):637-643.
Gavin_2002 datasets - 3210 edges and 1352 vertices
Gavin AC,..., Cruciat CM et al: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 2002, 415(6868):141-147.
Gavin_2006 datasets - 26531 edges and 1430 vertices.
Gavin AC, ..., Dumpelfeld B et al: Proteome survey reveals modularity of the yeast cell machinery. Nature 2006, 440(7084):631-636.
DIP dataset - 17491 edges and 4934
Xenarios I, ..., Eisenberg D: DIP: the database of interacting proteins. Nucleic Acids Res 2000, 28(1):289-291.
DATA ARE DOWNLOADED WITH jCLUST APPLICATION
|
|
Interface and intermediate files |
The main interface looks like below. The users needs to load the file with the interactions and set the parameters of the algorithms like mentioned below.

This Panel shows the contents of the file that was loaded. In this case it shows the PPI interactions from the file that we loaded.

This Panel below shows the results of the clustering that were produced by the initial clustering methods.

This Panel below shows the intermediate results produced by the clustering methods.
This will be later the input for the secondary filtering methods methods. It contains information about the clusters formed and the interactions of the nodes within the clusters. these interactions were found in the initial input file format.
This file like every file is stored locally on the hard disk. This is because in the case of a big dataset, someone can load this file directly by skipping the time consuming run of MCL. This was done mainly for time efficiency purposes.

This panel which is the most important one shows the final results of the workflow after the clustering-filtering compunation.
Here we can see the number of clusters together with their elements. Results are also stored locally on the hard disk drive.

This is the help panel where someone can see the meaning of the various parameters of the algorithms.

This is the panel that holds information about Medusas input file. Medusa can be called within jClust application and the file can be loaded seperately. The option of visualizeing the predefined clusters will force Medusa to visualize distinct clusters. An example is given below.

Below we see how the initial interaction network look like and in bottom left image we see how layout algorithms help to isolate connections of specific nodes. In bottom right image we see how clusters look like. Of course Medusa comes with new interactivity and richer functionality which makes the exploration of these networks easier.
