I am contacting you regarding the OrthoClust program that your group has on github and had a couple questions about how to apply the program to new datasets. First, how was the co-appearance matrix calculated from the OrthoClust output? Second, is it necessary to modify the initial number of spin states (q) or the coupling constant (k) parameters that were used in the 2014 Genome Biology paper? I am not able to find options in the current release and wondering where these values can be changed in the code?
The current implementation in github is based on a heuristic, rather than the simulated annealing method used in the 2014 Genome Biology paper. The initial number of spin state q is no longer a parameter you have to supply. It’s set to be the total number of nodes in the system. As explained in the readme, the coupling constant k is supplied in one of the input files (the coupling information file). It should be the 3rd column of the file. in my example (ortho_info file found in data folder), the third column is all 1, meaning k=1.
For the co-occurrence matrix, notice that the output file is a tab delimited file which consists of three columns. The 1st and 2nd columns are the species id and the gene id given by the input files. The 3rd column is a module id. Suppose there are N1 genes in species 1 and N2 genes in species 2, the co-appearance matrix has dim (N1+N2) by (N1+N2). One should build a map between the genes in individual species to the indices running from 1 to (N1+N2). Suppose there are n genes in module 1, then all the pair-wise combination of these n genes should be marked as 1 in the corresponding matrix elements.
One output file can be used to make a co-appearance matrix (with only 0 and 1). If you have multiple output files from multiple runs of the algorithm, you will arrive at a final co-appearance matrix shown in the Genome biology paper by adding the results together. Of course, in order to make a plot like the heat map shown in the paper, one has to further perform clustering to arrange the rows and columns.
If you use Julia, I may be able to send you a little script.