The err of compile of 3v program

Q:
I download Voss Volume Voxelator from website
http://3vee.molmovdb.org/sourcecode.php. When I open the source of 3v, and
then use "make" to compile all the source. but the err happen as below:

####################################################################
cpuflags: cpuflags : unknown Intel Pentium3 model "42"
g++ -Wall -O3 -fomit-frame-pointer -ffast-math -funroll-loops -march=i686
-fopenmp -c -o utils-main.o utils-main.cpp
utils-main.cpp:1:0: error: CPU you selected does not support x86-64
instruction set
make: *** [utils-main.o] Error 1
####################################################################

I use "uame -a", and the specify information:
**********************************************************************************
Linux localhost.localdomain 2.6.35.6-45.fc14.x86_64
#1 SMP Mon Oct 18 23:57:44 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
**********************************************************************************

What’s wrong with the compile of 3v program? Could you give me some
suggestion?

A:
Could you download the the latest subversion it should fix your problem:

svn checkout http://vossvolvox.svn.sourceforge.net/svnroot/vossvolvox/ vossvolvox/

Voss Volume Voxelator crashes in Mac OS X shell with “Segmentation fault”

Q:

I run into a problem using your 3v tool for the shell. I am using Mac OS X 10.6 with Xcode installed. During compiling I get the following warnings: cpuflags: CPU "core" not supported by compiler "gcc-2335.9" on Darwin g++ -Wall -O3 -fomit-frame-pointer -ffast-math -funroll-loops -fopenmp -o AllChannel.exe utils-main.o utils-output.o utils-mrc.o allchannel.cpp allchannel.cpp: In function ‘int main(int, char**)’: allchannel.cpp:57: warning: format not a string literal and no format arguments allchannel.cpp:67: warning: format not a string literal and no format arguments allchannel.cpp:73: warning: format not a string literal and no format arguments chmod 755 AllChannel.exe mv AllChannel.exe ../bin for all tools in the package. I was trying using the volume tool and I compared it with your web service. Both generated the same result without any error message. Then I moved on and tried screening for channels. This time however the script crashes with the message segmentation fault (command used ../bin/AllChannel.exe -b 12.000 -s 1.400 -t 1.500 -g 0.500 -v 700 -i 2RH2_WT67_model1-noions.xyzr 2> output.log). I compared it with the console output from the web service. Until the segmentation fault both outputs (local and web) are exactly the same. Do you have any idea what might cause the problem.

A:
I compiled it on a mac once, but it has been a while. My guess is that you do not have openMP installed on your mac (a simple library that allows multiprocessor support).

Go into the Makefile (it is a text file) and change the FLAGS line at the top to just:

FLAGS = -Wall

then try it again. Let me know if it works or not.

From what I can tell you are not using openmp, library named libgomp. If you use ‘top -s3
-ocpu’, can you tell if it is using more than 100%.

Ok, back to the seg fault. I have no idea, like you said it is the exact same command. I can see that it dies somewhere between line 180 and 300, but it does not run lines 202-266.

Can you go into utils.h and change the DEBUG flag to 1

#define DEBUG 1

then compile and run it again.

Also – Did you do a ‘make clean’ before compiling after changing the DEBUG flag? Because there should be a lot more output.

Another thing to try is slightly adjusting the grid size (go from 0.5 to 0.51), to see if it is trying to access memory outside the bounds of the array.

Changing parameters (especially the # of residues per site) in the scheme for identifying surface-critical residues

Q:

I would like to ask for advice on your recently published STRESS tool. I would like to use it to identify residues that might be involved in allostery, however in the case of surface critical residues, it currently reports only up to ten residues per binding pocket. Is there a way to "hack" the identify_high_confidence_BL_sites.py script (which writes this file) to write all such residues, according to some probability cutoff? (I’m a Perl programmer, with relatively limited python experience.) Thank you in advance.

A:
In order to change the number of residues reported for each site, you may download and modify 2 of the C scripts (available on the stress github page)

For example, to modify the scripts such that the new limit is set to 15 residues (you can change this cutoff number to be any arbitrary value that you wish), do the following:

1) In the script bindingSiteMeasures.c, replace instances of "10" with "15" in lines 77, 79, 217, 219, 240, 242.

2) In the script surfaceProbe.c, I replace instances of "10" with "15" in lines 1227, 1229, 1235, 1237.

3) Recompile the source code and re-run the calculations.

However, with respect to the parameters in general, we should note that the parameters of the surface-site identification scheme were established using a known set of allosteric residues. That is, our parameters were established empirically to best capture known allosteric sites. The details of all this can be found in the Supplementary Materials of the paper, specifically in the Supp section 3.1-a-iii "Defining & Applying Thresholds to Select High-Confidence Surface-Critical Sites". Thus, we would advise against changing them, since again, they were empirically optimized.

Batch submissions to the STRESS server

Q:
I have a large number of structures that I would like to submit to the STRESS server. Does the server offer an option for batch submissions?

A:
The STRESS server itself does not currently provide an option for batch submissions. However, we encourage users to try implementing such jobs by running the source code available on our GitHub page. This may be accessed through github.com/gersteinlab/STRESS

Dealing w/system boundaries in Voronoi calculations & assigning radii to pseudo-water

Q:
I’m analyzing protein structures (specifically, I’m performing a Voronoi-based analysis) using the tools linked on the geometry page:

http://geometry.molmovdb.org/NucProt/

I understand that the bisection of distances between atoms means that the radius does not matter. However, what happens at the boundary of the system?

Also, if you add ‘pseudo-water’ to the system, then do the water atoms need to have a particular radius? If not, then is there a distance cutoff?

A:
With respect to your first question (regarding the boundary of the system — ie the protein surface): the Voronoi volumes become large and potentially infinite. That’s why you need to introduce solvent. A course lecture may help to further explain this nicely:
http://www.gersteinlab.org/courses/452/09-spring/pdf/structure2.pdf

With respect to your second question (regarding the radius assigned to water): Yes, the water atoms most definitely need to have a particular radius. Distance cutoffs won’t work. You can probably use the normal water radius here.

Change in contact areas as the radii grow larger (but remain in proportion)

Q:

I’m analyzing protein structures (specifically, I’m performing a Voronoi-based analysis) using the tools linked on the geometry page:

http://geometry.molmovdb.org/NucProt/

Is there any work you know off showing how the contact areas change as the radii grow larger but in proportion? … I managed to read in any file definition of atom radii but this has no effect on the area of polygon faces. I also tried to multiply the atom_vdw[ii] but this too had no effect. The main routine I use is "full-dump-polyhedra.main.c".

A:
I think there’s an easy answer. See the DumpAFace routine in the code linked here:

http://geometry.molmovdb.org/files/libproteingeometry/src-prog/full-dump-polyhedra.c

This prints out :
"– Face between atom %3d and neighbour %3d"
& then
"Face-Area= %9.4f Pyramid-Volume=%9.4f\n",area,FaceVolume

If you vary the radii used to parameterize the program, you can see how the contact area changes, perhaps by tabulating the value of the area variable. The effect is, as you guess, rather small but is related to the way optimal radii sets were selected in the past.

With respect to the second part of the question, ie:
I also managed to read in any file definition of atom radii but this
has no effect on the area of polygon faces.
I also tried to multiple the atom_vdw[ii] but this too had no effect.
The main routine I use is "full-dump-polyhedra.main.c".

There’s a number of reasons why this is happening.

(1) You only have one atom type (ie just CA).

In this case, radii are irrelevant and you’re effectively just using bisection. The radii are only relevant when two atoms of different types come into contact, and one has to apportion the space between them.

(2) You have differently typed atoms, but you’re using the normal Voronoi bisection method and not the alternate plane positioning methods using radii.

You need to tell the program explicitly not to use the normal bisection approach via the "-method" argument. See the documentation for calc-volume in the readme file linked here:

http://geometry.molmovdb.org/files/libproteingeometry/src-prog/README

I think this argument works properly for full-dump-polyhedra:

http://geometry.molmovdb.org/files/libproteingeometry/src-prog/full-dump-polyhedra.main.c

See the code for the main() routine to see it being invoked.

(3) You have differently typed atoms & are specifying a non-bisection plane positioning method, but you’re not reading the radii properly.

Here you can verify the atoms are correctly typed by using the show-2rad-refV program, viz:

http://geometry.molmovdb.org/files/libproteingeometry/src-prog/show-2rad-refV.main.c

Re-parameterizing radii when performing Voronoi-based analysis on structures with only the alpha carbon atoms

Q:

I’m analyzing protein structures (specifically, I’m performing a Voronoi-based analysis) using the tools linked on the geometry page:

http://geometry.molmovdb.org/NucProt/

I’m hoping to run the calculations using just the CA atoms, and so I must change the radii accordingly. Where should I start in terms of figuring out the new radii that should be used?

A:
The most recent re-parameterization of the radii is from Neil Voss’s work about ten years ago. See

http://papers.gersteinlab.org/papers/nucprot/

The logic in this paper could be easily extended to derive a set of CA radii.

Voronoi-based analyses of very large structures using tools in NucProt

Q:

I’m analyzing protein structures (specifically, I’m performing a Voronoi-based analysis) using the tools linked on the geometry page:

http://geometry.molmovdb.org/NucProt/

I’m using this on huge systems with more than 99,999 atoms. Is this possible?

A:
I don’t think there’s any hard coded limitation in the number of atoms. Look at the read_pdb_file routine this source script:

http://geometry.molmovdb.org/files/libproteingeometry/src-lib/readpdb.c

This "mallocs" up space on demand so in theory if you have enough memory I think you can accommodate >100K atoms. However, the PDB format itself is an issue here. You can modify the PDB reading routines to a different format. Just modify the read_record routine in the same file. However, I don’t know if doing this in multiple "models" will work.

Distinction Between Surface- and Interior-Critical Residues

Q:
What is the main difference between surface- and interior-critical residues?

A:
Allosteric surface residues play regulatory roles that are fundamentally distinct from those of allosteric residues within the interior. While surface residues may often constitute the sources or sinks of allosteric signals, interior residues act to transmit such signals. Thus, different approaches are needed for identifying these two classes of residues. Surface-critical residues are identified by finding pockets such that the occlusion of such pockets is likely to interfere with large-scale protein motions (see Documentation for details; see also Ming and Wall, 2005; Mitternacht and Berezovsky, 2011). Interior-critical residues are identified by finding information-flow bottlenecks within the protein structure (see Documentation and main paper for details; see also del Sol et al, 2006; Ghosh et al, 2008; Rousseau et al, 2005).

del Sol, A., Fujihashi, H., Amoros, D., and Nussinov, R. (2006). Residues crucial for maintaining short paths in network communication mediate signaling in proteins. Mol. Syst. Biol. 2(1).

Ghosh, A., and Vishveshwara, S. (2008). Variations in Clique and Community Patterns in Protein Structures during Allosteric Communication: Investigation of Dynamically Equilibrated Structures of Methionyl tRNA Synthetase Complexes. Biochemistry. 47, 11398-11407.

Ming, Dengming, and Michael E. Wall. “Quantifying allosteric effects in proteins.” Proteins: Structure, Function, and Bioinformatics 59.4 (2005): 697-707.

Mitternacht, S. and Berezovsky, I.N. (2011). Binding leverage as a molecular basis for allosteric regulation. PLoS Comput. Biol. 7, e1002148.

Rousseau, F. and Schymkowitz, J. (2005). A systems biology perspective on protein structural dynamics and signal transduction. Curr. Opin. Struct. Biol. 15, 23–30.

Van der Waals Radii

Q:
I am writing about your article, The Packing Density in Proteins: Standard Radii and Volumes, published by JMB on 1999. In the article, in particular in table 2, you list a series of radii associated to each atom according to the number of hydrogens it has attached and a number you call the “valence”. However, valences of carbon are 2 and 4, and the list shows a valence 3 carbon; also valences for nitorgen are 3 and 5, and the tible shows a valence 4 one. Could you please explain what you mean by the term “valence” exactly? In particular, I am interested in knowing the type of heavy atoms you can find in glutamine and alanine residues, and their radii.

A:
Here, the term “valence” is perhaps best described in Table 1 (instead of Table 2). What is meant by the “n-term” (here, used synonymously with valence) is usually a geometric descriptor designating the orientation of other atomic species around that atom (for example, n=4 usually means that the atom builds a tetrahedron, whereas n=3 usually means that the atom is trigonal planar). Strictly speaking, and perhaps more accurately, n just designates the total number of atoms bound to a central atom. So, in your example of carbon’s n=3 in Table 2, these are carbon atoms which are connected to 3 other atoms (an example of C3H0 may be the carbonyl carbon in a protein backbone, and C3H1 may be a carbon atom in a phenyl group of PHE). In your example of nitrogen’s n=4, the N4H3 may represent the epsilon-amino group in LYS, since it is bound to 4 other atoms (one carbon and 3 hydrogen atoms).