Request for a SI document

The file in the url:
(SI of the paper "Architecture of the human regulatory network derived from
ENCODE data") is damaged and cannot be read.

Can you please send me a copy?

The file in the url:
(SI of the paper "Architecture of the human regulatory network derived from
ENCODE data") is damaged and cannot be read.

Can you please send me a copy?

Thank you for your prompt reply.

The nature11245__ALL.pdf file you generated works fine (using Adobe XI on Windows 7) but as I mentioned, the file download from nature’s website is damaged (on windows and Linux machines).

FYI, here is Linux stderr (using ocular):

Error: PDF file is damaged – attempting to reconstruct xref table…

Error: Couldn’t find trailer dictionary

Error: Couldn’t read xref table

Connecting to deprecated signal QDBusConnectionInterface::serviceOwnerChanged(QString,QString,QString)

Thank you for letting us know. This is actually a more serious problem than what I had been expecting. We may need to contact them about this very soon, as other users will experience the same problem. Thanks again.

It might be a problem of fonts that mac os has but windows/linux don’t. You might want to try produce the pdf on windows and try to open it on mac os and linux and if it works just substitute the file on nature site (which is not a trivial task I guess).

no output files from the multichain server

One of my friends recommended Molmovdb to me to calculate the morph conformation. As a test, I have successfully generated the morphing conformation from the single chain server.

Then I tried to use the "multichain server" to generate the morphing conformations of my target proteins last Friday. Everything goes well. However, I did not receive any mails reminding me the progress until today. So I am wondering if I should wait more days.

Just in case, I submitted the same job again today. The job No. is b337528-18153. Would you like to give me a favor to check the progress?

Thank you very much for your query, and for your interest in our server. We have been having problems with our server not sending emails, and we have a notice on our website notifying users about these email issues.

However, the good news is that you should still be able to access morphs. The best approach would be to just append your job ID to the standard URL. Thus, in your case, this would be:

If you are still unable to see your morphs, you may email me the structures, and I will try to generate them for you through our server.

Query about QTL calling from Wang et al PsychEncode paper


I have a quick query about the Wang et al paper from the PsychENCODE study.
Were the QTLs identified from all the samples or the control samples only?
I’ve checked the paper, online resources and the supplementary methods but can’t seem to work this out.

The QTLs were identified from both control and disease samples. You could find the sample information in Table S11. Summary of dataset.

accessing Database of Macromolecular Movements


I am developing a software for assessing similarity among flexible proteins. I
would like to test the software on the Database of Macromolecular Movements
to test my software, however I found no means to download multiple files. I
was wondering whether it is possible to get a data set with files containing
protein motions without separately accessing each and every entry in your

What I would like to have, if it is possible of course, is the curated
files of the conformational changes and the corresponding PDB IDs. Let
me know whether it is possible or not.


This may be doable, but it depends on what exactly you need. Do you want the frame-by-frame morph files, the video files, or the PDB IDs, or some other form of the data? Also, we actually have two databases: one is a manually curated set of about 200 conformational changes, and the other database is user-submitted. If you tell me more about the kinds of things you may need, I can likely send you the compressed files.

At the first URL below, you will find the curated set of motions. The second URL is an outbox that’s prepared for you, which contains these morphs (frame-by-frame) from the curated motions:

Please let us know if you have trouble getting access for any reason.

Temp issues w/Packing-Eff


I read your journal about Packing-Eff. I find it very informative and resourceful. I would like to try to use it on some protein models that i had built using comparative modelling (MODELLER).

However, when i try to access the website, Packing-Eff Online, I’m afraid it is down. Can you help me with the problem?

I am studying the packing of residues in proteins and tried online version
of "Packing-Eff". Unfortunately I could not find any relation between output
amino acid numbering or total number of residues and the input PDB file (I
used PDB: 451C). Is it normal?

It would be nice of you if you help me to solve this problem.

A1 & A2:
Sorry about this. We had briefly experienced a systems failure, but have since recovered. Try again now.

Alternative to StoneHinge?

I am a research student working on protein structures using computational methods. I have used the tool StoneHinge to determine the hinge region residue and %protein rigidity for a protein. To confirm and report the significance of the putative hinge residue, I induced single residue mutations and noted the changes in the %packing rigidity. I was unable to find any significant changes. Therefore to confirm the importance of the putative hinge residue and effects of mutation on the hinge movement/rotation, is there any other tool?

did you see our "related resources" page?:

Some of the items in here may be of some help

SIN database, request detailed format


I am interested in the evolution of protein-protein interaction networks, and
recently became an enthusiastic user of your Structural Interaction Network
(SIN) database.

While downloading the data from the SIN website
(, I noticed that more detailed
formats are available upon request for for SIN versions 0.9, 1.0 and 2.0.
In particular, which Pfam domains are involved in each interaction, and
which yeast crystal structure (hopefully PDB identifications) the
interactions are based on.

Would it be possible to obtain this information? I would really appreciate
that. I hope to be able to use it to survey physical properties of the
interactions throughout the network, and connect it to the evolutionary
simulations I’m working on at the lab.

I have a few questions about the DynaSIN. Sorry for this long email, I tried to be as clear as possible. It would be really great if you could help me answer those questions!

Question (1) and (2) are regarding the ‘Interaction Data’ section, file ‘interface_final2.txt’:

(1) What is the significance of the order in which protein A and protein B (second and third columns, respectively) are presented? In other words – if protein A and B are swapped, should the other entries (PDB IDs and surface residues) be calculated in a different way? I thought that swapping protein A and B should give the same result, but I noticed that for interaction 566 and 508, swapping protein A and B result in different PDB IDs and different surface residues for the PDB IDs they have in common:

566 HFE_HUMAN TFR1_HUMAN Permanent 1A6Z_A;1A6Z_B;26,30,49,97,122,202,204,236,243,;54,55,53,31,60,99,11,10, 1A6Z_A;1A6Z_D;; 1A6Z_C;1A6Z_B;; 1A6Z_C;1A6Z_D;26,30,49,97,122,204,236,243,;54,55,53,31,60,11,99,10, 1DE4_A;1DE4_B;30,49,121,122,204,233,236,243,;55,53,1,60,99,11,8,10, 1DE4_A;1DE4_E;; 1DE4_A;1DE4_H;; 1DE4_D;1DE4_B;; 1DE4_D;1DE4_E;30,49,97,120,122,202,204,206,207,233,236,239,243,;55,53,60,3,98,99,11,12,13,8,10, 1DE4_D;1DE4_H;; 1DE4_G;1DE4_B;; 1DE4_G;1DE4_E;; 1DE4_G;1DE4_H;30,49,97,120,121,122,202,204,233,236,;55,53,62,31,1,60,98,99,11,8,10,

508 TFR1_HUMAN HFE_HUMAN Permanent 1DE4_C;1DE4_A;629,640,;85,146, 1DE4_C;1DE4_D;; 1DE4_C;1DE4_G;; 1DE4_F;1DE4_A;; 1DE4_F;1DE4_D;629,658,;146,64, 1DE4_F;1DE4_G;; 1DE4_I;1DE4_A;; 1DE4_I;1DE4_D;; 1DE4_I;1DE4_G;629,640,;85,146,

(2) Do the surface residues numbers (column 5 and subsequent columns) correspond to their position in the full protein sequence as defined in UniProt? Or the residue ID in the PDB file? I assume the latter (but still wanted to make sure) because sometimes the surface residues numbers exceed the protein length. For example in interaction 554, first PDB description:

554 CDC42_HUMAN RHG01_HUMAN Transient 1AM4_D;1AM4_A;532,561,563,564,;189,191,198,126,197,220, …

For the PDB ID 1AM4 (see ), chain D (protein CDC42) is 191 amino acids long (see and the surface residues are 532,561,563 and 564.

And (3), a more general question regarding the definition of ‘transient’ and ‘permanent’ interactions. In the Bhardwaj et al (2011) paper it was mentioned that:

"It should be noted here that the term ‘‘permanent’’ does not indicate that the relevant protein interacts with its partner in a strictly permanent fashion (i.e., it does not remain bound to the partner for the duration of its life time). This term (along with ‘‘transient’’ interaction) is based on the convention previously adopted by Kim et al".

I searched the Kim et al (Science 2006) paper for a definition, but I couldn’t find it in the main text or supporting information. Could you please let me know what is the definition, or point out where the definition is? That would be very helpful.

you might want to look at

Unfortunately, the E. coli set does not include the same level of detail which
we provide for the human set on our website. Indeed, the E. coli set, though
part of our study, was not the main focus of the study that motivated the
creation of DynaSIN [ref provided below].

Having said that, however, it should be possible to parse through our E. coli
set and to download the appropriate data from biomart by searching for gene-PDB
mappings. Again, thank you for your interest in this work.

Bhardwaj et al (2011) Integration of protein motions with molecular networks
reveals different mechanisms for permanent and transient interactions. Protein
Science 20:1745-1754.

1) This is indeed a strange observation in the file. It should not be
unless there’s an implicit convention of which I’m unaware. The analysis and
file compilation has been performed by a previous member of our group. Since I
cannot explain what you’ve observed for interactions 508 and 566, I’ll have to
defer your question to the post-doc who managed these files. I will cc you on
that email I send to him now.

2) You are correct — the surface residues are numbered according to their
numbering in the actual PDB files, and not according to their respective
UniProt reside indices.

3) You’re correct that, in the Kim et al 2006 paper, the terms "transient" and
"permanent" are never given explicit definitions. Rather, certain implied
definitions are appended to these terms in that paper. These definitions and
the reasoning are as follows:
A "transient" interaction is one in which multiple distinct pairs of
protein interact by using a shared interface on either protein. So, for
instance, let’s say that interface "a" on protein "A" interacts with interface
"b" on protein "B". Let’s also say that it’s possible for interface "a" on
protein "A" to interact with a completely different protein (say,
protein "C").
Since both "C" and "B" need to user surface "a" on "A", it is not possible for
both protein C & B to interact with A at the same time. That is to say, such
interactions are mutually exclusive. Assuming that both interactions are, at
some point in time, essential for biological processes, it must be the case
that there’s a transient nature to these interactions, thereby enabling
B and C
to interact with A at different times.
A "permanent" interaction, on the other hand, is one in which there are
not other competing pairs. The analogy here would be if "a" on "A" is inferred
to interact ONLY with "b" on "B". In theory, the interaction between "A" and
"B" may be permanent, since no other proteins need to interact with "a"
on "A".

We’ll wait to hear back from one of the other authors of the DynaSIN
paper, but
if anything I said above is unclear, of if you have any other queries, please
don’t hesitate to let us know.

Thanks for bringing it up; its been a while since I had a look at the codes behind DynaSIN (I have moved from Gerstein Lab). Anyways, ideally, order of proteins should not make a difference; swapping protein A and B should not change the contact residues. How many such cases do you see where order of the proteins made a difference?

The good thing is that these contact residues were not used for deriving the main results of the paper, they were only provided as an additional piece of data. Plus, if you think that the list of contact residues has some issues, its very easy to extract interface residues. That also gives you the freedom to change the distance cutoff.

List of all PDB structures + chain IDs for motions database


I would like to have information on the list of PDB structures (with chain IDs) for the structure pairs used in the motions database. I saw that the zipped list file of IDs in the website did not have the chain ID. I was wondering if you have already compiled information available for this?

I was a bit confused by the format of the file available for download on MolMovDb website (List.txt.gz). Do you have a compiled list of just the motion pairs manually created and for which PDB structures are available? I am also specifically looking for the corresponding chain ID for each of the PDB structures. Any help would be appreciated!

Fairly recently, we virtualized MolMovDB, and this process
may have made it difficult to obtain some of the data files for which you’re
searching. This may also have played a role in the issue you bring up
about the

Cavity.exe within 3V


I am using your program 3V for computing the volume of the active site
of a protein which is clearly external. I am using your program Cavity.exe of
3V. However, I do not know how to interpret the results of the run. It
would be nice if you please help me on this regards,

If the site is external then you are probably looking for a channel
rather than a cavity. Cavities are completely enclosed by the structure.
My suggestion would be to use the AllChannel.exe program and play with
the probe sizes.

The output from can be in several forms, but I typically use the -m
volume.mrc, MRC output and view the results in UCSF Chimera since it is
a free program.

Thanks for your reply. I followed your suggestion and able to visualize the
Channel. But, I am wondering if I could also measure the Channel. Is it
possible by your program?

More specifically, given a structure, can I compute the volume of any
active site using your program? Do you have any other program or webserver
to do that?

I know the latest subversion source code outputs both the volume and
surface area of the channels. The website shows it as well.

It is up to you to get the parameters right, so that you are only
looking at the channel and not the any extra pockets.

3V online – problem with results download


I like your paper about 3V tool and I try to apply it on my proteins. However there seems to be a problem with online version – I cannot get to results through the link provided, e.g.

I tried both module for cavities and channels with similar failure.

Could you solve it for me please?

I’ve set up a newer server that provides more feedback at, I keep the old one around since people still prefer it.

All I can tell from looking at the server is that the file you are looking to download does not exist. So, maybe the program crashed. Try the new site and it might be more informative.

I would try Channel Finder instead of Channel Extraction. Channel Extraction is very particular about finding the exact coordinate of the channel — in fact I may change the algorithm to accommodate for this.