Q:
I am trying to construct a "gene co-phenotype" background network using bayesian approach which is mentioned in your paper "A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data" (science ,17th October 2003).
After reading the supplementary method related with this paper, I have a question on how to set the training data set size.
In this paper, 8250 positive /2691903 negative training gene pairs are used. It is recommended that the training data set should be balance with the true situation when we use naive bayesian method. Could you give me some instruvtions on how you set the positive/negative training dataset size. It will be very glad to hear from you.
A:
best I can do here is point you to:
http://papers.gersteinlab.org/papers/funcpred-goldstd