Experimental design trade-offs for gene regulatory network inference: an in silico study of the yeast Saccharomyces cerevisiae cell cycle
Abstract
Time-series of high throughput gene sequencing data intended for gene regulatory network (GRN) inference are often short due to the high costs of sampling cell systems. Moreover, experimentalists lack a set of quantitative guidelines that prescribe the minimal number of samples required to infer a reliable GRN model. We study the temporal resolution of data vs quality of GRN inference in order to ultimately overcome this deficit. The evolution of a Markovian jump process model for the Ras/cAMP/PKA pathway of proteins and metabolites in the G1 phase of the Saccharomyces cerevisiae cell cycle is sampled at a number of different rates. For each time-series we infer a linear regression model of the GRN using the LASSO method. The inferred network topology is evaluated in terms of the area under the precision-recall curve AUPR. By plotting the AUPR against the number of samples, we show that the trade-off has a, roughly speaking, sigmoid shape. An optimal number of samples corresponds to values on the ridge of the sigmoid.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2017
- DOI:
- 10.48550/arXiv.1712.05453
- arXiv:
- arXiv:1712.05453
- Bibcode:
- 2017arXiv171205453M
- Keywords:
-
- Quantitative Biology - Quantitative Methods;
- Mathematics - Optimization and Control