Efficient Inference of Gaussian Process Modulated Renewal Processes with Application to Medical Event Data
The episodic, irregular and asynchronous nature of medical data render them difficult substrates for standard machine learning algorithms. We would like to abstract away this difficulty for the class of time-stamped categorical variables (or events) by modeling them as a renewal process and inferring a probability density over continuous, longitudinal, nonparametric intensity functions modulating that process. Several methods exist for inferring such a density over intensity functions, but either their constraints and assumptions prevent their use with our potentially bursty event streams, or their time complexity renders their use intractable on our long-duration observations of high-resolution events, or both. In this paper we present a new and efficient method for inferring a distribution over intensity functions that uses direct numeric integration and smooth interpolation over Gaussian processes. We demonstrate that our direct method is up to twice as accurate and two orders of magnitude more efficient than the best existing method (thinning). Importantly, the direct method can infer intensity functions over the full range of bursty to memoryless to regular events, which thinning and many other methods cannot. Finally, we apply the method to clinical event data and demonstrate the face-validity of the abstraction, which is now amenable to standard learning algorithms.