Adaptive Robot-Assisted Feeding: An Online Learning Framework for Acquiring Previously Unseen Food Items
A successful robot-assisted feeding system requires bite acquisition of a wide variety of food items. It must adapt to changing user food preferences under uncertain visual and physical environments. Different food items in different environmental conditions require different manipulation strategies for successful bite acquisition. Therefore, a key challenge is how to handle previously unseen food items with very different success rate distributions over strategy. Combining low-level controllers and planners into discrete action trajectories, we show that the problem can be represented using a linear contextual bandit setting. We construct a simulated environment using a doubly robust loss estimate from previously seen food items, which we use to tune the parameters of off-the-shelf contextual bandit algorithms. Finally, we demonstrate empirically on a robot-assisted feeding system that, even starting with a model trained on thousands of skewering attempts on dissimilar previously seen food items, $\epsilon$-greedy and LinUCB algorithms can quickly converge to the most successful manipulation strategy.