### Current browse context:

cs.LG

### Change to browse by:

### References & Citations

# Computer Science > Machine Learning

# Title: Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

(Submitted on 4 May 2021)

Abstract: We propose two algorithms for episodic stochastic shortest path problems with linear function approximation. The first is computationally expensive but provably obtains $\tilde{O} (\sqrt{B_\star^3 d^3 K/c_{min}} )$ regret, where $B_\star$ is a (known) upper bound on the optimal cost-to-go function, $d$ is the feature dimension, $K$ is the number of episodes, and $c_{min}$ is the minimal cost of non-goal state-action pairs (assumed to be positive). The second is computationally efficient in practice, and we conjecture that it obtains the same regret bound. Both algorithms are based on an optimistic least-squares version of value iteration analogous to the finite-horizon backward induction approach from Jin et al. 2020. To the best of our knowledge, these are the first regret bounds for stochastic shortest path that are independent of the size of the state and action spaces.

Link back to: arXiv, form interface, contact.