Limit Synchronization in Markov Decision Processes

Doyen, Laurent; Massart, Thierry; Shirmohammadi, Mahsa

Full-text links:

Download:

Current browse context:

cs.GT

< prev | next >

new | recent | 1310

Change to browse by:

Computer Science > Computer Science and Game Theory

Title: Limit Synchronization in Markov Decision Processes

Authors: Laurent Doyen, Thierry Massart, Mahsa Shirmohammadi

(Submitted on 10 Oct 2013 (v1), last revised 31 Oct 2013 (this version, v2))

Abstract: Markov decision processes (MDP) are finite-state systems with both strategic and probabilistic choices. After fixing a strategy, an MDP produces a sequence of probability distributions over states. The sequence is eventually synchronizing if the probability mass accumulates in a single state, possibly in the limit. Precisely, for 0 <= p <= 1 the sequence is p-synchronizing if a probability distribution in the sequence assigns probability at least p to some state, and we distinguish three synchronization modes: (i) sure winning if there exists a strategy that produces a 1-synchronizing sequence; (ii) almost-sure winning if there exists a strategy that produces a sequence that is, for all epsilon > 0, a (1-epsilon)-synchronizing sequence; (iii) limit-sure winning if for all epsilon > 0, there exists a strategy that produces a (1-epsilon)-synchronizing sequence.
We consider the problem of deciding whether an MDP is sure, almost-sure, limit-sure winning, and we establish the decidability and optimal complexity for all modes, as well as the memory requirements for winning strategies. Our main contributions are as follows: (a) for each winning modes we present characterizations that give a PSPACE complexity for the decision problems, and we establish matching PSPACE lower bounds; (b) we show that for sure winning strategies, exponential memory is sufficient and may be necessary, and that in general infinite memory is necessary for almost-sure winning, and unbounded memory is necessary for limit-sure winning; (c) along with our results, we establish new complexity results for alternating finite automata over a one-letter alphabet.

Subjects:	Computer Science and Game Theory (cs.GT)
Cite as:	arXiv:1310.2935 [cs.GT]
	(or arXiv:1310.2935v2 [cs.GT] for this version)

Submission history

From: Mahsa Shirmohammadi [view email]
[v1] Thu, 10 Oct 2013 16:30:45 GMT (70kb)
[v2] Thu, 31 Oct 2013 15:05:13 GMT (70kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1310.2935

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Science and Game Theory

Title: Limit Synchronization in Markov Decision Processes

Submission history