Show that the lower bound is non-decreasing in n and the


(The Odoni bound) Let k∗ be the optimal stationary policy for a Markov decision problem and let g∗ and π ∗ be the corresponding gain and steady-state probability respectively. Let v∗(n, u) be the optimal dynamic expected reward for starting in state i at stage n with final reward vector u.

(a) Show that mini[v∗(n, u) - v∗(n - 1, u)] ≤ g∗ ≤ maxi[v∗(n, u) - v∗(n - 1, u)] ; n ≥ 1. Hint: Consider premultiplying v∗(n, u) - v∗(n - 1, u) by π ∗ or π where k is the optimal dynamic policy at stage n.

(b) Show that the lower bound is non-decreasing in n and the upper bound is non- increasing in n.

Text Book: Stochastic Processes: Theory for Applications By Robert G. Gallager.

Request for Solution File

Ask an Expert for Answer!!
Advanced Statistics: Show that the lower bound is non-decreasing in n and the
Reference No:- TGS01207808

Expected delivery within 24 Hours