Learning-based model
predictive control for Markov decision processes | ||
|---|---|---|
|
R.R. Negenborn*,
B. De
Schutter*,
M.A.
Wiering+, and
J.
Hellendoorn* * Delft Center for Systems and Control, Delft University of Technology + Institute of Information and Computing Sciences, Utrecht University Proceedings of the 16th IFAC World Congress, Prague, Czech Republic, July 2005. Paper 2106 / We-M16-TO/2. | ||
| (Download technical report as .pdf) | ||
|
We propose the use of Model Predictive Control (MPC) for
controlling systems described by Markov decision processes.
First, we consider a straightforward MPC algorithm for Markov
decision processes. Then, we propose value functions, a means
to deal with issues arising in conventional MPC, e.g.,
computational requirements and sub-optimality of actions. We
use reinforcement learning to let an MPC agent learn a value
function incrementally. The agent incorporates experience from
the interaction with the system in its decision making. Our
approach initially relies on pure MPC. Over time, as
experience increases, the learned value function is taken more
and more into account. This speeds up the decision making,
allows decisions to be made over an infinite instead of a
finite horizon, and provides adequate control actions, even if
the system and desired performance slowly vary over time. | ||
| (Download technical report as .pdf) | ||
| Feel free to send any comments to me. |
|