Logo
Nazad
Kévin Colin, Mina Ferizbegovic, H. Hjalmarsson
3 2022.

Regret Minimization for Linear Quadratic Adaptive Controllers Using Fisher Feedback Exploration

In this letter, we study the trade-off between exploration and exploitation for linear quadratic adaptive control. This trade-off can be expressed as a function of the exploration and exploitation costs, called cumulative regret. It has been shown over the years that the optimal asymptotic rate of the cumulative regret is in many instances $\mathcal {O}(\sqrt {T})$ . In particular, this rate can be obtained by adding a white noise external excitation, with a variance decaying as $\mathcal {O}(1/\sqrt {T})$ . As the amount of excitation is pre-determined, such approaches can be viewed as open loop control of the external excitation. In this contribution, we approach the problem of designing the external excitation from a feedback perspective leveraging the well known benefits of feedback control for decreasing sensitivity to external disturbances and system-model mismatch, as compared to open loop strategies. We base the feedback on the Fisher information matrix which is a measure of the accuracy of the model. Specifically, the amplitude of the exploration signal is seen as the control input while the minimum eigenvalue of the Fisher matrix is the variable to be controlled. We call such exploration strategies Fisher Feedback Exploration (F2E). We propose one explicit F2E design, called Inverse Fisher Feedback Exploration (IF2E), and argue that this design guarantees the optimal asymptotic rate for the cumulative regret. We provide theoretical support for IF2E and in a numerical example we illustrate benefits of IF2E and compare it with the open loop approach as well as a method based on Thompson sampling.


Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više