In this letter, we study the trade-off between exploration and exploitation for linear quadratic adaptive control. This trade-off can be expressed as a function of the exploration and exploitation costs, called cumulative regret. It has been shown over the years that the optimal asymptotic rate of the cumulative regret is in many instances $\mathcal {O}(\sqrt {T})$ . In particular, this rate can be obtained by adding a white noise external excitation, with a variance decaying as $\mathcal {O}(1/\sqrt {T})$ . As the amount of excitation is pre-determined, such approaches can be viewed as open loop control of the external excitation. In this contribution, we approach the problem of designing the external excitation from a feedback perspective leveraging the well known benefits of feedback control for decreasing sensitivity to external disturbances and system-model mismatch, as compared to open loop strategies. We base the feedback on the Fisher information matrix which is a measure of the accuracy of the model. Specifically, the amplitude of the exploration signal is seen as the control input while the minimum eigenvalue of the Fisher matrix is the variable to be controlled. We call such exploration strategies Fisher Feedback Exploration (F2E). We propose one explicit F2E design, called Inverse Fisher Feedback Exploration (IF2E), and argue that this design guarantees the optimal asymptotic rate for the cumulative regret. We provide theoretical support for IF2E and in a numerical example we illustrate benefits of IF2E and compare it with the open loop approach as well as a method based on Thompson sampling.
In this paper, we propose variations of Willems’ fundamental lemma that utilize second-order moments such as correlation functions in the time domain and power spectra in the frequency domain. We believe that using a formulation with estimated correlation coefficients is suitable for data compression, and possibly can reduce noise. Also, the formulations in the frequency domain can enable modeling of a system in a frequency region of interest.
Abstract In this paper, we consider data driven control of Hammerstein systems. For such systems a common control structure is a transfer function followed by a static output nonlinearity that tries to cancel the input nonlinearity of the system, which is modeled as a polynomial or piece-wise linear function. The linear part of the controller is used to achieve desired disturbance rejection and tracking properties. To design a linear part of the controller, we propose a weighted average risk criterion with the risk being the average of the squared L2 tracking error. Here the average is with respect to the observations used in the controller and the weighting is with respect to how important it is to have good control for different impulse responses. This criterion corresponds to the average risk criterion leading to the Bayes estimator and we therefore call this approach Bayes control. By parametrizing the weighting function and estimating the corresponding hyperparameters we tune the weighting function to the information regarding the true impulse response contained in the data set available to the user for the control design. The numerical results show that the proposed methods result in stable controllers with performance comparable to the optimal controller, designed using the true input nonlinearity and true plant.
This letter concerns the problem of learning robust LQ-controllers, when the dynamics of the linear system are unknown. First, we propose a robust control synthesis method to minimize the worst-case LQ cost, with probability $1-\delta $ , given empirical observations of the system. Next, we propose an approximate dual controller that simultaneously regulates the system and reduces model uncertainty. The objective of the dual controller is to minimize the worst-case cost attained by a new robust controller, synthesized with the reduced model uncertainty. The dual controller is subject to an exploration budget in the sense that it has constraints on its worst-case cost with respect to the current model uncertainty. In our numerical experiments, we observe better performance of the proposed robust LQ regulator over the existing methods. Moreover, the dual control strategy gives promising results in comparison with the common greedy random exploration strategies.
Abstract Identification of dynamic networks has been a flourishing area in recent years. However, there are few contributions addressing the problem of simultaneously identifying all modules in a network of given structure. In principle the prediction error method can handle such problems but this methods suffers from well known issues with local minima and how to find initial parameter values. Weighted Null-Space Fitting is a multi-step least-squares method and in this contribution we extend this method to rational linear dynamic networks of arbitrary topology with modules subject to white noise disturbances. We show that WNSF reaches the performance of PEM initialized at the true parameter values for a fairly complex network, suggesting consistency and asymptotic efficiency of the proposed method.
This paper concerns the problem of learning control policies for an unknown linear dynamical system to minimize a quadratic cost function. We present a method, based on convex optimization, that accomplishes this task robustly: i.e., we minimize the worst-case cost, accounting for system uncertainty given the observed data. The method balances exploitation and exploration, exciting the system in such a way so as to reduce uncertainty in the model parameters to which the worst-case cost is most sensitive. Numerical simulations and application to a hardware-in-the-loop servo-mechanism demonstrate the approach, with appreciable performance and robustness gains over alternative methods observed in both.
Identification of a complete dynamic network affected by sensor noise using the prediction error method is often too complex. One of the reasons for this complexity is the requirement to minimize a non-convex cost function, which becomes more difficult with more complex networks. In this paper, we consider serial cascade networks affected by sensor noise. Recently, the Weighted Null-Space Fitting method has been shown to be appropriate for this setting, providing asymptotically efficient estimates without suffering from non-convexity; however, applicability of the method was subject to some conditions on the locations of sensors and excitation signals. In this paper, we drop such conditions, proposing an extension of the method that is applicable to general serial cascade networks. We formulate an algorithm that describes application of the method in a general setting, and perform a simulation study to illustrate the performance of the method, which suggests that this extension is still asymptotically efficient.
For identification of systems embedded in dynamic networks, applying the prediction error method (PEM) to a correct tailor-made parametrization of the complete network provided asymptotically efficient estimates. However, the network complexity often hinders a successful application of PEM, which requires minimizing a non-convex cost function that in general becomes more difficult for more complex networks. For this reason, identification in dynamic networks often focuses in obtaining consistent estimates of particular network modules of interest. A downside of such approaches is that splitting the network in several modules for identification often costs asymptotic efficiency. In this paper, we consider the particular case of a dynamic network with the individual systems connected in a serial cascaded manner, with measurements affected by sensor noise. We propose an algorithm that estimates all the modules in the network simultaneously without requiring the minimization of a non-convex cost function. This algorithm is an extension of Weighted Null-Space Fitting (WNSF), a weighted least-squares method that provides asymptotically efficient estimates for single-input single-output systems. We illustrate the performance of the algorithm with simulation studies, which suggest that a network WNSF may also be asymptotically efficient estimates when applied to cascade networks, and discuss the possibility of extension to more general networks affected by sensor noise.
Abstract In system identification, many structures and approaches have been proposed to deal with systems with non-linear behavior. When applicable, the prediction error method, analogously to the linear case, requires minimizing a cost function that is non-convex in general. The issue with non-convexity is more problematic for non-linear models, not only due to the increased complexity of the model, but also because methods to provide consistent initialization points may not be available for many model structures. In this paper, we consider a non-linear rational finite impulse response model. We observe how the prediction error method requires minimizing a non-convex cost function, and propose a three-step least-squares algorithm as an alternative procedure. This procedure is an extension of the Model Order Reduction Steiglitz-McBride method, which is asymptotically efficient in open loop for linear models. We perform a simulation study to illustrate the applicability and performance of the method, which suggests that it is asymptotically efficient.
Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo
Saznaj više