﻿ 基于近似动态规划的三轴卫星姿态最优控制 Optimal Attitude Control of Three-Axis Satellite Based on Approximate Dynamic Programming

Journal of Aerospace Science and Technology
Vol.05 No.01(2017), Article ID:20096,10 pages
10.12677/JAST.2017.51004

Optimal Attitude Control of Three-Axis Satellite Based on Approximate Dynamic Programming

Mingze Wang1, Xinsheng Ge2

1College of Automation Engineering, Beijing Information Science and Technology University, BISTU, Beijing

2College of Applied Science, Beijing Information Science and Technology University, BISTU, Beijing

Received: Mar. 10th, 2017; accepted: Mar. 28th, 2017; published: Mar. 31st, 2017

ABSTRACT

The optimal attitude trajectory planning of three-axis satellite using approximate dynamic programming (ADP) method is discussed. Firstly, the dynamic and kinematic equations of the three-axis satellite are used, and for given initial and final attitudes, the performance to be optimized is selected as minimizing the rest-to-rest maneuver energy. On grounds of adaptive dynamic programming structure, critic network and action network are used to approximate performance index function and control variables respectively, and Runge-Kutta method to solve the state variables. Besides, a concrete expression of the utility function is provided which is suitable for this kind of problem. The simulation results show that the proposed algorithm satisfies the constraints well and can be used on-line with its small computational amount and low computational complexity.

Keywords:Attitude Control, Approximate Dynamic Programming, Three-Axis Satellite, Optimal Control, Neural Network

1北京信息科技大学自动化学院，北京

2北京信息科技大学理学院，北京

Copyright © 2017 by authors and Hans Publishers Inc.

1. 引言

2. 三轴卫星姿态机动最优控制问题

2.1. 卫星动力学方程及卫星运动学方程

(1)

(2)

2.2. 最优控制问题的一般形式

(3)

(4)

(5)

2.3. 卫星姿态机动能量最优控制问题的数学描述

(6)

(7)

(8)

3. 近似动态规划方法

3.1. 近似动态规划方法概述

3.2. 近似动态规划方法结构

(9)

(10)

(11)

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

3.3. 近似动态规划算法步骤

1) 初始化执行网络与评价网络的神经网络的权值，阈值，以及神经网络学习率，定义折扣因子的值。

2) 读取三轴卫星系统的状态并将其输入到执行网络，计算输出控制动作。

3) 将当前的状态与执行网络输出的控制动作输入被控对象，通过被控对象状态方程获得下一阶段的状态。

4) 将状态、控制输入到评价网络，获得评价网络输出。

5) 将下一时刻的状态输入到执行网络输出下一时刻的控制动作。

6) 将新的状态、控制输入到评价网络，得到评价网络下一时刻的输出。

9) 循环第2)至第8)步，直到达到最大训练次数或满足误差条件则仿真结束。

4. 仿真实例

Table 1. Single axis maneuver simulation parameter list [18]

(a) (b)(c)

Figure 1. (a) Attitude angle; (b) Angle velocity; (c) Curve: Control torque

Table 2. Three axis maneuver simulation parameter list [18]

(a) (b) (c) (d)

Figure 2. (a) Attitude angle; (b) Angle velocity; (c) Control torque calculated by GA; (d) Control torque calculated by ADHDP

5. 结论

Optimal Attitude Control of Three-Axis Satellite Based on Approximate Dynamic Programming[J]. 国际航空航天科学, 2017, 05(01): 27-36. http://dx.doi.org/10.12677/JAST.2017.51004

1. 1. 张化光, 张欣, 罗艳红, 等. 自适应动态规划综述[J]. 自动化学报, 2013, 39(4): 303-311.

2. 2. 林小峰, 张衡, 宋绍剑, 等. 非线性离散时间系统带ε误差限的自适应动态规划[J]. 控制与决策, 2011, 26(10): 1586-1590.

3. 3. Al-Tamimi, A., Vrabie, D., Abu-Khalaf, M., et al. (2007) Model-Free Approximate Dynamic Programming Schemes for Linear Systems. International Joint Conference on Neural Networks, Orlando, 12-17 August 2007, 371-378.

4. 4. Jiang, Y. and Jiang, Z.P. (2013) Global Adaptive Dynamic Programming for Continuous-Time Nonlinear Systems. IEEE Transactions on Automatic Control, 19, 1-13.

5. 5. Lee, J.Y., Jin, B.P. and Choi, Y.H. (2009) Model-Free Approximate Dynamic Programming for Continuous-Time Linear Systems. IEEE Conference on Decision and Control, Shanghai, 15-18 December 2009, 5009-5014.

6. 6. Tang, K.W. and Srikant, G. (1997) Reinforcement Control via Action Dependent Heuristic Dynamic Programming. International Conference on Neural Networks, Vol. 3, Houston, 12 June 1997, 1766-1770.

7. 7. Murray, J.J., Cox, C.J., Lendaris, G. G., et al. (2002) Adaptive Dynamic Programming. IEEE Transactions on Systems Man & Cybernetics Part C Applications & Reviews, 32, 140-153. https://doi.org/10.1109/TSMCC.2002.801727

8. 8. Ding, W., Liu, D. and Wei, Q. (2011) Adaptive Dynamic Programming for Finite-Horizon Optimal Tracking Control of a Class of Nonlinear Systems. 30th Chinese Control Conference, Yantai, 22-24 July 2011, 2450-2455.

9. 9. Liu, D., Wang, D. and Yang, X. (2013) An Iterative Adaptive Dynamic Programming Algorithm for Optimal Control of Unknown Discrete-Time Nonlinear Systems with Constrained Inputs. Information Sciences, 220, 331-342. https://doi.org/10.1016/j.ins.2012.07.006

10. 10. Liu, D. (2005) Approximate Dynamic Programming for Self-Learning Control. Automatica, 31, 13-18.

11. 11. Jiang, Y. and Jiang, Z.P. (2012) Computational Adaptive Optimal Control for Continuous-Time Linear Systems with Completely Unknown Dynamics. Automatica, 48, 2699-2704. https://doi.org/10.1016/j.automatica.2012.06.096

12. 12. Wang, D. and Liu, D. (2013) Neuro-Optimal Control for a Class of Unknown Nonlinear Dynamic Systems Using SN-DHP Technique. Neurocomputing, 121, 218-225. https://doi.org/10.1016/j.neucom.2013.04.006

13. 13. Zhu, Y., Zhao, D. and Liu, D. (2015) Convergence Analysis and Application of Fuzzy-HDP for Nonlinear Discrete-Time HJB Systems. Neurocomputing, 149, 124-131. https://doi.org/10.1016/j.neucom.2013.11.055

14. 14. Si, J. and Wang, Y.T. (2000) On-Line Learning Control by Association and Reinforcement. IEEE Transactions on Neural Networks, 12, 264-276. https://doi.org/10.1109/72.914523

15. 15. Wei, Q., Song, R. and Sun, Q. (2015) Nonlinear Neuro-Optimal Tracking Control via Stable Iterative Q-Learning Algorithm. Neurocomputing, 168, 520-528. https://doi.org/10.1016/j.neucom.2015.05.075

16. 16. 章仁为. 卫星轨道姿态动力学与控制[M]. 北京: 北京航空航天大学出版社, 1998.

17. 17. 黄静. 三轴稳定航天器姿态最优控制方法研究[D]: [硕士学位论文]. 哈尔滨: 哈尔滨工业大学, 2010.

18. 18. 郭金良. 三轴稳定卫星姿态机动的时间最优控制[D]: [硕士学位论文]. 哈尔滨: 哈尔滨工业大学, 2013.

19. 19. 安晓风. 卫星相对姿态智能自适应控制及分布式仿真技术研究[D]: [硕士学位论文]. 北京: 北京理工大学, 2016.