Robo Reflections: Using the Optimizer

With the actuator saturations acquired, I was able to put together the following model and optimizer code.

Model

A constant reference of 0, is compared with the gain of the plant to determine an actuation signal representing an acceleration to be applied at the base. This command signal is passed through the actuators subsystem block to apply the actuator dynamics. This potentially saturated signal is passed into the plant state-space model block. The state variables are read from the plant, sent to the workspace, and passed to the controller gains. While running this model, the evolutionary optimizer will be modifying the controller gains and testing the state output.

Let's take a closer look at the actuators subsystem.

The actuators subsystem block takes in a raw acceleration signal and saturates it. This saturated signal is integrated to find the approximate velocity and saturates it as well. If the speed becomes saturated, the acceleration value outputted if forced to 0 via If and If Action blocks. Otherwise, the saturated acceleration value is outputted. The saturated velocity is sent to a Terminator block to prevent a compiler warning and allow the signal to be attached to a scope.

Optimizer

This model is operated by the particle swarm optimizer to find optimal gains for the controller. The evolutionary algorithm is split over three files. The first, simplebalance_env.m, initializes the environment to run the other files.

%
% SimpleBalance - Environment
%
% This file prepares the environment for the simplebalance model files
%

% clear the environment
clear all

% initialize constants
g = 9.81;        % m/s^2, acceleration due to gravity
l = .11125;      % m, length of moment arm
m = 652.039032; % g, mass of point
M = 28.3495231; % g, mass of base

% calculate the state-space matrices
A = [0 1; m*g*l/(2*m*l^2) 0];
B = [0; M*l/(2*m*l^2)];
C = [1 0; 0 1];
D = [0;0];

% calculate a conversion factor from rotations to meters.
deg2m = 1/360    * 81.6*pi * 1/1000;
%       rot/deg    mm/rot    m/mm

% calculate the saturation of the motors
velo_top = 900 * deg2m; % m/s, maximum motor velocity at max battery
accel_top = 200 * deg2m; % m/s^2, maximum motor acceleration at max battery

% set an initial state for the model
x_0 = [.0001;0];

% set an initial gain for the model (so we don't get compiler warnings
% while we're working with it)
K = [0 0];

The inline comments explain most of what's going on. Note that the state equations and constants were determined from my earlier work on this hardware in 2009. The 81.6 used in the degrees to meters calculation is given from the wheel specifications. And as mentioned earlier, the 900 degrees per second velocity and the 200 degrees per second squared acceleration were determine last week. The initial state is set such that a small error is generated away from balanced to demonstrate to task the control to correct for the error.

After the environment has been set, the optimizer can begin to run. The model algorithm is too long to post here, but is accessible via simplebalance_psoa.m. Please open that file if you'd like to follow along. The algorithm is documented, and I'll refrain from reviewing the operation of the optimizer here as I provided materials to it in an earlier post, but here's a few points of interest.

The parameters for the optimizer were pulled from M. E. H. Pedersen's publication Good Parameters for Particle Swarm Optimization. In it, Pedersen summarizes a number of tuning values that can serve as good starting point values for a number of different configurations. Choosing some basic variables have proved to be sufficient.

The globalBest on lines 75 and 118, and the it on line 80 are intentionally left without a semicolon so the script can output a progress report via lines to the command window. Lines 93 and 102 can be uncommented for a similar effect.

I've noticed much better performance in general (not just with this project) without the bounding that the original algorithm outlines with this project and with others. This could be due to a number of reasons, not the least of which is a search bounds being too small. However, I have found that the optimizer with these Pedersen's recommended parameters perform much better without bounds in general as compared to other tuning values. I'll speak more to this in the final paper.

Finally, this implementation of the optimizer uses the following cost function in simplebalance_cost.m.

%
% SimpleBalance - PSOA Cost Function
%
% This file runs the particle swarm optimizer using the simplebalance_cost
% function as a cost function.
%
% This file is intended to be called from simplebalance_psoa.m.

function cost = simplebalance_cost(position)

% assign the workspace variable K with the position of the particle
assignin('base', 'K', position);

% run the simulation against the gain values
sim('simplebalance2');

% if the simulation is a step response, the cost returned can be the
% settling time.
%info = stepinfo(simout.Data(:,1), simout.Time(:,1));
%cost = info.SettlingTime;

% the cost returned is the absolute value of the last simulated point
cost = abs(simout.Data(501,1));

As with the other files this is documented, so I'll stick to the remarks. My original cost function (commented out) was designed to respond to a step response to the reference value. A cost value was mapped from the settling time. The shorter the settling time, the lower the cost. This worked fine before the modifications to the actuator subsystem model were included. Now the model settles from an unstable initial condition and the cost is the final value after 10 seconds of simulation data points are collected. Although we are not theoretically guaranteed that the model is stable under this condition, it has been my findings that it is. And with the configuration shown here, actually over tunes the gain values.

The evolutionary optimization strategy identifies controller gains of about 19800 and 630. After 10 seconds of simulation the cost is smaller than what can be stored in program memory and is sufficiently small to be zero.

Implementing the Optimized Controller

Using these gains, I set out to program the controller into the hardware and created this model. The main loop is trivial...

The plant is much more interesting...

This plant is based off of my experience with the NXC version. The actuation from the control is sent into a gain to convert the meters per second squared acceleration value into units using degrees. The resulting value is summed with the average speed of the motors as read by the encoders to find the new speed. This speed is passed through an additional gain and set to the motors. Meanwhile, the current gyro sensor value is integrated and sent along with its current value to the output. At the moment no consideration is giving for gyro drift.

First a few considerations. The gain of 1/9 is required to scale the potential maximum speed of 900 degrees per second to a reasonable motor input signal for the motors between -100 and 100. In reality, this gain would change overtime based on the state of the battery. In a more professional control, there could be another control loop relating motor references, speeds, positions, battery voltages. For this simple model, we ignore those factors. Also, the gain is negated because the motors are mounted backwards to their reference.

Running the model on the hardware leads to jerky attempts to balance. The gyroscopic drift comes back into play as over time the robot believes the balance angle is off from what it should be and is the main source off disturbance. I will continue to investigate this after the project is complete, but I believe the basic tasks set by the proposal to be complete and will focus my attention on the paper.

Robo Reflections

Monday, April 23, 2012

Using the Optimizer

Model

Optimizer

Implementing the Optimized Controller

No comments: