Giannone Nao Learning

Embed Size (px)

Citation preview

  • 7/30/2019 Giannone Nao Learning

    1/20

  • 7/30/2019 Giannone Nao Learning

    2/20

    Overview:environment

    Robotic Agent NAO

    Application Robotic Soccer

    SDK

    Simulator

    Humanoid Robot

    Produced by Aldebaran

  • 7/30/2019 Giannone Nao Learning

    3/20

    Process raw data

    from environment

    Elaborate raw data to obtain

    more reliable information

    Decide the best behaviour to

    accomplish the agent goal

    Actuate robot motors

    accordindly

    Vision Module Modelling Module

    Motion Control

    Module

    Behaviour Control

    Module

    Environment

    At First !!!

    At First !!!

    Overview:(sub)tasks

  • 7/30/2019 Giannone Nao Learning

    4/20

    Make Nao walkhow?

    Main Advantage

    and a Drawback

    Based on an unknow Walk Model

    Ready to Use (to be tuned)

    Nao is equipped

    with a set of motion utilities including

    a walk implementationthat can be

    No flexibility at all!!!

    called through an interface

    (NaoQi Motion Proxy)

    partially customized by tuning

    some parameters

    For these reasons

    we decided to develop

    our walkmodeland to tune it using

    machine learnig tecniques

  • 7/30/2019 Giannone Nao Learning

    5/20

  • 7/30/2019 Giannone Nao Learning

    6/20

    A simple walking RAgent for Nao

    Motion Control Module

    NaoQi Adaptor

    Simple Behaviour Module

    Switches between

    two states: walk -

    stand

    Smemy

    SPQR Walking Library

    NAO (NaoQi)

    Webots Client

    TCP channel

    WEBOTS

    uses

  • 7/30/2019 Giannone Nao Learning

    7/20

    Choose a set of variable output:

    3D coordinates of selected points

    of the robot

    Choose and parametrize the desired

    trajectories for these variables

    at each phase of the gait

    SPQR Walking Engine Model

    21 degrees of freedom

    Velocity Commands (v,) v is linear velocity

    is angolar velocity

    We follow theStatic Walking Pattern:

    Use a-priori definition of the

    desired trajectories defined by:

    NAO modelcharacteristics

    No actuated trunk

    No dynamic model available

  • 7/30/2019 Giannone Nao Learning

    8/20

  • 7/30/2019 Giannone Nao Learning

    9/20

    SPQR walking subtasks and parameters

    SPQR walk subtasks

    Foot trajectories in

    the xz planeCenter of mass

    trajectory in lateral

    direction

    Hip yaw/pitch

    control (turn)

    Arm control

    Xtot, Xsw0, Xds

    Zst, Zsw

    Yft, Yss, Yds, Kr

    HypKs

    Biped walking

    Double support phaseSwing phase SS%

  • 7/30/2019 Giannone Nao Learning

    10/20

    Walk tuning: main issues Possible choices

    By hand

    By using machine learning techniques

    Machine Learning seems the best solution

    Less human interaction

    Explores the search space in a more systematic way

    but take care of some aspects

    You need to define an effective fitness function

    You need to choose the right algorithm to explore the parameterspace

    Only a limited amount of experiments can be done on a real

    robot

  • 7/30/2019 Giannone Nao Learning

    11/20

    SPQR Learning System Architecture

    LearnerLearning library

    RAgent

    Walking library

    uses

    uses

    Real Nao

    Webots

    Datato evaluatethe fitness

    FitnessIterationexperiments

    (GPS)

  • 7/30/2019 Giannone Nao Learning

    12/20

    SPQR Learner

    First

    iteration?

    Return initial

    Iteration and

    iteration information

    Apply the chosen

    algorithm (strategy)

    Yes

    No

    Policy Gradient

    (e.g., PGPR)

    Nelder Mead

    Simplex Method

    Genetic Algorithm

    Learner

    Return next

    Iteration and

    iteration information

  • 7/30/2019 Giannone Nao Learning

    13/20

  • 7/30/2019 Giannone Nao Learning

    14/20

    Enhancing PG: PGPR

    At each iteration i, the gradient estimate (i) can be

    used to obtain a metric for measuring therelevance of the parameters.

    Given the relevance and a threshold T, PGPR prunes less relevant parameters

    in next iterations.

    forgetting factor

  • 7/30/2019 Giannone Nao Learning

    15/20

  • 7/30/2019 Giannone Nao Learning

    16/20

    Simulators in learning tasks

    Advantages

    You can test the gait model and the learningalgorithm without being biased by noise

    Limits

    The results of the experiments on the simulator can

    be ported on the real robot, but specialized solutions

    for the simulated model can be not so effective on the

    real robot (e.g., it does not take into account

    asymmetries, models are not very accurate)

  • 7/30/2019 Giannone Nao Learning

    17/20

    Results (1)

    Five sessions of PG, 20 iterations each, all starting from

    the same initial configuration

    SS%, Ks, Yft have been set to hand-tuned values

    16 policies for each iteration

    Fitness increases

    in a regular way

    Low variance

    among the five

    simulations

  • 7/30/2019 Giannone Nao Learning

    18/20

    Results (2)

    Zsw Xs KrXsw0

    Five runs of PGPR

    Final parameter setsfor the five PG runs

  • 7/30/2019 Giannone Nao Learning

    19/20

    A. Cherubini, F. Giannone, L. Iocchi, M. Lombardo, G. Oriolo. Policy

    Gradient Learning for a Humanoid Soccer Robot. Accepted for Journal ofRobotics and Autonomous Systems.

    A. Cherubini, F. Giannone, L. Iocchi, and P. F. Palamara, An extendedpolicy gradient algorithm for robot task learning, Proc. of IEEE/RSJInternational Conference on Intelligent Robots and System, 2007.

    A. Cherubini, F. Giannone, and L. Iocchi, Layered learning for a soccerlegged robot helped with a 3D simulator, Proc. of 11th InternationalRobocup Symposium, 2007.

    http://openrdk.sourceforge.net

    http://www.aldebaran-robotics.com/

    http://spqr.dis.uniroma1.it

    Bibliography

    http://openrdk.sourceforge.net/http://www.aldebaran-robotics.com/http://spqr.dis.uniroma1.it/http://spqr.dis.uniroma1.it/http://www.aldebaran-robotics.com/http://www.aldebaran-robotics.com/http://www.aldebaran-robotics.com/http://openrdk.sourceforge.net/
  • 7/30/2019 Giannone Nao Learning

    20/20

    ??? Any Questions ???

    ???

    ???