7UDEDMRG H) LQGH *UDGR Grado en Ingeniería Electrónica ...bibing.us.es/proyectos/abreproy/90614/fichero/TFG... · a las distintas zonas de la industria. Esta serie de problemas,

Equation Chapter 1 Section 1

Trabajo de Fin de Grado

Grado en Ingeniería Electrónica, Robótica y

Mecatrónica

DECENTRALIZED CONSENSUS-BASED

SELF-CALIBRATION IN SENSOR NETWORKS

Autor: Julio José López Paneque

Tutor: José Ramiro Martínez de Dios

Dep. Ingeniería de Sistemas y Automática

Escuela Técnica Superior de Ingeniería

Universidad de Sevilla

Sevilla, 2016

iii

Trabajo de Fin de Grado

Grado en Ingeniería Electrónica, Robótica y Mecatrónica

DECENTRALIZED CONSENSUS-BASED

SELF-CALIBRATION IN SENSOR NETWORKS

Autor:

Julio José López Paneque

Tutor:

José Ramiro Martínez de Dios

Profesor titular

Dep. de Ingeniería de Sistemas y Automática

Escuela Técnica Superior de Ingeniería

Universidad de Sevilla

Sevilla, 2016

v

Proyecto Fin de Carrera: DECENTRALIZED CONSENSUS-BASED SELF-CALIBRATION IN

SENSOR NETWORKS

Autor: Julio José López Paneque

Tutor: José Ramiro Martínez de Dios

El tribunal nombrado para juzgar el Proyecto arriba indicado, compuesto por los siguientes miembros:

Presidente:

Vocales:

Secretario:

Acuerdan otorgarle la calificación de:

Sevilla, 2016

El Secretario del Tribunal

A mi familiaA mi tutor de proyecto

A mis profesores y compañeros

vii

Agradecimientos

El desarrollo de este trabajo, y el camino hasta aquí, no ha sido nada sen-cillo, y de seguro mi vida se habría tornado mucho más complicada de no serpor aquellas personas que han estado apoyándome desde que tengo uso dememoria.

Para empezar, me gustaría mencionar a una serie de profesores que, enun momento complicado de mi vida, me dieron la formación y los consejosque me llevaron a convertirme en lo que hoy soy. Gracias a Manolo, Antonio,Fernando, Jesús y Antonio.

La educación y el aprendizaje es una de las cosas que más tengo en estima,y a raíz de ello me veo en deuda con casi todos los profesores que he tenidoen la carrera. Por hacerme esforzarme, por la preocupación que demuestranpor sus alumnos, y por su paciencia, les doy las gracias.

Me gustaría mencionar especialmente a Ramiro, mi tutor de este proyecto,con el que llevo trabajando ya más de un año en distintos campos y que hademostrado un temple extraordinario a lo largo del mismo. Por su ayuda ysu esfuerzo, gracias.

También quiero expresar mi gratitud a todos mis amigos, tanto los desiempre como los que he hecho durante la carrera y en el laboratorio. Porayudarme a descargar la tensión, y apoyarme cuando fuera necesario, gracias.

Por último, quiero dar las gracias a mis padres por haber heredado supasión por la ciencia y por ser un modelo a seguir. A mi hermano poralertarme para que no me estancara en la carrera, y a Bárbara por ser mimayor pilar.

Julio José López PanequeSevilla, 2016

viii

Abstract

The aim of this Thesis is to develop decentralized techniques for autonomousself-calibration of sensor nodes in industrial sensor networks. These tech-niques must be consensus-based, meaning that they do not require any pat-tern sensor node, but the collaboration of all the sensor nodes in the network.The solution needs also to be robust (suitable for di�erent calibration situ-ations) and should scale properly with the amount of senor nodes in thenetwork.

Particularly, the solution will be tested in a sensor network where eachsensor node takes temperature measurements at a di�erent location of thesame pipe. These nodes may have calibration errors in both the slope andthe mean, and the sensor network should be able to correct these errors.

The solution implemented in this Thesis is composed of a DecentralizedRANSAC algorithm, a series of Bayesian Networks and di�erent speci�c cal-culations designed for this solution.

This thesis is composed of 6 chapters. Chapter 1 will introduce the con-text of this thesis and the main objectives of it, as well as its structure.Chapter 2 explains the state of the art of this project, and also di�erenttechniques that aim to recalibrate sensor nodes using di�erent approachesthan the one of this thesis. In Chapter 3, there is an explanation of the mainalgorithms and mathematical models used in this thesis: RANSAC, Decen-tralized RANSAC and Bayesian Networks. In Chapter 4 it is explained howto implement the two versions of the proposed solution. Chapter 5 showsa series of experiments intended to understand the behavior of the systemand its limits. Lastly, Chapter 6 provides with a series of conclusions andproposals for future improvements of the solution.

ix

Resumen

Motivación y Estructura del Proyecto

En la actualidad, gran cantidad de industrias petrolíferas y de gases se en-cuentran instaladas a lo largo del mundo. Están compuestas por sistemascomplejos de tuberías y estructuras, y tienen una gran cantidad de nodossensores instalados para monitorizar el comportamiento de la planta. Estacantidad de nodos sensores conlleva un gran coste económico, llegando a ungasto anual de 56000 millones de euros en todo el mundo sólo en labores desupervisión y mantenimiento de los mismos. Los sensores se deterioran y seven sometidos a fallos inesperados, afectando así negativamente al compor-tamiento global de la industria que los tenga instalados.

Debido a la gran cantidad de procesos que dependen de las medidas deestos nodos, es necesario recalibrarlos cuando no dan las medidas correctas.Esto se realiza normalmente con un sensor patrón que permite ajustar man-ualmente cada nodo sensor. Este ajuste es generalmente una tarea peligrosae implica comprar un equipo de alto coste para que el operario pueda accedera las distintas zonas de la industria.

Esta serie de problemas, y la importancia de la correcta calibración delos sensores industriales, hace que durante los últimos años la búsqueda deuna solución haya sido un problema de importancia global. Distintos proyec-tos plantean soluciones para dicha recalibración. En concreto, este TFG seencuentra bajo el marco del proyecto AEROARMS (AErial RObotic systemintegrating multiple ARMS, Sistema Robótico Aéreo que Integra MúltiplesBrazos, www.aeroarms-project.eu), que busca desarrollar el primer sistemarobótico con múltiples brazos y capacidades avanzadas de manipulación paraaplicarlo en actividades de inspección y mantenimiento industrial.

En este proyecto, los robots aéreos son capaces de desplegar nodos sen-sores permanentes en distintas partes de un complejo industrial, además de

x

xi

reparar o reemplazar los existentes. Para que esto sea útil, es necesario tenerun conocimiento de dónde y cuándo se debe realizar una tarea de inspección(no tiene sentido reparar o reemplazar un nodo sensor si éste está funcio-nando bien). Esta condición será determinada por el conjunto de sensoresinstalados en cada sistema de la industria. Por ejemplo, en una tubería largadonde un �uido va bajando de temperatura, los nodos sensores instalados enella son capaces de hallar un modelo matemático de cómo está evolucionandodicha temperatura a lo largo de todo el recorrido. Usando este modelo, cadasensor podría deducir si se encuentra correctamente calibrado o no, y en di-cho caso ajustar sus parámetros por software o llamar al robot para que loreemplace, en caso de que el fallo sea irreparable debido a un error de hard-ware.

El objetivo de este TFG es desarrollar dicho sistema de calibración desensores. Las técnicas implementadas deben estar basadas en el consenso dela red (lo cual implica que no se requiere ningún sensor patrón para recali-brar, sino la colaboración de todos los sensores ya instalados). La solucióndebe también ser robusta y debe escalar convenientemente con el número denodos sensores en la red.

Cada nodo sensor deberá ser autónomo y se encontrará desatendido entodo momento (ningún operador se acercará a la red para realizar un man-tenimiento de ésta). En todo caso, se tiene la posibilidad de que un robotaéreo reemplace algún nodo sensor cuando sea necesario, pero ésto se debehacer estrictamente sólo cuando el nodo sensor no se puede reparar de man-era autónoma.

La solución implementada se probará en una red de sensores donde cadauno mide la temperatura en un punto distinto de una tubería. Estos nodossensores podrán tener errores de calibración tanto en pendiente como en o�-set.

Este TFG se compone de 6 capítulos, que se distribuyen en la siguiente es-tructura: En el Capítulo 1 se encuentra la motivación del trabajo, así comolos objetivos y la explicación detallada de la estrucutra del mismo. En elCapítulo 2 se da una explicación del estado del arte de este proyecto. En eltercero, se da una descripción del problema y la solución implementada, ex-plicando paso a paso sus componentes. En el cuarto capítulo, se explica cómoimplementar dicha solución, usando los dos enfoques que se han desarrolladoen el proyecto (centralizado y descentralizado). En el Capítulo 5 se realizanvarios experimentos explicativos para comprobar el correcto funcionamiento

xii

de la solución, así como una serie de experimentos a gran escala. Por último,el Capítulo 6 expresa las conclusiones obtenidas a partir de los resultados delproyecto, así como las posibles implementaciones que se pueden realizar paramejorar los mismos.

Estado del Arte

Redes de Sensores

Una red de sensores se compone de varios nodos sensores de bajo precio, quetienen capacidades de comunicación y proceso. Cada uno puede realizar sim-ples cálculos y puede medir una o varias magnitudes. Después de medirlas,los nodos sensores pueden comunicarse entre sí para ajustar modelos máscomplejos del sistema, generalmente distribuyendo los cálculos a lo largo dela red, para evitar posibles cuellos de botella.

Normalmente, un único nodo sensor no puede realizar cálculos extensos.Por ejemplo, si varios nodos midieran su distancia con respecto a un robot,y usando dicha medida se implementara un �ltro, los cálculos del mismo po-drían ser demasiados para un único nodo sensor, haciendo que éste retraseal resto del sistema. Distribuyendo los cálculos se consigue un resultado gen-eralmente más rápido que en caso contrario.

RANSAC

RANSAC (RANdom SAmple Consensus, Consenso Aleatorio de Muestras)es un algoritmo que iterativamente estima los parámetros de un modelomatemático a partir de una serie de datos que contiene �outliers� (medi-das que no encajan bien en el modelo global, incluso teniendo en cuenta elruido en las mismas).

El interés en este algoritmo reside en su capacidad para obtener un con-senso a partir de un conjunto de muestras ruidosas, pero con la particularidadde que dicho consenso no se ve afectado por la presencia de �outliers� en lasmedidas. En el problema propuesto para este TFG, la red de sensores puedetener una cierta cantidad de �outliers� debido a errores de calibración en losmismos.

xiii

Redes Bayesianas

Basadas en el Teorema de Bayes, estas redes son una representación grá�capara representar relaciones entre distintas variables aleatorias o desconoci-das. Los grá�cos que se usan en este tipo de redes son dirigidos y acíclicos.El uso de este enfoque estadístico permite realizar de manera sencilla cálculosprobabilísticos sobre una serie de variables y obtener fórmulas y conclusionesa través de un dibujo representativo de las mismas.

En este TFG, las Redes Bayesianas se usan con el objetivo de hallar in-dividualmente si cada sensor está descalibrado o no, a partir de sus medidasy del consenso del sistema.

ROS

ROS (the Robot Operating System, el Sistema Operativo Robótico) es unentorno de trabajo basado en Linux y diseñado para facilitar la rápida im-plementación de software en robótica. Se compone de una gran cantidad deherramientas y librerías que hacen más fácil crear programas de la compleji-dad necesaria para un sistema robótico.

Una primera característica importante de ROS es la comunidad colab-orativa de usuarios que lo emplean. Cada persona, grupo o centro tiene laposibilidad de compartir sus implementaciones de manera rápida y extensiblemediante el uso de �paquetes�, que son el tipo de proyecto estándar de ROS.

La otra ventaja de ROS es que permite realizar de manera sencilla pro-gramas que se ejecutan en paralelo. Divide el software en �nodos�, que secomunican entre sí a través de �tópicos� y �servicios�. Esto hace que ROS seamuy útil para implementar complejas aplicaciones de software que precisande varios o muchos programas ejecutándose en paralelo.

Métodos Existentes Para Calibración Autónoma de No-

dos Sensores

Debido al gran interés por el desarrollo de técnicas de auto-calibración de no-dos sensores, diversas soluciones se han propuesto para solventar este prob-lema.

xiv

Algunas soluciones se centran en realizar calibraciones rápidas con �bags�de valores reales y medidos, otras buscan algoritmos altamente optimizadospara corregir errores de no linearidad. También hay un enfoque que buscacalibrar un único sensor partiendo de la base de que el resto de los nodossensores de la red están calibrados. Esto puede ser útil para una calibracióninicial al instalar un sensor, pero a parte de usar otras técnicas distintas alas de este TFG (haciéndolo muy vulnerable a �outliers�), no permitía unacalibración a posteriori.

Hay métodos que buscan robustez frente a anomalías puntuales, uti-lizando técnicas de inteligencia arti�cial para dicho �n. Sin embargo éstosno sirven para rechazar �outliers� frecuentes en el sistema.

Los métodos como RANSAC son aleatorios, y por tanto no garantizanencontrar la solución óptima del sistema. Para ello, se plantean algoritmosóptimos de RANSAC que funcionan hasta con un más de un 50% de outliers.El problema de este método reside en que es centralizado.

Se puede ver que, aunque estos métodos funcionan bien para cierto tipode aplicaciones, no resultarían adecuados para el problema formulado en esteTFG.

Descripción del Sistema

Formulación del Problema

En el problema para el cual está pensado este TFG, se tiene un sistema in-dustrial, con una magnitud (temperatura, presión, caudal...) que evoluciona(debido a causas varias) a lo largo de la extensión de dicho sistema. Estaevolución se mide a través de una serie de nodos sensores, y tiene una funciónmodelo que la monitoriza. Este modelo consta de una serie de parámetros{a, b, c...} que de�nen su comportamiento.

Para concretar un ejemplo, este problema general se ha particularizadopara un caso en el que la temperatura es el objetivo a medir. Suponga quela siguiente igualdad rige el comportamiento de la misma a lo largo de unatubería al inicio de la cual hay funcionando un secador de pelo:

x = a+ b log(d), (1)

xv

donde a y b toman valores �jos durante el tiempo que duran todas las medi-das en tomarse. x representa el estado global del sistema, que es la evoluciónde la temperatura a lo largo de dicha tubería.

Suponga que hay n nodos sensores, {S1, S2, ..., Sn}. Cada uno se en-cuentra a una distancia di del secador, por lo que, idealmente, medirían losiguiente:

xi = a+ b log(di), (2)

donde xi es el estado del sistema en la posición i.

Suponga que la medida del sensor se ve afectada por las siguientes fuentesde error:

zi = A(xi + η) +B, (3)

donde η aúna los ruidos aleatorios que aparecen en el proceso de medida,A representa el error en pendiente y B representa el error en o�set, ambosdebido a una calibración errónea del sensor. Idealmente, A=1 y B=0.

Una vez se tiene la medida zi, ésta se multiplica por un valor α y poste-riormente se le suma un valor β. Estos valores están pensados para corregirmediante software posibles errores de calibración. Inicialmente, α vale 1 y βcale 0.

x = zi ∗ α + β (4)

Por tanto x será la estimación que el sistema tiene de la magnitud. Conesto, el objetivo de este TFG es hallar:

α, β = argminα,β

(xi − (zi ∗ α + β)), (5)

para cada nodo sensor i.

Pasos Usados por la Solución

Para poder corregir los fallos de los posibles nodos sensores con medidas er-róneas, se requiere del uso de un modelo consensuado por la red. Para ésto,se empleará un algoritmo de RANSAC. Debido a que la implementación seha realizado para ambos casos (descentralizado y centralizado), ambos tipos

xvi

de algoritmo se emplearán.

Tanto para RANSAC como para De-RANSAC, se cumple que el umbralserá tres veces la desviación típica de cada sensor, siendo ésta equivalente a1°Celsius. Este valor cumple que el 99% de las muestras se verán incluidascomo �inliers� si el nodo sensor está calibrado. Además, la inclusión de lasmedidas está hecha por nodos, de manera que las medidas de un nodo no seincluirán si más del 50% son �outliers�.

Una vez hallado el consenso, se calcularán la varianza muestral del mismoy la varianza de dicha varianza muestral. Esto permite realizar posteriorescálculos estadísticos a partir de dichos valores y el consenso.

Cuando cada nodo sensor dispone de todos los valores necesarios, em-pleará una Red Bayesiana para determinar si su o�set está correctamentecalibrado. En caso de que lo esté, se usará otra red para determinar si lapendiente también lo está. Una explicación extensa del funcionamiento dedichas redes se encuentra en la Sección 3.4.

En caso de que alguna de las Redes Bayesianas dictaminen que es nece-sario recalibrar, el sensor empleará sus parámetros y los del consenso paraajustar los nuevos valores de α y β. Las fórmulas para dicho ajuste se en-cuentran en la Sección 3.5.

Implementación

Centralizada

En el caso centralizado de este TFG, se tienen un nodo ROS principal, en-cargado de computar el algoritmo de RANSAC centralizado y de recibir lasmedidas de todos los nodos sensores ROS de la red virtual.

La comunicación entre nodos se realiza mediante una serie de tópicos deROS, uno para recibir las medidas de los sensores y otro para enviar losparámetros calculados por el algoritmo RANSAC. Además de ello, el nodoes un cliente de un servicio ROS de pintado de resultados, si bien esto esirrelevante para el funcionamiento del sistema.

Una vez se han calculado los parámetros de RANSAC, cada nodo sensorde ROS los recibe y ejecuta las correspondientes Redes Bayesianas con di-

xvii

chos datos. Como nota, hay que tener en cuenta que la Red para calibrarla pendiente sólo funciona si el o�set se encuentra corregido de antemano.Este requisito se soluciona fácilmente haciendo que sólo se ejecute cuando laprobabilidad de que β esté mal calibrado se encuentre por debajo del 55%.

Si una de las Redes Bayesianas dictaminan que es necesario recalibrar elsensor, se procede a los cálculos correspondientes.

Descentralizada

Este caso, que es el más interesante de cara a una implementación física, nodispone de ningún nodo central. Por ello, depende de una correcta comu-nicación entre todos los nodos sensores de ROS y de que los algoritmos seconstruyan de forma distribuida.

Tras tomar medidas cada nodo sensor y compartirlas con el resto de susnodos adyacentes (esta comunicación viene dada por un grafo de conectivi-dad, suponiendo que no hay pérdida de paquetes), el nodo sensor con el Idmás bajo realiza una serie de hipótesis del modelo del sistema monitorizado,y las comparte con el resto. Cada nodo sensor va añadiendo sus correspon-dientes hipótesis. El número de hipótesis compartidas viene dado por unaserie de cálculos que dependen de la estimación de la estructura de la red.

Cuando todos los sensores tienen el mismo vector de medidas, comienzanun algoritmo de votación: se generan unos vectores de votos, se compartencon el resto de sensores y se repiten estos pasos hasta que los votos conver-gen. Para este paso es importante que la red esté conectada durante la mayorparte del tiempo, ya que si no los votos no convergerán en el tiempo esperado.

Una vez se tiene una hipótesis más votada, un algoritmo de mínimoscuadrados distribuidos ajusta dicha solución a sus �inliers�. Tras esto, unalgoritmo de cálculo distribuido de la varianza muestral (y la varianza deésta) se ejecuta.

Con todos los parámetros calculados de manera distribuida, el sensorprocede a calcular las Redes Bayesianas y recalibrarse de la misma maneraque en el caso centralizado (pues este paso no requiere de más comunicación).

xviii

Experimentos

Durante esta sección se realizan una serie de ensayos del algoritmo descen-tralizado, tanto para mostrar al lector cómo se comporta el algoritmo hastacalibrar los valores necesarios, como para subrayar los límites de la imple-mentación actual del mismo.

El primer experimento, una prueba de concepto, explica detalladamenteun caso de corrección de la magnitud β de un sensor. Paso a paso, se puedever cómo el algoritmo calcula el consenso, computa mínimos cuadrados paraa�narlo, ejecuta la red bayesiana correspondiente y por último recalibra elsensor. El resultado de este experimento fue satisfactorio.

El segundo y tercer experimento (casos 1 y 2) tienen como objetivomostrar casos más complejos, en los que varias magnitudas de uno o mássensores son incorrectas. El resultado de ambos es satisfactorio, y en con-creto el tercero sirve para comprender cómo los valores de α se calibrandespués de los de β.

Los dos últimos experimentos comentados, el cuarto y el quinto (casos 3 y4), son los que muestran las limitaciones del algoritmo. En el cuarto se puedecomprobar que el sistema es incapaz de calibrar sensores que se encuentranpoco descalibrados (valores por debajo de la frontera del RANSAC), mientrasque en el quinto se muestra la desventaja de usar un sensor con diferente vari-anza al resto (el sistema lo toma como un error y modi�ca su α, descalibrandopor tanto al sensor). Estos dos casos generan nuevas metas a solucionar traseste TFG. Dichas metas se concretan en el último capítulo.

Conclusiones e Implementaciones Futuras

Partiendo de los experimentos realizados y de las hipótesis formuladas, se haconcluido que:

� La solución funciona de manera satisfactoria siempre y cuando se cum-plan los requisitos impuestos al sistema. La mayoría de éstos se dan yaen entornos industriales, lo cual hace que dicha solución sea su�ciente-mente adecuada a priori para dichos entornos.

� Cuando algunos sensores se encuentran ligeramente descalibrados, és-tos afectan al consenso y no se llegan a corregir. Esto hace que reper-

xix

cutan negativamente en la calidad de la medida �nal. Cuando estopasa, la probabilidad de que el sensor esté descalibrado se considerabaja, aunque la distancia de sus medidas al consenso es mayor que loque debería, lo cual permite buscar nuevas soluciones para dicho caso.Cuanto menor sea la varianza de los sensores, más difícilmente ocurriráeste fenómeno.

� Cuando un sensor tiene una varianza diferente al resto, se modi�carásu pendiente aunque ésta estuviera ya calibrada. Esto hace que el sen-sor ya no reaccione correctamente cuando la magnitud varíe de nuevo.Este problema supone un reto a resolver mediante la inclusión de car-acterísticas individuales de cada sensor.

Teniendo en cuenta dichas conclusiones, se han propuesto los siguientesdesarrollos para la continuación de la investigación llevada a cabo en esteTFG:

� Para corregir sensores con bajo error de calibración (una vez identi�-cada una cierta distancia al consenso aunque con una probabilidad bajade descalibración), se realizaría una calibración más compleja medianteun sistema paso a paso en el que los nodos sensores hagan lo siguiente:Uno a uno, eliminan sus medidas del consenso mientras el resto no lohace; entonces, calculan la probabilidad de estar descalibrado en di-cho caso (sin sus medidas incluidas), y por último el nodo sensor cuyaprobabilidad sea más alta realiza una recalibración.

� Para tratar adecuadamente el caso en que un nodo sensor tiene unavarianza distinta al resto, se necesitarían datos especí�cos del datasheetde cada nodo, de manera que cada uno tenga distintas métricas para suRed Bayesiana. No obstante, es común que en la industria se comprensensores del mismo tipo, y por tanto este problema no debería aparecergeneralmente.

� Con el objetivo de implementar esta solución en redes de sensores reales,se deben añadir múltiples medidas de seguridad (tolerancia a apagadode sensores, correcta implementación del algoritmo distribuido...). Asímismo, se debe añadir un protocolo para que cada nodo sensor tengaun Id único en cada momento. Por último, hace falta que sea posibleextraer la magnitud estimada del sistema para su uso en procesos de laindustria, por lo que alguno de los nodos debería tener su�ciente capaci-dad como para comunicarse con la red para transmitir dicha magnitud.

Contents

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . 3

2 State of the Art 1

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 RANSAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

De�nition and Purpose . . . . . . . . . . . . . . . . . . . . . . 3De-RANSAC . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.4 Bayesian Networks . . . . . . . . . . . . . . . . . . . . . . . . 7Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Bayes' Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 7Bayesian Networks . . . . . . . . . . . . . . . . . . . . . . . . 7

2.5 ROS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.6 Existing Methods for Sensor Node

Self-calibration . . . . . . . . . . . . . . . . . . . . . . . . . . 112.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Description of the System 13

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . 143.3 Notes about the RANSAC Algorithms . . . . . . . . . . . . . 183.4 The Bayesian Network . . . . . . . . . . . . . . . . . . . . . . 19

Network Structure Description and Mean Calibration . . . . . 19Variance Calibration . . . . . . . . . . . . . . . . . . . . . . . 23

3.5 The Recalibration Step . . . . . . . . . . . . . . . . . . . . . . 243.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

xx

CONTENTS xxi

4 Implementation 27

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.2 Centralized Implementation . . . . . . . . . . . . . . . . . . . 27

RANSAC ROS Node . . . . . . . . . . . . . . . . . . . . . . . 27Sensor Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.3 Decentralized Implementation . . . . . . . . . . . . . . . . . . 31Sensor Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.4 ROS Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 34Centralized System . . . . . . . . . . . . . . . . . . . . . . . . 34Decentralized System . . . . . . . . . . . . . . . . . . . . . . . 34

4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5 Experiments 36

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.2 Proof of Concept . . . . . . . . . . . . . . . . . . . . . . . . . 365.3 Case 1: One Sensor Node has an Incorrect A, Another has an

Incorrect B . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.4 Case 2: A Sensor Node has both A and B Incorrectly set,

Another has an Incorrect B . . . . . . . . . . . . . . . . . . . 445.5 Case 3: Nodes with Low Calibration Error that Alters the

Final Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465.6 Case 4: A node has di�erent σx

2 than the others . . . . . . . . 485.7 Extensive Simulations . . . . . . . . . . . . . . . . . . . . . . 515.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6 Conclusions and Future Implementations 55

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55About the Overall Behavior of the System . . . . . . . . . . . 55About the Presence of Low Calibration Error . . . . . . . . . 55About the Presence of Sensors with a Di�erent σx

2 . . . . . . 556.2 Future Implementations . . . . . . . . . . . . . . . . . . . . . 56

About the Presence of Low Calibration Error . . . . . . . . . 56About the Presence of Sensors with a Di�erent σx

2 . . . . . . 56About the Implementation over a Real Sensor Network . . . . 57

7 Bibliography 58

1. Introduction

1.1 Motivation

Nowadays, huge oil and gas industries are installed all over the world. Theyare composed of complex systems of pipes and structures, and have a largeamount of sensor nodes installed in order to monitor the behavior of the plant.

This quantity of sensor nodes implies high maintenance costs for the in-dustries, who expend approximately e56000 million a year in this area. Thedeterioration of the sensor nodes, and also unexpected failures in them, hasa high impact in the overall behavior of the industry, due to the importantnumber of processes that require these measurements in order to functionproperly.

To be able to repair possible errors in the sensor nodes, it is usually re-quired that they are calibrated or replaced manually by an operator, whichis usually a very hazardous task and involves high-cost equipment to allowthe worker to reach certain areas of the industry.

Due to the de�cit caused by maintenance tasks, the interest in solutionsfor them has expanded highly during the last years. A proper solution wouldimply important economic savings for these industries.

This Thesis is framed in the AEROARMS project (AErial RObotic sys-tem integrating multiple ARMS, www.aeroarms-project.eu), that has theobjective of developing the �rst collaborative aerial robotic system with mul-tiple arms and advanced manipulation capabilities to be applied in inspectionand maintenance activities in industrial plants.

In this project, the aerial robots are able to deploy permanent sensornodes in di�erent facilities of an industrial complex. For that purpose, therobots need to know where and when these sensor nodes are demanded to be

1

CHAPTER 1. INTRODUCTION 2

installed. To solve this, the current sensor nodes in the network should beable to communicate with each other in order to monitor the evolution of themeasured magnitude (for instance, the temperature along the pipe they areinstalled in), and determine if they are able to correct possible calibrationerrors that some sensor nodes may have. If they are not able to do it, therobots can be called to replace the uncalibrated nodes, or to install others inparticular places, as exempli�ed in Fig. 1.1.

Figure 1.1: Concept of an aerial robot approaching a pipe for maintenance.

The AEROARMS project is a continuation of the work carried out in theARCAS (Aerial Robotics Cooperative Aerial Systems, www.arcas-project.eu)project, which is an FP7 European project where the �rst aerial robots with6 and 7 degrees of freedom arms were developed.

The research carried out in this Bachelor's Thesis will serve as a part ofthe overall solution developed in the AEROARMS project.

1.2 Objectives

The aim of this Thesis is to develop decentralized techniques for autonomousself-calibration of sensor nodes in industrial sensor networks. These tech-niques must be consensus-based, meaning that they do not require any pat-tern sensor node, but the collaboration of all the sensor nodes in the network.

CHAPTER 1. INTRODUCTION 3

The solution needs also to be robust (suitable for di�erent calibration situ-ations) and should scale properly with the amount of senor nodes in thenetwork.

The sensor nodes will be autonomous and will be left unattended, mean-ing that no operator will approach the network to perform maintenance tasks.It is possible that an aerial robot comes to replace broken sensor nodes withnew ones, but this circumstance should be reduced as much as possible andshould only happen if a sensor can not be calibrated by the network (becauseits hardware is broken and there is no software solution for it).

Particularly, the solution will be tested in a sensor network where eachsensor node takes temperature measurements at a di�erent location of thesame pipe. These nodes may have calibration errors in both the slope andthe mean, and the sensor network should be able to correct these errors.

1.3 Structure of the Thesis

This Bachelor's Thesis: is structured as follows: First, Chapter 2 describesthe State of the Art of this project, giving explanations of the concepts thatare most relevant, and also providing detail on how recent solutions are usedto solve the problem of self-calibrating sensor nodes. Secondly, in Chapter 3lays an extensive explanation of both the aim of this Thesis and the di�erentsteps involved in it. Then, the two solutions used are explained in Chapter 4,giving programmatical details and also implementational concepts that needto be taken into account.

After the solution has been explained, several experiments are conductedand described in Chapter 5, both for explaining the behavior of the systemand its limitations. Finally, some conclusions are provided in Chapter 6, aswell as the future implementations that have been purposed to overcome thedi�erent unresolved problems that appear in Chapter 5.

Each chapter starts with an introduction, brie�y explaining the di�erentsections in it, and ends with some conclusions that resume the most impor-tant facts learned.

2. State of the Art

2.1 Introduction

The research conducted in this Thesis is highly related to several topics vary-ing from statistics and algorithms to sensor networks and methods designedfor them. Thus, this section o�ers a brief introduction to the most importantconcepts underneath the system developed.

As the system is designed to be implemented in a Sensor Network, thisconcept will be explained �rst, during Section 2.2, giving some examples ofhow the idea of distributing computations over di�erent sensor nodes pro-vides a good focus for many applications. Then, there lay some explanationsregarding the RANSAC algorithms. The most basic RANSAC, and also adecentralized version of it, will be presented in Section 2.3. These algorithmsare used to obtain a consensus of the measurements in a speci�c part of theoverall solution. After that, in Section 2.4, an statistical tool (Bayesian Net-works) will be explained. This concept is used in the solution in order toincrease the robustness of the calibration step.

Moreover, the ROS framework will be depicted in Section 2.5. This tool isused to simulate a sensor network in order to easily implement the algorithmsused in this Bachelor's Thesis. After that, some existent methods concern-ing the �eld of sensor networking and calibration are presented in Section 2.6.

Finally, a brief conclusion is given.

2.2 Sensor Networks

A Sensor Network is composed of di�erent low-cost sensor nodes that havecommunication and processing capabilities. Each sensor node has a processor

1

CHAPTER 2. STATE OF THE ART 2

aimed to perform simple calculations, and is usually able to measure variousmagnitudes. After taking them, the sensor nodes communicate with eachother to adjust more complex models of the system, usually distributing cal-culus through the entire network, so no bottlenecks may appear.

Usually, a unique sensor node is not able to compute complex calcula-tions. For instance, if several sensor nodes measured the distance betweenthem and a robot, and then passed the distances to a central node that im-plements a �lter, this node may encounter di�culties to compute the needed�lters and algorithms. Not only because there may be delays and failures insome of the connections, but also due to the complexity of the calculations,that may not be feasible for a low-cost sensor node by itself. However, whenthe sensor nodes collaborate to implement these �lters, the calculations aredistributed and the network load is usually reduced, giving thus the oppor-tunity to use a cheaper system than the one needed in a centralized network(the cost of N sensor nodes with K

Ncomputational power is usually less than

the cost of one sensor node with K computational power).

A common example of a sensor network would be an environmental pre-diction system. Having only a couple of sensors in one particular place wouldnot be enough to predict how the weather will evolve during the followingdays. However, when multiple stations are combined, and some posteriorcalculations are made with the measurements, the system becomes able todetect patrons and predict the coming weather.

A more recent approach would be a system of robots, where each has aspeci�c amount of information of its own environment, but this may not beenough to accomplish certain complex tasks, such as transporting materialsto an area with varying demands. Thus, it is required that each robot com-municates with the others in order to share relevant details that might notbe accessible to all of them in other circumstances.

In this Bachelor's Thesis, maintaining the properties of a distributed sen-sor network is crucial, as the cost of the computations and the network loadwould be too high, otherwise. A decentralized implementation of the algo-rithms used in this Thesis will be explained in chapter 4, and the importanceof this distributed focus will be underlined several times during the di�erentchapters of the Thesis.


2.3 RANSAC

De�nition and Purpose

First published by Fischler and Bolles [1], RANSAC (RANdom SAmple Con-sensus) is an algorithm that iteratively estimates the parameters of a math-ematical model from a set of data that contains outliers (an outlier is anobservation that does not �t well in the global model, either due to an ex-treme noise value, or because there was a mistake during its sampling). Anexample of a �tting problem solved trough RANSAC is presented in Fig. 2.1.

Figure 2.1: RANSAC-�tted line. The red dots are outliers and the blue onesare inliers. The �tted line is independent of the outliers' positions.

The interest on this algorithm resides in its ability of determining a con-sensus from a series of noisy samples, but with the particularity that it elim-inates the presence of highly erratic measurements. In a sensor network, onesensor node may be giving a value that is very far from the ideal value itwould give. Moreover, the other sensor nodes have a certain amount of noisethat may make it harder to determine which model best �ts all the data.Through RANSAC, a proper model (with high probability of being the bestone) can be calculated, and all the measurements not close from it (beingthe distance less than a threshold τ) are ignored, allowing thus to calculatea better �t of all the inlying samples.

It has to be noticed that RANSAC is a random algorithm, that iteratesa �xed amount of times in order to �nd the best possible answer between k


starting hypotheses. According to that, one has no guarantee that RANSACwill �nd the best solution. A further explanation on this can be found in [1].

RANSAC does not �t the model by itself, but it needs to be comple-mented with a model-�tting algorithm, such as least squares �tting (LSQF)or Levenberg-Marquardt, and it also needs that structure of the model ispreviously �xed (i.e: a logarithm, a line, a quadratic function, etc).

Algorithm 2.1 shows a typical implementation of RANSAC. It is basedon [1], with some modi�cations to improve understandability.

Algorithm 2.1: RANSAC Implementation

function bestfit = RANSAC( data , model , n , k , t , d)

% n is the minimal amount of data required to fit a model

% k is the number of RANSAC iterations

% t is the threshold that divides inliers and outliers

% d is the minimum number of close data values for a good model

% Let length(something) be te amount of points in the set

\textit{something}

bestfit = NULL (empty structure of parameters)

besterror = inf (or something really large)

for iterations = 0 to k

{

maybeinliers = n values selected randomly from data

maybemodel = model parameters resulting from fitting

maybeinliers (with least squares or similar)

alsoinliers = NULL

for every point in data and not in maybeinliers

{

if distance of point to model maybemodel < t

add point to alsoinliers

}

if length(alsoinliers) > d

{

bettermodel = model parameters resulting from fitting

alsoinliners and maybeinliners (with least squares or

similar)


thiserror = a measurement of how well the model fits these

points (i.e: the number of point fitted)

if thiserror < besterror{

bestfit = bettermodel

besterror = thiserror

}

}

}

end

De-RANSAC

Because of its implementation, RANSAC is a centralized method where alldata is collected before performing it, and then a central processing nodedoes the posterior calculations of the algorithm. Thus, it presents an issueregarding sensor networks, that, as stated in Section 2.2, may be composedof several sensor nodes, each with little computational capacities. There isno guarantee that the central node (if exists) receives information from allthe sensors, and the delays occurred when transmitting (i.e: through �ood-ing, [2]) grow higher when the number of sensor nodes increases.

It is then required a new approach, where calculations are distributedover the entire network, and all measurements are not required to be recom-piled before the algorithm starts.

Stated in [3], De-RANSAC (DEcentralized RANSAC or DeRANSAC) isa set of algorithms and protocols used to implement RANSAC over a dis-tributed system. It requires synchronization between available sensor nodesduring each step, but allows them to directly perform RANSAC withouthaving to communicate with all others before. At the end, a voting systemmakes all the sensor nodes converge to a particular solution, however thismay not be exactly the same in every node, as the connections available havean e�ect on the �nal result.

Each sensor node creates a certain number of De-RANSAC hypotheses,and this number increases with the amount of sensor nodes that this nodecan communicate with (if the sensor node A has B,C and D in range, it will


produce more hypotheses than if if only had B and C in range).

If every sensor has access (through �ooding) to all the measurementsalongside the network, it is proven that the De-RANSAC algorithm solu-tions are the same as those in the classical RANSAC implementation.

The De-RANSAC �owchart is depicted in Fig. 2.2.

Figure 2.2: De-RANSAC Algorithm as stated in [3].


2.4 Bayesian Networks

Notation

It is important to understand that P (A) ≡ P (A = a), which means that theprobability of observing the event A is de�ned as the probability of the eventA taking a previously known value a. For instance, when throwing a dice, theprobability P (B) means the probability of the dice taking a speci�c value,wich is usually in the set {1,2,3,4,5,6}. This makes it easier to generalize allthe possible hypoThesis. If one needed to particularize the case B = 2 itwould just be needed to write P (B = 2).

Bayes' Theorem

Even simple, the Bayes' Theorem (or Bayes' Rule) is one of the most com-monly used theorems in probability theory. It states that:

P (A|B) =P (B|A)P (A)

P (B)(2.1)

Also, from the Kolmogorov de�nition, it is learned that:

P (B|A)P (A) = P (B ∩ A) (2.2)

In other words, this theorem a�rms that the probability of the event Abeing true given the event B has been observed and is con�rmed to be right,is the probability of having both events as true (at the same time), dividedby the probability of B being correct.

The Bayes' Theorem and the Kolmogorov de�nition assume that both Aand B are conditionally independent from each other, this meaning that thevalue of A do not alter in any form the values of B, and vice-versa.

Bayesian Networks

Based on the Bayes' Theorem, these Networks are a graphical approach tode�ning relationships between several random (or unknown) variables thatmight, or might not, be conditionally dependent from others. The graphsdepicted in a Bayesian Network must by directed and acyclic. An exampleof a Bayesian Network can be found in Fig. 2.3.


Figure 2.3: An example of a Bayesian Network concerning academic matters.It is based in examples shown in [4].

In the above example, a Bayesian Network concerning the abilities andresults of a student is shown. On one hand, the intelligence of the studentconditions (but not merely determines) the possible mark obtained in theSAT exam (which is a common exam in the USA). On the other hand, re-garding a particular subject, the student's intelligence combined with thedi�culty of the course conditions the �nal grade reached. Depending on thisgrade, the teacher of the subject would be more or less likely to write a rec-ommendation letter for the student, if asked.

Even though no relationship in this graph is deterministic, the results arehighly conditioned by the previous conditions. For instance, if a student isintelligent and the course is easy, the student is very likely to score high.However, there are many factors that remain unknown, such as the attitudeof the student towards this particular subject, or the familiar situation of thestudent during that semester. Thus, the relationship between conditions andresults is still statistic.

Using the Bayes' Theorem, it is possible to determine the probability ofobserving a particular set of events in the network:

P (D, I,G, S, L) = P (D)P (I)P (G|I,D)P (S|I)P (L|G) (2.3)

This probability chain, combined with calculations derived from inde-pendence relationships between variables, can lead to expressions that areapplicable to a speci�c problem, where some events are known (for instance:the grade and intelligence of the student), and the others may be needed to


determine (the di�culty of the course).

A more developed example, designed for this Thesis, will be shown in thenext chapter.

2.5 ROS

ROS (the Robot Operating System), as described in [5], is a software frame-work designed for the fast implementation of robotics software. It has severaltools and libraries that make it easier to create complex programs as thoseneeded in robotic systems. The community besides ROS has created a widevariety of applications and speci�c programs, which can be used to overcomeeasily di�erent tedious implementations needed in many projects. Further-more, it is through the ROS ecosystem that many engineers and hobbyistsshare state-of-the-art algorithms, so one could add them to a project in asimple, extensible way. These contributions are presented in packages, whichare the standard ROS projects implementations.

Apart from the community, an appeal of ROS is that it makes imple-menting parallel programs really easy and fast. It divides the software into�nodes�. Each ROS node has its own name and Id, and is capable of commu-nicating with each others through �topics� or �services�. Further informationon this can be found in [5]

When a ROS node subscribes to a particular ROS topic (subscriber node),it receives all the messages that are published there by other ROS nodes (pub-lisher nodes). These can be handled immediately or in a posterior moment.Messages are queued until the ROS node callback is invoked. The length ofthe queue is de�ned by the user in order to �t the particular implementationneeded.

As for the ROS services, they are functions that are called from one ROSnode (the client) to other (the server). They may return some parameters(otherwise they return nothing) if the service is programmed that way.

A simple example of a ROS implementation would be Fig. 2.4. It repre-sents di�erent ROS nodes of a ground robot. There are four in this case: asensing ROS node that takes measurements of the position of the robot, anodometry ROS node that implements �lters and makes decisions with thesensed measurements, a controller ROS node that drives di�erent motors of


the robot, and a recorder ROS node that saves all the information of thesensors in a �le.

The sensing ROS node publishes the measured distances to the \measurementstopic. All the ROS nodes subscribed to this topic (the odometry ROS nodeand the recorder ROS node) receive all the values obtained by the sensorROS node. Whereas the recorder ROS node only saves these values and doesnothing more, the odometry ROS node estimates the position of the robotand makes requests as a client to the controller ROS node, in order to changesome values such as the speed reference of the motors. In this case, no re-ply is given from the server at the controller ROS node, however it couldhave been programmed to send an OK response if the request was completedsuccessfully.

Figure 2.4: Diagram representing di�erent ROS nodes and communicationsbetween them.

It is important to notice that one ROS topic may have multiple publishersand subscribers, yet one ROS server acts individually and can have multiplerequests of its services.


This software is used for this Bachelor's Thesis to simulate the sensornetwork and its connections easily.

2.6 Existing Methods for Sensor Node

Self-calibration

An ideal sensor node would have a linear relationship between its analog ordigital value measured and the real magnitude (i.e: temperature or pressure)measured. This would be determined in a simple way through the sensor`ssensibility. However, the sensing process has many sources of errors, beingo�set errors (continuous values added to the ideal measurement) or gain er-rors (values that change linearly with the ideal measurement) the two mostcommon sources of it. Others could be non-linearity errors or frequency er-rors.

To correct these errors, many algorithms and methods have been devel-oped. Some focus on fast calibration of sensors with a known set of mea-surements and real values; [6] depicts a highly optimized algorithm to correctnon-linearity errors. Others, such as [7], focus on sensor nodes that interactwith each other to calibrate a speci�c one so it �ts the model determined bythe others.

The problem with [7] is that it does not take possible outlying sensornodes into account, so the �nal estimation can be highly a�ected by those.

If some punctual unusual measurements are originated by causes not re-garding calibration, but by possible anomalous failures or attacks to the net-work, other methods may be applied, such as [8], which employs a previouslytrained Restricted Boltzmann Machine to detect anomalies in a network.

However, even though these methods are proven to work well for somespeci�c applications, another focus is required when outliers are likely to bein the network. Moreover, these methods only aim to calibrate the sensor,and do not regard the possibility of knowing the probability of the sensorbeing uncalibrated, which is something fundamental in this Thesis.

Methods like RANSAC may fail to calculate an optimal consensus modelif the measurements are too contaminated with outliers. Moreover, the num-ber of hypotheses is calculated statistically and thus the algorithm is not


likely to calculate the same model if executed twice. To overcome this, anoptimal RANSAC algorithm is proposed in [9], which is stated to work prop-erly even if the number of inliers is less than the 50%. However, this method iscentralized and thus not suitable for being implemented in a sensor network.

2.7 Conclusions

Having explained all the basic concepts, as well as the state-of-the-art tech-niques, the following conclusions are reached:

� Even with a handful of self-calibration methods available, it is notpossible through them to determine the probability of a sensor beingcalibrated. Thus, a statistical approach should be taken to solve thisissue.

� The presence of outliers can lead many non-robust algorithms, such asleast squares �tting, to unsuccessful solutions due to the high error ofthese outliers.

� It is possible to distribute the computation of the RANSAC algorithm.That makes it more appealing to be combined with the Bayesian Net-work implementation for the system developed in this Thesis, as itscalculations are also decentralized.

3. Description of the System

3.1 Introduction

During this chapter, the theoretical design of the system will be explained,as well as several facts that need to be settled before the practical implemen-tation.

Firstly, the problem is formulated in Section 3.2, explaining how the mea-sured magnitude behaves, and which are the di�erent steps taken from thesensing of the magnitude to the calibration of the sensor nodes. Then, somenotes about the behavior of the implemented RANSAC and De-RANSACalgorithms and their parameters are given in Section 3.3. However, this sec-tion does not cover the implementation of these algorithms, as they will bedescribed in Chapter 4, in speci�c sections for each approach (centralizedand distributed). This section will also provide with some calculations thatallow to obtain the randomness of the consensed model, as it will be neededin section 3.4.

Later, the Bayesian Networks implemented for both α and β calibrationsare presented in Section 3.4. This section explains the calculations used totheoretically determine the �nal expression derived from the Bayesian Net-works and how to resolve it for each case. After that, Section 3.5 describesthe recalibration step and its arithmetics.

Lastly, some conclusions on this design are provided. The most noticeablefacts will be appointed here.

13

CHAPTER 3. DESCRIPTION OF THE SYSTEM 14

3.2 Problem Formulation

In the problem solved by the solution of this Thesis, there is an industrialsystem, with a magnitude (temperature, pressure, �ow...) that evolves (dueto di�erent causes) along the extension of that system. This evolution ismeasured by a series of sensors, and has a model function that monitors it.This model has a series of parameters {a, b, c...} that de�ne its behavior.

To make things clearer, this general problem is particularized in a casewhere temperature is the measured magnitude.

Assume there exists a pipe with a blow dryer emitting hot air at the be-ginning of it. Let the following equality rule the evolution of the temperatureover that pipe:

x = a+ b log(d), (3.1)

where a and b take �xed values during the time the algorithm is executed(which means that the global model changes much more slowly than the timeneeded to take all the measurements and perform recalibration by using it).x represents the overall state of the system, which is the evolution of thetemperature along the pipe.

Let there be n sensor nodes, {S1, S2, ..., Sn}. Each one is positioned at adistance di from the blow dryer, so ideally their measurement would be:

xi = a+ b log(di), (3.2)

where xi is the state of the system at the position di.

Assume the ideal measurement is a�ected by di�erent sources of errors(and no others than the ones mentioned here), resulting in the followingequality:

zi = A(xi + η) +B, (3.3)

where:

� xi represents the ideal measurement of the sensor node i, which is theone obtained if there where no sources of errors at all. This is the valueideally obtained during the transduction.


� η is a random variable that summarizes the errors occurred during thesensing step (inside the transducer), and its modeled as a Gaussianwith mean 0 and variance σx

2 (as it can be deduced from the CentralLimit Theorem). Assume all the sensor nodes have the same σx

2 (thisallows to simplify the calculations without needing to take into accountthe sensor nodes' datasheets)

� A represents the error in the slope, as a result of an improper calibra-tion. Its value would ideally be 1. It a�ects both the value of xi and η.This value appears during the signal conditioning and processing step.

� B represents the error in the o�set, also provoked by an impropercalibration, and its value would ideally be 0. This value appears duringthe signal conditioning and processing step.

Having A and B correctly set, the measurements of the sensor nodesshould follow a similar behavior than in Fig. 3.1.

Figure 3.1: Example of the formulated problem, where a = 52 and b = -12.

If the values of A or B of a sensor node had di�erent values to the proper,the resulting measurements would not �t the evolution of the magnitude. For


instance, Fig. 3.2 shows a case where the sensor node 2 has a di�erent Avalue and the sensor node 4 has a di�erent B value.

Figure 3.2: Example of the formulated problem, where a = 52 and b = -12.

Once the measurement zi is obtained, it is multiplied by a value α, andthen added a constant β, �xed by the user. These values allow to recalibratethe sensor node without altering the sensing step. Prior to calibration, α = 1and β = 0. The implementation of this step would be:

x = zi ∗ α + β (3.4)

Provided all these assumptions, the objective of this Bachelor's Thesis isto �nd:

α, β = argminα,β

(xi − (zi ∗ α + β)), (3.5)

for each sensor node i.

A representation of the implemented solution and its steps is given in Fig.3.3.


Figure 3.3: Flowchart representing the sensing, processing and calibrationsteps, plus all the intermediate ones.


To be able to �nd α and β, all the nodes �rstly calculate a consensedmodel through a RANSAC algorithm (in the sensor network it is the decen-tralized one; in the simulations both cases are implemented), which allowsto calculate the parameters a and b that monitor the temperature's evolu-tion along the pipe. The general assumptions and characteristics that bothRANSAC implementations share are explained in Section 3.3.

After computing RANSAC, and thus having the values a and b, each nodehas to decide if it should recalibrate its parameters α and β. To discriminatebetween being properly calibrated or not, the sensor node computes �rstlya Bayesian Network that shows if there is a high probability that β is in-correct, and then (after β has the right value) computes a second BayesianNetwork that tells if α is probably incorrect. This step and its calculationsis described in Section 3.4.

If the networks determine that it is needed to recalibrate α or β, theirnew values are resolved using resulting parameters from the RANSAC stepand sensor's measurements. The calculations are explained in Section 3.5.

3.3 Notes about the RANSAC Algorithms

This section explains general facts that apply to both RANSAC and De-RANSAC algorithms. The implementation of them (with further informa-tion on each step) will be explained in Sections 4.2 and 4.3.

The consensus system will attempt to �t a logarithmic function to themeasurements obtained (the one in (3.1)). As the model is linear (there isno parameter inside the logarithm), it can be �tted using least squares. Thethreshold τ to determine whether a sample is part of the inliers will be 3σ,considering the typical σ of the sensor nodes as 1° Celsius. This is deducedto be the best threshold due to the fact that, for Gaussian distributions, theinterval (µ± 3σ) contains the 99% of the measurements.

The inclusion of the inliers is sensor-wise. If a sensor node has more out-lying measurements than inlying ones, all of these will be taken as outliers.This allows sensor nodes with medium calibration error (inside the interval(µ± 3σ), but still uncalibrated) not to a�ect the overall consensus. A betterapproach on how to solve this issue will be described in the future implemen-tations section (Chapter 6).


To use the consensed model in the Bayesian Networks calculations, anestimator of its randomness is needed. In [10], the way to compute thesample variance of the RANSAC model is explained: using the consensusand the measurements of each sensor node, one can estimate it as following:

S := σ ≈ 1

n− 2

n∑i=1

(yi − yfit)2, (3.6)

where n is equal to the amount of measurements taken that are inliers, andn-2 is the number of degrees of freedom (nº of data points - nº of parametersextracted using LSQF).

For the second Bayesian Network, the variance of this sample variance isneeded (it is a measure of how well the variance is capturing the randomnessof the consensus). For that purpose, the approach on [11] is combined with[10] to reach an estimation of this magnitude:

V ar(S2) := σ4 =1

n(µ4 −

n− 4

n− 2µ2

2), (3.7)

where:

µk =

∑ni=1(yi − yfit)k

n, (3.8)

with n being the number of inliers.

These two terms (S and V ar(S2)) are also needed to be computed foreach sensor node measurements alone (without the consensus). To do this,the value (yi − yfit) will be substituted by (yi −mean(y)), and the term nwould be the number of measurements taken by the sensor node and thevalues (n− 2) and (n− 4) would become (n− 1) and (n− 3), respectively.

3.4 The Bayesian Network

Network Structure Description and Mean Calibration

Description of the used Bayesian Network

The Bayesian Network implemented for both mean and variance calibrationis structured as in Fig. 3.4.


Figure 3.4: Structure of the adopted Bayesian Network.

It is composed by a series of independent variables and dependent vari-ables derived from them. The following notation will be used for mean cali-bration (the meanings of the variables change when calibrating variance):

� Let Si,t be the state of the sensor node i. This independent unknownvariable is modeled as a random variable with two possible states:

� P (Si,t = 1) models the probability of the sensor node being cor-rectly calibrated (depending on the case, this implies calibrationin mean or in variance, as it will be seen in the next section).

� P (Si,t = 2) models the probability of the sensor node being incor-rectly calibrated (again, it may be mean calibration or variancecalibration).

� Let Xt be the state of the system. If the network is determining aparametrized function h (which is the case in this Thesis), the value Xt


would represent the parameters and the randomness of that function,so that h(k) is the value of the consensus in the position of the sensornode k.

� Lastly, let Zi,t be the measured value by the sensor node i.

Having de�ned all the correspondent variables, the Bayesian Network hasthe following probability distribution:

P (S1,t, S2,t, ..., Sn,t, Z1,t, Z2,t, ..., Zn,t, Xt) = P (Xt)n∏i=1

P (Si,t)P (Zi,t|Si,t, Xt)

(3.9)One can then marginalize over every sensor node that is not k, thus leaving

to the expression:

P (Sk,t, Zk,t, Xt) = P (Xt)P (Sk,t)P (Zk,t|Sk,t, Xt), (3.10)

which will be used in a following calculation.

Provided that the aim of this network is to calculate the probability ofthe di�erent states of the sensor k from the obtained measurement Zk,t andthe consensus Xt, an equation is proposed, using the Bayes Theorem:

P (Sk,t|Zk,t, Xt) = P (Sk,t)P (Zk,t, Xt|Sk,t)P (Zk,t, Xt)

, (3.11)

but the result of this distribution is yet impossible to determine. Using (3.10),and marginalizing over Sk,t, the following expression is determined:

P (Zk,t, Xt) = P (Xt)P (Zk,t|Xt), (3.12)

where P (Xt) still remains unresolvable.

Regarding to the numerator in (3.11), one can use the Kolmogorov de�-nition (note: P (A ∩B) ≡ P (A,B)) to reach the following expression:

P (Zk,t, Xt|Sk,t) = P (Xt)P (Zk,t|Sk,t, Xt) (3.13)

Then, using (3.13) and (3.12) in the numerator and denominator of (3.11),respectively, the next statement is concluded:

P (Si,t|Zk,t, Xt) = P (Sk,t)P (Zk,t|Sk,t, Xt)

P (Zk,t|Xt)(3.14)


With this equation, one can calculate the probability of every state of thesensor. The di�erent probability distributions involved will now be explained.

Calculations in the implementation

The values P (Sk,t), P (Zk,t|Sk,t, Xt) and P (Zk,t|Xt) are needed in order toobtain the �nal result.

The �rst and last elements are simple to calculate. For P (Zk,t|Xt), theexpression P (Zk,t|Sk,t, Xt) is marginalized over Sk,t, which means summingP (Zk,t|Sk,t = 1, Xt) and P (Zk,t|Sk,t = 2, Xt). As for P (Sk,t), its value needsto be estimated by the designer of the system via an analysis of the sensorsinstalled. If usually one of every ten sensors is uncalibrated, P (Sk,t) = 0.9(this is the used value in this Thesis, for simplicity). Otherwise, one coulduse a time-variant value for this probability, or even a sensor-wise constant,if di�erent sensors are used or varying test conditions a�ect the network.

The probability P (Zk,t|Sk,t, Xt) needs a further analysis to be calculated:

� The value P (Zk,t|Sk,t = 1, Xt) corresponds to the probability of obtain-ing the measurement Zk,t provided the sensor k is well calibrated andgiven the consensus of the system (and its variance). The probabilitydensity function of obtaining a measurement enclosed in an interval ismodeled as a Gaussian, with µ = h(x) and σ2 = sigmah

2. Thus, for ob-taining the only measurement Zk,t (which is enclosed in an in�nitesimalinterval), the probability is ρh_distrib(Z) dZ .

� The value P (Zk,t|Sk,t = 2, Xt) regards the second case. When thesensor is not properly calibrated, it wont give a measurement that cor-responds to the consensus, however it will �t its own probability densityfunction, with µ and σ calculated from the historic of the values mea-sured. Thus, for obtaining the only measurement Zk,t, the probabilityis ρown_distrib(Z) dZ .

Having those values reached, it is through combining them with (2.3) thatone calculates the �nal solution. For instance, if one wants to compute thecase where P (Sk,t = 1|Zk,t, Xt), then it would result:

P (Sk,t = 1|Zk,t, Xt) = 0.9ρh_distrib(Z)��dZ

(ρh_distrib(Z) + ρown_distrib(Z))��dZ(3.15)


And for P (Sk,t = 2|Zk,t, Xt) :

P (Sk,t = 2|Zk,t, Xt) = 0.1ρown_distrib(Z)��dZ

(ρh_distrib(Z) + ρown_distrib(Z))��dZ(3.16)

Note : It is important to notice that, even though a Gaussian distributionis used in this design to determine the probability density functions, onecould use any other type of distribution, as soon as it represents correctlythe random behavior of the sensors and the overall consensus.

Variance Calibration

Using the same network as with the mean calibration, the variance calibrationstep changes the meaning of the variables, so that the new ones con helpdetermine whether a sensor is incorrectly calibrated in the α parameter:

� Xt represents the variance of the Consensus of the means (which wasused in the previous calculation of P (Zk,t|Sk,t = 1, Xt)), and also thevariance of this sample variance. These were calculated in section 2.3.2.

� Zi,t represents the sample variance of the sensor i and also the varianceof the sample variance of it.

� Si,t still represents the sensor S being calibrated or not, but in this caseregards only the α calibration.

The new parameters of the Gaussian distributions in (3.14) are:

� For calculating P (Zk,t|Sk,t = 1, Xt), a Gaussian is used with µ = σhand σ2 = σh

4.

� For calculating P (Zk,t|Sk,t = 2, Xt), a Gaussian is used with µ = σowmand σ2 = σown

4.

Usually, the sensors are more likely to estimate that they are improperlycalibrated in slope than in o�set, due to the lower randomness in the variancethan in the mean. To solve that issue, a con�dence interval will be added tothe network. Each execution of the Bayesian Network acts like a Bernouillitrial, thus the chosen interval will be the normal approximation interval (�rstdescribed in [12]), with an 80% con�dence level.

Having these two networks, it is possible to predict both calibration fail-ures in α and β.


Note : If one used another type of distribution than a Gaussian for themean calibration, and the sample variance is not an estimator of it, or it doesnot follow a Gaussian distribution for its own variance, it would be neededto calculate another estimator of the randomness of the randomness of themodel.

3.5 The Recalibration Step

Once the Bayesian network has determined whether the mean and slope needto be �xed, a series of calculations are made to compute the new values of αand β. For that purpose it is needed to know the previous values of both ofthem, and also a series of parameters of the measurements and consensus ofthe sensor k, that have the following notation:

� Let µc be the model value in the position of the sensor k obtainedthrough a consensus (RANSAC) algorithm.

� Let µs be the mean of the values measured by the sensor k.

� Let σc2 be the variance of the obtained model through the RANSAC

algorithm.

� Let σs2 be the variance of the measurements obtained by the sensor k.

� The su�x t− 1 implies a value obtained before the present calibrationstep.

In order to calibrate the mean (which is done if P (Sk,t = 2|Zk,t, Xt) >threshold, for the mean Bayesian Network), the next equality is used:

βnew = βold + (µc − µs), (3.17)

which simply adjust the parameter β in order to make the mean of the sensornode be coincident with the one of the consensus.

When both means coincide, the second Bayesian network is executed. Ifthe node has an error in the slope (this is noticed by the con�dence intervaldescribed in Section 3.4), the next equalities are used to correct it.

The �rst one adjusts the value α so that the variances of both the con-sensus and the sensor node coincide:


αnew = αold

√σc2

σs2(3.18)

After that, the value of β becomes incorrect .This is due to the fact thatthe slope error was a�ecting the mean of the sensor node, which was cor-rected using (3.5). After calibrating α, this error appears no more and β isno longer necessary to correct it.

In order to solve this, the value γ is calculated in (3.5). It represents themean that the sensor node would have had before ever calibrating β, butprovided the slope (α) was properly calibrated.

γ = (µst−1 − βold)αnewαold

(3.19)

Once γ is computed, the new β is then calculated as the di�erence betweenthe mean of the sensor node (which used the former α and β values), and γ.Through this, one �nds (and corrects) the actual error in the mean if therewas no error in the slope.

βnew = µst−1 − γ (3.20)

3.6 Conclusions

The following summarizes the most relevant concepts explained in this chap-ter:

� Errors in the mean are required to be corrected before calibrating theslope.

� The parameters that are used in the RANSAC estimation and theBayesian Network calculation are chosen arbitrarily through previousobservation of the system, and a�ect directly to the behavior of theresulting algorithm.

� The approach on using a Bayesian Network for sensors' state estimationis �exible and could lead to di�erent solutions if one needed to take intoaccount further relations and dependencies. This was not necessary forthe formulated problem.


� The probability density function used in the Bayesian Network for themean can follow a di�erent distribution than a Gaussian, as soon asthe variance of its randomness is properly determined through anothermodel according to the model of the mean.

4. Implementation

4.1 Introduction

The aim of this chapter is to present how both the centralized and decentral-ized algorithms are simulated, and to provide notes about their particularitiesand how they may be �tted to a real implementation over a sensor network.

Firstly, the centralized case will be detailed, providing speci�c detailsabout its features and how and when each step is performed. Then, thedecentralized case is explained, giving also information of possible problemscaused by the parallel computations of all the sensors. During these two sec-tions, several implementational facts will be compared, giving informationabout how both approaches can be implemented over a sensor network.

The simulations have been programmed using the ROS framework. Thus,a brief detail of the needed nodes and communications (for both cases) willbe given. Lastly, some conclusions on both algorithms will be elaborated,although the �nal demonstrations and comparisons between both approacheswill be exposed in the next chapter.

4.2 Centralized Implementation

RANSAC ROS Node

The aim of this ROS node is two-fold: sort the measurements taken by theROS sensor nodes, and compute the RANSAC algorithm every time it iscalled. It also handles the calls to the ROS plotting server, making it drawboth the measurements of the ROS sensor nodes and the consensed modelreached by the algorithm. The �owchart designed for this ROS node is shownin Fig. 4.1.

27

CHAPTER 4. IMPLEMENTATION 28

Figure 4.1: Flowchart implemented in the central ROS node.

To serve that purposes this ROS node has di�erent characteristics:

� It is subscribed to the ROS topic �Temp_Sensors_Readings�, wherethe measurements of the ROS sensor nodes are published. When thathappens, the �newReadings� function is called. In a sensor network, thismeasurements would be shared through a �ooding algorithm, choosingthe central sensor node (the one that computes RANSAC) as thatwhich minimizes the messages passed through the network. It can alsobe chosen arbitrarily.

� It has a timer that every three seconds calls the �Ransac_Evaluate�function. In a sensor network the central sensor node would have itsown timer to determine when to perform the RANSAC algorithm. Per-forming RANSAC every 3 seconds may be an unnecessary approach, asin an industry the sample time may be slower than in the simulation,where this time was used for faster tests.


� It is a client of the plotting service, which is used to show the resultsof the algorithms in real time. The implementation of this is irrelevantand thus will not be explained.

It should be noticed that, for this implementation, the position of thenodes has been assumed as �xed. There is a preexisting table where the Idof each sensor node is related to its position. In an industrial sensor network,usually the nodes are �xed in a place, and thus this approach is feasible. Pro-vided that the sensors might be moved, their position should be included inthe messages. This thesis is not intended to calibrate the position estimationof the sensors, and thus it will be taken as ideal. One can modify the codeto add the position error to the overall error propagation of the algorithm,however the Bayesian Network is not designed to calibrate both the positionand the magnitude at the same time. By modifying the network, it wouldbe possible to perform the calibration of the position sensors, however thepossibility of calibrating the parameters of the magnitude sensors should bediscarded then, or another source of information would have to be found.

Regarding to the �newReadings� function, every time a new measurementhas been published, the ROS node takes it and saves it into a list where themeasurement is saved in the corresponding vector of its sensor. Each sensoruses the last M measurements taken. This number needs to be chosen inorder to capture the static model of the magnitude. The higher the valueis, the better the RANSAC algorithm and the Bayesian network functions.However, it also implies an increasing amount of computational cost.

The function �Ransac_Evaluate� is called every 3 seconds in the simula-tion, however it should only be called once the model of the magnitude hasreached a steady-state (this meaning that there are no temporal variationsor perturbations that are not considered in the mathematical model �tted byRANSAC), and all the measurements in the list re�ect properly that state(one may include temporal variations in the model, however this probleminvolves far more complexity and needs to be studied properly to overcomepossible errors).

Once called, the function does the following:

� Compute the RANSAC algorithm to calculate the hypothesis that best�ts the data available.

� Adjust the �nal hypotheses to all the inliers of it (perform LSQF usingall the inliers). In fact, in this particular implementation the adjust-


ment is performed during each RANSAC iteration (for all the hypothe-ses). The aim of this was to overcome the possible case where twohypotheses have the same number of inliers and one has less estima-tion error than the other. However, this case is not likely to happen andit is only needed to perform this step for the hypothesis that has thehighest number of inliers, in order to prevent excesive computationalcosts.

� Once the �nal hypothesis is reached and corrected with all the inliers,perform the computation of the sample variance of the model and alsoof the variance of the sample variance (as stated in (3.6) and (3.7)).

� Publish the resulting parameters and their statistical properties.

� Lastly, draw the points and the obtained consensus.

Sensor Node

Each sensor of the network is implemented in ROS as an independent ROSnode. The measurements are published on a shared channel (this functionsthe same way as if all the sensors were in range of the others, and it was pro-grammed to eliminate the need of programming a �ooding algorithm), how-ever the sensors make no use of the other's measurements (only the RANSACROS node collects the measurements). Every message published has boththe magnitude measured and the Id (if the measurements involve more facts,such as the time or place they were taken at, these should also be in thatmessage).

The function �Data_measures� is called periodically, and then generatesa measurement. For this two possible cases are presented:

� The measurement can be obtained through a random Gaussian gener-ator that simulates a sensor node (including its uncalibrated parame-ters). This is the solution used in the implemented simulation.

� The measurement can be obtained from a real sensor node that com-municates wirelessly with the ROS framework. This implementationwas made to simplify possible tests with real sensors (it is intended towork with the TelosB sensors).

Every time the RANSAC algorithm concludes, the function �newRead-ings� executes the two Bayesian Networks to calibrate both the mean and the


variance. As the variance is corrected after the mean, the condition statedin Algorithm 2.1 is always accomplished. The �owchart of this ROS node isshown in Fig. 4.2.

Figure 4.2: Flowchart implemented in the measuring nodes.

4.3 Decentralized Implementation

Sensor Node

In this implementation there is only one type of ROS node, and thus itsfunctionalities must include any needed methods and calculation in orderto implement RANSAC. Notice that each ROS sensor node should have aunique Id that identi�es it. Figure Fig. 4.3 shows the theoretical design ofthis system for each ROS sensor node.


Figure 4.3: Flowchart implemented in each ROS sensor node (this diagramonly shows the consensus �owchart, not the measurements and communica-tions to share them).

First, as described in [3] every ROS sensor node has to calculate severalparameters, being N and xi the most important ones:

� N is the total number of sensors in the network, and has to be theexact amount of them that are collaborating in the current distributedRANSAC algorithm. In a sensor network, this should be done throughsharing a vector over all the nodes and including each sensor node Idon it. When it converges, the calculation is considered over.


� xi is the total number of hypotheses that the sensor node i has todetermine in order to obtain ki hypotheses that are di�erent to theones created by other sensors, with a probability P.

These parameters should be calculated every time RANSAC is called, oronly once if the network is static.

After determining the parameters, each ROS sensor calculates xi hypothe-ses, and then shares them with the others. It is important to sort the hy-potheses the same way in each sensor node. This can be done by using theirId to determine an order. The sensor node with the lowest Id should startthis communications. In the simulation, all the sensors refresh their vectorsautomatically every time a new hypotheses is added. However, in a sensornetwork the vector should �rstly be calculated and then shared with all thenetwork through a �ooding algorithm.

Once all sensors have obtained the hypotheses, the distributed votingstarts. Using the method in [3], each sensor node calculates the vectors γand ν, and shares them with all its neighbors. Once received the votes fromthe neighbors, the sensor node updates the values of γ and ν and shares themagain. Here it is crucial that all the sensors in the network are synchronized.Otherwise, the method will not function properly. This step is executed it-eratively until the network converges to a solution, which in [13] is proven tohappen after a time t0.

Having �nished voting, a distributed LSQF algorithm (as described in[14]) is executed to perform a better �t of the inliers.

Once the better �t is calculated, the network estimates a distributed sam-ple variance (σ2) and a distributed variance of the sample variance (σ4) ofthe consensus, as these values are needed by each sensor node in order tocompute the Bayesian Network.

Regarding the Bayesian Network, there is no di�erence between this im-plementation and the centralized one, as this calculations are performed byeach sensor node.


4.4 ROS Architecture

Centralized System

The structure mentioned in Section 4.2 is shown in Fig. 4.4. In it the threetypes of nodes (Sensors, Ransac and Plot) are seen, and the communicationsbetween them occur through the topics described previously.

Figure 4.4: ROS Nodes and communications between them in the centralizedcase.

Decentralized System

The diagram on Fig. 4.5 shows the nodes and communications describedin Section 4.3. It exempli�es a series of ROS sensor nodes that communi-cate with each other through di�erent topics (such as /RANSAC_Voting).Notice that there are no topics related to the distributed LSQF or the dis-tributed variance computations. However, the topics (/Hypothese_Sharingand /Node_Prepared) can be used for these purposes.


Figure 4.5: ROS Nodes and communications between them in the decentral-ized case.

4.5 Conclusions

Having explained both implementations and their particularities, the follow-ing conclusions are reached:

� Whereas the centralized approach implies more network load (i.e: therehas to be a main sensor node, communications are needed to send all themeasurements to that node, etc), the decentralized one implies morecomplex solutions, such as having synchronicity, performing varioussteps and distributed algorithms, etc.

� If one needed to implement plotting functions, the decentralized imple-mentation presents the problem of collecting the data available on eachsensor node. If one trusts the result of the calibration (which would benatural in an industrial implementation), one could take the consensusreached by the closest sensor node to the SCARA, and treat it as ideal.

� In the distributed approach, provided the network is static, the cal-culations are simpli�ed and one could collect information about thearchitecture of it in order to reduce the network load and to optimizethe consensus algorithms.

5. Experiments

5.1 Introduction

In this chapter di�erent experiments are performed, both to explain how theimplemented solution works to calibrate all the needed parameters, and toshow the limits of the assumptions taken.

Firstly, a detailed proof of concept will be given in Section 5.2, using asensor node that has an incorrect B parameter, to show extensively how eachstep of the software takes action to resolve the main problem. After that, inSection 5.3 there is a case where two sensor nodes have incorrect parameters(one has a wrongly set value of A, and the other of B). Then, a more compli-cated example where a node has incorrect A and B parameters, and anotherhas an incorrect B parameter, will be given in Section 5.4.

These experiments showed the behavior of the solution, and the two lastare intended to give an idea of when it could possibly fail: the �rst one, inSection 5.5, gives an example where some sensor nodes have a value of Bthat di�ers lowly from the correct one; the second, in Section 5.6, presents acase where a node has a di�erent variance from the others but has no errorin the slope.

A massive statistical analysis of the behavior of the system will be sum-marized in Section 5.7. Lastly, some conclusions on the experiments arepresented.

5.2 Proof of Concept

In this case and the others of this chapter, there will be a total of 5 sen-sor nodes in the network. Each one will have a unique Id that identi�es it,starting from Id=0 to Id=4. The sensor node in the starting position will be

36

CHAPTER 5. EXPERIMENTS 37

called �node Id=0� or �Node 0�, and the one on the furthest position will becalled �node Id=4� or �Node 4�, having thus an increasing Id from the clos-est to the furthest. An example of this nomenclature can be found in Fig. 5.1.

During this experiment, node Id=2 has B=5 and A=1. The others haveB=0 and A=1. The model of the magnitude is :

x = 52− 12 log (d), (5.1)

which means that a=52 and b=-12. Figure 5.1 shows the magnitude and themeasurements of the di�erent sensor nodes.

Figure 5.1: Measurements and magnitude of the experiment.

Notice that the mean of the node Id=2 is uncalibrated. It should be 27,but it is 32. This is due to having B=5 instead of B=0. We will used theproposed solution to correct the value of B.

RANSAC and LSQF:

Once the sensor nodes have shared their measurements, the decentralizedexecution of the solution starts (as in Section 4.3). First of all, a De-RANSACalgorithm is executed to �nd the best consensed model. After that, a dis-tributed LSQF algorithm adjusts that solution to �t properly all the inliersof it. Table 5.1 shows the result after the execution of both algorithms:


Step a b

Ideal 52 -12De-RANSAC 52.01 -11.80LSQF Adjustment 51.99 -11.91

Table 5.1: Values of a and b obtained during the consensus algorithm.

As it can be seen, the values obtained in the De-RANSAC algorithm arealmost the ideal. This happens because most of the sensor nodes are properlycalibrated. However, this values (which were hypotheses generated by twosensor nodes) become more correct when they take into account the rest ofthe inlying points. Thus, the values calculated using LSQF with the inliersare more correct. The consensus and measurements will look as in Fig. 5.2.This �gure looks as Fig. 5.1 because the consensus is almost the same as theideal magnitude.

Figure 5.2: Measurements and magnitude of the experiment.

When the parameters a and b are computed, a distributed algorithmcalculates the sample variance of the consensed model (the variance of thesample variance is also computed, but it is not needed for this case). Thisvariance has a value of 1.05 in this example. Each sensor node also computesits own sample variance, using the measurements that each one has sensed.For node Id=2, this variance is 1.02 (the others have a similar one)


Bayesian Network:

Once all these parameters are calculated, a Bayesian Network is executedby each sensor node to estimate if it is calibrated or not. Table 5.2 showsthe probability of each sensor node being uncalibrated.

Node Id=0 Id=1 Id=2 Id=3 Id=4Probability of bad calibration (p.u.) 0.00 0.00 0.95 0.00 0.00

Table 5.2: Probability of each node being uncalibrated.

Recalibration:

The only sensor node that determines it is uncalibrated is node Id=2 (theothers have a probability lower than 10e-4 p.u.), which is the one with thewrong value of B. Thus, it will perform a recalibration using the formula in(3.5), where βold = 0, µc = 27.22 and mus = 32.03. This results in:

βnew = −4.816 (5.2)

This new value of β is not exactly -5, however it is close enough to theideal β, being only a 3.8% less than it. This value would be more appropriateif the standard deviation of the sensor nodes was lower.

RANSAC and LSQF:

Once node Id=2 has adjusted its measurements, they are shared with therest of the sensor nodes and a the De-RANSAC algorithm is executed again.Table 5.3 shows the results obtained this time:

Step a b

Ideal 52 -12De-RANSAC 51.90 -11.88LSQF Adjustment 52.02 -11.95

Table 5.3: Values of a and b obtained during the consensus algorithm.

The new consensed model is closer to the ideal in this case, however thisvalue has not changed due to the correctness of the node Id=2 (this sensornode's measurements were not included in the �rst consensus), but becauseof the randomness of the measurements. Simply, during this iteration, themeasurements were closer to the ideal, and thus the consensus becomes more


accurate. The result of this consensus is shown in Fig. 5.3.

Figure 5.3: Measurements and consensus when node Id=2 is calibrated.

Bayesian Network:

When the consensus and its variance is calculated, the sensor nodes com-pute once again the Bayesian Network in order to estimate if they are uncal-ibrated or not. Table 5.4 shows these probabilities.

Node Id=0 Id=1 Id=2 Id=3 Id=4Probability of bad calibration (p.u.) 0.00 0.00 0.00 0.00 0.00

Table 5.4: Probability of each node being uncalibrated.

As it is seen, the node Id=2 now estimates that it is well calibrated. Thesolution has corrected its measurements so it �ts within the consensed model.

In this experiment, as there was no error in the slope, the Bayesian Net-work of the variance calibration will always estimate that all the sensors arecorrectly calibrated.


5.3 Case 1: One Sensor Node has an Incorrect

A, Another has an Incorrect B

For this case, node Id=1 has A=√2.3, and node Id=3 has B=7. The other

values of A are 1, and those of B are 0.

RANSAC and LSQF:

The �rst time that the consensed model is computed, the obtained valuesare those in Table 5.5.

Step a b

Ideal 52 -12First consensus 51.91 -11.89

Table 5.5: Values of a and b obtained in the consensus algorithm.

The measurements and the consensus look as in Fig. 5.4.

Figure 5.4: Measures and consensus initially.

Bayesian Network and recalibration:

As node Id=1 and node Id=4 are uncalibrated, their value β will becorrected in this iteration. Table 5.6 shows the new values of β.


Step Node 0 Node 1 Node 2 Node 3 Node 4

Initially 0 0 0 0 0O�set calibration 0 -19.907 0 -6.98 0

Table 5.6: Values of β.

RANSAC and LSQF:

Once β is calibrated for both nodes, the consensus algorithm is executedagain. Then, the resulting parameters of the consensus are those in Table5.7.

Step a b

Ideal 52 -12First consensus 51.91 -11.89Consensus with correct β 51.91 -11.97


Figure 5.5 shows the resulting measurements and consensus at this iter-ation.

Figure 5.5: Measures and consensus after correcting β.



The value of α is still not corrected for node Id=1. Using the varianceof the sample variance, and the other parameters of the consensus and mea-surements of node Id=1, the Bayesian Network determines that it is neededto correct A, and the recalibration algorithm does it. After this value iscalibrated, the new values of α and β are those in Table 5.8 and Table 5.9.


Initially 1 1 1 1 1O�set calibration 1 1 1 1 1Slope calibration 1 0.69 1 1 1

Table 5.8: Values of α.


Initially 0 0 0 0 0O�set calibration 0 -19.907 0 -6.98 0Slope calibration 0 -1.44 0 -6.98 0


It can be seen that the value B of the node Id=3 has been corrected(7 − 6.98 ≈ 0), and also the value A of the node Id=1 (

√(2.3) = 1.51 ≈

10.69

= 1.44). Figure 5.6 shows how the new measurements and consensusbehave.

Figure 5.6: Measures and consensus after correcting α.


5.4 Case 2: A Sensor Node has both A and B

Incorrectly set, Another has an Incorrect B

For this case, the node Id=1 has A=√2 and B=10. Additionally, the node

Id=3 has B=-14.36. The others had an A of 1 and a B of 0.

RANSAC and LSQF:


Step a b



The measurements and the consensus look as in Fig. 5.7.



As node Id=1 and node Id=3 are uncalibrated, their value β will becorrected in this iteration. Table 5.11 shows the new values of β.



Initially 0 0 0 0 0O�set calibration 0 -25.82 0 14.50 0


Once β is calibrated for both nodes, the consensus algorithm is executedagain. Then, the resulting parameters of the consensus are those in Table5.12.

Step a b

Ideal 52 -12First consensus 51.91 -11.89Consensus with correct β 51.94 -11.91


The resulting measurements and consensus look as those in Fig. 5.8.

Figure 5.8: Measures and consensus after correcting β.


The value of α is still not corrected for node Id=1. After this value iscalibrated, the new values of α and β are those in Table 5.13 and Table 5.14.



Initially 0 0 0 0 0O�set calibration 0 -25.82 0 14.50 0Slope calibration 0 -4.99 0 14.50 0



Initially 1 1 1 1 1O�set calibration 1 1 1 1 1Slope calibration 1 0.68 1 1 1


It can be seen that the value B of the node Id=3 has been corrected(−14.36+14.50 ≈ 0), and also the value A of the node Id=1 (

√(2) = 1.41 ≈

10.68

= 1.47) and its value B (102− 4.99 ≈ 0) . Figure 5.9 shows how the new

measurements and consensus behave.


5.5 Case 3: Nodes with Low Calibration Error

that Alters the Final Result

For this case, node Id=1 has B=2. Additionally, the node Id=4 has B=2 (itis not relevant that this value is the same as for the node Id=1). The others


have A=1 and a B=0.

RANSAC and LSQF:

In this case, no calibration step is generated, and the measurements andthe consensus remain always as in Fig. 5.10.

Figure 5.10: Measures and consensus during the execution of the system.

The following table shows the values that a and b have during the example(they may vary, though these deviations are small):

Step a b

Ideal 52 -12Consensus 52.24 -11.72

In this case, the sensor nodes estimate they have little probability of beinguncalibrated (which is logical, due to the fact that their measurements areincluded in the consensus). However, their distance to the consensus is higherthan in the other examples. This could be used to take further actions toimprove the �nal result, as will be explained in the Chapter 6.


5.6 Case 4: A node has di�erent σx2 than the

others

For this case, node Id=2 has a di�erent σx2 than the others (σx

2 = 2.5),however A=1, as there is no error in the slope. All the sensor nodes haveA=1 and B=0.

RANSAC and LSQF:


Step a b



The measurements and the consensus look Fig. 5.11.



Then, as node Id=2 had a di�erent variance, its α is calibrated to makeit the same as the others. After this, the new values of α and β are those inTable 5.16 and Table 5.17.



Initially 0 0 0 0 0Slope calibration 0 0 8.49 0 0



Initially 1 1 1 1 1Slope calibration 1 0.69 1 1 1


RANSAC and LSQF:

The new measurements and consensus look as in Fig. 5.12


With these values corrected, the system remains stable for a long time.However, after that time, the temperature of the system increases, causingthe model take other values of a and b. Assume the necessary time haspassed so the system varies no more and the new values of a and b are thosein Table 5.18.


Step a b

Ideal 52 -12First consensus 52.22 -12.07Consensus with calibrated α 52.2 -12.06New TempIdeal 80 -14First consensus 79.91 -13.99


RANSAC and LSQF:

The measurements and the consensus will look as in Fig. 5.13.

Figure 5.13: Measures and consensus with the new temperature.


Now the measurements of node Id=2 do not �t the new consensus. Thus,the β will be re-calibrated to �x this issue. Table 5.19 shows the its newvalue.



Initially 0 0 0 0 0Slope calibration 0 0 8.49 0 0New TempInitially 0 0 8.49 0 0O�set calibration 0 0 15.68 0 0


Figure 5.14 shows the new result of this calibration.

Figure 5.14: Measures and consensus after correcting α again.

As it can be deduced, every time the temperature evolves, the value ofβ of node Id=2 will have to be changed in order for the node to be able tofollow the consensus. This shows that having di�erent variances may presenta problem if it is not treated correctly. This will be explained in the nextChapter.

5.7 Extensive Simulations

The experiments performed have varying values of a, b, A and B for themodel an the sensor nodes. There are a total amount of 8 sensor nodes inthe network. The value of τ was 3, and σ=1. The values mentioned werein di�erent ranges for each of the �ve cases performed. Each case has beenexecuted 30 times with random parameters:


� In Case 1 one sensor node has a parameter B between 5∗τ3

and 8∗τ3,

either with a positive sign or a negative one.

� In Case 2 one sensor node has a parameter A between 1.5 ∗ σ and 4∗σ.

� In Case 3 one sensor node has a parameter B between 2.5∗τ3

and 3.5∗τ3

,either with a positive sign or a negative one.

� In Case 4 one sensor node has a parameter A between σ and 2.5 ∗ σ.

� In Case 5 there is one sensor node that has a parameter B between2.5∗τ3

and 4.5∗τ3

, either with a positive sign or a negative one. It also hasA between 1.5 ∗ σ and 3.2 ∗ σ. There is another sensor node with a Bbetween 2.5∗τ

3and 4.5∗τ

3, either with a positive sign or a negative one.

Tables 5.20 and 5.21 summarize the results obtained in this experiments.The parameters in them are:

� The True Positive Rate (TPR) and False Positive Rate(FPR) of bothβ correction and α correction during all the 30 di�erent iterations.

� The mean, maximum and minimum normalized error between the idealβ and α and the resulting ones after the solution has �nished.

� The mean normalized error between the ideal a and b and the resultingones after the solution has �nished


Case

TPRβ

FPRβ

TPRα

FPRα

High

Berror

10

-0

High

Aerror

-0

10

Low

Berror

00.0113

-0.0521

Low

Aerror

-0.0113

10.0631

Twoincorrect

nodes

10

10

Table5.20:

Resu

ltsof

thecon

ducted

experim

ents.

Case

βmean

βmax

βmin

αmean

αmax

αmin

amean

bmean

High

Berror

0.01960.1012

0.0018-

--

0.00220.0051

High

Aerror

--

-0.0925

0.25750.0095

0.00220.0051

Low

Berror

--

--

--

0.00860.0024

Low

Aerror

--

-0.4410

0.6610.2621

0.01030.0025

Twoincorrect

nodes

1.18333.1158

0.31420.0805

0.26400.0140

0.00450.0025

Table5.21:

Resu

ltsof

thecon

ducted

experim

ents.


5.8 Conclusions

The following notes are concluded from the experiments performed in thischapter:

� The system proves to work correctly if the error is taken out of theconsensus model.

� Provided the errors are included in the consensus, an unaccurate so-lution will be obtained, and a �ner method will be need to correctlycalibrate the sensor nodes.

� If a sensor node has a di�erent variance than the others, its parame-ters will not be calibrated correctly, and thus it will fail to follow theevolution of the magnitude once it changes. A further approach wouldbe needed to overcome this issue.

6. Conclusions and Future

Implementations

6.1 Conclusions

About the Overall Behavior of the System

� Provided the assumptions are accomplished, the system has proven towork successfully.

� Most of the assumptions are satis�ed in an industrial system, thusmaking the algorithms suitable for these applications.

About the Presence of Low Calibration Error

� When some of the sensor nodes have incorrect values of A or B, butthese values di�er less than enough from the real ones so that theycannot be considered as outliers, they a�ect the overall consensus andwill not be recalibrated, since the belief that they are already properlycalibrated is feasible.

� If this case happens, the probability of each sensor node being uncal-ibrated is low, but their distance to the consensed model gets higherthan if all the nodes were properly set.

� The lower the variance of the sensor nodes is, the harder it is for thiscase to happen.

About the Presence of Sensors with a Di�erent σx2

� When a node has a di�erent σx2 from the others, it is treated as if its

slope was not properly calibrated. Thus, the system will �x this, chang-ing the parameters α and β so that the node's measures �t properly inthe present consensus.

55

CHAPTER 6. CONCLUSIONS AND FUTURE IMPLEMENTATIONS 56

� When the sensor node has been calibrated to �t the consensus, andthen the magnitude changes, the new measures of the sensor node willnot follow correctly the change in the magnitude. Even if the sensornode is calibrated once again, this problem will still continue endlessly.

6.2 Future Implementations

About the Presence of Low Calibration Error

� It has been shown that, in this case, even if the estimated probabilityof the sensor nodes being calibrated is very high, the distance to theconsensus is longer than in the ideal case. Thus, one could use thisdistance as a threshold to perform a more �ne implementation of thesystem, following these steps:

� One by one, each sensor node eliminates its measure from theRANSAC step while the others do not.

� Then, the node computes the Bayesian Networks, and comparesits probability of being calibrated with the one calculated whenits measures were included.

� After all nodes have done this, the one with the highest decrementof probability calibrates its parameters.

� This solution may be feasible only if most of the nodes give correctvalues. Otherwise, this method may not result in a proper solution.Due to that, the user needs to take into account the requirements of itsnetwork, and buy sensor nodes that, for instance, have low probabilityof being uncalibrated.

About the Presence of Sensors with a Di�erent σx2

� In order to be able to overcome this issue, a new way to estimate thecertainty of the slope of the sensor node will be needed (the varianceof the sample variance should be computed in other way than directlyusing the consensed model). To implement this, a closer look wouldhave to be taken at the datasheet's parameters of each sensor node, inorder to be able to reach further conclusions on how their measures canbe related.

� As it is usual that the sensor nodes used in an application all are thesame model, this implementation may be skipped for many cases, but

CHAPTER 6. CONCLUSIONS AND FUTURE IMPLEMENTATIONS 57

it is a crucial step to take if one is willing to combine this algorithmwith sensor fusion techniques.

About the Implementation over a Real Sensor Network

� Apart from implementing the algorithm, many security measures needto be taken into account when managing sensor networks:

� As sensor nodes may shutdown suddenly, the algorithms need tobe robust to this kind of failure (however the network should stillbe connected most of the time).

� Due to the low computational power of each node, the algorithmsshould be made feasible for their capacities. To do so, it is impor-tant to distribute correctly all the computations and to give simplesynchronization and communication rules that allow to exchangeinformation without saturating the nodes.

� All the nodes should have a way to determine their own unique Ids,that allow to implement some of the algorithms that need a structuredorder (e.g: sharing the hypotheses).

� If the user wants to receive the data and measures of the Network, one ofthe nodes should be equipped with higher communication capabilities,in order not to slow down te behavior of the entire Network.

7. Bibliography

[1] M. a. Fischler and R. C. Bolles, �Random Sample Consensus: AParadigm for Model Fitting with,� Communications of the ACM, vol. 24,pp. 381�395, 1981.

[2] A. Tanenbaum and Wetherall, Computer Networks, 5th Edition., 2010.

[3] E. Montijano, S. Martínez, and C. Sagues, �De-RANSAC: DecentralizedRANSAC for sensor networks,� 2009.

[4] Stanford University, �Probabilistic Graphical Models.� [Online]. Avail-able: http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=ProbabilisticGraphicalModels

[5] M. Q. Ng, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs,R. Wheeler, and A. Y., �ROS: an open-source Robot Operating Sys-tem,� ICRA Workshop on Open Source Software, 2009.

[6] J. Rivera, M. Carrillo, M. Chacón, G. Herrera, and G. Bojorquez, �Self-Calibration and Optimal Response in Intelligent Sensors Design Basedon Arti�cial Neural Networks,� Sensors, vol. 7, no. 8, pp. 1509�1529,2007.

[7] L. Balzano and R. Nowak, �Blind Calibration of Sensor Networks,� Pro-ceedings of the 6th International Conference on Information Processingin Sensor Networks - IPSN '07, pp. 79�88, 2007.

[8] C. Liu, S. Ghosal, Z. Jiang, and S. Sarkar, �An unsupervised spatiotem-poral graphical modeling approach to anomaly detection in distributedCPS,� 2015.

[9] A. Hast, J. Nysjö, and A. Marchetti, �Optimal RANSAC - Towards arepeatable algorithm for �nding the optimal set,� Journal of WSCG,vol. 21, no. 1, pp. 21�30, 2013.

58

http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=ProbabilisticGraphicalModels

http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=ProbabilisticGraphicalModels

CHAPTER 7. BIBLIOGRAPHY 59

[10] K. K. Gan, �Lecture 7 Some Advanced Topics using Propagation of Er-rors and Least Squares Fitting Error on the mean ( review from Lecture4 ) More on Least Squares Fit ( LSQF ),� pp. 1�8.

[11] E. Cho and M. J. Cho, �Variance of the With-Replacement Sample Vari-ance,� Section on Survey Research Methods - Joint Statistical Meetings,vol. 2, pp. 1291�1293, 2008.

[12] P. S. Laplace, �Normal Con�dence Interval,� in Thèorie analytique desprobabilitès, 1812, p. 283.

[13] F. Bullo, J. Cortés, and S. Martinez, �Distributed Control of RoboticNetworks,� p. 238, 2009.

[14] A. H. Sayed and C. G. Lopes, �Distributed recursive least-squares strate-gies over adaptive networks,� Conference Record - Asilomar Conferenceon Signals, Systems and Computers, no. 1, pp. 233�237, 2006.

Documents

7UDEDMRG H) LQGH *UDGR Grado en Ingeniería Electrónica ...bibing.us.es/proyectos/abreproy/90614/fichero/TFG... · a las distintas zonas de la industria. Esta serie de problemas,