17
Project Report on Storage, retreival and process of continuous streaming data in a Wide Area Frequency Measurement System by Nitesh Pandit Kedar Khandeparkar under the guidance of Prof. A.M. Kulkarni Department of Electrical Engineering Indian Institute of Technology, Bombay Mumbai-400076 July 2010

Wide area frequency easurement system iitb

Embed Size (px)

Citation preview

Page 1: Wide area frequency easurement system iitb

Project Report on

Storage, retreival and process of continuous streaming data in a WideArea Frequency Measurement System

by

Nitesh PanditKedar Khandeparkar

under the guidance of

Prof. A.M. Kulkarni

Department of Electrical EngineeringIndian Institute of Technology, Bombay

Mumbai-400076

July 2010

Page 2: Wide area frequency easurement system iitb

Contents

0.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . 20.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 2

0.2 Current Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.2.1 Packet Format . . . . . . . . . . . . . . . . . . . . . . 30.2.2 Need of Project . . . . . . . . . . . . . . . . . . . . . . 4

0.3 Data Reception . . . . . . . . . . . . . . . . . . . . . . . . . . 50.3.1 Requirement . . . . . . . . . . . . . . . . . . . . . . . 50.3.2 Current Approach . . . . . . . . . . . . . . . . . . . . 50.3.3 Our Approach . . . . . . . . . . . . . . . . . . . . . . 5

0.4 Data Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 70.4.1 Requirement . . . . . . . . . . . . . . . . . . . . . . . 70.4.2 Current Approaches . . . . . . . . . . . . . . . . . . . 70.4.3 Possible Approaches . . . . . . . . . . . . . . . . . . . 70.4.4 Our Approach . . . . . . . . . . . . . . . . . . . . . . 70.4.5 Pseudocode for Insertion & Updation in Database . . 8

0.5 Data Display . . . . . . . . . . . . . . . . . . . . . . . . . . . 90.5.1 Requirement . . . . . . . . . . . . . . . . . . . . . . . 90.5.2 Current Approach . . . . . . . . . . . . . . . . . . . . 90.5.3 Possible Approaches . . . . . . . . . . . . . . . . . . . 90.5.4 Our Approach . . . . . . . . . . . . . . . . . . . . . . 9

0.6 Handling database issues . . . . . . . . . . . . . . . . . . . . . 110.6.1 Running the program . . . . . . . . . . . . . . . . . . 12

0.7 Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . 140.7.1 MYSQL . . . . . . . . . . . . . . . . . . . . . . . . . 140.7.2 PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

0.8 Future Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . 150.8.1 Interpolation of Frequencies . . . . . . . . . . . . . . . 150.8.2 Daily triggers . . . . . . . . . . . . . . . . . . . . . . . 150.8.3 Threading . . . . . . . . . . . . . . . . . . . . . . . . . 15

0.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1

Page 3: Wide area frequency easurement system iitb

0.1 Problem Statement

Storage, retreival and process of continuous streaming data in a Wide AreaFrequency Measurement System and further analysis to generate the flagsand alerts.

0.1.1 Introduction

Power System Network is continuously subjected to disturbances in the formof sudden load and generation changes. These disturbances give rise to os-cillations in the rotor angle which are also seen in the frequency. To studythese oscillations wide area frequency measurement setup is implemented inthis project. The setup are strategically placed in five different places inIndia and are time synchronized via Network Time Protocol (NTP). Thefrequency is measured every 20 ms and is time stamped and send throughinternet to a server in IITB.

The server program which is continuously receiving the packets, checkingthe correctness of packet information, doing the insertion in database ac-cording their time and displaying it on the web.

2

Page 4: Wide area frequency easurement system iitb

0.2 Current Setup

In ‘Wide area frequency measurement system’ frequencies measured at dif-ferent places in India are sent to a IITB Server. Frequencies are measuredby sensors at every 20 ms. Local frequency measured is stamped with thetime and place and sent to IITB Server.The IITB server is continuously listening for the packets on UDP port 6000.When a packet is received, it extracts the data and writes it in a file. A sep-arate file is created for each sensor and when any file reaches its maximum anew file is created for each sensor with incremented serial number. Plottingof this data is done offline.

0.2.1 Packet Format

Where,Source Port = 6000Destination Port = 6000Checksum = DefaultData = “( Sensor name, year-month-day hours minutes seconds.milliseconds,Frequency value )”

3

Page 5: Wide area frequency easurement system iitb

0.2.2 Need of Project

In existing scenario, data from each packet is stored in files at IITB sever.The analysis of data in these files is a difficult task and is done later offline.This project deals with online display of the incoming frequency data. Alsoit does constraint checking of each packet such as authenticity of locationof sensor, validity of frequency range etc. If a packet passes such validationchecks it is stored in the database for further analysis.

Hardware Requirements1) Intel P4 processor2) RAM (512 MB)

Software Requirements1) C (gcc compiler)2) Perl (libperl-dev, libdbd-mysql-perl)3) MySql (mysql-server, mysql-client, libmysqlclient15-dev)4) PHP (php5, php5-gd, php5-mysql, php-cli)5) Apache (apache2)6) Bzip2 (with perl libraries)

4

Page 6: Wide area frequency easurement system iitb

0.3 Data Reception

0.3.1 Requirement

Data from multiple sensors coming continuously at a very small time inter-val need to be stored in an organized manner for further analysis.

0.3.2 Current Approach

Data is dumped into flat files as it coming, with a separate file for eachsensor. When any file reaches its maximum limit new files are created foreach sensor with incremented serial number attached as a part of file name.There is no way for soft real time display of such data. It is also very difficultto gather data from all the sensors in a particular time range and comparethem. Also current approach does not handle constraint checking.

0.3.3 Our Approach

A UDP Server implemented in C language. It is continuously receiving thepackets from multiple sensors.

Server program comprises of three functions

1. main():

Continuously senses for the packets on UDP port 6000 and on receptioncreates a thread by calling a fuction check constraint has calculate(). Tothis function the message received is passed as an argument.

2. check constraint hash calculate():

Splits the received, message checks validity of each field and performs hashcalculation. After that it calls insert packets().

5

Page 7: Wide area frequency easurement system iitb

Logic for ‘check constraint hash calculate()’

I. Each packet has \Sensor location Name, Date Time, Frequency"for ex. "surat, 2010-06-10 12 39 23.545, 49.654". We splitthe message by comma and place the values in columns array,for ex. columns[0]=’surat’ columns[1]=’2010-06-10 12 39 23.545’columns[2]=49.654.

II.We then perform some validity checks likea) Check if columns[0] contains the valid sensor name, if not

then entry is made toLog_Error table and no furtherprocessing is done for the packet.

b) Check if columns[1] contains a non null value, if not thenentry is made to Log_Error table and no further processingis done for the packet.

c) Check if columns[2] contains frequency value that lies between47 and 52, if not then entry is made to Log_Error table and nofurther processing is done for the packet.

d) If the packet passes all the validation checks then we furtherprocess the packet. Splits the received, message checks validityof each field and performs

d.i) Split Date Time value in packet by space ’ ’e.g. columns[1]=2009-09-07 10 12 322.743 will be split asdatetime[0]=2009-09- 07,datetime[1]=10 ,datetime[2]=12,datetime[3]=322.743

d.ii) Split Date by ’-’e.g. datetime[0]=2009-09-07 will be split as date[0]=2009,date[1]=09,date[2]=07

d.iii) Split millisecond time(ms) by ’.’e.g.datetime[3]=322.743 as time[0]=322,time[1]=743

III. Now we calculate the hash value of each packet.It is the number of millisiseconds for the date time value inthe packet. i.e. 2009-09-07 10 12 322.743 =>(x millisec)

IV. On performing hash calculation we call a function insert_packets()and pass the calculated hash (i.e. x from above ) andcolumns[] as arguments.

6

Page 8: Wide area frequency easurement system iitb

0.4 Data Storage

0.4.1 Requirement

Storage and retrieval of data should be efficient.

0.4.2 Current Approaches

Data is stored in files, but to retrieve this data and do analysis is very diffi-cult.

0.4.3 Possible Approaches

Two ways of data storagea) Store as flat files as is done in the previous approachb) Use database.

0.4.4 Our Approach

We have used Mysql database for data storage. Logic for insertion of packetsinto the database table is a part of main server program i.e. insert packets()called from check constraint hash calculate() contains the logic to insertnewly arrived packet values in the database table.

7

Page 9: Wide area frequency easurement system iitb

0.4.5 Pseudocode for Insertion & Updation in Database

Algorithm 0.4.1: InsertPackets(newHash, datetime, frequency)

comment: Insert the packet with $datetime$ and $frequency$ at correct position in the table

global currentHash, minHash, nextHashlocal tdiff, tmod, k, i, t2, t3, tempHashif currentHash! = 0

then

if newHash >= currentHash

then

tdiff ← newHash− currentHashif tdiff <= 10then

{UPDATE TABLE(currentHash,$datetime$,$frequecy$)

else

tmod← tdiff%20;if tmod <= 10

then{

t2← tdiff − tmod;t3← (integer)(t2)

20

else{

t2← tdiff − tmod + 20;t3← (integer)(t2)

20for k ← 0 to t3− 1

do

$currentHash$← $nextHash$$nextHash$← $currentHash$ + 20INSERT INTO TABLE(currentHash,NULL,NULL)

$currentHash$← $nextHash$$nextHash$← $currentHash$ + 20INSERT INTO TABLE(currentHash,$datetime$,$frequency$)

else

tdiff ← currentHash− newHashtmod← tdiff%20if newHash >= minHash

then

if tmod <= 10

then

t2← tdiff − tmod;t3← currentHash− t2;UPDATE TABLE(t3,$datetime$,$frequecy$)

else

t2← tdiff − tmod + 20;t3← currentHash− t2;UPDATE TABLE(t3,$datetime$,$frequecy$)

else

if tmod <= 10

then{

t2← tdiff − tmod;t3← (integer)(t2)

20

else{

t2← tdiff − tmod;t3← (integer)(t2)

20tempHash← currentHashfor k ← 1 to t3− 1

do{

tempHash← tempHash− 20;INSERT INTO TABLE(tempHash,NULL,NULL)

tempHash← tempHash− 20

INSERT INTO TABLE( currentHash,$datetime$,$frequency$)minHash← tempHash

else

currentHash← minHash← newHashnewHash← currentHash + 20INSERT INTO TABLE(currentHash,$datetime$,$frequency$)

8

Page 10: Wide area frequency easurement system iitb

0.5 Data Display

0.5.1 Requirement

Graphical display of frequency Values on web.

0.5.2 Current Approach

Offline plotting of frequency stored in files is done in Matlab, below graphis generated using Matlab and has to be done offline.

0.5.3 Possible Approaches

1. Java Applets2. JPGraph3. PHP4. Asp.net

0.5.4 Our Approach

We have chosen PHP and GD library to plot graphs of frequency data onweb as it is an open source allows to build dynamic graphs. It also has lessoverhead compared to other web designing techniques.

Server Requirements a) Apache serverb) PHP (with GD)c) MySql

9

Page 11: Wide area frequency easurement system iitb

Steps involved in Display:

1. In the first step get the fixed size graph (image) on the web browserby using the following functions,

ImageCreate ( int $width , int $height ).It returns an image identifier representing a blank image of specified size. Inaddition to ImageCreate() that created a graph, there are other functionsused to fill the information in the graph such as drawing X-axis and Y-axislines, plotting appropriate axis names, defining the color values etc.

ImageLine (resource $image,int $x1,int $y1,int $x2,int $y2,int $color)Imageline function draws line between given points. This function will drawa line between the points ($x1,$y1) & ($x2,$y2) and assign the color speci-fied by $color.

ImageString (resource $image,int $font,int $x,int $y,string $string,int$color)ImageString function is used to display a text on the given coordinates ofgraph.Here ($x,$y) is a coordinate point on the graph where $string will be written.

ImageColorAllocate (resource $image,int $red,int $green,int $blue)ImageColorAllocate returns a color identifier representing the color com-posed of the given RGB components. To plot a line or write a string on thegraph we require colors. Imagecolorallocate() is used to associate color withobjects

.2. After the first step a blank graph is created and axis are drawn php-

mysql database connection is established.We use $con=mysql connect ( ’localhost’, ’username’, ’password’ ); mysql select db(‘database name’, $con);

3. In third step we run mysql queries to fetch the records from tablesensor data. A view named display is created that will fetch last 10 sec-onds records from the table sensor data. Also, there are other queries toobtain minimum and maximum frequency values for the records obtainedfrom view. These are required to decide the limits of values on Y-axis oneach page refresh.

10

Page 12: Wide area frequency easurement system iitb

4. Fourth step is actual plotting of the frequency values. Time in secondson X-axis and frequency on Y-axis. On each page refresh, which happenedevery second a new record set is fetched from the sensor data table. Thisfetched data is plotted dynamically by adjusting X axis and Y axis. Thereare two files, WAFMES.html and graph.php. WAFMES.html page callsgraph.php every second on page refresh. Every new second value is plottedfrom right to left side. If the frequencies change suddenly it will be seen onthe display as spikes.

5. In the fifth step we handle exceptions and errors. If any of data sourcefailed and the data is not coming at server we raise a flag. This is done bydisplaying a message “X Currently Off” where X is the name of the datasource. Null values in between the rows are also handled at display time.These Null values in the database table may be due to UDP packet lost ortransmission delay or sensor side packet generation. These null values aredisplayed as sudden spikes on graphs. To overcome this problem we havetaken previous frequency value in place of null only at display time andplot. The actual entry in the database table is left intact. However if thereare continuous null values for a period of 10 seconds then we consider thatsensor is down and raise the flag message.

0.6 Handling database issues

Since approximately 4.5 millions of rows are inserted to ‘sensor data’ tableevery day and its maximum limit is 5.0 billion records we need to constantlymove the records from the table to make the space for the new data. Forthis ‘Backup.pl’ script is run on daily basis. The job is scheduled in cron-job. This will run at 1 am every day and move the last day entries fromtable to a file compress the file with bzip2 and store it in the folder sen-sor data backup. The system date is part of filename. After taking thebackup of the day those rows are deleted from the table sensor data.

11

Page 13: Wide area frequency easurement system iitb

0.6.1 Running the program

Step 1

A shell script is written, so that when need to add a new sensor name,add only his name in text file (but it must be in sequence as per in php fileand also need to update php file according to changes)

first run ./setup.sh to create databasethen ./run.sh to run the server

OtherwiseMaking the database readyNeed to be done only first time.CREATE DATABASE WAFMES;USE WAFMES;CREATE TABLE sensor data (

hash decimal(14,0) NOT NULL,surat_time varchar(25),surat_value float ,sangli_time varchar(25),sangli_value float,ahm_time varchar(25),ahm_value float,mum_time varchar(25),mum_value float,pune_time varchar(25),pune_value float,bhu_time varchar(25),bhu_value float,PRIMARY KEY(hash)

);

CREATE TABLE Log Error (

sensor_name varchar(10),time varchar(30),freq varchar(10));

CREATE VIEW display AS

SELECT * FROM sensor_data wherehash>=((SELECT max(hash) FROM sensor_data)-12000) && hash<=((SELECTmax(hash) FROM sensor_data)-2000)ORDER BY hash DESC;

12

Page 14: Wide area frequency easurement system iitb

Make an entry in cronjob for the file Backup.pl to be run on daily basis.It will get the last one day entries (approx 4.5 million) from database ta-ble (sensor data) and write in a file and compress, take the date as file name.

Step 2

When a new sensor is added we need to alter the table and create columnsfor the new sensor. For ex. if x is a sensor location that has been newlyestablished, then make the following changes to the table.

ALTER TABLE sensor data add x time varchar(25), x value float);

Changes also need to be reflected in the server.c and graph.php.

Step-3

Compile the server program as ./compile.sh

Step-4

Run the server program as ./run.sh

13

Page 15: Wide area frequency easurement system iitb

0.7 Literature Survey

0.7.1 MYSQL

The MySQL software delivers a very fast, multi-threaded, multi-user, androbust SQL (Structured Query Language) database server. The MySQLsoftware is Dual Licensed. Users can choose to use the MySQL software asan Open Source product under the terms of the GNU General Public License(http://www.fsf.org/licenses/) or can purchase a standard commerciallicense from Oracle. See http://www.mysql.com/company/legal/licensing/for more information on our licensing policies.

It is the one of the most scalable (upto 5 Billion entries) and fast databaseto meet our requirement.

0.7.2 PHP

PHP (Hypertext Processor) is the language to create dynamic web devel-opment. PHP is an interpreted language and is executed on server side(just like CGI or ASP scripts) contrary to scripts executed on client side (aJavascript or a Java applet executes on your computer). It is usually asso-ciated with Apache and Mysql database. PHP with GD library to createimages, graphs on the web browser.

14

Page 16: Wide area frequency easurement system iitb

0.8 Future Scope

0.8.1 Interpolation of Frequencies

We are currently storing the frequency values from multiple sensors in a20 millisecond time interval. For eg. if a packet that has arrived has thefollowing data ’surat,2010-07-11 10 12 32.320,47.8’. Its calculated hash is1278843152320. It is stored in the database table against the hash(which isa primary key) 1278843152330. Instead of storing the frequency against thehash 1278843152330(2010-07-11 10 12 32.340) we can interpolate betweenthe current frequency value and the previous frequency of the sensor andstore it against 1278843152330, for that we need to store last one frequencyvalue for each sensor.

0.8.2 Daily triggers

Alarms and Triggers need to be genarated for the variations in the frequen-cies between the sensors over a particular time interval.

0.8.3 Threading

Current program running sequencialy so packet loss may occure when thenumber of sensor increased, in order to avodie such problems can be usethreading to achive the parallelism with every packet. We can create athread for each incoming packet and perform the constraint check anddatabase insertion.

15

Page 17: Wide area frequency easurement system iitb

0.9 References

1. For mysql http://dev.mysql.com/doc/refman/5.1/en/index.html2. For PHP http://php.net/manual/en/book.image.php3. For threads https://computing.llnl.gov/tutorials/pthreads

16