Abstract - bedford-computing.co.ukbedford-computing.co.uk/.../Jamie_Hagerty_report.docx · Web viewCAAS agrees that the assignment may be submitted ... sentiment analysis and mining

ASSIGNMENT TOP SHEETFaculty of Creative Arts, Technologies & ScienceDepartment of Computer Science & Technology

Student Ref. No 1209310 Unit Code: CIS017-3

Unit Name:Undergraduate Project

Deadline for Submission(s)Tuesday 8th May 2015

Student's SurnameHAGERTY

Student's ForenameJAMIE

Unit Leader's Name:Enjie Liu

Supervisor:Dr Ingo Frommholz

Assignment Details:

Assessment 2: Final Report

Instructions to Student:Please note: Work presented in an assessment must be the student's own. Plagiarism is where a student copies work from another source, published or unpublished (including the work of a fellow student) and fails to acknowledge the influence of another's work or to attribute quotes to the author. Plagiarism is an academic offence.Work presented in an assessment must be your own. Plagiarism is where a student copies work from another source, published or unpublished (including the work of another student) and fails to acknowledge the influence of another’s work or to attribute quotes to the author. Plagiarism is an academic offence and the penalty can be serious. The University’s policies relating to Plagiarism can be found in the regulations at http://www.beds.ac.uk/aboutus/quality/regulations. To detect possible plagiarism we may submit your work to the national plagiarism detection facility. This searches the Internet and an extensive database of reference material including other students’ work to identify. Once your work has been submitted to the detection service it will be stored electronically in a database and compared against work submitted from this and other universities. It will therefore be necessary to take electronic copies of your materials for transmission, storage and comparison purposes and for the operational back-up process. This material will be stored in this manner indefinitely.

I have read the above information and I confirm that this work is my own and that it may be processed and stored in the manner described.

Signature (Print Name): ........................................................... Date: .........................................

Extension deadline CAAS agrees that the assignment may be submitted ____ days after the deadline and should be marked without penalty.

CAAS confirmation...................................................................................................................

Please leave sufficient time to meet this deadline and do not leave the handing-in of assignments to the last minute. You need to allow time for any system problems or other issues.

http://www.beds.ac.uk/aboutus/quality/regulations

Jamie Hagerty

1209310

Tweet sentiment analysis using dataflow acceleration

BSc (Hons) Computer Science

Undergraduate Thesis Report

Department of Computer Science and Technology

University of Bedfordshire

Dr Ingo Frommholz

2014/2015

Abstract

While ‘big data’ continues to grow ever larger and processing times and costs

become problematic in the computing world, FPGA technology appears to

provide a solution which can offer reduced processing times, costs and present

this at a reduced impact to the environment.

Social media platforms such as twitter are sources for vast amounts of data

relating to how users throughout the world may feel about certain topics, ideas,

or people. This paper investigates how Maxeler’s FPGA dataflow solutions may

help in achieving reduced processing times with regards to sentiment analysis of

the ‘Twittersphere’.

To approach this task, an artefact to compare execution times of a sentiment

analysis algorithm upon a large tweet dataset on CPU and FPGA

implementations is proposed to produce benchmarks between the technologies.

Results show that the FPGA implementation can process the selected

sentiment analysis algorithm around a magnitude of ~3.07 times faster than the

stand alone CPU implementation.

In conclusion, FPGA clearly shows its benefits in terms of processing speeds

in the area of sentiment analysis. In all, it demonstrates that FPGA technology

could provide big improvements in real world applications in which reduced

processing times could give large advantages.

i

Acknowledgements

I would like to acknowledge and express my deepest appreciation to Dr. Ingo

Frommholz who introduced me to the concept of FPGA computing and offered

me support, guidance and encouragement throughout the development of my

final year project. Without him this project would not have been undertaken.

Furthermore I would like to acknowledge with much appreciation Oliver Brown

and the staff at Maxeler that helped in making the necessary equipment required

available to me to undertake my project.

Dedication

I dedicate my dissertation work to family and many friends that I have met

throughout my university experience and supported me along the way.

Keywords

FPGA

DFE

MAXELER

DATAFLOW

SENTIMENT ANALYSIS

TWITTER

TWEETS

ii

Table of Figures

Figure 4.1 Computing With Dataflow Core.........................................................14

Figure 4.2 Computing With A Control Flow Core..............................................15

Figure 4.3 Tweet Class Diagram..........................................................................18

Figure 4.4 ANEW Class Diagram........................................................................18

Figure 4.5 Program Flow Diagram......................................................................19

Figure 4.6 Russell’s Emotional Circumplex........................................................24

Figure 4.7 CPU Tweet Analyse Loop…………………………..…………………………..25

Figure 4.8 Building DFE Stream…………………………………………………………….26

Figure 4.9 Execution Order & Data Dependence for Loop-Tiled Row-Sum,

With C of 4………………………………………………..……………………………………….28

Figure 5.1 Benchmark Results……………………………………………………………….30

List of Tables

Table 3.1 Main Requirements..............................................................................11

Table 3.2 Results Requirements...........................................................................12

Table 4.1 ANEW Word Table..............................................................................22

Table 4.2 Stage 1: Decomposition For VALENCE.............................................22

Table 4.3 Stage 1: Decomposition For AROUSAL.............................................22

Table 4.4 Stage 2: Combining..............................................................................23

Table 4.5 Stage 3: Combined Mean & Standard Deviation.................................23

Table 5.1 Data Set Statistics.................................................................................29

Table 5.2 Magnitude of Acceleration DFE Implementation Provides.................31

iii

Table of ContentsAbstract..................................................................................................................i

Acknowledgements...............................................................................................ii

Dedication.............................................................................................................ii

Keywords..............................................................................................................ii

Table of Figures...................................................................................................iii

List of Tables.......................................................................................................iii

1 Introduction..................................................................................................1

1.1 Inspiration.......................................................................................................1

1.2 Project and Artefact.......................................................................................1

1.3 Report Structure.............................................................................................2

1.3.1 Literature Review.........................................................................................2

1.3.2 Requirements................................................................................................2

1.3.3 Methods & Design.......................................................................................3

1.3.4 Results..........................................................................................................3

1.3.5 Conclusion....................................................................................................3

2 Literature review..........................................................................................4

2.1 Introducing Sentiment Analysis: Why?.......................................................4

2.2 Approaching Sentiment Analysis..................................................................5

2.3 Sentiment Dictionary Oriented Approaches................................................6

2.4 Introducing FPGA: Benefits & Potential.....................................................7

2.5 FPGA In Information Filtering.....................................................................7

2.6 Conclusion.......................................................................................................8

3 Requirements..............................................................................................10

3.1 Introduction..................................................................................................10

3.2 Prerequisites..................................................................................................10

3.2.1 Twitter Data................................................................................................10

iv

3.2.2 Sentiment Dictionary..................................................................................10

3.3 Implementation.............................................................................................11

3.4 Results & Evaluation....................................................................................12

4 Methods & Design......................................................................................13

4.1 Introduction..................................................................................................13

4.2 Maxeler Dataflow Computing.....................................................................13

4.3 Development..................................................................................................16

4.3.1 Tools, Languages & Libraries....................................................................16

4.3.2 Development Model...................................................................................16

4.3.3 Testing........................................................................................................16

4.3.4 Data Structures & Dataflow.......................................................................18

4.3.5 Sentiment Analysis Algorithm...................................................................20

4.3.6 Interpreting Valence & Arousal Scores.....................................................23

4.3.7 DFE Algorithm Implementation Differences.............................................25

5 Results.........................................................................................................29

5.1 Measuring Results........................................................................................29

5.2 Data Sets........................................................................................................29

5.3 Benchmarks...................................................................................................30

6 Conclusion...................................................................................................32

7 References...................................................................................................34

v

1 Introduction

1.1 Inspiration

Social media platforms such as twitter are sources for vast amounts of data

relating to how users throughout the world may feel about certain topics, ideas or

people. Throughout the history of Twitter the website has gone from having 500

tweets per day in 2007 to 500 million per day in 2013 (Twitter Blog, 2013).

Given the scale and availability of data in the form of tweets and their

associated metadata, the ‘Twittersphere’ (postings made on the social media

website Twitter considered collectively) has become the subject of many forms

of data processing (i.e. sentiment analysis) to connect and analyse tweets to

generate new information relating to specific searches. As this data is ever

growing, technologies such as FPGA (Field-Programmable Gate Array) could be

applied to process algorithms within faster time scales at a fraction of the cost.

1.2 Project and Artefact

My project aims to re-implement an existing sentiment analysis algorithm using

FPGA based DFEs (Data Flow Engines) to accelerate sentiment analysis upon a

large collection of tweets and benchmarking this against the same task without

dataflow acceleration to show the potential benefits this technology may provide.

The primary objective of my project will be to show how DFEs can be applied to this area of study to produce results I can analyse to compare the implementations:

a) The magnitude of acceleration in algorithm processing times

b) How the sample size of tweets affect a)

Potential from the impact of a) will be analysed to show how using DFEs within

this field may allow for advancements in real world applications that make use of

sentiment analysis.

1

To do this and realize my artefact I will:

a) Implement code to execute a sentiment analysis algorithm on a collection

of tweets and perform analysis on processing speeds when ran using

solely x86 conventional CPUs.

b) Using the same implementation, adjust the algorithm component of the

code to make use of DFE technology and perform the same analysis to

benchmark the two sets of data against each other.

c) Produce a report detailing how the investigation was carried out and

provide statistical data to show results from the benchmarks and provide

visualizations in the form of graphs.

1.3 Report Structure

1.3.1 Literature Review

The purpose of the literature review is to further investigate what research and

work has already been done that relates to topics within my project. This review

details further background information and history on sentiment analysis and

applications of FPGA in information search and classification. Furthermore, this

review will conclude my feelings on the current state of the topics reviewed.

1.3.2 Requirements

The requirements section briefly details prerequisites and other requirements

needed to materialize my artefact and achieve my project aims. Other

requirements detail implementation, result gathering and evaluation.

2

1.3.3 Methods & Design

The methods and design section details how I set out to achieve my artefact

objectives. It explains the Maxeler dataflow computing concept used to produce

a DFE accelerated program, tools and languages used, data structures and the

flow of data throughout implementations, and it explains how the benchmarked

sentiment analysis algorithm is performed and interpreted.

One key area of this section details differences between CPU and DFE

implementations.

1.3.4 Results

The results section explains how implementation execution times were recorded

and displays the benchmarks between implementations, including the magnitude

of acceleration across datasets.

1.3.5 Conclusion

The conclusion summarises my findings and how they respond to the brief

explained in the introduction. In addition, it covers my personal development

throughout the project and what I feel the project has contributed to its respective

area, as well as what I would have done differently with more time and the

potential future the project may hold.

3

2 Literature review

2.1 Introducing Sentiment Analysis: Why?

There has always been a demand for opinion when it comes to any form of

decision-making. User generated content from websites such as twitter can carry

valuable information in relation to products, services, topics, ideas and people.

Twitter is a form of ‘big data’ and organisations, businesses and researchers are

keen to tackle the automation of opinion mining upon this platform to exploit this

extensive source of data. Using sentiment analysis, another layer of information

to analyse can be added to each tweet - its ‘mood'.

In the year 2013/14, the UK National office for statistics published a report

on the personal well-being of the population within the United Kingdom

(ons.gov.uk, 2013); this report was based on a survey of 165,000 people with 4

questions on the topic of well-being and requested answers on a scale of 1-10.

The requirement for opinionated data analysis is of clear importance to the UK

government and can involve large data samples. The government is willing to

invest time and money into information gathering using surveys on this scale for

their information needs.

Thus, how would the results compare if reliable sentiment analysis was run

upon one year’s worth of tweets with geo-location based within the UK?

Providing a reliable outcome could be found, how much time would it take to

perform? And how much would it cost? The potential of real-world applications

of reliable sentiment analysis could expedite processes, reduce costs and change

how businesses and other entities may function.

4

2.2 Approaching Sentiment Analysis

Research by Arora and Srinivasa (2014) aim to provide a schematic framework

for researchers to understand the landscape of sentiment mining by proposing

faceted classification and addressing issues faced within the topic. So far, efforts

on research within sentiment analysis and mining are described as “fragmented

and disparate” (Arora and Srinivasa, 2014).

Four types of opinion classifications exist within this schematic framework

and sentiments are described as being either positive, negative, neutral or

constructive. Approaches to brand opinions use the understanding of domain-

specific opinion words, their respective polarity and language specific opinion-

rules; opinions are challenging to extract due to the “latent context in which an

opinion is expressed” (Arora and Srinivasa, 2014).

Attempts at classification through the realm of sentiment analysis have had

mixed degrees of success and this is due to issues such as cross-domain

classification, sarcasm, spam and “the need for an annotated training dataset”

(Arora and Srinivasa, 2014).

Insights to consider in relation to helping solving some of the issues

described above within sentiment analysis are mentioned such as:

Identifying the purpose or background motivation of the opinion to

provide insight into the credibility of any sentiments expressed to help

avoid accounting for opinion spamming.

To further classify opinions as direct, indirect or comparative – indirect

opinions are implied as idioms or can express sarcasm; identifying these

to process accordingly would help lower incorrect sentiment analysis.

Current methods discussed to extract opinion structure include machine

learning, rules-based approaches and statistical algorithms with a lexicon

(language specific dictionary of words with predefined polarity). My research

aims to use a statistical algorithm along with a lexicon to run sentiment analysis.

However, addressing problems described by this approach are stated and are still

open to solution:

Lexicons are language specific

5

Word polarity is domain specific.

Polarity can also be context specific.

2.3 Sentiment Dictionary Oriented Approaches

Twitter sentiment analysis is no new idea and existing implementations already

exist. Ramaswamy (2011) has produced a thesis on visualizing twitter sentiment

based on keyword search. I plan to implement the same method of sentiment

analysis used in his work.

Multiple computational methods such as Bayesian networks and support

vector machines can be used to perform concept-level analysis of natural

language text. Traditional approaches like these require “sufficient high-quality

text to allow for accurate natural language evaluations” (Healey and

Ramaswamy, 2011) and in arguing that this level of requirement of text is not

necessarily available in short text snippets like tweets, an alternative method is

proposed: using dictionaries that pre-define the sentiment of a collection of

words along a set of emotional dimensions.

The sentiment dictionary used within Ramaswamy’s (2011) thesis is the

ANEW (Affective Norms for English Words) dictionary; It provides measures of

valence, arousal and dominance for 1034 English words which previous research

identified as good candidates to convey emotion. The ANEW dictionary rates

words in upon a 9 point scale (1-9) and has been constructed from asking

volunteers to read a body of text and provide a rating along each dimension for

each occurrence of an ANEW related word. ANEW words within the text are

combined to form an overall mean rating and standard deviation of ratings.

Emotional models have been proposed within the field of psychology to

define and compare emotional states using valence and arousal. James Russell

(1980) proposed a model which maps valence and arousal to build an 'emotional

circumplex of affect', with 28 emotional states positioned accordingly; valence

runs across the horizontal axis and arousal on the vertical axis. Ramaswamy

(2011) uses this model to plot measurements of a tweets valance and arousal

against Russell’s emotional circumplex of affect.

6

Implementation of the sentiment algorithm onto tweets is relatively straight

forward and for each tweet first involves capturing the mean valance and arousal

values for each ANEW word along with respective standard deviation values.

Tweets with less than 2 ANEW words are regarded as having insufficient data to

perform sentiment analysis on. Each words mean is weighted by the probability

of the words rating falling exactly at the mean value using a probably density

function of a normal distribution in relation to each words respective standard

deviation value. To gain an overall value for both valence and arousal of the

tweet, these weighted means are then averaged. (Ramaswamy, 2011)

2.4 Introducing FPGA: Benefits & Potential

FPGA solutions are becoming ever more prevalent for a myriad of tasks relating

to intense data processing and for good reason: the benefits of using such

technology include massive process acceleration and lower computational power

requirement costs.

In 2012 Bank JPMorgan adopted this technology and applied it to its risk

measuring algorithms. The technology has allowed simulations that once used to

take hours to complete to now finish in just a few minutes (Tom Groenfeldt,

2012). It’s clear that FPGA has the potential to massively accelerate the speed of

large and intense data analysis. Carl Claunch (2011), vice president and analyst

at Gartner states that the true value of this comes from “enabling new levels of

performance, changing the user’s competitive dynamics or unlocking new market

opportunities”.

2.5 FPGA In Information Filtering

Much of the research in this review contains an overview and some specifics

Researchers have evaluated the performance of using FPGAs within information

filtering over a series of experiments comparing FPGA implementations against

an optimised reference implementation (Vanderbauwhede, W. Azzopardi, L. and

Moadeli, M., 2009). In this paper, the researchers used a collection of document

datasets which each differed in numbers of documents they contained and the

7

average document length. Upon running various profile filters on each dataset

their results indicated that their FPGA implementation ranged from 8.3 to 20.8

orders of magnitude faster than the reference implementation. Additionally, the

standard implementation became slower as profile size increased while the

FPGA implementation remained relatively constant due to pipelining profile

scoring, keeping latency constant.

The conclusion of this paper further details benefits to use of FPGAs; they

offer this processing speed at a small fraction of the power a CPU-only solution

would use. Power consumption within data centres is a growing issue due to the

cost of cooling and power consumption of computation. The paper tells that

FPGAs could tackle the challenge of developing environmentally friendly

systems.

More recent research shows that use of hybrid CPU-FPGA systems for

streaming document classification can achieve throughputs of 10Gb/s in real time

and that by moving the document parser from CPU to FPGA researchers aim to

achieve speeds of 100Gb/s. (Vanderbauwhede, W., Frolov, A., Chalamalasetti, S.

R., and Margala, M., 2014).

2.6 Conclusion

Much of areas I will have to explore to produce my own research of using

FPGAs alongside sentiment analysis upon large datasets. I feel points made

throughout this review show that the topic of sentiment analysis and mining is of

great interest to many people and entities as it has the potential to unravel new

ways of interpreting data from search and have implications of a positive nature

to change how said entities may function with regards to analysis.

Disparity between approaches of achieving reliable sentiment analysis on

different styles of text-based information and issues within certain methodologies

remaining unanswered appear to suggest that this area of research is still within

its early stages. However, research in the area seems to be building upon creating

a framework to tackle all aspects of sentiment analysis, which need to be

considered.

8

FPGA implementations make clear their advantage when it comes to

accelerating algorithms and performing information filtering. FPGA also

displays a new route to achieving ‘green systems’ as they require little power

consumption and cooling measures. Research in this area appears to be focused

on what FPGA can be applied to and measuring how effective it can be when

applied to various computational problems. It’s shown that FPGAs have an

additional advantage in the fact that they can process large streams of data in real

time, widening applications of the technology.

My own research should provide additional insight into helping make real-

world applications of sentiment analysis upon large sets of data relating to social

media reasonable in terms of processing speed and efficiency. In demonstrating

this, FPGA in reliable sentiment analysis could unlock limitations in terms of

time scales and costs required and enable proposals which rely on speed or real-

time analysis.

9

3 Requirements

3.1 Introduction

As with any development process, I first considered the tasks required to achieve

a working implementation and compile these into a requirements list. This list

would function as an overview of the fundamental components for building the

artefact.

3.2 Prerequisites

Fundamentally, my project will perform sentiment analysis upon millions of

tweets using different approaches towards computation. I would need to source

bulk Twitter data and a sentiment dictionary for this to be possible.

3.2.1 Twitter Data

Twitter data is the subject of analysis. To perform a meaningful and reliable

comparison between implementations, a large dataset of tweets are required. This

dataset was obtained from the Tweets2011 corpus as part of the TREC microblog

track and contained just under 12 million tweets for me to analyse which were

compiled over a period of 2 weeks.

3.2.2 Sentiment Dictionary

The sentiment dictionary provides the method of calculation. To achieve my

route of sentiment analysis a sentiment dictionary (lexicon) was required. I chose

to use the same sentiment dictionary, the ‘Affective Norm for English Words’

(ANEW) as used by Ramaswamy (2011). This dictionary provides arousal and

valence data for 1034 unique words in the dictionary.

10

3.3 Implementation

Implementation requirements span the essential components of program flow to

generate results for analysis. I have divided implementation requirements up into

four sections, which will be further elaborated on in the design section of this

report.

Table 3.1 Main Requirements

Requirement Description ID

Data IO The implementation must be able to read from

JSON formatted files to access tweets and

ANEW sentiment dictionary data.

1

Tweet Processing Tweets must be tokenized and stemmed so that

they can be searched for in the ANEW

dictionary.

2

Sentiment Analysis Processed tweets will be checked against the

ANEW dictionary. A mathematical formulae is

required to compile ANEW word data for each

tweet into an overall score.

3

Benchmarking Running times of each implementation need to

be measured and recorded. (Linux time

command)

4

11

3.4 Results & Evaluation

Results specifically refer to the running times for each data set ran on the tweet

sentiment analysis implementation. Results must adhere to specific requirements

so that a meaningful evaluation can be made.

Table 3.2 Results Requirements

Requirement Description ID

Generating results to

compare

Benchmark results must be generated using both

computational methods (conventional CPU and

dataflow acceleration) to allow for a comparison.

1

Multiple tweet data

sets for analysis

Data sets of different sizes should be used to

generate a range of results and to see how

computational times scale.

2

Confidence of results Generating results for each data-set should be

repeated to provide a degree of confidence in

results.

3

Evaluation of results Provide a comparison between computational

methods for each set of results generated. Show

statistics and provide data representations

(graphs, tables).

4

12

4 Methods & Design

4.1 Introduction

In this section I will detail the design of my artefact and the methods I used to

approach design. Additionally, concepts used within my design will be

explained. This section is written in a style so that my work could be replicated

4.2 Maxeler Dataflow Computing

Maxeler provides solutions aimed at tackling big data problems by exploiting the

concept of dataflow computing on FPGAs as opposed to using traditional CPU

‘control flow’ computing.

This concept allows for optimizing the movement of data in an application

and utilizing massive parallelism between thousands of ‘dataflow cores’ to

provide benefits in performance, space and power consumption. (Maxeler, 2015)

As seen in figure 4.1, ‘computing with dataflow cores’, data is streamed

from memory into the dataflow engine where each dataflow core acts as one

computational unit to perform operations and forward data to the next core or the

off-chip memory only once the chain of processing is complete. Instructions for

each program are described by the configuration file which maps the operations,

layout and connections of the dataflow engine.

In comparison, figure 4.2, ‘computing with control flow cores’, shows how

data and instructions are continuously passed between memory and processor

core as operations are performed.

This model is sequential and performance is limited by the latency of data

movement. (Maxeler, 2015)

In relation to my project artefact, I will implementing a design which uses

only a control flow core architecture (traditional CPU) and a design which

utilizes both, with algorithm calculations being carried out on the dataflow core

architecture.

13

My project goal to compare processing times of sentiment analysis will

fundamentally be measuring the timing differences of the algorithm with and

without this dataflow architecture.

Figure 4.1 Computing With a Dataflow Core (Maxeler, 2015)

14

Figure 4.2 Computing With a Control Flow Core (Maxeler, 2015)

15

4.3 Development

4.3.1 Tools, Languages & Libraries

For development of my work I have used a multitude of tools, languages and

libraries, listed below, alongside their purpose.

MaxelerOS – System used to run Maxeler programs.

MaxCompiler – Developer environment for Maxeler programs.

C/C++ 11 – Used to develop artefact host code.

Java/maxJava – Used to program dataflow engines.

rapidJSON – JSON parser to read tweets and ANEW data.

Oleander stemming library – library used to perform word stemming

upon tweets.

4.3.2 Development Model

Throughout project development I opted to use the waterfall model primarily due

to its simplicity – each stage is specific and is easy to manage as these stages

have direct outputs. In addition, due to the simplified, small number of

requirements in this project, not much could be overlooked.

One downside of using this model for this project was that I had no prior

experience of developing an FPGA application using Maxeler technologies. This

led to problems in the system design stage for Maxeler kernel development as

going back to change wrong designs was often troublesome as much of the

design had to be refactored, compile times could take upwards of 30 minutes,

and simulations of code wouldn’t always reproduce perfectly on the FPGA

hardware.

4.3.3 Testing

Within implementation, code was developed in ‘blocks’ of functionality, which I

would test before moving onto the next block of functionality; to test the desired

functionality I was after, I would write in-line unit tests after each section which

would compare functions against expected output, for example: to specifically

16

ensure that the sentiment analysis algorithm was working correctly, a small

number of ‘development tweets’ which I had previously calculated sentiment

analysis values for was ran on code execution to check for discrepancies.

Debugging in FPGA hardware could sometimes prove difficult due to the

difficulty of inspecting a running stream, but MaxDebug provided a means of

inspecting stream statuses in running hardware executions.

17

4.3.4 Data Structures & Dataflow

Two of the main entities needed to be represented for the artefact were tweets

and ANEW word entries. The artefact uses a Tweet and ANEW class to manage

storage, manipulation, and retrieval of entity data. Designing these structures in

an object orientated fashion makes transfer and manipulation of entity data

simple; encapsulating entity data and methods keeps all information inside the

object.

Using this style of design, the main body of code needs to only focus upon

building, using and keeping track of objects. Figures 4.3 and 4.4 detail these

entities as class diagrams.

Figure 4.3 Tweet Class Diagram

Figure 4.4 ANEW Class Diagram

18

Figure 4.5 details a sea-level representation of

program flow from start to finish, each stage will be

elaborated on.

Stage 1: Building ANEW hashmap.

Each ANEW word found in the dictionary JSON is

built as an ANEW object. Each ANEW object is

inserted into a hashmap, with a key that matches the

stemmed dictionary word it represents.

Stage 2: Load & parse tweet JSON into memory.

rapidJSON parses and validates the tweet JSON

dump. It stores the JSON structure into memory for

later access.

Stage 3: Build tweet objects from JSON data.

Build all tweet objects from parsed JSON.

Stage 4: Perform word stemming & record

ANEW words.

Tweet text is tokenized by word and each word

token is stemmed and searched for in the ANEW

hashmap; if the word is found, it’s recorded into a

vector.

Stage 5: Run algorithm calculation.

Tweet objects with more than 2 ANEW words are

run through the algorithm to determine the tweets

overall valence and arousal ratings and standard

deviations. This stage is implemented different for

CPU and DFE builds.

Figure 4.5 Program

Flow Diagram

19

4.3.5 Sentiment Analysis Algorithm

To perform sentiment analysis upon a viable tweet (>2 ANEW words), arousal

and valence ratings for every ANEW word must be taken into account. This is

the part of the solution which will be made on CPU and on DFE to benchmark.

For every ANEW word found in a tweet considered viable for analysis,

valence and arousal means, standard deviations, and word frequencies (sample

sizes) are collected. These separate values are decomposed, combined and the

overall mean and standard deviation is calculated. The mathematical formulae

used to do this consists of 3 stages, with each previous stages values being used

in the next:

Stage 1: Decomposition of mean and standard deviation for each ANEW

word.

Stage 2: Combining decomposed data.

Stage 3: Calculating the mean and standard deviation with the combined data.

4.3.5.1 Stage 1: Decomposition for Each ANEW Word Labelled as i

Step 1a:

Decomposing the mean to find sum of x values for word i

Step 1b:

Decomposing standard deviation to find sum of x2 values for word i

20

4.3.5.2 Stage 2: Combining Values for Each ANEW Word

Step 2a:

Combine n values (word frequencies) for all words

Step 2b:

Combine sum of x values for all words

Step 2c:

Combine the sum of x2 values for all words

4.3.5.3 Stage 3: Calculate Combined Mean & Standard Deviation Using

Combined Values

Step 3a: combined mean

Formulae for finding the mean

Step 3b:

Formulae for finding the standard deviation

21

An example where the bolded words have been identified as part of the ANEW

sentiment dictionary:

“Finally saw the movie #Tron … have to say that I

quite enjoyed it! … especially loved the

motorcycles…”

Table 4.3 ANEW Word Table

ANEW

word

Valence

mean (μ)

Valence

standard

deviation (σ)

Arousal

mean (μ)

Arousal standard

deviation (σ)

Word

freq (n)

Movie 6.86 1.81 4.93 2.54 29

Enjoyed 7.8 1.2 5.2 2.72 21

Loved 8.64 0.71 6.38 2.68 56

Table 4.4 Stage 1: Decomposition For VALENCE

ANEW word Step 1a Step 1b

Movie 198.94 1456.46

Enjoyed 163.8 1306.44

Love 483.84 4208.1

Table 4.5 Stage 1: Decomposition For AROUSAL

ANEW word Step 1a Step 1b

Movie 142.97 885.49

Enjoyed 109.2 1121.32

Love 357.28 2674.48

22

Table 4.6 Stage 2: Combining

Rating type Step 2a Step 2b Step 2c

Valence 106 846.58 6971

Arousal 106 609.45 4275.77

Table 4.7 Stage 3: Combined Mean & Standard Deviation

Rating type Step 3a Step 3b

Valence 7.99 1.41

Arousal 5.75 2.71

Table 4.1 shows the initial data for each of the ANEW words found in this tweet,

they are combined to give one overall score for valence and arousal ratings in

terms of mean and standard deviation.

Table 4.2 and table 4.3 (stage 1) respectively shows decomposition of each

words valence and arousal ratings found in table 4.1. These values are combined

in table 4.4 (stage 2). Finally, the combined mean and standard deviation is

calculated using combined values as shown in table 4.5 (stage 3).

4.3.6 Interpreting Valence & Arousal Scores

As discussed in section 2.3, Russell’s emotional circumplex is used to give

meaning to these valence and arousal scores. Arousal in mapped against the x

axis and valence is mapped against the y axis. The surrounding emotions in this

circumplex give an indication of the emotional state of each tweet.

Figure 4.6 shows two tweets mapped against this emotional circumplex to

give a representation of the emotional state of each tweet. The blue dot is the

same tweet that was calculated for through tables 4.1 to 4.5, it shows that this

tweet lies within the active/pleasant quadrant of the circumplex. Another tweet,

marked by the pink dot indicates a tweet that lies within the unpleasant/subdued

quadrant.

23

These marked tweets are plotted according to the mean values of valence and

arousal. When considering the standard deviation values of valence and arousal,

the spread of values about the mean can be determined. The higher the standard

deviation, the larger the average spread of data from the mean.

Figure 4.6 Russell’s Emotional Circumplex with Mapped Tweets

24

4.3.7 DFE Algorithm Implementation Differences

For the most part, both program implementations share much of the same code.

Differences between implementations appear after the CPU code has finished

identifying ANEW words within a tweet.

In the CPU code, looping through a vector of tweets to perform analysis

would take the format of figure 4.7.

// Tweet analyse loopfor (SizeType i = 0; i < tweets.Size(); i++){ // get tweet data

string tweet_text = tweets[i]["text"].GetString();string tweet_id = tweets[i]["id_str"].GetString();

// make tweet objectTweet tweet(tweet_id, tweet_text);

// stem + analyze

tweet.clean_stem();tweet.analyze(anew_map);[..] <Print out current tweet / track tweet pointer in

vector/array>}

Figure 4.7 CPU Tweet Analyse Loop

25

In the DFE implementation, arrays of inputs for tweet analysis need to be

constructed so they may be passed as a ‘stream’ into a dataflow engine. Figure

4.8 demonstrates adding tweet input data to five different arrays (streams).

// for each tweetfor (uint32_t x = 0; x < anew_count; x++){ element = …

// get anew values from hashmap for anew word x [..] query = anew_map.find(p_currtweet->get_anew_words()[x]);

// build arraystweet_ar_mean[element] = query->second.arousal_mean;tweet_ar_sd[element] = query->second.arousal_sd;

tweet_val_mean[element] = query->second.valence_mean;tweet_val_sd[element] = query->second.valence_sd;

tweet_n[element] = query->second.n;

}

Figure 4.8 Building DFE Stream

To build a stream the FPGA hardware will accept, one main condition is

required:

Each ‘in stream’ is required to be a multiple of 16 bytes

A stream into FPGA hardware can effectively take the form of a matrix, in my

design I built this stream with a width of 8, essentially creating an 8 by Y matrix

of float types, where Y was the number of tweets to be analysed. This approach

was needed to achieve stream conditions:

sizeof(float) = 4 bytes (width) 8 * (sizeof(float)) 4 = 32 bytes

Now every row Y would be a multiple of 16 bytes and be able to carry 8 ANEW

floating point values. Dummy data (padding) is used to fill up any empty

elements in each row of the matrix.

26

The matrix acts in the following way:

Each row in the matrix acts as one tweet. This limitation would not matter as no tweets in the corpus contain more

than 8 ANEW words.

Furthermore, due to the nature of pipelining within the FPGA design and the

use of floating point adders, to gain a throughput of one computation per tick

with a design that depends on previously output values requires a technique

known as loop tiling or C-slow retiming.

Inputs are required to arrive in transposed blocks of C rows, where C is equal

to 16 in my design. This now also means the matrix height needs to be made a

multiple of 16, with the extra rows filled in with dummy data.

Figure 4.9 shows how loop tiling works, with a C row value of 4. The

execution order is displayed by red arrows running from top to bottom in blocks

of C. Because the execution order works like this, when the design offsets a

value at (0, 0) into a source-less stream by -4, this value now has 4 ticks in the

pipeline before it’s available again at (1, 0), because the execution order will

return to element (1, 0) in 4 ticks time, the offset value at (0, 0) will now be

available.

27

Figure 4.9 Execution Order & Data Dependence For Loop-Tiled Row-Sum,

With C Of 4. (Maxeler, 2014)

28

5 Results

5.1 Measuring Results

Each implementation uses a command line interface and was tested on

MaxelerOS, a variant of CentOS. Running tests in this environment allows for

use of the Linux time command to measure the running times of each test. The

function returns three timings:

Real User System

User timings are the amount of CPU time spent outside the kernel, within the

process being ran. System timings are the amount of CPU time spent inside the

kernel, within the process. To gauge an accurate timing on how much CPU time

was used per test, user and system times are combined.

5.2 Data Sets

For benchmarking I used three data sets of different sizes, to determine how

acceleration scales with data size. Table 5.1 lists statistics about each data set.

Table 5.8 Data Set Statistics

Data set ID Total tweets Total tweets viable for

sentiment analysis

1 11,963,477 854,241

2 8,008,664 573,955

3 4,007,614 284,465

29

5.3 Benchmarks

For each dataset, tests were ran a total of three times and the average timings

rounded to the nearest second. Figure 5.1 shows the benchmarks from the tests

executed. Table 5.2 shows the magnitude of acceleration the DFE

implementation provides in relation to the sole CPU implementation for each

data set size.

1 2 30

100

200

300

400

500

600

495

321

160155105

52

CPU Runtime (rounded) DFE Accelerated Runtime (rounded)

Data set ID

Run

tim

e (S

econ

ds)

Figure 5.1 Benchmark Results

30

Table 5.9 Magnitude of Acceleration DFE Implementation Provides

Data set ID Magnitude of acceleration

1 3.08

2 3.06

3 3.08

It is clear that the DFE accelerated implementation runs an order of magnitude

faster than the CPU implementation. As data set size increases, the magnitude of

acceleration in my DFE design remains relatively constant; no significant

difference outside the region of error is noticeable. The DFE implementation

runs about ~3.07 orders of magnitude faster when averaged across data sets.

31

6 Conclusion

My initial brief set out to explore how much faster FPGAs could accelerate a

sentiment analysis algorithm ran upon a large corpus of tweets. To do this I have

produced two program implementations, one purely CPU based, and one which

makes use of FPGA for sentiment analysis algorithm calculations. My objective

to benchmark the run times of these and see how implementations scale with data

set size has been shown – the DFE accelerated implementation performs

approximately 3.07 orders of magnitude faster than the CPU implementation and

seems to be consistent as data set sizes increase.

Unfortunately, due to complications with hardware setup leaving me only 8

days to test and debug my design on hardware, I was unable to research and

experiment further with DFE optimizations and perhaps a multi-threaded CPU

implementation. Additionally, I would have liked to experiment with much larger

sample sizes of tweets (50-100 million+). If I were to have more time I would

definitely explore these areas of the project.

Reflecting on this project, I’ve improved many skills and learned many new

ones. Prior to the project, I had not heard of FPGA and Maxeler, but now I have

a good general overview of the technology and may pursue further research on

this path. I feel fluent using the Linux ecosystem – with emphasis on the

terminal, and I’m now familiar with the C/C++ programming languages which I

did not know before undertaking this project.

I feel that although my results were predictable in the sense that the FPGA

implementation would be faster, I’ve achieved quantifiable results and have

demonstrated so in an area that is of interest to many others – I feel FPGA has

the potential to give massive advantages in areas where fast computations and

simulations can help predict trends, and my work builds upon re-enforcing that

idea.

In the future, I’d like to see this project far more optimized and

experimenting with multiple types of analysis algorithms, including testing out

different sentiment dictionaries within the current project. Ultimately, if the

32

program could analyse live streams of tweets, real-time applications of this

program could be quickly realized.

33

7 References

Ramaswamy, S. (2011) Visualization of the Sentiment of the Tweets. A thesis

submitted to the Graduate Faculty of North Carolina State University in partial

fulfillment of the requirements for the Degree of Master of Science. Raleigh:

North Carolina State University.

Vanderbauwhede, W. Azzopardi, L. and Moadeli, M. (2009) FPGA-accelerated

Information Retrieval: High-efficiency document filtering. International

conference on Field Programmable Logic and Applications, 2009. FPL 2009.

IEEE (pp. 417-422).

Azzopardi, L. Vanderbauwhede, W. and Moadeli, M. (2009, July). Developing

energy efficient filtering systems. In Proceedings of the 32nd international ACM

SIGIR conference on Research and development in information retrieval (pp.

664-665). ACM.

Vanderbauwhede, W., Frolov, A., Chalamalasetti, S. R., & Margala, M. (2014).

A hybrid CPU-FPGA system for high throughput (10Gb/s) streaming document

classification. ACM SIGARCH Computer Architecture News, 41(5), 53-58.

Arora, R. and Srinivasa, S. (2014). A faceted characterization of the opinion

mining landscape. In COMSNETS 2014 (pp. 1-6).

Russell, J. A. (1980). A circumplex model of affect. Journal of personality and

social psychology, 39(6), 1161.

Healey and Ramaswamy. (2011) Twitter Sentiment Vizualization. [Online]

Available from: http://www.csc.ncsu.edu/faculty/healey/tweet_viz/ [Accessed:

07 January 2015]

Internetlivestats.com. Twitter usage statistics. [Online] Available from:

http://www.internetlivestats.com/twitter-statistics/ [Accessed: 07 January 2015]

34

Office for National Statistics. (2013) Personal Well-being across the UK,

2012/13. [Online] Available from:

http://www.ons.gov.uk/ons/dcp171778_328486.pdf

[Accessed: 07 January 2015]

Tom Groenfeldt. (2012). Supercomputer manages fixed income risk at

JPMorgan. [Online] Available from:

http://www.forbes.com/sites/tomgroenfeldt/2012/03/20/supercomputer-manages-

fixed-income-risk-at-jpmorgan/ [Accessed: 07 January 2015]

Raffi Krinkorian. (2013) New Tweets per second record, and how! [Online]

Available from: https://blog.twitter.com/2013/new-tweets-per-second-record-

and-how [Accessed: 07 January 2015]

Text REtrieval Conference (2011) Tweets2011 Twitter Collection [Online]

Available from: http://trec.nist.gov/data/tweets/ [Accessed: 05 May 2015]

Maxeler, Dataflow Computing | Maxeler Technologies [Online]

Available from: https://www.maxeler.com/technology/dataflow-computing/

[Accessed: 05 May 2015]

35

Appendix A Tweet JSON Sample{ "text":"accident Chef salad is calling my name, I\u0027m so hungry!", "id_str":"28965131362770944", "id":28965131362770944, "created_at":"Sun Jan 23 24:00:00 +0000 2011", "retweeted":true, "retweet_count":1, "favorited":true, "user":{ "id_str":"27144739", "id":27144739, "screen_name":"LovelyThang80", "name":"One of A Kind" }, "requested_id":28965131362770944}

36

Appendix B Tweet Kernel Design

Appendix C Tweet Kernel Manager37

Appendix D DFE Terminal Runs38

Appendix E Partial DFE run log

39

Appendix F Tweet Calculator Kernel40

class TweetCalculatorKernel extends Kernel{ private static final DFEType floatType = dfeFloat(8, 24); protected TweetCalculatorKernel(KernelParameters parameters, int X, int C) { super(parameters);

// inputs DFEVar in_mean = io.input("in_mean", floatType); DFEVar in_sd = io.input("in_sd", floatType); DFEVar in_n = io.input("in_n", floatType);

// counters CounterChain chain = control.count.makeCounterChain(); DFEVar x = chain.addCounter(X, 1); chain.addCounter(C, 1);

DFEVar xSum = in_mean * in_n;

// source-less stream DFEVar carriedMean = floatType.newInstance(this);

// head optimization.pushPipeliningFactor(0); DFEVar wCarriedMean = x.eq(0) ? 0.0 : carriedMean; optimization.popPipeliningFactor();

// body DFEVar nwCarriedMean = xSum + wCarriedMean;

// foot carriedMean <== stream.offset(nwCarriedMean, -C);

// source-less stream DFEVar carriedSD = floatType.newInstance(this);

// head optimization.pushPipeliningFactor(0); DFEVar wCarriedSD = x.eq(0) ? 0.0 : carriedSD; optimization.popPipeliningFactor();

// body DFEVar nwCarriedSD = (in_mean !== 0) ? ((in_sd * in_sd) * (in_n - 1)) + ((xSum * xSum) / in_n) + wCarriedSD : wCarriedSD; nwCarriedSD = optimization.pipeline(nwCarriedSD);

41

// foot carriedSD <== stream.offset(nwCarriedSD, -C);

// source-less stream DFEVar carriedN = floatType.newInstance(this);

// head optimization.pushPipeliningFactor(0); DFEVar wCarriedN = x.eq(0) ? 0.0 : carriedN; optimization.popPipeliningFactor();

// body DFEVar nwCarriedN = in_n + wCarriedN;

// foot carriedN <== stream.offset(nwCarriedN, -C);

// Check if last word in stream for final calculation

DFEVar conditionFinal = (x === (X - 1));

DFEVar calculationScore = conditionFinal ? nwCarriedMean / nwCarriedN : 0.0;

DFEVar calculationScoreSD = conditionFinal ? KernelMath.sqrt((nwCarriedSD - (nwCarriedMean * nwCarriedMean) / nwCarriedN) / (nwCarriedN - 1)) : 0;

// Output - only output on last word in stream. io.output("calculationScore", calculationScore, calculationScore.getType(), conditionFinal); io.output("calculationScoreSD", calculationScoreSD, calculationScoreSD.getType(), conditionFinal);

}

}

Appendix G Tweet Calculator Manager

42

public class TweetCalculatorManager {

private static final int X = 8; private static final int C = 16;

public static void main(String[] args) { Manager manager = new Manager(new EngineParameters(args)); BuildConfig bc = new BuildConfig(Level.FULL_BUILD); KernelConfiguration currKConf = manager.getCurrentKernelConfig();

//currKConf.optimization.setUseGlobalClockBuffer(true); //currKConf.optimization.setCEPipelining(5); //currKConf.optimization.setCEReplicationNumPartitions(2);

bc.setBuildEffort(BuildConfig.Effort.VERY_HIGH); bc.setMPPRCostTableSearchRange(1, 4); //bc.setMPPRParallelism(2); manager.setBuildConfig(bc);

// Instantiate the kernel Kernel kernel = new TweetCalculatorKernel(manager.makeKernelParameters(), X, C);

//manager.setEnableStreamStatusBlocks(true); manager.setKernel(kernel); manager.setIO(IOType.ALL_CPU); // Connect all kernel ports to the CPU manager.createSLiCinterface(interfaceDefault());

manager.addMaxFileConstant("X", X); manager.addMaxFileConstant("CFactor", C);

manager.build();

}

private static EngineInterface interfaceDefault() { EngineInterface ei = new EngineInterface();

InterfaceParam Y = ei.addParam("Y", CPUTypes.UINT32); InterfaceParam YInBytes = Y * X * CPUTypes.FLOAT.sizeInBytes();

ei.setTicks("TweetCalculatorKernel", Y * X);

ei.setStream("in_mean", CPUTypes.FLOAT, YInBytes); ei.setStream("in_sd", CPUTypes.FLOAT, YInBytes); ei.setStream("in_n", CPUTypes.FLOAT, YInBytes); ei.setStream("calculationScore", CPUTypes.FLOAT, YInBytes / X); ei.setStream("calculationScoreSD", CPUTypes.FLOAT, YInBytes / X);

43

return ei; } }

44

Appendix H Poster

45

Documents

Abstract - bedford-computing.co.ukbedford-computing.co.uk/.../Jamie_Hagerty_report.docx · Web viewCAAS agrees that the assignment may be submitted ... sentiment analysis and mining