Upload
vohuong
View
217
Download
0
Embed Size (px)
Citation preview
ASSIGNMENT TOP SHEETFaculty of Creative Arts, Technologies & ScienceDepartment of Computer Science & Technology
Student Ref. No 1209310 Unit Code: CIS017-3
Unit Name:Undergraduate Project
Deadline for Submission(s)Tuesday 8th May 2015
Student's SurnameHAGERTY
Student's ForenameJAMIE
Unit Leader's Name:Enjie Liu
Supervisor:Dr Ingo Frommholz
Assignment Details:
Assessment 2: Final Report
Instructions to Student:Please note: Work presented in an assessment must be the student's own. Plagiarism is where a student copies work from another source, published or unpublished (including the work of a fellow student) and fails to acknowledge the influence of another's work or to attribute quotes to the author. Plagiarism is an academic offence.Work presented in an assessment must be your own. Plagiarism is where a student copies work from another source, published or unpublished (including the work of another student) and fails to acknowledge the influence of another’s work or to attribute quotes to the author. Plagiarism is an academic offence and the penalty can be serious. The University’s policies relating to Plagiarism can be found in the regulations at http://www.beds.ac.uk/aboutus/quality/regulations. To detect possible plagiarism we may submit your work to the national plagiarism detection facility. This searches the Internet and an extensive database of reference material including other students’ work to identify. Once your work has been submitted to the detection service it will be stored electronically in a database and compared against work submitted from this and other universities. It will therefore be necessary to take electronic copies of your materials for transmission, storage and comparison purposes and for the operational back-up process. This material will be stored in this manner indefinitely.
I have read the above information and I confirm that this work is my own and that it may be processed and stored in the manner described.
Signature (Print Name): ........................................................... Date: .........................................
Extension deadline CAAS agrees that the assignment may be submitted ____ days after the deadline and should be marked without penalty.
CAAS confirmation...................................................................................................................
Please leave sufficient time to meet this deadline and do not leave the handing-in of assignments to the last minute. You need to allow time for any system problems or other issues.
Jamie Hagerty
1209310
Tweet sentiment analysis using dataflow acceleration
BSc (Hons) Computer Science
Undergraduate Thesis Report
Department of Computer Science and Technology
University of Bedfordshire
Dr Ingo Frommholz
2014/2015
Abstract
While ‘big data’ continues to grow ever larger and processing times and costs
become problematic in the computing world, FPGA technology appears to
provide a solution which can offer reduced processing times, costs and present
this at a reduced impact to the environment.
Social media platforms such as twitter are sources for vast amounts of data
relating to how users throughout the world may feel about certain topics, ideas,
or people. This paper investigates how Maxeler’s FPGA dataflow solutions may
help in achieving reduced processing times with regards to sentiment analysis of
the ‘Twittersphere’.
To approach this task, an artefact to compare execution times of a sentiment
analysis algorithm upon a large tweet dataset on CPU and FPGA
implementations is proposed to produce benchmarks between the technologies.
Results show that the FPGA implementation can process the selected
sentiment analysis algorithm around a magnitude of ~3.07 times faster than the
stand alone CPU implementation.
In conclusion, FPGA clearly shows its benefits in terms of processing speeds
in the area of sentiment analysis. In all, it demonstrates that FPGA technology
could provide big improvements in real world applications in which reduced
processing times could give large advantages.
i
Acknowledgements
I would like to acknowledge and express my deepest appreciation to Dr. Ingo
Frommholz who introduced me to the concept of FPGA computing and offered
me support, guidance and encouragement throughout the development of my
final year project. Without him this project would not have been undertaken.
Furthermore I would like to acknowledge with much appreciation Oliver Brown
and the staff at Maxeler that helped in making the necessary equipment required
available to me to undertake my project.
Dedication
I dedicate my dissertation work to family and many friends that I have met
throughout my university experience and supported me along the way.
Keywords
FPGA
DFE
MAXELER
DATAFLOW
SENTIMENT ANALYSIS
TWEETS
ii
Table of Figures
Figure 4.1 Computing With Dataflow Core.........................................................14
Figure 4.2 Computing With A Control Flow Core..............................................15
Figure 4.3 Tweet Class Diagram..........................................................................18
Figure 4.4 ANEW Class Diagram........................................................................18
Figure 4.5 Program Flow Diagram......................................................................19
Figure 4.6 Russell’s Emotional Circumplex........................................................24
Figure 4.7 CPU Tweet Analyse Loop…………………………..…………………………..25
Figure 4.8 Building DFE Stream…………………………………………………………….26
Figure 4.9 Execution Order & Data Dependence for Loop-Tiled Row-Sum,
With C of 4………………………………………………..……………………………………….28
Figure 5.1 Benchmark Results……………………………………………………………….30
List of Tables
Table 3.1 Main Requirements..............................................................................11
Table 3.2 Results Requirements...........................................................................12
Table 4.1 ANEW Word Table..............................................................................22
Table 4.2 Stage 1: Decomposition For VALENCE.............................................22
Table 4.3 Stage 1: Decomposition For AROUSAL.............................................22
Table 4.4 Stage 2: Combining..............................................................................23
Table 4.5 Stage 3: Combined Mean & Standard Deviation.................................23
Table 5.1 Data Set Statistics.................................................................................29
Table 5.2 Magnitude of Acceleration DFE Implementation Provides.................31
iii
Table of ContentsAbstract..................................................................................................................i
Acknowledgements...............................................................................................ii
Dedication.............................................................................................................ii
Keywords..............................................................................................................ii
Table of Figures...................................................................................................iii
List of Tables.......................................................................................................iii
1 Introduction..................................................................................................1
1.1 Inspiration.......................................................................................................1
1.2 Project and Artefact.......................................................................................1
1.3 Report Structure.............................................................................................2
1.3.1 Literature Review.........................................................................................2
1.3.2 Requirements................................................................................................2
1.3.3 Methods & Design.......................................................................................3
1.3.4 Results..........................................................................................................3
1.3.5 Conclusion....................................................................................................3
2 Literature review..........................................................................................4
2.1 Introducing Sentiment Analysis: Why?.......................................................4
2.2 Approaching Sentiment Analysis..................................................................5
2.3 Sentiment Dictionary Oriented Approaches................................................6
2.4 Introducing FPGA: Benefits & Potential.....................................................7
2.5 FPGA In Information Filtering.....................................................................7
2.6 Conclusion.......................................................................................................8
3 Requirements..............................................................................................10
3.1 Introduction..................................................................................................10
3.2 Prerequisites..................................................................................................10
3.2.1 Twitter Data................................................................................................10
iv
3.2.2 Sentiment Dictionary..................................................................................10
3.3 Implementation.............................................................................................11
3.4 Results & Evaluation....................................................................................12
4 Methods & Design......................................................................................13
4.1 Introduction..................................................................................................13
4.2 Maxeler Dataflow Computing.....................................................................13
4.3 Development..................................................................................................16
4.3.1 Tools, Languages & Libraries....................................................................16
4.3.2 Development Model...................................................................................16
4.3.3 Testing........................................................................................................16
4.3.4 Data Structures & Dataflow.......................................................................18
4.3.5 Sentiment Analysis Algorithm...................................................................20
4.3.6 Interpreting Valence & Arousal Scores.....................................................23
4.3.7 DFE Algorithm Implementation Differences.............................................25
5 Results.........................................................................................................29
5.1 Measuring Results........................................................................................29
5.2 Data Sets........................................................................................................29
5.3 Benchmarks...................................................................................................30
6 Conclusion...................................................................................................32
7 References...................................................................................................34
v
1 Introduction
1.1 Inspiration
Social media platforms such as twitter are sources for vast amounts of data
relating to how users throughout the world may feel about certain topics, ideas or
people. Throughout the history of Twitter the website has gone from having 500
tweets per day in 2007 to 500 million per day in 2013 (Twitter Blog, 2013).
Given the scale and availability of data in the form of tweets and their
associated metadata, the ‘Twittersphere’ (postings made on the social media
website Twitter considered collectively) has become the subject of many forms
of data processing (i.e. sentiment analysis) to connect and analyse tweets to
generate new information relating to specific searches. As this data is ever
growing, technologies such as FPGA (Field-Programmable Gate Array) could be
applied to process algorithms within faster time scales at a fraction of the cost.
1.2 Project and Artefact
My project aims to re-implement an existing sentiment analysis algorithm using
FPGA based DFEs (Data Flow Engines) to accelerate sentiment analysis upon a
large collection of tweets and benchmarking this against the same task without
dataflow acceleration to show the potential benefits this technology may provide.
The primary objective of my project will be to show how DFEs can be applied to this area of study to produce results I can analyse to compare the implementations:
a) The magnitude of acceleration in algorithm processing times
b) How the sample size of tweets affect a)
Potential from the impact of a) will be analysed to show how using DFEs within
this field may allow for advancements in real world applications that make use of
sentiment analysis.
1
To do this and realize my artefact I will:
a) Implement code to execute a sentiment analysis algorithm on a collection
of tweets and perform analysis on processing speeds when ran using
solely x86 conventional CPUs.
b) Using the same implementation, adjust the algorithm component of the
code to make use of DFE technology and perform the same analysis to
benchmark the two sets of data against each other.
c) Produce a report detailing how the investigation was carried out and
provide statistical data to show results from the benchmarks and provide
visualizations in the form of graphs.
1.3 Report Structure
1.3.1 Literature Review
The purpose of the literature review is to further investigate what research and
work has already been done that relates to topics within my project. This review
details further background information and history on sentiment analysis and
applications of FPGA in information search and classification. Furthermore, this
review will conclude my feelings on the current state of the topics reviewed.
1.3.2 Requirements
The requirements section briefly details prerequisites and other requirements
needed to materialize my artefact and achieve my project aims. Other
requirements detail implementation, result gathering and evaluation.
2
1.3.3 Methods & Design
The methods and design section details how I set out to achieve my artefact
objectives. It explains the Maxeler dataflow computing concept used to produce
a DFE accelerated program, tools and languages used, data structures and the
flow of data throughout implementations, and it explains how the benchmarked
sentiment analysis algorithm is performed and interpreted.
One key area of this section details differences between CPU and DFE
implementations.
1.3.4 Results
The results section explains how implementation execution times were recorded
and displays the benchmarks between implementations, including the magnitude
of acceleration across datasets.
1.3.5 Conclusion
The conclusion summarises my findings and how they respond to the brief
explained in the introduction. In addition, it covers my personal development
throughout the project and what I feel the project has contributed to its respective
area, as well as what I would have done differently with more time and the
potential future the project may hold.
3
2 Literature review
2.1 Introducing Sentiment Analysis: Why?
There has always been a demand for opinion when it comes to any form of
decision-making. User generated content from websites such as twitter can carry
valuable information in relation to products, services, topics, ideas and people.
Twitter is a form of ‘big data’ and organisations, businesses and researchers are
keen to tackle the automation of opinion mining upon this platform to exploit this
extensive source of data. Using sentiment analysis, another layer of information
to analyse can be added to each tweet - its ‘mood'.
In the year 2013/14, the UK National office for statistics published a report
on the personal well-being of the population within the United Kingdom
(ons.gov.uk, 2013); this report was based on a survey of 165,000 people with 4
questions on the topic of well-being and requested answers on a scale of 1-10.
The requirement for opinionated data analysis is of clear importance to the UK
government and can involve large data samples. The government is willing to
invest time and money into information gathering using surveys on this scale for
their information needs.
Thus, how would the results compare if reliable sentiment analysis was run
upon one year’s worth of tweets with geo-location based within the UK?
Providing a reliable outcome could be found, how much time would it take to
perform? And how much would it cost? The potential of real-world applications
of reliable sentiment analysis could expedite processes, reduce costs and change
how businesses and other entities may function.
4
2.2 Approaching Sentiment Analysis
Research by Arora and Srinivasa (2014) aim to provide a schematic framework
for researchers to understand the landscape of sentiment mining by proposing
faceted classification and addressing issues faced within the topic. So far, efforts
on research within sentiment analysis and mining are described as “fragmented
and disparate” (Arora and Srinivasa, 2014).
Four types of opinion classifications exist within this schematic framework
and sentiments are described as being either positive, negative, neutral or
constructive. Approaches to brand opinions use the understanding of domain-
specific opinion words, their respective polarity and language specific opinion-
rules; opinions are challenging to extract due to the “latent context in which an
opinion is expressed” (Arora and Srinivasa, 2014).
Attempts at classification through the realm of sentiment analysis have had
mixed degrees of success and this is due to issues such as cross-domain
classification, sarcasm, spam and “the need for an annotated training dataset”
(Arora and Srinivasa, 2014).
Insights to consider in relation to helping solving some of the issues
described above within sentiment analysis are mentioned such as:
Identifying the purpose or background motivation of the opinion to
provide insight into the credibility of any sentiments expressed to help
avoid accounting for opinion spamming.
To further classify opinions as direct, indirect or comparative – indirect
opinions are implied as idioms or can express sarcasm; identifying these
to process accordingly would help lower incorrect sentiment analysis.
Current methods discussed to extract opinion structure include machine
learning, rules-based approaches and statistical algorithms with a lexicon
(language specific dictionary of words with predefined polarity). My research
aims to use a statistical algorithm along with a lexicon to run sentiment analysis.
However, addressing problems described by this approach are stated and are still
open to solution:
Lexicons are language specific
5
Word polarity is domain specific.
Polarity can also be context specific.
2.3 Sentiment Dictionary Oriented Approaches
Twitter sentiment analysis is no new idea and existing implementations already
exist. Ramaswamy (2011) has produced a thesis on visualizing twitter sentiment
based on keyword search. I plan to implement the same method of sentiment
analysis used in his work.
Multiple computational methods such as Bayesian networks and support
vector machines can be used to perform concept-level analysis of natural
language text. Traditional approaches like these require “sufficient high-quality
text to allow for accurate natural language evaluations” (Healey and
Ramaswamy, 2011) and in arguing that this level of requirement of text is not
necessarily available in short text snippets like tweets, an alternative method is
proposed: using dictionaries that pre-define the sentiment of a collection of
words along a set of emotional dimensions.
The sentiment dictionary used within Ramaswamy’s (2011) thesis is the
ANEW (Affective Norms for English Words) dictionary; It provides measures of
valence, arousal and dominance for 1034 English words which previous research
identified as good candidates to convey emotion. The ANEW dictionary rates
words in upon a 9 point scale (1-9) and has been constructed from asking
volunteers to read a body of text and provide a rating along each dimension for
each occurrence of an ANEW related word. ANEW words within the text are
combined to form an overall mean rating and standard deviation of ratings.
Emotional models have been proposed within the field of psychology to
define and compare emotional states using valence and arousal. James Russell
(1980) proposed a model which maps valence and arousal to build an 'emotional
circumplex of affect', with 28 emotional states positioned accordingly; valence
runs across the horizontal axis and arousal on the vertical axis. Ramaswamy
(2011) uses this model to plot measurements of a tweets valance and arousal
against Russell’s emotional circumplex of affect.
6
Implementation of the sentiment algorithm onto tweets is relatively straight
forward and for each tweet first involves capturing the mean valance and arousal
values for each ANEW word along with respective standard deviation values.
Tweets with less than 2 ANEW words are regarded as having insufficient data to
perform sentiment analysis on. Each words mean is weighted by the probability
of the words rating falling exactly at the mean value using a probably density
function of a normal distribution in relation to each words respective standard
deviation value. To gain an overall value for both valence and arousal of the
tweet, these weighted means are then averaged. (Ramaswamy, 2011)
2.4 Introducing FPGA: Benefits & Potential
FPGA solutions are becoming ever more prevalent for a myriad of tasks relating
to intense data processing and for good reason: the benefits of using such
technology include massive process acceleration and lower computational power
requirement costs.
In 2012 Bank JPMorgan adopted this technology and applied it to its risk
measuring algorithms. The technology has allowed simulations that once used to
take hours to complete to now finish in just a few minutes (Tom Groenfeldt,
2012). It’s clear that FPGA has the potential to massively accelerate the speed of
large and intense data analysis. Carl Claunch (2011), vice president and analyst
at Gartner states that the true value of this comes from “enabling new levels of
performance, changing the user’s competitive dynamics or unlocking new market
opportunities”.
2.5 FPGA In Information Filtering
Much of the research in this review contains an overview and some specifics
Researchers have evaluated the performance of using FPGAs within information
filtering over a series of experiments comparing FPGA implementations against
an optimised reference implementation (Vanderbauwhede, W. Azzopardi, L. and
Moadeli, M., 2009). In this paper, the researchers used a collection of document
datasets which each differed in numbers of documents they contained and the
7
average document length. Upon running various profile filters on each dataset
their results indicated that their FPGA implementation ranged from 8.3 to 20.8
orders of magnitude faster than the reference implementation. Additionally, the
standard implementation became slower as profile size increased while the
FPGA implementation remained relatively constant due to pipelining profile
scoring, keeping latency constant.
The conclusion of this paper further details benefits to use of FPGAs; they
offer this processing speed at a small fraction of the power a CPU-only solution
would use. Power consumption within data centres is a growing issue due to the
cost of cooling and power consumption of computation. The paper tells that
FPGAs could tackle the challenge of developing environmentally friendly
systems.
More recent research shows that use of hybrid CPU-FPGA systems for
streaming document classification can achieve throughputs of 10Gb/s in real time
and that by moving the document parser from CPU to FPGA researchers aim to
achieve speeds of 100Gb/s. (Vanderbauwhede, W., Frolov, A., Chalamalasetti, S.
R., and Margala, M., 2014).
2.6 Conclusion
Much of areas I will have to explore to produce my own research of using
FPGAs alongside sentiment analysis upon large datasets. I feel points made
throughout this review show that the topic of sentiment analysis and mining is of
great interest to many people and entities as it has the potential to unravel new
ways of interpreting data from search and have implications of a positive nature
to change how said entities may function with regards to analysis.
Disparity between approaches of achieving reliable sentiment analysis on
different styles of text-based information and issues within certain methodologies
remaining unanswered appear to suggest that this area of research is still within
its early stages. However, research in the area seems to be building upon creating
a framework to tackle all aspects of sentiment analysis, which need to be
considered.
8
FPGA implementations make clear their advantage when it comes to
accelerating algorithms and performing information filtering. FPGA also
displays a new route to achieving ‘green systems’ as they require little power
consumption and cooling measures. Research in this area appears to be focused
on what FPGA can be applied to and measuring how effective it can be when
applied to various computational problems. It’s shown that FPGAs have an
additional advantage in the fact that they can process large streams of data in real
time, widening applications of the technology.
My own research should provide additional insight into helping make real-
world applications of sentiment analysis upon large sets of data relating to social
media reasonable in terms of processing speed and efficiency. In demonstrating
this, FPGA in reliable sentiment analysis could unlock limitations in terms of
time scales and costs required and enable proposals which rely on speed or real-
time analysis.
9
3 Requirements
3.1 Introduction
As with any development process, I first considered the tasks required to achieve
a working implementation and compile these into a requirements list. This list
would function as an overview of the fundamental components for building the
artefact.
3.2 Prerequisites
Fundamentally, my project will perform sentiment analysis upon millions of
tweets using different approaches towards computation. I would need to source
bulk Twitter data and a sentiment dictionary for this to be possible.
3.2.1 Twitter Data
Twitter data is the subject of analysis. To perform a meaningful and reliable
comparison between implementations, a large dataset of tweets are required. This
dataset was obtained from the Tweets2011 corpus as part of the TREC microblog
track and contained just under 12 million tweets for me to analyse which were
compiled over a period of 2 weeks.
3.2.2 Sentiment Dictionary
The sentiment dictionary provides the method of calculation. To achieve my
route of sentiment analysis a sentiment dictionary (lexicon) was required. I chose
to use the same sentiment dictionary, the ‘Affective Norm for English Words’
(ANEW) as used by Ramaswamy (2011). This dictionary provides arousal and
valence data for 1034 unique words in the dictionary.
10
3.3 Implementation
Implementation requirements span the essential components of program flow to
generate results for analysis. I have divided implementation requirements up into
four sections, which will be further elaborated on in the design section of this
report.
Table 3.1 Main Requirements
Requirement Description ID
Data IO The implementation must be able to read from
JSON formatted files to access tweets and
ANEW sentiment dictionary data.
1
Tweet Processing Tweets must be tokenized and stemmed so that
they can be searched for in the ANEW
dictionary.
2
Sentiment Analysis Processed tweets will be checked against the
ANEW dictionary. A mathematical formulae is
required to compile ANEW word data for each
tweet into an overall score.
3
Benchmarking Running times of each implementation need to
be measured and recorded. (Linux time
command)
4
11
3.4 Results & Evaluation
Results specifically refer to the running times for each data set ran on the tweet
sentiment analysis implementation. Results must adhere to specific requirements
so that a meaningful evaluation can be made.
Table 3.2 Results Requirements
Requirement Description ID
Generating results to
compare
Benchmark results must be generated using both
computational methods (conventional CPU and
dataflow acceleration) to allow for a comparison.
1
Multiple tweet data
sets for analysis
Data sets of different sizes should be used to
generate a range of results and to see how
computational times scale.
2
Confidence of results Generating results for each data-set should be
repeated to provide a degree of confidence in
results.
3
Evaluation of results Provide a comparison between computational
methods for each set of results generated. Show
statistics and provide data representations
(graphs, tables).
4
12
4 Methods & Design
4.1 Introduction
In this section I will detail the design of my artefact and the methods I used to
approach design. Additionally, concepts used within my design will be
explained. This section is written in a style so that my work could be replicated
4.2 Maxeler Dataflow Computing
Maxeler provides solutions aimed at tackling big data problems by exploiting the
concept of dataflow computing on FPGAs as opposed to using traditional CPU
‘control flow’ computing.
This concept allows for optimizing the movement of data in an application
and utilizing massive parallelism between thousands of ‘dataflow cores’ to
provide benefits in performance, space and power consumption. (Maxeler, 2015)
As seen in figure 4.1, ‘computing with dataflow cores’, data is streamed
from memory into the dataflow engine where each dataflow core acts as one
computational unit to perform operations and forward data to the next core or the
off-chip memory only once the chain of processing is complete. Instructions for
each program are described by the configuration file which maps the operations,
layout and connections of the dataflow engine.
In comparison, figure 4.2, ‘computing with control flow cores’, shows how
data and instructions are continuously passed between memory and processor
core as operations are performed.
This model is sequential and performance is limited by the latency of data
movement. (Maxeler, 2015)
In relation to my project artefact, I will implementing a design which uses
only a control flow core architecture (traditional CPU) and a design which
utilizes both, with algorithm calculations being carried out on the dataflow core
architecture.
13
My project goal to compare processing times of sentiment analysis will
fundamentally be measuring the timing differences of the algorithm with and
without this dataflow architecture.
Figure 4.1 Computing With a Dataflow Core (Maxeler, 2015)
14
Figure 4.2 Computing With a Control Flow Core (Maxeler, 2015)
15
4.3 Development
4.3.1 Tools, Languages & Libraries
For development of my work I have used a multitude of tools, languages and
libraries, listed below, alongside their purpose.
MaxelerOS – System used to run Maxeler programs.
MaxCompiler – Developer environment for Maxeler programs.
C/C++ 11 – Used to develop artefact host code.
Java/maxJava – Used to program dataflow engines.
rapidJSON – JSON parser to read tweets and ANEW data.
Oleander stemming library – library used to perform word stemming
upon tweets.
4.3.2 Development Model
Throughout project development I opted to use the waterfall model primarily due
to its simplicity – each stage is specific and is easy to manage as these stages
have direct outputs. In addition, due to the simplified, small number of
requirements in this project, not much could be overlooked.
One downside of using this model for this project was that I had no prior
experience of developing an FPGA application using Maxeler technologies. This
led to problems in the system design stage for Maxeler kernel development as
going back to change wrong designs was often troublesome as much of the
design had to be refactored, compile times could take upwards of 30 minutes,
and simulations of code wouldn’t always reproduce perfectly on the FPGA
hardware.
4.3.3 Testing
Within implementation, code was developed in ‘blocks’ of functionality, which I
would test before moving onto the next block of functionality; to test the desired
functionality I was after, I would write in-line unit tests after each section which
would compare functions against expected output, for example: to specifically
16
ensure that the sentiment analysis algorithm was working correctly, a small
number of ‘development tweets’ which I had previously calculated sentiment
analysis values for was ran on code execution to check for discrepancies.
Debugging in FPGA hardware could sometimes prove difficult due to the
difficulty of inspecting a running stream, but MaxDebug provided a means of
inspecting stream statuses in running hardware executions.
17
4.3.4 Data Structures & Dataflow
Two of the main entities needed to be represented for the artefact were tweets
and ANEW word entries. The artefact uses a Tweet and ANEW class to manage
storage, manipulation, and retrieval of entity data. Designing these structures in
an object orientated fashion makes transfer and manipulation of entity data
simple; encapsulating entity data and methods keeps all information inside the
object.
Using this style of design, the main body of code needs to only focus upon
building, using and keeping track of objects. Figures 4.3 and 4.4 detail these
entities as class diagrams.
Figure 4.3 Tweet Class Diagram
Figure 4.4 ANEW Class Diagram
18
Figure 4.5 details a sea-level representation of
program flow from start to finish, each stage will be
elaborated on.
Stage 1: Building ANEW hashmap.
Each ANEW word found in the dictionary JSON is
built as an ANEW object. Each ANEW object is
inserted into a hashmap, with a key that matches the
stemmed dictionary word it represents.
Stage 2: Load & parse tweet JSON into memory.
rapidJSON parses and validates the tweet JSON
dump. It stores the JSON structure into memory for
later access.
Stage 3: Build tweet objects from JSON data.
Build all tweet objects from parsed JSON.
Stage 4: Perform word stemming & record
ANEW words.
Tweet text is tokenized by word and each word
token is stemmed and searched for in the ANEW
hashmap; if the word is found, it’s recorded into a
vector.
Stage 5: Run algorithm calculation.
Tweet objects with more than 2 ANEW words are
run through the algorithm to determine the tweets
overall valence and arousal ratings and standard
deviations. This stage is implemented different for
CPU and DFE builds.
Figure 4.5 Program
Flow Diagram
19
4.3.5 Sentiment Analysis Algorithm
To perform sentiment analysis upon a viable tweet (>2 ANEW words), arousal
and valence ratings for every ANEW word must be taken into account. This is
the part of the solution which will be made on CPU and on DFE to benchmark.
For every ANEW word found in a tweet considered viable for analysis,
valence and arousal means, standard deviations, and word frequencies (sample
sizes) are collected. These separate values are decomposed, combined and the
overall mean and standard deviation is calculated. The mathematical formulae
used to do this consists of 3 stages, with each previous stages values being used
in the next:
Stage 1: Decomposition of mean and standard deviation for each ANEW
word.
Stage 2: Combining decomposed data.
Stage 3: Calculating the mean and standard deviation with the combined data.
4.3.5.1 Stage 1: Decomposition for Each ANEW Word Labelled as i
Step 1a:
Decomposing the mean to find sum of x values for word i
Step 1b:
Decomposing standard deviation to find sum of x2 values for word i
20
4.3.5.2 Stage 2: Combining Values for Each ANEW Word
Step 2a:
Combine n values (word frequencies) for all words
Step 2b:
Combine sum of x values for all words
Step 2c:
Combine the sum of x2 values for all words
4.3.5.3 Stage 3: Calculate Combined Mean & Standard Deviation Using
Combined Values
Step 3a: combined mean
Formulae for finding the mean
Step 3b:
Formulae for finding the standard deviation
21
An example where the bolded words have been identified as part of the ANEW
sentiment dictionary:
“Finally saw the movie #Tron … have to say that I
quite enjoyed it! … especially loved the
motorcycles…”
Table 4.3 ANEW Word Table
ANEW
word
Valence
mean (μ)
Valence
standard
deviation (σ)
Arousal
mean (μ)
Arousal standard
deviation (σ)
Word
freq (n)
Movie 6.86 1.81 4.93 2.54 29
Enjoyed 7.8 1.2 5.2 2.72 21
Loved 8.64 0.71 6.38 2.68 56
Table 4.4 Stage 1: Decomposition For VALENCE
ANEW word Step 1a Step 1b
Movie 198.94 1456.46
Enjoyed 163.8 1306.44
Love 483.84 4208.1
Table 4.5 Stage 1: Decomposition For AROUSAL
ANEW word Step 1a Step 1b
Movie 142.97 885.49
Enjoyed 109.2 1121.32
Love 357.28 2674.48
22
Table 4.6 Stage 2: Combining
Rating type Step 2a Step 2b Step 2c
Valence 106 846.58 6971
Arousal 106 609.45 4275.77
Table 4.7 Stage 3: Combined Mean & Standard Deviation
Rating type Step 3a Step 3b
Valence 7.99 1.41
Arousal 5.75 2.71
Table 4.1 shows the initial data for each of the ANEW words found in this tweet,
they are combined to give one overall score for valence and arousal ratings in
terms of mean and standard deviation.
Table 4.2 and table 4.3 (stage 1) respectively shows decomposition of each
words valence and arousal ratings found in table 4.1. These values are combined
in table 4.4 (stage 2). Finally, the combined mean and standard deviation is
calculated using combined values as shown in table 4.5 (stage 3).
4.3.6 Interpreting Valence & Arousal Scores
As discussed in section 2.3, Russell’s emotional circumplex is used to give
meaning to these valence and arousal scores. Arousal in mapped against the x
axis and valence is mapped against the y axis. The surrounding emotions in this
circumplex give an indication of the emotional state of each tweet.
Figure 4.6 shows two tweets mapped against this emotional circumplex to
give a representation of the emotional state of each tweet. The blue dot is the
same tweet that was calculated for through tables 4.1 to 4.5, it shows that this
tweet lies within the active/pleasant quadrant of the circumplex. Another tweet,
marked by the pink dot indicates a tweet that lies within the unpleasant/subdued
quadrant.
23
These marked tweets are plotted according to the mean values of valence and
arousal. When considering the standard deviation values of valence and arousal,
the spread of values about the mean can be determined. The higher the standard
deviation, the larger the average spread of data from the mean.
Figure 4.6 Russell’s Emotional Circumplex with Mapped Tweets
24
4.3.7 DFE Algorithm Implementation Differences
For the most part, both program implementations share much of the same code.
Differences between implementations appear after the CPU code has finished
identifying ANEW words within a tweet.
In the CPU code, looping through a vector of tweets to perform analysis
would take the format of figure 4.7.
// Tweet analyse loopfor (SizeType i = 0; i < tweets.Size(); i++){ // get tweet data
string tweet_text = tweets[i]["text"].GetString();string tweet_id = tweets[i]["id_str"].GetString();
// make tweet objectTweet tweet(tweet_id, tweet_text);
// stem + analyze
tweet.clean_stem();tweet.analyze(anew_map);[..] <Print out current tweet / track tweet pointer in
vector/array>}
Figure 4.7 CPU Tweet Analyse Loop
25
In the DFE implementation, arrays of inputs for tweet analysis need to be
constructed so they may be passed as a ‘stream’ into a dataflow engine. Figure
4.8 demonstrates adding tweet input data to five different arrays (streams).
// for each tweetfor (uint32_t x = 0; x < anew_count; x++){ element = …
// get anew values from hashmap for anew word x [..] query = anew_map.find(p_currtweet->get_anew_words()[x]);
// build arraystweet_ar_mean[element] = query->second.arousal_mean;tweet_ar_sd[element] = query->second.arousal_sd;
tweet_val_mean[element] = query->second.valence_mean;tweet_val_sd[element] = query->second.valence_sd;
tweet_n[element] = query->second.n;
}
Figure 4.8 Building DFE Stream
To build a stream the FPGA hardware will accept, one main condition is
required:
Each ‘in stream’ is required to be a multiple of 16 bytes
A stream into FPGA hardware can effectively take the form of a matrix, in my
design I built this stream with a width of 8, essentially creating an 8 by Y matrix
of float types, where Y was the number of tweets to be analysed. This approach
was needed to achieve stream conditions:
sizeof(float) = 4 bytes (width) 8 * (sizeof(float)) 4 = 32 bytes
Now every row Y would be a multiple of 16 bytes and be able to carry 8 ANEW
floating point values. Dummy data (padding) is used to fill up any empty
elements in each row of the matrix.
26
The matrix acts in the following way:
Each row in the matrix acts as one tweet. This limitation would not matter as no tweets in the corpus contain more
than 8 ANEW words.
Furthermore, due to the nature of pipelining within the FPGA design and the
use of floating point adders, to gain a throughput of one computation per tick
with a design that depends on previously output values requires a technique
known as loop tiling or C-slow retiming.
Inputs are required to arrive in transposed blocks of C rows, where C is equal
to 16 in my design. This now also means the matrix height needs to be made a
multiple of 16, with the extra rows filled in with dummy data.
Figure 4.9 shows how loop tiling works, with a C row value of 4. The
execution order is displayed by red arrows running from top to bottom in blocks
of C. Because the execution order works like this, when the design offsets a
value at (0, 0) into a source-less stream by -4, this value now has 4 ticks in the
pipeline before it’s available again at (1, 0), because the execution order will
return to element (1, 0) in 4 ticks time, the offset value at (0, 0) will now be
available.
27
Figure 4.9 Execution Order & Data Dependence For Loop-Tiled Row-Sum,
With C Of 4. (Maxeler, 2014)
28
5 Results
5.1 Measuring Results
Each implementation uses a command line interface and was tested on
MaxelerOS, a variant of CentOS. Running tests in this environment allows for
use of the Linux time command to measure the running times of each test. The
function returns three timings:
Real User System
User timings are the amount of CPU time spent outside the kernel, within the
process being ran. System timings are the amount of CPU time spent inside the
kernel, within the process. To gauge an accurate timing on how much CPU time
was used per test, user and system times are combined.
5.2 Data Sets
For benchmarking I used three data sets of different sizes, to determine how
acceleration scales with data size. Table 5.1 lists statistics about each data set.
Table 5.8 Data Set Statistics
Data set ID Total tweets Total tweets viable for
sentiment analysis
1 11,963,477 854,241
2 8,008,664 573,955
3 4,007,614 284,465
29
5.3 Benchmarks
For each dataset, tests were ran a total of three times and the average timings
rounded to the nearest second. Figure 5.1 shows the benchmarks from the tests
executed. Table 5.2 shows the magnitude of acceleration the DFE
implementation provides in relation to the sole CPU implementation for each
data set size.
1 2 30
100
200
300
400
500
600
495
321
160155105
52
CPU Runtime (rounded) DFE Accelerated Runtime (rounded)
Data set ID
Run
tim
e (S
econ
ds)
Figure 5.1 Benchmark Results
30
Table 5.9 Magnitude of Acceleration DFE Implementation Provides
Data set ID Magnitude of acceleration
1 3.08
2 3.06
3 3.08
It is clear that the DFE accelerated implementation runs an order of magnitude
faster than the CPU implementation. As data set size increases, the magnitude of
acceleration in my DFE design remains relatively constant; no significant
difference outside the region of error is noticeable. The DFE implementation
runs about ~3.07 orders of magnitude faster when averaged across data sets.
31
6 Conclusion
My initial brief set out to explore how much faster FPGAs could accelerate a
sentiment analysis algorithm ran upon a large corpus of tweets. To do this I have
produced two program implementations, one purely CPU based, and one which
makes use of FPGA for sentiment analysis algorithm calculations. My objective
to benchmark the run times of these and see how implementations scale with data
set size has been shown – the DFE accelerated implementation performs
approximately 3.07 orders of magnitude faster than the CPU implementation and
seems to be consistent as data set sizes increase.
Unfortunately, due to complications with hardware setup leaving me only 8
days to test and debug my design on hardware, I was unable to research and
experiment further with DFE optimizations and perhaps a multi-threaded CPU
implementation. Additionally, I would have liked to experiment with much larger
sample sizes of tweets (50-100 million+). If I were to have more time I would
definitely explore these areas of the project.
Reflecting on this project, I’ve improved many skills and learned many new
ones. Prior to the project, I had not heard of FPGA and Maxeler, but now I have
a good general overview of the technology and may pursue further research on
this path. I feel fluent using the Linux ecosystem – with emphasis on the
terminal, and I’m now familiar with the C/C++ programming languages which I
did not know before undertaking this project.
I feel that although my results were predictable in the sense that the FPGA
implementation would be faster, I’ve achieved quantifiable results and have
demonstrated so in an area that is of interest to many others – I feel FPGA has
the potential to give massive advantages in areas where fast computations and
simulations can help predict trends, and my work builds upon re-enforcing that
idea.
In the future, I’d like to see this project far more optimized and
experimenting with multiple types of analysis algorithms, including testing out
different sentiment dictionaries within the current project. Ultimately, if the
32
program could analyse live streams of tweets, real-time applications of this
program could be quickly realized.
33
7 References
Ramaswamy, S. (2011) Visualization of the Sentiment of the Tweets. A thesis
submitted to the Graduate Faculty of North Carolina State University in partial
fulfillment of the requirements for the Degree of Master of Science. Raleigh:
North Carolina State University.
Vanderbauwhede, W. Azzopardi, L. and Moadeli, M. (2009) FPGA-accelerated
Information Retrieval: High-efficiency document filtering. International
conference on Field Programmable Logic and Applications, 2009. FPL 2009.
IEEE (pp. 417-422).
Azzopardi, L. Vanderbauwhede, W. and Moadeli, M. (2009, July). Developing
energy efficient filtering systems. In Proceedings of the 32nd international ACM
SIGIR conference on Research and development in information retrieval (pp.
664-665). ACM.
Vanderbauwhede, W., Frolov, A., Chalamalasetti, S. R., & Margala, M. (2014).
A hybrid CPU-FPGA system for high throughput (10Gb/s) streaming document
classification. ACM SIGARCH Computer Architecture News, 41(5), 53-58.
Arora, R. and Srinivasa, S. (2014). A faceted characterization of the opinion
mining landscape. In COMSNETS 2014 (pp. 1-6).
Russell, J. A. (1980). A circumplex model of affect. Journal of personality and
social psychology, 39(6), 1161.
Healey and Ramaswamy. (2011) Twitter Sentiment Vizualization. [Online]
Available from: http://www.csc.ncsu.edu/faculty/healey/tweet_viz/ [Accessed:
07 January 2015]
Internetlivestats.com. Twitter usage statistics. [Online] Available from:
http://www.internetlivestats.com/twitter-statistics/ [Accessed: 07 January 2015]
34
Office for National Statistics. (2013) Personal Well-being across the UK,
2012/13. [Online] Available from:
http://www.ons.gov.uk/ons/dcp171778_328486.pdf
[Accessed: 07 January 2015]
Tom Groenfeldt. (2012). Supercomputer manages fixed income risk at
JPMorgan. [Online] Available from:
http://www.forbes.com/sites/tomgroenfeldt/2012/03/20/supercomputer-manages-
fixed-income-risk-at-jpmorgan/ [Accessed: 07 January 2015]
Raffi Krinkorian. (2013) New Tweets per second record, and how! [Online]
Available from: https://blog.twitter.com/2013/new-tweets-per-second-record-
and-how [Accessed: 07 January 2015]
Text REtrieval Conference (2011) Tweets2011 Twitter Collection [Online]
Available from: http://trec.nist.gov/data/tweets/ [Accessed: 05 May 2015]
Maxeler, Dataflow Computing | Maxeler Technologies [Online]
Available from: https://www.maxeler.com/technology/dataflow-computing/
[Accessed: 05 May 2015]
35
Appendix A Tweet JSON Sample{ "text":"accident Chef salad is calling my name, I\u0027m so hungry!", "id_str":"28965131362770944", "id":28965131362770944, "created_at":"Sun Jan 23 24:00:00 +0000 2011", "retweeted":true, "retweet_count":1, "favorited":true, "user":{ "id_str":"27144739", "id":27144739, "screen_name":"LovelyThang80", "name":"One of A Kind" }, "requested_id":28965131362770944}
36
Appendix B Tweet Kernel Design
Appendix C Tweet Kernel Manager37
Appendix D DFE Terminal Runs38
Appendix E Partial DFE run log
39
Appendix F Tweet Calculator Kernel40
class TweetCalculatorKernel extends Kernel{ private static final DFEType floatType = dfeFloat(8, 24); protected TweetCalculatorKernel(KernelParameters parameters, int X, int C) { super(parameters);
// inputs DFEVar in_mean = io.input("in_mean", floatType); DFEVar in_sd = io.input("in_sd", floatType); DFEVar in_n = io.input("in_n", floatType);
// counters CounterChain chain = control.count.makeCounterChain(); DFEVar x = chain.addCounter(X, 1); chain.addCounter(C, 1);
DFEVar xSum = in_mean * in_n;
// source-less stream DFEVar carriedMean = floatType.newInstance(this);
// head optimization.pushPipeliningFactor(0); DFEVar wCarriedMean = x.eq(0) ? 0.0 : carriedMean; optimization.popPipeliningFactor();
// body DFEVar nwCarriedMean = xSum + wCarriedMean;
// foot carriedMean <== stream.offset(nwCarriedMean, -C);
// source-less stream DFEVar carriedSD = floatType.newInstance(this);
// head optimization.pushPipeliningFactor(0); DFEVar wCarriedSD = x.eq(0) ? 0.0 : carriedSD; optimization.popPipeliningFactor();
// body DFEVar nwCarriedSD = (in_mean !== 0) ? ((in_sd * in_sd) * (in_n - 1)) + ((xSum * xSum) / in_n) + wCarriedSD : wCarriedSD; nwCarriedSD = optimization.pipeline(nwCarriedSD);
41
// foot carriedSD <== stream.offset(nwCarriedSD, -C);
// source-less stream DFEVar carriedN = floatType.newInstance(this);
// head optimization.pushPipeliningFactor(0); DFEVar wCarriedN = x.eq(0) ? 0.0 : carriedN; optimization.popPipeliningFactor();
// body DFEVar nwCarriedN = in_n + wCarriedN;
// foot carriedN <== stream.offset(nwCarriedN, -C);
// Check if last word in stream for final calculation
DFEVar conditionFinal = (x === (X - 1));
DFEVar calculationScore = conditionFinal ? nwCarriedMean / nwCarriedN : 0.0;
DFEVar calculationScoreSD = conditionFinal ? KernelMath.sqrt((nwCarriedSD - (nwCarriedMean * nwCarriedMean) / nwCarriedN) / (nwCarriedN - 1)) : 0;
// Output - only output on last word in stream. io.output("calculationScore", calculationScore, calculationScore.getType(), conditionFinal); io.output("calculationScoreSD", calculationScoreSD, calculationScoreSD.getType(), conditionFinal);
}
}
Appendix G Tweet Calculator Manager
42
public class TweetCalculatorManager {
private static final int X = 8; private static final int C = 16;
public static void main(String[] args) { Manager manager = new Manager(new EngineParameters(args)); BuildConfig bc = new BuildConfig(Level.FULL_BUILD); KernelConfiguration currKConf = manager.getCurrentKernelConfig();
//currKConf.optimization.setUseGlobalClockBuffer(true); //currKConf.optimization.setCEPipelining(5); //currKConf.optimization.setCEReplicationNumPartitions(2);
bc.setBuildEffort(BuildConfig.Effort.VERY_HIGH); bc.setMPPRCostTableSearchRange(1, 4); //bc.setMPPRParallelism(2); manager.setBuildConfig(bc);
// Instantiate the kernel Kernel kernel = new TweetCalculatorKernel(manager.makeKernelParameters(), X, C);
//manager.setEnableStreamStatusBlocks(true); manager.setKernel(kernel); manager.setIO(IOType.ALL_CPU); // Connect all kernel ports to the CPU manager.createSLiCinterface(interfaceDefault());
manager.addMaxFileConstant("X", X); manager.addMaxFileConstant("CFactor", C);
manager.build();
}
private static EngineInterface interfaceDefault() { EngineInterface ei = new EngineInterface();
InterfaceParam Y = ei.addParam("Y", CPUTypes.UINT32); InterfaceParam YInBytes = Y * X * CPUTypes.FLOAT.sizeInBytes();
ei.setTicks("TweetCalculatorKernel", Y * X);
ei.setStream("in_mean", CPUTypes.FLOAT, YInBytes); ei.setStream("in_sd", CPUTypes.FLOAT, YInBytes); ei.setStream("in_n", CPUTypes.FLOAT, YInBytes); ei.setStream("calculationScore", CPUTypes.FLOAT, YInBytes / X); ei.setStream("calculationScoreSD", CPUTypes.FLOAT, YInBytes / X);
43
return ei; } }
44
Appendix H Poster
45