Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
1
Mining Social Media Data to Investigate Patient Perceptions
Regarding DMARD Pharmacotherapy for Rheumatoid Arthritis.
Dr Chanakya Sharma MBBS FRACP
This thesis is presented in partial fulfilment of the requirements for the Master
of Clinical Research degree at the University of Western Australia.
School: Graduate Research School
Year of submission: 2020
Contact details: [email protected]
2
Thesis declaration I, Chanakya Sharma, certify that this thesis is my work, it has been completed during the
course of this degree, and does not breach any ethical rules with regard to the conduct of the
research.
Dr Chanakya Sharma MBBS (UWA) FRACP
3
Table of Contents
Thesis declaration .................................................................................................................................. 2
List of Abbreviations .............................................................................................................................. 5
Abstract ................................................................................................................................................. 7
List of Tables .......................................................................................................................................... 9
List of Figures ....................................................................................................................................... 10
Acknowledgement ............................................................................................................................... 11
Authorship Declaration ........................................................................................................................ 12
Chapter 1: Introduction ....................................................................................................................... 13
Chapter 2: Background ........................................................................................................................ 15
2.1 Rheumatoid arthritis .................................................................................................................. 15
2.1.1 Aetiology ............................................................................................................................. 15
2.1.2 Pathogenesis ....................................................................................................................... 16
2.1.3 Clinical Features .................................................................................................................. 17
2.1.4 Management ...................................................................................................................... 19
2.1.5 DMARDs .............................................................................................................................. 20
2.2 Social Media and Sentiment Analysis ........................................................................................ 22
2.2.1 Big Data ............................................................................................................................... 23
2.2.2 Social Media Analytics ........................................................................................................ 26
2.2.3 Data Capture ....................................................................................................................... 26
2.2.4 Preprocessing ...................................................................................................................... 27
2.2.5 Sentiment Analysis .............................................................................................................. 29
2.3 Conclusion ................................................................................................................................. 33
Chapter 3: Scoping Review - Can sentiment analysis be conducted on social media platforms to
understand public sentiment held towards pharmacotherapy? ......................................................... 34
3.1 Abstract ..................................................................................................................................... 34
3.2 Methods ..................................................................................................................................... 35
3.3 Results ....................................................................................................................................... 36
3.3.1 Sentiment analysis techniques and accuracy ..................................................................... 50
3.3.2 Sentiment analysis use ....................................................................................................... 51
3.4 Discussion .................................................................................................................................. 52
3.5 Conclusion ................................................................................................................................. 57
Chapter 4: Mining social media data to investigate patient perceptions regarding DMARD therapy 59
4
4.1 Abstract ..................................................................................................................................... 59
4.2 Methods ..................................................................................................................................... 60
4.2.1 Statistics .............................................................................................................................. 62
4.3 Ethics .......................................................................................................................................... 62
4.4 Results ....................................................................................................................................... 62
4.4.1 B/tsDMARDs ....................................................................................................................... 65
4.4.2 CsDMARDs .......................................................................................................................... 73
4.4.3 B/tsDMARDs vs csDMARDs ................................................................................................. 76
4.5 Discussion .................................................................................................................................. 77
Chapter 5: Conclusion .......................................................................................................................... 81
5.1 Research Contribution ............................................................................................................... 81
5.2 Future Directions ....................................................................................................................... 82
References ........................................................................................................................................... 84
Appendix .............................................................................................................................................. 99
Ethics approval ................................................................................................................................ 99
5
List of Abbreviations
ACR – American College of Rheumatology
API - Application Programming Interfaces
ARPA - Advanced Research Projects Agency
bDMARDs – Biological Disease Modifying Antirheumatic Drugs
csDMARDs – Conventional Synthetic Disease Modifying Antirheumatic Drugs
EULAR – European League Against Rheumatism
HCQ-hydroxychloroquine
LB-Lexicon based
LEF-Leflunomide
ML - Machine Learning
MTX – Methotrexate
QA – Quality Assessment
RA – Rheumatoid Arthritis
SA - Sentiment Analysis
SM - Social Media
6
SZS – Sulfasalazine
tsDMARDs – Targeted Synthetic Disease Modifying Antirheumatic Drugs
USA – United States of America
7
Abstract Objectives: The hypothesise of study is that patients have a positive sentiment regarding
b/tsDMARDs and a negative sentiment towards csDMARDs. A scoping review was conducted
to map the literature as it pertains to the use of sentiment analysis as a tool to extract
meaningful data on social media discussion on pharmacotherapy. Sophisticated sentiment
analysis algorithms were then used to analyse discussions on social media platforms regarding
DMARDs to understand the collective sentiment expressed towards these medications.
Methods: For the scoping review a keyword search strategy was used on several databases
and 10 studies were included which revealed various uses of sentiment analysis, but most
commonly to extract sentiment regarding a particular medication. Treato analytics were then
utilised to download all available posts on social media about cs/b/tsDMARDs in the context
of rheumatoid arthritis. Strict filters ensured that user generated content was downloaded.
The sentiment (positive or negative) expressed in these posts was analysed for each DMARD
using Sentiment Analysis. An analysis was also conducted on the reason(s) for this sentiment
for each DMARD, looking specifically at efficacy and side effects.
Results: Computer algorithms analysed millions of social media posts and included 28261
posts on b/tsDMARDs and 26841 posts on csDMARDs. This revealed that all classes had an
overall positive sentiment. The ratio of positive to negative posts was higher for b/tsDMARDs
(1.210) than for csDMARDs (1.048). Efficacy was the most commonly mentioned reason in
posts with a positive sentiment and lack of efficacy was the most commonly mentioned
reason for a negative sentiment. These were followed by the presence/absence of side effects
in negative or positive posts respectively.
8
Conclusion: Public opinion on social media is generally positive about DMARDs, regardless of
class. Lack of efficacy followed by side effects were the most common themes in posts with a
negative sentiment. There are clear reasons why a DMARD generates a positive or negative
sentiment, and as the sentiment analysis technology becomes more refined, targeted studies
can be done to further analyse these reasons, and allow clinicians to tailor DMARDs to match
patient needs.
9
List of Tables • Table 1: Summary of studies
• Table 2: Aggregate sentiment
• Table 3: Social media platforms
• Table 4: b/tsDMARD positive and negative sentiment for efficacy and side effects
• Table 5: Positive/Negative sentiment csDMARDs reasons
• Table 6: Concerns: percentage of posts with a negative sentiment
• Table 7: Comparison of proportion of positive sentiment for efficacy amongst
b/tsDMARDs
10
List of Figures • Figure 1: ACR/EULAR 2010 Rheumatoid Arthritis Classification Criteria
• Figure 2 - 2019 update of the EULAR rheumatoid arthritis management
recommendations in the form of an algorithm
• Figure 3: The 5 Vs of big data
• Figure 4: Steps involved in big data analysis
• Figure 5: Types of Sentiment Analysis
• Figure 6 – Study flow diagram
11
Acknowledgement I would like to acknowledge the following people without whom I would not have been able
to complete this thesis.
My supervisors, Dr Helen Keen, Dr Samuel Whittle, Dr Pari Delir Haghighi and Dr Frada
Burstein. It was their continued support and guidance that allowed me to embark on this
project and see it through to completion. I would like to thank Arthritis Australia for their
generous research grant that allowed us to collect the data required for this project. I would
like to thank my family, including my parents, my wife and my children, Suhani and Sohum,
who have been ever so patient and allowed me to have the time I need to complete this
degree. Lastly, I would like to thank the staff at the University of Western Australia who
have taught me the value of research and have made this journey more enjoyable than
what I had thought it would be.
12
Authorship Declaration This thesis contains work that has been published.
Details of the published papers:
• Sharma C, Whittle S, Haghighi PD, Burstein F, Keen H. Sentiment analysis of social
media posts on pharmacotherapy: A scoping review. Pharmacology Research &
Perspectives. 2020 Oct;8(5):e00640.
o Located in thesis: Chapter 3
• Sharma C, Whittle S, Haghighi PD, Burstein F, Sa'adon R, Keen H. Mining social media
data to investigate patient perceptions regarding DMARD pharmacotherapy for
rheumatoid arthritis. Annals of the Rheumatic Diseases. 2020 Sep 3.
o Located in thesis: Chapter 4
13
Chapter 1: Introduction Patients with rheumatoid arthritis (RA) face debilitating and occasionally life-threatening
consequences of untreated disease. The treatment however does require using potent
immunosuppressive/immunomodulatory agents which often have several undesirable side
effects. The intimidating nature of the physician’s office, which has led to the development
of syndromes such as “white coat hypertension”, can stifle the voice of the patients. The
nature of healthcare however has rapidly changed over the last few years. The hitherto
didactic transaction from a doctor to patient has turned into a more open discussion, which
has largely been enabled by the rise of the Internet and social media. These have been key
agents in disrupting the informational imbalance that has formed the basis of the power
differential between a physician and a patient. By providing easy and rapid access to vast
reservoirs of information about health and medications, the internet has allowed patients to
better understand their conditions and its management. In addition, social media has
provided avenues for discussions to take place amongst the suffering silent majority of
patients who might have otherwise not been able to do so. The ripples of this rise in both
knowledge and discussions online are increasingly being felt in clinical practice by physicians.
Patients are now striding into appointments aware of the latest research and discussions that
are relevant to their health. While this has had several positive consequences with patients
taking increased interest and ownership in their health, however, access to the wrong
information or the wrong discussions can often be equally disastrous by pushing patients in a
direction that can lead to worse outcomes.
The stigma of a chronic illness can be isolating, not only physically through pain and deformity,
but also emotionally and socially. Finding a cohort of likeminded and like-suffering individuals
can be a powerful driver of emotion and sentiment. We as clinicians have a responsibility to
14
be aware of the discourse that occurs on these social media platforms, and to be aware of
the sentiments that patients are expressing about the condition and the treatment that they
are being exposed to, as this can have a tremendous impact on the patient’s beliefs and
consequently their actions regarding their health.
The purpose of this thesis is to understand how patients feel about the various treatment
options that are available for RA by analysing the discussions that they are having on various
social media platforms online. This has been explored this over the last two years and findings
summarised over the next three chapters. Chapter two explains the three aspects of this
study, RA, social media and sentiment analysis. Chapter three is a scoping review that
explores whether sentiment analysis has been used to understand pharmacotherapy. This
was published in Pharmacology Research & Perspectives in October 2020. Chapter four
describes the study; analysing social media in its entirety, specifically looking at discussions
on the various DMARDs, and understanding the sentiment that was being expressed towards
these medications. This was published in The Annals of Rheumatic Disease in September
2020. The final chapter presents the conclusion of our findings and our thoughts on the future
of such analytics.
15
Chapter 2: Background
Information technology has, over the last few years rapidly ingrained itself in every aspect of
human life. While health care has traditionally been wary of changes and slow to incorporate
new technology, however quantum advances in the fields of big data analytics and artificial
intelligence have unmasked the potential for revolutionising patient care, something that
healthcare professionals can ill afford to ignore. This chapter provides an introduction to both
the information technology (social media and sentiment analysis) and healthcare (RA and
DMARDs) aspects of our study.
2.1 Rheumatoid arthritis
Rheumatoid arthritis (RA) is a chronic, inflammatory disorder that if untreated will lead to
irreversible destruction of the joints. It has a global prevalence of 0.24% (1). While the earliest
known mentions of RA date back to the Ebers Papyrus from 1500 BC, it wasn’t until 1800 that
a young medical resident by the name of Augustin Jacob Landré-Beauvais first described RA,
although he felt it was a type of gout, labelling it “Goutte Asthénique Primitive” (2). It would
take another 90 years however before the constellation of symptoms would formally be
labelled as RA (3). Even though RA spent its first three millennia in relative obscurity, the pace
of developments in the last 130 years has been nothing short of astounding.
2.1.1 Aetiology
Most of the developments in RA in the early to mid-20th century were in the fields of
pathophysiology, with it being established as an autoimmune disease with both genetic and
environmental components (4). The exact aetiology however is thus far unknown. There is a
genetic component with studies showing heritability up to 60% in twin studies (5). Genetic
16
studies have strongly suggested a polygenic cause of RA, with population-based studies
identifying over 100 loci associated with the development of RA (6,7). The strongest genetic
association thus far has been found with HLA-DRB1, and PTPN22 and DPP4 being amongst
several non-HLA loci that have been associated with increased RA susceptibility (8).
While genetics do play a role, they are not enough on their own to cause disease, and there
is likely to be a “second hit”, usually an environmental risk factor, that triggers the onset of
RA, although this has yet to be proven. Cigarette smoking remains one of the strongest
environmental risk factors for RA, with the risk being proportional to the number and duration
of cigarettes smoked (9). Various infectious agents have also been postulated to play a role in
the development of RA, although the exact mechanism by which they induce disease has not
been elucidated, molecular mimicry likely plays a role (10). More recently, alterations in the
gut microbiome have also been implicated in enhancing the susceptibility to RA (11). Exposure
to these environmental risk factors can result in the development of autoantibodies
associated with RA, which have been found in up to 80% of RA patients. The two most
commonly found antibodies are Rheumatoid Factor (RF) and Anti–cyclic citrullinated protein
antibodies (Anti-CCP). While these autoantibodies have been shown to play a part in the
pathogenesis of RA (likely through immune complex formation and complement activation),
their exact role remains unclear (12). Thus, while the exact aetiology of RA remains unknown,
it is clearly a complicated, multifaceted autoimmune process that involves the interplay of
both genetics and the environment.
2.1.2 Pathogenesis
The pathologic hallmark of RA is inflammation of the synovial tissue that lines the joints, this
is known as synovitis. Untamed synovitis leads to destruction of cartilage and bone in the
17
joint, resulting in the clinical manifestations of RA. The inflamed synovium comprises of a
variety of immune cells including T cells, B cells, plasma cells, natural killer cells, macrophages
and neutrophils (13). The migration and influx of these immune cells in the synovium occurs
due to the upregulation of cytokines and chemokines. The final impact of this symphony of
inflammatory and immune cells is osteoclast generation and chondrocyte stimulation, which
results in bone and cartilage destruction respectively (14).
2.1.3 Clinical Features
Rheumatoid arthritis is predominantly a disease of the small joints of the hands and feet. It
typically results in symmetric, synovial inflammation and tenderness across various joints,
with patients describing swelling and morning stiffness lasting over an hour. Blood tests can
show a rise in the levels of inflammation, reflected by high C-reactive protein (CRP) and/or an
erythrocyte sedimentation rate (ESR). Positivity to one or both the autoantibodies
(rheumatoid factor and/or CCP antibodies) can also be seen in most of the patients. These
findings are reflected in the 2010 American College of Rheumatology/European League
Against Rheumatism classification criteria for RA. These criteria were designed to assess
patients for suitability for inclusion in research and not for making a clinical diagnosis,
however they are often used in clinical settings to support a diagnosis of RA (15). In addition
to the joint involvement, patients with RA can have so-called extra-articular manifestations
ranging from vasculitis to interstitial lung disease. Long-term consequences of untreated RA
can be equally severe with development of amyloidosis or lymphoma (14).
18
2010 American College of Rheumatology/European League Against
Rheumatism classification criteria for rheumatoid arthritis
Score
Joint involvement
• 1 large joint
• 2–10 large joints
• 1–3 small joints
• 4–10 small joints
• >10 joints (at least 1 small joint)
0
1
2
3
5
Serology
• Negative RF and negative ACPA
• Low-positive RF or low-positive ACPA
• High-positive RF or high-positive ACPA
0
2
3
Acute-phase reactants
• Normal CRP and normal ESR
• Abnormal CRP or abnormal ESR
0
1
Duration of symptoms
• <6 weeks
• ≥6 weeks
0
1
• A total score of >=6 is needed to classify a patient as having definite RA
Figure 1: ACR/EULAR 2010 Rheumatoid Arthritis Classification Criteria
19
2.1.4 Management
The aim of the management of RA is to suppress inflammation, both systemic and synovial,
in order to prevent long-term joint damage, thus reducing morbidity and mortality. This
suppression of inflammatory activity is achieved by the use of a heterogeneous group of drugs
collectively known as Disease Modifying Antirheumatic Drugs (DMARDs). While these drugs
exert their effect by different mechanisms, the common thread that joins them together is
the fact that they all suppress disease activity and joint damage. The goal of using DMARD
treatment is to achieve remission, which is done by a process known as treat-to-target (16).
A treat-to-target approach identifies a target (usually remission) which needs to be achieved
by tailoring treatment at every individual consultation. There have been 3 broad approaches
used to tailor DMARD therapy namely: Step up combination therapy, initial combination
therapy and sequential monotherapy (17). Sequential monotherapy has largely been
abandoned in favour of the former 2 approaches, either of which can be used depending on
the severity of the patient’s symptoms. There are several clinical measures of disease activity
which can be used to assess whether patient is in remission or has low, moderate or high
disease activity. The six measures of disease activity that have been endorsed by the
American College of Rheumatology include: Patient Activity Scale (PAS) or PASII (range 0–10),
Routine Assessment of Patient Index Data 3 (RAPID3) (range 0–10), Clinical Disease Activity
Index (CDAI) (range 0–76.0), Disease Activity Score (DAS) 28 erythrocyte sedimentation rate
(ESR) (range 0–9.4) and Simplified Disease Activity Index (SDAI) (range 0–86.0). According to
the 2015 ACR/EULAR task force, remission is defined “as a tender joint count, swollen joint
count, C-reactive protein level (mg/dl), and patient global assessment of ≤1 each or a
Simplified DAS of ≤3.3” (18). If this target has not been achieved by the patient then DMARD
therapy should typically be escalated so as to achieve remission.
20
2.1.5 DMARDs
There are 3 broad categories of DMARDs. These are conventional synthetic DMARDs
(csDMARDs), biological DMARDs (bDMARDs) and targeted synthetic DMARDs (tsDMARD).
Conventional synthetic DMARDs are the oldest of the above 3 and are universally accepted
as first-line agents in treating newly diagnosed patients with RA (18, 19). In those patients in
whom adequate disease control or remission is not achieved with (combination) csDMARDs
then either bDMARD or tsDMARD will need to be added to the treatment regimen (Figure 1).
Figure 2 - 2019 update of the EULAR RA management recommendations in form of an
algorithm
21
While the number of csDMARDs has remained static for a number of years, the number of
b/tsDMARDs available for RA has rapidly increased over the past few years. This has been
mirrored by the rise in total healthcare costs associated with DMARD therapy. By 2014 the
actual costs of bDMARDs to the pharmaceutical benefits scheme were more than double that
of what had been originally estimated (20). Despite there being an improvement in outcomes
for RA patients, medication adherence rates, especially with csDMARDs, have been poor with
some studies showing full adherence in as few as 30% of patients (21,22). Evidence is
emerging that some patients are progressing to b/tsDMARDs without using csDMARDs as
prior or co-therapy, in contrast to guidelines and typical regulatory rules (23). Anecdotal
evidence suggests that one of the factors driving this are the patient perceptions which
appear to be strongly positive for the b/tsDMARDs as compared to the csDMARDs.
Patient concordance with medications is associated with improved outcomes in RA (24,25).
One of the biggest factors affecting concordance is the patient’s personal belief about the
disease and medications (26). Studies have shown that in order to improve adherence with
DMARDs, clinicians should focus less on provision of medical information and be more aware
of patients’ beliefs (27). Understanding patient beliefs however is difficult and often relies on
qualitative studies. While these are excellent at providing an in-depth thematic analysis of a
specific issue, but they are traditionally conducted on a small scale and might not be
representative of a diverse population set. A novel method of obtaining vast amounts of
patient originated content is by analysing comments made on social media. As more and
more industries are turning to analysing crowd sourced data generated on social media to
better understand their customer base, we looked at the possibility of using this data to
understand patient sentiment towards DMARDs. Our hypothesis was that patient sentiment
22
is positively skewed in favour of the b/tsDMARDs and negatively so towards the csDMARDs.
This hypothesis was generated from collective clinical observations of the investigating
cohort, and further corroborated by discussions with other experienced rheumatologists who
agreed that the hypothesis reflected their clinical interactions. While it is possible that
manifestations of RA itself could be a cause of varied sentiment amongst patients, much of
the variability in RA patients lies in the treatment response and tolerability of medications.
This, along with the demographic heterogeneity of the patients across the world with RA, I
believe, are a far greater driver of the varied perceptions than disease specific characteristics.
2.2 Social Media and Sentiment Analysis
On 4th October 1957 the Soviet Union launched Sputnik 1, the world’s first artificial satellite,
into space. This started the so-called “space race” and prompted the United States of America
(USA) to develop the U.S. Department of Defense’s Advanced Research Projects Agency
(ARPA). A consequence of this project was the creation of a network of computers with
remote logins. Little did the original creators of this network know that they were laying the
foundations of what would later become known as “the internet” (28). The internet’s original
iteration (so called Web 1.0 or Semantic web) was designed primarily to be a read-only,
unidirectional source of information from a few to the many. The bursting of the “dot-com
bubble” amongst other factors in the early 2000s however lead to a re-evaluation and
emergence of the next generation of the world wide web, so called “Web 2.0” or “Social Web.
Web 2.0 is an umbrella term for new easy to use services, such as blogs and social media, that
began to be provided on the internet in the mid-2000s and allowed users to generate their
own content (29).
23
The development of Web 2.0 has allowed the internet to become a more interactive platform
for its users, thus allowing social media to flourish (30). Social media is defined as “a group of
Internet-based applications that build on the ideological and technological foundations of
Web 2.0, and that allow the creation and exchange of user generated content” (31). By 2007
six percent of the internet’s population was on social media, a number that would nearly
double by 2011 to 11% (1.2 billion) (32). As of 2015, 76% of all American adults use social
media (33). Some of the most popular social media platforms include Facebook, YouTube,
WhatsApp, Twitter, Tik Tok and Reddit to name a few. The growth of social media has been
unprecedented with no evidence of it slowing down. In parallel with its reach has been a
tremendous rise in the power of social media to shape public opinion and, in extreme cases,
result in collective actions. This has been demonstrated on several occasions in the past few
years from the 2011 Arab Spring which spread across twenty countries, the Occupy
movements, the 2013 Brazil protests, Brexit and the American presidential election of 2016
(34-36). The powerful impact social media can have on users and their friends and family is
being explored by more and more industries to gain insights into their user base and
consequently drive change (37,38).
2.2.1 Big Data
While social connectivity was the primary objective of social media, one of the most lucrative
by products of social media has been user generated data. This has indeed become the cash
cow for the majority of the social media empires such as Facebook, Twitter and Google. This
vast repository of data, which collectively is known as “Big Data”, is not just limited to textual
content but can also include videos, movies and sounds amongst other types. Broadly
speaking however it falls into two categories, structured or unstructured data. While the
24
former describes data that is allocated to predefined fields, the latter has no recognisable
order to it (39).
While there is no universal definition of Big Data, it is generally accepted as data that is too
big to be handled and analysed by traditional database protocols (40). It was initially defined
by having three characteristics (called the 3 Vs) of volume, velocity and variety (41). There is
no consensus on the volume at which data becomes big data. It has been reported that
Facebook and Twitter alone are generating 50 gigabytes of data per day, a value that triples
every 3-5 years (42). Variety refers to the heterogeneity of the data, which, as stated above,
is either structured (~5% of all big data) or unstructured. Velocity refers to the rate of data
generation and analyses (43). As the dimensions of big data have become clearer, more Vs
have been added to the list, with the two most commonly recognised ones being “veracity”
which highlights the large amount of noise that often gets collected with big data, and
“value”, which is perhaps the most important aspect of big data and refers to the usefulness
of the data being obtained (44).
25
Figure 3: The 5 Vs of Big Data
Big data is useless unless it can be used to drive meaningful change. This need to understand
big data in order to draw meaningful conclusions has resulted in the development of Big Data
Analytics. Big Data Analytics deals with complex data that is often unstructured and uses a
variety of tools ranging from, artificial intelligence, machine learning and statistical modelling
to detect patterns within this unstructured data which can then be used to gain insights and
change practice (45). While there are minor variations depending on the source and the need
for the analysis, broadly speaking, the process of deriving meaningful insights from Big Data
occurs over five broad steps, which are expressed in the diagram below(46):
Figure 4: Steps involved in big data analysis
BIG DATA
VERACITY VARIETY
VALUE VOLUME
VELOCITY
Data acquisition
Information
extraction and
cleaning
Data integration,
aggregation and
representation
Data modelling
and analysis
Interpretation
26
2.2.2 Social Media Analytics
When big data that has been generated by social media is analysed, this process is termed
Social Media Analytics. It is defined as “an emerging interdisciplinary research field that aims
on combining, extending, and adapting methods for analysis of social media data” (47).
Broadly speaking there are two types of social media analytics: content based analytics –
which focus on the unstructured content posted by users on social media to derive insights,
and structure-based analytics which looks at the structure of a social network and extracts
information based on the relationship between the users (43). Very few studies have been
done into developing a standardised approach towards social media analytics, with no “gold
standard” approach having been established (48). A widely recognised approach was
proposed by Fan et al, and is known as the “CUP” framework. This stands for “capture,”
“understand,” and “present”. As the names suggest, the capture stage refers to the
acquisition of data and also involves processing this data so it is easily readable by the
algorithms. Understand refers to the actual analytic stage which could involve various
methodologies to conduct classification or predictive modelling. The present stage deals with
the presentation, interpretation and application of the analysed data (49).
2.2.3 Data Capture
Data acquisition is the first step in social media analytics and is concerned with
acquiring/collecting data. Researchers can either narrow the source of the data to particular
social media platforms, or, thanks to the open access nature of most social media, it is also
possible to obtain all publicly available social media content across various platforms. This has
become increasingly possible due to the growth of public Application Programming Interfaces
(APIs). An API “is a way for two computer applications to talk to each other over a network
27
(predominantly the Internet) using a common language that they both understand” (50).
Historically APIs predate the onset of Web 2.0 and have been present as “private APIs” across
various technological companies (51). However, the development of Web 2.0 lead to the
parallel development of so called “open APIs” which are available for public use. Open APIs
for companies such as Twitter and Google have resulted in public access to massive amounts
of data, which can then be collated for analysis. The information is then collated into a corpus
or dataset which can then be cleaned and analysed. While APIs are the most common method
used to obtain large quantities of data online, another popular method involves the use of
web crawlers. A web crawler is “is a system for the bulk downloading of web pages” (52). This
typically starts with a list of web addresses which are “crawled” by the program, any
information found is stored in a pre-specified repository, and any new web addresses
detected during this crawl are subsequently visited and the process of data finding and
storage is repeated. Each system has its own merits, and which one gets used will depend on
various factors including the reason for data collection and the resources (including APIs)
available.
2.2.4 Preprocessing
The data captured from social media will typically have both structured and unstructured
components. Structured data, as stated above, will have pre assigned categories (e.g. user
information, demographics etc), unstructured data however will be devoid of most such
identifiable categories. The majority of the data available on the internet is likely to be
unstructured. This data will then need to undergo a process of text preprocessing. The aim of
preprocessing is to make the data more readable for the analytical software without
28
impacting on the information that it provides. This is done in several steps including cleaning,
normalisation, transformation and data reduction which then yield a “cleaned” dataset that
can be analysed (53). Some of the most common types of text preprocessing are tokenization,
stop-word removal, lowercase conversion, and stemming (54). This is not always a linear
process, with different steps being done at different times, depending on the type of data
being analysed.
Tokenization is one of the most common forms of text preprocessing. It involves dividing the
corpus of text into subcategories, which could be words, phrases or other meaningful
elements, which are called tokens (55). Usual practice is to combine words together, this is
called “n-grams”, where n represents the number of words being combined together, thus
resulting in unigrams, bigrams, trigrams etc. This has been shown to improve text
classification (56). Tokenisation of social media data is considerably more difficult due to the
widespread use of slang, abbreviations and emoticons (61). Stop words are words that are
used frequently in a language yet carry no inherent meaning (such as pronouns and
prepositions). Exclusion of these stop words, by a text preprocessing technique called “stop
word removal”, prior to data analysis has shown to reduce problems encountered in
classifying the text by machine learning algorithms, and not shown to impact the accuracy of
text analysis (62). While there is no universally accepted list of stop-words, most text
preprocessing software usually have a predefined list of terms deemed to be stop-words.
Lowercase conversion, as its name suggests, merely represents the changing of all characters
to lowercase, as generally there is no difference in the meaning of the word, when it is
changed from upper to lowercase. This change however has been shown to improve the
accuracy of the text analysis (54). Stemming refers to the process of getting to the root or
29
stem of each word and to reduce the grammatical variations of the word (59). This typically
involves removing the suffix of various words that share the same “stem”. A more
sophisticated version of this is called lemmatization, which involves determining which words
have the same root despite their structural differences. (60). The usefulness of these methods
(and other less common methods of text preprocessing) have been analysed by various
studies, with most of them concluding that there is no universal fit and that it is more
important to choose the right technique based on the platform and language than to adopt a
one size fits all approach (61-64).
2.2.5 Sentiment Analysis
Once the data corpus has been preprocessed and cleaned, it is ready for the analysis. There
are several different methods of conducting social media content analysis, depending on the
questions being asked. One of the most common types of social media content analysis is to
try and detect the aggregate opinion held towards a particular product, or as is the case in
this study, pharmacotherapy. This is done via a technique known as Sentiment Analysis (SA);
also termed “opinion mining” (65). Sentiment Analysis involves assigning an integer value to
the corpus of text, depending on the sentiment being expressed in that text. Words with
negative sentiment get negative scores and vice versa (66). For example, the term “painful”
might receive a negative score, whereas “beautiful” will usually receive a positive score.
Sentiment Analysis typically occurs in two steps. The first step is known as “subjectivity
classification” which assesses where the sentence is subjective or objective. If the sentence is
objective then no further action will be taken, but if it is subjective then the second part of
the analysis occurs, known as “polarity classification”. This is the step that analyses and
30
assigns a sentiment to the text (67). This can be done at various levels, including the level of
the document, the sentence, phrases or words (65).
Figure 5: Types of Sentiment Analysis
There are three ways by which sentiment analysis can be done, lexicon based (also known as
knowledge based), machine learning (also statistical) or a hybrid of the two (68). The lexicon
based method requires the use or development of a lexicon or collection of words or phrases
with their sentiment polarity mapped and scored. These words are then searched for in the
target document and their scores are aggregated to obtain an overall sentiment score for the
document (69). These lexica can be created manually or via automated means. There are
several pre-existing lexica that are commonly used in conducting SA, such as Subjectivity
Lexicon, General Enquirer and SentiWordNet to name a few (70-72). As sentiment analysis is
highly domain specific, it is important to ensure that the right lexicon is being used when
analysing a particular corpus of data (73). This has been demonstrated in studies that have
tested the accuracy of various lexica across domains and shown that the accuracy depends
Sentiment Analysis
Lexicon Based Machine Learning Hybrid
Supervised Unsupervised
31
more on the appropriateness of selecting the right lexica for the domain, than the lexica itself
(74). For example, the phrase “…a hair raising journey with unexpected twists and turns”
might result in a positive sentiment if being analysed by a lexicon designed to assess movie
reviews, however if the same phrase was assessed by a lexicon designed to assess public
transport, it would result in a negative sentiment.
Machine learning is the process of “programming computers to optimise a performance
criterion using example data” (75). There are several different types of machine learning
algorithms, but they are usually divided into supervised and unsupervised learning algorithms
(76). The basic difference between the two is the while in the former the algorithm “learns”
on a labelled dataset and is then applied to the actual dataset, the latter is run on an
unlabelled dataset which it tries to make sense of. While unsupervised approaches such as
Probabilistic Latent Semantic Analysis and Latent Dirichlet Allocation have been used in
sentiment analysis, their results are often incoherent as the functions of the topics being
detected do not always correlate with human judgements. However, more recent
developments have shown promise in handling large, unstructured datasets. These use
powerful processors which are stacked to resemble the human brain, which is where they get
their name from, “Artificial Neural Networks”. Some of these networks can be hundreds of
layers ‘deep’, hence the name “deep learning”. This is a new and exciting area of machine
learning, in which there is reduced dependency for the need of labelled/structured data, with
the algorithm itself being able to “learn” what is relevant from the unstructured data (77).
Supervised machine learning algorithms (such as Naïve Bayes or Support Vector Machines)
are well suited to sentiment analysis (78). While algorithms built for these supervised learning
models can reach very high levels of accuracy, these are quite domain specific, and using the
32
same algorithm on a different data set (one they were not trained on) can result in a dramatic
drop in this accuracy (79).
One of the first steps of machine learning models is feature selection. A “feature” in the
context of data analytics, is simply an individual, measurable, aspect of the data being
analysed (80). It is important to choose the right features to segregate the data as this allows
for more effective and accurate analysis. If too many features are selected then the task can
be too computationally intensive and difficult, whereas if too few features are selected then
the results might not be accurate. Optimum feature selection needs to meet two basic
qualities, it needs to result in high learning accuracy while at the same time have less
computational overhead. There are several ways by which feature selection can be done,
however it needs to be individualised for the project at hand to ensure accuracy while limiting
costs (81, 82).
Once the features have been selected, these are then used on a subset of the data, called the
training set, to train the algorithm or “classifier”. There are several different types of classifier
algorithms used in machine learning, but they broadly fall into four categories namely linear
classifiers (such as Naïve Bayes), support vector machines, decision trees and Neural networks
(83). The purpose of this classifier is to analyse the labelled training data and find the class of
the output variable with sufficient accuracy.
The accuracy of the classifier can be tested using a variety of metrics, but the most common
ones involve the use of Recall, Precision and their harmonic mean, the F-score. Recall is “the
number of retrieved relevant items as a proportion of all relevant items”, whereas Precision
is “the number of retrieved relevant items as a proportion of the total number of retrieved
33
items” (84). The F-score is a composite average (thus a measure of accuracy) of these two
values, with a range between 0 (worst) to 1 (best) (85).
Once the classifier has demonstrated acceptable accuracy, it can then be used on the dataset
to conduct sentiment analysis. This sentiment analysis algorithm will then be run on the
specific social media platform to understand the overall sentiment being expressed towards
a particular topic, in this case, DMARDs in the context of RA.
2.3 Conclusion
Achievement of remission is the target of treating RA, which is usually done using DMARDs.
While csDMARDs are very cheap and effective, b/tsDMARDs are more likely to reduce
disease activity, however at considerably higher costs. Anecdotally there has been a
significant rise in the number of patients who are demanding to be placed on the newer
agents (b/tsDMARDs), instead of the csDMARDs. A key driver of this may be positive
discussions being held on social media about the b/tsDMARDs, and the negative ones on
csDMARDs. The research question posed was, “What is the aggregate sentiment being
expressed on social media towards the csDMARDs and the b/tsDMARDs?” Prior to
answering this question however, it is important to review the literature to see if sentiment
analysis technology has been used to analyse social media discussions on pharmacotherapy.
A scoping review examining this issue is presented in the next chapter.
34
Chapter 3: Scoping Review - Can sentiment analysis be conducted on social media platforms to understand public sentiment held
towards pharmacotherapy?
Publication
• Sharma C, Whittle S, Haghighi PD, Burstein F, Keen H. Sentiment analysis of social
media posts on pharmacotherapy: A scoping review. Pharmacology Research &
Perspectives. 2020 Oct;8(5):e00640.
3.1 Abstract
Social media is playing an increasingly central role in patient's decision-making process.
Advances in technology have enabled meaningful interpretation of discussions on social
media. A scoping review was conducted to assess whether Sentiment Analysis, a big data
analytic tool, could be used to extract meaningful themes from social media discussions on
pharmacotherapy. A keyword search strategy was used on the following databases:
OneSearch, PubMed, Medline, EMBASE, and Cochrane. One hundred and ninety-four titles
were identified of which 10 studies were included. Themes were then extracted about the
uses and implications of sentiment analysis of social media discussions on pharmacotherapy.
Twitter was the most frequently analysed platform. Assessment of public sentiment about a
particular medication was the most common use of sentiment analysis followed by detection
of adverse drug reactions. Studies also revealed a significant impact of news media on public
sentiment. Implications for real world practice include identifying reasons for a negative
sentiment, detecting adverse drug reactions and using the impact of news media on social
media sentiment to drive public health initiatives. The lack of a consistent approach to
35
sentiment analysis between the studies reflects the lack of a gold standard for the technology
and consequently the need for future research. Sentiment Analysis is a promising technology
that can allow us to better understand patient opinion regarding pharmacotherapy. This
knowledge can be used to improve patient safety, patient- physician interaction, and also
enhance the delivery of public health measures.
3.2 Methods
Due to the novelty of the topic a scoping review methodology was used to summarise all
available information from a variety of sources. The framework outlined by Arksey and
O’Malley was followed (86).
The research question was identified as “Can sentiment analysis be conducted on social media
platforms to understand public sentiment held towards pharmacotherapy?”
Social media is defined as “a group of Internet-based applications that build on the ideological
and technological foundations of Web 2.0, and that allow the creation and exchange of user
generated content” (87). Pharmacotherapy was defined as the use of pharmaceutical drugs
to treat or prevent medical conditions.
Literature published between 2002 (inception of web 2.0) and 2019 was collected from
OneSearch, PubMed, Medline, EMBASE and Cochrane. A keyword search strategy was
employed using the words (Sentiment Analysis OR Opinion mining) AND (Social Media OR
Medication OR Pharmacotherapy OR Drugs OR Pharmaceutical OR Medicine OR Facebook OR
Twitter)`.
Articles were eligible for inclusion in this review if their primary aim was to conduct sentiment
analysis of social media posts regarding pharmacotherapy. Only articles published in English
36
were included in this study. Articles that did not contain original data (e.g. letters to editor,
opinion pieces) were also excluded. Reviews and Meta-analyses were excluded but manually
searched for potential studies.
From all the included studies, information was collected on the following aspects on a
predesigned template: authorship, year and journal published, social media platform(s)
mined, medical condition(s), pharmacotherapy, type of sentiment analysis used, outcomes
generated and potential use in clinical settings as described in the study.
3.3 Results
Our search strategy revealed 194 articles, 95 of which were excluded after title and abstract
review for not meeting inclusion criteria. Of the remaining 99, 89 were excluded as they were
not analysing at least one of the required topics of pharmacotherapy, medicine or social
media. A total of 10 studies were finally included (Figure 1) (90-99).
37
*SA – Sentiment Analysis; *SM – Social Media
Figure 6 – Study flow diagram
All the studies found were published after 2013. Eight of the ten included studies performed
data mining on a single forum. Twitter was the most common platform mined (50%). The
majority of the studies aimed to understand the sentiment being expressed towards a
particular treatment, some of them also used this to explore other avenues such as adverse
drug reaction detection, the role of new media in influencing social media sentiment and the
sentiment dynamics on social media forums (Table 1).
194 articles identified from literature search
10 studies included in final review
99 articles included for full text review
95 articles excluded after title and abstract review
Excluded: 45 – Not on pharmacotherapy 25 – Not studies 11 – Not medical 6 – Not on SA 1 – Not on SM 1 – Under embargo
38
Table 1: Summary of studies
Authors Title, Journal
and year
Data Source
And Quality
Assessment
(QA)
Type of
sentiment
analysis And
Data pre-
processing
Outcome of
interest
Result Significance
Ramagopalan et
al93
Using Twitter to
investigate
opinions about
multiple
sclerosis
treatments: a
descriptive,
exploratory
study
QA not stated
LB - Hu & Liu's
opinion lexicon
Data pre-
processing - Yes
The Sentiment
Score (mean and
summed) for
each treatment
Overall positive
sentiment
scores for all
drugs apart from
Novantrone and
Tysabri.
Oral treatments had
the highest mean
summed scores which
showing that patients
prefer oral
medications as
opposed to injections.
39
F1000Research.
2014
Portier et al95 Understanding
Topics and
Sentiment in an
Online Cancer
Survivor
Community
Journal of the
National Cancer
Institute
Monographs.
2013
Cancer survivors
network
QA not stated
ML using
Adaboost
classifier
Data pre-
processing – Not
explicitly stated
Does the
sentiment of the
person making a
post change
with regards to
responses
received for that
post?
Thread about
treatment side
effects had the
lowest initial
sentiment score,
but also the
greatest shift in
sentiment
(towards
positive).
Treatment and side
effect related posts
are usually highly
negative but are
associated with the
most shift in
sentiment polarity,
thus showing the
positive support that
is provided in the
community.
Roccetti et al92 Attitudes of
Crohn’s Disease
Facebook and
LB using
OpinionFinder
What topic
within Crohn’s
Infliximab (an
antibody used to
This study showed
that a data mining
40
Patients:
Infodemiology
Case Study and
Sentiment
Analysis of
Facebook and
Twitter Posts
Journal of
Medical Internet
Research Public
Health and
Surveillance.
2017
QA:’ Used a
“Honeypot”
approach to
identify social
spammers and
to ensure that
data being
gathered is from
patients.
Data pre-
processing –
Not explicitly
stated
disease
generates that
strongest
sentiment from
patients?
Correlation
between SA and
human scores
treat Crohn’s
disease) was the
most sentiment
related term for
both positive
and negative
sentiment.
High degree of
correlation
between
positive and
negative scores,
less so for
neutral score.
approach provided
material of simple
interpretation,
regardless of the
analysts’ scientific and
professional
background. This
shows that the
analysis of such data
can be completely
automated with
significant accuracy.
41
Du et al94 Leveraging
machine
learning-based
approaches to
assess human
papillomavirus
vaccination
sentiment trends
with Twitter
data
BioMed Central
Medical
Informatics and
Decision
Making. 2017
QA not stated
ML using SVM
Data pre-
processing - Yes
Sentiment
towards HPV
vaccination. Also
looked at the
impact of new
media on
sentiment and
change in
sentiment as it
relates to the
day of the week.
35.8% were
“Positive”;
32.1% were
“Neutral”; and
32.0% tweets
were
“Negative”.
Safety was the
biggest factor in
negative tweets.
They also found
that mainstream
media can have
a significant
influence on
This study revealed
the significant impact
of news media articles
on public sentiment, a
fact that can be used
to promote public
health.
42
public opinion
with 66.21%
positive rate on
the day a
favourable news
article was
published
compared to the
previous
positive rate of
35.8%.
Cobb et al96 Sentiment
Analysis to
Determine the
Impact of Online
QuitNet
QA not stated
LB (Salience
Engine 4.1)
Whether
exposure to
positive
messages re:
Registrants who
started or
continued with
varenicline were
While the authors
could not draw
conclusions about
causality, emotional
43
Messages on
Smokers’
Choices to Use
Varenicline
Journal of the
National Cancer
Institute
Monographs.
2013
Data pre-
processing - No
varenicline
resulted in more
people switching
to it and sticking
with it.
exposed to a
statistically
significantly
greater number
of positive-
sentiment
varenicline
messages than
negative-
sentiment
messages.
content of online
communications
about health
behaviour
intervention was
found to be
associated with
decision making
around
pharmaceutical
choices.
Korkontzelos et
al91
Analysis of the
effect of
sentiment
analysis on
DailyStrength
forum and
QA not stated
LB, 5 lexica used
- the Hu&Liu
Lexicon of
Opinion Words
Whether the
addition of
sentiment
analysis feature
There was an
increase in pick
up rate of ADRs
for posts taken
This study showed
that sentiment
analysis can be used
44
extracting
adverse drug
reactions from
tweets and
forum posts
Journal of
Biomedical
informatics.
2016
(H&L), the
Subjectivity
Lexicon (SL), the
NRC
Word-Emotion
Association
Lexicon (NRC),
the NRC Hashtag
Sentiment
Lexicon (NRC#),
and the
Sentiment 140
Lexicon (S140)
to ADRMine (a
software already
designed to pick
up ADR
mentions)
would increase
accuracy of
picking up ADRs
from twitter but
not for posts
from
dailystrength.
Of all the lexica
used,
Sentiment140
performed the
best (lexica
generated from
twitter).
to augment ADR
detection rate.
45
Data pre-
processing - Yes
Ebrahimi et al90 Recognition of
side effects as
implicit-opinion
words in drug
reviews
Emerald Insight.
2016
www.drugrating
z.com
QA Not stated
ML using SVM
and a Rule
based version of
lexicon based
Data pre-
processing - Yes
To evaluate if
implicit
sentiment can
be used to
identify drug
side effects from
disease
symptom. These
were tested
against the
manual
annotation of
the same drug
Experimental
results show
that ML
outperforms the
rule-based
algorithm
significantly for
both disease
symptom and
especially side
effect detection
where it was
The main finding was
that drug review side
effect recognition can
be handled by using
the ML algorithm,
which significantly
outperforms the
regular expression-
based algorithm.
46
reviews by a
pharmacist
almost two-fold
better.
Liu et al98
Adverse drug
reaction related
post detection
using sentiment
features
Webmd.com;
Manual
annotation of
posts done
LB -
SentiWordNet
Data pre-
processing - Not
stated
To use
sentiment
features to
detect and
identify if a post
was related to
an ADR. They
compared the
accuracy of
detecting ADRs
using three
approaches; 1.
Using N-gram
This method
was very
efficient in
picking up ADR
related posts.
Compared to
similar studies
(which had use
some of the
methods but not
all three) it had
the highest F-
The addition of
sentiment analysis to
detect ADRs from
social media forums
results in greater
accuracy than seen in
previous methods.
47
and domain
features 2.
Adding
sentiment to the
above, 3. Using
CHI statistic to
select posts with
high correlation
between
sentiment, n-
gram and
domain
features.
measure
(81.4%).
Cabling et al97 Sentiment
Analysis of an
Breastcancer.or
g
LB; Liu’s
dictionary
What is the
sentiment
Most active
users were 80%
Online support groups
allow for stronger ties
48
Online Breast
Cancer Support
Group:
Communicating
about
Tamoxifen.
QA not stated
Date pre-
processing – yes
expressed
towards
Tamoxifen
more positive
than least active
users, while the
least active
users were 48%
more negative
than the most
active ones.
to be created around
a specific sentiment,
with less connection
from those with
dissimilar sentiments
to the dominant
group.
Zhang et al99 Utilizing twitter
data for analysis
of
chemotherapy
QA not stated
LB – using
TextBlob
Data pre-
processing – Not
explicitly stated
To assess and
compare
perceptions
about
chemotherapy
of patients and
health-care
Individuals are
more likely to
post emotional
tweets about
side effects than
organisations
Twitter data can be
used to understand
behavioural patterns
associated with
treatments for cancer
and for understanding
how individuals and
49
providers
through analysis
of chemo-
related tweets.
organisations
communicate about
health care concerns
and discovering
cancer patients’
needs, which could
aid in developing
personalised therapy.
Abbreviations: SA – Sentiment Analysis; ML – Machine Learning; LB – Lexicon Based; QA – Quality Assessment
50
3.3.1 Sentiment analysis techniques and accuracy
Seven of the studies used a lexicon based approach, two used machine learning and one used
both methods. Most of the studies used a different lexicon for their analysis, with none of
them being specifically geared for medical terminology. The studies that used machine
learning algorithms also utilised different algorithms, namely AdaBoost Classifier in one and
Support Vector Machine in the other two. Both these are types of machine learning
algorithms that allow stratification of data into different categories. While AdaBoost does this
by sequentially weighting the results of weak classifiers to form a strong classifier, Support
Vector Machine finds the ideal margin to separate the dataset into desired categories (88,
89).
The study by Ebrahimi et al was the only one that compared machine learning techniques to
lexicon based and also against manually classified sentiment. They used Support Vector
Machine to create a machine learning based algorithm and compared that to a lexicon based
algorithm. The machine learning algorithm outperformed the lexicon based algorithm on
both the primary (identifying forum posts mentioning drug side effects) and secondary
objectives (identifying posts mentioning disease symptoms) (90).
Data pre-processing was employed by five of the studies (90,91,93,94,97). The methods used
by the studies varied, with tokenisation (breaking sentences into small word groups or
phrases that are more easily read by a program) being the most common. The other studies
did not explicitly state whether they conducted data pre-processing, and if so then what
techniques were used.
The study by Roccetti et al compared the performance of its lexical sentiment analysis
technique to that of manual (human) coding of sentiment and found that there was a high
51
degree of correlation for the extremes of sentiment (positive and negative), and less so for
the neutral sentiments (92). Du et al conducted a manual analysis of a small corpus of tweets
classified by their machine learning algorithm and found the overall accuracy to be acceptable
(94).
3.3.2 Sentiment analysis use
The most common application of sentiment analysis (seven studies) was to analyse opinion
regarding a particular medication (92-94,96,97,99). Six of these used lexicon based
approaches and one used machine learning. While majority of these studies directly analysed
the cumulative polarity of the posts for each medication, the study by Roccetti et al reversed
the process to analyse which therapy generated the strongest sentiment (positive or
negative).
The next most common application of sentiment analysis (three studies) was to identify
adverse drug reactions (ADR) from social media chatter (90,91,98). The studies differed in
both the platforms that they mined and the approach to sentiment analysis. Ebrahimi et al
mined an online forum (www.drugratingz.com) using both machine learning and lexicon
based algorithms to assess whether sentiment expressed in forum posts can be used to
identify drug side effects from disease symptoms. Korkontzelos et al mined forums and
tweets using five different lexicon based methods to assess whether the addition of a
sentiment analysis feature to a pre-existing adverse drug reaction detection algorithm would
improve its efficacy. Liu et al mined www.webmd.com, specifically reviewing diabetic
medication forums. Their aim was to see if the addition of sentiment analysis to pre-existing
ADR detection algorithms would enhance detection. All three studies provided evidence that
sentiment analysis can be used to detect ADR mentions from social media posts.
52
One study also explored the interaction between news media and social media through the
lens of sentiment. Du et al analysed the impact of sentiment towards Human Papilloma Virus
vaccination, as expressed by tweets, before and after publication of a positive New York Times
article (94). While the average number of tweets (positive, negative and neutral) pertaining
to the topic was 1245 per day, the immediate period after publication of a New York Times
article on HPV saw this number jump to 16,000 with the proportion of positive sentiment
tweets rising from 35% to 66%. This was a remarkable demonstration of the impact of real-
world events on social media sentiment.
Three studies analysed the sentiment dynamics in cancer forums (95-97). The study by Portier
et al looked at how the sentiment expressed by users in each thread influences the sentiment
of the person who started the thread. They were able to show that discussions especially
about pain and chemotherapy side effects typically started with a negative sentiment but
gradually underwent a positive sentiment shift, reflecting the power of community support
in improving sentiment (95). The study by Cabling et al looked at the sentiment of the posters
in a breast cancer forum on tamoxifen and found that the most active posters were more
likely to have a positive sentiment than those who posted less frequently (97). The study by
Cobb et al was the only one to assess the direct impact of sentiment on compliance. After
adjusting for variables they found that as the exposure to positive messages about varenicline
increase, so did the odds (odds ratio = 2.05, 95% confidence interval = 1.66 to 2.54) of the use
starting and continuing with the medication in an attempt to quit smoking (96).
3.4 Discussion
This scoping review shows that sentiment analysis can be used to gauge public perceptions
regarding pharmacotherapy as expressed on social media. The most common application that
53
emerged was of using sentiment analysis to assess patient opinion regarding
pharmacotherapy. While there was some consistency with regards to the platform being
mined (Twitter being the most common), there was no consistent “gold standard” approach
used by the authors to conduct SA. This likely reflects the fact that sentiment analysis is still
in its early stages of development, with various methods currently being explored in order to
establish a standard (100).
Lexicon based approaches were more popular than machine learning based approaches,
especially when the aim was to detect sentiment towards a particular treatment, with all of
them being successful in detecting the sentiment expressed. The accuracy of this sentiment,
as judged by a manual review, however, was infrequently done (92,94). Roccetti et al
conducted a manual analysis of a small corpus of tweets to judge the accuracy of their SA.
This analysis was conducted by medical specialist and a software engineer who individually
reviewed the posts and assigned a sentiment to each one. It was interesting to note that while
the agreement between the two manual observers was good (kappa 0.647) it was not perfect,
thus showing that even amongst human reviewers there can be disagreement about the
underlying sentiment of the text being analysed. While their algorithm had adequate accuracy
in detecting positive and negative sentiment, it was more likely to classify those posts with
less obvious sentiment as neutral. It appears that sentiment analysis might be unable to
detect the polarity of posts with subtle sentiment and tends to classify them as neutral. This
is a reassuring finding for two reasons, firstly, it would be better to classify a post with subtle
positive or negative emotion as neutral than the opposite category (as was seen with the
human reviewers where the computer scientist assigned more posts as either positive or
negative than the gastroenterologist), thus highlighting that sentiment analysis can negate
54
some of the inherent experiential biases that come with human sentiment coding. Secondly,
posts that describe significant adverse drug reactions are unlikely to have subtle emotion,
thus more likely to be picked up by SA.
Three studies applied sentiment analysis to improve the detection of adverse drug reactions,
an important cause of morbidity and mortality (101). While some adverse drug reactions are
detected during clinical trials, a large number only become obvious during the post marketing
surveillance phase (102). There were significant differences between the studies in terms of
both the platforms being mined (DailyStrength forum and Twitter, www.drugratingz.com and
webmd.com) and the technique used (lexicon based by two and both machine learning and
lexicon based by the other). The study by Korkontzelos et al added different types of lexicon-
based sentiment analysis to an existing adverse drug reaction detection program (ADRMine
– an algorithm-based software designed to detect adverse drug reaction mentions in social
media posts) to assess whether identification of negative sentiment would increase the
detection rate. While ADRMine is designed to be highly sensitive, the addition of sentiment
analysis slightly improved the rate of detection of ADRs. The most successful lexica employed
in this analysis were developed from Twitter, reinforcing the knowledge that sentiment
analysis is highly domain specific (103). A similar study was conducted by Liu et al who added
sentiment analysis to pre-existing adverse drug reactions detection processes such as N-gram
and domain features and demonstrated that this resulted in increased detection of adverse
drug reactions. In contrast, the study by Ebrahimi et al applied both lexicon based and
machine learning sentiment analysis directly to the mined data and successfully detected
adverse drug reactions from the forum posts. This was the only study that compared machine
learning to lexicon based algorithms, using manual review of the adverse drug reactions
55
identified. While machine learning based approaches were superior at picking up adverse
drug reaction mentions and detection of disease effects, the authors concluded that both
approaches were promising and that in future perhaps a hybrid of the two could be used for
even more accuracy (90).
Another potential application of sentiment analysis is understanding the interaction between
news media and social media through the sentiment expressed. The study by Du et al showed
the remarkable (positive) impact a (positive) news media publication can have on social media
sentiment, thus demonstrating its potential use in public health. This is an exciting area
deserving of further analysis as the relationship between News media and social media would
provide a powerful tool to help promote and assess the efficacy of public health initiatives,
especially relevant in the current pandemic.
Perhaps more important is the potential impact of social media sentiment on real-world
behaviour. This has already been demonstrated in other fields such as the film industry and
stock market, with positive sentiment resulting in positive box-office and market returns
(104,105). Thus, the question arises whether social media sentiment might influence
individual decisions related to pharmacotherapy. This concept was evaluated by Cobb et al
who used sentiment analysis to evaluate the impact of online messages on a smoker’s
decision to use a particular medication (varenicline) to help them quit smoking (96). They
analysed smokers who posted information about their pharmacotherapy use on QuitNet, a
forum for smokers. It showed that smokers who were exposed to greater amount of positive
sentiment posts about varenicline were more likely to start and continue to use varenicline
in an effort to quit smoking. While the authors refrained from drawing concrete conclusions
on causality of sentiment on medication preference and compliance, the results certainly
56
warrant further scrutiny with targeted studies. Cabling et al also looked at the sentiment
dynamics on medical forums (specifically Tamoxifen related posts on Breastcancer.org) and
found that the most active posters were much more likely to express positive sentiment, thus
perhaps explaining the positive sentiment that persistent users from Cobb et al study were
exposed to.
The specifics of negative sentiment associated with certain medications and side effects
suggests sentiment analysis could be used to identify specific issues which could be addressed
by individual clinicians with their patients, to allay their fears and improve adherence. This
was demonstrated in the study by Ramagopalan et al on Multiple Sclerosis medications. This
study revealed that patients preferred oral medications to injections and were more
concerned about some side effects (e.g. infections) than others. Similarly, the study by Zhang
et al was also able to demonstrate user sentiment towards specific side effects of
chemotherapy, showing some side effects generate less negative sentiment (“nausea”, “hair
loss”) as opposed to others (“Fatigue”, “neuropathy”), which generated much more negative
sentiment. This knowledge can be used by clinicians and pharmacists to better target
medication related counselling, thus potentially improving adherence.
While this review does provide preliminary evidence that sentiment analysis can be used to
understand mass opinion about pharmacotherapy, several questions remain about the
overall process and the technique of sentiment analysis. There was significant heterogeneity
between the studies at several stages of the analytic process, especially at the key stage of
conducting the analysis but also at the earlier stage of data pre-processing and the
subsequent stage of accuracy analysis. These different approaches are however not specific
to sentiment analysis of medical texts and reflect the ongoing development and evolution of
57
the technology itself (106). There is presently no universally accepted gold standard
approach. Current evidence suggests that the choice of method may be domain-specific
(depend on the condition/therapy being analysed, the platform being mined and the outcome
that is sought). The few studies that have compared the different approaches have generally
failed to establish a gold standard, with each approach having its own set of advantages and
disadvantages (107,108).
As the technology is further refined, standardisation of methodology and the establishment
of healthcare specific sentiment analysis methods (either machine learning algorithms or a
medical-sentiment lexicon) may facilitate the development of further validity regarding the
application of this technology to the health care sector (109,110).
This review has a few limitations. Sentiment analysis is dependent on the domain or topic
being studied, thus the lack of validated lexica or machine learning algorithms of conducting
sentiment analysis specific to the field of healthcare meant that there was significant
heterogeneity in the studies, which limited comparison and developing concrete conclusions.
Our inclusion criteria were intentionally specific, thereby limiting the focus of sentiment
analysis just to the realm of social media and pharmacotherapy, however there are other
applications of sentiment analysis in the field of healthcare including (but not limited to)
mining opinions regarding healthcare received, determining clinical outcomes and
understanding emotions of being unwell (110).
3.5 Conclusion
This scoping review provides an overview of current evidence on the multifaceted
applicability of sentiment analysis. While the most obvious utilisation is in the assessment of
public sentiment about particular medications, the fact that sentiment analysis is also being
58
used for other tasks such as adverse drug reaction detection is a promising glimpse into the
hitherto untapped potential of this technology. The heterogeneity of approach to sentiment
analysis across the studies reflects the rapid pace at which this technology continues to
evolve. While it has already found use in the fields of commerce and marketing, its current
state of clinical equipoise may be resolved if a universally agreed standardised approach is
established. This will have far reaching consequences across various domains of healthcare,
including but not limited to patient safety and public health initiatives.
59
Chapter 4: Mining social media data to investigate patient perceptions regarding DMARD therapy
Publication:
• Sharma C, Whittle S, Haghighi PD, Burstein F, Sa'adon R, Keen HI. Mining social
media data to investigate patient perceptions regarding DMARD pharmacotherapy
for rheumatoid arthritis. Annals of the Rheumatic Diseases. 2020 Sep 3.
4.1 Abstract
Objectives The hypothesis of this study is that patients have a positive sentiment regarding
b/tsDMARDs and a negative sentiment towards csDMARDs. To investigate this, all available
discussions on social media platforms regarding DMARDs in the context of rheumatoid
arthritis were analysed to understand the collective sentiment expressed towards these
medications.
Methods Treato analytics were used to download all available posts on social media about
DMARDs in the context of RA. Strict filters ensured that user generated content was
downloaded. The sentiment (positive or negative) expressed in these posts was analysed for
each DMARD using sentiment analysis. The reason(s) for this sentiment for each DMARD was
also analysed, looking specifically at efficacy and side effects.
Results Computer algorithms analysed millions of social media posts and included 28261
posts on b/tsDMARDs and 26841 posts on csDMARDs. Both classes had an overall positive
sentiment. The ratio of positive to negative posts was higher for b/tsDMARDs (1.210) than for
csDMARDs (1.048). Efficacy was the most commonly mentioned reason in posts with a
positive sentiment and lack of efficacy was the most commonly mentioned reason for a
60
negative sentiment. These were followed by the presence/absence of side effects in negative
or positive posts, respectively.
Conclusions Public opinion on social media is generally positive about DMARDs. Lack of
efficacy followed by side effects were the most common themes in posts with a negative
sentiment. There are clear reasons why a DMARD generates a positive or negative sentiment,
as the sentiment analysis technology becomes more refined, targeted studies could be done
to analyse these reasons and allow clinicians to tailor DMARDs to match patient needs.
4.2 Methods
The services of the web analytics firm Treato were utilised to collect the data. The Treato
platform automatically identifies, collects and analyses publicly available user-generated
content on health-related topics from over 10,000 sources. These sources include the publicly
available data on social networks such as Facebook and Twitter, discussion forums and blogs.
Over three billion posts were analysed from these sources. The data are then analysed using
a patented algorithm that applies natural language processing to this content to identify
medical concepts mentioned in text, and extract patients’ self-reported descriptions of their
experiences with various health conditions and medications. These medical experiences were
then mapped on to formal concepts in a medical ontology. Treato’s algorithms combine
various medical ontologies including those used by the Food and Drug Administration for
coding. This process includes resolving conceptual synonyms of medical terms (e.g., ‘fatigue’
and ‘tired’ were assigned the same concept code); resolution of patient-specific phrases (e.g.,
“pain in my joints” and “my joints hurt”) to medical terms; word-sense disambiguation
algorithms (e.g., “BP” could refer to bi-polar disorder, blood pressure, or a bisphosphonate
61
medication); and medication synonyms (e.g., generic and brand names for the same
medication).
The data used in this study was limited to posts written in the English language. The unit of
analysis for this study was an individual post. In order for a post to be included in the final
analysis it needed to be user generated content mentioning at least one of the thirteen
current DMARDs (methotrexate, leflunomide, sulfasalazine, hydroxychloroquine,
adalimumab, etanercept, certolizumab, golimumab, tocilizumab, tofacitinib, rituximab,
abatacept and infliximab) in the context of RA.
Included posts were then subject to Treato’s sentiment analysis algorithms for further
categorisation into posts with positive or negative sentiment. The two most common reasons
for a positive post were DMARD efficacy and lack of side effects. Conversely, the most
common reasons for a negative post were lack of efficacy and side effects. Therefore, the
positive and negative tagging is not mutually exclusive since a post may contain both positive
and negative experiences about the same medication. Treato also compiled data on the most
common concerns that were frequently listed by patients on various DMARDs. These data
were then provided to us for interpretation.
The overall sentiment for each DMARD was expressed as the ratio of the positive to negative
posts for that DMARD. A ratio greater than one indicated an overall positive sentiment.
Demographic information was collected where available.
While the algorithms were able to assign sentiment and extract information regarding efficacy
and side effects for all the DMARDs, the final numbers were not available for
hydroxychloroquine and abatacept, which were then manually extracted. In order to ensure
62
that the results were valid for hydroxychloroquine and abatacept, this process of manual
extraction was repeated for all the other DMARDs. There were negligible differences (0-3%)
between the algorithm and manual extraction across the categories of the DMARDs which
likely reflect the difference in dates when the data was provided by Treato’s algorithms and
when it was manually extracted (additional posts on social media). This difference was not
felt to be large enough to have a significant impact on the overall interpretation of the results.
A comparison of proportions analysis was conducted to assess whether there was a difference
in the proportion of positive sentiment posts between the various b/ts DMARDs (111)
4.2.1 Statistics
Cohen's kappa coefficient was used to assess inter-rater agreement between Treato and
manual assessment of sentiment. A chi squared test was conducted to compare positive
sentiment for efficacy across b/tsDMARDs and concerns raised by patients on both csDMARDs
and b/tsDMARDs. Statistical significance was assumed at p < 0.05.
4.3 Ethics
Ethics approval was obtained from Human Research Ethics Committee at Monash University
and the University of Western Australia.
4.4 Results
Treato collected data prospectively from July 2017 till October 2018, and also analysed
available data retrospectively. A total of 28261 posts on b/tsDMARDs and 26841 posts on
csDMARDs were collected, with some overlap. The individual breakdown of the DMARDs and
the positive and negative posts is shown in Table 2. Treato’s algorithms identified majority
(89.6% and 88.8% respectively) of the posts on b/tsDMARDs and csDMARDs as being written
by patients. As a validation exercise, 200 posts were manually assessed and assigned a
63
sentiment. This was compared with the sentiment assigned by Treato’s algorithms for these
posts. Agreement between sentiment assessed by machine and human was moderate
(csDMARDs ƙ= 0.49 and b/tsDMARDs ƙ= 0.52) (112). We considered kappa values of 0.49 and
0.52 as ‘moderate agreement’ based on well-established parameters, which are widely used
in clinical research (113). While a high kappa is desirable in clinical studies, the same standards
cannot be applied when gauging online sentiment. Indeed, the kappa in such studies is
moderate at best even between human reviewers, thus achieving a moderate kappa between
human and algorithm is a strength of the study (114).
Table 2: Aggregate sentiment
b/tsDMARD Number of posts Percent Ratio P/N
Etanercept positive 5210 18.4 1.35
Etanercept negative 3852 13.6
Infliximab positive 2636 9.3 1.1
Infliximab negative 2405 8.5
Adalimumab positive 4419 15.6 1.08
Adalimumab negative 4107 14.5
Certolizumab positive 461 1.6 1.11
Certolizumab negative 415 1.5
Golimumab positive 306 1.1 1.26
64
Golimumab negative 243 .9
Tocilizumab positive 384 1.4 1.40
Tocilizumab negative 274 1.0
Abatacept positive 774 2.7 1.16
Abatacept negative 694 2.5
Tofacitinib positive 346 1.2 1.71
Tofacitinib negative 202 .7
Rituximab positive 918 3.2 1.49
Rituximab negative 615 2.2
csDMARD Number of posts Percent Ratio P/N
MTX positive 9058 33.7 0.995
MTX Negative 9103 33.9
HCQ Positive 3026 11.3 1.26
HCQ Negative 2398 8.9
SZS Positive 803 3.0 0.97
SZS Negative 827 3.1
LEF Positive 849 3.2 1.09
LEF Negative 777 2.9
65
4.4.1 B/tsDMARDs
Content about b/tsDMARDs was collected from 497 publicly available forums. The greatest
proportion (7,969/28261 posts) were obtained from Facebook. The 10 most popular social
media platforms used to publish these posts are shown in Table 3. Geolocation data was
available on 1837 posts which identified users from 34 countries. Majority of the posts
(95.4%) were from USA (1349), UK (162), Canada (155), Australia (55) and Mexico (15).
Table 3 Social media platforms
b/tsDMARDs Number of posts Percent Cumulative Percent
facebook.com 7969 28.2 28.2
inspire.com 3032 10.7 38.9
healingwell.com 1738 6.1 45.1
dailystrength.org 1735 6.1 51.2
community.arthritis.org 1551 5.5 56.7
reddit.com 1297 4.6 61.3
healthunlocked.com 1057 3.7 65.0
remedyspot.com 902 3.2 68.2
crohnsforum.com 795 2.8 71.0
arthritiscareforum.org.uk 540 1.9 72.9
66
csDMARDs Number of posts Percent Cumulative Percent
facebook.com 6638 24.7 24.7
healthunlocked.com 2184 8.1 32.9
dailystrength.org 1956 7.3 40.2
inspire.com 1689 6.3 46.4
community.arthritis.org 1318 4.9 51.4
reddit.com 1100 4.1 55.5
remedyspot.com 1088 4.1 59.5
arthritiscareforum.org.uk 1003 3.7 63.2
healingwell.com 879 3.3 66.5
psoriasis-help.org.uk 648 2.4 68.9
The ratio of total positive to negative posts was 1.21, thus indicating an overall positive
sentiment. Each of the b/tsDMARDs had a greater number of positive than negative posts.
Efficacy was the most common theme identified within posts assigned a positive sentiment
(>80% of positive posts), followed by lack of side effects (13% of positive posts) (Table 4).
Comparing b/tsDMARDs to each other in terms of the proportion of patients who posted a
positive post due to efficacy, revealed etanercept as being the most popular by having a
significantly superior difference in proportion to three other b/tsDMARDs (rituximab,
infliximab and tofacitinib), (Table 5). While it could be argued that those bDMARDs that had
a higher number of posts mentioning lack of side effects would be deemed to be less
67
efficacious than those that received more posts for efficacy, as the analysis looked at
proportion of positive posts for efficacy, the the impact of lack of side effects related posts
was negated. Additionally if a bDMARD is generating substantial positive sentiment simply for
lack of side effects as compared to efficacy, then its lower efficacy percentage is likely
justified.
While lack of efficacy was also the most common theme in posts with a negative sentiment,
side effect concerns were a more prominent cause of negative sentiment posts than lack of
side effects were for positive sentiment posts (Table 4).
Table 4: b/tsDMARD positive and negative sentiment for efficacy and side effects
b/tsDMARD Efficacy
posts
Total
positive
posts
Percentage Posts
stating
"no side
effects"
Percentage
Infliximab 2239 2636 84.94 308 11.68
Abatacept 666 774 86.04 109 14.08
Adalimumab 3769 4419 85.29 616 13.94
Certolizumab 383 461 83.08 76 16.49
Golimumab 264 306 86.27 48 15.69
Rituximab 777 918 84.64 143 15.58
Tocilizumab 335 384 87.24 51 13.28
68
Tofacitinib 281 346 81.21 70 20.23
Etanercept 4536 5210 87.06 610 11.71
b/tsDMARD Lack of
efficacy
posts
Total
negative
posts
Percentage Side
effects
posts
Percentage
Infliximab 1265 2405 52.60 983 40.87
Abatacept 437 694 62.97 259 37.32
Adalimumab 2500 4107 60.87 1429 34.79
Certolizumab 249 415 60 163 39.28
Golimumab 187 243 76.95 53 21.81
Rituximab 347 615 56.42 243 39.51
Tocilizumab 143 274 52.19 132 48.18
Tofacitinib 102 202 50.5 102 50.50
Etanercept 2344 3852 60.85 1387 36.00
69
70
Table 5: Comparison of proportion of positive sentiment for efficacy amongst b/tsDMARDs
Abatacept
86.04%
Adalimuma
b 85.29%
Certolizuma
b 83.08%
Etanercept
87.06%
Golimumab
86.27%
Infliximab
84.94%
Rituximab
84.64%
Tocilizumab
87.24%
Tofacitinib
81.21%
Abatacept
86.04%
NA DUP DUP DUP DUP DUP DUP DUP DUP
Adalimuma
b 85.29%
0.75% NA DUP DUP DUP DUP DUP DUP DUP
Certolizuma
b 83.08%
2.96% 2.21% NA DUP DUP DUP DUP DUP DUP
71
Etanercept
87.06%
1.02% *1.77% *3.98% NA DUP DUP DUP DUP DUP
Golimumab
86.27%
0.23% 0.98% 3.19% 0.79% NA DUP DUP DUP DUP
Infliximab
84.94%
1.10% 0.35% 1.86% *2.12% 1.34% NA DUP DUP DUP
Rituximab
84.64%
1.40% 0.65% 1.56% *2.42% 1.63% 0.30% NA DUP DUP
72
Tocilizumab
87.24%
1.20% 1.95% 4.16% 0.18% 0.97% 2.30% 2.60% NA DUP
Tofacitinib
81.21%
*4.83% *4.08% 1.87% *5.85% 5.06% 3.73% 3.43% *6.03% NA
NA – Not Applicable; DUP – duplicate value; * - p < 0.05
73
The most common concerns raised by patients who wrote a negative post on b/tsDMARDs
are depicted in Table 7. Joint pain was the most common but the next three reasons for a
negative sentiment were due to side effects (“Rash”, ‘Nausea” and “Itching”). Infections were
also a prominent reason for a negative sentiment, with four of the top 20 reasons being
occupied by infectious causes (“fever”, “pneumonia”, “common cold” and “sinus infections”).
4.4.2 CsDMARDs
Posts about csDMARDs were collected from 515 social media sites. Ten websites contributed
69% (18,503) of all the posts (table 3). Geolocation was only available for 5% (1441) of the
posts. Among these, however, 36 countries were represented. The majority of the posts
(93.3%) came from USA (904), UK (174), Canada (142), Australia (90) and New Zealand (35).
The ratio of total positive to negative posts was 1.048, indicating an overall positive
sentiment. The individual ratios revealed a negative sentiment for sulfasalazine (0.97) and
methotrexate (0.995), and positive for leflunomide (1.09) and hydroxychloroquine (1.26)
(Table 2).
Efficacy was the most common theme in posts with a positive sentiment for all the csDMARDs
(Table 6). While lack of efficacy was the most common theme in posts with a negative
sentiment, its overall share was lower than what was seen in posts with a positive sentiment.
Approximately half of the negative posts regarding methotrexate discussed either lack of
efficacy (50.08%) or side effects (44.94%). For hydroxychloroquine and sulfasalazine, a higher
proportion of negative posts discussed lack of efficacy (56.42% and 53.81% respectively)
versus side effects (40.28% and 31.68% respectively). Leflunomide saw a slightly larger share
of negative sentiment posts discussing side effects (18.15%), with discussions on lack of
efficacy accounting for 16.86% of the negative sentiment posts. The lower percentage of
74
positive and negative posts for leflunomide was raised with the Treato engineers. They stated
that it was likely that the discussions that were being had for leflunomide were not specific
enough for either side effects or lack of efficacy for them to be appropriately categorised by
the algorithms. It is possible that the remaining discussions were still on these two topics, but
the way in which it was worded resulted in them not being placed in these two categories by
the algorithm. Of the patients who gave methotrexate an overall negative sentiment, 7.18%
still felt that it was effective, these numbers were lower for sulfasalazine (4.96%) and
leflunomide (3.2%). (Table 6)
Table 6: Positive/Negative sentiment csDMARDs reasons
Positive sentiment csDMARDs reasons
csDMARDs Efficacy
Total
posts Percentage
Lack of
side
effects
Total
posts Percentage
Methotrexate 7364 9058 81.30 1762 9058 19.45
Hydroxychloroquine 2621 3026 86.62 439 3026 14.5
Leflunomide 215 849 25.32 63 849 7.42
Sulfasalazine 611 803 76.10 135 803 16.81
Negative sentiment for csDMARD
csDMARDs
Lack of
efficacy
Total
posts Percentage
Side
effects
Total
posts Percentage
75
Methotrexate 4559 9103 50.08 4091 9103 44.94
Hydroxychloroquine 1353 2398 56.42 966 2398 40.28
Leflunomide 131 777 16.86 141 777 18.15
Sulfasalazine 445 827 53.81 262 827 31.68
The most common concerns associated with a negative sentiment are shown Table 7.
“Nausea” was the most common, closely followed by “Joint pain”. The remainder of the list
was strongly populated with side effect mentions including “Hair loss” “allergy” “Rash” and
“stomach problems”.
Table 7: Concerns: percentage of posts with a negative sentiment
Concern b/tsdmard
(%)
Csdmard
(%)
difference 95% CI
lower
limit
95% CI
upper
limit
p value
Joint Pain 13.86 10.96 2.90 2.10 3.71 0.0001
Itching 2.73 1.75 0.98 0.62 1.35 0.0001
Rash 3.29 2.60 0.69 0.27 1.10 0.0011
Cancer 2.44 1.80 0.64 0.29 0.99 0.0003
Weight
Gain
2.45 2.10 0.35 -0.01 0.73 0.05
76
Common
Cold
1.31 1.12 0.19 0.08 0.46 0.15
Migraines 1.42 1.31 0.11 0.17 0.40 0.4421
Muscle Pain 1.13 1.06 0.066 0.18 0.32 0.6122
Fever 1.51 1.46 0.05 0.24 0.35 0.7302
Weight Loss 1.07 1.52 -0.45 0.17 0.73 0.0014
Hair Loss 2.27 6.85 -4.58 4.08 5.09 0.0001
Nausea 2.99 11.25 -8.26 7.64 8.88 0.0001
4.4.3 B/tsDMARDs vs csDMARDs
More patients on b/tsDMARDs were significantly more likely to positively post due to efficacy
(85.74 %) as compared to csDMARDs (78.71 %), difference of 7.03 % (95% CI 6.15 % to 7.91
%; p < 0.0001). However, patients on csDMARDs were significantly more likely to post a
positive comment due to lack of side effects (17.47 %) as opposed to those on b/tsDMARDs
(13.14 %), difference of 4.33 % (95% CI 3.5 % to 5.16 %; p < 0.0001).
Concerns about medications were broadly similar in posts about either csDMARDs or
b/tsDMARDs (Table 7). However, posts about b/tsDMARDs were significantly more likely to
contain descriptions of joint pain, drug reactions (rash and itching) and cancer, whereas posts
about csDMARDs contained more descriptions of weight loss, hair loss and nausea. Posts on
csDMARDs were more likely to be on gastrointestinal issues such as “stomach problems”,
“diarrhoea” and “vomiting”. Allergic reactions to the medications were also a common reason
77
for negative sentiment with csDMARDs, particularly sulfasalazine (10.1% of all negative posts,
vs 3.66% for all other csDMARDs). Infections were mentioned more frequently in posts on
b/tsDMARDs (10.54% vs 5.76%; p < 0.0001). Among the b/tsDMARDs, shingles was more
frequently mentioned in association with tofacitinib than the other b/tsDMARDs combined
(5.4% vs 0.7% of negative posts; p < 0.0001).
4.5 Discussion
Our study supports our hypothesis that the collective sentiment was skewed positively in
favour of the b/tsDMARDs over the csDMARDs. While all the b/tsDMARDs had a positive
sentiment, this was only true for hydroxychloroquine and leflunomide amongst the
csDMARDs.
Efficacy and side effects were found to be the most commonly discussed topics in posts with
positive and negative sentiment. These findings mirror those of a recent study that
investigated the reasons for bDMARD discontinuation in RA patients and found that lack of
efficacy followed by side effects as the two biggest factors (115). The ratio of positive to
negative posts for b/tsDMARDs ranged from 1.71 for tofacitinib to 1.08 for adalimumab.
Tofacitinib had 81.21% of its positive posts discussing efficacy, this was lower than the other
b/tsDMARDs and methotrexate. However, tofacitinib also had the highest percentage of
positive posts discussing lack of side effects (20.23%) which contributed to its overall high
ratio of positive to negative posts. However, side effects were also the most common theme
in posts with a negative sentiment towards tofacitinib with 50% of negative posts describing
side effects, the highest across both the categories of DMARDs. The literature regarding side
effects with tofacitinib however does not reveal any unexpected findings (116-118).
78
Tofacitinib had the least number of posts (548) across both categories of DMARDs, which
likely played a role in the occurrence of such diverse results.
All the b/tsDMARDs had at least 80% of their positive posts discussing efficacy. While
etanercept had significantly higher posts commenting positively due to efficacy than some of
the other b/tsDMARDs, the absolute difference in proportions was small and unlikely to be
clinically meaningful. It is interesting to note that the three b/tsDMARDs that had a lower
proportion of efficacy posts than etanercept (rituximab, tofacitinib and infliximab) all had a
different mechanism of action to one another and a different mode of administration. This
comparison also highlights a powerful potential use of sentiment analysis technology. Despite
the ever-increasing number of b/tsDMARDs, there are few head to head trials that directly
compare these agents. The use of sentiment analysis provides us with a large scale, real-world
summary measure of effectiveness and tolerability that acts as an (in)direct comparison.
While methotrexate did have over 80% of its positive posts discussing efficacy, only marginally
below the b/tsDMARDs, it still generated an overall negative sentiment ratio due to the high
incidence of posts mentioning side effects. Almost half of the negative posts against
methotrexate discussed side effects, which was one of the highest across both the categories
of DMARDs. Our study demonstrates that majority of patients find methotrexate to be
efficacious yet have assigned it a negative sentiment primarily due to gastrointestinal side
effects. While clinical trial data have shown that less than 10% of patients stop methotrexate
due to side effects, longer term studies however have demonstrated that over a third of the
patients who take methotrexate for more than two years will discontinue the medication
(119,120). Sulfasalazine also had a high percentage of patients posting about side effects, with
allergic reactions being the frequently mentioned, however the percentage of positive posts
79
discussing efficacy were lower than that of methotrexate or the b/tsDMARDs. It was a
combination of poor (perceived) efficacy along with side effect concerns that generated the
overall negative sentiment for sulfasalazine. Trials that have previously compared
sulfasalazine to methotrexate have demonstrated comparable efficacy and side effects
(121,122).
One of the most common concerns raised by patients on b/tsDMARDs were injection site
reactions. Studies have shown that patients have a strong preference for orally administered
medications over injectables and this likely contributed towards the reduced side effect
related sentiment (123). Frequency of administration might also explain the relatively fewer
negative posts due to side effects for golimumab which has a monthly dosing interval.
Evidence suggests that patients with RA prefer a monthly frequency of drug administration
and while other drugs such as infliximab, tocilizumab and rituximab have similar or longer
frequency of administration, their intravenous route of administration is known to be less
desired by patients (124).
The most common concerns raised by patients on csDMARDs were hair loss, gastrointestinal
issues and allergic reactions. Shingles was a higher cause of negativity in patients on
tofacitinib than on the other b/tsDMARDs, which mirrors the findings in the studies (125).
More patients posted a positive comment for b/tsDMARDs regarding efficacy than for
csDMARDs, this was demonstrated in a network meta-analysis, which showed that 16% more
patients on biologic/DMARD combination achieved an American College of Rheumatology 50
(ACR50) response than those on csDMARDs (126).
80
The most important limitations of this study are reflective of the nascent state of the
technology. The first being the quality of the data. This study is unlike the typical qualitative
analysis studies which obtain responses from patients by direct questioning. Our study
downloaded free flowing conversations across the entirety of the internet on specific topics.
While this allowed us to capture patient sentiment in its more pure form, unbiased by the
confines of surveys and questionnaires, it comes at the cost of accuracy. Despite using strict
filters, without conducting a manual analysis of the 3 billion posts it is impossible to know
how relevant the information contained within the post is to the topic being studied.
Secondly, sentiment analysis itself is evolving with no current gold standard approach. There
are various methods by which sentiment analysis can be conducted, with each having certain
advantages and disadvantages and none providing an absolute guarantee of accuracy. Due to
these issues, it would not be surprising to have similar studies produce different results based
on the platforms being analysed (as some allow patients to post large amounts of information
and others, like Twitter, only allow small amounts, thus influencing the accuracy of the
algorithms) and the technique used to conduct sentiment analysis. Posts made in languages
other than English were also excluded as sentiment analysis is not as well developed for other
languages. Therefore, the results of this study might not be applicable to countries where
English is not the primary language.
81
Chapter 5: Conclusion
5.1 Research Contribution
Rheumatoid arthritis is a chronic, incurable and debilitating condition that requires strict
adherence to pharmacotherapy in order to achieve remission and improve quality of life.
Patient concordance with prescribed medications has historically been poor. While high
quality randomised control trials are useful for detecting efficacy, they are typically not
designed to understand patient emotions and individual experiences, which often play a
bigger role in determining long-term concordance with medications. Qualitative analysis are
better suited to tackling such questions but are typically done on a much smaller scale thus
might not be able to capture the wide range of reasons and emotions expressed by patients.
The availability of social media as a vast repository of unfiltered, free-flowing patient
discussions that are unencumbered by the confines of rigid questionnaires, along with the
rapid improvement in big data analytic technology, has given us a unique opportunity to bring
these two together to better understand this important yet elusive question of patient
sentiment and beliefs about their medications.
This study is the first of its kind to conduct a sentiment analysis of all available social media
posts generated by RA patients for all available DMARDs (as of the commencement of the
study). Our study has been able to capture unprompted sentiment as directly expressed by
the patient. The sentiment was positive for all the b/tsDMARDs with efficacy being the
primary driver of this, followed by lack of side effects. Methotrexate and sulfasalazine had an
overall negative sentiment, and descriptions of side effects were particularly common for
methotrexate. While csDMARDs are typically first line agents, majority of patients with RA
will be in remission on these agents alone. Thus, it is counter intuitive to some extent for the
82
sentiment towards these agents to be less positive than the b/tsDMARDs, which are typically
reserved for refractory cases of RA. Analysing the role that the position each DMARD holds in
the hierarchy of RA management, and its impact on patient sentiment can be done in future
studies.
We identified efficacy and side effects as the major points of discussion on various social
media, thus chose to focus the discussion on them. However, it is certainly possible that there
were other aspects of DMARDs that were being discussed (such as affordability, availability
and restrictions imposed by local regulatory requirements to name a few) and future studies
could certainly assess these to obtain a broader understanding of patient perceptions
regarding these medications.
5.2 Future Directions
While the field of data analytics has certainly improved over the last decade, the progress has
been overshadowed by the meteoric rise in the quantity of data being produced on a daily
basis. The ability to analyse this data is further hampered by the global variation in linguistics
and semantics, which in the absence of machine readable labelling (a human intensive task),
will unlikely yield meaningful results. However rapid advances are occurring in the field of big
data analysis, especially with the development of artificial neural networks and deep learning,
which allow the computer programs to self-learn and improve with each iteration without the
need for human input. While the potential for this technology to rapidly capture and interpret
broad-spectrum patient sentiment towards medications is unimaginable, a universally
accepted “gold standard” approach will need to be established prior to wider acceptance of
this technology in the medical field.
83
Once established, we foresee sentiment analysis as being a valuable addition to existing
qualitative methods, which allow for a more nuanced assessment than is currently possible
with sentiment analysis. This complementary approach will generate novel insights and
improve various aspects of patient-physician interaction, from shared decision-making
regarding DMARD selection, to patient adherence, thus improving patient care.
84
References 1. Cross M, Smith E, Hoy D, Carmona L, Wolfe F, Vos T, et al. The global burden of
rheumatoid arthritis: estimates from the global burden of disease 2010 study. Ann
Rheum Dis. 2014;73(7):1316-22.
2. Garrod AE. A treatise on rheumatism and rheumatoid arthritis: Griffin; 1890.
3. Landré-Beauvais AJ. The first description of rheumatoid arthritis. Unabridged text of
the doctoral dissertation presented in 1800. Joint Bone Spine. 2001;68(2):130-43.
4. Tan EM, Smolen JS. Historical observations contributing insights on etiopathogenesis
of rheumatoid arthritis and role of rheumatoid factor. J Exp Med.
2016;213(10):1937-50.
5. Yarwood A, Huizinga TW, Worthington J. The genetics of rheumatoid arthritis: risk
and protection in different stages of the evolution of RA. Rheumatology. 2016 Feb
1;55(2):199-209.
6. Stahl EA, Wegmann D, Trynka G, Gutierrez-Achury J, Do R, Voight BF, Kraft P, Chen R,
Kallberg HJ, Kurreeman FA, Kathiresan S. Bayesian inference analyses of the
polygenic architecture of rheumatoid arthritis. Nature genetics. 2012 May;44(5):483-
9.
7. Okada Y, Wu D, Trynka G, Raj T, Terao C, Ikari K, Kochi Y, Ohmura K, Suzuki A,
Yoshida S, Graham RR. Genetics of rheumatoid arthritis contributes to biology and
drug discovery. Nature. 2014 Feb;506(7488):376-81.
8. Viatte S, Barton A. Genetics of rheumatoid arthritis susceptibility, severity, and
treatment response. InSeminars in immunopathology 2017 Jun 1 (Vol. 39, No. 4, pp.
395-408). Springer Berlin Heidelberg.
85
9. Firestein GS, McInnes IB. Immunopathogenesis of Rheumatoid Arthritis. Immunity
(Cambridge, Mass). 2017;46(2):183-96. McInnes IB, Schett G. The pathogenesis of
rheumatoid arthritis. New England Journal of Medicine. 2011 Dec 8;365(23):2205
10. McInnes IB, Schett G. The pathogenesis of rheumatoid arthritis. New England Journal
of Medicine. 2011 Dec 8;365(23):2205-19.
11. Scher JU, Littman DR, Abramson SB. Microbiome in inflammatory arthritis and
human rheumatic diseases. Arthritis & rheumatology (Hoboken, NJ). 2016
Jan;68(1):35.
12. Derksen VF, Huizinga TW, van der Woude D. The role of autoantibodies in the
pathophysiology of rheumatoid arthritis. InSeminars in immunopathology 2017 Jun 1
(Vol. 39, No. 4, pp. 437-446). Springer Berlin Heidelberg.
13. Gravallese EM, Monach PA. Pathogenesis and pathology of rheumatoid arthritis.
Seventh Edition ed2019. p. 811-31.
14. Smolen JSP, Aletaha DMD, McInnes IBP. Rheumatoid arthritis. Lancet, The. 2016.
15. Aletaha D, Neogi T, Silman AJ, Funovits J, Felson DT, Bingham III CO, Birnbaum NS,
Burmester GR, Bykerk VP, Cohen MD, Combe B. 2010 rheumatoid arthritis
classification criteria: an American College of Rheumatology/European League
Against Rheumatism collaborative initiative. Arthritis & rheumatism. 2010
Sep;62(9):2569-81.
16. Grigor C, Capell H, Stirling A, McMahon AD, Lock P, Vallance R, Porter D, Kincaid W.
Effect of a treatment strategy of tight control for rheumatoid arthritis (the TICORA
study): a single-blind randomised controlled trial. The Lancet. 2004 Jul
17;364(9430):263-9.
86
17. Goekoop-Ruiterman YD, de Vries-Bouwstra JK, Allaart CF, Van Zeben D, Kerstens PJ,
Hazes JM, Zwinderman AH, Ronday HK, Han KH, Westedt ML, Gerards AH. Clinical
and radiographic outcomes of four different treatment strategies in patients with
early rheumatoid arthritis (the BeSt study): a randomized, controlled trial. Arthritis &
Rheumatism. 2005 Nov;52(11):3381-90.
18. Singh JA, Saag KG, Bridges Jr SL, Akl EA, Bannuru RR, Sullivan MC, Vaysbrot E,
McNaughton C, Osani M, Shmerling RH, Curtis JR. 2015 American College of
Rheumatology guideline for the treatment of rheumatoid arthritis. Arthritis &
rheumatology. 2016 Jan;68(1):1-26.
19. Smolen JS, Landewé R, Bijlsma JW, Burmester GR, Dougados M, Kerschbaumer A,
McInnes IB, Sepriano A, Van Vollenhoven RF, De Wit M, Aletaha D. EULAR
recommendations for the management of rheumatoid arthritis with synthetic and
biological disease-modifying antirheumatic drugs: 2019 update. Annals of the
rheumatic diseases. 2020 Jan 22.
20. Hopkins AM, Proudman SM, Vitry AI, Sorich MJ, Cleland LG, Wiese MD. Ten years of
publicly funded biological disease-modifying antirheumatic drugs in Australia. The
Medical journal of Australia. 2016 Feb;204(2):64-8.
21. Salt E, Frazier SK. Adherence to disease-modifying antirheumatic drugs in patients
with rheumatoid arthritis: a narrative review of the literature. Orthopedic nursing.
2010;29(4):260-75.
22. Blum MA, Koo D, Doshi JA. Measurement and rates of persistence with and adherence
to biologics for rheumatoid arthritis: a systematic review. Clinical therapeutics.
2011;33(7):901-13.
87
23. https://www.pbs.gov.au/pbs/industry/listing/participants/public-release-docs/2016-
02/bdmards-for-psoriatic-arthritis-2016-02
24. DiMatteo MR, Giordani PJ, Lepper HS, Croghan TW. Patient adherence and medical
treatment outcomes: a meta-analysis. Medical care. 2002;40(9):794-811.
25. Wabe N, Lee A, Wechalekar M, McWilliams L, Proudman S, Wiese M. Adherence to
combination DMARD therapy and treatment outcomes in rheumatoid arthritis: a
longitudinal study of new and existing DMARD users. Rheumatology international.
2017;37(6):897-904.
26. Gagnon MD, Waltermaurer E, Martin A, Friedenson C, Gayle E, Hauser DL. Patient
Beliefs Have a Greater Impact Than Barriers on Medication Adherence in a Community
Health Center. Journal of the American Board of Family Medicine : JABFM.
2017;30(3):331-6.
27. Wong PK. Medication adherence in patients with rheumatoid arthritis: why do
patients not take what we prescribe? Rheumatology international. 2016;36(11):1535-
42.
28. The Internet and American Business. Aspray W, Ceruzzi PE, editors: The MIT Press;
2008.
29. Anderson P. Web 2.0 and Beyond: Principles and Technologies: Chapman &
Hall/CRC; 2012.
30. Murugesan S. Understanding Web 2.0. IT Professional Magazine. 2007;9(4):34-41.
31. Kaplan AM, Haenlein M. Users of the world, unite! The challenges and opportunities
of Social Media. Business Horizons. 2010;53(1):59-68.
88
32. Kaun A. Jose van Dijck: Culture of Connectivity: A Critical History of Social Media.
Oxford: Oxford University Press. 2013. MedieKultur: Journal of media and
communication research. 2014;30:3.
33. Andrew Perrin. “Social Networking Usage: 2005-2015.” Pew Research Center. October
2015.
34. Wilson MI, Corey KE. The role of ICT in Arab spring movements. Netcom. 2014;2012-
2(26):343-56.
35. Hall W, Tinati R, Jennings W. From Brexit to Trump: Social Media’s Role in
Democracy. Computer. 2018;51(1):18-27.
36. Margetts H, John P, Hale S, Yasseri T. Political Turbulence, How Social Media Shape
Collective Action: Princeton University Press; 2016.
37. Bond RM, Fariss CJ, Jones JJ, Kramer AD, Marlow C, Settle JE, et al. A 61-million-person
experiment in social influence and political mobilization. Nature. 2012;489(7415):295-
8.
38. Kurniawati K, Shanks GG, Bekmamedova N, editors. The Business Impact Of Social
Media Analytics. ECIS; 2013.
39. Baars H, Kemper H-G. Management Support with Structured and Unstructured Data -
An Integrated Business Intelligence Framework. Information Systems Management
25(2):132-148.DOI: 10.1080/10580530801941058. IS Management. 2008;25:132-48.
40. Mateosian R. Ethics of Big Data. IEEE Micro. 2013;33(2):60-1.
41. Doug Laney, “3D Data Management: Controlling Data Volume, Velocity, and Variety”,
Gartner, file No. 949. 6 February 2001,
89
http://blogs.gartner.com/douglaney/files/2012/01/ad949-3D-Data-Management-
ControllingData-Volume-Velocity-and-Variety.pdf.
42. Kaisler S, Armour F, Espinosa JA, Money W. Big Data: Issues and Challenges Moving
Forward. IEEE; 2013. p. 995-1004.
43. Gandomi A, Haider M. Beyond the hype: Big data concepts, methods, and analytics.
International journal of information management. 2015;35(2):137-44.
44. Erl T, Khattak W, Buhler P. Big Data Fundamentals: Concepts, Drivers &
Techniques: Prentice Hall Press; 2016. PP 41
45. Ghavami P. Big Data Analytics Methods: Analytics Techniques in Data Mining, Deep
Learning and Natural Language Processing2019.
46. Parimala K, Rajkumar G, Ruba A, Vijayalakshmi S. Challenges and Opportunities with
Big Data. International Journal of Scientific Research in Computer Science and
Engineering. 2017;5:16-20.
47. Zeng DD, Chen H-c, Lusch R, Li S-H. Social Media Analytics and Intelligence. Intelligent
Systems, IEEE. 2011;25:13-6.
48. Stieglitz S, Mirbabaie M, Ross B, Neuberger C. Social media analytics – Challenges in
topic discovery, data collection, and data preparation. International Journal of
Information Management. 2018;39:156-68.
49. Fan W, Gordon M. The power of social media analytics. Communications of the ACM.
2014;57(6):74-81.
50. Jacobson D, Brail G, Woods D. APIs: A Strategy Guide: O'Reilly Media, Inc.; 2011.
51. Qiu Y. The openness of Open Application Programming Interfaces. Information,
Communication & Society. 2017;20(11):1720-36.
52. Olston C, Najork M. Web Crawling. Found Trends Inf Retr. 2010;4(3):175–246.
90
53. García S, Luengo J, Herrera F. Tutorial on Practical Tips of the Most Influential Data
Preprocessing Algorithms in Data Mining. Knowledge-Based Systems. 2015;98.
54. Uysal AK, Gunal S. The impact of preprocessing on text classification. Information
Processing & Management. 2014;50:104-12.
55. Gurusamy V, Kannan S. Preprocessing Techniques for Text Mining2014.
56. Raskutti B, Ferrá H, Kowalczyk A, editors. Second Order Features for Maximising Text
Classification Performance. Machine Learning: ECML 2001; 2001 2001//; Berlin,
Heidelberg: Springer Berlin Heidelberg.
57. Kolchyna O, Souza TTP, Treleaven P, Aste T. Twitter Sentiment Analysis: Lexicon
Method, Machine Learning Method and Their Combination. 2015.
58. Sriyanong W, Moungmingsuk N, Khamphakdee N. A Text Preprocessing Framework
for Text Mining on Big Data Infrastructure2018. 169-73 p.
59. Jivani A. A Comparative Study of Stemming Algorithms. Int J Comp Tech Appl.
2011;2:1930-8.
60. Jurafsky D, Martin J. Speech and Language Processing: An Introduction to Natural
Language Processing, Computational Linguistics, and Speech Recognition2008.
61. Méndez, J. R., Iglesias, E. L., Fdez-Riverola, F., Díaz, F., & Corchado, J. M. (2006).
Tokenising, stemming and stopword removal on anti-spam filtering domain.
Proceedings of the 11th spanish association conference on current topics in artificial
intelligence. Springer-Verlag: Santiago de Compostela, Spain
62. Pomikálek, J., & Rehurek, R. (2007). The Influence of preprocessing parameters on text
categorization. International Journal of Applied Science, Engineering and Technology,
4, 430–434.
91
63. Song, F. X., Liu, S. H., & Yang, J. Y. (2005). A comparative study on text representation
schemes in text categorization. Pattern Analysis and Applications, 8, 199–209.
64. Toman, M., Tesar, R., & Jezek, K. (2006). Influence of word normalization on text
classification. In Proceedings of the 1st international conference on multidisciplinary
information sciences & technologies (Vol. 2, pp. 354–358). Merida, Spain.
65. Pang B, Lee L. Opinion Mining and Sentiment Analysis. Foundations and Trends® in
Information Retrieval. 2008;2(1–2):1-135.
66. Bali R. Learning Social Media Analytics with R. Sarkar D, Sharma T, editors.
Birmingham: Birmingham : Packt Publishing; 2017.
67. Pozzi F, Fersini E, Messina V, liu b. Sentiment Analysis in Social Networks2016. Chapter
1.
68. Cambria E, Das D, Bandyopadhyay S, Feraco A. A Practical Guide to Sentiment Analysis
2017. PP 1-10
69. Taboada M, Brooke J, Tofiloski M, Voll K, Stede M. Lexicon-Based Methods for
Sentiment Analysis. Computational Linguistics. 2011;37(2):267-307.
70. Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. Recognizing contextual polarity in
phrase-level sentiment analysis. In Proceedings of the Conference on Human
Language Technology and Empirical Methods in Natural Language Processing, HLT ’05,
pages 347–354, Stroudsburg, PA, USA, 2005. Association for Computational
Linguistics.
71. Philip Stone. General inquirer. http://www.wjh.harvard.edu/~inquirer/, last accessed:
25/08/2020.
92
72. Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. Sentiwordnet 3.0: An
enhanced lexical resource for sentiment analysis and opinion mining. In Nicoletta
Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan
Odijk, Stelios Piperidis, Mike Rosner, and Daniel Tapias, editors, Proceedings of the
Seventh International Conference on Language Resources and Evaluation (LREC’10),
Valletta, Malta, may 2010. European Language Resources Association (ELRA).
73. K. Denecke. Are sentiwordnet scores suited for multi-domain sentiment classification?
In Digital Information Management, 2009. ICDIM 2009. Fourth International
Conference on, pages 1–6, Nov 2009.
74. B. Ohana, B. Tierney, and S. Delany. Domain independent sentiment classification with
many lexicons. In Advanced Information Networking and Applications (WAINA), 2011
IEEE Workshops of International Conference on, pages 632–637, March 2011.
75. Alpaydin E, Bach F. Introduction to Machine Learning. 3rd ed. ed. Cambridge: MIT
Press; 2014.
76. Ayodele TO. Types of machine learning algorithms. New advances in machine learning.
2010 Feb 1;3:19-48.
77. Schmidhuber J. Deep learning in neural networks: An overview. Neural networks. 2015
Jan 1;61:85-117.
78. Madhoushi Z, Hamdan AR, Zainudin S. Sentiment analysis techniques in recent works.
In2015 Science and Information Conference (SAI) 2015 Jul 28 (pp. 288-291). IEEE.
79. Aue, Anthony and Michael Gamon. 2005. Customizing sentiment classifiers to new
domains: A case study. In Proceedings of the International Conference on Recent
Advances in Natural Language Processing, Borovets, Bulgaria.
80. Bishop C. Pattern Recognition and Machine Learning. 162006. p. 140-55.
93
81. Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: A new perspective.
Neurocomputing. 2018 Jul 26;300:70-9.
82. Deng X, Li Y, Weng J, Zhang J. Feature selection for text classification: A review.
Multimedia Tools and Applications. 2019 Feb 1;78(3):3797-816.
83. Tang J, Alelyani S, Liu H. Feature selection for classification: A review. Data
classification: Algorithms and applications. 2014:37.
84. Buckland M, Gey F. The Relationship between Recall and Precision. Journal of the
American Society for Information Science. 1994;45(1):12.
85. Guns R, Lioma C, Larsen B. The tipping point: F-score as a function of the number of
retrieved items. Information Processing and Management. 2012;48(6):1171-80.
86. Arksey H, O’Malley L. Scoping studies: towards a methodological framework.
International Journal of Social Research Methodology. 2005 Feb 1;8(1):19–32.
87. Kaplan AM, Haenlein M. Users of the world, unite! The challenges and opportunities
of Social Media. Business Horizons. 2010;53(1):59-68.
88. Freund Y, Schapire R. A decision-theoretic generalization of online learning and an
application to boosting. Journal of Computer Sys. Sci. 1997;vol. 55:119–139
89. Devroye, L, Györfi, L. & Lugosi, G. in A Probabilistic Theory of Pattern Recognition.
Stochastic Modelling and Applied Probability 187–213 (Springer, New York, NY, 1996).
90. Ebrahimi M, Yazdavar AH, Salim N, Eltyeb S. Recognition of side effects as implicit-
opinion words in drug reviews. Online Information Review. 2016 Nov 4;40(7):1018–
32.
91. Korkontzelos I, Nikfarjam A, Shardlow M, Sarker A, Ananiadou S, Gonzalez GH. Analysis
of the effect of sentiment analysis on extracting adverse drug reactions from tweets
and forum posts. J Biomed Inform. 2016 Aug;62:148–58.
94
92. Roccetti M, Marfia G, Salomoni P, Prandi C, Zagari RM, Gningaye Kengni FL, et al.
Attitudes of Crohn's Disease Patients: Infodemiology Case Study and Sentiment
Analysis of Facebook and Twitter Posts. JMIR public health and surveillance.
2017;3(3):e51.
93. Ramagopalan S, Wasiak R, Cox AP. Using Twitter to investigate opinions about multiple
sclerosis treatments: a descriptive, exploratory study. F1000Res. 2014;3:216.
94. Du J, Xu J, Song H-Y, Tao C. Leveraging machine learning-based approaches to assess
human papillomavirus vaccination sentiment trends with Twitter data. BMC medical
informatics and decision making. 2017 Jul 5;17(Suppl 2):69–69.
95. Portier K, Greer GE, Rokach L, Ofek N, Wang Y, Biyani P, et al. Understanding topics
and sentiment in an online cancer survivor community. J Natl Cancer Inst Monogr.
2013 Dec;2013(47):195–8.
96. Cobb NK, Mays D, Graham AL. Sentiment analysis to determine the impact of online
messages on smokers’ choices to use varenicline. J Natl Cancer Inst Monogr. 2013
Dec;2013(47):224–30.
97. Cabling ML, Turner JW, Hurtado-de-Mendoza A, Zhang Y, Jiang X, Drago F, et al.
Sentiment Analysis of an Online Breast Cancer Support Group: Communicating about
Tamoxifen. Health communication. 2018;33(9):1158-65.
98. Liu J, Jiang X, Chen Q, Song M, Li J. Adverse Drug Reaction Related Post Detecting Using
Sentiment Feature. Iranian journal of public health. 2018;47(6):861-7.
99. Zhang L, Hall M, Bastola D. Utilizing Twitter data for analysis of chemotherapy.
International journal of medical informatics. 2018;120:92-100.
100. Seerat B, Azam F. Opinion mining: Issues and challenges (a survey).
International Journal of Computer Applications. 2012;49(9).
95
101. Sloane R, Osanlou O, Lewis D, Bollegala D, Maskell S, Pirmohamed M. Social
media and pharmacovigilance: A review of the opportunities and challenges. Br J Clin
Pharmacol. 2015;80(4):910-20.
102. Sultana J, Cutroneo P, Trifiro G. Clinical and economic burden of adverse drug
reactions. J Pharmacol Pharmacother. 2013 Dec;4(Suppl 1):S73-77.
103. Aue A, Gamon M. Customizing Sentiment Classifiers to New Domains: a Case
Study. In: Submitted to RANLP-05, the International Conference on Recent Advances
in Natural Language Processing [Internet]. Borovets, BG; 2005.
104. Yu Y, Duan W, Cao Q. The impact of social and conventional media on firm
equity value: A sentiment analysis approach. Decision Support Systems.
2013;55(4):919-26.
105. Rui H, Liu Y, Whinston A. Whose and what chatter matters? The effect of
tweets on movie sales. Decision Support Systems. 2013;55(4):863-70.
106. Collomb A, Costea C, Joyeux D, Hasan O, Brunie L. A study and comparison of
sentiment analysis methods for reputation evaluation. Rapport de recherche RR-LIRIS-
2014-002. 2014.
107. Devika MD, Sunitha C, Ganesh A. Sentiment Analysis: A Comparative Study on
Different Approaches. Procedia Computer Science. 2016;87:44–9.
108. Gonçalves P, Araújo M, Benevenuto F, Cha M. Comparing and Combining
Sentiment Analysis Methods. 2014.
109. Balahur A, Jacquet G. Sentiment analysis meets social media – Challenges and
solutions of the field in view of the current information sharing context. Information
Processing and Management. 2015;51(4):428–32.
96
110. Denecke K, Deng Y. Sentiment analysis in medical settings: New opportunities
and challenges. Artificial Intelligence In Medicine. 2015;64(1):17–27.
111. The Analysis Of Efficacy Data. In: Cleophas TJ, Zwinderman AH, Cleophas TF,
Cleophas EP, editors. Statistics Applied to Clinical Trials. Dordrecht: Springer
Netherlands; 2009. p. 17-43.
112. Cohen J. A coefficient of agreement for nominal scales. Educational and
Psychological Measurement. 1960;20:37-46.
113. Ebina K, Hashimoto M, Yamamoto W, Hirano T, Hara R, Katayama M, et al. Drug
tolerability and reasons for discontinuation of seven biologics in elderly patients with
rheumatoid arthritis -The ANSWER cohort study. PloS one. 2019;14(5):e0216624.
114. Landis JR, Koch GG . The measurement of observer agreement for categorical
data. Biometrics 1977;33:159– 74.doi:10.2307/2529310pmid:
http://www.ncbi.nlm.nih.gov/pubmed/843571 CrossRefPubMedWeb of
ScienceGoogle Scholar
115. Provoost S, Ruwaard J, van Breda W, et al Validating automated sentiment
analysis of online cognitive behavioral therapy patient Texts: an exploratory study.
Front Psychol 2019;10:1065. doi:10.3389/fpsyg.2019.01065pmid:
http://www.ncbi.nlm.nih.gov/pubmed/31156504
116. Charles-Schoeman C, Burmester G, Nash P, Zerbini CA, Soma K, Kwok K, et al.
Efficacy and safety of tofacitinib following inadequate response to conventional
synthetic or biological disease-modifying antirheumatic drugs. Annals of the
rheumatic diseases. 2016;75(7):1293-301.
117. Cohen SB, Tanaka Y, Mariette X, Curtis JR, Lee EB, Nash P, et al. Long-term
safety of tofacitinib for the treatment of rheumatoid arthritis up to 8.5 years:
97
integrated analysis of data from the global clinical trials. Annals of the rheumatic
diseases. 2017;76(7):1253-62.
118. Kivitz AJ, Cohen S, Keystone E, van Vollenhoven RF, Haraoui B, Kaine J, et al. A
pooled analysis of the safety of tofacitinib as monotherapy or in combination with
background conventional synthetic disease-modifying antirheumatic drugs in a Phase
3 rheumatoid arthritis population. Seminars in arthritis and rheumatism.
2018;48(3):406-15.
119. Katchamart W, Trudeau J, Phumethum V, Bombardier C. Methotrexate
monotherapy versus methotrexate combination therapy with non-biologic disease
modifying anti-rheumatic drugs for rheumatoid arthritis. The Cochrane database of
systematic reviews. 2010(4):Cd008495.
120. Salliot C, van der Heijde D. Long-term safety of methotrexate monotherapy in
patients with rheumatoid arthritis: a systematic literature research. Annals of the
rheumatic diseases. 2009;68(7):1100-4.
121. Dougados M, Combe B, Cantagrel A, Goupille P, Olive P, Schattenkirchner M,
et al. Combination therapy in early rheumatoid arthritis: a randomised, controlled,
double blind 52 week clinical trial of sulphasalazine and methotrexate compared with
the single components. Annals of the rheumatic diseases. 1999;58(4):220-5.
122. Felson DT, Anderson JJ, Meenan RF. Use of short-term efficacy/toxicity
tradeoffs to select second-line drugs in rheumatoid arthritis. A metaanalysis of
published clinical trials. Arthritis and rheumatism. 1992;35(10):1117-25.
123. Stewart KD, Johnston JA, Matza LS, Curtis SE, Havel HA, Sweetana SA, et al.
Preference for pharmaceutical formulation and treatment process attributes. Patient
preference and adherence. 2016;10:1385-99.
98
124. Alten R, Kruger K, Rellecke J, Schiffner-Rohe J, Behmer O, Schiffhorst G, et al.
Examining patient preferences in the treatment of rheumatoid arthritis using a
discrete-choice approach. Patient preference and adherence. 2016;10:2217-28.
125. Winthrop KL, Yamanaka H, Valdez H, Mortensen E, Chew R, Krishnaswami S, et
al. Herpes zoster and tofacitinib therapy in patients with rheumatoid arthritis. Arthritis
& rheumatology (Hoboken, NJ). 2014;66(10):2675-84.
126. Singh JA, Hossain A, Mudano AS, Tanjong Ghogomu E, Suarez-Almazor ME,
Buchbinder R, et al. Biologics or tofacitinib for people with rheumatoid arthritis naive
to methotrexate: a systematic review and network meta-analysis. The Cochrane
database of systematic reviews. 2017;5:Cd012657.
99
Appendix
Ethics approval
100