Mining Social Media Data to Investigate Patient Perceptions … · ARPA - Advanced Research Projects Agency bDMARDs – Biological Disease Modifying Antirheumatic Drugs csDMARDs

1

Mining Social Media Data to Investigate Patient Perceptions

Regarding DMARD Pharmacotherapy for Rheumatoid Arthritis.

Dr Chanakya Sharma MBBS FRACP

This thesis is presented in partial fulfilment of the requirements for the Master

of Clinical Research degree at the University of Western Australia.

School: Graduate Research School

Year of submission: 2020

Contact details: [email protected]

2

Thesis declaration I, Chanakya Sharma, certify that this thesis is my work, it has been completed during the

course of this degree, and does not breach any ethical rules with regard to the conduct of the

research.

Dr Chanakya Sharma MBBS (UWA) FRACP

3

Table of Contents

Thesis declaration .................................................................................................................................. 2

List of Abbreviations .............................................................................................................................. 5

Abstract ................................................................................................................................................. 7

List of Tables .......................................................................................................................................... 9

List of Figures ....................................................................................................................................... 10

Acknowledgement ............................................................................................................................... 11

Authorship Declaration ........................................................................................................................ 12

Chapter 1: Introduction ....................................................................................................................... 13

Chapter 2: Background ........................................................................................................................ 15

2.1 Rheumatoid arthritis .................................................................................................................. 15

2.1.1 Aetiology ............................................................................................................................. 15

2.1.2 Pathogenesis ....................................................................................................................... 16

2.1.3 Clinical Features .................................................................................................................. 17

2.1.4 Management ...................................................................................................................... 19

2.1.5 DMARDs .............................................................................................................................. 20

2.2 Social Media and Sentiment Analysis ........................................................................................ 22

2.2.1 Big Data ............................................................................................................................... 23

2.2.2 Social Media Analytics ........................................................................................................ 26

2.2.3 Data Capture ....................................................................................................................... 26

2.2.4 Preprocessing ...................................................................................................................... 27

2.2.5 Sentiment Analysis .............................................................................................................. 29

2.3 Conclusion ................................................................................................................................. 33

Chapter 3: Scoping Review - Can sentiment analysis be conducted on social media platforms to

understand public sentiment held towards pharmacotherapy? ......................................................... 34

3.1 Abstract ..................................................................................................................................... 34

3.2 Methods ..................................................................................................................................... 35

3.3 Results ....................................................................................................................................... 36

3.3.1 Sentiment analysis techniques and accuracy ..................................................................... 50

3.3.2 Sentiment analysis use ....................................................................................................... 51

3.4 Discussion .................................................................................................................................. 52

3.5 Conclusion ................................................................................................................................. 57

Chapter 4: Mining social media data to investigate patient perceptions regarding DMARD therapy 59

4

4.1 Abstract ..................................................................................................................................... 59

4.2 Methods ..................................................................................................................................... 60

4.2.1 Statistics .............................................................................................................................. 62

4.3 Ethics .......................................................................................................................................... 62

4.4 Results ....................................................................................................................................... 62

4.4.1 B/tsDMARDs ....................................................................................................................... 65

4.4.2 CsDMARDs .......................................................................................................................... 73

4.4.3 B/tsDMARDs vs csDMARDs ................................................................................................. 76

4.5 Discussion .................................................................................................................................. 77

Chapter 5: Conclusion .......................................................................................................................... 81

5.1 Research Contribution ............................................................................................................... 81

5.2 Future Directions ....................................................................................................................... 82

References ........................................................................................................................................... 84

Appendix .............................................................................................................................................. 99

Ethics approval ................................................................................................................................ 99

5

List of Abbreviations

ACR – American College of Rheumatology

API - Application Programming Interfaces

ARPA - Advanced Research Projects Agency

bDMARDs – Biological Disease Modifying Antirheumatic Drugs

csDMARDs – Conventional Synthetic Disease Modifying Antirheumatic Drugs

EULAR – European League Against Rheumatism

HCQ-hydroxychloroquine

LB-Lexicon based

LEF-Leflunomide

ML - Machine Learning

MTX – Methotrexate

QA – Quality Assessment

RA – Rheumatoid Arthritis

SA - Sentiment Analysis

SM - Social Media

6

SZS – Sulfasalazine

tsDMARDs – Targeted Synthetic Disease Modifying Antirheumatic Drugs

USA – United States of America

7

Abstract Objectives: The hypothesise of study is that patients have a positive sentiment regarding

b/tsDMARDs and a negative sentiment towards csDMARDs. A scoping review was conducted

to map the literature as it pertains to the use of sentiment analysis as a tool to extract

meaningful data on social media discussion on pharmacotherapy. Sophisticated sentiment

analysis algorithms were then used to analyse discussions on social media platforms regarding

DMARDs to understand the collective sentiment expressed towards these medications.

Methods: For the scoping review a keyword search strategy was used on several databases

and 10 studies were included which revealed various uses of sentiment analysis, but most

commonly to extract sentiment regarding a particular medication. Treato analytics were then

utilised to download all available posts on social media about cs/b/tsDMARDs in the context

of rheumatoid arthritis. Strict filters ensured that user generated content was downloaded.

The sentiment (positive or negative) expressed in these posts was analysed for each DMARD

using Sentiment Analysis. An analysis was also conducted on the reason(s) for this sentiment

for each DMARD, looking specifically at efficacy and side effects.

Results: Computer algorithms analysed millions of social media posts and included 28261

posts on b/tsDMARDs and 26841 posts on csDMARDs. This revealed that all classes had an

overall positive sentiment. The ratio of positive to negative posts was higher for b/tsDMARDs

(1.210) than for csDMARDs (1.048). Efficacy was the most commonly mentioned reason in

posts with a positive sentiment and lack of efficacy was the most commonly mentioned

reason for a negative sentiment. These were followed by the presence/absence of side effects

in negative or positive posts respectively.

8

Conclusion: Public opinion on social media is generally positive about DMARDs, regardless of

class. Lack of efficacy followed by side effects were the most common themes in posts with a

negative sentiment. There are clear reasons why a DMARD generates a positive or negative

sentiment, and as the sentiment analysis technology becomes more refined, targeted studies

can be done to further analyse these reasons, and allow clinicians to tailor DMARDs to match

patient needs.

9

List of Tables • Table 1: Summary of studies

• Table 2: Aggregate sentiment

• Table 3: Social media platforms

• Table 4: b/tsDMARD positive and negative sentiment for efficacy and side effects

• Table 5: Positive/Negative sentiment csDMARDs reasons

• Table 6: Concerns: percentage of posts with a negative sentiment

• Table 7: Comparison of proportion of positive sentiment for efficacy amongst

b/tsDMARDs

10

List of Figures • Figure 1: ACR/EULAR 2010 Rheumatoid Arthritis Classification Criteria

• Figure 2 - 2019 update of the EULAR rheumatoid arthritis management

recommendations in the form of an algorithm

• Figure 3: The 5 Vs of big data

• Figure 4: Steps involved in big data analysis

• Figure 5: Types of Sentiment Analysis

• Figure 6 – Study flow diagram

11

Acknowledgement I would like to acknowledge the following people without whom I would not have been able

to complete this thesis.

My supervisors, Dr Helen Keen, Dr Samuel Whittle, Dr Pari Delir Haghighi and Dr Frada

Burstein. It was their continued support and guidance that allowed me to embark on this

project and see it through to completion. I would like to thank Arthritis Australia for their

generous research grant that allowed us to collect the data required for this project. I would

like to thank my family, including my parents, my wife and my children, Suhani and Sohum,

who have been ever so patient and allowed me to have the time I need to complete this

degree. Lastly, I would like to thank the staff at the University of Western Australia who

have taught me the value of research and have made this journey more enjoyable than

what I had thought it would be.

12

Authorship Declaration This thesis contains work that has been published.

Details of the published papers:

• Sharma C, Whittle S, Haghighi PD, Burstein F, Keen H. Sentiment analysis of social

media posts on pharmacotherapy: A scoping review. Pharmacology Research &

Perspectives. 2020 Oct;8(5):e00640.

o Located in thesis: Chapter 3

• Sharma C, Whittle S, Haghighi PD, Burstein F, Sa'adon R, Keen H. Mining social media

data to investigate patient perceptions regarding DMARD pharmacotherapy for

rheumatoid arthritis. Annals of the Rheumatic Diseases. 2020 Sep 3.

o Located in thesis: Chapter 4

13

Chapter 1: Introduction Patients with rheumatoid arthritis (RA) face debilitating and occasionally life-threatening

consequences of untreated disease. The treatment however does require using potent

immunosuppressive/immunomodulatory agents which often have several undesirable side

effects. The intimidating nature of the physician’s office, which has led to the development

of syndromes such as “white coat hypertension”, can stifle the voice of the patients. The

nature of healthcare however has rapidly changed over the last few years. The hitherto

didactic transaction from a doctor to patient has turned into a more open discussion, which

has largely been enabled by the rise of the Internet and social media. These have been key

agents in disrupting the informational imbalance that has formed the basis of the power

differential between a physician and a patient. By providing easy and rapid access to vast

reservoirs of information about health and medications, the internet has allowed patients to

better understand their conditions and its management. In addition, social media has

provided avenues for discussions to take place amongst the suffering silent majority of

patients who might have otherwise not been able to do so. The ripples of this rise in both

knowledge and discussions online are increasingly being felt in clinical practice by physicians.

Patients are now striding into appointments aware of the latest research and discussions that

are relevant to their health. While this has had several positive consequences with patients

taking increased interest and ownership in their health, however, access to the wrong

information or the wrong discussions can often be equally disastrous by pushing patients in a

direction that can lead to worse outcomes.

The stigma of a chronic illness can be isolating, not only physically through pain and deformity,

but also emotionally and socially. Finding a cohort of likeminded and like-suffering individuals

can be a powerful driver of emotion and sentiment. We as clinicians have a responsibility to

14

be aware of the discourse that occurs on these social media platforms, and to be aware of

the sentiments that patients are expressing about the condition and the treatment that they

are being exposed to, as this can have a tremendous impact on the patient’s beliefs and

consequently their actions regarding their health.

The purpose of this thesis is to understand how patients feel about the various treatment

options that are available for RA by analysing the discussions that they are having on various

social media platforms online. This has been explored this over the last two years and findings

summarised over the next three chapters. Chapter two explains the three aspects of this

study, RA, social media and sentiment analysis. Chapter three is a scoping review that

explores whether sentiment analysis has been used to understand pharmacotherapy. This

was published in Pharmacology Research & Perspectives in October 2020. Chapter four

describes the study; analysing social media in its entirety, specifically looking at discussions

on the various DMARDs, and understanding the sentiment that was being expressed towards

these medications. This was published in The Annals of Rheumatic Disease in September

2020. The final chapter presents the conclusion of our findings and our thoughts on the future

of such analytics.

15

Chapter 2: Background

Information technology has, over the last few years rapidly ingrained itself in every aspect of

human life. While health care has traditionally been wary of changes and slow to incorporate

new technology, however quantum advances in the fields of big data analytics and artificial

intelligence have unmasked the potential for revolutionising patient care, something that

healthcare professionals can ill afford to ignore. This chapter provides an introduction to both

the information technology (social media and sentiment analysis) and healthcare (RA and

DMARDs) aspects of our study.

2.1 Rheumatoid arthritis

Rheumatoid arthritis (RA) is a chronic, inflammatory disorder that if untreated will lead to

irreversible destruction of the joints. It has a global prevalence of 0.24% (1). While the earliest

known mentions of RA date back to the Ebers Papyrus from 1500 BC, it wasn’t until 1800 that

a young medical resident by the name of Augustin Jacob Landré-Beauvais first described RA,

although he felt it was a type of gout, labelling it “Goutte Asthénique Primitive” (2). It would

take another 90 years however before the constellation of symptoms would formally be

labelled as RA (3). Even though RA spent its first three millennia in relative obscurity, the pace

of developments in the last 130 years has been nothing short of astounding.

2.1.1 Aetiology

Most of the developments in RA in the early to mid-20th century were in the fields of

pathophysiology, with it being established as an autoimmune disease with both genetic and

environmental components (4). The exact aetiology however is thus far unknown. There is a

genetic component with studies showing heritability up to 60% in twin studies (5). Genetic

16

studies have strongly suggested a polygenic cause of RA, with population-based studies

identifying over 100 loci associated with the development of RA (6,7). The strongest genetic

association thus far has been found with HLA-DRB1, and PTPN22 and DPP4 being amongst

several non-HLA loci that have been associated with increased RA susceptibility (8).

While genetics do play a role, they are not enough on their own to cause disease, and there

is likely to be a “second hit”, usually an environmental risk factor, that triggers the onset of

RA, although this has yet to be proven. Cigarette smoking remains one of the strongest

environmental risk factors for RA, with the risk being proportional to the number and duration

of cigarettes smoked (9). Various infectious agents have also been postulated to play a role in

the development of RA, although the exact mechanism by which they induce disease has not

been elucidated, molecular mimicry likely plays a role (10). More recently, alterations in the

gut microbiome have also been implicated in enhancing the susceptibility to RA (11). Exposure

to these environmental risk factors can result in the development of autoantibodies

associated with RA, which have been found in up to 80% of RA patients. The two most

commonly found antibodies are Rheumatoid Factor (RF) and Anti–cyclic citrullinated protein

antibodies (Anti-CCP). While these autoantibodies have been shown to play a part in the

pathogenesis of RA (likely through immune complex formation and complement activation),

their exact role remains unclear (12). Thus, while the exact aetiology of RA remains unknown,

it is clearly a complicated, multifaceted autoimmune process that involves the interplay of

both genetics and the environment.

2.1.2 Pathogenesis

The pathologic hallmark of RA is inflammation of the synovial tissue that lines the joints, this

is known as synovitis. Untamed synovitis leads to destruction of cartilage and bone in the

17

joint, resulting in the clinical manifestations of RA. The inflamed synovium comprises of a

variety of immune cells including T cells, B cells, plasma cells, natural killer cells, macrophages

and neutrophils (13). The migration and influx of these immune cells in the synovium occurs

due to the upregulation of cytokines and chemokines. The final impact of this symphony of

inflammatory and immune cells is osteoclast generation and chondrocyte stimulation, which

results in bone and cartilage destruction respectively (14).

2.1.3 Clinical Features

Rheumatoid arthritis is predominantly a disease of the small joints of the hands and feet. It

typically results in symmetric, synovial inflammation and tenderness across various joints,

with patients describing swelling and morning stiffness lasting over an hour. Blood tests can

show a rise in the levels of inflammation, reflected by high C-reactive protein (CRP) and/or an

erythrocyte sedimentation rate (ESR). Positivity to one or both the autoantibodies

(rheumatoid factor and/or CCP antibodies) can also be seen in most of the patients. These

findings are reflected in the 2010 American College of Rheumatology/European League

Against Rheumatism classification criteria for RA. These criteria were designed to assess

patients for suitability for inclusion in research and not for making a clinical diagnosis,

however they are often used in clinical settings to support a diagnosis of RA (15). In addition

to the joint involvement, patients with RA can have so-called extra-articular manifestations

ranging from vasculitis to interstitial lung disease. Long-term consequences of untreated RA

can be equally severe with development of amyloidosis or lymphoma (14).

18

2010 American College of Rheumatology/European League Against

Rheumatism classification criteria for rheumatoid arthritis

Score

Joint involvement

• 1 large joint

• 2–10 large joints

• 1–3 small joints

• 4–10 small joints

• >10 joints (at least 1 small joint)

0

1

2

3

5

Serology

• Negative RF and negative ACPA

• Low-positive RF or low-positive ACPA

• High-positive RF or high-positive ACPA

0

2

3

Acute-phase reactants

• Normal CRP and normal ESR

• Abnormal CRP or abnormal ESR

0

1

Duration of symptoms

• <6 weeks

• ≥6 weeks

0

1

• A total score of >=6 is needed to classify a patient as having definite RA

Figure 1: ACR/EULAR 2010 Rheumatoid Arthritis Classification Criteria

19

2.1.4 Management

The aim of the management of RA is to suppress inflammation, both systemic and synovial,

in order to prevent long-term joint damage, thus reducing morbidity and mortality. This

suppression of inflammatory activity is achieved by the use of a heterogeneous group of drugs

collectively known as Disease Modifying Antirheumatic Drugs (DMARDs). While these drugs

exert their effect by different mechanisms, the common thread that joins them together is

the fact that they all suppress disease activity and joint damage. The goal of using DMARD

treatment is to achieve remission, which is done by a process known as treat-to-target (16).

A treat-to-target approach identifies a target (usually remission) which needs to be achieved

by tailoring treatment at every individual consultation. There have been 3 broad approaches

used to tailor DMARD therapy namely: Step up combination therapy, initial combination

therapy and sequential monotherapy (17). Sequential monotherapy has largely been

abandoned in favour of the former 2 approaches, either of which can be used depending on

the severity of the patient’s symptoms. There are several clinical measures of disease activity

which can be used to assess whether patient is in remission or has low, moderate or high

disease activity. The six measures of disease activity that have been endorsed by the

American College of Rheumatology include: Patient Activity Scale (PAS) or PASII (range 0–10),

Routine Assessment of Patient Index Data 3 (RAPID3) (range 0–10), Clinical Disease Activity

Index (CDAI) (range 0–76.0), Disease Activity Score (DAS) 28 erythrocyte sedimentation rate

(ESR) (range 0–9.4) and Simplified Disease Activity Index (SDAI) (range 0–86.0). According to

the 2015 ACR/EULAR task force, remission is defined “as a tender joint count, swollen joint

count, C-reactive protein level (mg/dl), and patient global assessment of ≤1 each or a

Simplified DAS of ≤3.3” (18). If this target has not been achieved by the patient then DMARD

therapy should typically be escalated so as to achieve remission.

20

2.1.5 DMARDs

There are 3 broad categories of DMARDs. These are conventional synthetic DMARDs

(csDMARDs), biological DMARDs (bDMARDs) and targeted synthetic DMARDs (tsDMARD).

Conventional synthetic DMARDs are the oldest of the above 3 and are universally accepted

as first-line agents in treating newly diagnosed patients with RA (18, 19). In those patients in

whom adequate disease control or remission is not achieved with (combination) csDMARDs

then either bDMARD or tsDMARD will need to be added to the treatment regimen (Figure 1).

Figure 2 - 2019 update of the EULAR RA management recommendations in form of an

algorithm

21

While the number of csDMARDs has remained static for a number of years, the number of

b/tsDMARDs available for RA has rapidly increased over the past few years. This has been

mirrored by the rise in total healthcare costs associated with DMARD therapy. By 2014 the

actual costs of bDMARDs to the pharmaceutical benefits scheme were more than double that

of what had been originally estimated (20). Despite there being an improvement in outcomes

for RA patients, medication adherence rates, especially with csDMARDs, have been poor with

some studies showing full adherence in as few as 30% of patients (21,22). Evidence is

emerging that some patients are progressing to b/tsDMARDs without using csDMARDs as

prior or co-therapy, in contrast to guidelines and typical regulatory rules (23). Anecdotal

evidence suggests that one of the factors driving this are the patient perceptions which

appear to be strongly positive for the b/tsDMARDs as compared to the csDMARDs.

Patient concordance with medications is associated with improved outcomes in RA (24,25).

One of the biggest factors affecting concordance is the patient’s personal belief about the

disease and medications (26). Studies have shown that in order to improve adherence with

DMARDs, clinicians should focus less on provision of medical information and be more aware

of patients’ beliefs (27). Understanding patient beliefs however is difficult and often relies on

qualitative studies. While these are excellent at providing an in-depth thematic analysis of a

specific issue, but they are traditionally conducted on a small scale and might not be

representative of a diverse population set. A novel method of obtaining vast amounts of

patient originated content is by analysing comments made on social media. As more and

more industries are turning to analysing crowd sourced data generated on social media to

better understand their customer base, we looked at the possibility of using this data to

understand patient sentiment towards DMARDs. Our hypothesis was that patient sentiment

22

is positively skewed in favour of the b/tsDMARDs and negatively so towards the csDMARDs.

This hypothesis was generated from collective clinical observations of the investigating

cohort, and further corroborated by discussions with other experienced rheumatologists who

agreed that the hypothesis reflected their clinical interactions. While it is possible that

manifestations of RA itself could be a cause of varied sentiment amongst patients, much of

the variability in RA patients lies in the treatment response and tolerability of medications.

This, along with the demographic heterogeneity of the patients across the world with RA, I

believe, are a far greater driver of the varied perceptions than disease specific characteristics.

2.2 Social Media and Sentiment Analysis

On 4th October 1957 the Soviet Union launched Sputnik 1, the world’s first artificial satellite,

into space. This started the so-called “space race” and prompted the United States of America

(USA) to develop the U.S. Department of Defense’s Advanced Research Projects Agency

(ARPA). A consequence of this project was the creation of a network of computers with

remote logins. Little did the original creators of this network know that they were laying the

foundations of what would later become known as “the internet” (28). The internet’s original

iteration (so called Web 1.0 or Semantic web) was designed primarily to be a read-only,

unidirectional source of information from a few to the many. The bursting of the “dot-com

bubble” amongst other factors in the early 2000s however lead to a re-evaluation and

emergence of the next generation of the world wide web, so called “Web 2.0” or “Social Web.

Web 2.0 is an umbrella term for new easy to use services, such as blogs and social media, that

began to be provided on the internet in the mid-2000s and allowed users to generate their

own content (29).

23

The development of Web 2.0 has allowed the internet to become a more interactive platform

for its users, thus allowing social media to flourish (30). Social media is defined as “a group of

Internet-based applications that build on the ideological and technological foundations of

Web 2.0, and that allow the creation and exchange of user generated content” (31). By 2007

six percent of the internet’s population was on social media, a number that would nearly

double by 2011 to 11% (1.2 billion) (32). As of 2015, 76% of all American adults use social

media (33). Some of the most popular social media platforms include Facebook, YouTube,

WhatsApp, Twitter, Tik Tok and Reddit to name a few. The growth of social media has been

unprecedented with no evidence of it slowing down. In parallel with its reach has been a

tremendous rise in the power of social media to shape public opinion and, in extreme cases,

result in collective actions. This has been demonstrated on several occasions in the past few

years from the 2011 Arab Spring which spread across twenty countries, the Occupy

movements, the 2013 Brazil protests, Brexit and the American presidential election of 2016

(34-36). The powerful impact social media can have on users and their friends and family is

being explored by more and more industries to gain insights into their user base and

consequently drive change (37,38).

2.2.1 Big Data

While social connectivity was the primary objective of social media, one of the most lucrative

by products of social media has been user generated data. This has indeed become the cash

cow for the majority of the social media empires such as Facebook, Twitter and Google. This

vast repository of data, which collectively is known as “Big Data”, is not just limited to textual

content but can also include videos, movies and sounds amongst other types. Broadly

speaking however it falls into two categories, structured or unstructured data. While the

24

former describes data that is allocated to predefined fields, the latter has no recognisable

order to it (39).

While there is no universal definition of Big Data, it is generally accepted as data that is too

big to be handled and analysed by traditional database protocols (40). It was initially defined

by having three characteristics (called the 3 Vs) of volume, velocity and variety (41). There is

no consensus on the volume at which data becomes big data. It has been reported that

Facebook and Twitter alone are generating 50 gigabytes of data per day, a value that triples

every 3-5 years (42). Variety refers to the heterogeneity of the data, which, as stated above,

is either structured (~5% of all big data) or unstructured. Velocity refers to the rate of data

generation and analyses (43). As the dimensions of big data have become clearer, more Vs

have been added to the list, with the two most commonly recognised ones being “veracity”

which highlights the large amount of noise that often gets collected with big data, and

“value”, which is perhaps the most important aspect of big data and refers to the usefulness

of the data being obtained (44).

25

Figure 3: The 5 Vs of Big Data

Big data is useless unless it can be used to drive meaningful change. This need to understand

big data in order to draw meaningful conclusions has resulted in the development of Big Data

Analytics. Big Data Analytics deals with complex data that is often unstructured and uses a

variety of tools ranging from, artificial intelligence, machine learning and statistical modelling

to detect patterns within this unstructured data which can then be used to gain insights and

change practice (45). While there are minor variations depending on the source and the need

for the analysis, broadly speaking, the process of deriving meaningful insights from Big Data

occurs over five broad steps, which are expressed in the diagram below(46):

Figure 4: Steps involved in big data analysis

BIG DATA

VERACITY VARIETY

VALUE VOLUME

VELOCITY

Data acquisition

Information

extraction and

cleaning

Data integration,

aggregation and

representation

Data modelling

and analysis

Interpretation

26

2.2.2 Social Media Analytics

When big data that has been generated by social media is analysed, this process is termed

Social Media Analytics. It is defined as “an emerging interdisciplinary research field that aims

on combining, extending, and adapting methods for analysis of social media data” (47).

Broadly speaking there are two types of social media analytics: content based analytics –

which focus on the unstructured content posted by users on social media to derive insights,

and structure-based analytics which looks at the structure of a social network and extracts

information based on the relationship between the users (43). Very few studies have been

done into developing a standardised approach towards social media analytics, with no “gold

standard” approach having been established (48). A widely recognised approach was

proposed by Fan et al, and is known as the “CUP” framework. This stands for “capture,”

“understand,” and “present”. As the names suggest, the capture stage refers to the

acquisition of data and also involves processing this data so it is easily readable by the

algorithms. Understand refers to the actual analytic stage which could involve various

methodologies to conduct classification or predictive modelling. The present stage deals with

the presentation, interpretation and application of the analysed data (49).

2.2.3 Data Capture

Data acquisition is the first step in social media analytics and is concerned with

acquiring/collecting data. Researchers can either narrow the source of the data to particular

social media platforms, or, thanks to the open access nature of most social media, it is also

possible to obtain all publicly available social media content across various platforms. This has

become increasingly possible due to the growth of public Application Programming Interfaces

(APIs). An API “is a way for two computer applications to talk to each other over a network

27

(predominantly the Internet) using a common language that they both understand” (50).

Historically APIs predate the onset of Web 2.0 and have been present as “private APIs” across

various technological companies (51). However, the development of Web 2.0 lead to the

parallel development of so called “open APIs” which are available for public use. Open APIs

for companies such as Twitter and Google have resulted in public access to massive amounts

of data, which can then be collated for analysis. The information is then collated into a corpus

or dataset which can then be cleaned and analysed. While APIs are the most common method

used to obtain large quantities of data online, another popular method involves the use of

web crawlers. A web crawler is “is a system for the bulk downloading of web pages” (52). This

typically starts with a list of web addresses which are “crawled” by the program, any

information found is stored in a pre-specified repository, and any new web addresses

detected during this crawl are subsequently visited and the process of data finding and

storage is repeated. Each system has its own merits, and which one gets used will depend on

various factors including the reason for data collection and the resources (including APIs)

available.

2.2.4 Preprocessing

The data captured from social media will typically have both structured and unstructured

components. Structured data, as stated above, will have pre assigned categories (e.g. user

information, demographics etc), unstructured data however will be devoid of most such

identifiable categories. The majority of the data available on the internet is likely to be

unstructured. This data will then need to undergo a process of text preprocessing. The aim of

preprocessing is to make the data more readable for the analytical software without

28

impacting on the information that it provides. This is done in several steps including cleaning,

normalisation, transformation and data reduction which then yield a “cleaned” dataset that

can be analysed (53). Some of the most common types of text preprocessing are tokenization,

stop-word removal, lowercase conversion, and stemming (54). This is not always a linear

process, with different steps being done at different times, depending on the type of data

being analysed.

Tokenization is one of the most common forms of text preprocessing. It involves dividing the

corpus of text into subcategories, which could be words, phrases or other meaningful

elements, which are called tokens (55). Usual practice is to combine words together, this is

called “n-grams”, where n represents the number of words being combined together, thus

resulting in unigrams, bigrams, trigrams etc. This has been shown to improve text

classification (56). Tokenisation of social media data is considerably more difficult due to the

widespread use of slang, abbreviations and emoticons (61). Stop words are words that are

used frequently in a language yet carry no inherent meaning (such as pronouns and

prepositions). Exclusion of these stop words, by a text preprocessing technique called “stop

word removal”, prior to data analysis has shown to reduce problems encountered in

classifying the text by machine learning algorithms, and not shown to impact the accuracy of

text analysis (62). While there is no universally accepted list of stop-words, most text

preprocessing software usually have a predefined list of terms deemed to be stop-words.

Lowercase conversion, as its name suggests, merely represents the changing of all characters

to lowercase, as generally there is no difference in the meaning of the word, when it is

changed from upper to lowercase. This change however has been shown to improve the

accuracy of the text analysis (54). Stemming refers to the process of getting to the root or

29

stem of each word and to reduce the grammatical variations of the word (59). This typically

involves removing the suffix of various words that share the same “stem”. A more

sophisticated version of this is called lemmatization, which involves determining which words

have the same root despite their structural differences. (60). The usefulness of these methods

(and other less common methods of text preprocessing) have been analysed by various

studies, with most of them concluding that there is no universal fit and that it is more

important to choose the right technique based on the platform and language than to adopt a

one size fits all approach (61-64).

2.2.5 Sentiment Analysis

Once the data corpus has been preprocessed and cleaned, it is ready for the analysis. There

are several different methods of conducting social media content analysis, depending on the

questions being asked. One of the most common types of social media content analysis is to

try and detect the aggregate opinion held towards a particular product, or as is the case in

this study, pharmacotherapy. This is done via a technique known as Sentiment Analysis (SA);

also termed “opinion mining” (65). Sentiment Analysis involves assigning an integer value to

the corpus of text, depending on the sentiment being expressed in that text. Words with

negative sentiment get negative scores and vice versa (66). For example, the term “painful”

might receive a negative score, whereas “beautiful” will usually receive a positive score.

Sentiment Analysis typically occurs in two steps. The first step is known as “subjectivity

classification” which assesses where the sentence is subjective or objective. If the sentence is

objective then no further action will be taken, but if it is subjective then the second part of

the analysis occurs, known as “polarity classification”. This is the step that analyses and

30

assigns a sentiment to the text (67). This can be done at various levels, including the level of

the document, the sentence, phrases or words (65).

Figure 5: Types of Sentiment Analysis

There are three ways by which sentiment analysis can be done, lexicon based (also known as

knowledge based), machine learning (also statistical) or a hybrid of the two (68). The lexicon

based method requires the use or development of a lexicon or collection of words or phrases

with their sentiment polarity mapped and scored. These words are then searched for in the

target document and their scores are aggregated to obtain an overall sentiment score for the

document (69). These lexica can be created manually or via automated means. There are

several pre-existing lexica that are commonly used in conducting SA, such as Subjectivity

Lexicon, General Enquirer and SentiWordNet to name a few (70-72). As sentiment analysis is

highly domain specific, it is important to ensure that the right lexicon is being used when

analysing a particular corpus of data (73). This has been demonstrated in studies that have

tested the accuracy of various lexica across domains and shown that the accuracy depends

Sentiment Analysis

Lexicon Based Machine Learning Hybrid

Supervised Unsupervised

31

more on the appropriateness of selecting the right lexica for the domain, than the lexica itself

(74). For example, the phrase “…a hair raising journey with unexpected twists and turns”

might result in a positive sentiment if being analysed by a lexicon designed to assess movie

reviews, however if the same phrase was assessed by a lexicon designed to assess public

transport, it would result in a negative sentiment.

Machine learning is the process of “programming computers to optimise a performance

criterion using example data” (75). There are several different types of machine learning

algorithms, but they are usually divided into supervised and unsupervised learning algorithms

(76). The basic difference between the two is the while in the former the algorithm “learns”

on a labelled dataset and is then applied to the actual dataset, the latter is run on an

unlabelled dataset which it tries to make sense of. While unsupervised approaches such as

Probabilistic Latent Semantic Analysis and Latent Dirichlet Allocation have been used in

sentiment analysis, their results are often incoherent as the functions of the topics being

detected do not always correlate with human judgements. However, more recent

developments have shown promise in handling large, unstructured datasets. These use

powerful processors which are stacked to resemble the human brain, which is where they get

their name from, “Artificial Neural Networks”. Some of these networks can be hundreds of

layers ‘deep’, hence the name “deep learning”. This is a new and exciting area of machine

learning, in which there is reduced dependency for the need of labelled/structured data, with

the algorithm itself being able to “learn” what is relevant from the unstructured data (77).

Supervised machine learning algorithms (such as Naïve Bayes or Support Vector Machines)

are well suited to sentiment analysis (78). While algorithms built for these supervised learning

models can reach very high levels of accuracy, these are quite domain specific, and using the

32

same algorithm on a different data set (one they were not trained on) can result in a dramatic

drop in this accuracy (79).

One of the first steps of machine learning models is feature selection. A “feature” in the

context of data analytics, is simply an individual, measurable, aspect of the data being

analysed (80). It is important to choose the right features to segregate the data as this allows

for more effective and accurate analysis. If too many features are selected then the task can

be too computationally intensive and difficult, whereas if too few features are selected then

the results might not be accurate. Optimum feature selection needs to meet two basic

qualities, it needs to result in high learning accuracy while at the same time have less

computational overhead. There are several ways by which feature selection can be done,

however it needs to be individualised for the project at hand to ensure accuracy while limiting

costs (81, 82).

Once the features have been selected, these are then used on a subset of the data, called the

training set, to train the algorithm or “classifier”. There are several different types of classifier

algorithms used in machine learning, but they broadly fall into four categories namely linear

classifiers (such as Naïve Bayes), support vector machines, decision trees and Neural networks

(83). The purpose of this classifier is to analyse the labelled training data and find the class of

the output variable with sufficient accuracy.

The accuracy of the classifier can be tested using a variety of metrics, but the most common

ones involve the use of Recall, Precision and their harmonic mean, the F-score. Recall is “the

number of retrieved relevant items as a proportion of all relevant items”, whereas Precision

is “the number of retrieved relevant items as a proportion of the total number of retrieved

33

items” (84). The F-score is a composite average (thus a measure of accuracy) of these two

values, with a range between 0 (worst) to 1 (best) (85).

Once the classifier has demonstrated acceptable accuracy, it can then be used on the dataset

to conduct sentiment analysis. This sentiment analysis algorithm will then be run on the

specific social media platform to understand the overall sentiment being expressed towards

a particular topic, in this case, DMARDs in the context of RA.

2.3 Conclusion

Achievement of remission is the target of treating RA, which is usually done using DMARDs.

While csDMARDs are very cheap and effective, b/tsDMARDs are more likely to reduce

disease activity, however at considerably higher costs. Anecdotally there has been a

significant rise in the number of patients who are demanding to be placed on the newer

agents (b/tsDMARDs), instead of the csDMARDs. A key driver of this may be positive

discussions being held on social media about the b/tsDMARDs, and the negative ones on

csDMARDs. The research question posed was, “What is the aggregate sentiment being

expressed on social media towards the csDMARDs and the b/tsDMARDs?” Prior to

answering this question however, it is important to review the literature to see if sentiment

analysis technology has been used to analyse social media discussions on pharmacotherapy.

A scoping review examining this issue is presented in the next chapter.

34

Chapter 3: Scoping Review - Can sentiment analysis be conducted on social media platforms to understand public sentiment held

towards pharmacotherapy?

Publication

• Sharma C, Whittle S, Haghighi PD, Burstein F, Keen H. Sentiment analysis of social

media posts on pharmacotherapy: A scoping review. Pharmacology Research &

Perspectives. 2020 Oct;8(5):e00640.

3.1 Abstract

Social media is playing an increasingly central role in patient's decision-making process.

Advances in technology have enabled meaningful interpretation of discussions on social

media. A scoping review was conducted to assess whether Sentiment Analysis, a big data

analytic tool, could be used to extract meaningful themes from social media discussions on

pharmacotherapy. A keyword search strategy was used on the following databases:

OneSearch, PubMed, Medline, EMBASE, and Cochrane. One hundred and ninety-four titles

were identified of which 10 studies were included. Themes were then extracted about the

uses and implications of sentiment analysis of social media discussions on pharmacotherapy.

Twitter was the most frequently analysed platform. Assessment of public sentiment about a

particular medication was the most common use of sentiment analysis followed by detection

of adverse drug reactions. Studies also revealed a significant impact of news media on public

sentiment. Implications for real world practice include identifying reasons for a negative

sentiment, detecting adverse drug reactions and using the impact of news media on social

media sentiment to drive public health initiatives. The lack of a consistent approach to

35

sentiment analysis between the studies reflects the lack of a gold standard for the technology

and consequently the need for future research. Sentiment Analysis is a promising technology

that can allow us to better understand patient opinion regarding pharmacotherapy. This

knowledge can be used to improve patient safety, patient- physician interaction, and also

enhance the delivery of public health measures.

3.2 Methods

Due to the novelty of the topic a scoping review methodology was used to summarise all

available information from a variety of sources. The framework outlined by Arksey and

O’Malley was followed (86).

The research question was identified as “Can sentiment analysis be conducted on social media

platforms to understand public sentiment held towards pharmacotherapy?”

Social media is defined as “a group of Internet-based applications that build on the ideological

and technological foundations of Web 2.0, and that allow the creation and exchange of user

generated content” (87). Pharmacotherapy was defined as the use of pharmaceutical drugs

to treat or prevent medical conditions.

Literature published between 2002 (inception of web 2.0) and 2019 was collected from

OneSearch, PubMed, Medline, EMBASE and Cochrane. A keyword search strategy was

employed using the words (Sentiment Analysis OR Opinion mining) AND (Social Media OR

Medication OR Pharmacotherapy OR Drugs OR Pharmaceutical OR Medicine OR Facebook OR

Twitter)`.

Articles were eligible for inclusion in this review if their primary aim was to conduct sentiment

analysis of social media posts regarding pharmacotherapy. Only articles published in English

36

were included in this study. Articles that did not contain original data (e.g. letters to editor,

opinion pieces) were also excluded. Reviews and Meta-analyses were excluded but manually

searched for potential studies.

From all the included studies, information was collected on the following aspects on a

predesigned template: authorship, year and journal published, social media platform(s)

mined, medical condition(s), pharmacotherapy, type of sentiment analysis used, outcomes

generated and potential use in clinical settings as described in the study.

3.3 Results

Our search strategy revealed 194 articles, 95 of which were excluded after title and abstract

review for not meeting inclusion criteria. Of the remaining 99, 89 were excluded as they were

not analysing at least one of the required topics of pharmacotherapy, medicine or social

media. A total of 10 studies were finally included (Figure 1) (90-99).

37

*SA – Sentiment Analysis; *SM – Social Media

Figure 6 – Study flow diagram

All the studies found were published after 2013. Eight of the ten included studies performed

data mining on a single forum. Twitter was the most common platform mined (50%). The

majority of the studies aimed to understand the sentiment being expressed towards a

particular treatment, some of them also used this to explore other avenues such as adverse

drug reaction detection, the role of new media in influencing social media sentiment and the

sentiment dynamics on social media forums (Table 1).

194 articles identified from literature search

10 studies included in final review

99 articles included for full text review

95 articles excluded after title and abstract review

Excluded: 45 – Not on pharmacotherapy 25 – Not studies 11 – Not medical 6 – Not on SA 1 – Not on SM 1 – Under embargo

38

Table 1: Summary of studies

Authors Title, Journal

and year

Data Source

And Quality

Assessment

(QA)

Type of

sentiment

analysis And

Data pre-

processing

Outcome of

interest

Result Significance

Ramagopalan et

al93

Using Twitter to

investigate

opinions about

multiple

sclerosis

treatments: a

descriptive,

exploratory

study

Twitter

QA not stated

LB - Hu & Liu's

opinion lexicon

Data pre-

processing - Yes

The Sentiment

Score (mean and

summed) for

each treatment

Overall positive

sentiment

scores for all

drugs apart from

Novantrone and

Tysabri.

Oral treatments had

the highest mean

summed scores which

showing that patients

prefer oral

medications as

opposed to injections.

39

F1000Research.

2014

Portier et al95 Understanding

Topics and

Sentiment in an

Online Cancer

Survivor

Community

Journal of the

National Cancer

Institute

Monographs.

2013

Cancer survivors

network

QA not stated

ML using

Adaboost

classifier

Data pre-

processing – Not

explicitly stated

Does the

sentiment of the

person making a

post change

with regards to

responses

received for that

post?

Thread about

treatment side

effects had the

lowest initial

sentiment score,

but also the

greatest shift in

sentiment

(towards

positive).

Treatment and side

effect related posts

are usually highly

negative but are

associated with the

most shift in

sentiment polarity,

thus showing the

positive support that

is provided in the

community.

Roccetti et al92 Attitudes of

Crohn’s Disease

Facebook and

twitter

LB using

OpinionFinder

What topic

within Crohn’s

Infliximab (an

antibody used to

This study showed

that a data mining

40

Patients:

Infodemiology

Case Study and

Sentiment

Analysis of

Facebook and

Twitter Posts

Journal of

Medical Internet

Research Public

Health and

Surveillance.

2017

QA:’ Used a

“Honeypot”

approach to

identify social

spammers and

to ensure that

data being

gathered is from

patients.

Data pre-

processing –

Not explicitly

stated

disease

generates that

strongest

sentiment from

patients?

Correlation

between SA and

human scores

treat Crohn’s

disease) was the

most sentiment

related term for

both positive

and negative

sentiment.

High degree of

correlation

between

positive and

negative scores,

less so for

neutral score.

approach provided

material of simple

interpretation,

regardless of the

analysts’ scientific and

professional

background. This

shows that the

analysis of such data

can be completely

automated with

significant accuracy.

41

Du et al94 Leveraging

machine

learning-based

approaches to

assess human

papillomavirus

vaccination

sentiment trends

with Twitter

data

BioMed Central

Medical

Informatics and

Decision

Making. 2017

Twitter

QA not stated

ML using SVM

Data pre-

processing - Yes

Sentiment

towards HPV

vaccination. Also

looked at the

impact of new

media on

sentiment and

change in

sentiment as it

relates to the

day of the week.

35.8% were

“Positive”;

32.1% were

“Neutral”; and

32.0% tweets

were

“Negative”.

Safety was the

biggest factor in

negative tweets.

They also found

that mainstream

media can have

a significant

influence on

This study revealed

the significant impact

of news media articles

on public sentiment, a

fact that can be used

to promote public

health.

42

public opinion

with 66.21%

positive rate on

the day a

favourable news

article was

published

compared to the

previous

positive rate of

35.8%.

Cobb et al96 Sentiment

Analysis to

Determine the

Impact of Online

QuitNet

QA not stated

LB (Salience

Engine 4.1)

Whether

exposure to

positive

messages re:

Registrants who

started or

continued with

varenicline were

While the authors

could not draw

conclusions about

causality, emotional

43

Messages on

Smokers’

Choices to Use

Varenicline

Journal of the

National Cancer

Institute

Monographs.

2013

Data pre-

processing - No

varenicline

resulted in more

people switching

to it and sticking

with it.

exposed to a

statistically

significantly

greater number

of positive-

sentiment

varenicline

messages than

negative-

sentiment

messages.

content of online

communications

about health

behaviour

intervention was

found to be

associated with

decision making

around

pharmaceutical

choices.

Korkontzelos et

al91

Analysis of the

effect of

sentiment

analysis on

DailyStrength

forum and

Twitter

QA not stated

LB, 5 lexica used

- the Hu&Liu

Lexicon of

Opinion Words

Whether the

addition of

sentiment

analysis feature

There was an

increase in pick

up rate of ADRs

for posts taken

This study showed

that sentiment

analysis can be used

44

extracting

adverse drug

reactions from

tweets and

forum posts

Journal of

Biomedical

informatics.

2016

(H&L), the

Subjectivity

Lexicon (SL), the

NRC

Word-Emotion

Association

Lexicon (NRC),

the NRC Hashtag

Sentiment

Lexicon (NRC#),

and the

Sentiment 140

Lexicon (S140)

to ADRMine (a

software already

designed to pick

up ADR

mentions)

would increase

accuracy of

picking up ADRs

from twitter but

not for posts

from

dailystrength.

Of all the lexica

used,

Sentiment140

performed the

best (lexica

generated from

twitter).

to augment ADR

detection rate.

45

Data pre-

processing - Yes

Ebrahimi et al90 Recognition of

side effects as

implicit-opinion

words in drug

reviews

Emerald Insight.

2016

www.drugrating

z.com

QA Not stated

ML using SVM

and a Rule

based version of

lexicon based

Data pre-

processing - Yes

To evaluate if

implicit

sentiment can

be used to

identify drug

side effects from

disease

symptom. These

were tested

against the

manual

annotation of

the same drug

Experimental

results show

that ML

outperforms the

rule-based

algorithm

significantly for

both disease

symptom and

especially side

effect detection

where it was

The main finding was

that drug review side

effect recognition can

be handled by using

the ML algorithm,

which significantly

outperforms the

regular expression-

based algorithm.

46

reviews by a

pharmacist

almost two-fold

better.

Liu et al98

Adverse drug

reaction related

post detection

using sentiment

features

Webmd.com;

Manual

annotation of

posts done

LB -

SentiWordNet

Data pre-

processing - Not

stated

To use

sentiment

features to

detect and

identify if a post

was related to

an ADR. They

compared the

accuracy of

detecting ADRs

using three

approaches; 1.

Using N-gram

This method

was very

efficient in

picking up ADR

related posts.

Compared to

similar studies

(which had use

some of the

methods but not

all three) it had

the highest F-

The addition of

sentiment analysis to

detect ADRs from

social media forums

results in greater

accuracy than seen in

previous methods.

47

and domain

features 2.

Adding

sentiment to the

above, 3. Using

CHI statistic to

select posts with

high correlation

between

sentiment, n-

gram and

domain

features.

measure

(81.4%).

Cabling et al97 Sentiment

Analysis of an

Breastcancer.or

g

LB; Liu’s

dictionary

What is the

sentiment

Most active

users were 80%

Online support groups

allow for stronger ties

48

Online Breast

Cancer Support

Group:

Communicating

about

Tamoxifen.

QA not stated

Date pre-

processing – yes

expressed

towards

Tamoxifen

more positive

than least active

users, while the

least active

users were 48%

more negative

than the most

active ones.

to be created around

a specific sentiment,

with less connection

from those with

dissimilar sentiments

to the dominant

group.

Zhang et al99 Utilizing twitter

data for analysis

of

chemotherapy

Twitter

QA not stated

LB – using

TextBlob

Data pre-

processing – Not

explicitly stated

To assess and

compare

perceptions

about

chemotherapy

of patients and

health-care

Individuals are

more likely to

post emotional

tweets about

side effects than

organisations

Twitter data can be

used to understand

behavioural patterns

associated with

treatments for cancer

and for understanding

how individuals and

49

providers

through analysis

of chemo-

related tweets.

organisations

communicate about

health care concerns

and discovering

cancer patients’

needs, which could

aid in developing

personalised therapy.

Abbreviations: SA – Sentiment Analysis; ML – Machine Learning; LB – Lexicon Based; QA – Quality Assessment

50

3.3.1 Sentiment analysis techniques and accuracy

Seven of the studies used a lexicon based approach, two used machine learning and one used

both methods. Most of the studies used a different lexicon for their analysis, with none of

them being specifically geared for medical terminology. The studies that used machine

learning algorithms also utilised different algorithms, namely AdaBoost Classifier in one and

Support Vector Machine in the other two. Both these are types of machine learning

algorithms that allow stratification of data into different categories. While AdaBoost does this

by sequentially weighting the results of weak classifiers to form a strong classifier, Support

Vector Machine finds the ideal margin to separate the dataset into desired categories (88,

89).

The study by Ebrahimi et al was the only one that compared machine learning techniques to

lexicon based and also against manually classified sentiment. They used Support Vector

Machine to create a machine learning based algorithm and compared that to a lexicon based

algorithm. The machine learning algorithm outperformed the lexicon based algorithm on

both the primary (identifying forum posts mentioning drug side effects) and secondary

objectives (identifying posts mentioning disease symptoms) (90).

Data pre-processing was employed by five of the studies (90,91,93,94,97). The methods used

by the studies varied, with tokenisation (breaking sentences into small word groups or

phrases that are more easily read by a program) being the most common. The other studies

did not explicitly state whether they conducted data pre-processing, and if so then what

techniques were used.

The study by Roccetti et al compared the performance of its lexical sentiment analysis

technique to that of manual (human) coding of sentiment and found that there was a high

51

degree of correlation for the extremes of sentiment (positive and negative), and less so for

the neutral sentiments (92). Du et al conducted a manual analysis of a small corpus of tweets

classified by their machine learning algorithm and found the overall accuracy to be acceptable

(94).

3.3.2 Sentiment analysis use

The most common application of sentiment analysis (seven studies) was to analyse opinion

regarding a particular medication (92-94,96,97,99). Six of these used lexicon based

approaches and one used machine learning. While majority of these studies directly analysed

the cumulative polarity of the posts for each medication, the study by Roccetti et al reversed

the process to analyse which therapy generated the strongest sentiment (positive or

negative).

The next most common application of sentiment analysis (three studies) was to identify

adverse drug reactions (ADR) from social media chatter (90,91,98). The studies differed in

both the platforms that they mined and the approach to sentiment analysis. Ebrahimi et al

mined an online forum (www.drugratingz.com) using both machine learning and lexicon

based algorithms to assess whether sentiment expressed in forum posts can be used to

identify drug side effects from disease symptoms. Korkontzelos et al mined forums and

tweets using five different lexicon based methods to assess whether the addition of a

sentiment analysis feature to a pre-existing adverse drug reaction detection algorithm would

improve its efficacy. Liu et al mined www.webmd.com, specifically reviewing diabetic

medication forums. Their aim was to see if the addition of sentiment analysis to pre-existing

ADR detection algorithms would enhance detection. All three studies provided evidence that

sentiment analysis can be used to detect ADR mentions from social media posts.

52

One study also explored the interaction between news media and social media through the

lens of sentiment. Du et al analysed the impact of sentiment towards Human Papilloma Virus

vaccination, as expressed by tweets, before and after publication of a positive New York Times

article (94). While the average number of tweets (positive, negative and neutral) pertaining

to the topic was 1245 per day, the immediate period after publication of a New York Times

article on HPV saw this number jump to 16,000 with the proportion of positive sentiment

tweets rising from 35% to 66%. This was a remarkable demonstration of the impact of real-

world events on social media sentiment.

Three studies analysed the sentiment dynamics in cancer forums (95-97). The study by Portier

et al looked at how the sentiment expressed by users in each thread influences the sentiment

of the person who started the thread. They were able to show that discussions especially

about pain and chemotherapy side effects typically started with a negative sentiment but

gradually underwent a positive sentiment shift, reflecting the power of community support

in improving sentiment (95). The study by Cabling et al looked at the sentiment of the posters

in a breast cancer forum on tamoxifen and found that the most active posters were more

likely to have a positive sentiment than those who posted less frequently (97). The study by

Cobb et al was the only one to assess the direct impact of sentiment on compliance. After

adjusting for variables they found that as the exposure to positive messages about varenicline

increase, so did the odds (odds ratio = 2.05, 95% confidence interval = 1.66 to 2.54) of the use

starting and continuing with the medication in an attempt to quit smoking (96).

3.4 Discussion

This scoping review shows that sentiment analysis can be used to gauge public perceptions

regarding pharmacotherapy as expressed on social media. The most common application that

53

emerged was of using sentiment analysis to assess patient opinion regarding

pharmacotherapy. While there was some consistency with regards to the platform being

mined (Twitter being the most common), there was no consistent “gold standard” approach

used by the authors to conduct SA. This likely reflects the fact that sentiment analysis is still

in its early stages of development, with various methods currently being explored in order to

establish a standard (100).

Lexicon based approaches were more popular than machine learning based approaches,

especially when the aim was to detect sentiment towards a particular treatment, with all of

them being successful in detecting the sentiment expressed. The accuracy of this sentiment,

as judged by a manual review, however, was infrequently done (92,94). Roccetti et al

conducted a manual analysis of a small corpus of tweets to judge the accuracy of their SA.

This analysis was conducted by medical specialist and a software engineer who individually

reviewed the posts and assigned a sentiment to each one. It was interesting to note that while

the agreement between the two manual observers was good (kappa 0.647) it was not perfect,

thus showing that even amongst human reviewers there can be disagreement about the

underlying sentiment of the text being analysed. While their algorithm had adequate accuracy

in detecting positive and negative sentiment, it was more likely to classify those posts with

less obvious sentiment as neutral. It appears that sentiment analysis might be unable to

detect the polarity of posts with subtle sentiment and tends to classify them as neutral. This

is a reassuring finding for two reasons, firstly, it would be better to classify a post with subtle

positive or negative emotion as neutral than the opposite category (as was seen with the

human reviewers where the computer scientist assigned more posts as either positive or

negative than the gastroenterologist), thus highlighting that sentiment analysis can negate

54

some of the inherent experiential biases that come with human sentiment coding. Secondly,

posts that describe significant adverse drug reactions are unlikely to have subtle emotion,

thus more likely to be picked up by SA.

Three studies applied sentiment analysis to improve the detection of adverse drug reactions,

an important cause of morbidity and mortality (101). While some adverse drug reactions are

detected during clinical trials, a large number only become obvious during the post marketing

surveillance phase (102). There were significant differences between the studies in terms of

both the platforms being mined (DailyStrength forum and Twitter, www.drugratingz.com and

webmd.com) and the technique used (lexicon based by two and both machine learning and

lexicon based by the other). The study by Korkontzelos et al added different types of lexicon-

based sentiment analysis to an existing adverse drug reaction detection program (ADRMine

– an algorithm-based software designed to detect adverse drug reaction mentions in social

media posts) to assess whether identification of negative sentiment would increase the

detection rate. While ADRMine is designed to be highly sensitive, the addition of sentiment

analysis slightly improved the rate of detection of ADRs. The most successful lexica employed

in this analysis were developed from Twitter, reinforcing the knowledge that sentiment

analysis is highly domain specific (103). A similar study was conducted by Liu et al who added

sentiment analysis to pre-existing adverse drug reactions detection processes such as N-gram

and domain features and demonstrated that this resulted in increased detection of adverse

drug reactions. In contrast, the study by Ebrahimi et al applied both lexicon based and

machine learning sentiment analysis directly to the mined data and successfully detected

adverse drug reactions from the forum posts. This was the only study that compared machine

learning to lexicon based algorithms, using manual review of the adverse drug reactions

55

identified. While machine learning based approaches were superior at picking up adverse

drug reaction mentions and detection of disease effects, the authors concluded that both

approaches were promising and that in future perhaps a hybrid of the two could be used for

even more accuracy (90).

Another potential application of sentiment analysis is understanding the interaction between

news media and social media through the sentiment expressed. The study by Du et al showed

the remarkable (positive) impact a (positive) news media publication can have on social media

sentiment, thus demonstrating its potential use in public health. This is an exciting area

deserving of further analysis as the relationship between News media and social media would

provide a powerful tool to help promote and assess the efficacy of public health initiatives,

especially relevant in the current pandemic.

Perhaps more important is the potential impact of social media sentiment on real-world

behaviour. This has already been demonstrated in other fields such as the film industry and

stock market, with positive sentiment resulting in positive box-office and market returns

(104,105). Thus, the question arises whether social media sentiment might influence

individual decisions related to pharmacotherapy. This concept was evaluated by Cobb et al

who used sentiment analysis to evaluate the impact of online messages on a smoker’s

decision to use a particular medication (varenicline) to help them quit smoking (96). They

analysed smokers who posted information about their pharmacotherapy use on QuitNet, a

forum for smokers. It showed that smokers who were exposed to greater amount of positive

sentiment posts about varenicline were more likely to start and continue to use varenicline

in an effort to quit smoking. While the authors refrained from drawing concrete conclusions

on causality of sentiment on medication preference and compliance, the results certainly

56

warrant further scrutiny with targeted studies. Cabling et al also looked at the sentiment

dynamics on medical forums (specifically Tamoxifen related posts on Breastcancer.org) and

found that the most active posters were much more likely to express positive sentiment, thus

perhaps explaining the positive sentiment that persistent users from Cobb et al study were

exposed to.

The specifics of negative sentiment associated with certain medications and side effects

suggests sentiment analysis could be used to identify specific issues which could be addressed

by individual clinicians with their patients, to allay their fears and improve adherence. This

was demonstrated in the study by Ramagopalan et al on Multiple Sclerosis medications. This

study revealed that patients preferred oral medications to injections and were more

concerned about some side effects (e.g. infections) than others. Similarly, the study by Zhang

et al was also able to demonstrate user sentiment towards specific side effects of

chemotherapy, showing some side effects generate less negative sentiment (“nausea”, “hair

loss”) as opposed to others (“Fatigue”, “neuropathy”), which generated much more negative

sentiment. This knowledge can be used by clinicians and pharmacists to better target

medication related counselling, thus potentially improving adherence.

While this review does provide preliminary evidence that sentiment analysis can be used to

understand mass opinion about pharmacotherapy, several questions remain about the

overall process and the technique of sentiment analysis. There was significant heterogeneity

between the studies at several stages of the analytic process, especially at the key stage of

conducting the analysis but also at the earlier stage of data pre-processing and the

subsequent stage of accuracy analysis. These different approaches are however not specific

to sentiment analysis of medical texts and reflect the ongoing development and evolution of

57

the technology itself (106). There is presently no universally accepted gold standard

approach. Current evidence suggests that the choice of method may be domain-specific

(depend on the condition/therapy being analysed, the platform being mined and the outcome

that is sought). The few studies that have compared the different approaches have generally

failed to establish a gold standard, with each approach having its own set of advantages and

disadvantages (107,108).

As the technology is further refined, standardisation of methodology and the establishment

of healthcare specific sentiment analysis methods (either machine learning algorithms or a

medical-sentiment lexicon) may facilitate the development of further validity regarding the

application of this technology to the health care sector (109,110).

This review has a few limitations. Sentiment analysis is dependent on the domain or topic

being studied, thus the lack of validated lexica or machine learning algorithms of conducting

sentiment analysis specific to the field of healthcare meant that there was significant

heterogeneity in the studies, which limited comparison and developing concrete conclusions.

Our inclusion criteria were intentionally specific, thereby limiting the focus of sentiment

analysis just to the realm of social media and pharmacotherapy, however there are other

applications of sentiment analysis in the field of healthcare including (but not limited to)

mining opinions regarding healthcare received, determining clinical outcomes and

understanding emotions of being unwell (110).

3.5 Conclusion

This scoping review provides an overview of current evidence on the multifaceted

applicability of sentiment analysis. While the most obvious utilisation is in the assessment of

public sentiment about particular medications, the fact that sentiment analysis is also being

58

used for other tasks such as adverse drug reaction detection is a promising glimpse into the

hitherto untapped potential of this technology. The heterogeneity of approach to sentiment

analysis across the studies reflects the rapid pace at which this technology continues to

evolve. While it has already found use in the fields of commerce and marketing, its current

state of clinical equipoise may be resolved if a universally agreed standardised approach is

established. This will have far reaching consequences across various domains of healthcare,

including but not limited to patient safety and public health initiatives.

59

Chapter 4: Mining social media data to investigate patient perceptions regarding DMARD therapy

Publication:

• Sharma C, Whittle S, Haghighi PD, Burstein F, Sa'adon R, Keen HI. Mining social

media data to investigate patient perceptions regarding DMARD pharmacotherapy

for rheumatoid arthritis. Annals of the Rheumatic Diseases. 2020 Sep 3.

4.1 Abstract

Objectives The hypothesis of this study is that patients have a positive sentiment regarding

b/tsDMARDs and a negative sentiment towards csDMARDs. To investigate this, all available

discussions on social media platforms regarding DMARDs in the context of rheumatoid

arthritis were analysed to understand the collective sentiment expressed towards these

medications.

Methods Treato analytics were used to download all available posts on social media about

DMARDs in the context of RA. Strict filters ensured that user generated content was

downloaded. The sentiment (positive or negative) expressed in these posts was analysed for

each DMARD using sentiment analysis. The reason(s) for this sentiment for each DMARD was

also analysed, looking specifically at efficacy and side effects.

Results Computer algorithms analysed millions of social media posts and included 28261

posts on b/tsDMARDs and 26841 posts on csDMARDs. Both classes had an overall positive

sentiment. The ratio of positive to negative posts was higher for b/tsDMARDs (1.210) than for

csDMARDs (1.048). Efficacy was the most commonly mentioned reason in posts with a

positive sentiment and lack of efficacy was the most commonly mentioned reason for a

60

negative sentiment. These were followed by the presence/absence of side effects in negative

or positive posts, respectively.

Conclusions Public opinion on social media is generally positive about DMARDs. Lack of

efficacy followed by side effects were the most common themes in posts with a negative

sentiment. There are clear reasons why a DMARD generates a positive or negative sentiment,

as the sentiment analysis technology becomes more refined, targeted studies could be done

to analyse these reasons and allow clinicians to tailor DMARDs to match patient needs.

4.2 Methods

The services of the web analytics firm Treato were utilised to collect the data. The Treato

platform automatically identifies, collects and analyses publicly available user-generated

content on health-related topics from over 10,000 sources. These sources include the publicly

available data on social networks such as Facebook and Twitter, discussion forums and blogs.

Over three billion posts were analysed from these sources. The data are then analysed using

a patented algorithm that applies natural language processing to this content to identify

medical concepts mentioned in text, and extract patients’ self-reported descriptions of their

experiences with various health conditions and medications. These medical experiences were

then mapped on to formal concepts in a medical ontology. Treato’s algorithms combine

various medical ontologies including those used by the Food and Drug Administration for

coding. This process includes resolving conceptual synonyms of medical terms (e.g., ‘fatigue’

and ‘tired’ were assigned the same concept code); resolution of patient-specific phrases (e.g.,

“pain in my joints” and “my joints hurt”) to medical terms; word-sense disambiguation

algorithms (e.g., “BP” could refer to bi-polar disorder, blood pressure, or a bisphosphonate

61

medication); and medication synonyms (e.g., generic and brand names for the same

medication).

The data used in this study was limited to posts written in the English language. The unit of

analysis for this study was an individual post. In order for a post to be included in the final

analysis it needed to be user generated content mentioning at least one of the thirteen

current DMARDs (methotrexate, leflunomide, sulfasalazine, hydroxychloroquine,

adalimumab, etanercept, certolizumab, golimumab, tocilizumab, tofacitinib, rituximab,

abatacept and infliximab) in the context of RA.

Included posts were then subject to Treato’s sentiment analysis algorithms for further

categorisation into posts with positive or negative sentiment. The two most common reasons

for a positive post were DMARD efficacy and lack of side effects. Conversely, the most

common reasons for a negative post were lack of efficacy and side effects. Therefore, the

positive and negative tagging is not mutually exclusive since a post may contain both positive

and negative experiences about the same medication. Treato also compiled data on the most

common concerns that were frequently listed by patients on various DMARDs. These data

were then provided to us for interpretation.

The overall sentiment for each DMARD was expressed as the ratio of the positive to negative

posts for that DMARD. A ratio greater than one indicated an overall positive sentiment.

Demographic information was collected where available.

While the algorithms were able to assign sentiment and extract information regarding efficacy

and side effects for all the DMARDs, the final numbers were not available for

hydroxychloroquine and abatacept, which were then manually extracted. In order to ensure

62

that the results were valid for hydroxychloroquine and abatacept, this process of manual

extraction was repeated for all the other DMARDs. There were negligible differences (0-3%)

between the algorithm and manual extraction across the categories of the DMARDs which

likely reflect the difference in dates when the data was provided by Treato’s algorithms and

when it was manually extracted (additional posts on social media). This difference was not

felt to be large enough to have a significant impact on the overall interpretation of the results.

A comparison of proportions analysis was conducted to assess whether there was a difference

in the proportion of positive sentiment posts between the various b/ts DMARDs (111)

4.2.1 Statistics

Cohen's kappa coefficient was used to assess inter-rater agreement between Treato and

manual assessment of sentiment. A chi squared test was conducted to compare positive

sentiment for efficacy across b/tsDMARDs and concerns raised by patients on both csDMARDs

and b/tsDMARDs. Statistical significance was assumed at p < 0.05.

4.3 Ethics

Ethics approval was obtained from Human Research Ethics Committee at Monash University

and the University of Western Australia.

4.4 Results

Treato collected data prospectively from July 2017 till October 2018, and also analysed

available data retrospectively. A total of 28261 posts on b/tsDMARDs and 26841 posts on

csDMARDs were collected, with some overlap. The individual breakdown of the DMARDs and

the positive and negative posts is shown in Table 2. Treato’s algorithms identified majority

(89.6% and 88.8% respectively) of the posts on b/tsDMARDs and csDMARDs as being written

by patients. As a validation exercise, 200 posts were manually assessed and assigned a

63

sentiment. This was compared with the sentiment assigned by Treato’s algorithms for these

posts. Agreement between sentiment assessed by machine and human was moderate

(csDMARDs ƙ= 0.49 and b/tsDMARDs ƙ= 0.52) (112). We considered kappa values of 0.49 and

0.52 as ‘moderate agreement’ based on well-established parameters, which are widely used

in clinical research (113). While a high kappa is desirable in clinical studies, the same standards

cannot be applied when gauging online sentiment. Indeed, the kappa in such studies is

moderate at best even between human reviewers, thus achieving a moderate kappa between

human and algorithm is a strength of the study (114).

Table 2: Aggregate sentiment

b/tsDMARD Number of posts Percent Ratio P/N

Etanercept positive 5210 18.4 1.35

Etanercept negative 3852 13.6

Infliximab positive 2636 9.3 1.1

Infliximab negative 2405 8.5

Adalimumab positive 4419 15.6 1.08

Adalimumab negative 4107 14.5

Certolizumab positive 461 1.6 1.11

Certolizumab negative 415 1.5

Golimumab positive 306 1.1 1.26

64

Golimumab negative 243 .9

Tocilizumab positive 384 1.4 1.40

Tocilizumab negative 274 1.0

Abatacept positive 774 2.7 1.16

Abatacept negative 694 2.5

Tofacitinib positive 346 1.2 1.71

Tofacitinib negative 202 .7

Rituximab positive 918 3.2 1.49

Rituximab negative 615 2.2

csDMARD Number of posts Percent Ratio P/N

MTX positive 9058 33.7 0.995

MTX Negative 9103 33.9

HCQ Positive 3026 11.3 1.26

HCQ Negative 2398 8.9

SZS Positive 803 3.0 0.97

SZS Negative 827 3.1

LEF Positive 849 3.2 1.09

LEF Negative 777 2.9

65

4.4.1 B/tsDMARDs

Content about b/tsDMARDs was collected from 497 publicly available forums. The greatest

proportion (7,969/28261 posts) were obtained from Facebook. The 10 most popular social

media platforms used to publish these posts are shown in Table 3. Geolocation data was

available on 1837 posts which identified users from 34 countries. Majority of the posts

(95.4%) were from USA (1349), UK (162), Canada (155), Australia (55) and Mexico (15).

Table 3 Social media platforms

b/tsDMARDs Number of posts Percent Cumulative Percent

facebook.com 7969 28.2 28.2

inspire.com 3032 10.7 38.9

healingwell.com 1738 6.1 45.1

dailystrength.org 1735 6.1 51.2

community.arthritis.org 1551 5.5 56.7

reddit.com 1297 4.6 61.3

healthunlocked.com 1057 3.7 65.0

remedyspot.com 902 3.2 68.2

crohnsforum.com 795 2.8 71.0

arthritiscareforum.org.uk 540 1.9 72.9

66

csDMARDs Number of posts Percent Cumulative Percent

facebook.com 6638 24.7 24.7

healthunlocked.com 2184 8.1 32.9

dailystrength.org 1956 7.3 40.2

inspire.com 1689 6.3 46.4

community.arthritis.org 1318 4.9 51.4

reddit.com 1100 4.1 55.5

remedyspot.com 1088 4.1 59.5

arthritiscareforum.org.uk 1003 3.7 63.2

healingwell.com 879 3.3 66.5

psoriasis-help.org.uk 648 2.4 68.9

The ratio of total positive to negative posts was 1.21, thus indicating an overall positive

sentiment. Each of the b/tsDMARDs had a greater number of positive than negative posts.

Efficacy was the most common theme identified within posts assigned a positive sentiment

(>80% of positive posts), followed by lack of side effects (13% of positive posts) (Table 4).

Comparing b/tsDMARDs to each other in terms of the proportion of patients who posted a

positive post due to efficacy, revealed etanercept as being the most popular by having a

significantly superior difference in proportion to three other b/tsDMARDs (rituximab,

infliximab and tofacitinib), (Table 5). While it could be argued that those bDMARDs that had

a higher number of posts mentioning lack of side effects would be deemed to be less

67

efficacious than those that received more posts for efficacy, as the analysis looked at

proportion of positive posts for efficacy, the the impact of lack of side effects related posts

was negated. Additionally if a bDMARD is generating substantial positive sentiment simply for

lack of side effects as compared to efficacy, then its lower efficacy percentage is likely

justified.

While lack of efficacy was also the most common theme in posts with a negative sentiment,

side effect concerns were a more prominent cause of negative sentiment posts than lack of

side effects were for positive sentiment posts (Table 4).

Table 4: b/tsDMARD positive and negative sentiment for efficacy and side effects

b/tsDMARD Efficacy

posts

Total

positive

posts

Percentage Posts

stating

"no side

effects"

Percentage

Infliximab 2239 2636 84.94 308 11.68

Abatacept 666 774 86.04 109 14.08

Adalimumab 3769 4419 85.29 616 13.94

Certolizumab 383 461 83.08 76 16.49

Golimumab 264 306 86.27 48 15.69

Rituximab 777 918 84.64 143 15.58

Tocilizumab 335 384 87.24 51 13.28

68

Tofacitinib 281 346 81.21 70 20.23

Etanercept 4536 5210 87.06 610 11.71

b/tsDMARD Lack of

efficacy

posts

Total

negative

posts

Percentage Side

effects

posts

Percentage

Infliximab 1265 2405 52.60 983 40.87

Abatacept 437 694 62.97 259 37.32

Adalimumab 2500 4107 60.87 1429 34.79

Certolizumab 249 415 60 163 39.28

Golimumab 187 243 76.95 53 21.81

Rituximab 347 615 56.42 243 39.51

Tocilizumab 143 274 52.19 132 48.18

Tofacitinib 102 202 50.5 102 50.50

Etanercept 2344 3852 60.85 1387 36.00

69

70

Table 5: Comparison of proportion of positive sentiment for efficacy amongst b/tsDMARDs

Abatacept

86.04%

Adalimuma

b 85.29%

Certolizuma

b 83.08%

Etanercept

87.06%

Golimumab

86.27%

Infliximab

84.94%

Rituximab

84.64%

Tocilizumab

87.24%

Tofacitinib

81.21%

Abatacept

86.04%

NA DUP DUP DUP DUP DUP DUP DUP DUP

Adalimuma

b 85.29%

0.75% NA DUP DUP DUP DUP DUP DUP DUP

Certolizuma

b 83.08%

2.96% 2.21% NA DUP DUP DUP DUP DUP DUP

71

Etanercept

87.06%

1.02% *1.77% *3.98% NA DUP DUP DUP DUP DUP

Golimumab

86.27%

0.23% 0.98% 3.19% 0.79% NA DUP DUP DUP DUP

Infliximab

84.94%

1.10% 0.35% 1.86% *2.12% 1.34% NA DUP DUP DUP

Rituximab

84.64%

1.40% 0.65% 1.56% *2.42% 1.63% 0.30% NA DUP DUP

72

Tocilizumab

87.24%

1.20% 1.95% 4.16% 0.18% 0.97% 2.30% 2.60% NA DUP

Tofacitinib

81.21%

*4.83% *4.08% 1.87% *5.85% 5.06% 3.73% 3.43% *6.03% NA

NA – Not Applicable; DUP – duplicate value; * - p < 0.05

73

The most common concerns raised by patients who wrote a negative post on b/tsDMARDs

are depicted in Table 7. Joint pain was the most common but the next three reasons for a

negative sentiment were due to side effects (“Rash”, ‘Nausea” and “Itching”). Infections were

also a prominent reason for a negative sentiment, with four of the top 20 reasons being

occupied by infectious causes (“fever”, “pneumonia”, “common cold” and “sinus infections”).

4.4.2 CsDMARDs

Posts about csDMARDs were collected from 515 social media sites. Ten websites contributed

69% (18,503) of all the posts (table 3). Geolocation was only available for 5% (1441) of the

posts. Among these, however, 36 countries were represented. The majority of the posts

(93.3%) came from USA (904), UK (174), Canada (142), Australia (90) and New Zealand (35).

The ratio of total positive to negative posts was 1.048, indicating an overall positive

sentiment. The individual ratios revealed a negative sentiment for sulfasalazine (0.97) and

methotrexate (0.995), and positive for leflunomide (1.09) and hydroxychloroquine (1.26)

(Table 2).

Efficacy was the most common theme in posts with a positive sentiment for all the csDMARDs

(Table 6). While lack of efficacy was the most common theme in posts with a negative

sentiment, its overall share was lower than what was seen in posts with a positive sentiment.

Approximately half of the negative posts regarding methotrexate discussed either lack of

efficacy (50.08%) or side effects (44.94%). For hydroxychloroquine and sulfasalazine, a higher

proportion of negative posts discussed lack of efficacy (56.42% and 53.81% respectively)

versus side effects (40.28% and 31.68% respectively). Leflunomide saw a slightly larger share

of negative sentiment posts discussing side effects (18.15%), with discussions on lack of

efficacy accounting for 16.86% of the negative sentiment posts. The lower percentage of

74

positive and negative posts for leflunomide was raised with the Treato engineers. They stated

that it was likely that the discussions that were being had for leflunomide were not specific

enough for either side effects or lack of efficacy for them to be appropriately categorised by

the algorithms. It is possible that the remaining discussions were still on these two topics, but

the way in which it was worded resulted in them not being placed in these two categories by

the algorithm. Of the patients who gave methotrexate an overall negative sentiment, 7.18%

still felt that it was effective, these numbers were lower for sulfasalazine (4.96%) and

leflunomide (3.2%). (Table 6)

Table 6: Positive/Negative sentiment csDMARDs reasons

Positive sentiment csDMARDs reasons

csDMARDs Efficacy

Total

posts Percentage

Lack of

side

effects

Total

posts Percentage

Methotrexate 7364 9058 81.30 1762 9058 19.45

Hydroxychloroquine 2621 3026 86.62 439 3026 14.5

Leflunomide 215 849 25.32 63 849 7.42

Sulfasalazine 611 803 76.10 135 803 16.81

Negative sentiment for csDMARD

csDMARDs

Lack of

efficacy

Total

posts Percentage

Side

effects

Total

posts Percentage

75

Methotrexate 4559 9103 50.08 4091 9103 44.94

Hydroxychloroquine 1353 2398 56.42 966 2398 40.28

Leflunomide 131 777 16.86 141 777 18.15

Sulfasalazine 445 827 53.81 262 827 31.68

The most common concerns associated with a negative sentiment are shown Table 7.

“Nausea” was the most common, closely followed by “Joint pain”. The remainder of the list

was strongly populated with side effect mentions including “Hair loss” “allergy” “Rash” and

“stomach problems”.

Table 7: Concerns: percentage of posts with a negative sentiment

Concern b/tsdmard

(%)

Csdmard

(%)

difference 95% CI

lower

limit

95% CI

upper

limit

p value

Joint Pain 13.86 10.96 2.90 2.10 3.71 0.0001

Itching 2.73 1.75 0.98 0.62 1.35 0.0001

Rash 3.29 2.60 0.69 0.27 1.10 0.0011

Cancer 2.44 1.80 0.64 0.29 0.99 0.0003

Weight

Gain

2.45 2.10 0.35 -0.01 0.73 0.05

76

Common

Cold

1.31 1.12 0.19 0.08 0.46 0.15

Migraines 1.42 1.31 0.11 0.17 0.40 0.4421

Muscle Pain 1.13 1.06 0.066 0.18 0.32 0.6122

Fever 1.51 1.46 0.05 0.24 0.35 0.7302

Weight Loss 1.07 1.52 -0.45 0.17 0.73 0.0014

Hair Loss 2.27 6.85 -4.58 4.08 5.09 0.0001

Nausea 2.99 11.25 -8.26 7.64 8.88 0.0001

4.4.3 B/tsDMARDs vs csDMARDs

More patients on b/tsDMARDs were significantly more likely to positively post due to efficacy

(85.74 %) as compared to csDMARDs (78.71 %), difference of 7.03 % (95% CI 6.15 % to 7.91

%; p < 0.0001). However, patients on csDMARDs were significantly more likely to post a

positive comment due to lack of side effects (17.47 %) as opposed to those on b/tsDMARDs

(13.14 %), difference of 4.33 % (95% CI 3.5 % to 5.16 %; p < 0.0001).

Concerns about medications were broadly similar in posts about either csDMARDs or

b/tsDMARDs (Table 7). However, posts about b/tsDMARDs were significantly more likely to

contain descriptions of joint pain, drug reactions (rash and itching) and cancer, whereas posts

about csDMARDs contained more descriptions of weight loss, hair loss and nausea. Posts on

csDMARDs were more likely to be on gastrointestinal issues such as “stomach problems”,

“diarrhoea” and “vomiting”. Allergic reactions to the medications were also a common reason

77

for negative sentiment with csDMARDs, particularly sulfasalazine (10.1% of all negative posts,

vs 3.66% for all other csDMARDs). Infections were mentioned more frequently in posts on

b/tsDMARDs (10.54% vs 5.76%; p < 0.0001). Among the b/tsDMARDs, shingles was more

frequently mentioned in association with tofacitinib than the other b/tsDMARDs combined

(5.4% vs 0.7% of negative posts; p < 0.0001).

4.5 Discussion

Our study supports our hypothesis that the collective sentiment was skewed positively in

favour of the b/tsDMARDs over the csDMARDs. While all the b/tsDMARDs had a positive

sentiment, this was only true for hydroxychloroquine and leflunomide amongst the

csDMARDs.

Efficacy and side effects were found to be the most commonly discussed topics in posts with

positive and negative sentiment. These findings mirror those of a recent study that

investigated the reasons for bDMARD discontinuation in RA patients and found that lack of

efficacy followed by side effects as the two biggest factors (115). The ratio of positive to

negative posts for b/tsDMARDs ranged from 1.71 for tofacitinib to 1.08 for adalimumab.

Tofacitinib had 81.21% of its positive posts discussing efficacy, this was lower than the other

b/tsDMARDs and methotrexate. However, tofacitinib also had the highest percentage of

positive posts discussing lack of side effects (20.23%) which contributed to its overall high

ratio of positive to negative posts. However, side effects were also the most common theme

in posts with a negative sentiment towards tofacitinib with 50% of negative posts describing

side effects, the highest across both the categories of DMARDs. The literature regarding side

effects with tofacitinib however does not reveal any unexpected findings (116-118).

78

Tofacitinib had the least number of posts (548) across both categories of DMARDs, which

likely played a role in the occurrence of such diverse results.

All the b/tsDMARDs had at least 80% of their positive posts discussing efficacy. While

etanercept had significantly higher posts commenting positively due to efficacy than some of

the other b/tsDMARDs, the absolute difference in proportions was small and unlikely to be

clinically meaningful. It is interesting to note that the three b/tsDMARDs that had a lower

proportion of efficacy posts than etanercept (rituximab, tofacitinib and infliximab) all had a

different mechanism of action to one another and a different mode of administration. This

comparison also highlights a powerful potential use of sentiment analysis technology. Despite

the ever-increasing number of b/tsDMARDs, there are few head to head trials that directly

compare these agents. The use of sentiment analysis provides us with a large scale, real-world

summary measure of effectiveness and tolerability that acts as an (in)direct comparison.

While methotrexate did have over 80% of its positive posts discussing efficacy, only marginally

below the b/tsDMARDs, it still generated an overall negative sentiment ratio due to the high

incidence of posts mentioning side effects. Almost half of the negative posts against

methotrexate discussed side effects, which was one of the highest across both the categories

of DMARDs. Our study demonstrates that majority of patients find methotrexate to be

efficacious yet have assigned it a negative sentiment primarily due to gastrointestinal side

effects. While clinical trial data have shown that less than 10% of patients stop methotrexate

due to side effects, longer term studies however have demonstrated that over a third of the

patients who take methotrexate for more than two years will discontinue the medication

(119,120). Sulfasalazine also had a high percentage of patients posting about side effects, with

allergic reactions being the frequently mentioned, however the percentage of positive posts

79

discussing efficacy were lower than that of methotrexate or the b/tsDMARDs. It was a

combination of poor (perceived) efficacy along with side effect concerns that generated the

overall negative sentiment for sulfasalazine. Trials that have previously compared

sulfasalazine to methotrexate have demonstrated comparable efficacy and side effects

(121,122).

One of the most common concerns raised by patients on b/tsDMARDs were injection site

reactions. Studies have shown that patients have a strong preference for orally administered

medications over injectables and this likely contributed towards the reduced side effect

related sentiment (123). Frequency of administration might also explain the relatively fewer

negative posts due to side effects for golimumab which has a monthly dosing interval.

Evidence suggests that patients with RA prefer a monthly frequency of drug administration

and while other drugs such as infliximab, tocilizumab and rituximab have similar or longer

frequency of administration, their intravenous route of administration is known to be less

desired by patients (124).

The most common concerns raised by patients on csDMARDs were hair loss, gastrointestinal

issues and allergic reactions. Shingles was a higher cause of negativity in patients on

tofacitinib than on the other b/tsDMARDs, which mirrors the findings in the studies (125).

More patients posted a positive comment for b/tsDMARDs regarding efficacy than for

csDMARDs, this was demonstrated in a network meta-analysis, which showed that 16% more

patients on biologic/DMARD combination achieved an American College of Rheumatology 50

(ACR50) response than those on csDMARDs (126).

80

The most important limitations of this study are reflective of the nascent state of the

technology. The first being the quality of the data. This study is unlike the typical qualitative

analysis studies which obtain responses from patients by direct questioning. Our study

downloaded free flowing conversations across the entirety of the internet on specific topics.

While this allowed us to capture patient sentiment in its more pure form, unbiased by the

confines of surveys and questionnaires, it comes at the cost of accuracy. Despite using strict

filters, without conducting a manual analysis of the 3 billion posts it is impossible to know

how relevant the information contained within the post is to the topic being studied.

Secondly, sentiment analysis itself is evolving with no current gold standard approach. There

are various methods by which sentiment analysis can be conducted, with each having certain

advantages and disadvantages and none providing an absolute guarantee of accuracy. Due to

these issues, it would not be surprising to have similar studies produce different results based

on the platforms being analysed (as some allow patients to post large amounts of information

and others, like Twitter, only allow small amounts, thus influencing the accuracy of the

algorithms) and the technique used to conduct sentiment analysis. Posts made in languages

other than English were also excluded as sentiment analysis is not as well developed for other

languages. Therefore, the results of this study might not be applicable to countries where

English is not the primary language.

81

Chapter 5: Conclusion

5.1 Research Contribution

Rheumatoid arthritis is a chronic, incurable and debilitating condition that requires strict

adherence to pharmacotherapy in order to achieve remission and improve quality of life.

Patient concordance with prescribed medications has historically been poor. While high

quality randomised control trials are useful for detecting efficacy, they are typically not

designed to understand patient emotions and individual experiences, which often play a

bigger role in determining long-term concordance with medications. Qualitative analysis are

better suited to tackling such questions but are typically done on a much smaller scale thus

might not be able to capture the wide range of reasons and emotions expressed by patients.

The availability of social media as a vast repository of unfiltered, free-flowing patient

discussions that are unencumbered by the confines of rigid questionnaires, along with the

rapid improvement in big data analytic technology, has given us a unique opportunity to bring

these two together to better understand this important yet elusive question of patient

sentiment and beliefs about their medications.

This study is the first of its kind to conduct a sentiment analysis of all available social media

posts generated by RA patients for all available DMARDs (as of the commencement of the

study). Our study has been able to capture unprompted sentiment as directly expressed by

the patient. The sentiment was positive for all the b/tsDMARDs with efficacy being the

primary driver of this, followed by lack of side effects. Methotrexate and sulfasalazine had an

overall negative sentiment, and descriptions of side effects were particularly common for

methotrexate. While csDMARDs are typically first line agents, majority of patients with RA

will be in remission on these agents alone. Thus, it is counter intuitive to some extent for the

82

sentiment towards these agents to be less positive than the b/tsDMARDs, which are typically

reserved for refractory cases of RA. Analysing the role that the position each DMARD holds in

the hierarchy of RA management, and its impact on patient sentiment can be done in future

studies.

We identified efficacy and side effects as the major points of discussion on various social

media, thus chose to focus the discussion on them. However, it is certainly possible that there

were other aspects of DMARDs that were being discussed (such as affordability, availability

and restrictions imposed by local regulatory requirements to name a few) and future studies

could certainly assess these to obtain a broader understanding of patient perceptions

regarding these medications.

5.2 Future Directions

While the field of data analytics has certainly improved over the last decade, the progress has

been overshadowed by the meteoric rise in the quantity of data being produced on a daily

basis. The ability to analyse this data is further hampered by the global variation in linguistics

and semantics, which in the absence of machine readable labelling (a human intensive task),

will unlikely yield meaningful results. However rapid advances are occurring in the field of big

data analysis, especially with the development of artificial neural networks and deep learning,

which allow the computer programs to self-learn and improve with each iteration without the

need for human input. While the potential for this technology to rapidly capture and interpret

broad-spectrum patient sentiment towards medications is unimaginable, a universally

accepted “gold standard” approach will need to be established prior to wider acceptance of

this technology in the medical field.

83

Once established, we foresee sentiment analysis as being a valuable addition to existing

qualitative methods, which allow for a more nuanced assessment than is currently possible

with sentiment analysis. This complementary approach will generate novel insights and

improve various aspects of patient-physician interaction, from shared decision-making

regarding DMARD selection, to patient adherence, thus improving patient care.

84

References 1. Cross M, Smith E, Hoy D, Carmona L, Wolfe F, Vos T, et al. The global burden of

rheumatoid arthritis: estimates from the global burden of disease 2010 study. Ann

Rheum Dis. 2014;73(7):1316-22.

2. Garrod AE. A treatise on rheumatism and rheumatoid arthritis: Griffin; 1890.

3. Landré-Beauvais AJ. The first description of rheumatoid arthritis. Unabridged text of

the doctoral dissertation presented in 1800. Joint Bone Spine. 2001;68(2):130-43.

4. Tan EM, Smolen JS. Historical observations contributing insights on etiopathogenesis

of rheumatoid arthritis and role of rheumatoid factor. J Exp Med.

2016;213(10):1937-50.

5. Yarwood A, Huizinga TW, Worthington J. The genetics of rheumatoid arthritis: risk

and protection in different stages of the evolution of RA. Rheumatology. 2016 Feb

1;55(2):199-209.

6. Stahl EA, Wegmann D, Trynka G, Gutierrez-Achury J, Do R, Voight BF, Kraft P, Chen R,

Kallberg HJ, Kurreeman FA, Kathiresan S. Bayesian inference analyses of the

polygenic architecture of rheumatoid arthritis. Nature genetics. 2012 May;44(5):483-

9.

7. Okada Y, Wu D, Trynka G, Raj T, Terao C, Ikari K, Kochi Y, Ohmura K, Suzuki A,

Yoshida S, Graham RR. Genetics of rheumatoid arthritis contributes to biology and

drug discovery. Nature. 2014 Feb;506(7488):376-81.

8. Viatte S, Barton A. Genetics of rheumatoid arthritis susceptibility, severity, and

treatment response. InSeminars in immunopathology 2017 Jun 1 (Vol. 39, No. 4, pp.

395-408). Springer Berlin Heidelberg.

85

9. Firestein GS, McInnes IB. Immunopathogenesis of Rheumatoid Arthritis. Immunity

(Cambridge, Mass). 2017;46(2):183-96. McInnes IB, Schett G. The pathogenesis of

rheumatoid arthritis. New England Journal of Medicine. 2011 Dec 8;365(23):2205

10. McInnes IB, Schett G. The pathogenesis of rheumatoid arthritis. New England Journal

of Medicine. 2011 Dec 8;365(23):2205-19.

11. Scher JU, Littman DR, Abramson SB. Microbiome in inflammatory arthritis and

human rheumatic diseases. Arthritis & rheumatology (Hoboken, NJ). 2016

Jan;68(1):35.

12. Derksen VF, Huizinga TW, van der Woude D. The role of autoantibodies in the

pathophysiology of rheumatoid arthritis. InSeminars in immunopathology 2017 Jun 1

(Vol. 39, No. 4, pp. 437-446). Springer Berlin Heidelberg.

13. Gravallese EM, Monach PA. Pathogenesis and pathology of rheumatoid arthritis.

Seventh Edition ed2019. p. 811-31.

14. Smolen JSP, Aletaha DMD, McInnes IBP. Rheumatoid arthritis. Lancet, The. 2016.

15. Aletaha D, Neogi T, Silman AJ, Funovits J, Felson DT, Bingham III CO, Birnbaum NS,

Burmester GR, Bykerk VP, Cohen MD, Combe B. 2010 rheumatoid arthritis

classification criteria: an American College of Rheumatology/European League

Against Rheumatism collaborative initiative. Arthritis & rheumatism. 2010

Sep;62(9):2569-81.

16. Grigor C, Capell H, Stirling A, McMahon AD, Lock P, Vallance R, Porter D, Kincaid W.

Effect of a treatment strategy of tight control for rheumatoid arthritis (the TICORA

study): a single-blind randomised controlled trial. The Lancet. 2004 Jul

17;364(9430):263-9.

86

17. Goekoop-Ruiterman YD, de Vries-Bouwstra JK, Allaart CF, Van Zeben D, Kerstens PJ,

Hazes JM, Zwinderman AH, Ronday HK, Han KH, Westedt ML, Gerards AH. Clinical

and radiographic outcomes of four different treatment strategies in patients with

early rheumatoid arthritis (the BeSt study): a randomized, controlled trial. Arthritis &

Rheumatism. 2005 Nov;52(11):3381-90.

18. Singh JA, Saag KG, Bridges Jr SL, Akl EA, Bannuru RR, Sullivan MC, Vaysbrot E,

McNaughton C, Osani M, Shmerling RH, Curtis JR. 2015 American College of

Rheumatology guideline for the treatment of rheumatoid arthritis. Arthritis &

rheumatology. 2016 Jan;68(1):1-26.

19. Smolen JS, Landewé R, Bijlsma JW, Burmester GR, Dougados M, Kerschbaumer A,

McInnes IB, Sepriano A, Van Vollenhoven RF, De Wit M, Aletaha D. EULAR

recommendations for the management of rheumatoid arthritis with synthetic and

biological disease-modifying antirheumatic drugs: 2019 update. Annals of the

rheumatic diseases. 2020 Jan 22.

20. Hopkins AM, Proudman SM, Vitry AI, Sorich MJ, Cleland LG, Wiese MD. Ten years of

publicly funded biological disease-modifying antirheumatic drugs in Australia. The

Medical journal of Australia. 2016 Feb;204(2):64-8.

21. Salt E, Frazier SK. Adherence to disease-modifying antirheumatic drugs in patients

with rheumatoid arthritis: a narrative review of the literature. Orthopedic nursing.

2010;29(4):260-75.

22. Blum MA, Koo D, Doshi JA. Measurement and rates of persistence with and adherence

to biologics for rheumatoid arthritis: a systematic review. Clinical therapeutics.

2011;33(7):901-13.

87

23. https://www.pbs.gov.au/pbs/industry/listing/participants/public-release-docs/2016-

02/bdmards-for-psoriatic-arthritis-2016-02

24. DiMatteo MR, Giordani PJ, Lepper HS, Croghan TW. Patient adherence and medical

treatment outcomes: a meta-analysis. Medical care. 2002;40(9):794-811.

25. Wabe N, Lee A, Wechalekar M, McWilliams L, Proudman S, Wiese M. Adherence to

combination DMARD therapy and treatment outcomes in rheumatoid arthritis: a

longitudinal study of new and existing DMARD users. Rheumatology international.

2017;37(6):897-904.

26. Gagnon MD, Waltermaurer E, Martin A, Friedenson C, Gayle E, Hauser DL. Patient

Beliefs Have a Greater Impact Than Barriers on Medication Adherence in a Community

Health Center. Journal of the American Board of Family Medicine : JABFM.

2017;30(3):331-6.

27. Wong PK. Medication adherence in patients with rheumatoid arthritis: why do

patients not take what we prescribe? Rheumatology international. 2016;36(11):1535-

42.

28. The Internet and American Business. Aspray W, Ceruzzi PE, editors: The MIT Press;

2008.

29. Anderson P. Web 2.0 and Beyond: Principles and Technologies: Chapman &

Hall/CRC; 2012.

30. Murugesan S. Understanding Web 2.0. IT Professional Magazine. 2007;9(4):34-41.

31. Kaplan AM, Haenlein M. Users of the world, unite! The challenges and opportunities

of Social Media. Business Horizons. 2010;53(1):59-68.

88

32. Kaun A. Jose van Dijck: Culture of Connectivity: A Critical History of Social Media.

Oxford: Oxford University Press. 2013. MedieKultur: Journal of media and

communication research. 2014;30:3.

33. Andrew Perrin. “Social Networking Usage: 2005-2015.” Pew Research Center. October

2015.

34. Wilson MI, Corey KE. The role of ICT in Arab spring movements. Netcom. 2014;2012-

2(26):343-56.

35. Hall W, Tinati R, Jennings W. From Brexit to Trump: Social Media’s Role in

Democracy. Computer. 2018;51(1):18-27.

36. Margetts H, John P, Hale S, Yasseri T. Political Turbulence, How Social Media Shape

Collective Action: Princeton University Press; 2016.

37. Bond RM, Fariss CJ, Jones JJ, Kramer AD, Marlow C, Settle JE, et al. A 61-million-person

experiment in social influence and political mobilization. Nature. 2012;489(7415):295-

8.

38. Kurniawati K, Shanks GG, Bekmamedova N, editors. The Business Impact Of Social

Media Analytics. ECIS; 2013.

39. Baars H, Kemper H-G. Management Support with Structured and Unstructured Data -

An Integrated Business Intelligence Framework. Information Systems Management

25(2):132-148.DOI: 10.1080/10580530801941058. IS Management. 2008;25:132-48.

40. Mateosian R. Ethics of Big Data. IEEE Micro. 2013;33(2):60-1.

41. Doug Laney, “3D Data Management: Controlling Data Volume, Velocity, and Variety”,

Gartner, file No. 949. 6 February 2001,

89

http://blogs.gartner.com/douglaney/files/2012/01/ad949-3D-Data-Management-

ControllingData-Volume-Velocity-and-Variety.pdf.

42. Kaisler S, Armour F, Espinosa JA, Money W. Big Data: Issues and Challenges Moving

Forward. IEEE; 2013. p. 995-1004.

43. Gandomi A, Haider M. Beyond the hype: Big data concepts, methods, and analytics.

International journal of information management. 2015;35(2):137-44.

44. Erl T, Khattak W, Buhler P. Big Data Fundamentals: Concepts, Drivers &

Techniques: Prentice Hall Press; 2016. PP 41

45. Ghavami P. Big Data Analytics Methods: Analytics Techniques in Data Mining, Deep

Learning and Natural Language Processing2019.

46. Parimala K, Rajkumar G, Ruba A, Vijayalakshmi S. Challenges and Opportunities with

Big Data. International Journal of Scientific Research in Computer Science and

Engineering. 2017;5:16-20.

47. Zeng DD, Chen H-c, Lusch R, Li S-H. Social Media Analytics and Intelligence. Intelligent

Systems, IEEE. 2011;25:13-6.

48. Stieglitz S, Mirbabaie M, Ross B, Neuberger C. Social media analytics – Challenges in

topic discovery, data collection, and data preparation. International Journal of

Information Management. 2018;39:156-68.

49. Fan W, Gordon M. The power of social media analytics. Communications of the ACM.

2014;57(6):74-81.

50. Jacobson D, Brail G, Woods D. APIs: A Strategy Guide: O'Reilly Media, Inc.; 2011.

51. Qiu Y. The openness of Open Application Programming Interfaces. Information,

Communication & Society. 2017;20(11):1720-36.

52. Olston C, Najork M. Web Crawling. Found Trends Inf Retr. 2010;4(3):175–246.

90

53. García S, Luengo J, Herrera F. Tutorial on Practical Tips of the Most Influential Data

Preprocessing Algorithms in Data Mining. Knowledge-Based Systems. 2015;98.

54. Uysal AK, Gunal S. The impact of preprocessing on text classification. Information

Processing & Management. 2014;50:104-12.

55. Gurusamy V, Kannan S. Preprocessing Techniques for Text Mining2014.

56. Raskutti B, Ferrá H, Kowalczyk A, editors. Second Order Features for Maximising Text

Classification Performance. Machine Learning: ECML 2001; 2001 2001//; Berlin,

Heidelberg: Springer Berlin Heidelberg.

57. Kolchyna O, Souza TTP, Treleaven P, Aste T. Twitter Sentiment Analysis: Lexicon

Method, Machine Learning Method and Their Combination. 2015.

58. Sriyanong W, Moungmingsuk N, Khamphakdee N. A Text Preprocessing Framework

for Text Mining on Big Data Infrastructure2018. 169-73 p.

59. Jivani A. A Comparative Study of Stemming Algorithms. Int J Comp Tech Appl.

2011;2:1930-8.

60. Jurafsky D, Martin J. Speech and Language Processing: An Introduction to Natural

Language Processing, Computational Linguistics, and Speech Recognition2008.

61. Méndez, J. R., Iglesias, E. L., Fdez-Riverola, F., Díaz, F., & Corchado, J. M. (2006).

Tokenising, stemming and stopword removal on anti-spam filtering domain.

Proceedings of the 11th spanish association conference on current topics in artificial

intelligence. Springer-Verlag: Santiago de Compostela, Spain

62. Pomikálek, J., & Rehurek, R. (2007). The Influence of preprocessing parameters on text

categorization. International Journal of Applied Science, Engineering and Technology,

4, 430–434.

91

63. Song, F. X., Liu, S. H., & Yang, J. Y. (2005). A comparative study on text representation

schemes in text categorization. Pattern Analysis and Applications, 8, 199–209.

64. Toman, M., Tesar, R., & Jezek, K. (2006). Influence of word normalization on text

classification. In Proceedings of the 1st international conference on multidisciplinary

information sciences & technologies (Vol. 2, pp. 354–358). Merida, Spain.

65. Pang B, Lee L. Opinion Mining and Sentiment Analysis. Foundations and Trends® in

Information Retrieval. 2008;2(1–2):1-135.

66. Bali R. Learning Social Media Analytics with R. Sarkar D, Sharma T, editors.

Birmingham: Birmingham : Packt Publishing; 2017.

67. Pozzi F, Fersini E, Messina V, liu b. Sentiment Analysis in Social Networks2016. Chapter

1.

68. Cambria E, Das D, Bandyopadhyay S, Feraco A. A Practical Guide to Sentiment Analysis

2017. PP 1-10

69. Taboada M, Brooke J, Tofiloski M, Voll K, Stede M. Lexicon-Based Methods for

Sentiment Analysis. Computational Linguistics. 2011;37(2):267-307.

70. Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. Recognizing contextual polarity in

phrase-level sentiment analysis. In Proceedings of the Conference on Human

Language Technology and Empirical Methods in Natural Language Processing, HLT ’05,

pages 347–354, Stroudsburg, PA, USA, 2005. Association for Computational

Linguistics.

71. Philip Stone. General inquirer. http://www.wjh.harvard.edu/~inquirer/, last accessed:

25/08/2020.

92

72. Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. Sentiwordnet 3.0: An

enhanced lexical resource for sentiment analysis and opinion mining. In Nicoletta

Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan

Odijk, Stelios Piperidis, Mike Rosner, and Daniel Tapias, editors, Proceedings of the

Seventh International Conference on Language Resources and Evaluation (LREC’10),

Valletta, Malta, may 2010. European Language Resources Association (ELRA).

73. K. Denecke. Are sentiwordnet scores suited for multi-domain sentiment classification?

In Digital Information Management, 2009. ICDIM 2009. Fourth International

Conference on, pages 1–6, Nov 2009.

74. B. Ohana, B. Tierney, and S. Delany. Domain independent sentiment classification with

many lexicons. In Advanced Information Networking and Applications (WAINA), 2011

IEEE Workshops of International Conference on, pages 632–637, March 2011.

75. Alpaydin E, Bach F. Introduction to Machine Learning. 3rd ed. ed. Cambridge: MIT

Press; 2014.

76. Ayodele TO. Types of machine learning algorithms. New advances in machine learning.

2010 Feb 1;3:19-48.

77. Schmidhuber J. Deep learning in neural networks: An overview. Neural networks. 2015

Jan 1;61:85-117.

78. Madhoushi Z, Hamdan AR, Zainudin S. Sentiment analysis techniques in recent works.

In2015 Science and Information Conference (SAI) 2015 Jul 28 (pp. 288-291). IEEE.

79. Aue, Anthony and Michael Gamon. 2005. Customizing sentiment classifiers to new

domains: A case study. In Proceedings of the International Conference on Recent

Advances in Natural Language Processing, Borovets, Bulgaria.

80. Bishop C. Pattern Recognition and Machine Learning. 162006. p. 140-55.

93

81. Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: A new perspective.

Neurocomputing. 2018 Jul 26;300:70-9.

82. Deng X, Li Y, Weng J, Zhang J. Feature selection for text classification: A review.

Multimedia Tools and Applications. 2019 Feb 1;78(3):3797-816.

83. Tang J, Alelyani S, Liu H. Feature selection for classification: A review. Data

classification: Algorithms and applications. 2014:37.

84. Buckland M, Gey F. The Relationship between Recall and Precision. Journal of the

American Society for Information Science. 1994;45(1):12.

85. Guns R, Lioma C, Larsen B. The tipping point: F-score as a function of the number of

retrieved items. Information Processing and Management. 2012;48(6):1171-80.

86. Arksey H, O’Malley L. Scoping studies: towards a methodological framework.

International Journal of Social Research Methodology. 2005 Feb 1;8(1):19–32.

87. Kaplan AM, Haenlein M. Users of the world, unite! The challenges and opportunities

of Social Media. Business Horizons. 2010;53(1):59-68.

88. Freund Y, Schapire R. A decision-theoretic generalization of online learning and an

application to boosting. Journal of Computer Sys. Sci. 1997;vol. 55:119–139

89. Devroye, L, Györfi, L. & Lugosi, G. in A Probabilistic Theory of Pattern Recognition.

Stochastic Modelling and Applied Probability 187–213 (Springer, New York, NY, 1996).

90. Ebrahimi M, Yazdavar AH, Salim N, Eltyeb S. Recognition of side effects as implicit-

opinion words in drug reviews. Online Information Review. 2016 Nov 4;40(7):1018–

32.

91. Korkontzelos I, Nikfarjam A, Shardlow M, Sarker A, Ananiadou S, Gonzalez GH. Analysis

of the effect of sentiment analysis on extracting adverse drug reactions from tweets

and forum posts. J Biomed Inform. 2016 Aug;62:148–58.

94

92. Roccetti M, Marfia G, Salomoni P, Prandi C, Zagari RM, Gningaye Kengni FL, et al.

Attitudes of Crohn's Disease Patients: Infodemiology Case Study and Sentiment

Analysis of Facebook and Twitter Posts. JMIR public health and surveillance.

2017;3(3):e51.

93. Ramagopalan S, Wasiak R, Cox AP. Using Twitter to investigate opinions about multiple

sclerosis treatments: a descriptive, exploratory study. F1000Res. 2014;3:216.

94. Du J, Xu J, Song H-Y, Tao C. Leveraging machine learning-based approaches to assess

human papillomavirus vaccination sentiment trends with Twitter data. BMC medical

informatics and decision making. 2017 Jul 5;17(Suppl 2):69–69.

95. Portier K, Greer GE, Rokach L, Ofek N, Wang Y, Biyani P, et al. Understanding topics

and sentiment in an online cancer survivor community. J Natl Cancer Inst Monogr.

2013 Dec;2013(47):195–8.

96. Cobb NK, Mays D, Graham AL. Sentiment analysis to determine the impact of online

messages on smokers’ choices to use varenicline. J Natl Cancer Inst Monogr. 2013

Dec;2013(47):224–30.

97. Cabling ML, Turner JW, Hurtado-de-Mendoza A, Zhang Y, Jiang X, Drago F, et al.

Sentiment Analysis of an Online Breast Cancer Support Group: Communicating about

Tamoxifen. Health communication. 2018;33(9):1158-65.

98. Liu J, Jiang X, Chen Q, Song M, Li J. Adverse Drug Reaction Related Post Detecting Using

Sentiment Feature. Iranian journal of public health. 2018;47(6):861-7.

99. Zhang L, Hall M, Bastola D. Utilizing Twitter data for analysis of chemotherapy.

International journal of medical informatics. 2018;120:92-100.

100. Seerat B, Azam F. Opinion mining: Issues and challenges (a survey).

International Journal of Computer Applications. 2012;49(9).

95

101. Sloane R, Osanlou O, Lewis D, Bollegala D, Maskell S, Pirmohamed M. Social

media and pharmacovigilance: A review of the opportunities and challenges. Br J Clin

Pharmacol. 2015;80(4):910-20.

102. Sultana J, Cutroneo P, Trifiro G. Clinical and economic burden of adverse drug

reactions. J Pharmacol Pharmacother. 2013 Dec;4(Suppl 1):S73-77.

103. Aue A, Gamon M. Customizing Sentiment Classifiers to New Domains: a Case

Study. In: Submitted to RANLP-05, the International Conference on Recent Advances

in Natural Language Processing [Internet]. Borovets, BG; 2005.

104. Yu Y, Duan W, Cao Q. The impact of social and conventional media on firm

equity value: A sentiment analysis approach. Decision Support Systems.

2013;55(4):919-26.

105. Rui H, Liu Y, Whinston A. Whose and what chatter matters? The effect of

tweets on movie sales. Decision Support Systems. 2013;55(4):863-70.

106. Collomb A, Costea C, Joyeux D, Hasan O, Brunie L. A study and comparison of

sentiment analysis methods for reputation evaluation. Rapport de recherche RR-LIRIS-

2014-002. 2014.

107. Devika MD, Sunitha C, Ganesh A. Sentiment Analysis: A Comparative Study on

Different Approaches. Procedia Computer Science. 2016;87:44–9.

108. Gonçalves P, Araújo M, Benevenuto F, Cha M. Comparing and Combining

Sentiment Analysis Methods. 2014.

109. Balahur A, Jacquet G. Sentiment analysis meets social media – Challenges and

solutions of the field in view of the current information sharing context. Information

Processing and Management. 2015;51(4):428–32.

96

110. Denecke K, Deng Y. Sentiment analysis in medical settings: New opportunities

and challenges. Artificial Intelligence In Medicine. 2015;64(1):17–27.

111. The Analysis Of Efficacy Data. In: Cleophas TJ, Zwinderman AH, Cleophas TF,

Cleophas EP, editors. Statistics Applied to Clinical Trials. Dordrecht: Springer

Netherlands; 2009. p. 17-43.

112. Cohen J. A coefficient of agreement for nominal scales. Educational and

Psychological Measurement. 1960;20:37-46.

113. Ebina K, Hashimoto M, Yamamoto W, Hirano T, Hara R, Katayama M, et al. Drug

tolerability and reasons for discontinuation of seven biologics in elderly patients with

rheumatoid arthritis -The ANSWER cohort study. PloS one. 2019;14(5):e0216624.

114. Landis JR, Koch GG . The measurement of observer agreement for categorical

data. Biometrics 1977;33:159– 74.doi:10.2307/2529310pmid:

http://www.ncbi.nlm.nih.gov/pubmed/843571 CrossRefPubMedWeb of

ScienceGoogle Scholar

115. Provoost S, Ruwaard J, van Breda W, et al Validating automated sentiment

analysis of online cognitive behavioral therapy patient Texts: an exploratory study.

Front Psychol 2019;10:1065. doi:10.3389/fpsyg.2019.01065pmid:

http://www.ncbi.nlm.nih.gov/pubmed/31156504

116. Charles-Schoeman C, Burmester G, Nash P, Zerbini CA, Soma K, Kwok K, et al.

Efficacy and safety of tofacitinib following inadequate response to conventional

synthetic or biological disease-modifying antirheumatic drugs. Annals of the

rheumatic diseases. 2016;75(7):1293-301.

117. Cohen SB, Tanaka Y, Mariette X, Curtis JR, Lee EB, Nash P, et al. Long-term

safety of tofacitinib for the treatment of rheumatoid arthritis up to 8.5 years:

97

integrated analysis of data from the global clinical trials. Annals of the rheumatic

diseases. 2017;76(7):1253-62.

118. Kivitz AJ, Cohen S, Keystone E, van Vollenhoven RF, Haraoui B, Kaine J, et al. A

pooled analysis of the safety of tofacitinib as monotherapy or in combination with

background conventional synthetic disease-modifying antirheumatic drugs in a Phase

3 rheumatoid arthritis population. Seminars in arthritis and rheumatism.

2018;48(3):406-15.

119. Katchamart W, Trudeau J, Phumethum V, Bombardier C. Methotrexate

monotherapy versus methotrexate combination therapy with non-biologic disease

modifying anti-rheumatic drugs for rheumatoid arthritis. The Cochrane database of

systematic reviews. 2010(4):Cd008495.

120. Salliot C, van der Heijde D. Long-term safety of methotrexate monotherapy in

patients with rheumatoid arthritis: a systematic literature research. Annals of the

rheumatic diseases. 2009;68(7):1100-4.

121. Dougados M, Combe B, Cantagrel A, Goupille P, Olive P, Schattenkirchner M,

et al. Combination therapy in early rheumatoid arthritis: a randomised, controlled,

double blind 52 week clinical trial of sulphasalazine and methotrexate compared with

the single components. Annals of the rheumatic diseases. 1999;58(4):220-5.

122. Felson DT, Anderson JJ, Meenan RF. Use of short-term efficacy/toxicity

tradeoffs to select second-line drugs in rheumatoid arthritis. A metaanalysis of

published clinical trials. Arthritis and rheumatism. 1992;35(10):1117-25.

123. Stewart KD, Johnston JA, Matza LS, Curtis SE, Havel HA, Sweetana SA, et al.

Preference for pharmaceutical formulation and treatment process attributes. Patient

preference and adherence. 2016;10:1385-99.

98

124. Alten R, Kruger K, Rellecke J, Schiffner-Rohe J, Behmer O, Schiffhorst G, et al.

Examining patient preferences in the treatment of rheumatoid arthritis using a

discrete-choice approach. Patient preference and adherence. 2016;10:2217-28.

125. Winthrop KL, Yamanaka H, Valdez H, Mortensen E, Chew R, Krishnaswami S, et

al. Herpes zoster and tofacitinib therapy in patients with rheumatoid arthritis. Arthritis

& rheumatology (Hoboken, NJ). 2014;66(10):2675-84.

126. Singh JA, Hossain A, Mudano AS, Tanjong Ghogomu E, Suarez-Almazor ME,

Buchbinder R, et al. Biologics or tofacitinib for people with rheumatoid arthritis naive

to methotrexate: a systematic review and network meta-analysis. The Cochrane

database of systematic reviews. 2017;5:Cd012657.

99

Appendix

Ethics approval

100

Documents

Mining Social Media Data to Investigate Patient Perceptions … · ARPA - Advanced Research Projects Agency bDMARDs – Biological Disease Modifying Antirheumatic Drugs csDMARDs