Application of Social Network Analysis in the Estimation of BankFinancial Strength During the Financial CrisisMichelle Morales1, David Guy Brizan2, Hussein Ghaly1,
Thomas Hauner3, Min Ma2, Syed Reza2, Andrew Rosenberg21Department of Linguistics, CUNY Graduate Center, USA
2Department of Computer Science, CUNY Graduate Center, USA3Department of Economics, CUNY Graduate Center, USA
The Financial Crisis of 2007-2008 was avery complex and impactful global event.The goal of this research is to explorethe possibility of using a banks socialrelations to estimate a banks financialstrength. We apply Natural Language Pro-cessing techniques to a corpus of financialdata released by the NLP Unshared Taskin PoliInformatics in 2014 in order to ex-plore and better understand this possibil-ity. Our work begins with the extractionof named entities from the corpus to es-tablish names of people involved in thecrisis. We then aggregate the social his-tories of these individuals from an onlinecollaborative knowledge base: Freebase.Accordingly, we use the social histories ofentities to establish social connections be-tween them. We end with a visualizationof the connections we found: a presenta-tion of a social financial crisis network.
The financial crisis represents the near collapse ofthe financial system in the United States, wherefinancial institutions were under threat of col-lapse, banks were bailed out by national govern-ments, and stock markets plummeted around theworld. It has been shown that various paticipantsshared responsibility for creating or enabling thecrisis. Mortgage lenders, borrowers, regulators,investors, rating agencies, and many others havebeen blamed (Bolton, 2009). At the center of thecrisis were the financial institutions, and we focuson these key players. Specifically, we aim to dis-cover how the strength of these players affectedthe outcome of the crisis. Whereas traditional fi-nancial models consider a banks credit relationsand financial strength using balance sheets, we ex-
plore using a banks social relations to estimate abanks financial strength in the aftermath of the fi-nancial crisis. Specifically, we measure the finan-cial strength of a bank by the amount of bailoutmoney and emergency lending to which it had ac-cess.
This work is part of the NLP Unshared Taskin PoliInformatics 2014. The corpus includes re-ports, hearings, bills, and other transcripts relatedto the crisis and was provided by the organiz-ers of task, a detailed breakdown of the data canbe viewed at https://sites.google.com/site/unsharedtask2014. Our research is motivated by2 questions (1) who was the financial crisis? (2)and, what was the financial crisis? We approachthis task from the perspective of social networkanalysis (SNA) because it allows us to effectivelyanalyze the interdependence and flows of influ-ence among individuals, groups, and institutions.Social Network Analysis involves the investiga-tion of networks, which are made of social enti-ties linked by specific types of interdependencies.This analysis takes social connections as the pri-mary building blocks of the social world (Pinheiro,2011). Besides individual attributes, SNA consid-ers all information about the relationships, repre-sented as links, among the network members, rep-resented as nodes. The information about the re-lations among the individuals within a social net-work is usually more relevant than the individualattributes of the individuals. Identifying nodes andlinks is key to whether a social network analysiswill prove informative.
While there is a rich history of literature consid-ering geographic networks and topological space,in economic analysis (Anselin 1998 ,Kelejian1998, and Fingleton 2014) only a few applicationsconsider a social network or social space indepen-dent of geographydespite the routine acknowl-edgement of the possibility (Hsieh, 2012). All rely
on the first law of geography from Tobler (1970):Everything is related to everything else, but nearthings are more related than distant things. How-ever, social interactions also have an economiceffect independent of spatial effects. Acemogluet al. (2014) consider the scheduled interactions,social connections, and geographic proximity ofbanks in relation to Timothy Geithner, hypothesiz-ing that banks socially closer to Geithner experi-enced higher stock returns in response to his nom-ination as Secretary of the Treasury. In the daysafter Geithners nomination, they estimate a cumu-lative abnormal return of 15% for those banks con-nected to him. In the same vein, we also considersocial connections in our analysis of the financialcrisis. We begin in section 2 with a discussion ofrelated work, section 3 details our data, section 4explains our methods, section 5 describes our datavisualization, in section 6 we sumamrize conclu-sions and results, and we end in section 7 with fu-ture work.
2 Related Work
Social network analysis has useful and practi-cal interdisciplinary applications. When appliedto telecommunications, SNA can help companiesrecognize the behavior of their customers and thenpredict the strength of links between customersand the possible impact of events among them. A SNA perspective has also been employedto understand political, economic and social or-ganizations and individuals (Christopoulos 2008).For example, it has been a useful approach forthe study of terrorism and related fields. In thestudy of political violence SNA helped provideimportant information about the characteristics ofa terrorist group structure, showing how a socialstructure influences members motives, behaviors,and the outcomes of their actions. Using SNA,Perliger (2011) provided insight into recruitmentprocesses, evolution, and the division of politicaland social power among the members of these ter-rorist groups. The network perspective has veryproductive sub-fields within historical anthropol-ogy, development studies, epidemiology and otherfields.
We constrcut our social network around the Fi-nancial Crisis event of 2007-2008. Each mem-ber (node) in our network represents an individ-
ual or orgnaization involved in the crisis. Eachconnection (link) between members is establishedbased on a social connection. Our data was pro-vided to us as part of the NLP Unshared Taskin PoliInformatics 2014, organized by the PoliIn-formatics Research Coordination Network. Thedata set includes Federal Open Market Commit-tee (FOMC) meeting minutes and transcripts, Fed-eral Crisis Inquiry Commission reports and tran-scripts,and Congressional bills, reports, and hear-ings. We also include data aggregated from Free-base, an online knowledge database (http://www.freebase.com/). We imported 2267 entities, in-cluding 954 people and 1207 organizations, mostof which (77.5%) are financial institutions in-volved in the financial crisis.
We rely on two public journalistic sources fordata on bank bailouts. For data on the TARPbailout we utilize the ProPublica Bailout Tracker(Kiel and Nguyen, 2014), an updated database ofoutgoing and incoming bailout funds (we excludebailouts of the housing agencies Fannie Mae andFreddie Mac as they represent government spon-sored entities). Data on the peak amount of bankdebt through the Federal Reserves myriad secretemergency lending facilities is available throughBloomberg (Friedman et. al 2014). Their report-ing is based on the source documents of the Gov-ernment Accountability Offices onetime audit ofthe Federal Reserve as legislated by DoddFrankWall Street Reform and Consumer Protection Act.
4.1 Named entity recognition
Named entities (NEs), especially person names(PER), location names (LOC), and organizationnames (ORG), deliver essential context and mean-ing in human languages (Chen et al. 2013).To investigate the behaviors of the NEs involvedin the financial crisis and the relations betweenthem, the first step is to extract the NEs from thedatasets. We split all the transcripts in the Con-gressional Hearings and the Congressional Re-ports into seperate files by speaker to gain a bet-ter understanding of the personal concerns of NEs,one file per speaker. A named entity tagger (Li etal., 2012) based on the conditional random field(CRF) model is employed to highlight all the NEsin each file.
4.2 Relation extraction
We then aim to establish relations between twoentities using the unstructured text data. To ex-tract these relations we treat each sentence as anevent, and target specifically the action in theevent. Therefore, we focus on the verbs as theyconvey the action in the sentence. We explore thepair relationships between two NEs recognized inthe previous stage, which enables us to determinethe entities mentioned within the text of each state-ment by each speaker, as in the following exam-ple, NEs are marked by brackets: [Mr. Lockhart]:The goal of these dual conservatorships is to helprestore confidence in [Fannie Mae] and [FreedieMac].
Using simple regular expression functions, weare able to extract the intervening text betweeneach pair of entities. The basic assumption here isthat this intervening text would tell us somethingabout the relationship between the two entities, asis the case:
Table 1: Example PhraseEntity1 Intervening Text Entity2
Fannie Mae and Freddie Mac
This indicates some degree of association be-tween the two entities Fannie Mae and FreddieMac, being mentioned together in many contextsand separated by and. However, for larger in-tervening text chunks, it would be difficult to infersuch relationships, as in the following:
Table 2: Example SentenceEntity1 Intervening Text Entity2
The goal ofthese dual
conservatorshipsis to help
restore confidence in
We then use Part of Speech Tagging (as imple-mented in the NLTK POS tagger), in order to iden-tify the verbs within the sentence and the interven-ing text. We also consider verb frequency and ig-nore extremely frequent verbs, such as, is andhas. For the case of negation in text, we addnot before the verb. We identify negation bysearching for the trigger words not or never.Our goal is to we reduce the intervening text to a
single verb that indicates the relationship betweenthe two entities.
Table 3: Example RelationEntity1 Intervening Text Entity2
Mr. Lockhart restore Fannie Mae
Although we find many good examples as isshown in Table 3, due to time constraints this taskneeds further development, before the events ex-tracted from text can be included into our socialnetwork.
4.3 Aggregation of Freebase data
Given a list of extracted individuals from the tasksdata, we query the Freebase database (March,2014) for a list of attributes. We focus on the at-tributes that would provide insight into the socialrelations of that person. Table 4 demonstrates theproperties we consider for a person. For certainattributes we consider temporal information, suchas the time frame employed by a company. Thoserelations are marked with an asterisk (*).
Table 4: Freebase Relationsprofession
government positions held*political party
For NEs that are organizations we only queryFreebase for a list of employees, specifically boardmembers. For this query we also consider the timeframe these employees served as board members,noting board members appointed before or duringthe crisis as more relevant. All extracted data wasentered into our SQL database.
5 Data Visualization
Lastly, we derive a visual presentation of our so-cial network. Each entity from our database rep-resents one node while links represent social rela-tions between entities. The figure shows one sub-network pulled for the larger network. Our onlinedemo demonstrates the interactive version of ournetwork where labels can be queried and observed.
Figure 1: Social Network of Citigroup
The Citigroup Social Network demonstratesone small subnetwork. Citigroup represents themost central and largest node. The first connec-tions we consider are board members of the or-ganization (second largest nodes) appointed be-fore/during the time of the crisis. Next, we es-tablish social relations between board membersand people from the task data (smallest nodes)by examining the social histories of the individ-uals, considering events like employment history,schools attended, etc. For each individual weconsider the list of Freebase attributes extracted.For each individual, each attribute gets checkedagainst the rest of the individuals, when a match isfound, a social connection is made. In this smallsubnetwork we remove intervening institutions tosimplify. However, in the larger we can see everyindividual, every financial institution, and everyintervening financial institution (e.g. schools, em-ployers,etc.) This network represents some of ourfindings. A detailed description and interactive vi-sualization of both the subnetwork, as well as, ourlargest and most detailed network can be seen inour web demo: http://lynx.cs.qc.cuny.edu/cunyfinancialcrisis/ Our preliminary resultsshow that most social relations were establishedbetween individuals based on their educational in-stitution. Due to the size of these institutions, andthe unlikelihood of a relation forming based on at-tendance, moving forward we will fine tune whichrelations are considered in our network. Most im-portantly, we will continue to work on extractingrelations from our unstructured data to incorporateinto our network.
Although our work is still in its developing stages,it shows promise. Thus far, we have extracted rel-evant entities involved in the financial crisis from
our corpus. We have also aggregated structureddata from Freebase. From the collection of thesedata, we have then created social histories of allthe individuals involved in the crisis. By analyz-ing each individuals history we then establishedsocial relations between entities and consequentlyconstructed a visualization of a financial crisis so-cial network. We also began exploring the useof unstructured data to establish relations betweenentities,and we will continue along this avenue aswe believe it will help to establish more relevantrelations between individuals.
7 Future Work
Moving forward, we will continue to focus on re-fining our analysis of unstructured data, using de-pendency parsing and semantic role labeling, toeventually incorporate these relations into our net-work. Future work will also include a compre-hensive analysis of our network using a spatialeconometric model. The underlying premise ofa spatial econometric model assumes spatial de-pendence between observations. In our applica-tion, we would then consider spatial dependencethrough social topology, as derived from our socialnetwork analysis. Social interactions would thenenter our economic model through a social spatialweights matrix, which would take into account thestr...