4
Natural Language Processing and Big Data for Psychology and Computational Linguistics __________________________________________________________________________________________________________________ Overview Over the past decade, there has been a remarkable cross-fertilization between the fields of psycholinguistics and natural language processing with insights from each field enhancing the development of theory, methodology, and resources in the other. Developments in statistical NLP have lead to a much larger availability of textual resources for psycholinguistic research and have made it possible for psycholinguists to quickly develop specific resources such as corpora and word-frequency lists. As a result, there has been an explosive growth in behavioral and linguistic data that can be leveraged for psycholinguistic research. Our increasingly networked and technological society has also spurred development of techniques such as natural language understanding and machine translation. This has led to a re-birth of artificial intelligence known as deep learning, which can be traced back to learning theory developed in psychology in (Rescorla & Wagner, 1977) and can again be applied to research on human language processing (e.g., Mandera, Keuleers, & Brysbaert, 2017). These developments are quickly changing psycholinguistic research. From a field that for decades has been dominated by small-scale one-time controlled laboratory experiments it is becoming a dynamic research enterprise relying on reusable and distributed data generation processes leveraging crowd participation (Keuleers & Balota, 2015). Knowledge of these developments and techniques is becoming indispensable for researchers working in any area of cognitive science that deals directly or indirectly with language or linguistic resources. Objectives

Natural Language Processing and Big Data for …...Rescorla-Wagner model, the perceptron, recurrent neural networks, deep learning. Day 2 (11th December) Lecture 3 (EK, 1.5 hrs) Discriminative

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Natural Language Processing and Big Data for …...Rescorla-Wagner model, the perceptron, recurrent neural networks, deep learning. Day 2 (11th December) Lecture 3 (EK, 1.5 hrs) Discriminative

NaturalLanguageProcessingandBigDataforPsychologyandComputationalLinguistics__________________________________________________________________________________________________________________

OverviewOver the past decade, there has been a remarkable cross-fertilization between the fields ofpsycholinguisticsandnatural languageprocessingwith insights fromeachfieldenhancingthedevelopmentoftheory,methodology,andresourcesintheother.

DevelopmentsinstatisticalNLPhaveleadtoamuchlargeravailabilityoftextualresourcesforpsycholinguistic research and have made it possible for psycholinguists to quickly developspecific resources such as corpora and word-frequency lists. As a result, there has been anexplosive growth in behavioral and linguistic data that can be leveraged for psycholinguisticresearch.

Our increasingly networked and technological society has also spurred development oftechniquessuchasnatural languageunderstandingandmachinetranslation.Thishasledtoare-birthofartificial intelligenceknownasdeeplearning,whichcanbetracedbacktolearningtheory developed in psychology in (Rescorla &Wagner, 1977) and can again be applied toresearchonhumanlanguageprocessing(e.g.,Mandera,Keuleers,&Brysbaert,2017).

These developments are quickly changing psycholinguistic research. From a field that fordecades has beendominatedby small-scale one-time controlled laboratory experiments it isbecomingadynamic researchenterprise relyingon reusable anddistributeddata generationprocessesleveragingcrowdparticipation(Keuleers&Balota,2015).

Knowledgeof thesedevelopments and techniques is becoming indispensable for researchersworking in any area of cognitive science that deals directly or indirectly with language orlinguisticresources.

Objectives

Page 2: Natural Language Processing and Big Data for …...Rescorla-Wagner model, the perceptron, recurrent neural networks, deep learning. Day 2 (11th December) Lecture 3 (EK, 1.5 hrs) Discriminative

Theprimaryobjectivesforthiscourseare:

1) Toexplore themutual connectionsbetween the fieldsofnatural languageprocessingandlanguagepsychology,fromahistoricalperspective.

2) To teach participants natural language processing techniques that can be applied tobehavioralresearch.

3)To instill inparticipantsthemindsettodocreativepsycholinguistic researchthat takes fulladvantageofNLPtechniquesandbigdata.

Modules Dates:December10th,2018–December15th,2018.

Day1(10thDecember)ModuleA:HumanandMachineLearningofLanguageLecture1(EK,1.5hrs)Language,behaviorism,andcognitive-science.Lecture 2 (EK, 1.5 hrs)Classical conditioning, operant conditioning, theRescorla-Wagnermodel,theperceptron,recurrentneuralnetworks,deeplearning.Day2(11thDecember)Lecture 3 (EK, 1.5 hrs)Discriminative and associative learning of humanlanguage,distributionalsemanticsModuleB:NaturalLanguageProcessingTutorial1(EK,2hrs)Basicpython,regularexpressions,tokenizing.Day3(12thDecember)Tutorial2(EK,2hrs)Lemmatizing,named-entityrecognition.Tutorial3(EK,2hrs)Counting,wordfrequencydistributions.Day4(13thDecember)Tutorial 4 (EK, 2 hrs)TFxIDF, Feature Extraction, Classification, MachineLearningalgorithms

Page 3: Natural Language Processing and Big Data for …...Rescorla-Wagner model, the perceptron, recurrent neural networks, deep learning. Day 2 (11th December) Lecture 3 (EK, 1.5 hrs) Discriminative

Tutorial5(EK,2hrs)Vector-SpacemodelsDay5(14thDecember)ModuleC:CrowdsourcingandBigDataLecture4(EK,1.5hrs)SocialIntelligence,SocialMediaAnalytics2Lecture5(EK,1.5hrs)CrowdsourcingofbehavioraldataDay6(15thDecember)ModuleD:TheoreticalapplicationsLecture6(EK,1.5hrs)Frequency,visualwordrecognition,grammaticalityjudgments.Lecture7(EK,1.5hrs)Aging,multilingualism,transactionrecords.

Whocanattend? Thecoursecanbeattendedbystudentsfromdiversefieldsascomputerscience, linguistics, cognitive psychology, cognitive sciences,psycholinguistics,NaturalLanguageProcessing,DataMining,BigDataandsoon.StudentsatalllevelsBTech/MTech,BSc./MSc.andfromearlylevelPhDs across these disciplines and also interested young faculty fromreputeduniversitiesandtechnicalinstitutionsarewelcometoattendthecourse.

Fees Theparticipationfeesfortakingthecourseisasfollows:Participantsfromabroad:US$200Industry/ResearchOrganizations:15000INRAcademicInstitutions:5000INRStudents:2000INRThe above fee include all instructional materials, computer use fortutorialsandassignments,laboratoryequipmentusagecharges,24hrfreeinternetfacility.Theparticipantswillbeprovidedwithaccommodationonpaymentbasis.

Page 4: Natural Language Processing and Big Data for …...Rescorla-Wagner model, the perceptron, recurrent neural networks, deep learning. Day 2 (11th December) Lecture 3 (EK, 1.5 hrs) Discriminative

TheFaculty

Dr.EmmanuelKeuleers isanAssistantProfessor inDepartment of Cognitive Science and ArtificialIntelligenceatTilburgUniversity(TheNetherlands).Hisresearchissituatedattheintersectionofnaturallanguage processing, machine learning, theoreticallinguistics, and psychology. He is the author ofnumerous well-cited publications on newtechniques, datasets, and theoretical findings inthese fields.His researchhas been instrumental inestablishing the use of techniques from naturallanguage processing and computational linguisticsin psycholinguistic investigation.Hewas the editorofarecentspecial issueoftheQuarterlyJournalofExperimental Psychology on "Megastudies,crowdsourcing, and big datasets inpsycholinguistics". He is a consulting editor forBehavior Research Methods, a member of theeditorialboardoftheMentalLexiconandaregularcontributortotopjournals inhisfieldsofresearch.His current work focuses on the computationalmodeling of human language processing followingdesign principles from information theory,discriminationlearning,andnetworkscience.

CourseCoordinatorDr.ArkVerma,AssistantProfessorofPsychology,IITKanpur.Phone:+91-512-6847(O),+91-9506304191E-mail:[email protected]...........................................................................