Learning Markov Logic Networks with Many Descriptive Attributes

Learning Markov Logic Networks with Many Descriptive Attributes

Learning Markov Logic Networks with Many Descriptive AttributesHassan KhosraviOliver Schulte

Tong Man

Xiaoyuan Xu

Bahareh Bina

School of Computing ScienceSimon Fraser UniversityVancouver, Canada

12Learning Markov Logic NetworksOutlineMarkov Logic Networks (MLNs) and motivation for this researchParameterized Bayes nets (PBNs)From PBN to MLNLearn-and-Join Algorithm for PBNs.Empirical EvaluationMarkov Logic Networks(Domingos and Richardson ML 2006)Learning Markov Logic Networks3A logical KB is a set of hard constraints on the set of possible worlds

MLNs make them soft constraints: When a world violates a formula, It becomes less probable, not impossible

A Markov Logic Network (MLN) is a set of pairs (F, w) whereF is a formula in first-order logicw is a real number

First Order LogicWeightEnglish2.3If a student is intelligent then he is well ranked0.7If a student has an intelligent friend then he is intelligent

It would be nice if the variable x was upper case, see next. Domingos tends to put variables lower case, constants upper case, we do the opposite.3Why MLNs are importantLearning Markov Logic Networks4MLNs are proposed as a unifying framework for Statistical relational learning. Their authors show how other approaches are special cases of MLNs.They are popular

Learning the structure of Markov logic networksDiscriminative training of Markov logic networksEfficient weight learning for Markov logic networksBottom-up learning of Markov logic network structureMapping and revising Markov logic networks for transfer learningDiscriminative structure and parameter learning for Markov logic networksEntity resolution with markov logicEvent modeling and recognition using markov logic networksHybrid markov logic networksLearning Markov logic network structure via hypergraph liftingImproving the accuracy and efficiency of map inference for markov logicLimitationLearning Markov Logic Networks5Structure learning in MLNs is mostly ILP(Inductive Logic Programming) based The complexity of search space is exponential in the number of predicates

For datasets with many descriptive attributes current MLN learning algorithms are infeasible as they either never terminate or are very slow

Inductive logic programming5Parametrized Bayes Nets (Poole UAI 2003)Learning Markov Logic Networks6A functor is a function symbol with typed variables f(X), g(X,Y), R(X,Y).A PBN is a BN whose nodes are functors. We use PBNs with variables only. Not intelligence(Jack)ranking(S)intelligence(S)diff(C)rating(C)registered(S,C)grade(S,C)satisfaction(S,C)popularity(C)teach-ability(C)6Overview Learning Markov Logic Networks7DatasetMLNBayesNetRelational Bayes Net learnerFirst Order LogicformulaTurn BN to FormulaMLN Parameter learningFrom PBN to FOL formulaLearning Markov Logic Networks8Parameterized Bayes Nets(PBNs) can be converted to a set of first order formula easily. Moralize PBN (marry parents, drop arrows).

For every CP-table value in PBN, add a formula ranking (S1, R1) , intelligence(S1, I1)ranking (S1, R1), intelligence(S1, I2)ranking (S1, R2), intelligence(S1, I1)ranking (S1, R2), intelligence(S1, I2)

Ranking(S)Intelligence(S)Rank=R1

Rank=R2

Int =I1P1P2Int =I2

P3P4Diff(C)rating(C)Popularity(C)teach-ability(C)Diff(C)rating(C)Popularity(C)teach-ability(C)Nice slide. Still have the uppercase-lowercase issue. Need to comment on how MLNs switch to predicate rather than functional notation.8Learning PBN StructureRequired: single-table BN learner L. Takes as input Single data table.A set of edge constraints required edgesA set of edge constraints forbidden edges

Nodes: Descriptive attributes (e.g. intelligence(S)) Boolean relationship nodes (e.g., Registered(S,C)).

EdgesLearning correlations between attributes

Learning Markov Logic Networks9Phase 1: Entity tablesLearning Markov Logic Networks10ranking(S)intelligence(S)diff(C)rating(C)popularity(p(C))teach-ability(p(C))StudentsNameintelligencerankingJack31Kim21Paul12BN learner LCourseNumber ratingdifficultyProf-popularityProf- teachablity101311210222221033211BN learner LI switched back to our original example, mainly because I was in a rush and this had the variables in it (intelligence(S)) rather than intelligence. Its important to have those because o.w. its inconsisent with poole. Plus we need the variables to represent autocorrelations.10Phase 2: relationship tablesLearning Markov Logic Networks11RegistrationStudentCourseS.NameC.numbergradesatisfactionintelligencerankingratingdifficultyPopularityTeach-abilityJack101A1313112..BN learner Lranking(S)intelligence(S)diff(C)rating(C)grade(S,C)satisfaction(S,C)popularity(p(C))teach-ability(p(C))diff(C)rating(C)popularity(p(C))teach-ability(p(C))ranking(S)intelligence(S)Statistical Motivation11Phase 3: add Boolean relationship indicator variables12registered(S,C)ranking(S)intelligence(S)diff(C)rating(C)grade(S,C)satisfaction(S,C)popularity(p(C))teach-ability(p(C))ranking(S)intelligence(S)diff(C)rating(C)grade(S,C)satisfaction(S,C)popularity(p(C))teach-ability(p(C))Also messed up, probably need to make fonts smaller or move text.12DatasetsUniversityMovielensMutagenesis13Learning Markov Logic Networks

SystemsLearning Markov Logic Networks14Moralized Bayes Nets(MBN) is a Parameterized Bayes nets (PBNs) converted into a Markov logic network.

LHL(current implementation of structure learning in Alchemy)

Const_LHL: Following the data reformatting used by Kok and Domingos 2007.Salary(Student, salary_value) =>Salary_high(student)Salary_low(student)Salary_medium(student)

Evaluation Plan15Learning Markov Logic NetworksDatasetTable formatGround factsConstant formatPBN learningLHL structure learningLHL parameter learningAlchemy default inferenceMBNLHLLHL_constCan we get the original flowchart to modify this? Instead of first-order logic, we should say ground facts or tuples15Evaluation Metrics(Default)Running time

AccuracyHow accurate our prediction is

Conditional Log Likelihood (CLL)How confident we are with the prediction

Area Under Curve (AUC)Avoid false negatives16Learning Markov Logic NetworksAdd reference to share this is mlns default metrics16Running time Time in Minutes. NT = did not terminate.X+Y = PBN structure learning + MLN parameter learning17Learning Markov Logic Networks

MBNAccuracy

18Learning Markov Logic Networks

Conditional Log likelihood


Area Under Curve


Future Work: Parameter Learning21Learning Markov Logic NetworksDatasetTable formatGround factsConstant formatPBN Structure learningLHL structure learningLHL parameter learningAlchemy default inferenceMBNLHLLHL_constPBN parameter learningSlide is optional, can skip21SummaryLearning Markov Logic Networks22Key idea: learn directed model, convert to undirected to avoid cycles.New efficient structure learning algorithm for Parametrized Bayes Nets.Fast and scalable (e.g., 5 min vs. 21 hr). Substantial Improvements in all default Evaluation Metrics

Id like to show the total number of slides, e.g 18/23. In this stupid new version of powerpoint I cant make that work.22Thank you!Learning Markov Logic Networks23Any questions?

Learning PBN Structure24Required: single-table BN learner L. Takes as input (T,RE,FE):Single data table.A set of edge constraints (forbidden/required edges).Nodes: Descriptive attributes (e.g. intelligence(S)) Boolean relationship nodes (e.g., Registered(S,C)).RequiredEdges, ForbiddenEdges := emptyset.For each entity table Ei:Apply L to Ei to obtain BN Gi. For two attributes X,Y from Ei, If X Y in Gi, then RequiredEdges += X Y .If X Y not in Gi, then ForbiddenEdges += X Y .For each relationship table join of size s = 1,..kCompute Rtable join, join with entity tables := Ji.Apply L to (Ji , RE, FE) to obtain BN Gi. Derive additional edge constraints from Gi.Add relationship indicators: If edge X Y was added when analyzing join R1 join R2 join Rm, add edges Ri Y.To deal with autocorrelation, duplicate entity tables. May want to have back-up slide on that. Explain that you will give an example in a moment. The main thing to note is that we recursively add more and more edge constraints and feed those back into later stages of the search.Neumann question: how do you know which data tables to use? Answer: this is based on the relational schema. Without relational schema, may be able to get this from functor structure only interesting question.24RestrictionsLearn-and-Join learns dependencies among attributes, not dependencies among relationships.The structure is limited to certain patternsOnly works with many descriptive attributes Parameter learning still a bottleneck.

Learning Markov Logic Networks25Inheriting Edge Constraints From SubtablesLearning Markov Logic Networks26Intuition: Evaluate dependencies on the smallest possible join.Statistical Motivation: Statistics can change between tables, e.g. distribution of students age may differ in Student table from Registration table.Computational Motivation: as larger join tables are formed, many edges need not be considered fast learning.Optional slide, can skip.26Chart10.850.370.510.670.430.430.650.420.420.810.360.550.81Mutagenesis Subsample120.35

MBNJBNMLNCMLN

Sheet1JBNMLNCMLNUniversity0.850.370.51Movielens Subsample10.670.430.43Movielens Subsample120.650.420.42Mutagenesis Subsample10.810.360.55Mutagenesis Subsample120.810.35To resize chart data range, drag lower right corner of range.

Chart1-0.4-5.79-3.24-0.75-4.09-2.83-1-3.55-3.94-0.6-4.7-3.38-0.6Mutagenesis Subsample12-4.65

JBNMLNCMLN

Sheet1JBNMLNCMLNUniversity-0.4-5.79-3.24Movielens Subsample1-0.75-4.09-2.83Movielens Subsample12-1-3.55-3.94Mutagenesis Subsample1-0.6-4.7-3.38Mutagenesis Subsample12-0.6-4.65To resize chart data range, drag lower right corner of range.

Chart10.880.450.680.70.460.530.690.490.510.90.560.60.9Mutagenesis Subsample20.56

JBNMLNCMLN

Sheet1JBNMLNCMLNUniversity0.880.450.68Movielens Subsample10.70.460.53Movielens Subsample120.690.490.51Mutagenesis Subsample10.90.560.6Mutagenesis Subsample20.90.56To resize chart data range, drag lower right corner of range.

Documents

Learning Markov Logic Networks with Many Descriptive Attributes