12
Montague Semantics for Lambek Pregroups Giovanni de Felice , Konstantinos Meichanetzidis , Alexis Toumi Department of Computer Science, University of Oxford. Cambridge Quantum Computing Ltd. May 17, 2019 Lambek pregroups are algebraic structures modelling natural lan- guage syntax, while Montague captures semantics as a homomor- phism from syntax to rst-order logic. We introduce the variant RelCoCat of the categorical compositional distributional (DisCoCat) models of Coecke et al. [1] where linear maps are replaced by rela- tions. Our main result is a factorisation of RelCoCat models through free cartesian bicategories, as a corollary we get a logspace reduction from the semantics problem to the evaluation of conjunctive queries. Finally, we dene question answering as an NP - complete problem. Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): the fragment of rst-order logic generated by relational symbols, equality (=), truth (), conjunction () and existential quantication (). RL formulae play a foundational role in the theory of relational databases, where they correspond to conjunctive queries: the select-project-join fragment of relational algebra [2] and select-where-from in SQL [3]. Chandra and Merlin [4] showed that conjunctive query evaluation and containment are logspace equivalent to graph homomorphism: they are NP - complete. In parallel to the foundations of database theory in terms of graph homo- morphisms, the 1970s saw the development of Montague semantics [5, 6, 7], where the meaning of grammatical sentences is given by a homomorphism from their syntax trees into rst-order logic through lambda calculus. In essence, Montague semantics (originally called “Fregean interpretations”) is the application of Frege’s principle of compositionality to Chomsky’s theory of syntactic structures [8]. Our main contribution is the construction of a fragment, 1

Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

Montague Semantics for Lambek PregroupsGiovanni de Felice†, Konstantinos Meichanetzidis†?, Alexis Toumi†

† Department of Computer Science, University of Oxford. ? Cambridge Quantum Computing Ltd.

May 17, 2019

Lambek pregroups are algebraic structures modelling natural lan-guage syntax, while Montague captures semantics as a homomor-phism from syntax to �rst-order logic. We introduce the variantRelCoCat of the categorical compositional distributional (DisCoCat)models of Coecke et al. [1] where linear maps are replaced by rela-tions. Our main result is a factorisation of RelCoCat models throughfree cartesian bicategories, as a corollary we get a logspace reductionfrom the semantics problem to the evaluation of conjunctive queries.Finally, we de�ne question answering as an NP− complete problem.

Introduction

In this paper, we give a semantics to pregroup grammar in regular logic (RL):the fragment of �rst-order logic generated by relational symbols, equality (=),truth (>), conjunction (∧) and existential quanti�cation (∃). RL formulae play afoundational role in the theory of relational databases, where they correspondto conjunctive queries: the select-project-join fragment of relational algebra[2] and select-where-from in SQL [3]. Chandra and Merlin [4] showed thatconjunctive query evaluation and containment are logspace equivalent to graphhomomorphism: they are NP− complete.In parallel to the foundations of database theory in terms of graph homo-

morphisms, the 1970s saw the development of Montague semantics [5, 6, 7],where the meaning of grammatical sentences is given by a homomorphismfrom their syntax trees into �rst-order logic through lambda calculus. Inessence, Montague semantics (originally called “Fregean interpretations”) isthe application of Frege’s principle of compositionality to Chomsky’s theory ofsyntactic structures [8]. Our main contribution is the construction of a fragment,

1

Page 2: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

i.e. a subset of natural language together with an interpretation into �rst-order logic. In the spirit of Lawvere’s functorial semantics [9], we constructthis fragment as a functor from the syntax given by Lambek pregroups [10, 11].In section 1 we de�ne the grammar category G, which captures the output

of the Parsing problem for pregroups. In section 2 we recall the formulationof conjunctive queries of Bonchi et al. [12] as arrows in the free cartesianbicategory CB(Σ); functors K : CB(Σ) → Rel into the category of sets andrelations are precisely relational databases. In section 3, we de�ne RelCoCat(categorical compositonal relational) models as monoidal functors F : G→ Rel

and show that they have a universal factorisation F = G L−→ CB(Σ) K−→ Rel.Computing the value of these functors — the Semantics problem — allows tomodel question answering (QA). Building on previous work [13], we show thatQA is logspace equivalent to conjunctive query evaluation: it is NP− complete.

1 Lambek Pregroups and Free Rigid Categories

Pregroup grammar (PG) is a mathematical model of natural language grammarintroduced by Lambek [10]. PG has the same expressive power as context-freegrammar [14], replacing syntax trees by arrows in a monoidal category. Inthis section, we give a de�nition of the Parsing problem for PG in terms of thegrammar category generated by a dictionnary.Given a natural number n ∈ N, we abuse notation and let n = { i ∈ N | i < n }.

Given a set X, we de�ne the free monoid List(X) = ⋃n∈NX

n and treat listsv ∈ Xn as functions v : n → X, i.e. v = v(0) . . . v(n − 1). We denote the length ofv ∈ Xn by |v| = n, concatenation of (u, v) ∈ Xm×Xn by uv ∈ Xm+n and the emptylist by ε ∈ X0. We use implicit isomorphisms between the sum and the disjointunion m+ n, as well as between the product and the cartesian product m× n.Definition 1.1. A preorder is a set P equipped with a graph (≤) ⊆ P 2 such that for alla, b, c ∈ P we have:

• a = b =⇒ (a ≤ b ∧ b ≤ a) (reflexivity)

• (a ≤ b ∧ b ≤ c) =⇒ a ≤ c (transitivity)

A pro-monoid (for preordered monoid) is a preorder P equipped with a product denotedby concatenation and a unit ε ∈ P such that for all a, b, c, d ∈ P :

• a ε = a = ε a ∧ (a b) c = a b c = a (b c) (monoidality)

2

Page 3: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

• (a ≤ b ∧ c ≤ d) =⇒ (a c ≤ b d) (precongruence)

Proposition 1.2. Given a free monoid P and a graph R ⊆ P 2, the least pro-monoidinduced by R is the reflexive transitive closure of the substitution relation Sub(R) ={ (abc, adc) | a, b, c, d ∈ P ∧ (b, d) ∈ R }.

Definition 1.3. A pregroup is a pro-monoid P where every type t ∈ P has a pair of leftand right adjoints ?t and t? such that:

• t(?t) ≤ ε ≤ (?t)t (left adjunction)

• (t?)t ≤ ε ≤ t(t?) (right adjunction)

We say that a pregroup is strict when for all s, t ∈ P we have:

• ?(t?) = t = (?t)? (cancellation)

• ?(st) = (?t)(?s) ∧ (st)? = (t?)(s?) (monoid anti-hom.)

Given a graph I ⊆ B2, let P (B) = List(B × Z) be the least pro-monoid induced by:

• (a, b) ∈ I =⇒ (a, 0) ≤ (b, 0) (induced step)

• (b, z)(b, z − 1) ≤ ε (contraction)

• ε ≤ (b, z)(b, z + 1) (expansion)

Note that we identify (b, 0) ∈ P (B) with b ∈ B, called a basic type.

Theorem 1.4. P (B) is the free strict pregroup generated by I ⊆ B2, with adjoints givenby ?(b, z) = (b, z − 1), (b, z)? = (b, z + 1) for (b, z) ∈ B × Z and ?t = ?t(n − 1) . . . ?t(0),t? = t(n− 1)? . . . t(0)? for arbitrary types t ∈ (B × Z)n ⊆ P (B).

Proof. See [10] for the original proof from Lambek, where pregroups are defined as partialorders and anti-symmetry implies that pregroups are necessarily strict. The definitionbased on preorders is borrowed from [14], one may recover the partial order case by takingthe quotient of P (B) over (∼) = { (s, t) | (s ≤ t) ∧ (t ≤ s) }.

Definition 1.5. A pregroup grammar is a pair G = (∆, I) where ∆ ⊆ V ×P (B) is a relationcalled the dictionnary (or lexicon) and I ⊆ B2 the graph of basic types and induced steps.

We require the set ∆(w) = { t ∈ P (B) | (w, t) ∈ ∆ } to be finite for every word w ∈ V inthe vocabulary and write ∆(u) = { t(0) . . . t(n− 1) | t ∈ P (B)n ∧ ∀ i < n · t(i) ∈ ∆(u(i)) }for u ∈ V n. Given a sentence type s ∈ P (B), the language generated by G is given by:

L(G, s) = {u ∈ List(V ) | ∃ t ∈ ∆(u) · t ≤ s ∈ P (B) }

3

Page 4: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

Definition 1.6. Grammaticality

Input: G = (∆, I), u ∈ List(V ), s ∈ P (B)Output: is u ∈ L(G, s)?

Example 1.7. Take basic types B = { s, q, n, d, i, o } (sentence, question, noun, determi-nant, subject and object) with induced steps I = { (n, i), (n, o) } and dictionnary:

∆(Spinoza) = ∆(Leibniz) = ∆(calc) = {n } ∆(the) = { d } ∆(phil) = { ?d n }

∆(read) = ∆(infl) = ∆(disc) = { ?i s o? } ∆(who) = { ?nn(s?)i, q(s?)i }

The following reduction in P (B) gives a grammaticality proof for “Leibniz read Spinoza”.

n(?iso?)n ≤ i(?iso?)o = (i?i)s(o?o) ≤ s

Lemma 1.8 (Switching lemma [10]). For all t ≤ s ∈ P (B) there is some t′ ∈ P (B) and apair of reductions t ≤ t′ and t′ ≤ s with no expansions and no contractions respectively.

Corollary 1.9 ([14]). Grammaticality ∈ P

Proof. The proof goes by translating pregroup grammar to context-free grammar.

Pregroups allow to de�ne Grammaticality as a decision problem, howeverpreorders cannot distinguish between distinct reductions for the same type. Inorder to de�ne the function problem Parsing, we introduce grammar categoriesas the proof-relevant counterpart of pregroup grammar.Definition 1.10. Given ∆ ⊆ V × P (B) and I ⊆ B2, the grammar category G is the strictmonoidal category with objects P (B) and generated by the following arrows:

(b, z) (b, z − 1)

b∈Bz∈Z

+

(b, z) (b, z + 1)

b∈Bz∈Z

+

(b, 0)

(a, 0)

(a,b)∈I

+

w

t

(w,t)∈∆

.

called cups, caps, induced steps and typed words respectively, and the following equations: (b, z) (b, z − 1) (b, z) = (b, z) = (b, z) (b, z + 1) (b, z)

b∈Bz∈Z

Given a list of words u ∈ V n and t ∈ ∆(u), we write u = u(0)⊗ · · · ⊗ u(n− 1) : ε→ t.

Proposition 1.11 ([15]). The subcategory R(B) ↪−→ G generated only by cups, caps andinduced steps is the free rigid category generated by the graph I ⊆ B2.

4

Page 5: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

Remark 1.12. We use the terminology from Selinger’s survey of graphical languages [16].Rigid categories are called compact 2-categories in [15] and autonomous categories in [17].Note that a rigid category that is also symmetric monoidal is called compact-closed.

Definition 1.13. Parsing

Input: G = (∆, I), u ∈ List(V ), s ∈ P (B)Output: (ε u−→ t

r−→ s) ∈ G(ε, s), where t ∈ ∆(u), r : t→ s ∈ R(B)

Example 1.14. Take ∆ as in example 1.7, the following diagram (where we omit types forreadability) is a parsing for “Who influenced the philosopher who discovered calculus?”

Who infl the phil who disc calc

Proposition 1.15 ([18]). Parsing is poly-time computable in the size of the basic types B,the dictionnary ∆ ⊆ V × P (B) and the length of the inputs (u, s) ∈ List(V )× P (B).

Proof. The parsing algorithm has time complexity n3 in general, restricted cases of interestin linguistic applications may be parsed in linear time, see [19].

2 Conjunctive Queries and Free Cartesian Bicategories

A relational signature is a set of symbols Σ equipped with a function ar : Σ→ N.Given a �nite set U called the universe, we de�ne the set of Σ-models MΣ(U) ={K ⊆

∏R∈Σ U

ar(R) }, i.e. a Σ-model K gives an interpretation K(R) ⊆ Uar(R) forevery symbol R ∈ Σ. Given two Σ-modelsK ∈MΣ(U),K ′ ∈MΣ(U ′), a homomor-phism f : K → K ′ is a function f : U → U ′ such that for all R ∈ Σ and ~x ∈ Uar(R)

we have ~x ∈ K(R) =⇒ f(~x) ∈ K ′(R). Let MΣ = {K ∈MΣ(U) | U is �nite }.Definition 2.1. Homomorphism

Input: Σ, K,K ′ ∈MΣ

Output: f : K → K ′

Proposition 2.2. [20] Homomorphism is NP− complete.

Proof. Membership may be shown to follow from Fagin’s theorem: homomorphisms aredefined by an existential second-order logic formula. Hardness follows by reduction fromgraph homomorphism: take Σ = { • } and ar(•) = 2 then a Σ-model is a graph.

5

Page 6: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

Fix a countable set of variables X , the set of regular logic formulae is de�nedby the following context-free grammar:

ϕ ::= > | x = x′ | ϕ ∧ ϕ | ∃ x · ϕ | R(~x)

where x, x′ ∈ X , R ∈ Σ and ~x ∈ X ar(R). We de�ne the set of conjunctive queriesQΣ = { (A,B,C) | A+B ⊆ X , C ⊆

∏R∈Σ(A+B)ar(R) } where the triple (A,B,C) ∈

QΣ encodes the following regular logic formula:ϕ = ∃ B(0) · · · ∃ B(k − 1)

∧i<n

Ri(~xi)

for sets of free variables A ⊆ X , bound variables B ∈ X k and the conjunctionof C = { (R0, ~x0), . . . , (Rn−1, ~xn−1) }. For A ⊆ X , let QΣ(A) = {ϕ ∈ QΣ | fv(ϕ) = A }

for fv(ϕ) ⊆ X the free variables of ϕ. Given a model K ∈MΣ(U) and a valuationv : fv(ϕ)→ U , satisfaction (K, v) � ϕ is de�ned in the usual way.Definition 2.3. Evaluation

Input: Σ, ϕ ∈ QΣ, K ∈MΣ(U)Output: ϕ(K) = { v ∈ Ufv(ϕ) | (K, v) � ϕ }

Definition 2.4. Containment

Input: Σ, ϕ, ϕ′ ∈ QΣ(A)Output: ϕ ⊆ ϕ′ ≡ ∀ K ∈MΣ · ϕ(K) ⊆ ϕ′(K)

Definition 2.5. Given a query (A,B,C) ∈ QΣ, the canonical Σ-model CM(ϕ) ∈MΣ(A+B)is defined by CM(ϕ)(R) = { ~x ∈ (A+B)ar(R) | (R, ~x) ∈ C }.

Theorem 2.6 (Chandra-Merlin [4]). Evaluation and Containment are logspace equivalentto Homomorphism, hence NP− complete.

Proof. Given a query ϕ ∈ QΣ(A) and a model K ∈ MΣ(U), query evaluation ϕ(K) isgiven by the set of homomorphisms CM(ϕ)→ K. Given ϕ′ ∈MΣ(A), we have ϕ ⊆ ϕ′ iffthere is a homomorphism f : CM(ϕ) → CM(ϕ′). For the other direction, given a modelK ∈MΣ(U) we construct the corresponding query (∅, U,K) ∈ QΣ(∅).

Bonchi, Seeber and Sobocinski [12] introduced graphical conjunctive queries(GCQ), a graphical calculus where query containment is captured by the axiomsof the free cartesian bicategory CB(Σ) generated by the relational signature Σ.Definition 2.7 (Carboni-Walters [21]). A cartesian bicategory is a symmetric monoidalcategory enriched in partial orders such that:

6

Page 7: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

1. every object is equipped with a special commutative Frobenius algebra,

2. the monoid and comonoid structure of each Frobenius algebra are adjoint,

3. every arrow is a lax comonoid homomorphism.

A morphism of cartesian bicategories is a strong monoidal functor preserving the partialorder, the monoid and the comonoid structure.

Example 2.8. The category Rel of sets and relations with cartesian product as tensor,singleton as unit, the diagonal and its transpose as monoid and comonoid, subset aspartial order, is a cartesian bicategory. Given a set U , the subcategory Rel|U ↪−→ Rel withnatural numbers m,n ∈ N as objects and relations R ⊆ Um+n as arrows is also a cartesianbicategory. It is furthermore a PROP, i.e. a symmetric monoidal category with additionof natural numbers as tensor on objects.

Theorem 2.9. ([12, prop. 9,10]) Let CB(Σ) be the free cartesian bicategory generatedby {R : 0→ ar(R) }R∈Σ, see [12, def. 21]. There is a two-way semantics-preservingtranslation Θ : QΣ → CB(Σ), Λ : CB(Σ) → QΣ, i.e. for all ϕ,ϕ′ ∈ QΣ we haveϕ ⊆ ϕ′ ⇐⇒ Θ(ϕ) ≤ Θ(ϕ′), and for all arrows d, d′ ∈ CB(Σ), d ≤ d′ ⇐⇒ Λ(d) ⊆ Λ(d′).

Proof. The translation is defined by induction from the syntax of regular logic to thatof graphical conjunctive queries (GCQ) and back. Note that Θ(ϕ) ∈ CB(Σ)(0, n) wheren = |fv(ϕ)|. Similarly for the other direction we have Λ : CB(Σ)(n,m) → QΣ(n + m),i.e. open wires in CB(Σ) correspond to free variables.

Proposition 2.10. ([12, prop. 23]) Σ-models with universe U are in bijective correspondencewith morphisms of cartesian bicategoriesK : CB(Σ)→ Rel|U that are identity on objects.

Proof. By the universal property, a morphism K from the free cartesian bicategory isfully determined by its image on 1 (which we require to be 1) and its image on generatorsK(R) ⊆ Uar(R) for R ∈ Σ: this is precisely the data for a Σ-model.

Corollary 2.11. There are bijective correspondences:

CB(Σ)(0, 0) ∼−→ QΣ(∅) ∼−→MΣ∼−→ [CB(Σ),Rel]

where [CB(Σ),Rel] denotes the set of morphisms of cartesian bicategories.

Proof. The first bijection follows by 2.9, as there are no free variables Λ and Θ are inversesof each other. The second bijection is given by CM : QΣ(∅)→MΣ with inverse given inthe proof of 2.6. The third bijection is Proposition 2.10.

7

Page 8: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

3 Functorial Semantics for Question Answering

Given a cartesian bicategory C, we de�ne a Montague functor L : G → C as astrong monoidal functor which respects the rigid structure, i.e. it sends cupsand caps in G to those induced by the Frobenius structure in C.Definition 3.1. A RelCoCat model is a Montague functor F : G → Rel, we say a modelis one-sorted whenever its image lies in Rel|U for some finite universe U .

Definition 3.2. Semantics

Input: F : G→ Rel|U , d ∈ Parsing(G, u, s)Output: F (d) ⊆ UF (s)

Theorem 3.3. Given a one-sorted RelCoCat model F : G → Rel|U , let ∆ + I be therelational signature where ar(w, t) = F (t) and ar(e) = F (a) + F (b) for (w, t) ∈ ∆ ande = (a, b) ∈ I. There exists a (∆ + I)-model K with universe U and a Montague functorL : G→ CB(∆ + I) with:

F = G L−→ CB(∆ + I) K−→ Rel|U

This factorisation is universal in the following sense: for any cartesian bicategory C withnatural numbers as objects (i.e. C is a PROP), any Montague functor L′ : G → C

and identity-on-objects morphism of cartesian bicategories K ′ : C → Rel|U such thatF = K ′ ◦L′, there exists a unique morphism of cartesian bicategories (!) : CB(∆+I)→ C

making the following diagram commutes:

G Rel|U

CB(∆ + I)

C

F

L

L′

K

K′

!

Proof. The functor L is defined by L(t) = F (t) on objects and on arrows by mappingtyped words ∆ and induced steps I to their corresponding relational symbols in ∆ + I.We define a (∆ + I)-model K as follows: K(w, t) = F (ε w−→ t) ⊆ UF (t) and K(e) =F (e) ⊆ UF (a)+F (b). By proposition 2.10 this defines a morphism of cartesian bicategoriesK : CB(∆ + I)→ Rel|U , where by construction we have K ◦ L = F .

Note that both K and K ′ are identity on objects, this implies that (!) must be identityon objects. Existence and uniqueness of (!) : CB(∆ + I) → C then follow from theuniversal property of the free cartesian bicategory CB(∆ + I).

8

Page 9: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

Corollary 3.4. Semantics ∈ NP.

Proof. We reduce to Evaluation: given (F, d), constructing the functors K ◦ L = F andthe translation Λ of theorem 2.9 can be done in logspace, this yields a conjunctive queryϕ = Λ(L(d)) ∈ Q(∆+I) such that F (d) = ϕ(K).

We conjecture that the constraint language induced by a pregroup grammarmeets the tractability condition for the CSP dichotomy theorem [22].Conjecture 3.5. Fix G = (∆, I), then Semantics is poly-time computable in the size of(u, s) ∈ List(V )× P (B) and F : G→ Rel|U , i.e. in the size of the universe U .

Given a Montague functor L : G→ C, we can use the 2-arrows inC to modelnatural language entailment, i.e. we can equip G with a preorder enrichmentgiven by f ≤ g ∈ G ⇐⇒ L(f) ≤ L(g) ∈ C.Example 3.6. We take the dictionnary from example 1.7 and define the cartesian bicategoryC generated by Σ = {Spinoza, Leibniz, read, infl, disc, calc, phil } where:

ar = {Spinoza, Leibniz, phil, calc 7→ 1, read, infl, disc 7→ 2 }

and the following set of generating 2-arrows, called a TBox in description logics:

philcalcdisc ≤Leib ≤inflread ≤

Then we define the Montague functor L : G→ C where the image of the relative pronoun“who” is given by the Frobenius algebra, see [23]. We choose to model the question word“Who” with the compact-closed structure of C. We send the determinant “the” to the unitand the common noun “philosopher” to the symbol phil ∈ Σ composed with the comonoidof the Frobenius algebra. Finally, we map the induced steps to the identity. Thus, we getthe following 2-arrow in G, i.e. the entailment of the following two sentences:

Leibniz read Spinoza ≤ Spinoza influenced Leibniz

We can now answer our question d ∈ G(ε, s) in 1.14 as the evaluation of the query:

Λ(L(d)) = ∃ x1 ∃ x2 · infl(x0, x1) ∧ phil(x1) ∧ disc(x1, x2) ∧ calc(x2)

Who infl phil disc calc

9

Page 10: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

So far, we have shown how to translate questions into conjunctive queries.We now show how to translate a corpus, i.e. a set of lists of sentences calleddiscourses, into a relational database so that question answering reduces toquery evaluation. If we apply our Montague functor independently to eachsentence, the resulting queries have disjoint sets of variables. In order to obtainthe desired database, variables need to be assigned to some designated entities;a standard natural language processing task called named entity linking (NEL).Fix a pregroup grammar G = (∆ ⊆ V × P (B), I ⊆ B2), a sentence type

s ∈ P (B) and a Montague functor L : G → CB(Σ) with L(s) = 0. We de�ne acorpus as a set of discourses C ⊆ List(G(ε, s)), where we de�ne

ϕ(C ) = (∅, B, C) = Λ(L(⊗d∈C

d)) ∈ QΣ

for the conjunctive query given by corollary 2.11. A named entity linking for C isa function µ : B → E for some arbitrary set E of entities. We de�ne:

ϕ(C , µ) =(∅, E, { (R,µ(~x)) | (R, ~x) ∈ C }

)∈ QΣ

Definition 3.7. QA

Input: C ⊆ List(G(ε, s)), q ∈ QΣ, µ : U → E

Output: q(K) ⊆ Efv(ϕ) where K = CM(ϕ(C , µ))

Theorem 3.8. QA is NP− complete

Proof. Membership follows immediately by reduction to Evaluation. Hardness followsby reduction from graph homomorphism, we only give a sketch of proof and refer to [13]where named entity linking is called matching. Any graph can be encoded in a corpusgiven by a set of subject-verb-object sentences, where the named entity linking maps thenouns to their corresponding node.

We conclude with related work and potential directions for future work:• many-sorted RelCoCat models with graphical regular logic [24],• from Boolean semantics to generalised relations in arbitrairy topoi [25],• semantics of “How many?” questions and counting problems [26],• quantum speedup for question answering via closest vector problem [27],• comonadic semantics for bounded short-term memory [28],• from regular logic to description logics in bicategories of relations [29].

10

Page 11: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

Acknowledgments

The authors would like to thank Bob Coecke, Dan Marsden, Antonin Delpeuch,Vincent Wang, Jacob Leygonie, Dusko Pavlovic and Samson Abramsky for in-spiring discussions in the process of writing this article.

References

[1] Stephen Clark, Bob Coecke, and Mehrnoosh Sadrzadeh. Mathematical foundations for acompositional distributional model of meaning. In J. van Benthem, M. Moortgat, andW. Buszkowski, editors, A Festschrift for Jim Lambek, volume 36 of Linguistic Analysis, pages345–384. 2010.

[2] E F Codd. A Relational Model of Data for Large Shared Data Banks. 13(6):11, 1970.[3] Donald D. Chamberlin and Raymond F. Boyce. SEQUEL: A structured English query lan-

guage. In Proceedings of the 1976 ACM SIGFIDET (Now SIGMOD) Workshop on Data Description,Access and Control - FIDET ’76, pages 249–264, Not Known, 1976. ACM Press.

[4] Ashok K. Chandra and Philip M. Merlin. Optimal implementation of conjunctive queries inrelational data bases. In Proceedings of the Ninth Annual ACM Symposium on Theory of Com-puting - STOC ’77, pages 77–90, Boulder, Colorado, United States, 1977. ACM Press.

[5] Richard Montague. Universal grammar. Theoria, 36(3):373–398, 1970.[6] Richard Montague. English as a Formal Language. In Bruno Visentini, editor, Linguaggi

Nella Societa e Nella Tecnica, pages 188–221. Edizioni di Communita, 1970.[7] Richard Montague. The Proper Treatment of Quanti�cation in Ordinary English. Approaches

to Natural Language, pages 221–242, 1973.[8] Noam Chomsky. Syntactic Structures. Mouton and Co., The Hague, 1957.[9] F. William Lawvere. Functorial Semantics of Algebraic Theories. Proceedings of the National

Academy of Sciences of the United States of America, 50(5):869–872, 1963.[10] Joachim Lambek. Type Grammar Revisited. In Alain Lecomte, François Lamarche, and Guy

Perrier, editors, Logical Aspects of Computational Linguistics, pages 1–27, Berlin, Heidelberg,1999. Springer Berlin Heidelberg.

[11] Joachim Lambek. From Word to Sentence: A Computational Algebraic Approach to Grammar.Open Access Publications. Polimetrica, 2008.

[12] Filippo Bonchi, Jens Seeber, and Pawel Sobocinski. Graphical Conjunctive Queries.arXiv:1804.07626 [cs], April 2018.

[13] Bob Coecke, Giovanni de Felice, Dan Marsden, and Alexis Toumi. Towards Composi-tional Distributional Discourse Analysis. Electronic Proceedings in Theoretical Computer Sci-ence, 283:1–12, November 2018.

[14] Wojciech Buszkowski and Katarzyna Moroz. Pregroup Grammars and Context-free Gram-mars. 2007.

[15] Anne Preller and Joachim Lambek. Free compact 2-categories. Mathematical Structures inComputer Science, 17(2):309–340, 2007.

11

Page 12: Montague Semantics for Lambek Pregroups...Introduction In this paper, we give a semantics to pregroup grammar in regular logic (RL): ... morphisms, the 1970s saw the development of

[16] P. Selinger. A Survey of Graphical Languages for Monoidal Categories. New Structures forPhysics, pages 289–355, 2010.

[17] Antonin Delpeuch. Autonomization of Monoidal Categories. arXiv:1411.3827 [cs, math],November 2014.

[18] R Oehrle. A parsing algorithm for pregroup grammars. Proceedings of Categorial Grammars2004, pages 59–75, January 2004.

[19] Anne Preller. Linear Processing with Pregroups. Studia Logica: An International Journal forSymbolic Logic, 87(2/3):171–197, 2007.

[20] Michael R. Garey and David S. Johnson. Computers and Intractability; A Guide to the Theory ofNP-Completeness. W. H. Freeman & Co., New York, NY, USA, 1990.

[21] A. Carboni and R. F. C. Walters. Cartesian bicategories I. Journal of Pure and Applied Algebra,49(1):11–32, November 1987.

[22] Andrei A. Bulatov. A dichotomy theorem for nonuniform CSPs. arXiv:1703.03021 [cs], March2017.

[23] Mehrnoosh Sadrzadeh, Stephen Clark, and Bob Coecke. The Frobenius anatomy of wordmeanings I: Subject and object relative pronouns. Journal of Logic and Computation, 23:1293–1317, 2013.

[24] Brendan Fong and David I. Spivak. Graphical Regular Logic. arXiv:1812.05765 [cs, math],December 2018.

[25] Bob Coecke, Fabrizio Genovese, Martha Lewis, Dan Marsden, and Alex Toumi. Generalizedrelations in linguistics & cognition. Theoretical Computer Science, 2018.

[26] Giorgio Stefanoni, Boris Motik, and Egor V. Kostylev. Estimating the Cardinality of Con-junctive Queries over RDF Data Using Graph Summarisation. In Proceedings of the 2018WorldWideWebConference onWorldWideWeb,WWW2018, Lyon, France, April 23-27, 2018, pages1043–1052, 2018.

[27] William Zeng and Bob Coecke. Quantum Algorithms for Compositional Natural LanguageProcessing. Electronic Proceedings in Theoretical Computer Science, 221:67–75, August 2016.

[28] Samson Abramsky and Nihil Shah. Relating Structure and Power: Comonadic Semanticsfor Computational Resources. arXiv:1806.09031 [cs], June 2018.

[29] Evan Patterson. Knowledge Representation in Bicategories of Relations. arXiv:1706.00526[cs, math], June 2017.

12