Presupposition
Consider these three sentences:
- John stopped smoking.
- If John used to smoke, then John stopped smoking.
- Either John didn’t use to smoke, or he stopped smoking.
Sentence (1) presupposes that John used to smoke but neither sentence (2) nor sentence (3) does.1 This is an instance of the pattern of presupposition projection, the way complex expressions inherit or, as in this case, fail to inherit the presuppositions of their parts. Ideally, there would be something relatively simple we could say about why (2) and (3) don’t give rise to the presupposition that (1) does. Saying it has turned out to be surprisingly difficult. In the following (based on Rothschild 2015), I explain the treatment of Heim's (1983) presuppositions in dynamic semantics.
Common grounds and projection rules
One of the marks of linguistic presuppositions is that when a sentence presupposes a proposition an assertion of the sentence seems to take the proposition for granted. We might describe presuppositions by saying that a sentence, \(S\), presupposes a proposition, \(p\), when an assertion of \(S\) is only felicitous in a context in which the mutual assumptions of the conversational participants include \(p\). This definition, due to Stalnaker (1974) and Karttunen (1974), takes linguistic presupposition to give rise to acceptability conditions on the common ground, the collection of mutually accepted assumptions among conversational participants.Q Here is a more careful description of the framework: in a conversation any utterance is made against the common ground, which we model as the set of worlds not ruled about by the mutual assumptions of the conversational participants. When one asserts a proposition, \(a\), the normal effect, if the audience accepts the assertion, is the removal of the worlds where \(a\) is false from the common ground. One way of working presuppositions into this framework is to assume that certain sentences are such that they are only felicitously asserted in certain common grounds. In particular, we say that if a sentence \(A\) presupposes \(\underline{a}\), then \(A\) is only felicitously assertable in a common ground \(c\) if \(c\) entails \(\underline{a}\), i.e., \(\underline{a}\) is true in every world in \(c\) (which we write as \(c \models \underline a\)). When it is felicitous, the effect of an assertion of \(A\) is to remove certain worlds from the common ground.
In this framework, due to Stalnaker and Karttunen, the projection problem is the problem of defining what conditions complex sentences put on the common ground in terms of what conditions their parts do. Below are some sample rules we could use to describe the projection behavior in this framework:
- \(b \land A\) is acceptable in \(c\) iff \(c \models b \to \underline{a}\)
- \(b \lor A\) is acceptable in \(c\) iff \(c \models \lnot b \to \underline{a}\)
- \(b \to A\) is acceptable in \(c\) iff \(c \models b \to \underline{a}\)
We can apply these rules to examples such as the following:
- John used to smoke and he’s stopped. (@) John didn’t use to smoke, or he’s stopped ((???)) If John used to smoke, then he’s stopped.
According to the rules (4) to (6) the presuppositions in sentences (7) to ((???)) are trivial. For instance, the presupposition of (7) by rule (4) is if John used to smoke, then he used to smoke. Since this is trivially true the entire sentence is correctly predicted not to presuppose anything.
These rules can be elaborated into general rules that predict the presupposition of any complex sentence, given the presuppositions of its parts. Such a set of rules would essentially be the filtering rules developed by Karttunen (1973). There is some debate over the empirical merits of these rules, but I want to put this aside here.2
Suppose rules along the lines of (4) to (6) suffice to describe the pattern of presupposition projection. Merely stating these rules fails entirely to explain why the pattern of presuppositions project can be so described. Heim (1983) was a landmark paper partly because it gave a semantics of presuppositional expressions (and complexes formed out of these) from which these rules of presupposition projection follow. I will outline her account and discuss a major criticism of it, due to Scott Soames and Mats Rooth. They argued that Heim’s semantics has features which effectively amount to stipulations of presupposition projection properties (Soames 1982; Heim 1990; Schlenker 2008).
Dynamic semantics
I am going to present Heim’s propositional dynamics semantics in a non-standard way, as this will facilitate some of the later discussion.3 While the basic ideas may be familiar to some readers, it is worth skimming through this section to get a sense of the notation.
The major change in Heim’s dynamic semantics, from the Stalnakerian framework discussed above, is that the meanings of sentences are no longer propositions, sets of possible worlds, but instead ways of changing the common ground. Thus, a sentence has as its semantic value a function from sets of possible worlds to sets of possible worlds (i.e. a function with domain \(\mathcal{P}(W)\) and range \(\mathcal{P}(W)\)).4
Using this kind of semantic value we can reproduce the Stalnakerian treatment of assertion. Instead of having a sentence \(S\) denote a set of possible worlds \(p\) we have the sentence denote the function that goes from a common ground \(c\) to the intersection of \(p\) and \(c\). In other words, a sentence denotes a function that captures what it is to update any common ground with the sentence. Heim named this sort of function a context change potential, or CCP for short.
Much of the allure of dynamic semantics, in particular the treatment of donkey anaphora, comes from its treatment of variables which a propositional fragment cannot capture. However, most of Heim’s treatment of presupposition projection can be expressed in a propositional fragment. For now I will only discuss the propositional case, and introduce variables and quantifiers later in §[quantification].
Presuppositional meanings are encoded by partial functions from contexts to contexts. Consider a sentence like John stopped smoking. In a classical semantics we would assign this sentence as its meaning the set of possible worlds where John used to smoke and doesn’t any more. However, in a partial, dynamic semantics we assign this sentence a partial function, \(f: \mathcal{P}(W)\to \mathcal{P}(W)\), such that:
\(f(x)\) is defined iff John used to smoke in all worlds in \(x\)
where defined $f(x) = { w x: $ John no longer smokes in \(w \}\).
Since John stopped smoking is not defined when the context does not entail that John used to smoked, it is infelicitous in such a context. Thus, the partiality of the CCPs captures their presuppositional behavior.
It is helpful to note that Heim’s treatment of presuppositions as partially defined CCPs is technically similar to the older tradition of modeling presuppositions with a trivalent semantics. On a trivalent semantics each sentence can be true in some worlds, false in others, and undefined in others. Typically, we say that if a sentence \(S\) has a presupposition failure, then it is neither true nor false. So John stopped smoking has the following truth-condition:
John stopped smoking is true iff John used to smoke and he doesn’t any longer.
John stopped smoking is false iff John used to smoke and he still smokes.
John stopped smoking is neither true nor false iff John didn’t used to smoke.
Stalnaker (1973) proposed that the following pragmatic rule should govern the assertion of such sentences:5 If we followed this rule, then \(S\) could only result in a felicitous update of \(c\) if \(c\) entails that \(S\) is either true or false. Thus, a trivalent semantics does the same basic thing a CCP semantics does: it formally encodes the presuppositions of sentences in terms of definedness conditions.
So far, dynamic semantics looks like a different technical framework for expressing what a trivalent semantics does. The interest comes when we introduce the compositional rules for complex sentences. Before we do that, however, we need to state the details of the semantics in a more precise way. We could simply give a semantics for sentences which assigns as values not propositions, but CCPs. However, to facilitate the later discussion, I will introduce a language that includes not just CCPs but also formulas representing contexts or common grounds. So this language will include two parts: 1) the context part for formulas representing common grounds and 2) the CCP part for sentences expressing context change potentials. Properly speaking, then, the only part of the formal language that corresponds to the actual spoken (or written) language is the CCP-part. However, this is just a notational convenience, not itself a substantive assumption.
Syntax
lower-case letters \(a, b, c\ldots\) are atomic sentences (these will be used to model contexts)
upper-case letters \(A, B, C\ldots\) are atomic CCPs (these represent sentences in human language)
the set of CCPs is defined as follows:
any atomic CCP is a CCP
if \(\phi\) and \(\psi\) are CCPs then so are \(\lnot \phi\), \(\phi \land \psi\), \(\phi \lor \psi\), and \(\phi \to \psi\)
the set of complex sentences is defined as follows:
any atomic sentence is a sentence
if \(\alpha\) and \(\beta\) are sentences then so are \(\alpha \land \beta\), \(\alpha \lor \beta\), and \(\alpha \backslash \beta\)6
if \(\alpha\) is a sentence and \(\phi\) is a CCP then \(\alpha [\phi]\) is a sentence
As noted, the actual complex sentences in this language represent contexts including ones which are combined or updated by CCPs in various ways. So, for instance, \(c[A \land B]\) represents the update of the context \(c\) by the complex CCP \(A \land B\). It may seem that syntactic rules for combining contexts to get such formulas as \(a \land b\) are pointless since contexts themselves are not syntactic objects, but they will come in handy for giving the semantics of complex CCPs.
Partial semantics
I will use the denotation brackets, “\({[\hspace{-.02in}[]\hspace{-.02in}]}\)”, to designate the semantic value of a sentence or a CCP. An interpretation \(I\) sets the semantic values for both the atomic sentences and the atomic CCPs, while the semantic values of complex formulas is given by recursive semantic rules. For every atomic sentence \(\alpha\), \({[\hspace{-.02in}[\alpha]\hspace{-.02in}]}_I\) is a set of possible worlds (i.e. a subset of \(W\)). For every atomic CCP \(\alpha\), \({[\hspace{-.02in}[\alpha]\hspace{-.02in}]}_I\) is a function from sets of possible worlds to sets of possible worlds (i.e. from \(\mathcal{P}(W)\) to \(\mathcal{P}(W)\)).7
The semantic value of all complex sentences, as we will see, are sets of possible worlds. A sentence \(\alpha\) entails \(\beta\) (which we write \(\alpha \models \beta\)) iff on every interpretation, \(I\), \({[\hspace{-.02in}[\alpha]\hspace{-.02in}]}_I \subseteq {[\hspace{-.02in}[\beta]\hspace{-.02in}]}_I\). The semantic value of all complex CCPs, which we will define later, are partial functions from \(\mathcal{P}(W)\) to \(\mathcal{P}(W)\); we will not need entailment relations for these.
Let us first discuss the semantic value of sentences of the form \(\alpha [\psi]\) where \(\alpha\) is a sentence and \(\psi\) is an atomic CCP. Our semantic rule for CCP application is functional application:
(@) \({[\hspace{-.02in}[\alpha [\psi]]\hspace{-.02in}]}_I\) = \({[\hspace{-.02in}[\psi]\hspace{-.02in}]}_I({[\hspace{-.02in}[\alpha]\hspace{-.02in}]}_I)\)
We assume (naturally) that, \({[\hspace{-.02in}[\alpha [\psi]]\hspace{-.02in}]}_I\) is defined iff both \({[\hspace{-.02in}[\psi]\hspace{-.02in}]}_I\) and \({[\hspace{-.02in}[\alpha]\hspace{-.02in}]}_I\) are defined and the latter is in the range of the former. (Since presuppositions arise because of undefinedness, such assumptions matter.)
We will also give recursive semantic rules for the connectives when they apply to sentences.8 These are as expected:
- \({[\hspace{-.02in}[\alpha \land \beta]\hspace{-.02in}]}_I\) is \(\{ w: w \in {[\hspace{-.02in}[\alpha]\hspace{-.02in}]}\) and \(w \in {[\hspace{-.02in}[\beta]\hspace{-.02in}]} \}\) (@)\({[\hspace{-.02in}[\alpha \lor \beta]\hspace{-.02in}]}_I\) is \(\{ w: w \in {[\hspace{-.02in}[\alpha]\hspace{-.02in}]}\) or \(w \in {[\hspace{-.02in}[\beta]\hspace{-.02in}]} \}\)
- \({[\hspace{-.02in}[\alpha \backslash \beta]\hspace{-.02in}]}_I\) is \(\{ w: w \in {[\hspace{-.02in}[\alpha]\hspace{-.02in}]}\) and \(w \not \in {[\hspace{-.02in}[\beta]\hspace{-.02in}]} \}\)
We assume here that for an arbitrary binary connective \(*\), \({[\hspace{-.02in}[\alpha * \beta]\hspace{-.02in}]}\) is defined iff both \({[\hspace{-.02in}[\alpha]\hspace{-.02in}]}\) and \({[\hspace{-.02in}[\beta]\hspace{-.02in}]}\) are defined. (This is the standard, week-Kleene treatment of combinations of partially defined formula, something that is only implicit in Heim’s original papers.)
This semantics does not cover the entire language since we have only given a semantics for sentences formed without the use of the recursive syntax for CCPs: we have no way of handling any formula that includes a complex CCP such as \(a[A \land B]\) or \(a[\lnot A]\). Complex CCPs is where the action is: I turn now to Heim’s treatment.
Semantics of complex CCPs
On my formulation of Heim’s semantics for complex CCPs, which is very close to her original treatment, their meaning is defined recursively in terms of the semantics of the language already given.
- If \(\alpha\) is a sentence and \(\phi\) and \(\psi\) are CCPs then:
\(\alpha[\lnot \phi] = \alpha \backslash \alpha[\phi]\)
\(\alpha[\phi \land \psi]= (\alpha[\phi])[\psi]\)[hconj]
\((\alpha[\phi \lor \psi] = \alpha[\phi] \lor (\alpha[\lnot \phi])[\psi]\)
\((\alpha[\phi \supset \psi] = \alpha[\lnot \phi] \lor (\alpha[\phi])[\psi]\)
Note: the equal sign is used to designate equality of semantic value, not syntactic equality, semantic evaluation brackets here and later are suppressed for readability.
Assessment
These rules complete the semantics for the language as repeated applications of them will yield an interpretation of any formula. Two main properties recommend this semantics for complex CCPs:
It gets the truth conditions of complex sentences correct. [point1]
The rules of presupposition projection fall out of it. [point2]
To see the way in which the proposal gets the truth-conditions right requires looking at an example of a complex CCP. Consider the CCPs we should assign to It stopped raining, \(R\), and John is tall, \(J\):
\({[\hspace{-.02in}[R]\hspace{-.02in}]}\) = function \(f\) s.t. \(f(p)\) is defined if and only if \(p\) is a set of possible worlds in all of which it used to rain, and when defined it returns \(\{ w \in p:\) it doesn’t rain now in \(w\}\)
\({[\hspace{-.02in}[J]\hspace{-.02in}]}\) = function \(g\) that takes any set of possible worlds \(p\) and returns \(\{ w \in p:\) John is tall in \(w\}\)
Now we can ask what happens when we update a context \(c\) with the complex CCP \(R \land J\). Applying ([hconj]) we get: \(c[R \land J] =(c[R])[J] = g(f(c))\). When defined \(g(f(c))\) = \(\{ w \in c:\) it doesn’t rain now in \(w\) and John is tall in \(w\}\). When defined, this is exactly the context you would get when you update it with the propositions that it stopped raining and that John is tall. So the rule for conjunction allows complex CCPs to mimic the effect on the common ground of adding complex sentences in a classical semantics. Similar remarks apply to the other definitions above (as long as we understand the conditional as the material conditional).
With regard to point (2) above, the question is what the definedness conditions of complex CCPs are in terms of the definedness conditions of their parts. Each of Heim’s semantic rules in ([hs]) uniquely determines a definedness condition. Using her rule for conjunction, \(\alpha[\phi \land \psi]\) = \((\alpha[\phi])[\psi]\), we can see that \(c[R \land J]\) is defined iff \((c[R])[J]\) is defined which it is iff \(g(f(c))\) is defined. Given that \(f\) is a partial function and \(g\) is a total function, the only way this can fail to be defined is if \(f(c)\) is not defined. By definition \(f(c)\) is defined iff \(c\) only includes worlds where it used to rain. This matches the predictions of standard accounts: for the sentence to not have a presupposition failure \(c\) must include the information that it is raining. If we switch the order of \(R\) and \(J\) we get the standard Karttunen prediction that \(c[J \land R]\) is defined if and only if in every world in \(c\) in which John is tall it used to rain.
If we work through the predictions for all the connectives we get standard predictions, ones that capture the generalizations in §[psrules]. I will call the generalizations about presupposition projection that follow from Heim’s definedness conditions the Karttunen/Heim projection rules.9
Bibliography
Beaver, David. 2001. Presupposition and Assertion in Dynamic Semantics. CSLI. https://webspace.utexas.edu/dib97/silli.pdf.
Beaver, David, and Bart Geurts. 2011. “Presupposition.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Summer 2011. http://plato.stanford.edu/entries/presupposition/.
Geurts, Bart. 1996. “Local Satisfaction Guarenteed: A Presupposition Theory and Its Problems.” Linguistics and Philosophy 19: 259–94. doi:10.1007/BF00628201.
Heim, Irene. 1983. “On the Projection Problem for Presuppositions.” West Coast Conference on Formal Linguistics 2: 114–25.
———. 1990. “Presupposition Projection.” In Reader for the Nijmegen Workshop on Presupposition, Lexical Meaning, and Discourse Processes, edited by R. van der Sandt. University of Nijmegen. http://semanticsarchive.net/Archive/GFiMGNjN/.
Karttunen, Lauri. 1973. “Presuppositions of Compound Sentences.” Linguistic Inquiry 4 (2): 169–93. http://www.jstor.org/stable/4177763.
———. 1974. “Presupposition and Linguistic Context.” Theoretical Linguistics 1 (1-3): 181–93. doi:10.1515/thli.1974.1.1-3.181.
Rothschild, Daniel. 2015. “Explaining Presupposition Projection with Dynamic Semantics.” Semantics and Pragmatics 4 (3): 1–43.
Schlenker, Philippe. 2008. “Be Articulate: A Pragmatic Theory of Presupposition Projection.” Theoretical Linguistics 34 (3): 157–212. doi:10.1515/THLI.2008.013.
Soames, Scott. 1982. “How Presuppositions Are Inherited: A Solution to the Projection Problem.” Linguistic Inquiry 13 (3): 483–545. http://www.jstor.org/stable/4178288.
———. 1989. “Presuppositions.” In Handbook of Philosophical Logic, edited by D. Gabbay and F. Guenther, IV:553–616. Dordrecht.
Stalnaker, Robert. 1973. “Presuppositions.” Journal of Philosophical Logic 2 (4): 447–57. doi:10.1007/BF00262951.
———. 1974. “Pragmatic Presuppositions.” In Semantics and Philosophy, edited by Milton K. Munitz and Peter K. Unger. NYU.
I am assuming a basic familiarity with the notion of presupposition as currently used within the semantics community. See e.g. Soames (1989), Beaver (2001), and Beaver and Geurts (2011).↩
There is a long tradition that argues against these conditional presuppositions (most notably Geurts 1996).↩
There are many changes from Heim’s original paper (1983). Most significantly: the notation is more in line with contemporary usage, presuppositions are modeled explicitly as partially defined functions, and letters representing contexts are brought into the object language.↩
Notation: \(W\) denotes the set of all possible worlds, and for any set \(X\), \(\mathcal{P}(X)\) denotes the set of all subsets of \(X\), i.e. the powerset of \(X\).↩
See Soames (1989) for an interesting criticism of this pragmatic rule. Soames’s most important point is that if we use trivalence to capture vagueness as well as presupposition failure, this rule predicts that a vague sentence has non-trivial presuppositions. Soames argues convincingly that this is a bad prediction.↩
Throughout, I assume and suppress when unnecessary standard parenthetical notation to mark order of operations.↩
Hence \(I\) can be specified by a triplet, \(\langle W, S, C \rangle\), where \(W\) is a set of possible worlds, \(S\) is a function from atomic sentences to subsets of \(W\), and \(C\) is a function from atomic CCPs to partial functions from \(\mathcal{P}(W)\) to \(\mathcal{P}(W)\). I will often suppress mention of \(I\).↩
Despite using the same symbols for connectives joining sentences and connectives joining CCPs, these are different connectives with different semantics.↩
This label is not very accurate for disjunction: Karttunen, in fact, made different predictions, while Heim (1983) does not discuss the case of disjunctions. I discuss disjunction further in §[disdisc].↩