Phrase-structure rules

So sentences are not just strings of words, but are composed of a series of constituents which are themselves made up of words. It turns out that these constituents generally correspond to phrases built around specific types of word, what we call lexical categories or, more traditionally, parts of speech.

The phrases that are built around different lexical categories are different in a lot of ways, but in many others they are very similar. We will look at the basics here.

For example, a noun is a single word (or compound).

 N | dog

A noun phrase is a constituent of a sentence that is built around a noun (which is what we call the head of the phrase), with other elements such as determiners and adjectives, or a relative clause or other modifier.

NP = Det + N (big dog Note that many different words could fit this basic template.)

NP = Det + Adj + N (the big dog, etc.)

We use tree diagrams like these to encode constituency, to represent the interrelations of the individual words. They are created -- in the speaker's mind, or in a computer model of the speaker's knowledge -- by means of general principles called phrase structure rules. These determine what kinds of sentence structures are possible in a language.

The phrase structure rules that permit these configurations are as follows:

NP Det N

NP Det Adj N

We can imagine many other variations on these noun phrases as well, of course.

the dog

the big dog

a big brown dog

big angry brown dogs

A way to unify these different types of noun phrase is to write a more complex phrase structure rule.

NP Det Adj* N

The asterisk means that an NP consists of a determiner, any number of adjectives, and a noun.

To account for phrases like the following, we refer to an Adjective Phrase rather than just an adjective:

[ a ]Det [ very big ]AdjP [ slightly brownish ]AdjP [ dog ]N

So, correcting the rule again:

NP ---> Det AdjP* N

Now, sometimes a noun phrase will be modified by another kind of constituent, a prepositional phrase, as in the dog in the yard :

The phrase structure rule for this structure is rather simple:


Simple as it is, such a rule demonstrates one of the key properties of language, which is that a single category can appear on both sides of the arrow of a phrase structure rule. This is called recursion, and is the main way that a finite grammar can derive an infinite number of structures.

We also need to "spell out" the prepositional phrase:


Because we already have a rule for NP, we can apply the rules in succession.



NP P NP Det AdjP N P Det AdjP N

Together these rules will produce the structures for complex NPs such as:

[ dogs ] [ on tables ]

[ the young brown dog ] [ under the big green tree ]

We can also apply the rules successively to produce more than one PP.





This yields the following tree. Follow the top node (NP) down to its dependents (NP and PP) and you see the effect of the first rule; keep going down to see the effect of the remaining rules. The tree structure is built by the rules.

We would keep going (applying the NP rule discussed above) to get to the individual words that make up the NPs at the bottom of this tree. For example:

[ the dog ] [ under the tree ] [ with brown fur ]

Here with brown fur modifies the dog just like under the tree does.

But suppose we apply the rules in a different way:





Now we get a different tree:

A phrase matching this structure is the following:

[ the dog ] [ under [ the tree with dead leaves ] ]

Dog buried in a very old grave over which a tree has grown up. It is, possibly, winter.

Notice that here, with dead leaves modifies the tree -- the NP lower in the structure, to which it is adjoined -- and the PP under the tree with dead leaves modifies the dog, because it's adjoined as a single unit.

These two sentences have the same linear order of elements (with the last adjective and noun changed), but different constituency. The possibility of two constituencies for the same string of words is what leads to structural ambiguities.

A famous example of structural ambiguity is a joke told by Groucho Marx in Animal Crackers.

One morning I shot an elephant in my pajamas.

How he got into my pajamas I dunno.

The ambiguity here centers on the prepositional phrase in my pajamas: does it modify the noun elephant, or the entire verb phrase? The simple linear order is consistent with either:

 Ishotan elephantin my pajamas
 subjectverbnoun phraseprep phrase

The more reasonable meaning is "I shot an elephant while (I was) in my pajamas." This is parallel to a sentence like I [bought a book [with my credit card]].

 Ishotan elephantin my pajamas

 subject  noun                                   
verbnoun phraseprep phrase
  noun phraseverb phrase

Or, as trees: Note that in Example A. below, the hunter is in his pajamas, while in sentence B the elephant is wearing the hunter's jammies. It's the little bit of extra time required to pull up the two pieces of logic that makes us laugh (and why young kids will often be mystified and fail to get the joke without an accompanying cartoon).

Also possible, though, is "I shot an elephant that was in my pajamas." This is parallel to a sentence like I bought [a book [with a red cover]].

 Ishotan elephantin my pajamas
 subjectverbdet+ nounprep phrase
    noun phrase
  verb phrase

And as the tree in Example B above, where we see that the phrase in my pajamas is more intimately connected to an elephant than it was in the preceding structure. This closer structural connection is what encodes the idea that it modifies elephant rather than the act of shooting.

So the ambiguity results from which order we decide to apply the phrase structure rules in. Again, a grammar that consisted of simple word strings would be incapable of capturing this.

Now, the phrases discussed here illustrate the importance of recursion, hierarchical structure, and the generality of the phrase structure rules -- the same spelling out of NP occurs whether the NP is part of a PP or not.

In principle, any combination of correctly formulated phrase structure rules will yield a possible sentence structure in the language.

A verb phrase is another type of constituent which includes the verb and its complements, such as a direct object, indirect object, and even a sentence. (A complement is what Pinker calls a "role-player" as distinct from a modifier.)

they [ saw me ]

she [ gave the book to me ]

you [ said that you would arrive on time ]

There's a fundamental division in a sentence between the subject (an NP) and the verb phrase (VP). In more philosophical contexts, these are often referred to as subject and predicate (the thing discussed, and what's said about it).

Because of this, the first rule in our phrase structure grammar will look something like this:


That is, a sentence consists of a subject NP, and a verb phrase.

The internal structure of the verb phrase depends on the nature of the verb.

intransitive: verb does not have an object

laugh, frown, die, wait, fall



transitive: verb has an object (= an NP complement)

see, want, like, find, make



ditransitive: verb has two objects

give, tell, buy, sell, send



Another spell-out is VP V NP PP, to yield the synonymous I gave the book to you alongside I gave you the book.

Often the same verb can belong to more than one class.

I already [ ate ]
I already [ ate the apple ]

She [ told ]  as in, "I'm gonna tell!"
She [ told a story ]
She [ told me a story ]

It's also possible for a verb to take complements that aren't NPs, at least not in any simple sense.

We [ told him the truth ]
We [ told him that we were leaving ]

They [ want the book ]
They [ want to leave ]
They [ want you to leave ]

There are a number of other complications and details with phrase structure that you can learn about if you take a course in syntax, like Ling 150, Introduction to Syntax.


Another interesting property of human language syntax is the phenomenon sometimes referred to as displacement, which refers to situations where an element (a word or constituent) appears in some position other than where we would expect it. We have already seen some examples of this, like topicalization sentences of the following type:

Him I don't like.

Normally, we would expect him to show up after the verb, because it is the direct object, and English direct objects generally follow the verb.

Another example is the passive construction, where the object of the verb becomes subject, and the subject either disappears or shows up in prepositional phrase with by:

Col. Mustard killed the Butler.

The Butler was killed (by Col. Mustard).

We can say that the second sentence is the passive of the first. Here again, we have a constituent, in this case the Datsun showing up in a different position than we might expect. It is the subject here, occurring before the verb, but as far as the meaning is concerned, it is still the object, the thing that is being acted upon, not the actor.

There are many ways we could try to account for this state of affairs, but Chomsky has argued that the best way to do it is not to just add extra phrase structure rules. Instead, he introduced the notion of a transformation, (which is an adaptation of a somewhat different idea proposed by his teacher, Zellig Harris).

The idea is basically this. We use the phrase-structure rules to derive a basic sentence. Thus, given a simplified version of the rules above:




We can have the following derivation:




NP V DET NP Col. Mustard Killed the Butler

Following the work of the phrase structure rules, another portion of the syntax goes to work, which can move elements around. In this instance, we apply the passive transformation, which takes a sentence, demotes the subject, and promotes the object to subject position, yielding the passive sentence The Datsun was stolen by Bobby.

Now, in order for a theory of this type to be interesting, transformations must be highly constrained. It should not be possible to move anything one likes, because then we would predict all sorts of sentences to be grammatical that in fact are not.

Syntactic theory for the past several decades has been working on figuring exactly what sort of transformations should be possible, as well as determining which phenomena should be explained by means of transformation and which are better explained through use of phrase structure rules.

For example, in recent years it has come to be accepted that all transformations can be formulated in terms of single constituents moving around within the tree, not in terms of, say, phrases swapping places or being added randomly.

Another restriction that has been proposed is that constituents only move if they have to in order to satisfy some grammatical principle. One such principle, at least in English, seems to be that all sentences must have a subject. This is demonstrated by the following examples:

a) It seems that all my friends are sick.

b) *Seems that all my friends are sick.

c) My friends seem to all __ be sick.

Setence a) has two clauses (mini-sentences), the main clause It seems , and the subordinate clause containing that and everything after it. The subject of the subordinate clause is my friends, they are the ones who are sick.

The subject of the main clause is the word it , which actually has no meaning. It does not refer to anything in the world, not even an abstract idea. Apparently, it just shows up here because the sentence needs a subject. If we leave out it , as in sentence b), the sentence is ungrammatical (we mark ungrammatical sentences with an asterisk).

But there is a different way to say essentially the same thing here. We can put the subject of the subordinate clause, my friends, in the subject position of the main clause. This actually constitutes very good evidence that it in sentence a) has no meaning, because we've left it off here without really affecting the meaning of the sentence.

Now, can my friends have started out as the subject of the main clause? Probably not, because as far as the meaning goes, it's still the subject of the subordinate clause. They are still the ones who are sick. So we have another situation where an element shows up in one place when we expect it to be in another. This is therefore standardly analyzed as another instance of a movement transformation.

Note that here we actually have some extra evidence that my friends started out in the subordinate clause and moved up to the main clause: the position of the word all. This word modifies my friends in sentence c) just as it does in sentence a), yet it is separated from it. It thus looks like this sentence started out looking something like sentence b), but then my friends moved up to the main clause, leaving its modifier behind.

Now we have a very attractive possibility for explaining why this movement should have taken place. If it did not, we would get b), which is ungramatical because the main clause has no subject. The movement is thus forced by the lack of a main clause subject.

In sentence a), on the other hand, movement is unnecessary (and impossible), because the subject requirement in the main clause has been satisfied by the meaningless element it.

The full analysis of sentences like this involves a number of further complications, and in all fairness I should note that recent versions of Chomsky's syntactic theory look rather different from what we've seen here in the details, but the basic ideas are still the same.