Minimalist program

In linguistics, the minimalist program (MP) is a major line of inquiry that has been developing inside generative grammar since the early 1990s, starting with a 1993 paper by Noam Chomsky.[1]

Chomsky presents MP as a program, not as a theory, following Imre Lakatos's distinction.[2] The MP seeks to be a mode of inquiry characterized by the flexibility of the multiple directions that its minimalism enables. Ultimately, the MP provides a conceptual framework used to guide the development of linguistic theory. In minimalism, Chomsky attempts to approach universal grammar from below—that is, proposing the question "what would be the optimal answer to what the theory of I-Language should be?"

For Chomsky, there are minimalist questions, but the answers can be framed in any theory. Of all these questions, the two that play the most crucial role are:[3]

(1) What is language?
(2) Why does it have the properties it has?

Theoretical goalsEdit


The MP appeals to the idea that the language ability in humans shows signs of being incorporated under an optimal design with exquisite organization, which seems to suggest that the inner workings conform to a very simple computational law or a particular mental organ. In other words, the MP works on the assumption that universal grammar constitutes a perfect design in the sense that it contains only what is necessary to meet humans' conceptual and physical (phonological) needs.[4]

From a theoretical standpoint, and in the context of generative grammar, the MP draws on the minimalist approach of the principles and parameters program, considered to be the ultimate standard theoretical model that generative linguistics has developed since the 1980s. What this approach suggests is the existence of a fixed set of principles valid for all languages, which, when combined with settings for a finite set of binary switches (parameters), may describe the specific properties that characterize the language system a child eventually comes to attain.[5]

The MP aims to get to know how much of the principles and parameters model can be taken as a result of this hypothetical optimal and computationally efficient design of the human language faculty. In turn, more developed versions of the principles and parameters approach provide technical principles from which the MP can be seen to follow.[6]


The MP aims at the further development of ideas involving economy of derivation and economy of representation, which had started to become significant in the early 1990s, but were still peripheral aspects of transformational grammar.[7]

  • Economy of derivation is a principle stating that movements (i.e., transformations) only occur in order to match interpretable features with uninterpretable features. An example of an interpretable feature is the plural inflection on regular English nouns, e.g., dogs. The word dogs can only be used to refer to several dogs, not a single dog, and so this inflection contributes to meaning, making it interpretable. English verbs are inflected according to the number of their subject (e.g., "Dogs bite" vs. "A dog bites"), but this information is only interpretable once a relationship is formed between the subject and the verb, so movement of the subject is required.
  • Economy of representation is the principle that grammatical structures must exist for a purpose, i.e., the structure of a sentence should be no larger or more complex than required to satisfy constraints on grammaticality, which are equivalent to constraints on the mapping between the conceptual/intentional and sensori-motor interfaces in the optimal system that minimalism seeks to explore.

Technical innovationsEdit

The exploration of minimalist questions has led to several radical changes in the technical apparatus of transformational generative grammatical theory. Some of the most important are:[8]

  • The generalization of X-bar theory into bare phrase structure (see below).
  • The simplification of representational levels in the grammatical model, eliminating the distinction between deep structure and surface structure in favor of a more explicitly derivational approach.
  • The elimination of the notion of government.
  • The inclusion of two new points of interaction, namely, a "spell-out" point between syntax and the interface with the phonological form, and an additional point of interaction with the logical form.
  • The idea that syntactic derivations proceed by clearly delineated stages called "phases" (see below).

Bare Phrase StructureEdit

A major development of the Minimalist Program inquiry is bare phrase structure (BPS), a theory of phrase structure (structure building operations) developed by Noam Chomsky in 1994.[9] According to the Concise Oxford Dictionary of Linguistics, BPS is a representation of the structure of phrases in which syntactic units are not explicitly assigned to categories.[10] The introduction of BPS has moved the Chomskyan tradition toward the dependency grammar, which operates with significantly less structure than most phrase structure grammars.[11] Chomsky proposed the elimination of the previously used phrase structure theory: X-bar theory. Bare phrase structure attempts to eliminate unnecessary elements, creating a structure of trees that are simpler as well as accountable for variation across languages.[12]


Beginning of Bare Phrase StructureEdit

There are certain fundamental properties that govern the structure of human language:[13]

  1. Hierarchical Structure
    • expressing the properties of phrase structures in languages with regards to linear order[14]
  2. Unboundedness & Discrete Infinity
    • language can have a countless amount of words in a sentence[14]
  3. Endocentricity & Headedness
    • phrases can be constructed based on a head of a phrase[14]
  4. The Duality of Semantics
    • phrases can refer to some earlier step in their phrasal structure[14]

The first fundamental property is one of the earliest analysis for structural linguistics is the Immediate Constituent (IC) analysis, developed in 1947, which is a procedural approach in structural linguistics.[13] This analysis cannot be applied to generative grammar, which opposes the procedural approach, but an insightful concept from IC is 'ordered rewriting rules' which is incorporated into phrase structure grammar theory. This theory is developed with a modification to the concept of 'vocabulary' (non-terminal vs. terminal contrast) where it contains phrase structure rules (PS rules) that follow the following form:[13]

  • ABC → ADC
    • B = single symbol
    • A, C, D = string of symbols (D = non-null, A and C can be null)
    • A and C are non-null when the environment in which B needs to be re-written as D is specified.

This happens with the insertion of a lexical item into a specific P-marker terminal position. The 'lexical insertion' is no longer at work, with subcategorization features, developed in 1965, of lexicon taking its place.[13] Separating phrase structure grammar (PSG) from the lexicon simplifies PS rules to being a context-free rule (B → D) as opposed to being context sensitive (ABC → ADC).[13]

  • B → D
    • B = single non-terminal symbol
    • D = string of non-terminal symbols (can be non-null) where lexical items can be inserted with their subcategorization features.[13]

Therefore, context-free grammar plus lexicon can express phrase structure properties of languages in accordance to the fundamental property 1. [14]

Phrase Structure identifies three major expressions:[14]

  1. Dominance: The hierarchy of a constituent
  2. Labeling: The type of a constituent
  3. Precedence: The linear order of a constituent

Therefore, the labelled hierarchical structure along with the string of elements linearly ordered are conveyed by phrase structure grammar that generates a set of P(hrase)-markers.

The second fundamental property is seen as the most rudimentary property of the capacity of shared language. Language is not a continuous notion, but rather discrete in the way that linguistic expressions are distinct units, such as a x word in a sentence, or a x+1, x-l words, and not partial words, x.1, x.2 .... Additionally, language is not constricted in size, but rather infinite. The longest sentence can contain an x number of words. The Standard Theory, created in 1960, includes 'Generalized transformations' to handle discrete infinity cases. It embeds a structure into another structure that is the same type. Nonetheless, this transformation theory was eradicated from the Standard Theory, and transitioned to PSG with a minor edit. Self-embedding occurs from either the left or right side of PS rules using recursive symbols to allow for non-local recursion. [13]

The third fundamental property was realized that transformations and PS rules are not enough to capture crucial generalizations of headedness and endocentricity about human language structure. Generally, human language is endocentric around one element central to a phrase, called the 'head' in which the imperative properties of a phrase are determined by it. Revolving around the head, are other elements that allow for the expansion of the structure. In 1968, it was discovered that this fundamental property cannot be captured by PSG because PS rules over produce structures that are not actually allowed in human languages in the sense that they do not have a head ('exocentric'). Therefore, another mechanism was introduced to capture this property, which is X-bar theory.[13]

The fourth fundamental property touches upon how the structure of a predicate-argument is realized in the vicinity of the predicate itself (within a clause). However, other semantic features such as scope and discourse are realized at the edge of an expression (most often a sentence). This combination needs a device that can relate non-sister nodes (two) in a sentence to an earlier step in the derivation of a phrasal structure. Withal, referring to the history of the derivation of a phrasal structure cannot be conveyed using context-free PSG. Hence why a new device, 'grammatical transformation' is introduced to manage duality of semantics.[13]

The preliminaries of X-bar theory in 1970 by Chomsky had the following basic ideas:[13]

  1. Each phrase has a head (endocentric) and it projects to a larger phrase[13]
  2. Heads are feature complexes that consist of primitive feature[13]
  3. UG postulates general X-bar schema that does the following, which manages head projection[13]
    • X' → X...
    • X″ → [Spec, X'] X'

Claims 1 and 2 have almost completely withstood their original forms through grammatical theory development, unlike 3. Claim number 1 will be eliminated later on in favour of projection-less nodes.[13]

In 1980, the Principles and Parameters (P&P) approach took place which marked the emergence of different theories that stray from rule-based grammars/rules, and have instead been replaced with multiple segment of UG such as X-bar theory, Case theory, etc. During this time, PS rules disappeared because they have proved to be redundant since they recap what is in the lexicon. Transformational rules have survived with a few amendments to how they are expressed. For complex traditional rules, they do not need to be defined and they can be dwindled to a general schema called Move-α - which means things can be moved anywhere. The only theory that withstood time within P&P is X-bar theory and Move-α. Of the fundamental properties mentioned above, X-bar theory is accountable for 1 and 2, while Move-α is accountable for 3 and 4. A few years later, an effort to merge both theories by suggesting that structures are built from the bottom going up (either using adjunction or substitution depending on the target structure):[13]

  1. Features are discharges as soon as heads project.[13]
    • based around the idea that heads are central and phrases are formed around them with heads projecting their essential features
  2. At a single bar level, iteration is probable.[13]
    • based around the idea that PS composition is infinite
  3. Adjunction is responsible for movement and building the structure. [13]
    • based around the idea that transformations allow for fundamental operations
  4. No X-bar scheme (no maximal projection at bar levels). [13]
    • based on the claim in 1
  5. Projections are closed by agreements. [13]
    • based on the idea that some languages (Japanese), phrases do not 'close' and elements can be added to keep expanding it - not the case in English

By 1995, X-bar theory has been eliminated by Chomsky's 'bare phrase structure theory'. As part of the minimalist program, BPS must satisfy the principles of UG (Universal Grammar) using at minimum two interfaces such as 'conceptual-intentional and sensorimotor systems' or a third condition not specific to language but still satisfying the conditions put forth by the interface.[13] The constitutional operation of BPS is 'Merge' which is discussed in following sections.

Impact of Bare Phrase StructureEdit

As mentioned previously, the syntactic theory of bare phrase structure was proposed by Noam Chomsky in 1994.[9] Since its publication, other linguists have continued to build on this theory. In 2002, Chris Collins continued research on Chomsky's proposal of the eliminations of labels, backing up Chomsky's suggestion of a more simple theory of phrase structure.[15] Collins proposed that economy features, such as Minimality, govern derivations and lead to a simplified representation. In addition, the minimal phrase structure has been formulated as an extension to bare phrase structure and X-bar theory, however it does not take into account the Minimalist assumptions.[12] Phrase structure theories should include the following seven features in order to be successful:[12]

  1. Avoid non-branching dominance by using as much structure as possible to model constituency
  2. Avoid assumptions of the large optionality in PSRs
  3. Avoid redundancy in labelling to ensure phrases share the category of their heads
  4. Avoid creating a theory that is distinct of X-bar theory
  5. Create distinctions between Xmax and XP as well as between the highest projection and Xmax
  6. Create a theory that accounts for exocentricity
  7. Create a theory that accounts for non-projecting categories

Although bare phrase structure satisfies many of these features, it does include all of them, therefore other theories have attempted to incorporate all of these features in order to present a successful phrase structure theory.

Comparison of Bare Phrase Structure with X-bar TheoryEdit

This theory contrasts with X-bar theory, which preceded it, in the following ways:

  1. BPS is explicitly derivational. That is, it is built from the bottom up, bit by bit. In contrast, X-bar theory is representational—a structure for a given construction is built in one fell swoop, and lexical items are inserted into the structure.
  2. BPS does not have a preconceived phrasal structure, while in X-bar theory every phrase has a specifier , a head , and a complement .
  3. BPS permits only binary branching, while X-bar theory permits both binary and unary branching.
  4. BPS does not distinguish between a "head" and a "terminal", while some versions of X-bar theory require such a distinction.
  5. BPS incorporates features into their structure, such as Xmax and Xmin, while X-bar theory contains levels, such as XP, X', X
  6. BPS accounts cross-linguistically as maximal projections can be perceived at an XP level or a X' level, whereas X-bar Theory does not include a maximal projection

In her chapter Phrase Structure of The Handbook of Contemporary Syntactic Theory, Naoki Fukui determined three kinds of syntactic relationships (mentioned previously), (1) Dominance : the hierarchical categorization of the lexical items and constituents of the structure, (2) Labeling : the syntactic category of each constituent and (3) Linear order (or Precedence) : the left-to-right order of the constituents (essentially the existence of the X-bar schemata). Whereas X-bar theory was composed of the three relationships, bare phrase structure only encodes the first two relationships.[16]

The main reasoning behind the transition from X-bar theory to BPS is the following:

  1. Eliminating the notion of non-branching domination
  2. Eliminating the necessity of bar-level projections

The examples below show the progression of syntax structure from X-bar theory (the theory preceding BPS), to specifier-less structure.

This tree is drawn according to the principles of X-bar theory, the theory that precedes BPS.
This is a tree of the same sentence as the X-bar theory syntax tree right above, however, this one uses BPS along with selection features.

Dependency GrammarEdit

There is a trend in minimalism that shifts from constituency-based to dependency-based structures. Under the dependency grammar umbrella, exists bare phrase structure, label-less trees, and specifier-less syntax.[17]

To simplify phrase structures, Noam Chomsky advanced its representation by positing what is now known as label-less trees. He argues that the labels of the category are unnecessary, and therefore do not need to be included. When presenting the phrase structure tree, the lexical item that is classified as a head would become its own label in lieu of a specific category in the projection. [18]

In addition to these label-less trees proposed by Chomsky, non-branching projections in bare phrase structure are recognized through the surfacing of "minimal " and "maximal" projections. "Minimal" entails a category that is not a projection, in other words, it is not dominating other lexical items or categories. On the other hand, "Maximal" is the projection that is unable to project any higher, hence its term.[18]

The introduction of Abney's (1987) "DP hypothesis" gave rise to the development of specifier-less syntax In this emergence, each lexical item that was initially labelled as a specifier were to become their own phrases, which then introduces a complement.[18]

To complete the development of the dependency grammar, merge is introduced.

Merge and MoveEdit

BPS incorporates two basic operations: "merge" and "move". Although there is active debate on exactly how "move" should be formulated, the differences between the current proposals are relatively minute. The following description follows Chomsky's original proposal.


Merge is a function that takes two objects (α and β) and merges them into an unordered set with a label (either α or β, in this case α). The label identifies the properties of the phrase. Merge will always occur between two syntactic objects: a head and a non-head. [14]

Merge (α, β) → {α, {α, β} }

For example, "merge" can operate on the lexical items "drink" and "water" to give "drink water". Note that sometimes people mistakenly claim that the phrase "drink water" behaves more like the verb drink than like the noun water. That is, wherever the verb drink can be put, so too can the phrase "drink water":

(1a) I like to drink.
(1b) I like to drink water.
(2a) Drinking is fun.
(2b) Drinking water is fun.

Furthermore, the phrase "drink water" can not typically be put in the same places as the noun water:

It can be said, "There's some water on the table", but not "There's some drink water on the table".

However, drink water cannot be put in the same place as drink into an infinitely large number of sentences in which drink already has a direct object, for instance:

(3a) I like to drink milk.

but not

(3b) *I like to drink water milk.

Despite the existence of such counterexamples, people tend to use the principle of "distributional identity" to explain which of two words will serve as the "head" or "label" of the word combination.

In the Minimalist Program, the phrase is identified with a label. In the case of "drink water", the label is drink since the phrase acts as a verb. For simplicity, this phrase is called a verb phrase (VP). If "cold" and "water" were merged to get "cold water", this would be a noun phrase (NP) with the label "water"; it follows that the phrase "cold water" can appear in the same environments as the noun water in the three test sentences above. So, for drink water, there is the following:

Merge (drink, water) → {drink, {drink, water} }

This can be represented in a typical syntax tree as follows:


or, with more technical terms, as:


Merge can also operate on structures already built. If it could not, then such a system would predict only two-word utterances to be grammatical. If a new head is merged with a previously formed object (a phrase), the function has the form

Merge (γ, {α, {α, β}}) → {γ, {γ, {α, {α, β}}}}

Here, γ is the label, so that γ "projects" from the label of the head. This corresponds to the following tree structure:


Merge operates blindly, projecting labels in all possible combinations. The subcategorization features of the head then license certain label projections and eliminate all derivations with alternate projections.


Chomsky (1993) presents three theories of strong features (Phonetic Form (PF) crash theory, Logical Form (LF) crash theory, Virus theory), which are unique conditions driving an earlier operation in order for overt movement to take place.[19][20]

Moving forward, three minimalist approaches to overt movement were proposed:

  1. Pseudogapping and sluicing, the two ellipsis components, verify the relevance of strong features provided by Chomsky (1993).
  2. The usage of a strong feature enables movement or ellipsis to save a derivation.
    • Raising that is prompted by strong features generate ellipsis.
    • Strong features include the Extended Projection Principle and the split VP hypothesis.
  3. PF crash theory is pertinent through direct or indirect mechanisms.
    • Strong features prompt overt movement. Under the PF crash theory, all of the constituents, not just formal features, must move by means of pied-piping.

Initially, the cooperation of Last Resort (LR) and the Uniformity Condition (UC) were the indicators of the structures provided by Bare Phrase which contain labels and are constructed by move, as well the impact of the Structure Preservation Hypothesis.[21]

The Uniformity Condition, with respect to the phrase structure status, is an unchanging chain. It has three functions that contributes to the idea of movement:[21]

  1. It obstructs the movement of a minimal non maximal projection to a specifier.
  2. It obstructs the covert movement of formal features (FF) to a specifier.
  3. It averts a moved non minimal projection from projecting further subsequent to merging with its target.

In order to maintain the desired simplicity of the Minimalist Program, c-command gains it recognition as being more optimal, and therefore replaces the UC based on the reasons that:

  1. Methodologically, the majority of existing relations contain C-command as its foundation.
  2. The condition of c-command on chain links posits a restriction regarding the movement of intermediate projections, unlike the UC.

Last Resort, as mentioned above, is a checking relation that is utilized in the process of movement. According to this property, a feature may move to its target only if the feature, which is moving, enters a checking relation with a feature in the head it is moving to.[21]

For example, D may move to SPEC C, only if the maximal projection and its subsequent projection select for D. [22]

There is a second important property in Move, the Minimal Link Condition, MLC. According to the MLC, a feature can only move to its target if there is no prior feature that is able to move under the Last Resort property and is closer to the target than the original feature. (e.g. C cannot move to target A if B obeys the LR property and is closer to target A than C). [22]

Features in Bare Phrase StructureEdit

Bare Phrase Structure contains the following features:

  • X+max: maximal projection
    • lexical category that cannot project to any further point in the tree[21]
      Maximal Projection
  • X+min: minimal projection
    • lexical item that does not make any projections [21]
      Minimal Projection
  • X-max,-min: in between minimal and maximal projection (intermediate projection)
    • complement acting as a sister to a minimal projection [21]
    • specifier acting as a sister to an intermediate projection [21]
      Intermediate Projection
  • EPP: extended projection principle[23]
    EPP Feature
  • Lexical items that represent locality of selection, such as {D,C,T, etc.}[23]
    Subcategorization Features
  • Co-indexation markers such as {k, m, o, etc.}[23]
    Co-Indexation Markers
  • Case markers such as {NOM = nominative, ACC = accusative, GEN = genitive, DAT = dative}[23]
    Case Markers

With all of these features, we end up with a complete tree :

Bare Phrase Structure Tree; complete

Attachment in Bare Phrase StructureEdit

In Chomsky's 1995, The Minimalist program, he outlines two methods of forming structure: adjunction and substitution. The standard properties of segments, categories, adjuncts, and specifiers are easily constructed.[20]

In the general form of a structured tree for adjunction and substitution, α is an adjunct to X, and α is substituted into SPEC, X position. α can raise to aim for the Xmax position, and it builds a new position that can either be adjoined to [Y-X] or is SPEC, X, in which it is termed the 'target'. At the bottom of the tree, the minimal domain includes SPEC Y and Z along with a new position formed by the raising of α which is either contained within Z, or is Z.[20]



Adjuncts are argued to exhibit a different, perhaps a more simplified structure in the minimalist program. There was a traditional way of defining adjuncts preceding the emergence of the minimalist program. Before the introduction of Noam Chomsky's BPS, adjuncts were known to conserve the information contained in the bar-level, category information, as well as the target's (located in the adjoined structure) headedness.[24] Based on Chomsky's work, adjunction forms a two-segment object/category. It consists of a head of a label, the head of a label, and a different label from the head of the label: <H(S), H(S)>, where L = {<H(S), H(S)>,{α,S}}. The label L is not considered a term in the structure that is formed because it is not identical to the head S, but it is formulated from it in an irrelevant way. [20] If α is adjoined to S, and S is projected, then the structure that result is L = {<H(S), H(S)>,{α,S}}, where the entire structure is replaced with S, as well as what the structure contains. The head is what projects, so it can be the label or it determines it irrelevantly.[20] There is a shift from this definition as we are introduced to BPS. In this new account, the properties of the head are no longer preserved in adjunction structures and the attachment of an adjunct to a particular XP following adjunction is perceived to be non-maximal. It is also worth noting that such an account is applicable to XP's that are related to multiple adjunction.[25]

Below is an example of adjunction in bare phrase structure and in X-bar theory :

Adjunction in bare phrase structure
Adjunction in X-bar theory


A new category is formed using this method, as opposed to how a two-segment object/category is formed using adjunction. It consists of a head, which is the label, and an element being projected: L = {H(S), {α,S}). H(S) = head/label, S = projected element.[20] Some ambiguities may arise if the features raising, in this case α, contain the entire head and the head is also XMAX.[20]

Examples of Bare Phrase StructureEdit

  • Here is an example of a bare phrase structure theory syntax tree in English :
Bare phrase structure; English Sentence
  • Here is an example of a bare phrase structure theory syntax tree in French :
Bare phrase structure tree; French Sentence


A "phase" is a syntactic domain first hypothesized by Noam Chomsky in 1998.[26] It is a domain where all derivational processes operate and where all features are checked.[27] A phase consists of a phase head and a phase domain. Once any derivation reaches a phase and all the features are checked, the phase domain is sent to transfer and becomes invisible to further computations.[27]

Exemplification of CP and vP phasesEdit

A simple sentence is often decomposed into two phases, CP and vP (see X-bar theory). Chomsky considers vP and CP to be strong phases because they show strong phase effects, relating to the domain for movement, domain for propositional scope, and the domain for reconstruction.[28]

Syntax Tree of simple sentence

Evidence for strong phase: movement

vP and CP can be the focus of pseudo-cleft movement, therefore showing that vP and CP both form a syntactic unit.[29]

The CP in sentence (1), 'that John is bringing the dessert' , can be the focus of pseudo-cleft movement, which is shown in sentence (2).

(1) Mary said [CP that John is bringing the dessert].

(2) What Mary said was [CP that John is bringing the dessert].

The vP in sentence (3), 'arrive tomorrow' , can be the focus of pseudo-cleft movement, which is shown in sentence (4).

(3) Alice will [vP arrive tomorrow].

(4) What Alice will do is [vP arrive tomorrow].

Evidence for strong phase: propositional content

vP is considered a propositional unit because all the theta roles are assigned in vP.[29] Theta role is a term used to identify the relation between the constituent and the predicate that selects is.[23] For example, sentence (1) shows that the verb 'ate' in the vP phase introduces the DP Agent and DP Theme.

(1) Mary [vP ate the cake].

CP is considered a propositional unit because it is a full clause that has tense and force.[29] For example, sentence (2) shows that the complementizer 'that' in the CP phase introduces the tense and the force of the sentence.

(2) John said [CP that Mary will eat the cake].

Evidence for strong phase: reconstruction

Reconstruction refers to context where the constituent that was moved must be interpreted in its original position, as if movement has not occurred, in order for binding principles to be followed.[30] The edges of vP and CP phase provide potential reconstruction site, where the binding principles are followed. This is evidence that the moved phrase stops at the edges of vP and CP phases.[31]

In sentence (1), according to binding principles, the reflexive himself must be c-commanded by John or Fred, depending on the co-referential relationship. The position at the beginning of the sentence that himself emerges in does not satisfy these requirements, however the sentence is still grammatical. Therefore, the reflexive must have moved to a reconstruction site first and this is where it is being interpreted. The edge of the lower CP phase is the position where these binding requirements are satisfied. The wh-phrase must have stopped at the edge of the CP phase first and the reflexive himself is interpreted as it hasn't moved to the beginning of the sentence.[29]

(1) [which picture of himselfk/j] did Johnk think Fredj liked  ?

In sentence (2), the pronoun he must be bound by every student and the R-expression Mary must not be bound by her. The position at the beginning of the sentence that the phrase 'which of the papers that he gave Mary' emerges in does not satisfy all the binding requirements. The phrase must have moved to a reconstruction site first and this is where it is being interpreted. The edge of the vP phase is the only position where the binding requirements are satisfied.The observed binding possibility shows that the wh-phrase has to stop at the edge of the vP phase and is where the phrase is interpreted.[31]

(2) [which of the papers that hek gave Maryj] did every studentk ask herj to read X carefully?

Exemplification of phase impenetrability condition (PIC)Edit

Chomsky theorized that syntactic operations must obey the Phase Impenetrability Condition (PIC). Movement of a constituent out of a phase is (in the general case) only permitted if the constituent has first moved to the left edge of the phase (XP). The edge of a head X is defined as the residue outside of X', in either specifier of X and adjuncts to XP.[32] This condition is described in the phase impenetrability condition, which has been variously formulated within the literature. The Extended Projection Principle feature that is on the heads of phases triggers the intermediate movement steps to phase edges.[29]

Wh-movement in English:[29]

Looking at wh-movement in English, we can observe successive cyclic movement that obeys the PIC.

(1) [CP Who did you [vP see who]]?

Sentence (1) has 2 phases (vP and CP). To generate this sentence, ‘who’ has to move from the vP phase (lower phase) to the CP phase (higher phase). This must occur in 2 steps since ‘who’ starts off in the complement position of vP and therefore cannot move out of the phase under the PIC. It must first move to the edge of the vP phase.

  • Step 1: First, ‘who’ must move from the complement position of VP to the edge of vP. The EPP feature of the verb motivates the movement of ‘who’ to the edge of vP.
Step 1: wh-phrase moves to edge of vP
  • Step 2: Now that ‘who’ is at the left edge of the vP phase, it can move out of the lower phase and into the specifier of the CP phase.
Step 2: wh-phrase moves into the CP phase

Wh-movement in Medumba:

Another example of PIC can be observed when analyzing A'-agreement in Medumba. A'-agreement is a term used for the morphological reflex of A'-movement of an XP.[30] In Medumba, when the moved phrase reaches a phase edge, a high low tonal melody is added to the head of the complement of the phase head. Since, A'-agreement in Medumba requires movement, the presence of agreement on the complements of phase heads shows that the wh-word moves to the edges of phases and obeys PIC.[30]


The sentence (2a) has a high low tone on the verb nɔ́ʔ and tense ʤʉ̀n, therefore is grammatical.

(2a) [CP á wʉ́ Wàtɛ̀t nɔ́ɔ̀ʔ [vP ⁿ-ʤʉ́ʉ̀n á?]]

‘Who did Watat see?’

The sentence (2b) does not have a high low tone on the verb nɔ́ʔ and tense ʤʉ̀n, therefore is not grammatical.

(2b) *[CP á wʉ́ Wàtɛ̀t nɔ́ʔ [vP ⁿ-ʤʉ́n á?]]

*‘Who did Watat see?’

To generate the grammatical sentence (2a), the wh-phrase á wʉ́ moves from the vP phase to the CP phase. To obey PIC, this movement must take 2 steps since the wh-phrase needs to move to the edge of the vP phase in order to move out of the lower phase.

  • Step 1: First, the wh-phrase moves from the complement of VP to the edge of the vP phase to avoid violating PIC. In this position, the agreement is expressed on the verb ʤʉ̀n and surfaces as a high low (HL) tone melody (ⁿ-ʤʉ́ʉ̀n). The agreement is expressed on the verb which is the head of the complement of the v phase head.
Step 1: wh-phrase moves to edge of vP
  • Step 2: Now that it is at the edge of the vP phase, the wh-phrase is able to leave the vP phase and move to the Spec-C position of the CP phase. Agreement is expressed on the tense nɔ́ʔ as a high low tone melody (nɔ́ɔ̀ʔ). The tense which agreement is expressed on is the head of the complement of the C phase head
Step 2: wh-phrase moves from vP phase to CP phase

We can confirm that A' agreement only occurs with movement by examining sentences where the wh-phrase does not move. In sentence (2c) below, we can observe that there is no high low tone melody on the verb nɔ́ʔ and tense since the wh-word does not move to the edge of the vP and CP phase.[30]

(2c) [m-ɛ́n nɔ́ʔ bɔ̀ á wʉ́ á]

'The child gave the bag to who?'

What can (not) be a phaseEdit

In the literature, there are 3 different trends of what is generally considered to be a phase. The first trend is the that only vP and CP are phases. Chomsky[28] originally proposed that CP and vP in transitive and unergative verbs constitute phases. This was proposed based on the phrases showing strong phase effects mentioned in the above Exemplification of CP and vP phases section.

The second trend is that a specific set can be a phase. It has been argued that in addition to transitive and unergative vP phases, unaccusative and passive vP are also phases too.[31] This was proposed because it was found that passive and unaccusative vP have the same reconstruction site characteristics of intermediate phase edges.[31] It has also been proposed that TPs belongs in the set of what can be a phase, depending on the language.[33] It also been suggested that the set of phases also include DPs since there are many parallels between DP and CP.[34]

The last trend discussed in the literature, is that every phrase is a phase. This is the idea that successive cyclic movement occurs through all intermediate phrase edges.[33]


In the last decade, a substantial body of literature in the Minimalist tradition has focused on how a phrase receives the proper label. A label is the indication about the kind of phrase that is built via merge. This eschews the bare phrase structure formulation of Merge in favor of a simpler Merge(a,b) = {a,b}.[35] This further departs from older schools of generative grammar in which the label of a phrase is labels are determined endocentrically. In a series of articles, Chomsky has proposed that labels are determined by a labeling algorithm which operates after syntactic structure have been built.

Strong Minimalist Thesis (SMT)Edit

In 2016 Chomsky and Berwick co-wrote their book titled Why Only Us where they defined both the Minimalist Program and the Strong Minimalist Thesis. According to Berwick and Chomsky, the Strong Minimalist Thesis states that "The optimal situation would be that UG reduces to the simplest computational principles which operate in accord with conditions of computational efficiency. This conjecture is ... called the Strong Minimalist Thesis (SMT)."[36]


In the late 1990s, David E. Johnson and Shalom Lappin published the first detailed critiques of Chomsky's minimalist program.[37] This technical work was followed by a lively debate with proponents of minimalism on the scientific status of the program.[38][39][40] The original article provoked several replies[41][42][43][44][45] and two further rounds of replies and counter-replies in subsequent issues of the same journal.

Lappin et al. argue that the minimalist program is a radical departure from earlier Chomskyan linguistic practice that is not motivated by any new empirical discoveries, but rather by a general appeal to perfection, which is both empirically unmotivated and so vague as to be unfalsifiable. They compare the adoption of this paradigm by linguistic researchers to other historical paradigm shifts in natural sciences and conclude that of the minimalist program has been an "unscientific revolution", driven primarily by Chomsky's authority in linguistics. The several replies to the article in Natural Language and Linguistic Theory Volume 18 number 4 (2000) make a number of different defenses of the minimalist program. Some claim that it is not in fact revolutionary or not in fact widely adopted, while others agree with Lappin and Johnson on these points, but defend the vagueness of its formulation as not problematic in light of its status as a research program rather than a theory (see above).

Prakash Mondal has published a book-length critique of the Minimalist model of grammar, showing a number of contradictions, inconsistencies and paradoxes within the formal structure of the system. In particular, his critique interrogates closely the consequences of adopting some rather innocuous and widespread assumptions or axioms about the nature of language as adopted in the Minimalist model of the language faculty.[46]

See alsoEdit

Further readingEdit

Much research has been devoted to the study of the consequences that arise when minimalist questions are formulated. This list is not exhaustive.

Works by Noam ChomskyEdit

  • Chomsky, Noam. 1993. "A minimalist program for linguistic theory". In Hale, Kenneth L. and S. Jay Keyser, eds. The view from Building 20: Essays in linguistics in honor of Sylvain Bromberger. Cambridge, Massachusetts: MIT Press. 1–52
  • Chomsky, Noam. 1995. The Minimalist Program. Cambridge, Massachusetts: The MIT Press.
  • Chomsky, Noam. 2000. Minimalist inquiries: the framework. In Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, eds. Roger Martin, David Michaels and Juan Uriagereka, 89–155. Cambridge, Massachusetts: MIT Press.
  • Chomsky, Noam. 2000. New horizons in the study of language and mind. Cambridge, UK ; New York: Cambridge University Press.
  • Chomsky, Noam. 2001. Derivation by Phase. In Ken Hale: A Life in Language, ed. Michael Kenstowicz, 1–52. Cambridge, Massachusetts: MIT Press.
  • Chomsky, Noam. 2004. Beyond Explanatory Adequacy. In Structures and Beyond. The Cartography of Syntactic Structures, ed. Adriana Belletti, 104–131. Oxford: Oxford University Press.
  • Chomsky, Noam. 2005. Three Factors in Language Design. Linguistic Inquiry 36: 1–22.
  • Chomsky, Noam. 2007. Approaching UG From Below. In Interfaces + Recursion = Language?, eds. Uli Sauerland and Hans Martin Gärtner, 1–29. New York: Mouton de Gruyter.
  • Chomsky, Noam. 2008. On Phases. In Foundational Issues in Linguistic Theory. Essays in Honor of Jean-Roger Vergnaud, eds. Robert Freidin, Carlos Peregrín Otero and Maria Luisa Zubizarreta, 133–166. Cambridge, Massachusetts: MIT Press.
  • Chomsky, Noam. 2013. Problems of Projection. Lingua 130: 33-49.

Linguistic textbooks on minimalismEdit

  • Adger, David. 2003. Core Syntax. A Minimalist Approach. Oxford: Oxford University Press
  • Boeckx, Cedric. 2006. Linguistic Minimalism. Origins, Concepts, Methods and Aims. Oxford: Oxford University Press.
  • Bošković, Željko and Howard Lasnik (eds). 2006. Minimalist Syntax: The Essential Readings. Malden, MA: Blackwell.
  • Cook, Vivian J. and Newson, Mark. 2007. Chomsky's Universal Grammar: An Introduction. Third Edition. Malden, MA: Blackwell.
  • Hornstein, Norbert, Jairo Nunes and Kleanthes K. Grohmann. 2005. Understanding Minimalism. Cambridge: Cambridge University Press
  • Lasnik, Howard, Juan Uriagereka, Cedric Boeckx. 2005. A Course in Minimalist Syntax. Malden, MA: Blackwell
  • Radford, Andrew. 2004. Minimalist Syntax: Exploring the Structure of English. Cambridge: Cambridge University Press.
  • Uriagereka, Juan. 1998. Rhyme and Reason. An Introduction to Minimalist Syntax. Cambridge, Massachusetts: MIT Press.
  • Webelhuth, Gert (ed.). 1995. Government and Binding Theory and the Minimalist Program: Principles and Parameters in Syntactic Theory. Wiley-Blackwell

Works on the main theoretical notions and their applicationsEdit

  • Boeckx, Cedric (ed). 2006. Minimalist Essays. Amsterdam: John Benjamins.
  • Bošković, Željko. 1997. The Syntax of Nonfinite Complementation. An Economy Approach. Cambridge, Massachusetts: MIT Press.
  • Brody, Michael. 1995. Lexico-Logical Form: a Radically Minimalist Theory. Cambridge, Massachusetts: MIT Press.
  • Collins, Chris. 1997. Local Economy. Cambridge, Massachusetts: MIT Press.
  • Epstein, Samuel David, and Hornstein, Norbert (eds). 1999. Working Minimalism. Cambridge, Massachusetts: MIT Press.
  • Epstein, Samuel David, and Seely, T. Daniel (eds). 2002. Derivation and Explanation in the Minimalist Program. Malden, MA: Blackwell.
  • Fox, Danny. 1999. Economy and Semantic Interpretation. Cambridge, Massachusetts: MIT Press.
  • Martin, Roger, David Michaels and Juan Uriagereka (eds). 2000. Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. Cambridge, Massachusetts: MIT Press.
  • Pesetsky, David. 2001. Phrasal Movement and its Kin. Cambridge, Massachusetts: MIT Press.
  • Richards, Norvin. 2001. Movement in Language. Oxford: Oxford University Press.
  • Stroik, Thomas. 2009. Locality in Minimalist Syntax. Cambridge, Massachusetts: MIT Press.


  1. ^ Chomsky, Noam. 1993. A minimalist program for linguistic theory. MIT occasional papers in linguistics no. 1. Cambridge, Massachusetts: Distributed by MIT Working Papers in Linguistics.
  2. ^ For a thorough discussion of this distinction in the context of linguistics, see Boeckx, Cedric. 2006. Linguistic Minimalism: Origins, Concepts, Methods, and Aims. Oxford: Oxford University Press.
  3. ^ Boeckx, Cedric Linguistic Minimalism. Origins, Concepts, Methods and Aims, pp. 84 and 115.
  4. ^ Boeckx, Cedric. 2006. Linguistic Minimalism. Origins, Concepts, Methods and Aims. Oxford: Oxford University Press.
  5. ^ There are many introductions to Principle and Parameters. Two that align PP in such a way that make the transition to MP smooth are Carnie, Andrew. 2006. Syntax: A Generative Introduction, 2nd Edition. Malden, MA: Blackwell, and Cook, Vivian J. and Newson, Mark. 2007. Chomsky's Universal Grammar: An Introduction. Third Edition. Malden, MA: Blackwell.
  6. ^ For a detailed introductory discussion between the transition of the technicalities from PP to MP see, among others, Gert Webelhuth. 1995. Government and Binding Theory and the Minimalist Program: Principles and Parameters in Syntactic Theory. Wiley-Blackwell; Uriagereka, Juan. 1998. Rhyme and Reason. An Introduction to Minimalist Syntax. Cambridge, Massachusetts: MIT Press; Hornstein, Norbert, Jairo Nunes and Kleanthes K. Grohmann. 2005. Understanding Minimalism. Cambridge: Cambridge University Press; and Boeckx, Cedric. 2006. Linguistic Minimalism. Origins, Concepts, Methods and Aims. Oxford: Oxford University Press.
  7. ^ For a full description of the checking mechanism see Adger, David. 2003. Core Syntax. A Minimalist Approach. Oxford: Oxford University Press; and also Carnie, Andrew. 2006. Syntax: A Generative Introduction, 2nd Edition. Blackwell Publishers
  8. ^ For some conceptual and empirical advantages of the MP over the traditional view see: Bošković, Željko. 1994. D-Structure, Θ-Criterion, and Movement into Θ-Positions. Linguistic Analysis 24: 247–286, and for more detailed discussions Bošković, Željko and Howard Lasnik (eds). 2006. Minimalist Syntax: The Essential Readings. Malden, MA: Blackwell.
  9. ^ a b See Chomsky, Noam. 1995. Bare Phrase Structure. In Evolution and Revolution in Linguistic Theory. Essays in honor of Carlos Otero., eds. Hector Campos and Paula Kempchinsky, 51–109.
  10. ^ Mathews, P.H (2014). The Concise Oxford Dictionary of Linguistics (3 ed.). Oxford University Press.
  11. ^ Osborne, Timothy, Michael Putnam, and Thomas Gross 2011. Bare phrase structure, label-less structures, and specifier-less syntax: Is Minimalism becoming a dependency grammar? The Linguistic Review 28: 315–364
  12. ^ a b c Lowe, John; Lovestrand, Joseph (2020-06-29). "Minimal phrase structure: a new formalized theory of phrase structure". Journal of Language Modelling. 8 (1): 1. doi:10.15398/jlm.v8i1.247. ISSN 2299-8470.
  13. ^ a b c d e f g h i j k l m n o p q r s t u Fukui, Naoki (2011). The Oxford Handbook of Linguistic Minimalism. Merge and Bare Phrase Structure. Oxford University Press. pp. 1–24.
  14. ^ a b c d e f g Fukui, Naoki (2017-04-21). Merge in the Mind-Brain: Essays on Theoretical Linguistics and the Neuroscience of Language (1 ed.). New York : Routledge, [2017] | Series: Routledge leading linguists; 23: Routledge. doi:10.4324/9781315442808-2. ISBN 978-1-315-44280-8.CS1 maint: location (link)
  15. ^ Chris, Collins (2002). Derivation and Explanation in the Minimalist Program; Eliminating Labels. Blackwell Publishers.
  16. ^ Fukui, Naoki (2001). "Phrase Structure". The Handbook of Contemporary Syntactic Theory. Oxford, UK: Blackwell Publishers. pp. 374–408. doi:10.1002/9780470756416.ch12. ISBN 978-0-470-75641-6.
  17. ^ Osborne, Timothy; Putnam, Michael; Gross, Thomas M. (2011). "Bare phrase structure, label-less tress, and specifier-less syntax. Is Minimalism becoming a dependency grammar?". The Linguistic Review. 28 (3). doi:10.1515/tlir.2011.009. S2CID 170269106.
  18. ^ a b c Jayaseelan, K.A. (2008). "Bare Phrase Structure and Specifier-less Syntax". Biolinguistics. 2 (1): 087–106.
  19. ^ Lasnik, Howard (1999). "On Feature Strength: Three Minimalist Approaches to Overt Movement". Linguistic Inquiry. 30 (2): 197–217. doi:10.1162/002438999554039. JSTOR 4179059. S2CID 57570833 – via JSTOR.
  20. ^ a b c d e f g Noam, Chomsky (1995). The Minimalist Program. Cambridge MA: MIT Press.
  21. ^ a b c d e f g Nunes, Jairo (1998). "Bare X-Bar Theory and Structures Formed by Movement". Linguistic Inquiry. 29 (1): 160–168. doi:10.1162/002438998553707. JSTOR 4179012. S2CID 57569962 – via JSTOR.
  22. ^ a b Manzini, Rita (1995). "From Merge and Move to Form Dependency".
  23. ^ a b c d e Sportiche, Dominique. (23 September 2013). An introduction to syntactic analysis and theory. Koopman, Hilda Judith., Stabler, Edward P. Hoboken. ISBN 978-1-118-47048-0. OCLC 861536792.
  24. ^ Hornstein, Norbert; Nunes, Jairo (2008). "Adjunction, Labeling, and Bare Phrase Structure". Biolinguistics. 2 (1).
  25. ^ Hornstein, Norbert; Nunes, Jairo (2008). "Adjunction, Labeling, and Bare Phrase Structure". Biolinguistics. 2 (1).
  26. ^ Chomsky, Noam (1998). "Minimalist Inquiries: The Framework" MIT Occasional Papers in Linguistics 15. Republished in 2000 in R. Martin, D. Michaels, & J. Uriagereka (eds.). Step By Step: Essays In Syntax in Honor of Howard Lasnik. 89–155. MIT Press.
  27. ^ a b "Derivation by Phase", Ken Hale, The MIT Press, 2001, doi:10.7551/mitpress/4056.003.0004, ISBN 978-0-262-31612-5, retrieved 2020-12-04
  28. ^ a b Chomsky, Noam (2000). Minimalist Inquiries: The Framework. In Roger Martin, David Michaels, and Juan Uriagereka, eds., Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. Cambridge, Mass.: MIT Press. pp. 89–156. ISBN 026213361X.
  29. ^ a b c d e f Obata, Miki (2006-01-01). "Phase and Convergence". University of Pennsylvania Working Papers in Linguistics. 12 (1).
  30. ^ a b c d e Keupdjio, H. S. (2020). "The syntax of A' -dependencies in Bamileke Medumba (T)". University of British Columbia – via
  31. ^ a b c d See, among others, Legate, Julie Anne. 2003. Some Interface Properties of the Phase. Linguistic Inquiry 34: 506–516 and Chomsky, Noam. 2008. On Phases. In Foundational Issues in Linguistic Theory. Essays in Honor of Jean-Roger Vergnaud. eds. Robert Freidin, Carlos Peregrín Otero and Maria Luisa Zubizarreta, 133–166. Cambridge, Massachusetts: MIT Press
  32. ^ Chomsky, Noam (1999). Derivation by Phase. MIT, Department of Linguistics.
  33. ^ a b See Assmann et al. (2015) Ergatives Move Too Early: On an Instance of Opacity in Syntax. Syntax 18:4 pp. 343–387
  34. ^ Svenonius, Peter (2004), Adger, David; De Cat, Cécile; Tsoulas, George (eds.), "On the Edge", Peripheries, Studies in Natural Language and Linguistic Theory, Dordrecht: Springer Netherlands, 59, pp. 259–287, doi:10.1007/1-4020-1910-6_11, ISBN 978-1-4020-1908-1, retrieved 2020-12-05
  35. ^ Epstein, Samuel David; Seely, T. Daniel, eds. (2002). Derivation and Explanation in the Minimalist Program (1 ed.). John Wiley & Sons, Ltd. doi:10.1002/9780470755662. ISBN 9780470755662.
  36. ^ Chomsky and Berwick. Why Only Us?. MIT Press. 2016. Page 94.
  37. ^ Johnson, David E. and Shalom Lappin (1997), "A Critique of the Minimalist Program" in Linguistics and Philosophy 20, 273–333, and Johnson, David E. and Shalom Lappin (1999). Local Constraints vs Economy. Stanford: CSLI
  38. ^ *Lappin, Shalom, Robert Levine and David E. Johnson (2000a). "The Structure of Unscientific Revolutions." Natural Language and Linguistic Theory 18, 665–771
  39. ^ Lappin, Shalom, Robert Levine and David E. Johnson (2000b). "The Revolution Confused: A Reply to our Critics." Natural Language and Linguistic Theory 18, 873–890
  40. ^ Lappin, Shalom, Robert Levine and David E. Johnson (2001). "The Revolution Maximally Confused." Natural Language and Linguistic Theory 19, 901–919
  41. ^ Holmberg, Anders (2000). "Am I Unscientific? A Reply to Lappin, Levine, and Johnson". Natural Language & Linguistic Theory. 18 (4): 837–842. doi:10.1023/A:1006425604798. S2CID 169909919.
  42. ^ Reuland, Eric (2000). "Revolution, Discovery, and an Elementary Principle of Logic". Natural Language & Linguistic Theory. 18 (4): 843–848. doi:10.1023/A:1006404305706. S2CID 169181486.
  43. ^ Roberts, Ian (2000). "Caricaturing Dissent". Natural Language & Linguistic Theory. 18 (4): 849–857. doi:10.1023/A:1006408422545. S2CID 189900101.
  44. ^ Piattelli-Palmarini, Massimo (2000). "The Metric of Open-Mindedness". Natural Language & Linguistic Theory. 18 (4): 859–862. doi:10.1023/A:1006460406615. S2CID 169864677.
  45. ^ Uriagereka, Juan (2000). "On the Emptiness of 'Design' Polemics". Natural Language & Linguistic Theory. 18 (4): 863–871. doi:10.1023/A:1006412507524. S2CID 170071816.
  46. ^ Mondal, Prakash (2014). Language, Mind and Computation. London/New York: Palgrave Macmillan.