A phraseme, also called a set of thoughts, set phrase, idiomatic phrase, multi-word expression (in computational linguistics), or idiom,[1][2][3] is a multi-word or multi-morphemic utterance at least one of whose components is selectionally constrained or restricted by linguistic convention such that it is not freely chosen.[4] In the most extreme cases, there are expressions such as X kicks the bucket ≈ ‘person X dies of natural causes, the speaker being flippant about X’s demise’ where the unit is selected as a whole to express a meaning that bears little or no relation to the meanings of its parts. All of the words in this expression are chosen restrictedly, as part of a chunk. At the other extreme, there are collocations such as stark naked, hearty laugh, or infinite patience where one of the words is chosen freely (naked, laugh, and patience, respectively) based on the meaning the speaker wishes to express while the choice of the other (intensifying) word (stark, hearty, infinite) is constrained by the conventions of the English language (hence, *hearty naked, *infinite laugh, *stark patience). Both kinds of expression are phrasemes, and can be contrasted with ’’free phrases’’, expressions where all of the members (barring grammatical elements whose choice is forced by the morphosyntax of the language) are chosen freely, based exclusively on their meaning and the message that the speaker wishes to communicate.


Major types of phraseme

Phrasemes can be broken down into groups based on their compositionality (whether or not the meaning they express is the sum of the meaning of their parts) and the type of selectional restrictions that are placed on their non-freely chosen members.[5][page needed] Non-compositional phrasemes are what are commonly known as idioms, while compositional phrasemes can be further divided into collocations, clichés, and pragmatemes.

Non-compositional phrasemes: Idioms

A phraseme is an idiom if its meaning is not the predictable sum of the meanings of its component—that is, if it is non-compositional. Generally speaking, idioms will not be intelligible to people hearing them for the first time without having learned them. Consider the following examples (an idiom is indicated by elevated half-brackets: ˹ … ˺):

˹rock and roll˺ ‘a Western music genre characterised by a strong beat with sounds generated by guitar, piano, and vocalists’
˹cheek by jowl˺ ‘in close association’
˹the game is up˺ ‘your deceit is exposed’
˹[X] comes to [NX’s] senses˺ ‘X becomes conscious again’
˹put [NY] on the map˺ ‘make the place Y well-known’
˹bull session˺ ‘long informal talk on a subject by a group of people’

In none of these cases are the meanings of any of the component parts of the idiom included in the meaning of the expression as a whole.

An idiom can be further characterized by its transparency, the degree to which its meaning includes the meanings of its components. Three types of idioms can be distinguished in this way—full idioms, semi-idioms, and quasi-idioms.[6]

Full idioms

An idiom AB (that is, composed of the elements A ‘A’ and B ‘B’) is a full idiom if its meaning does not include the meaning of any of its lexical components: ‘AB’ ⊅ ‘A’ and ‘AB’ ⊅ ‘B’.

˹put [NY] through its paces˺ ‘to test Y thoroughly’
˹go ballistic˺ ‘suddenly become very angry’
˹by heart˺ ‘remembering verbatim’
˹bone of contention˺ ‘reason for quarrels or fights’


An idiom AB is a semi-idiom if its meaning

1) includes the meaning of one of its lexical components, but not as its semantic pivot,
2) does not include the meaning of the other component and
3) includes an additional meaning ‘C’ as its semantic pivot:
‘AB’ ⊃ ‘A’, and ‘AB’ ⊅ ‘B’, and ‘AB’ ⊃ ‘C’.
˹private eye˺ 
‘private detective
˹sea anemone˺ 
predatory polyp dwelling in the sea’
Rus. ˹mozolit´ glaza˺ 
‘be in Y's sight too often or for too long(lit. ‘make corns on Y’s eyes’)

The semantic pivot of an idiom is, roughly speaking, the part of the meaning that defines what sort of referent the idiom has (person, place, thing, event, etc.) and is shown in the examples in bold. More precisely, the semantic pivot is defined, for an expression AB meaning ‘S’, as that part ‘S1’ of AB’s meaning ‘S’, such that ‘S’ [= ‘S1’ ⊕ ‘S2’] can be represented as a predicate ‘S2’ bearing on ‘S1’—i.e., ‘S’ = ‘S2’(‘S1’) (Mel’čuk 2006: 277).[7]

Quasi-idiom or weak idiom

An idiom AB is a quasi-idiom, or weak idiom if its meaning

1) includes the meaning of its lexical components, neither as the semantic pivot, and
2) includes an additional meaning ‘C’ as its semantic pivot:
‘AB’ ⊃ ‘A’, and ‘AB’ ⊃ ‘B’, and ‘AB’ ⊃ ‘C’.
Fr. ˹donner le sein à Y˺ 
feed the baby Y by putting one teat into the mouth of Y’
˹start a family˺ 
conceive a first child with one’s spouse, starting a family’
˹barbed wire˺ 
‘[artifact designed to make obstacles with and constituted by]
wire with barbs 
[fixed on it in small regular intervals]’

Compositional phrasemes

A phraseme AB is said to be compositional if the meaning ‘AB’ = ‘A’ ⊕ ‘B’ and the form /AB/ = /A/ ⊕ /B/ (“⊕” here means ‘combined in accordance with the rules of the language’). Compositional phrasemes can be broken down into two groups—collocations and clichés.


A collocation consists of a base (shown in Small caps ), a lexical unit chosen freely by the speaker, and of a collocate, a lexical unit chosen as a function of the base.[8][9][10]

heavy Accent  
‘strong accent’
sound Asleep  
‘asleep such that one is hard to awaken’
Armed to the teeth 
‘armed with many or with powerful weapons’
leap Year  
‘year in which February has 29 days’

In American English, you make a decision, and in British English, you can also take it. For the same thing, French says prendre [= ‘take’] une décision, German—eine Entscheidung treffen/fällen [= ‘meet/fell’], Russian—prinjat´ [= ‘accept’] rešenie, Turkish—karar vermek [= ‘give’], Polish—podjąć [= ‘take up’] decyzję, Serbian—doneti [= ‘bring’] odluku, and Korean—gyeoljeongeul hadanaerida〉 [= ‘do 〈take/put down〉’]. This clearly shows that boldfaced verbs are selected as a function of the noun meaning ‘decision’. If instead of DÉCISION a French speaker uses CHOIX ‘choice’ (Jean a pris la décision de rester ‘Jean has taken the decision to stay’ ≅ Jean a … le choix de rester ‘Jean has ... the choice to stay’), he has to say FAIRE ‘make’ rather than PRENDRE ‘take’: Jean a fait 〈*a pris〉 le choix de resterJean has made the choice to stay’.
A collocation is semantically compositional since its meaning is divisible into two parts such that the first one corresponds to the base and the second to the collocate. This is not to say that a collocate, when used outside the collocation, must have the meaning it expresses within the collocation. For instance, in the collocation sit for an exam ‘undergo an exam’, the verb SIT expresses the meaning ‘undergo’; but in an English dictionary, the verb SIT does not appear with this meaning: ‘undergo’ is not its inherent meaning, but rather is a context-imposed meaning.


A cliché is a phraseme where none of the components is selected freely and the restrictions are imposed by conventional linguistic usage

in the wrong place at the wrong time
you’ve seen one, you’ve seen ’em all!
no matter what
we all make mistakes
one thing after another

Clichés are compositional in the sense that the meaning of the expression is exactly the sum of the meanings of its parts, and clichés (unlike idioms) would be completely intelligible to someone hearing them for the first time without having learned the expression beforehand. They are not completely free expressions, however, because they are the conventionalized means of expressing the desired meanings in the language. Thus, in English one asks What is your name? and answers My name is [N] or I am [N]; in Spanish one asks ¿Cómo se llama? (lit. ‘How are you called?’) and one answers Me llamo [N] ‘I am called [N]’. The sentences ¿Cómo es su nombre? and Soy [N], the literal renderings of the English expressions, are fully understandable and grammatical, but not standard, in just the same way as the literal translations of the Spanish expressions would sound odd in English

A subtype of cliché is the pragmateme, a cliché where the restrictions are imposed by the situation of utterance:

Eng. Will you marry me? 
[when making a marriage proposal]
Rus. Bud´(te) moej ženoj! (lit. ‘Be my wife!’) 
[when making a marriage proposal]
Eng. Best before… 
[on a container of packaged food]
Rus. Srok godnosti – … (lit. ‘Deadline of fitness is …’) 
[on a container of packaged food]
Fr. À consommer avant … (lit. ‘To consume before …’) 
[on a container of packaged food]
Ger. Mindestens haltbar bis … (lit. ‘At least keepable until …’) 
[on a container of packaged food]

As with clichés, the conventions of the languages in question dictate a particular pragmateme for a particular situation—alternate expressions would be understandable, but would not be perceived as normal.

Phrasemes in morphology

Although the discussion of phrasemes centres largely on multi-word expressions such as those illustrated above, phrasemes are known to exist on the morphological level as well. Morphological phrasemes are conventionalized combinations of morphemes such that at least one of their components is selectionally restricted.[11][12] Just as with lexical phrasemes, morphological phrasemes can be either compositional or non-compositional.

Non-compositional morphological phrasemes

Non-compositional morphological phrasemes,[13] also known as morphological idioms,[14] are actually familiar to most linguists, although the term “idiom” is rarely applied to them—instead, they are usually referred to as “lexicalized” or “conventionalized” forms.[15] Good examples are English compounds such as harvestman ‘arachnid belonging to the order Opiliones’ (≠ ‘harvest’ ⊕ ‘man’) and bookworm (≠ ‘book’ ⊕ ‘worm’); derivational idioms can also be found: airliner ‘large vehicle for flying passengers by air’ (≠ airline ‘company that transports people by air’ ⊕ -er ‘person or thing that performs an action’). Morphological idioms are also found in inflection, as shown by these examples from the irrealis mood paradigm in Upper Necaxa Totonac:[16]

ḭš-tḭ-tachalá̰x-lḭ [past irrealis]
‘it could have shattered earlier (but didn't)’
ḭš-tachalá̰x-lḭ [present irrealis]
‘it could have shattered now (but hasn’t)’
ka-tḭ-tachalá̰x-lḭ [future irrealis]
‘it could shatter (but won't now)’

The irrealis mood has no unique marker of its own, but is expressed in conjunction with tense by combinations of affixes “borrowed” from other paradigms—ḭš- ‘past tense’, tḭ- ‘potential mood’, ka- ‘optative mood’, -lḭ ‘perfective aspect’. None of the resulting meanings is a compositional combination of the meanings of its constituent parts (‘present irrealis’ ≠ ‘past’ ⊕ ‘perfective’, etc.).

Compositional morphological phrasemes

Morphological collocations are expressions such that not all of their component morphemes are chosen freely: instead, one or more of the morphemes is chosen as a function of another morphological component of the expression, its base. This type of situation is quite familiar in derivation, where selectional restrictions placed by radicals on (near-)synonymous derivational affixes are common. Two examples from English are the nominalizers used with particular verbal bases (e.g., establishment, *establishation; infestation, *infestment; etc.), and the inhabitant suffixes required for particular place names (Winnipeger, *Winnipegian; Calgarian, *Calgarier; etc.); in both cases, the choice of derivational affix is restricted by the base, but the derivation is compositional.
An example of an inflectional morphological collocation is the plural form of nouns in Burushaski:[17]

Meaning Singular Plural Meaning Singular Plural
‘king’ thám thám-u ‘flower’ asqór asqór-iŋ
‘bread’ páqu páqu-mu ‘plow’ hárč harč ̣-óŋ
‘dragon’ aiždahár aiždahár-išu ‘wind’ tíš ̣ tiš ̣̣-míŋ
‘branch’ táγ taγ-ášku, taγ-šku ‘minister’ wazíir wazíir-ting
‘pigeon’ tál tál-Ǯu ‘woman’ gús guš-íngants
‘stone’ dán dan-Ǯó ‘[a] mute’ gót got ̣-ó
‘enemy’ dušmán dušmá-yu ‘body’ ḍím ḍím-a
‘rockN čár čar-kó ‘horn’ túr tur-iáŋ
‘dog’ húk huk-á, -ái ‘saber’ gaté+nč ̣ gaté-h
‘wolf’ úrk urk-á, urk-ás ‘walnut’ tilí tilí
‘man’ hír hur-í ‘demon’ díu diw-anc

Burushaski has about 70 plural suffixal morphemes The plurals are semantically compositional, consisting of a stem expressing the lexical meaning and a suffix expressing PLURAL, but for each individual noun, the appropriate plural suffix has to be learned.
Unlike compositional lexical phrasemes, compositional morphological phrasemes seem only to exist as collocations: morphological clichés and morphological pragmatemes have yet to be observed in natural language.[12]

