An Online Journal of Modern Philology ISSN 1214-5505

Idioms: Production, Storage and Comprehension


Marek Havrila


The present paper offers a brief overview of theories on production, storage and comprehension of idiomatic expressions and emphasizes the relative position of these expressions on the interface of grammar and lexicon.

Idioms: Production, Storage and Comprehension

Idiomatic expressions, which by definition are semantically non-compositional (cf. e.g. Katz & Postal 1963), present a great challenge to traditional theories of language storage and comprehension based on the principle of compositionality. However, the existing non-compositional approaches to idioms cannot definitely confirm that idiomatic expressions have their meanings arbitrarily stipulated either. The idea of exclusively arbitrary meaning stipulation was tackled already by Cacciari and Tabossi (1988), Gibbs et al. (1989), Nayak et al. (1990), who delivered evidence that it is possible to infer certain meaningful relations between the literal sense of individual parts of an idiom and its idiomatic meaning. Another problematic question is the production of idioms. Although idiom storage and comprehension have been discussed since 1970s (e.g. Bobrow and Bell 1973), the question of production, which is prior to storage and comprehension processes, was paradoxically tackled as late as on the verge of new millennium.

For better visualization of actual interrelations between individual theories, the paper is arranged chronologically and from-low-to-high complexity of approaches rather than in logical order from-production-to-comprehension.

1. Storage

The level of idiom storage models is represented by two theories. Following the Separate List Model (Bobrow and Bell 1973), idioms are stored as a separate list of items that has nothing to do with the list of ´single´ literal words out of which idiomatic phrases were made up. According to the Single Lexical Item Model (Swinney and Cutler 1979), idioms are stored in the mind as single lexical items in the lexicon in the same way as ´single´ words.

2. Production and Comprehension

Idiom comprehension theories can be divided into three groups according to their approach to the compositionality of idiom. Non-compositional approaches assume that idioms are stored in the lexicon and retrieved from it as whole ´long words´ (Bobrow and Bell 1973, Swinney and Cutler 1979, Gibbs 1980). Compositional models oppose the theory of ´long words´ and assume that individual idiom components contribute to the overall sense of an idiom (Cacciari and Tabossi 1988, Gibbs et al.1989). Hybrid models approach to the idiom processing as to a combination of compositional and unitary features of syntactic and lexical-conceptual nodes in the mental lexicon (Cutting and Bock 1997, Giora and Fein 1999, Sprenger et al 2006). Cutting and Bock (1997) and Sprenger et al. (2006) present not only their hybrid theories of idiom comprehension, but they also focus their attention to the explanation of the process of idiom production.

2.1 Non-Compositional Perspectives

2.1.1 Idiom List Hypothesis (Literal-First-Hypothesis)

Bobrow and Bell (1973) suggest that fixed expressions are stored as a separate list or an “idiom word“ dictionary of long complex words (Bobrow and Bell 1973:343) in a special idiom lexicon where these expressions are stored, respectively accessed as single lexical items. The idiom list hypothesis, often referred to as the Literal-First-Hypothesis of idiom comprehension (Vega-Moreno 2001), suggests that literal meaning is activated prior to the activation of figurative meaning. In other words, any type of expression is by default processed literally first. If the meaning does not match the context, the idiom mode of processing is activated, and the expression is checked for an appropriate figurative (idiomatic) meaning by accessing one’s idiom word dictionary. This model of idiom comprehension implies that it would take longer for idioms to be processed than for literal word combinations; however, Swinney & Cutler (1979) proved that the comprehension of idiomatic expressions is not more time-consuming than that of non-idiomatic ones (cf. also Ortony et. al. 1978).

2.1.2 Lexical Representation Theory (Simultaneous Processing Hypothesis)

Similarly to Bobrow and Bell (1973), Swinney and Cutler (1979) suggest that idiomatic expressions are stored and mentally processed as long ambiguous single lexical units (long words) whose all potential meanings are accessed when such a ´long word´ is encountered. The difference between the Idiom List Hypothesis and the Lexical Representation Theory is that the latter suggests that these long words are stored in the general lexicon. Swiney and Cuttler (1979) argue against the priority of literal interpretation, and in their model, also referred to as the Simultaneous Processing Hypothesis (Vega-Moreno 2001), they reject any special idiom processing mode and propose parallel access instead. The reason for diverting from the Idiom List Hypothesis derives from experimental findings that show that understanding idioms (e.g. kick the bucket) does not take longer than understanding literal strings (e.g. strike the pail) (Ortony et. al. 1978; Swinney & Cutler 1979). When the listener encounters the first constituent of a fixed expression, processing of both potential meanings is triggered, but the figurative one is preferred as soon as the idiomatic features are identified. The assumption of a simultaneous access of human mind to literal as well as figurative semantics of ‘long words’ is considered to be the explanation to why figurative and literal comprehension of language items take principally equally long time.

2.1.3 Direct Access Hypothesis (Figurative-First-Hypothesis)

Gibbs (1980) presents the Direct Access Hypothesis, also referred to as the Figurative-First-Hypothesis (Vega-Moreno 2001) diverting even more radically from Bobrow and Bell’s assumptions. Gibbs suggests that the literal meaning of idioms is of less importance in comprehension because idioms have strong conventional figurative meaning. The Figurative-First-Hypothesis proposes that idioms are lexical items whose idiomatic meaning is retrieved directly from the mental lexicon as soon as such a string is encountered in an utterance (cf. Gibbs 1980, 1982, 2002). Hence an idiom is accessed figuratively first, and only if the meaning is inappropriate to the context it is then interpreted literally. Gibbs also tackles Swinney & Cutler’s (1979) account of idiom comprehension by his suggestion that “the finding that idioms (e.g. kick the bucket) are processed faster than literal strings (e.g. ‘strike the pail’) does not necessarily imply that literal processing must take place at all“ (Vega-Moreno 2001:76). According to this account, the literal reading not only is not prior to the idiomatic one, but can also be completely bypassed. “The direct access view simply claims that listeners need not automatically analyze the complete literal meanings of linguistic expressions before accessing pragmatic knowledge to figure out what speakers mean to communicate“ (Gibbs 2002: 460).

2.2. Compositional Perspectives

2.2.1 Configuration Hypothesis

The configuration hypothesis proposed by Cacciary & Tabossi (1988) and later developed by Cacciari & Glucksberg (1991) supports the simultaneous processing hypothesis, however, without committing to the idea that idioms are stored as lexical items. The model suggests that idioms are grouped together with other memorised strings such as parts of poems, titles of songs, lyrics or any other sequence of words represented and distributed in the lexicon. The hypothesis emphasizes the compositional nature of idioms, which however assumes that idioms are not treated as long words but rather as configurations of words.

Cacciary & Tabossi (1988) assume that a word combination (a potential idiomatic expression) is initially processed literally until a configuration ´key´ is recognized, and the idiomatic meaning is activated. Subsequently, literal and figurative (idiomatic) processing run in parallel until the literal sense is definitely rejected and the idiomatic one is accepted as the intended interpretation. The ´key´, usually a word, is a point at which the hearer decides to reject the literal meaning option in favour of the idiomatic one. Since the recognition of idiomatic sense of an expression is principally context-dependent, recipients usually are able to recognise the ´idiom key´ in a configuration as soon as after the first or second word in the string.

The upgraded variant of Configuration hypothesis, proposed by Glucksberg (In: Vega-Moreno 2001) [1], namely the Phrase-Induced-Polysemy Model, assumes polysemous character of words in the string. According to this hypothesis, for example in the idiomatic string spill the beans the lexical form spill conveys an extra sense of REVEAL, and the lexical form bean carries an extra sense of SECRET. Following the model, understanding the string spill the beans as reveal the secret is just a matter of an appropriate recognition of the (sub-)senses in the configuration.

2.2.2 Idiom Decomposition Hypothesis (Conceptual Metaphor Model)

Gibbs et al (1989) come up with the idea that idioms are not just dead metaphors whose meaning can be paralleled with a simple ´single word´ literal paraphrase. The authors do not reject the role of the meaning stipulated to an idiom in the mental lexicon; however, according to the Idiom Decomposition theory, individual words in an idiomatic expression seem to contribute to the overall figurative meaning of the idiom due to their metaphoric potential that such words convey.

Gibbs’ explanation of the contributive role of individual words to the overall meaning of the idiom is based on the work of Lakoff and Johnson (1980) who suggest that language items are motivated by pre-existing conceptual metaphorical mappings in our long term memory reflecting our life experience. For example, understanding an idiom such as spill the beans is a matter of mapping the two metaphorical concepts that motivate the idiom: MIND IS A CONTAINER and IDEAS ARE PHYSICAL ENTITIES which can get spilled out of the container and thus be revealed to others.

Of course not all idioms are decomposable to the same extent on the basis of conceptual metaphor. Gibbs et al. (1989) define three degrees of idiom decomposability (analyzability) which they relate to the syntactic flexibility of idiomatic expressions. Research results suggest that there really is a direct relation between semantic analyzability of idioms and their syntactic productivity. On the one hand, some idiomatic phrases can be seen in different syntactic alternations, and they still maintain their figurative meaning such as for example John laid down the law (John enforced the rules). On the other hand, the idiomatic sentence John kicked the bucket (John died) can not be used in passive transformation such as *The bucket was kicked by John without disruption of its idiomatic interpretation. The degree of grammatical flexibility depends on the possibility of assigning particular meanings to individual words comprising the idiom, and subsequent definition of clear relations between them. Whereas the former idiom can be passivized into The law was laid down by John without any loss of figurative meaning, the latter one can not be passivized into *The bucket was kicked by John, since the idiom is semantically non-decomposable and no particular senses capable of acquiring certain grammatical roles can be ascribed to the individual words comprising the idiom.

Between normally decomposable and non-decomposable idioms lies a special group of abnormally decomposable idioms that display a restricted syntactic flexibility. Individual constituents of abnormally decomposable idioms do not by themselves refer directly (literally) to some component of the idiomatic reference but create only some metaphorical relation between individual parts and the referent. An example of an abnormally decomposable idiom is carry a torch for somebody (to have warm feelings for someone). For instance John carried a torch for Sally can be passivized into A torch for Sally was carried by John because the metaphorical relation between warm feelings and a torch implying fire and warmth can be established (Gibbs et al 1989:578). On the basis of the sub-conscious sensitivity to that decomposability-flexibility relation language users can reliably distinguish frozen and flexible idioms.

2.3 Hybrid Perspectives

2.3.1 The Graded Salience Hypothesis (Comprehension)

The Graded Salience Hypothesis, also referred to as the familiarity model, “posits the priority of salient (coded, context-independent, prominent) meanings“ (Giora 2002:490). It disregards the question of compositionality or analysability of idiom meaning, and it rejects any competition of literal and figurative meanings in idiom comprehension. The hypothesis does not tackle the syntax-lexicon interface, but it is hybrid in the sense that it suggests a direct access to the meaning of a language item, whereby it does not posit the default priority of figurative interpretation of an expression. The model presumes the automatic access to any most familiar (salient) interpretation of the linguistic expression. The ‘most salient’ applies to any literal and idiomatic sense of the expression. This approach relativizes the absolute role of the context in recognizing the appropriate meaning of an expression and assigns more responsibility for assessing the right sense of an expression to its inherent semantic content. The decisive factor for a successful and fast interpretation of an expression is the salience and familiarity of individual items present in such an expression. In other words, “more salient meanings - coded foremost on our mind due to conventionality, frequency, familiarity or prototypicality - are accessed faster and reach sufficient levels of activation before less salient ones“ (Laurent et al 2006:151). It is important to note that the model emphasises the irrelevance of the difference between literal and figurative (idiomatic) meaning of a language item in access to its salient meaning.

In other words, the familiarity and graded salience hypothesis rejects the processing difference between literal and figurative language items and implies that the more familiar a language item is, the more prominent is its position in the mental lexicon, and subsequently, the less decisive role in meaning recognition is played by contextual clues.

The guiding role of context in appropriate meaning recognition applies only in cases in which there are more than one approximately equally salient meanings assignable to one language item. On the one hand, the model supports the assumptions of the direct access model in the case of highly salient expressions, on the other, it pre-supposes a sequential and context-guided access to the meaning of less familiar language items.

These conclusions arise from Giora and Fein’s series of context conditioned comprehension tests (Giora and Fein 1999) in which access to meanings of different literal and idiomatic expressions with different levels of familiarity was assessed. For instance, the comprehension of highly familiar idioms in the idiomatically biasing context activated their salient idiomatic meanings, whereas the less salient literal meanings were hardly accessed. The same idioms set in the literally biasing context activated both, the literal as well as idiomatic meaning of the idioms. On the other hand, the comprehension of less familiar idioms with approximately equally salient idiomatic and literal interpretation set into an idiomatically biasing context resulted in activation of both literal and idiomatic meanings of the expression. Finally, the comprehension test with less familiar idioms set in a literally biasing context made literal meaning of the expression highly salient, whereas the idiomatic meaning was activated only marginally.

2.3.2a Syntactic-Conceptual Interface Model (Comprehension and Production)

Cutting and Bock (1997) propose a hybrid model of idiom processing suggesting a twofold, simultaneously unitary and compositional, perspective of idiom representation. Their model is based on Levelt’s speech production model (Levelt 1989) who suggests that idiomatic expressions have their own entries as lexical concepts. Cutting and Bock’s syntactic-conceptual interface model (Cutting and Bock 1997) assumes that idiom representation grounds on the mutual interplay of syntax and lexicon. From the viewpoint of syntax, every potential idiom representation consists of a set of rules forming a structural frame with terminal nodes of grammatically categorized empty slots. These slots are filled in with units derived from the lexicon, namely with nodes for semantic concepts, words, morphemes and phonemes that are mutually hierarchically interconnected. The node representation in the lexicon must grammatically comply with specific requirements of the syntactic slot. In other words, idioms are not just lexicalised, structurally void long words but rather phrases with internal syntactic and semantic structure. Idioms are represented in the lexicon as wholes (nodes) located between the levels of lexical and conceptual nodes, and hereafter they are referred to as lexical-concept nodes. The lexical representation of an idiom-node is associated with a phrasal node (e.g. verb phrase) rather than with a single grammatical category (e.g. verb). Idiom retains the structural information in its lexical representation, so that for example kick the bucket is represented as a phrasal node (verb phrase) in the syntactic part of the system. In the lexicon part, the lexical-conceptual node of the idiom is associated with individual lexical nodes (lemmas) ‘kick’, ‘the’ and ‘bucket’ that form the idiom.

The above assumptions on the unitary character of idioms consist in the observation that “idiom blends occur too rarely in spontaneous speech“ (Cutting and Bock 1997:59). This observation led Cutting and Bock (1997) to the idea of a time pressure speech error test with idiom blends in which idiom pairs with the same syntactic structure proved to be more sensitive and prone to error substitutions than pairs with different syntactic structure (e.g. kick the bucket and meet your maker produced kick the maker). Furthermore, 93% of the substitutions were words of the same grammatical class as original words they replaced. (Cutting and Bock 1997:63). The grammatical class consistency in error substitution of words, on the other hand, supports the assumption of syntactic sensitivity (structure) of idioms and refutes the perception of idioms as large single words without internal structure.

2.3.2b The Superlemma Theory of Idiom Production (Comprehension and Production)

The Superlemma Theory (Sprenger et al. 2006) [2], similarly to Cutting and Bock (1997) is based on Levelt’s theory of lexical access in speech production (Levelt 1989). The model supports the assumption of the hybrid, unitary and compositional, representation of idiom (Cutting and Bock 1997). An idiom has its unitary idiomatic concept that activates individual lemmas it is composed of, but the lemmas are not exclusively bound to one idiomatic meaning. The idiomatic expression is represented in the lexicon and activated by a superlemma that relates to a specific lexical concept which in turn activates the single lemmas comprising the superlemma (Sprenger et al 2006). For example the concept of ´dying´ will activate the superlemma kick the bucket which subsequently activates the individual lemmas kick, the and bucket. The concept of ´dying´ can activate any other superlemma such as to bite the dust, or to breathe one’s last, and those will compete for production in the actual speech the same way as simple lemmas do in the case of non-idiomatic speech (Levelt and Meyer 2000). In terms of grammatical behaviour, the specific “syntactic constraints associated with the idiom become available to the production system at the point of definite selection of the superlemma” (Kuiper et al 2007:323).

The extra step in the idiom production, namely the superlemma activation, provides an explanation why it takes more time from the conceptualization (preverbal message) of an idiomatic speech to the onset of its audible production (overt speech) compared to non-idiomatic speech. On the other hand, according to Levelt and Meyer (2000), the superlemma provides also an explanation to a higher level of fluency of idiomatic speech in comparison to the non-idiomatic one. Namely, in a non-idiomatic speech, a speaker can choose to start the articulation either at the point when the first element of the intended string is grammatically and phonologically encoded or as late as the whole intended articulation string is encoded and ready for overt production. When the speaker decides for the first method, he/she will start the overt speech production faster but at higher risk of articulation fluency disorders such as hesitations, pauses, or grammatical errors. In the case of the second approach to speech production, the conceptualization of the message prior to onset of the overt production itself takes longer, but the articulation proper is then more fluent (without hesitations, pauses, and ungrammaticalities) and generally faster at its end. In this respect, the superlemma theory complies with the results delivered by Cutting and Bock (1997) who suggest that idioms are less prone to errors than literal (novel) expressions in spontaneous speech.

2.3.2c Cutting and Bock vs. Superlemma Theory

The difference between Cutting and Bock and the Superlemma Theory lies in the approach to the syntactic representation of idioms. Cutting and Bock assume that “idiomatic concepts activate phrasal frames that are not bound to specific lemma representations. […] They provide a phrase structure with open slots that can be filled with simple lemmas that are activated by the idiom’s lexical concept node“ (Kuiper et al. 2007:324). The problem is that Cutting and Bock’s phrasal frame is an abstract syntactic structure that cannot recognize the relationship between concepts and individual active lemmas, which results in troubles in the production system inability to identify the speaker’s intention. This is particularly very probable in cases in which a potential idiomatic expression contains open slots for more than just one NP or VP leaving it unclear how the system decides where particular NPs and VPs shall be inserted. For instance, in the idiom to be a wolf in a sheep’s clothing, the nouns sheep and wolf could be inserted either of the open NP slots, which might make a wolf in sheep’s clothing into a sheep in wolf’s clothing equally probable (Sprenger et al. 2006:177 qtd. in Kuiper et al. 2007:325) [3]. The superlemma theory suggests that “[t]he syntactic relationships and idiosyncratic constraints that characterize an idiom are directly applied to the lemmas involved; no additional operation is required“ (Kuiper et al 2007:325). This model of idiom processing simultaneously unites the features of idiom production as well as comprehension, and in comparison to Cutting and Bock’s theory, avoids the troublesome explanations for the existence of many syntactic idiosyncrasies present in many idioms.


The overview of theories on production, storage and comprehension of idioms just emphasizes the relative position of these expressions on the interface of grammar and lexicon. The assumption of existence of supperlemma in idiom production finds a solid ground in higher fluency and minimum of overt speech hesitations and errors.

The question whether idioms are units stored with their meanings a priori stipulated and retrieved as wholes from the memory, or they are living compositional and conceptual entities remains. The fact that many idioms have their figurative and literal readings, as well as the evidence that confirms the contextual sensitivity of idioms offer interesting hypotheses to the above question but paradoxically are the major sources of conflicts between the processual and factual assumption of idiom comprehension.


[1] Glucksberg, S. (1993). Idiom meaning and allusional content. In: Cacciari, C. & P. Tabossi (eds.). Idioms: Processing, Structure, and Interpretation, 3-26. Hillsdale, New Jersey: Lawrence Erlbaum.

[2] Sprenger, S., Levelt, W.J.M. ,Kempen, G. (2006). Lexical access during the production of idiomatic phrases. Journal of memory and language, 54, 161-184. In: Kuiper, K., Egmod van, M.E., Kempen,G., Sprenger, S. (2007). Slipping on superlemmas: Multi-word lexical items in speech production. The Mental Lexicon 2:3, 313-357.

[3] Sprenger, S., Levelt, W.J.M. ,Kempen, G. (2006). Lexical access during the production of idiomatic phrases. Journal of memory and language, 54, 161-184. In: Kuiper, K., Egmod van, M.E., Kempen,G., Sprenger, S. (2007). Slipping on superlemmas: Multi-word lexical items in speech production. The Mental Lexicon 2:3, 313-357.


Bobrow, S. A., Bell, S. M. (1973). On catching on to idiomatic expressions. Memory and Cognition 1, 343-346.

Cacciari, C., Glucksberg, S. (1991). Understanding idiomatic expressions: The contribution of word meanings. In: Understanding word and sentence, ed. Greg B. Simpson, 217-240. North-Holland: Elsevier.

Cacciari, C., Tabossi, P. (1988). The comprehension of idioms. Journal of Memory and Language 27, 668-683.

Cutting, J.C., Bock, K. (1997). That’s the way the cookie bounces: Syntactic and semantic components of experimentally elicited idiom blends. Memory and Cognition 25(1) , 57-71.

Gibbs, R. W. (1980). Spilling the beans on understanding and memory for idioms in context. Memory and Cognition 8, 149-156.

Gibbs, R. W. (1992). What do idioms really mean? Journal of Memory and Language 31, 485-506.

Gibbs, R. W. (2002). A new look at literal meaning in understanding what is said and implicated. Journal of Pragmatics 34, 457-486.

Gibbs, R. W., Nayak, N. P., Cutting, C. (1989). How to kick the bucket and not decompose: Analyzability and idiom processing. Journal of Memory and Language 28, 576-593.

Giora, R. (2002). Literal vs. figurative meaning: Different or equal? Journal of Pragmatics 34, 487-506.

Giora, R., Fein,O. (1999). On understanding familiar and less-familiar figurative language. Journal of Pragmatics 31, 1601-1618.

Katz J.J., Postal, P.M. (1963). Semantic interpretation of idioms and sentences containing them. In: M.I.T Research Laboratory of Electronics, Quarterly Progress Report, 70, 275-282.

Kuiper, K., Egmod van, M.E., Kempen,G., Sprenger, S. (2007). Slipping on superlemmas: Multi-word lexical items in speech production. The Mental Lexicon 2:3, 313-357.

Lakoff, G., Johnson, M. (1980). Metaphors We Live By. Chicago, London: The University of Chicago Press.

Laurent, J.P., Denhiéres, G., Passerieus, Ch., Iakimova, G., Hardy-Baylé,M.C. (2006).On undestanding idiomatic language: The salience hypothesis assessed by ERPs. Brain Research, 1068, 151-160.

Levelt,W. J. M. (1989). Speaking: From Intention to Articulation. Cambridge, Massachusetts: The MIT Press.

Levelt, W.J.M. and Meyer, A.S. (2000). Word for Word: Multiple lexical access in speech production. European Journal of Cognitive Psychology, 12 (4) , 433-452.

Nayak, P.N., Gibbs, R.W. (1990). Conceptual knowledge in the interpretation of idioms. Journal of experimental psychology, Vol.119, No. 3, 315-330.

Ortony, A., Schallert, D. L., Reynolds, R. E., Antos, S. J. (1978). Interpreting metaphors and idioms: some effects of context on comprehension. Journal of Verbal Learning and Verbal Behavior 17, 465-477.

Swinney, D. A., Cutler, A. (1979). The access and processing of idiomatic expressions. Journal of Verbal Learning and Verbal Behavior 18, 523- 534.

Vega-Moreno, R. (2001). Representing and processing idioms. UCL Working Papers in Linguistics13:73-107.

[Viewed on 2017-06-26] is published by
Albis - Giorgio Cadorini
(From 2004 to 2016 the journal was published by
The Vilém Mathesius Society,
Opava, Czech Republic)
Copyright © 2003-2017,