Mokken scale

The Mokken scale is a psychometric method of data reduction. A Mokken scale is a unidimensional scale that consists of hierarchically-ordered items that measure the same underlying, latent concept. This method is named after the political scientist Rob Mokken who suggested it in 1971.[1]

Mokken Scales have been used in psychology,[2] education,[3][4] political science,[1][5] public opinion,[6] medicine[7] and nursing.[8][9]


An example of an Item Response Function
Item Response Functions that differ in their difficulty
Item Response Functions that differ in their discrimination function

Mokken scaling belongs to item response theory. In essence, a Mokken scale is a non-parametric, probabilistic version of Guttman scale. Both Guttman and Mokken scaling can be used to assess whether a number of items measure the same underlying concept. Both Guttman and Mokken scaling are based on the assumption that the items are hierarchically ordered: this means that they are ordered by degree of "difficulty". Difficulty here means the percentage of respondents that answers the question affirmatively. The hierarchical order means that a respondent who answered a difficult question correctly is assumed to answer an easy question correctly.[10] The key difference between a Guttman and Mokken scale is that Mokken scaling is probabilistic in nature. The assumption is not that every respondent who answered a difficult question affirmatively will necessarily answer an easy question affirmatively. Violations of this are called Guttman errors. Instead, the assumption is that respondents who answered a difficult question affirmatively are more likely to answer an easy question affirmatively. The scalability of the scale is measured by Loevinger's coefficient H. H compares the actual Guttman errors to the expected number of errors if the items would be unrelated.[10]

The chance that a respondent will answer an item correctly is described by an item response function. Mokken scales are similar to Rasch scales, in that they both adapted Guttman scales to a probabilistic model. However, Mokken scaling is described as 'non-parametric' because it makes no assumptions about the precise shape of the item response function, only that it is monotone and non-decreasing. The key difference between Mokken scales and Rasch scales is that the latter assumes that all items have the same item response function. In Mokken scaling the Item Response Functions differ for different items.[5]

Mokken scales can come in two forms: first as the Double Monotonicity model, where the items can differ in their difficulty. It is essentially an ordinal version of Rasch scale; and second, as the Monotone Homogeneity model, where items differ in their discrimination parameter, which means that there can be a weaker relationship between some items and the latent variable and other items and the latent variable.[5] Double Monotonicity models are used most often.

Monotone HomogeneityEdit

Monotone Homogeneity models are based on three assumptions.[5]

  1. There is a unidimensional latent trait on which subject and items can be ordered.
  2. The item response function is monotonically nondecreasing. This means that as one moves from one side of the latent variable to the other, the chance of giving a positive response should never decrease.
  3. The items are locally stochastically independent: this means that responses to any two items by the same respondent should not be the function any other aspect of the respondent or the item, but his or her position on the latent trait.[5]

Double monotonicity and invariant item orderingEdit

The Double Monotonicity model adds a fourth assumption, namely non-intersecting Item response functions, resulting in items that remain invariant rank-ordering.[11] There has been some confusion in Mokken scaling between the concepts of Double Monotonicity model and invariant item ordering.[12] The latter implies that all respondents to a series of questions all respond to them in the same order across the whole range of the latent trait. For dichotomously scored items, the Double Monotonicity model can mean invariant item ordering; however, for polytomously scored items this does not necessarily hold.[13] For invariant item ordering to hold not only should the item response functions not intersect, also, the item step response function between one level and the next within each item must not intersect.[14]

Sample sizeEdit

The issue of sample size for Mokken scaling is largely unresolved. Work using simulated samples and varying the item quality in the scales (Loevinger's coefficient and the correlation between scales) suggests that, where the quality of the items is high that lower samples sizes in the region of 250-500 are required compared with sample sizes of 1250-1750 where the item quality is low.[3] Using real data from the Warwick Edinburgh Mental Well Being Scale (WEMWBS)[15] suggests that the required sample size depends on the Mokken scaling parameters of interest as they do not all respond in the same way to varying sample size.[16]


While Mokken scaling analysis was originally developed to measure the extent to which individual dichotomous items form a scale, it has since been extended for polytomous items.[5] Moreover, while Mokken scaling analysis is a confirmatory method, meant to test whether a number of items form a coherent scale (like confirmatory factor analysis), an Automatic Item Selection Procedure has been developed to explore which latent dimensions structure responses on a number of observable items (like factor analysis).[17]


Mokken scaling software is available within the public domain statistical software R (programming language) and also within the data analysis and statistical software stata. MSP5 for Windows for use on personal computers is no longer compatible with current versions of Microsoft Windows. Also within the R (programming language), unusual response patterns in Mokken Scales can be checked using the package PerFit.[18] Two guides on how to conduct a Mokken scale analysis have been published.[19][20]


  1. ^ a b Mokken, Rob (1971). A theory and procedure of scale analysis: With applications in political research. Walter de Gruyter.
  2. ^ Bedford, A.; Watson, R.; Lyne, J.; Tibbles, J.; Davies, F.; Deary, I.J. (2009). "Mokken scaling and principal components analyses of the CORE-OM in a large clinical sample". Clinical Psychology and Psychotherapy. 17 (1): 51–62. doi:10.1002/cpp.649. PMID 19728291. S2CID 10445195.
  3. ^ a b Straat, J.H., Van Ark, L.A. and Sijtsma, K. (2014) Minimum Sample Size Requirements for Mokken Scale Analysis in Educational and Psychological Measurement Volume: 74 issue: 5, page(s): 809-822
  4. ^ Palmgren, P.J., Brodin, U., Nilsson G.H., Watson, R., Stenfors, T. (2018) Investigating psychometric properties and dimensional structure of an educational environment measure (DREEM) using Mokken scale analysis – a pragmatic approach BMC Medical Education volume = 18, issue = 1, article 235 | doi=10.1186/s12909-018-1334-8}}
  5. ^ a b c d e f van Schuur, Wijbrandt (2003). "Mokken scale analysis: Between the Guttman scale and parametric item response theory". Political Analysis. 11 (2): 139–163. doi:10.1093/pan/mpg002.
  6. ^ Gillespie, M.; Tenvergert, E.M.; Kingma, J. (1987). "[Using Mokken scale analysis to develop unidimensional scales ]". Quantity and Quality. 21 (4): 393–408. doi:10.1007/BF00172565. S2CID 118280333.
  7. ^ Stochl, J.; Jones, P.B.; Croudance, C.J. (2012). "Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers". BMC Medical Research Methodology. 12: 74. doi:10.1186/1471-2288-12-74. PMC 3464599. PMID 22686586.
  8. ^ Cook, N.F., McCance, T., McCormack, B., Barr, O., Slater, P. (2018) Perceived caring attributes and priorities of pre‐registration nursing students throughout a nursing curriculum underpinned by person‐centredness Journal of Clinical Nursing doi: 10.1111/jocn.14341
  9. ^ Aleo, G., Bagnasco, A., Watson, R., Dyson, J., Cowdell, F., Catania, G., Zanini, M.P., Cozani, E., Parodi, A., Saso, L. (2019) Comparing questionnaires across cultures: Using Mokken scaling to compare the Italian and English versions of the MOLES index Nursing Open doi: 10.1002/nop2.297
  10. ^ a b Crichton, N. (1999) "Mokken Scale Analysis" Journal of Clinical Nursing 8, 388
  11. ^ "Introduction to Nonparametric Item Response Theory - SAGE Research Methods". Retrieved 2019-11-06.
  12. ^ Meijer, R.R. (2010) A comment on Watson, Deary, and Austin (2007) and Watson, Roberts, Gow, and Deary (2008): How to investigate whether personality items form a hierarchical scale? Personality and Individual Differences doi: 10.1016/j.paid.2009.11.004
  13. ^ Ligtvoet, R., van der Ark, L.A., te Marvelde J.M., and Sijtsma, K. (2010) Investigating an Invariant Item Ordering for Polytomously Scored Items in Educational and Psychological Measurement Volume: 70 issue: 4, page(s): 578-595
  14. ^ Sijtsma, K., Meijer R.R., van der Ark, L.A. (2011) Mokken scale analysis as time goes by: An update for scaling practitioners Personality and Individual Differences (2011) Volume: 50, page(s): 31–37
  15. ^
  16. ^ Watson, Roger; Egberink, Iris JL; Kirke, Lisa; Tendeiro, Jorge N.; Doyle, Frank (2018). "What are the minimal sample size requirements for Mokken scaling? An empirical example with the Warwick- Edinburgh Mental Well-Being Scale". Health Psychology and Behavioral Medicine. 6 (1): 203–213. doi:10.1080/21642850.2018.1505520. PMC 8114397. PMID 34040828.
  17. ^ van der Ark, L.A. (Andries) (2012). "New Developments in Mokken Scale Analysis in R". Journal of Statistical Software. 48 (5). doi:10.18637/jss.v048.i05.
  18. ^ Meijer, R.R., Niessen, A.S.M., and Tendeiro, J.N. (2015) "A practical guide to check the consistency of item response patterns in clinical research through person-fit statistics: examples and an computer programme" Assessment 23, 56-62
  19. ^ Sijtsma, K; van der Ark, A (2016). "A tutorial on how to do a Mokken scale analysis on your test and questionnaire data". British Journal of Mathematical and Statistical Psychology. 70 (1): 137–185. doi:10.1111/bmsp.12078. hdl:11245.1/459fd643-a539-445a-a67a-b62b88c5a262. PMID 27958642.
  20. ^ Wind, Stefanie A. (2017). "An Instructional Module on Mokken Scale Analysis". Educational Measurement: Issues and Practice. 36 (2): 50–66. doi:10.1111/emip.12153.