Talk:BERT (language model)

Latest comment: 2 years ago by Trondtr in topic Title

Title edit

This page should probably be moved to BERT (language representation model) rather than language model. A language model has a specific meaning in that it models the joint probability distribution of words, whereas BERT doesn't do that, although it can predict a masked word it can't give you the probability distribution.

This would also be consistent with Wikipedia's own definition of a language model.

I agree. Let us move it unless we see substantial protests. Trondtr (talk) 14:03, 1 October 2021 (UTC).Reply