For example, in the sentence "Mary said she would help me", "she" and "Mary" are most likely referring to the same person or group, in which case they are coreferent. Similarly, in "I saw Scott yesterday. He was fishing by the lake," Scott and he are most likely coreferent.
The pattern of these examples is typical: when first introducing a person or other topic for discussion, an author or speaker will use a relatively long or detailed description, such as a definite description as defined by Saul Kripke. However, later mentions are briefer. Once down to mere pronouns, references are frequently ambiguous. In the "Mary said she would help me" example, although the most likely reading is that "she" refers to Mary, "she" could instead refer to someone else (most likely someone introduced earlier in a dialog).
In computational linguistics, coreference resolution is a well-studied problem in discourse. In order to derive the correct interpretation of text, or even to estimate the relative importance of various mentioned subjects, pronouns and other referring expressions need to be connected to the right individuals.
Algorithms intended to resolve co-references commonly look first for the nearest preceding individual that is compatible with the referring expression. For example, "she" might attach to a preceding expression such as "the woman" or "Anne", but not to "Bill". Pronouns such as "himself" have much stricter constraints. Algorithms for resolving co-reference tend to have accuracy in the 75% range (as with many linguistic tasks, there is a tradeoff between precision and recall).
A classic problem for coreference resolution in English, is the pronoun "it", which has many uses. "It" can refer much like "he" and "she", except that it refers to objects that are inanimate (the rules are actually more complex: animals may be any of "it", "he", or "she"; ships have traditionally been "she"; "hurricanes" are not usually referred to as "she" or "he" despite having gendered names). "It" can also refer to abstractions rather than beings: "He was paid minimum wage, but didn't seem to mind it." And finally, "it" also has pleonastic uses, which do not refer in anything like the same way as "he" and "she" do:
- It's raining.
- It's really a shame.
- It takes a lot of work to be a success.
- Sometimes it's those who are loudest who have the most influence.
Pleonastic uses are not considered referential, and so are not part of coreference. Li et al. (2009) have demonstrated very high accuracy in sorting out pleonastic "it", and this success promises to improve the accuracy of coreference resolution overall.
- Illinois Coreference Package Coreference resolution package implemented in Java. Demo
- Yifan Li, Petr Musilek, Marek Reformat, and Loren Wyard-Scott. "Identification of Pleonastic It Using the Web." In Journal of Artificial Intelligence Research 34 (2009): 339-389. http://www.jair.org/media/2622/live-2622-4362-jair.ps