Anne O'Tate is a free, web-based application [1] that analyses sets of records identified on PubMed, the bibliographic database of articles from over 5,500 biomedical journals worldwide. While PubMed has its own wide range of search options to identify sets of records relevant to a researchers query it lacks the ability to analyse these sets of records further, a process for which the terms text mining and drill down have been used. Anne O'Tate is able to perform such analysis and can process sets of up to 25,000 PubMed records.[1]


Once a set of articles has been identified using Anne O’Tate with its PubMed-like interface and search syntax, the set can be analysed and words and concepts mentioned in specific 'fields' (sections) of PubMed records can be displayed in order of frequency.[2] ‘Fields’ which Anne O’Tate can display in this manner are:

Topics (MeSH)Edit

This option may help to identify possible Medical Subject Headings (known as MeSH terms, but called ‘Topics’ by Anne O’Tate) for a subject for which no corresponding subject heading or ‘entry term’ (cross-references to preferred MeSH term) exists or where PubMed’s automatic mapping process (identifying a MeSH term and including it in a search formulation) fails.

Searching for instance for articles on ‘“Knowledge Transfer”’ (for which no corresponding MeSH or entry term exists) will retrieve a set of some 530 studies in PubMed (as of August 2011); Anne O’Tate’s analysis suggests that MeSH terms like "Diffusion of Innovation" or "Information Dissemination" may be suitable additional concepts to retrieve a more ‘sensitive’ (comprehensive) set of references. This method of identifying possible MeSH terms is not available on PubMed.


This option may help with identifying authors who have written frequently about a given subject, or may help with identifying possible experts or peer reviewers


Identifying journals which publish papers on the subject under investigation may assist with selecting suitable journals to consider for manuscripts or for detailed scanning for relevant articles ('hand searching'[3]) not found by the search on PubMed.

Other fieldsEdit

Author affiliations (addresses) and the years of publication can also be analysed. ‘Important words’ from titles and abstracts which may "[...] have more frequent occurrences in the result subset than in the MEDLINE as a whole, thus they distinguish the result subset from the rest of MEDLINE" [4] can be identified and help with further refining a search on PubMed.[5][6][7]


Anne O'Tate (a pun on the word ‘annotate’) was developed by Neil R Smalheiser and a team of researchers from the University of Chicago. It is part of the Arrowsmith Project, which developed tools such as “Arrowsmith” proper, a text-comparison application,[8] "Adam", a database of medical abbreviations,[9] and ‘’Author-ity’’ (an author-disambiguation tool),[10] "Compendium", a list of biomedical text mining tools, and Anne O’Tate. The Project is based on research led by Don R. Swanson at the University of Chicago[11] which hosted the original tool.[12] Further research was led by Neil R. Smalheiser at the University of Illinois at Chicago, with funding from the National Institutes of Health.[13]

Other PubMed text-mining applicationsEdit

A wide range of text-mining applications for PubMed have been developed,[4] using their own interface, such as GoPubMed, ClusterMed, or PubReMiner. Only Anne O’Tate uses PubMed’s standard interface, search syntax, and some of its functionality.


  1. ^ a b Smalheiser, N. R.; Zhou, W.; Torvik, V. I. (2008). "Anne O'Tate: A tool to support user-driven summarization, drill-down and browsing of PubMed search results". Journal of Biomedical Discovery and Collaboration. 3: 2. doi:10.1186/1747-5333-3-2. PMC 2276193. PMID 18279519.
  2. ^ Palidwor, G. A.; Andrade-Navarro, M. A. (2010). "MLTrends: Graphing MEDLINE term usage over time". Journal of Biomedical Discovery and Collaboration. 5: 1–6. PMC 2990277. PMID 20333611.
  3. ^ Langham, J.; Thompson, E.; Rowan, K. (1999). "Identification of randomized controlled trials from the emergency medicine literature: Comparison of hand searching versus MEDLINE searching". Annals of Emergency Medicine. 34 (1): 25–34. doi:10.1016/s0196-0644(99)70268-4. PMID 10381991.
  4. ^ a b Lu, Z. (2011). "PubMed and beyond: A survey of web tools for searching biomedical literature". Database. 2011: baq036. doi:10.1093/database/baq036. PMC 3025693. PMID 21245076.
  5. ^ Wilczynski, N. L.; Walker, C. J.; McKibbon, K. A.; Haynes, R. B. (1995). "Reasons for the loss of sensitivity and specificity of methodologic MeSH terms and textwords in MEDLINE". Proceedings. Symposium on Computer Applications in Medical Care: 436–440. PMC 2579130. PMID 8563319.
  6. ^ Greenhalgh, T. (1997). "How to read a paper. The Medline database". BMJ (Clinical Research Ed.). 315 (7101): 180–183. doi:10.1136/bmj.315.7101.180. PMC 2127107. PMID 9251552.
  7. ^ Smalheiser, N. R.; Zhou, W.; Torvik, V. I. (2011). "Distribution of "Characteristic" Terms in MEDLINE Literatures". Information. 2 (4): 266–276. doi:10.3390/info2020266.
  8. ^ Smalheiser, N. R.; Torvik, V. I.; Zhou, W. (2009). "Arrowsmith two-node search interface: A tutorial on finding meaningful links between two disparate sets of articles in MEDLINE". Computer Methods and Programs in Biomedicine. 94 (2): 190–197. doi:10.1016/j.cmpb.2008.12.006. PMC 2693227. PMID 19185946.
  9. ^ Zhou, W.; Torvik, V. I.; Smalheiser, N. R. (2006). "ADAM: Another database of abbreviations in MEDLINE". Bioinformatics. 22 (22): 2813–2818. doi:10.1093/bioinformatics/btl480. PMID 16982707.
  10. ^ Torvik, V. I.; Smalheiser, N. R. (2009). "Author Name Disambiguation in MEDLINE". ACM Transactions on Knowledge Discovery from Data. 3 (3): 1–29. doi:10.1145/1552303.1552304. PMC 2805000. PMID 20072710.
  11. ^ Swanson, D.R.; Smalheiser, N.R. (Summer 1999). "Implicit Text Linkages between Medline Records: Using Arrowsmith as an Aid to Scientific Discovery" (PDF). Library Trends. 48 (1): 48–59. Retrieved July 4, 2011.
  12. ^ "Arrowsmith-2 on Linux". The University of Chicago. Archived from the original on June 18, 2009. Retrieved July 4, 2011.
  13. ^ Smalheiser, N.R. (October 2005). "The Arrowsmith Project: 2005 Status Report". Discovery Science. 8th international conference on discovery science. Lecture Notes in Computer Science. Vol. 3735. pp. 26–43. doi:10.1007/11563983_5. ISBN 978-3-540-29230-2.

External linksEdit