Ann example diagram of Swanson linking, usinc the ABC paradigm

Literature-based discovery (LBD), also called Literature-Related Discovery (LRD) is a form of knowledge extraction and automated hypothesis generation that uses papers and other academic publications (the "literature") to find new relationships between existing knowledge (the "discovery"). Literature-based discovery aims to discover new knowledge by connecting information which have been explicitly stated in literature to deduce connections which have not been explicitly stated. [1]

LBD can help researchers to quickly discover and explore hypotheses as well as gain information on relevant advances inside and outside of their niches and increase interdisciplinary information sharing. [1]

The most basic and widespread type of LBD is called the ABC paradigm because it centers around three concepts called A, B and C. [2][3][4] It states that if there is a connection between A and B and one between B and C, then there is one between A and C which, if not explicitly stated, is yet to be explored. [1]

History edit

The LBD technique was pioneered by Don R. Swanson in the 1980s. [5] He hypothesized that the combination of two separately published results indicating an A-B relationship and a B-C relationship are evidence of an A-C relationship which is unknown or unexplored. He used this to propose fish oil as a treatment for Raynaud syndrome due to their shared relationship with blood viscosity. [6] This hypothesis was later shown to have merit in a prospective study [7] and he continually proposed other discoveries using similar methods [8][9][10]. [1]

Swanson linking edit

Swanson linking is a term proposed in 2003[11] that refers to connecting two pieces of knowledge previously thought to be unrelated.[12] For example, it may be known that illness A is caused by chemical B, and that drug C is known to reduce the amount of chemical B in the body. However, because the respective articles were published separately from one another (called "disjoint data"), the relationship between illness A and drug C may be unknown. Swanson linking aims to find these relationships and report them.

Although the ABC paradigm is widely used, critics of the system have argued that much of science is not captured on simple assertions and it is rather built from analogies and images at a higher level of abstraction. [13]

Systems edit

LBD comes generally in two flavours: open and closed discovery. In open discovery, only A is given. The approach finds Bs and uses them to return possibly interesting Cs to the user, thus generating hypotheses from A. With closed discovery, the A and C are given to the approach which seeks to find the Bs which can link the two, thus testing a hypothesis about A and C.[1]

A number of systems to perform literature-based discovery have been developed over the years, extending the original idea of Don Swanson, and the evaluation of the quality of such systems is an active area of research. [14]Some systems include web versions for increased user-friendliness. [15] A common approach to many systems is the use of MeSH terms to represent scientific articles. This is used by the systems Manjal, BITOLA and LitLinker. [16]

One well-known system within the field is called Arrowsmith and is tailored to find connections between two disjoint sets of articles, an approach labeled "two-node" search. [17][18]

Another well-known system, LION LBD, [19] uses PubTator [20] for annotating PubMed scientific articles with concepts such as chemicals, genes/proteins, mutations, diseases and species; as well as sentence-level annotation of cancer hallmarks that describe fundamental cancer processes and behaviour[21]. It uses co-occurrence metrics to rank relations between concepts and performs both open and closed discovery.[1]

While LBD systems are based on traditional statistical methods, [16] other systems leverage sophisticated machine learning methods, like neural networks.[1] Some LBD systems represent the connection between concepts as a knowledge graph, and thus employ techniques of graph theory. [22] The graph-based representation is also the foundation for LBD systems that employ graph databases like Neo4J, enabling discovery via graph query languages such as Cypher. [23]

Graph-based LBD systems represent the relations between concepts using a different relation types, such as those in the UMLS Semantic Network. [24] Some approaches go further and try to apply contextualized relations, [25] an approach also used by the Gene Ontology for their Causal Activity Modeling (GO-CAM). [26]

Use of databases edit

Besides extracting information from the body of scientific articles, LBD systems often employ structured knowledge from biocurated biological resources, like the Online Mendelian Inheritance in Men (OMIM).[27]

List of systems edit

 
The Anni 2.0 literature-based discovery system, employing a workflow similar to other LBD systems. [28]

These are the published LBD systems, ordered by date of publication: [29]

  • 1986 - Arrowsmith [30]
  • 2000 - BITOLA V1 [31]
  • 2001 - DAD [32]
  • 2003 - LitLinker [33]
  • 2004 - ACS [34]
  • 2004 - Manjal [35]
  • 2004 - IRIDESCENT [36]
  • 2005 - BITOLA V2 [37]
  • 2006 - LitLinker V2 [38]
  • 2007 - Arrowsmith V2 [39]
  • 2008 - Anni 2.0 [28]
  • 2008 - CoPub Discovery [40]
  • 2009 - RajoLink [41]
  • 2010 - Sem-BT [42]
  • 2015 - Obvio [43]
  • 2016 - Spark [44]
  • 2017 - Mine the gap [45]
  • 2019 - LION LBD [46]

Semantic typing edit

A common task in literature-based discovery is assigning words/concepts to different semantic types. A concept might be classified under one type or multiple types. For example in the Unified Medical Language System (UMLS) the term migraine is classified under the type disease and syndrome, while the term magnesium is under two types: biologically active substance and element, ion, or isotope. [16] The typing of concepts hones the discovery of connections between particular classes of concepts, i.e. diseases-genes or diseases-drugs. [16]

System evaluation edit

The evaluation of literature-based discoveries is challenging, and includes both experimental and in silico methods.[47] Methods try to quantify the amount of knowledge generated by systems, that should be provided in an amount and richness that is useful for scientists. [48]

Evaluation is difficult in LBD for several reasons: disagreement about the role of LBD systems in research and thus what makes a successful one; difficulty in determining how useful, interesting or actionable a discovery is; and difficulty in objectively defining a ‘discovery’, which hinders the creation of a standard evaluation set which quantifies when a discovery has been replicated or found. [1]

A popular method used in LBD is to replicate previous discoveries. [49][50][51] These are usually LBD-based discoveries as they are relatively easy to quantify compared to other discoveries. There are only a handful of such discoveries and approaches e tuned to perform well on these discoveries might not generalise. In this type of evaluation, the literature before the discovery to be replicated is used to generate a ranked list of discovery candidates as target or linking terms. Success is measured by reporting the rank of the term(s) of interest; the higher the rank, the better the approach.

Literature- or time-slicing involves splitting the existing literature at a point in time. The LBD system is then exposed to the literature before the split and is evaluated by how many of the discoveries in the later period it can discover. LBD systems have used term co-occurrences[52], relationships from external biomedical resources (e.g SemMedDB)[53] and semantic relationships[54] to generate the gold standards. A high precision approach is to get expert opinion to generate the gold standard[55], but this is time-consuming, expensive and tends to produce low recall rates. [1]

The advantage of time-slicing in comparison to the replication of previous discoveries is the evaluation on a large number of test instances. This raises the need for evaluation metrics which can quantify performance on large, ranked lists. [1] LBD works have used metrics popular in Information Retrieval [56] which include Precision, Recall, Area Under the Curve (AUC), Precision at k, Mean Average Precision (MAP) and others. [1]

The approach of Proposing new discoveries or treatments goes beyond replicating past discoveries or predicting time-sliced instances of a particular relationship and shows that a system is capable of being used in realistic situations. [57][58][59][60] This is usually accompanied by peer-reviewed publication in the domain or vetting by a domain expert. [1]

Text mining edit

 
Gene name normalization, an important step in LBD when dealing with genes[61]

The automation of literature-based discovery relies heavily on text mining. [62]

The language in scientific articles often include ambiguities, and an important step for coeherent parsing of the literature is the extracion of the sense of each term in the context they are used, a task called Word Sense Disambiguation (WSD). [63] For example, terms for genes like CT (PCYT1A) called and MR (NR3C2) can be confused with the acronyms for Computational Tomography and Magnetic Ressonance, requiring sofisticated disambiguation systems.[64] Terms are often reconciled to ontologies or other sources of unique identifiers, such as the Unified Medical Language System (UMLS). [65]

Usage edit

Life sciences edit

LBD has already been used in different waysto identify new connections between biomedical entities and new candidate genes and treatments for illnesses. [66][1]

Drug discovery edit

LBD has seen use in drug development and repurposing [67][68] as well as predicting adverse drug reactions. [69][70][1]

The method of literature-based discovery has been used to search for treatments for a number of human diseases, including:

Gene and protein function discovery edit

The approach has also been used to propose relations of genes with particular diseases, [74] like breast cancer. [75] In the context of systems vaccinology, it was used to identify proteins related to interferon gamma and that play a role in the response to vaccines. [76] It has also been used to propose mechanisms for currently used drugs. [77]

Biomarker discovery edit

LBD has been explored as a tool to identify biomarkers for diagnostic and prognostic for diseases, e.g. for the risk of type 2 diabetes. [78]

Other uses edit

Besides providing scientific hypotheses about the world, LBD has also been used to improve data analysis, via the automatic identification of possible confounding factors using the medical literature. [79]

It has also been used to understand better disease etiology and the relation of different diseases, for example looking for the genes connecting myocardial infarction and depression, [80] to find connections between psychiatric and somatic diseases,[81] and to connect testosterone blood levels and sleep quality. [82]

Beyond life sciences edit

LBD has mostly been deployed in the biomedical domain, but it has also been used outside of it as it has been applied to research into developing water purification systems, accelerating development of developing countries and identifying promising research collaborations. [83][84] [85]

See also edit

Additional reading edit

  • Wilson, Patrick (1977). Public Knowledge, Private Ignorance: Toward a Library and Information Policy. Greenwood Publishing Group. p. 156. ISBN 0-8371-9485-7.

References edit

  1. ^ a b c d e f g h i j k l m n Crichton, Gamal; Baker, Simon; Guo, Yufan; Korhonen, Anna (2020-05-15). "Neural networks for open and closed Literature-based Discovery". PLOS One. 15 (5): e0232891. doi:10.1371/JOURNAL.PONE.0232891. PMC 7228051. PMID 32413059.{{cite journal}}: CS1 maint: unflagged free DOI (link)   This article incorporates text available under the CC BY 4.0 license.
  2. ^ Smalheiser, Neil R; Swanson, Don R (1998-11). "Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses". Computer Methods and Programs in Biomedicine. 57 (3): 149–153. doi:10.1016/s0169-2607(98)00033-9. ISSN 0169-2607. {{cite journal}}: Check date values in: |date= (help)
  3. ^ Gordon, Michael D.; Lindsay, Robert K. (1996-02). <116::aid-asi3>3.0.co;2-1 "Toward discovery support systems: A replication, re-examination, and extension of Swanson's work on literature-based discovery of a connection between Raynaud's and fish oil". Journal of the American Society for Information Science. 47 (2): 116–128. doi:10.1002/(sici)1097-4571(199602)47:2<116::aid-asi3>3.0.co;2-1. ISSN 0002-8231. {{cite journal}}: Check date values in: |date= (help)
  4. ^ Cohen, Trevor; Schvaneveldt, Roger; Widdows, Dominic (2010-04). "Reflective Random Indexing and indirect inference: A scalable method for discovery of implicit connections". Journal of Biomedical Informatics. 43 (2): 240–256. doi:10.1016/j.jbi.2009.09.003. ISSN 1532-0464. {{cite journal}}: Check date values in: |date= (help)
  5. ^ Smalheiser, Neil R. (2017-12-01). "Rediscovering Don Swanson:The Past, Present and Future of Literature-based Discovery". Journal of Data and Information Science. 2 (4): 43–64. doi:10.1515/jdis-2017-0019. PMC 5771422. PMID 29355246.{{cite journal}}: CS1 maint: PMC format (link)
  6. ^ Swanson, Don R. (1986). "Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge". Perspectives in Biology and Medicine. 30 (1): 7–18. doi:10.1353/pbm.1986.0087. ISSN 1529-8795.
  7. ^ Ricco, Jean Baptiste (1990-05). "Fish-oil dietary supplementation in patients with Raynaud's phenomenon: a double blind, controlled, prospective study". Journal of Vascular Surgery. 11 (5): 733–734. doi:10.1016/0741-5214(90)90229-4. ISSN 0741-5214. {{cite journal}}: Check date values in: |date= (help)
  8. ^ Swanson, Don R. (1988). "Migraine and Magnesium: Eleven Neglected Connections". Perspectives in Biology and Medicine. 31 (4): 526–557. doi:10.1353/pbm.1988.0009. ISSN 1529-8795.
  9. ^ Swanson, Don R. (1990). "Somatomedin C and Arginine: Implicit Connections between Mutually Isolated Literatures". Perspectives in Biology and Medicine. 33 (2): 157–186. doi:10.1353/pbm.1990.0031. ISSN 1529-8795.
  10. ^ Smalheiser, Neil R.; Swanson, Don R. (1996-09). "Linking estrogen to Alzheimer's disease". Neurology. 47 (3): 809–810. doi:10.1212/wnl.47.3.809. ISSN 0028-3878. {{cite journal}}: Check date values in: |date= (help)
  11. ^ Stegmann J, Grohmann G. Hypothesis generation guided by co-word clustering. Scientometrics. 2003;56:111–135. As quoted by Bekhuis
  12. ^ Bekhuis, Tanja (2006). "Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy". Biomedical Digital Libraries. 3: 2. doi:10.1186/1742-5581-3-2. PMC 1459187. PMID 16584552.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  13. ^ Smalheiser, Neil R. (2011-07-26). "Literature-based discovery: Beyond the ABCs". Journal of the Association for Information Science and Technology. 63 (2): 218–224. doi:10.1002/ASI.21599.
  14. ^ Yetisgen-Yildiz, Meliha; Pratt, Wanda (2008-12-16). "A new evaluation methodology for literature-based discovery systems". Journal of Biomedical Informatics. 42 (4): 633–643. doi:10.1016/J.JBI.2008.12.001. PMID 19124086.
  15. ^ Hur, Junguk; Schuyler, Adam D.; States, David J.; Feldman, Eva L. (2009-02-02). "SciMiner: web-based literature mining tool for target identification and functional enrichment analysis". Bioinformatics. 25 (6): 838–840. doi:10.1093/bioinformatics/btp049. ISSN 1460-2059.
  16. ^ a b c d Yetisgen-Yildiz, Meliha; Pratt, Wanda (2006-01-04). "Using statistical and knowledge-based approaches for literature-based discovery". Journal of Biomedical Informatics. 39 (6): 600–611. doi:10.1016/J.JBI.2005.11.010. PMID 16442852.
  17. ^ Smalheiser, Neil R.; Torvik, Vetle I. (2008), Bruza, Peter; Weeber, Marc (eds.), "The Place of Literature-Based Discovery in Contemporary Scientific Practice", Literature-based Discovery, Information Science and Knowledge Management, Berlin, Heidelberg: Springer, pp. 13–22, doi:10.1007/978-3-540-68690-3_2, ISBN 978-3-540-68690-3, retrieved 2022-03-04
  18. ^ "ARROWSMITH: Start". arrowsmith.psych.uic.edu. Retrieved 2022-03-04.
  19. ^ Pyysalo, Sampo; Baker, Simon; Ali, Imran; Haselwimmer, Stefan; Shah, Tejas; Young, Andrew; Guo, Yufan; Högberg, Johan; Stenius, Ulla; Narita, Masashi; Korhonen, Anna (2018-10-09). "LION LBD: a literature-based discovery system for cancer biology". Bioinformatics. 35 (9): 1553–1561. doi:10.1093/bioinformatics/bty845. ISSN 1367-4803.
  20. ^ Wei, Chih-Hsuan; Kao, Hung-Yu; Lu, Zhiyong (2013-05-22). "PubTator: a web-based text mining tool for assisting biocuration". Nucleic Acids Research. 41 (W1): W518–W522. doi:10.1093/nar/gkt441. ISSN 1362-4962.
  21. ^ Baker, Simon; Ali, Imran; Silins, Ilona; Pyysalo, Sampo; Guo, Yufan; Högberg, Johan; Stenius, Ulla; Korhonen, Anna (2017-07-14). "Cancer Hallmarks Analytics Tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer". Bioinformatics. 33 (24): 3973–3981. doi:10.1093/bioinformatics/btx454. ISSN 1367-4803.
  22. ^ Cameron, Delroy; Kavuluru, Ramakanth; Rindflesch, Thomas C.; Sheth, Amit P.; Thirunarayan, Krishnaprasad; Bodenreider, Olivier (2015-02-07). "Context-driven automatic subgraph creation for literature-based discovery". Journal of Biomedical Informatics. 54: 141–157. doi:10.1016/J.JBI.2015.01.014. PMC 4888806. PMID 25661592.
  23. ^ Hristovski, Dimitar; Kastrin, Andrej; Dinevski, Dejan; Rindflesch, Thomas C. (2015-01-01). "Constructing a Graph Database for Semantic Literature-Based Discovery". Studies in Health Technology and Informatics. 216: 1094. PMID 26262393.
  24. ^ Preiss, Judita; Stevenson, Mark; Gaizauskas, Robert (2015-05-13). "Exploring relation types for literature-based discovery". Journal of the American Medical Informatics Association. 22 (5): 987–992. doi:10.1093/JAMIA/OCV002. PMC 4986660. PMID 25971437.
  25. ^ Kim, Yong Hwan; Song, Min (2019-04-24). "A context-based ABC model for literature-based discovery". PLOS One. 14 (4): e0215313. doi:10.1371/JOURNAL.PONE.0215313. PMC 6481912. PMID 31017923.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  26. ^ Thomas, Paul D.; Hill, David P.; Mi, Huaiyu; Osumi-Sutherland, David; Auken, Kimberly Van; Carbon, Seth J.; Balhoff, James P.; Albou, Laurent-Philippe; Good, Benjamin M.; Gaudet, Pascale; Lewis, Suzanna (2019-10-01). "Gene Ontology Causal Activity Modeling (GO-CAM) moves beyond GO annotations to structured descriptions of biological functions and systems". Nature Genetics. 51 (10): 1429–1433. doi:10.1038/S41588-019-0500-1. PMC 7012280. PMID 31548717.
  27. ^ Hristovski, Dimitar; Peterlin, Borut; Mitchell, Joyce A.; Humphrey, Susanne M. (2003-01-01). "Improving literature based discovery support by genetic knowledge integration". Studies in Health Technology and Informatics. 95: 68–73. PMID 14663965.
  28. ^ a b Jelier, Rob; Schuemie, Martijn J.; Schuemie, Martijn J.; Veldhoven, Antoine; Dorssers, Lambert C. J.; Jenster, Guido; Kors, Jan A.; Kors, Jan A. (2008-06-12). "Anni 2.0: a multipurpose text-mining tool for the life sciences". Genome Biology. 9 (6): R96. doi:10.1186/GB-2008-9-6-R96. PMC 2481428. PMID 18549479.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  29. ^ Gopalakrishnan, Vishrawas; Gopalakrishnan, Vishrawas; Jha, Kishlay; Jha, Kishlay; Jin, Wei; Zhang, Aidong; Zhang, Aidong (2019-03-09). "A survey on literature based discovery approaches in biomedical domain". Journal of Biomedical Informatics. 93: 103141. doi:10.1016/J.JBI.2019.103141. PMID 30857950.
  30. ^ Swanson, Don R. (1986). "Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge". Perspectives in Biology and Medicine. 30 (1): 7–18. doi:10.1353/pbm.1986.0087. ISSN 1529-8795.
  31. ^ Hristovski, Dimitar; Džeroski, Sašo; Peterlin, Borut; Rožić, Anamajirja (2000), "Supporting Discovery in Medicine by Association Rule Mining of Bibliographic Databases", Principles of Data Mining and Knowledge Discovery, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 446–451, ISBN 978-3-540-41066-9, retrieved 2022-03-17
  32. ^ Weeber, Marc; Klein, Henny; de Jong-van den Berg, Lolkje T.W.; Vos, Rein (2001). "Using concepts in literature-based discovery: Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries". Journal of the American Society for Information Science and Technology. 52 (7): 548–557. doi:10.1002/asi.1104. ISSN 1532-2882.
  33. ^ Pratt, Wanda; Yetisgen-Yildiz, Meliha (2003). "LitLinker". Proceedings of the international conference on Knowledge capture - K-CAP '03. New York, New York, USA: ACM Press. doi:10.1145/945645.945662.
  34. ^ van der Eijk, C. Christiaan; van Mulligen, Erik M.; Kors, Jan A.; Mons, Barend; van den Berg, Jan (2004). "Constructing an associative concept space for literature-based discovery". Journal of the American Society for Information Science and Technology. 55 (5): 436–444. doi:10.1002/asi.10392. ISSN 1532-2882.
  35. ^ Srinivasan, P.; Libbus, B. (2004-07-19). "Mining MEDLINE for implicit links between dietary substances and diseases". Bioinformatics. 20 (Suppl 1): i290–i296. doi:10.1093/bioinformatics/bth914. ISSN 1367-4803.
  36. ^ Verfasser, Wren, Jonathan D. Extending the mutual information measure to rank inferred literature relationships. OCLC 1186487448. {{cite book}}: |last= has generic name (help)CS1 maint: multiple names: authors list (link)
  37. ^ Hristovski, Dimitar; Peterlin, Borut; Mitchell, Joyce A.; Humphrey, Susanne M. (2005-03). "Using literature-based discovery to identify disease candidate genes". International Journal of Medical Informatics. 74 (2–4): 289–298. doi:10.1016/j.ijmedinf.2004.04.024. ISSN 1386-5056. {{cite journal}}: Check date values in: |date= (help)
  38. ^ Yetisgen-Yildiz, Meliha; Pratt, Wanda (2006-12). "Using statistical and knowledge-based approaches for literature-based discovery". Journal of Biomedical Informatics. 39 (6): 600–611. doi:10.1016/j.jbi.2005.11.010. ISSN 1532-0464. {{cite journal}}: Check date values in: |date= (help)
  39. ^ Torvik, Vetle I.; Smalheiser, Neil R. (2007-04-26). "A quantitative model for linking two disparate sets of articles in MEDLINE". Bioinformatics. 23 (13): 1658–1665. doi:10.1093/bioinformatics/btm161. ISSN 1460-2059.
  40. ^ Frijters, R.; Heupers, B.; van Beek, P.; Bouwhuis, M.; van Schaik, R.; de Vlieg, J.; Polman, J.; Alkema, W. (2008-05-19). "CoPub: a literature-based keyword enrichment tool for microarray data analysis". Nucleic Acids Research. 36 (Web Server): W406–W410. doi:10.1093/nar/gkn215. ISSN 0305-1048.
  41. ^ Petriĕ, Ingrid; Urbanĕiĕ, Tanja; Cestnik, Bojan; Macedoni-Lukšiĕ, Marta (2009-04). "Literature mining method RaJoLink for uncovering relations between biomedical concepts". Journal of Biomedical Informatics. 42 (2): 219–227. doi:10.1016/j.jbi.2008.08.004. ISSN 1532-0464. {{cite journal}}: Check date values in: |date= (help)
  42. ^ Hristovski, Dimitar; Kastrin, Andrej; Peterlin, Borut; Rindflesch, Thomas C. (2010), "Combining Semantic Relations and DNA Microarray Data for Novel Hypotheses Generation", Linking Literature, Information, and Knowledge for Biology, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 53–61, ISBN 978-3-642-13130-1, retrieved 2022-03-17
  43. ^ Cameron, Delroy; Kavuluru, Ramakanth; Rindflesch, Thomas C.; Sheth, Amit P.; Thirunarayan, Krishnaprasad; Bodenreider, Olivier (2015-04). "Context-driven automatic subgraph creation for literature-based discovery". Journal of Biomedical Informatics. 54: 141–157. doi:10.1016/j.jbi.2015.01.014. ISSN 1532-0464. {{cite journal}}: Check date values in: |date= (help)
  44. ^ Workman, T. Elizabeth; Fiszman, Marcelo; Cairelli, Michael J.; Nahl, Diane; Rindflesch, Thomas C. (2016-04-01). "Spark, an application based on Serendipitous Knowledge Discovery". Journal of Biomedical Informatics. 60: 23–37. doi:10.1016/j.jbi.2015.12.014. ISSN 1532-0464.
  45. ^ Peng, Yufang; Bonifield, Gary; Smalheiser, Neil R. (2017-05-22). "Gaps within the Biomedical Literature: Initial Characterization and Assessment of Strategies for Discovery". Frontiers in Research Metrics and Analytics. 2. doi:10.3389/frma.2017.00003. ISSN 2504-0537.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  46. ^ Pyysalo, Sampo; Baker, Simon; Ali, Imran; Haselwimmer, Stefan; Shah, Tejas; Young, Andrew; Guo, Yufan; Högberg, Johan; Stenius, Ulla; Narita, Masashi; Korhonen, Anna (2018-10-09). "LION LBD: a literature-based discovery system for cancer biology". Bioinformatics. 35 (9): 1553–1561. doi:10.1093/bioinformatics/bty845. ISSN 1367-4803.
  47. ^ Henry, M. S. Sam; McInnes, Bridget T. (2017-08-21). "Literature Based Discovery: models, methods, and trends". Journal of Biomedical Informatics. doi:10.1016/J.JBI.2017.08.011.
  48. ^ Preiss, Judita; Stevenson, Mark (2017-05-31). "Quantifying and filtering knowledge generated by literature based discovery". BMC Bioinformatics. 18 (Suppl 7): 249. doi:10.1186/S12859-017-1641-9. PMC 5471938. PMID 28617217.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  49. ^ Cohen, Trevor; Schvaneveldt, Roger; Widdows, Dominic (2010-04). "Reflective Random Indexing and indirect inference: A scalable method for discovery of implicit connections". Journal of Biomedical Informatics. 43 (2): 240–256. doi:10.1016/j.jbi.2009.09.003. ISSN 1532-0464. {{cite journal}}: Check date values in: |date= (help)
  50. ^ Swanson, Don R.; Smalheiser, Neil R. (1997-04). "An interactive system for finding complementary literatures: a stimulus to scientific discovery". Artificial Intelligence. 91 (2): 183–203. doi:10.1016/s0004-3702(97)00008-8. ISSN 0004-3702. {{cite journal}}: Check date values in: |date= (help)
  51. ^ R., Weeber, M. Klein, H. Aronson, A. R. Mork, J. G. de Jong-van den Berg, L. T. Vos,. Text-based discovery in biomedicine: the architecture of the DAD-system. American Medical Informatics Association. OCLC 678976989.{{cite book}}: CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link)
  52. ^ Hristovski, Dimitar; Džeroski, Sašo; Peterlin, Borut; Rožić, Anamajirja (2000), "Supporting Discovery in Medicine by Association Rule Mining of Bibliographic Databases", Principles of Data Mining and Knowledge Discovery, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 446–451, ISBN 978-3-540-41066-9, retrieved 2022-03-15
  53. ^ Eronen, Lauri; Hintsanen, Petteri; Toivonen, Hannu (2012), "Biomine: A Network-Structured Resource of Biological Entities for Link Prediction", Bisociative Knowledge Discovery, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 364–378, ISBN 978-3-642-31829-0, retrieved 2022-03-15
  54. ^ Preiss, Judita; Stevenson, Mark; Gaizauskas, Robert (2015-05-12). "Exploring relation types for literature-based discovery". Journal of the American Medical Informatics Association. 22 (5): 987–992. doi:10.1093/jamia/ocv002. ISSN 1527-974X.
  55. ^ Yetisgen-Yildiz, Meliha; Pratt, Wanda (2009-08). "A new evaluation methodology for literature-based discovery systems". Journal of Biomedical Informatics. 42 (4): 633–643. doi:10.1016/j.jbi.2008.12.001. ISSN 1532-0464. {{cite journal}}: Check date values in: |date= (help)
  56. ^ Yetisgen-Yildiz, M.; Pratt, W. (2008), "Evaluation of Literature-Based Discovery Systems", Literature-based Discovery, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 101–113, ISBN 978-3-540-68685-9, retrieved 2022-03-15
  57. ^ Hristovski, Dimitar; Kastrin, Andrej; Peterlin, Borut; Rindflesch, Thomas C. (2010), "Combining Semantic Relations and DNA Microarray Data for Novel Hypotheses Generation", Linking Literature, Information, and Knowledge for Biology, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 53–61, ISBN 978-3-642-13130-1, retrieved 2022-03-15
  58. ^ Swanson, Don R.; Smalheiser, Neil R. (1997-04). "An interactive system for finding complementary literatures: a stimulus to scientific discovery". Artificial Intelligence. 91 (2): 183–203. doi:10.1016/s0004-3702(97)00008-8. ISSN 0004-3702. {{cite journal}}: Check date values in: |date= (help)
  59. ^ Verfasser, Stegmann, Johannes. Hypothesis generation guided by co-word clustering. OCLC 1196712382. {{cite book}}: |last= has generic name (help)CS1 maint: multiple names: authors list (link)
  60. ^ Wren, J. D.; Bekeredjian, R.; Stewart, J. A.; Shohet, R. V.; Garner, H. R. (2004-01-22). "Knowledge discovery by automated identification and ranking of implicit relationships". Bioinformatics. 20 (3): 389–398. doi:10.1093/bioinformatics/btg421. ISSN 1367-4803.
  61. ^ Ozgür, Arzucan; Xiang, Zuoshuang; Radev, Dragomir R.; He, Yongqun (2010-06-03). "Literature-based discovery of IFN-gamma and vaccine-mediated gene interaction networks". Journal of Biomedicine and Biotechnology. 2010: 426479. doi:10.1155/2010/426479. PMC 2896678. PMID 20625487.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  62. ^ Korhonen, Anna; Guo, Yufan; Baker, Simon; Yetisgen-Yildiz, Meliha; Stenius, Ulla; Narita, Masashi; Liò, Pietro (2015-01-01). "Improving Literature-Based Discovery with Advanced Text Mining". Lecture Notes in Computer Science: 89–98. doi:10.1007/978-3-319-24462-4_8.
  63. ^ Preiss, Judita; Stevenson, Mark (2016-07). "The effect of word sense disambiguation accuracy on literature based discovery". BMC Medical Informatics and Decision Making. 16 (S1). doi:10.1186/s12911-016-0296-1. ISSN 1472-6947. {{cite journal}}: Check date values in: |date= (help)CS1 maint: unflagged free DOI (link)
  64. ^ Kastrin, Andrej; Hristovski, Dimitar (2008-11-06). "A fast document classification algorithm for gene symbol disambiguation in the BITOLA literature-based discovery support system". AMIA Annual Symposium proceedings: 358–362. PMC 2655979. PMID 18998999.
  65. ^ a b Gabetta, Matteo; Larizza, Cristiana; Bellazzi, Riccardo (2013-01-01). "A Unified Medical Language System (UMLS) based system for Literature-Based Discovery in medicine". Studies in Health Technology and Informatics. 192: 412–416. PMID 23920587.
  66. ^ Hristovski, Dimitar; Rindflesch, Thomas; Peterlin, Borut (2013-01-01). "Using Literature-based Discovery to Identify Novel Therapeutic Approaches". Cardiovascular & Hematological Agents in Medicinal Chemistry. 11 (1): 14–24. doi:10.2174/1871525711311010005. ISSN 1871-5257.
  67. ^ Hristovski, Dimitar; Kastrin, Andrej; Peterlin, Borut; Rindflesch, Thomas C. (2010), "Combining Semantic Relations and DNA Microarray Data for Novel Hypotheses Generation", Linking Literature, Information, and Knowledge for Biology, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 53–61, ISBN 978-3-642-13130-1, retrieved 2022-03-15
  68. ^ a b Zhang, Rui; Cairelli, Michael J.; Fiszman, Marcelo; Kilicoglu, Halil; Rindflesch, Thomas C.; Pakhomov, Serguei V.; Melton, Genevieve B. (2014-01). "Exploiting Literature-derived Knowledge and Semantics to Identify Potential Prostate Cancer Drugs". Cancer Informatics. 13s1: CIN.S13889. doi:10.4137/cin.s13889. ISSN 1176-9351. {{cite journal}}: Check date values in: |date= (help)
  69. ^ Benzschawel, Eric (2016). "Identifying Potential Adverse Drug Events in Tweets Using Bootstrapped Lexicons". Proceedings of the ACL 2016 Student Research Workshop. Stroudsburg, PA, USA: Association for Computational Linguistics. doi:10.18653/v1/p16-3003.
  70. ^ Shang, Ning; Xu, Hua; Rindflesch, Thomas C.; Cohen, Trevor (2014-12). "Identifying plausible adverse drug reactions using knowledge extracted from the literature". Journal of Biomedical Informatics. 52: 293–310. doi:10.1016/j.jbi.2014.07.011. ISSN 1532-0464. {{cite journal}}: Check date values in: |date= (help)
  71. ^ Maver, Ales; Hristovski, Dimitar; Rindflesch, Thomas C.; Peterlin, Borut (2013-11-24). "Integration of Data from Omic Studies with the Literature-Based Discovery towards Identification of Novel Treatments for Neovascularization in Diabetic Retinopathy". BioMed Research International. 2013: e848952. doi:10.1155/2013/848952. ISSN 2314-6133.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  72. ^ Kostoff, Ronald N.; Briggs, Michael B. (2008-02). "Literature-Related Discovery (LRD): Potential treatments for Parkinson's Disease". Technological Forecasting and Social Change. 75 (2): 226–238. doi:10.1016/j.techfore.2007.11.007. ISSN 0040-1625. {{cite journal}}: Check date values in: |date= (help)
  73. ^ Kostoff, Ronald N.; Briggs, Michael B.; Lyons, Terence J. (2008-02). "Literature-related discovery (LRD): Potential treatments for Multiple Sclerosis". Technological Forecasting and Social Change. 75 (2): 239–255. doi:10.1016/j.techfore.2007.11.002. ISSN 0040-1625. {{cite journal}}: Check date values in: |date= (help)
  74. ^ Hristovski, Dimitar; B, Peterlin; S, Dzeroski (2001-01-01). "Literature-based Discovery Support System and Its Application to Disease Gene Identification". Proceedings. AMIA Annual Symposium: 928–928. PMC 2243305.
  75. ^ Sarkar, Indra Neil; Agrawal, Abha (2006). "Literature based discovery of gene clusters using phylogenetic methods". AMIA ... Annual Symposium proceedings. AMIA Symposium: 689–693. ISSN 1942-597X. PMC 1839645. PMID 17238429.
  76. ^ Ozgür, Arzucan; Xiang, Zuoshuang; Radev, Dragomir R.; He, Yongqun (2010-06-03). "Literature-based discovery of IFN-gamma and vaccine-mediated gene interaction networks". Journal of Biomedicine and Biotechnology. 2010: 426479. doi:10.1155/2010/426479. PMC 2896678. PMID 20625487.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  77. ^ Ahlers, Caroline B.; Hristovski, Dimitar; Kilicoglu, Halil; Rindflesch, Thomas C. (2007-10-11). "Using the literature-based discovery paradigm to investigate drug mechanisms". AMIA ... Annual Symposium proceedings. AMIA Symposium: 6–10. ISSN 1942-597X. PMC 2655783. PMID 18693787.
  78. ^ Srinivasan, Mythily; Blackburn, Corinne; Mohamed, Mohamed; Sivagami, A. V.; Blum, Janice S. (2015-05-14). "Literature-based discovery of salivary biomarkers for type 2 diabetes mellitus". Biomarker Insights. 10: 39–45. doi:10.4137/BMI.S22177. PMC 4433061. PMID 26005324.
  79. ^ Malec, Scott A.; Wei, Peng; Xu, Hua; Bernstam, Elmer V.; Myneni, Sahiti; Cohen, Trevor (2016-01-01). "Literature-Based Discovery of Confounding in Observational Clinical Data". AMIA Annual Symposium proceedings. 2016: 1920–1929. PMC 5333204. PMID 28269951.
  80. ^ Dai, Zhenguo; Li, Qian; Yang, Guang; Wang, Yini; Liu, Yang; Zheng, Zhilei; Tu, Yingfeng; Yang, Shuang; Yu, Bo (2019-06-11). "Using literature-based discovery to identify candidate genes for the interaction between myocardial infarction and depression". BMC Medical Genetics. 20 (1): 104. doi:10.1186/S12881-019-0841-8. PMC 6560897. PMID 31185929.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  81. ^ Vos, Rein; Aarts, Sil; Mulligen, Erik M. van; Metsemakers, Job; Boxtel, Martin P. van; Verhey, Frans RJ; Akker, Marjan van den (2013-06-17). "Finding potentially new multimorbidity patterns of psychiatric and somatic diseases: exploring the use of literature-based discovery in primary care research". Journal of the American Medical Informatics Association. 21 (1): 139–145. doi:10.1136/AMIAJNL-2012-001448. PMC 3912726. PMID 23775174.
  82. ^ Miller, Christopher M.; Rindflesch, Thomas C.; Fiszman, Marcelo; Hristovski, Dimitar; Shin, Dongwook; Rosemblat, Graciela; Zhang, Han; Strohl, Kingman P. (2012-02-01). "A closed literature-based discovery technique finds a mechanistic link between hypogonadism and diminished sleep quality in aging men". Sleep. 35 (2): 279–285. doi:10.5665/SLEEP.1640. PMC 3250368. PMID 22294819.
  83. ^ Kostoff, Ronald N.; Solka, Jeffrey L.; Rushenberg, Robert L.; Wyatt, Jeffrey A. (2008-02). "Literature-related discovery (LRD): Water purification". Technological Forecasting and Social Change. 75 (2): 256–275. doi:10.1016/j.techfore.2007.11.009. ISSN 0040-1625. {{cite journal}}: Check date values in: |date= (help)
  84. ^ Gordon, M. D.; Awad, N. F. (2008), "The Tip of the Iceberg: The Quest for Innovation at the Base of the Pyramid", Literature-based Discovery, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 23–37, ISBN 978-3-540-68685-9, retrieved 2022-03-15
  85. ^ Hristovski, Dimitar; Kastrin, Andrej; Rindflesch, Thomas C. (2015-08-25). "Semantics-Based Cross-domain Collaboration Recommendation in the Life Sciences". Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015. New York, NY, USA: ACM. doi:10.1145/2808797.2809300.

Category:Information retrieval techniques Category:Medical research