||A major contributor to this article appears to have a close connection with its subject. (August 2011) (Learn how and when to remove this template message)|
eTBLAST is a now-defunct free text similarity service search engine which offered access to the MEDLINE database, the National Institutes of Health (NIH) CRISP database, the Institute of Physics (IOP) database, Wikipedia, arXiv, the NASA technical reports database, Virginia Tech class descriptions and a variety of databases of clinical interest. It is continuously expanding with additional text-based databases. eTBLAST searched citation databases and databases containing full text, such as PUBMED. The eTBLAST server compared a user's natural text query to target databases using a hybrid search algorithm consisting of a low-sensitivity weighted keyword-based first pass followed by a novel sentence-alignment based second pass. eTBLAST was a web-based service of The Innovation Laboratory at the Virginia Bioinformatics Institute.
eTBLAST, as a text similarity engine, made possible a large study of duplicate publications and potential plagiarisms in the biomedical literature. Thousands of random samples of Medline abstracts were submitted to eTBLAST, and those with the highest similarity were studied and entered into an on-line database. This work revealed several trends, including an increasing rate of duplication in the biomedical literature, as reported in the journals Bioinformatics, Anaesthesia and Intensive Care, Clinical Chemistry, Urologic Oncology, Nature, and Science.
It is not clear why the system and the database have been turned off.
- Lewis, J; Ossowski, S; Hicks, J; Errami, M; Garner, HR (2006). "Text similarity: An alternative way to search MEDLINE". Bioinformatics. 22 (18): 2298–304. doi:10.1093/bioinformatics/btl388. PMID 16926219.
- Pertsemlidis, A; Garner, HR (2004). "Text comparison based on dynamic programming". IEEE Engineering in Medicine and Biology Magazine. 23 (6): 66–71. doi:10.1109/MEMB.2004.1378640. PMID 15688594.
- Sun, Z; Errami, M; Long, T; Renard, C; Choradia, N; Garner, H (2010). Curioso, Walter H, ed. "Systematic Characterizations of Text Similarity in Full Text Biomedical Publications". PLoS ONE. 5 (9): e12704. doi:10.1371/journal.pone.0012704. PMC . PMID 20856807.
- Errami, M; Hicks, JM; Fisher, W; Trusty, D; Wren, JD; Long, TC; Garner, HR (2007). "Deja vu a study of duplicate citations in Medline". Bioinformatics. 24 (2): 243–9. doi:10.1093/bioinformatics/btm574. PMID 18056062.
- Errami, M; Sun, Z; George, AC; Long, TC; Skinner, MA; Wren, JD; Garner, HR (2010). "Identifying duplicate content using statistically improbable phrases". Bioinformatics. 26 (11): 1453–7. doi:10.1093/bioinformatics/btq146. PMC . PMID 20472545.
- Loadsman, JA; Garner, HR; Drummond, GB (2008). "Towards the elimination of duplication in Anaesthesia and Intensive Care". Anaesthesia and Intensive Care. 36 (5): 643–5. PMID 18853580.
- George, AC; Long, TC; Garner, HR (2010). "Quaere Verum". Clinical Chemistry. 56 (4): 673–4. doi:10.1373/clinchem.2009.130468. PMID 20093558.
- Garner, HR (2011). "Combating unethical publications with plagiarism detection services". Urologic Oncology. 29: 95–9. doi:10.1016/j.urolonc.2010.09.016. PMC . PMID 21194644.
- Errami, M; Garner, H (2008). "A tale of two citations". Nature. 451 (7177): 397–9. doi:10.1038/451397a. PMID 18216832.
- Long, TC; Errami, M; George, AC; Sun, Z; Garner, HR (2009). "Responding to Possible Plagiarism". Science. 323 (5919): 1293–4. doi:10.1126/science.1167408. PMID 19265004.