This page is not only subject to change, but likely to change. We're working on designing a rigorous trial. If you have any suggestions, please join in the discussion.

The Turnitin trial, should it be approved by the community will have certain questions it seeks to answer:

  • Does Turnitin's system effectively screen out false positives created by Wikipedia mirrors or sites that legitimately reuse our content under a compatible license?
  • Can Turnitin's system work on old as well as new articles? Perhaps the webcrawler should be excluded and only focus on content database?
  • What 'percent-match' present in a Turnitin report would optimize copyvio detection while minimizing false positives?
  • Does Turnitin's system improve upon our current investigation tools, namely Coren's Bot or Madman's Bot? (note that those two only works on new articles)
  • Does Turnitin catch known copyvio issues?
  • Does Turnitin have blind spots, areas that it can't detect (e.g. New York Times content...)?

Type of trials edit

  1. We could run a trial looking at how Turnitin works with new edits before they have been mirrored around the web.
  2. We could also run a trial looking at how Turnitin handles long established pages