TypoScan Roadmap

edit watch unwatch

  •  Done | Do the initial scan of the Wikipedia database extract with DBScanner against the list of typos in RegExTypoFix
  •  Done | Produce a basic SQL backend to track progress
  •  Done | Add initial list of articles with typos to MySQL Database
 Done Build a small application to take article list from the database scan and add them to the database (Takes ~4 seconds to populate nearly 70k articles into local database on reasonable spec server)
  •  Done | Work with the AWB developers on integration
 Done IListProvider AWB Plugin written/modified to be able to parse XML output from database generated in PHP, to give list of articles to work on (known as a workload)
Give user 100 articles to process per workload (can collect more than workload before processing, articles just appended to list)
 Done | Produce check in/check out/timeout system to track what has and hasn't been typo fixed.
Timestamped ("checked out") in database when list is requested with that article in
If the timestamp is more than 2 hours old and not marked as finished, it will be pulled for another user
 Done | Find way for client to upload the status to the server (check in/finished)
articles (article id's) can be posted back to the script and marked as finished
 Done | When to write the status decide if it will be in intervals of time, edits, or at the end of program
Can be done on demand by using Plugins menu
rev 3169 Automatically done when program is closing if there are articles to be submitted
rev 3170 Automatically done every 25 finished articles
  • ☒N | Build a small application to "sync" a new article list from a new dump with the database
List of articles to be removed & added (Can be done based on ListComparer and the already written application)
Periodically its probably worth clearing the database
  • ☒N | Integrate false-positive reporting with AWB. Then use this reporting to find regular expressions that don't produce good results.
Typo stats have been suggested for this reason
  •  Done | Expand plugin and DB for other projects
  •  Done | Log whether editied or ignored/skipped
Logs reason to database also. Stats included to show statistics
  • ☒N | Add as plugin expansion way for DBScanner to add straight to TypoScan DB