1. Should HTML entities be put into searchindex?
  2. Should math code be put into searchindex?
  3. Very small words? Very long words?
  4. Should forms of words be reduced? (eg.forked/forks reduced into fork?)
  5. Should a new stoplist be generated out of current searchindex?
  6. For English - the 's?
  7. Other characters?