This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages)(Learn how and when to remove this template message)
Recoll is a desktop search tool that provides full text search (from single-word to arbitrarily complex boolean searches) in a GUI with few mandatory external dependencies. It runs under many Unix-like operating systems, and is mostly independent of the desktop environment.
1.25.22 / August 27, 2019
|Written in||C++ and Python|
|Operating system||Unix-like, Windows|
Recoll was designed not to require a permanent daemon but on Linux systems it can make use of inotify. Recoll updates its index at designed intervals (for example through Cron tasks) but if desired, the indexing task can run as a file-system monitoring daemon for real-time index updates.
- Qt GUI.
- Xapian backend.
- Indexes the contents of many document types: text, HTML, email stores of all kinds, OpenOffice.org, Microsoft Office and Office Open XML, AbiWord, KWord, Gaim, Lyx, Scribus, PDF, WordPerfect, PostScript, RTF, TeX, DVI, DjVu, MP3 and other audio file formats, JPEG and other image file formats.
- Recursively processes embedded documents (E-Mail attachments, Zip archives) to arbitrary depths.
- Query facilities, with boolean searches, wildcards, phrases, proximity, filter on file types and directory tree. GUI Boolean search build tool.
- Xesam query language support.
- Word stemming is performed at query time (can switch stemming language after indexing).
- Multiple indexes selectable at query time (i.e. personal + system indexes).
- Natively based on Unicode. Supports many languages and character sets, including good support for East Asian texts (CJK).
- MD5 document hashes for the elimination of duplicates in results.
- Batch and real-time indexing modes.
- Python API.
- GNOME Shell search provider, WEB interface, and Firefox history extensions.
File type supportedEdit
File types indexed nativelyEdit
- Maildir, mh, and mailbox ( Mozilla, Thunderbird and Evolution mail ok). Evolution note: be sure to remove .cache from the skippedNames list in the GUI Indexing preferences/Local Parameters/ pane if you want to index local copies of Imap mail.
- Gaim and purple log files.
- Scribus files.
- Man pages (needs groff).
- Mimehtml web archive format (support based on the mail filter, which introduces some mild weirdness, but still usable).
- All the following need Python3:
- Dia diagrams.
- Excel and Powerpoint (pre-open-xml).
- Tar archives. Tar file indexing is disabled by default (because tar archives don't typically contain the kind of documents that people search for), you will need to enable it explicitely, like with the following in your $HOME/.recoll/mimeconf file:
[index] application/x-tar = execm rcltar
- Zip archives.
- Konqueror webarchive format (uses the tarfile Python standard library module).
File types indexed with external helpersEdit
- PDF files.
- MS-Word files.
- Wordperfect files.
- RTF files.
- Image and audio file tags.
- Abiword files.
- Fb2, Epub, and CHM ebooks.
- Kword files.
- Microsoft Office traditional and Open XML files.
- OpenOffice files.
- SVG files.
- Gnumeric files.
- Okular annotations files.