User talk:Emijrp/Wikipedia Archive
Ideas edit
Library of Congress is going to save every public tweet. Why don't they save a copy of Wikipedia? emijrp (talk) 16:47, 10 September 2010 (UTC)
iBiblio edit
I have contacted iBiblio for hosting a copy of the latest dumps, working as a mirror of download.wikimedia.org. No response yet. emijrp (talk) 13:12, 15 November 2010 (UTC)
- Their response: Unfortunately, we do not have the resources to provide a mirror of wikipedia. Best of luck!
Who can we contact for hosting a mirror of the XML dumps? edit
We are working in meta:Mirroring Wikimedia project XML dumps. emijrp (talk) 23:36, 10 December 2010 (UTC)
How can i get all the revisions of a language for a duration ? edit
I want all the revisions that are happened from 18-10-2010 tp 26-10-2010 for a particular language. How do I get it? —Preceding unsigned comment added by 207.46.55.31 (talk) 11:53, 29 November 2010 (UTC)
- You need to extract that date range from a full dump, stub-meta-history (only metadata) or pages-meta-history (metadata + text). You can use the xmlreader.py from meta:pywikipediabot. emijrp (talk) 15:58, 29 November 2010 (UTC)
- Also, you can request a meta:Toolserver account and make a SQL query to the server. emijrp (talk) 15:58, 29 November 2010 (UTC)
Update with latest dump info? edit
If you are around, would you mind updating with more recent dump info? I'd do it but am reluctant to edit another person's user page area. Thanks... --- 85.72.150.131 (talk) 17:04, 19 August 2011 (UTC)
Offline reader edit
If you are interested, there is another Offline-Reader (with image databases at archive.org): http://xowa.sourceforge.net/ https://sourceforge.net/projects/xowa/ — Preceding unsigned comment added by 188.100.234.211 (talk) 10:56, 13 January 2014 (UTC)
Tarball archive from 2005 edit
User:Emijrp/Wikipedia_Archive#Image_tarballs says: "Another one from 2005 only covers English Wikipedia images." The file description says: "all current images in use on Wikipedia and its related projects". Is it possible to find out that these pictures come from all projects or only from the English Wikipedia? Samat (talk) 23:16, 31 October 2014 (UTC)
Offline Wikipedia as epub(s) for e-readers edit
I'm endeavoring to create an offline Wikipedia in the form of epub(s) (more than one file if at least one e-reader turns out not to be able to handle a single 2GB epub) that inexpensive, high-autonomy e-ink readers could read. I intend to download a dump, make necessary transforms using mediawiki-utilities, sort articles by PageRank using for instance https://spark.apache.org/graphx/ , then take the top n articles until they (and the media they link to) reach 2GB. Does that look sound to you? ZPedro (talk) 21:16, 9 December 2016 (UTC)