Help talk:Using the Wayback Machine
This is the talk page for discussing improvements to the Using the Wayback Machine page. |
|
Archives: 1 |
This help page does not require a rating on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
accessdate=</nowiki>{{CURRENTYEAR}}-{{CURRENTMONTH}}-{{CURRENTDAY2}}<nowiki> did not work when I tried it
editThe sample code at the end of the intro section of this article didn't work for me.
Here was my attempt: [1] (see reference 38).
Can anyone confirm and/or fix this?
Thanks. --Mathieu ottawa (talk) 14:29, 15 August 2015 (UTC)
Add a notice that archive.org is blocked in China
editAccording to Websites blocked in mainland China , archive.org is blocked in China. ShadowYC (talk) 23:11, 15 February 2017 (UTC)
- I agree about this. I think Wikipedia is blocked in China too. Would be relevant to provide an alternative access method or just expect that people in China would use methods to avoid the blocking to access Wikipedia and other sites? Timofonic (talk) 13:07, 19 July 2017 (UTC)
Outdated Browser?
editWhat's with the message: "It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all). We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera" (See Curse, Inc. reference 77 as of 09:17, 25 April 2017 edit)
I get the same error message from Firefox and Chrome, both are up to date.
Need a newer example than Washington Post
editwww.washingtonpost.com no longer blocks ia_archiver. Searching for "user-agent%3A+ia_archiver"+ext%3Atxt "user-agent: ia_archiver" ext:txt on Google brought up www.qualcomm.com and xiph.org but these are lesser known websites than washingtonpost.com.
Which site(s) could be used instead? — Preceding unsigned comment added by Lol MD4 (talk • contribs) 05:39, 19 May 2017 (UTC)
- My suggestion: Historical reasons Timofonic (talk) 15:28, 19 July 2017 (UTC)
- Historical reasons (I got tired to putting exact references, but most of the original ones are on NCSA What's New (mirrored [2] Desy.de] or Unicom.com) and CUI W3Catalog zipped snapshot (from W3 Catalog History archive mirror on Software Composition Group). Feel free do add, modify, suggest or whatever. Maybe finding the most popular and ancient sites would be a good middle ground.
- home of the first website (1990) [1][2]
- Aliweb (1993) [3]
- World Wide Web (1993) [4]
- First International Conference on the World-Wide Web (1993, updated 1994) [5]
- Electronic Frontier Foundation (1993) [6]
- TinyTIM WWW Page (1993)[7]
- Welcome to Netscape (1994) [8]
- The San Francisco FogCam! (1994). The world's oldest webcam.[8]
- Strawberry Pop-Tart Blow-Torches (1994)[8]
- Fantasy Baseball Home Page (1996)[8]
- CNN’s O.J. Simpson Trial Page (1995 or 1996)[8]
- Klingon Language Institute (1996)[8]
- Washington Post’s “Year in Review” (1996) [8]
- Bob Dole/Jack Kemp Presidential Campaign (1996) [8]
- Three Rives Stadium (1998) [8]
- You’ve Got Mail (1998) [8]
- Internet Explorer is EVIL! (1998) [8]
- The Robert Deniro Page (1999). Currently dead, it's a link to the last archived page. [8]
References (incomplete, sorry)
- ^ World's oldest website revealed: First internet page is now 25 years old, Mirror Online.By Jasper Hamill, 21 DEC 2015
- ^ First website ever made
- ^ 23 Ancient Web Sites That Are Still Alive, Mental Floss. By Attila Nagy 11/15/12 3:05pm.
- ^ What's New, June 1993, desy.de
- ^ First International Conference on the World-Wide Web,404PageFound
- ^ [www.desy.de/web/mosaic/old-whats-new/whats-new-1293.html NCSA What's New, December 1993], mirrored by desy.de
- ^ TinyTIM WWW Page,404PageFound
- ^ a b c d e f g h i j k l 17 Ancient Abandoned Websites That Still Work, Mental Floss. By Lucas Reilly. November 22, 2013.
Suggestions for updates to example wikitext for "EXISTING REFERENCE" in the lede
editbackground
editWhen I was making this edit, -- just today -- I was guided partly by the "example wikitext" shown in the last sentence of the lede of this ("how-to guide") article -- displayed (right before the [Table of] "Contents") as:
In short, this is the code that needs to be added to a reference:
<ref>{{<!--EXISTING REFERENCE-->|archive-url=https://web.archive.org/web/20021128120000/http://www.originalurl.com|archive-date=2002-11-28|access-date={{subst:YYYYMMDD|d}}|dead-url=yes}}</ref>
suggestions for fixing two suspected TYPOs or other possible "mistakes", there
edit- I think that the advice about what to add, should recommend adding an "
archive-date
" field, but it should not mention (adding nor changing) the "access-date
" field. Note that in my recent edit, I added an "archive-date", but I did not modify the "access-date" ... partly because the "{{cite web}}" template [instance] (which was being updated) already had an "access-date" field [value]. That date (which was in 2008) was a date when the "original URL" link was not "yet" a dead link, [!] and ... due to circumstances ... I was (regretfully!) having to add "|dead-url = yes
" to the "{{cite web}}
" template [instance] which was being updated. So, I ignored the suggestion to add "{{subst:YYYYMMDD|d}}
" to the "access-date
" field. Doesn't that give a "today's date" value? (it gives the date of the "edit" or update ... right?) IMHO that would have been "false and misleading" for the original URL, and for the "archived" (Wayback machine) URL, I figured that the "archive-date" was more appropriate. - Also: When I did a "show preview" during editing, (by clicking on the button labeled "show preview"), I got an error message, with some words displayed in a RED font -- ("!") -- saying << "Check date values in: |archive-date= (help)" >>. (...and the word "help" was a hyperlink, to some "help" page). I am not sure what was wrong, but I figured that maybe dates like "2002-11-28" used to be allowed, but are no longer allowed. So, I used an ordinary "date" character string for that field, ("February 14, 2009") (you can view the "date" as part of the URL, by looking at the DIFF "[see above]") in order to get rid of that error message.
- Has this been fixed? I hope someone comments on this Timofonic (talk) 13:09, 19 July 2017 (UTC)
Any Comments?
editIf there are no comments here within a few weeks (OR, if there are only some comments that agree with my "suggestions", or otherwise fail to [or, do not try to] convince me to change my mind), then ... I intend to proceed with implementing these "suggestions", by editing this article (that is, "Help:Using_the_Wayback_Machine") as suggested above.
If you want to prevent that from happening, or suggest some different changes, or entertain readers by including some off topic jokes about the universe, here ... then feel free to chime in. --Mike Schwartz (talk) 20:34, 14 July 2017 (UTC)
- I don't understand what's happening. Also, the user didn't sign her/his comment. Any idea about this? I'm confused, I just wanted to change a dead link to an archived version on a Intel_MCS-48 article (Coprolite 8048 Projects). Now I'm completely confused about what to do. How to make it more clear? I can contribute in editing, but first I need to understand it. Timofonic (talk) 13:16, 19 July 2017 (UTC)
the EDIT (discussed here) has been done (but comments are still welcomed)
editThe EDIT (discussed above) [by Mike Schwartz (talk)] has been done now.
The changes made to the "Help:" page (that "goes with" this "Help_talk:" page) can be seen by looking at this DIFF.
If any of you are not happy with that edit, then you are still welcome to "chime in" here. I might not be as confused as Timofonic (talk), but sometimes I do have "some" episodes of misunderstanding.
In fact, it is possible that, if that edit did improve that snippet of "example" wikitext to be added (to an "EXISTING REFERENCE" such as a URL of the dead link[dead link ] "persuasion", inside [e.g.] a "{{cite web}}" template instance inside a "<ref>
" tag) ... then maybe it will even "Help:
" -- (no pun intended!) -- to clarify the confusion ("Now I'm completely confused [...]") allegedly reported by Timofonic (talk). OR ... maybe it will just help a little bit, ... but maybe not that much. --Mike Schwartz (talk) 17:42, 31 July 2017 (UTC)
Valid external link?
editIs
[http://javascript:void(window.open('https://web.archive.org/save/'+location.href) Wayback Save]
a valid HTTP link? Yes I am sure it works, but I don't think the [ and ] around it will make it useful to be clicked on from this page in most browsers. Jidanni (talk) 15:25, 4 November 2017 (UTC)
- That does not look like a valid HTTP link. Incidentally, Mozilla Firefox 57 won't even let me bookmark it. (It tries to protect users from creating invalid bookmarks, which might even be a good feature if it didn't also block some valid URIs.) Since the link (and others like it) can neither be followed, nor used to produce a working bookmark (it doesn't actually work, even when the browser allows creation), it doesn't seem to serve any purpose. Takatiej (talk) 00:31, 16 January 2018 (UTC)
Announce: RfC: Nonbinding advisory RfC concerning financial support for The Internet Archive
editWikipedia:Village pump (miscellaneous)/Archive 57#RfC: Nonbinding advisory RfC concerning financial support for The Internet Archive --Guy Macon (talk) 12:13, 22 December 2017 (UTC)
Wayback Machine not returning search results today ....
editThe following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
Currently, any URL I place in the Wayback search field [3] does not return any results. There's no list of years, nothing. There's just a blank page like this [4] at present. I've tried a half dozen random URLs to check, but nothing comes up (not even the message "Hmm, Wayback doesn't seem to have that page archived"). I don't know what's wrong or what to do about it. It was working fine less than 24 hours ago. Any help or advice of what to do or where/how to report it? (I'm going to post this message at WP:VPT as well.) Softlavender (talk) 06:54, 30 July 2018 (UTC)
- To keep the discussion in one place, I suggest continuing it here, not at VP:VPT. The following comment was posted at WP:VPT. --Pipetricker (talk) 09:56, 30 July 2018 (UTC)
Strange. I found the same results with Safari and Firefox. From a quick look at the page source, it seems it's not loading completely. It does, however, load pages when one has a link, e.g. this will work. You might try posting to their 'FAQ forum' listed at the bottom of this page; you could also email info@archive.org . BlackcurrantTea (talk) 08:32, 30 July 2018 (UTC)
- Yes, as I stated, it's not returning search results for any URL. There is no place to raise the issue at their forums (I checked). I emailed info archive.org two hours ago (that's the only email they provide); it would help if more people emailed them as well so they take the issue seriously. Thanks! Softlavender (talk) 10:05, 30 July 2018 (UTC)
- @Softlavender: Initially it failed for me but now it seems to be working. I'm not sure why. Jc86035 (talk) 15:10, 30 July 2018 (UTC)
- Yes, as I stated, it's not returning search results for any URL. There is no place to raise the issue at their forums (I checked). I emailed info archive.org two hours ago (that's the only email they provide); it would help if more people emailed them as well so they take the issue seriously. Thanks! Softlavender (talk) 10:05, 30 July 2018 (UTC)
- Thanks, yes, they seem to have altered the interface so that the year sections are very small. Possibly they were in the middle of making that change when the search wasn't working. Anyway, I'm going to close this discussion as resolved. Softlavender (talk) 04:21, 31 July 2018 (UTC)
Suggestion for an archive site run/hosted/initiated by Wikimedia
editSince archival has become an integral service used by Wikipedia and other Wikimedia projects, has anyone yet considered the possibility that Wikimedia should have their own alternative/backup archival website (or integrate the feature directly into Wikipedia (seems less independent, but perhaps more convenient))? It could not only provide many/most/all of the same features as existing archival sites, but:
- would have content retention policies directly in line with the needs of Wikipedia, its community, and other Wikimedia projects,
- would be an independent back-up resource should any of the third-party archival sites go down or dramatically alter their services, and
- could help automate a lot of the existing difficulties with working with existing archival sites (e.g. it could automatically generate lists of needed dead links and provide a streamlined user-interface to help editors find and document fixed links; it could go through existing citations and automatically generate archives of currently un-archived citations; etc.).
- Probably many other benefits, especially in terms of integration with WP and other WM projects. These suggestions were just the first ones off the top of my head.
Note: I posted this suggestion to this How To's talk page because I am currently in the process of trying to figure out the correct way to mark a dead link I found, and I found the information about how to do it complex and rather scattered around, and thought "There must be a more streamlined way to do this very basic, crucial type of maintenance task." I read the notice above and decided to post this here because I believe it is relevant to the request to, "Please use this talk page to discuss issues specific to this information or how-to page". The 'How To' of Using the Wayback Machine for the purposes of updating dead links with an archive is too difficult and could use such a Wikimedia-run archival site as a long-term improvement to this process. Sorry that I'm not a frequent editor and don't have the time to find the exact right place to post this feature request. I'm really just trying to fix one dead link! 😊 Please redirect this request to the appropriate location if this is not it. Thanks! --24.57.106.253 (talk) 16:56, 11 August 2018 (UTC)
What happens when an archived page is blocked off from the Wayback?
editGot an archived link or two done months ago, but it seems that the main site has exclusions done that the arcvhived links don't work anymore. Should someone revert the article URL and leave a note that the link requires a subscription to access it? Ominae (talk) 09:44, 15 February 2019 (UTC)
Wikipedia should have an Archive.today bot as well as a wayback machine bot
editArchive.today *does not* use robots.txt[5] so it's more reliable for Wikipedia than wayback machine/internet archive, IA are still blocking or removing pages from archive regularly, example Windows 7 Update privacy policy: [6]
Also the Internet Archive bot also seems to be falsely labelling links as dead for some reason: [7] --109.144.209.121 (talk) 11:17, 19 February 2019 (UTC)
Removing the navigational toolbar section no longer applicable
editWith both @InternetArchiveBot: and @GreenC bot:'s WaybackMedic actively stripping out any flags placed after the timedate stamp, is there any point to advising editors to go to the effort of adding them in the first place? The information is good and I think the iframe flag definitely provides Wikipedia users with the best presentation for archived content, but the standard being enforced appears to offer zero flexibility for any deviation from the basic link as generated on the Wayback Machine site. Is that intentional/by consensus or should the bots be reconfigured to respect editors that follow this guideline? — ⚞ ℛogueScholar🐈 ₨🗩 ⚟ 18:28, 14 October 2019 (UTC)
JavaScript bookmarklet to save live page with outlinks?
editI see there is a JavaScript bookmarklet to save a live page here, which is very useful to me; I use it often! Is there way to save a webpage with outlinks through a bookmark, or does that have to be done through web.archive.org/save itself? --MoofEMP (talk) 00:07, 8 March 2020 (UTC)
- @MoofEMP: It's not a bookmarklet, but there's an official browser extension that allows you to save pages, along with giving you the option to also save outlinks and screenshots. The versions on the chrome and firefox store are outdated, so if you want the newest version with these options you will need to download it directly from github here:(https://github.com/internetarchive/wayback-machine-webextension). Hope that helps. Duckduckgoop (talk) 04:58, 16 May 2021 (UTC)
Archiving or future-proofing subscriber-only articles
editI'm wondering what steps I can take to ensure the future-proofing of an article that cites subscriber-access only news stories (naturally, without causing copyvio concerns). An article that I'm working on uses several Gold Coast Bulletin articles, which I can access with my paid subscription, however these stories of course can't be archived by the Wayback Machine. News Corp Australia publications seem to have a pattern of removing historical stories from their servers—I'm sure this is the case for many similar websites. Of course, I have tried locating free replacements for the information cited but there are none available. I do have html files of the respective stories saved as offline copies on my computer and I imagine it would of course be copyvio to share these publicly. Given that these refs can't be digitally archived, what steps could I take to future-proof the article's verifiability if/when these are inevitably taken offline and if/when a fellow editor requests to check these sources at a formal review process ie FAC? My first idea is to post on the article's talk page a list of the subscriber articles used with a direct quote for the information being cited, and ask future editors to AGF that these quotes from the dead links were accurate and verifiable at the time they were accessed. I could also use the |quote= parameter in the citation template for the reference itself for the same purpose. — CR4ZE (T • C) 03:36, 17 May 2020 (UTC)
"New URLs added to Wikipedia articles (but not other pages) are usually automatically archived by a bot."
editIn the lead section of this article it says: "New URLs added to Wikipedia articles (but not other pages) are usually automatically archived by a bot." This is empirically untrue. I have created and significantly rewrote and improved over 15 Wikipedia pages, and every single time only some or half or if I'm lucky, most are archived when I use IABot. But every single time I find that I have to make custom archive URLs myself either through Wayback Machine or Archive.is. So what is the deal? Is this statement in the lead section inaccurate, am I doing something wrong, or both? Does the Wikipedia automatic archiving need to be fixed? And if so, how would I go about correcting the archiving, or notifying the person in charge? Factfanatic1 (talk) 10:48, 18 August 2020 (UTC)
- Factfanatic1: WP:LINKROT is probably the correct page. Can see at the top there it describes the NoMore404 program. Would need to see some examples. If it can be verified I can report it to InternetArchive who maintain the program. It's simple, add some URLs to a page and see if the archives show up on the WaybackMachine. -- GreenC 13:42, 18 August 2020 (UTC)
- Thank you. Factfanatic1 (talk) 13:53, 18 August 2020 (UTC)
"Wayback Save" link issue
editWhen I click the example link "Wayback Save" in Help:Using_the_Wayback_Machine#To_save_a_live_page, I get a new browser tab, showing the Wayback icon, "about:blank#blocked" in the address bar, and an otherwise blank page. Is this supposed to happen? If not what should happen? (I am using Chrome with Windows 10 Pro 1909, in case that is relevant.) FrankSier (talk) 09:43, 24 August 2020 (UTC)
- To save a URL enter this in your address bar:
https://web.archive.org/save/http://example.com
replacing http://example.com with whichever URL to save. I couldn't say why clicking that convenience link doesn't work. -- GreenC 15:04, 24 August 2020 (UTC)
Wayback machine doesn’t work
editDoes someone have the same problem as me? At the moment the internet archive doesn't work, if I save a page the page disappears immediately, so it has not been possible to save pages for some time now. 87.123.206.98 (talk) 22:45, 20 October 2020 (UTC)
messes up dates
editin farsi fa.wiki bot mostly breaks up dates mostlyBaratiiman (talk) 13:28, 9 November 2020 (UTC)
- Baratiiman, could you provide some example diffs and more detail like what it should look like? Then I can submit a trouble report. Thanks. -- GreenC 14:59, 9 November 2020 (UTC)
- fa.wikipedia.org/wiki/اصفهان I have used IA bot like a hundred times and i think it is doing this changing dates, dates dont match cite template.Baratiiman (talk) 15:24, 9 November 2020 (UTC)
Is it good practice to remove archive links from live citations?
editI saw somebody doing this and got into a debate with them about it. Just want to double check. Thanks. –Novem Linguae (talk) 13:36, 16 January 2021 (UTC)
- There's been recurrent debate about this in multiple places; a lot of editors think this is useless code bloat in the page, but others don't. It never seems to come to resolution, and probably will not without a very well-trafficked RfC, perhaps at WP:VPTECH or WT:CITE, with a WP:CENT notice so it gets input from more than the page regulars. — SMcCandlish ☏ ¢ 😼 05:21, 4 January 2024 (UTC)
url-status: Third value would be useful
editCurrently a link that has been archived can have one of two states, dead or live. If the link is live, the original link will be pulled, if dead the archived one. However, this does not work well if the link is still live, but has been moved behind a paywall (or registration requirement, etc). In this case, it would be preferable to serve the archived version. So a third value, to indicate that the link is still alive but not suitable for serving would be useful. Lklundin (talk) 21:02, 14 July 2021 (UTC)
- Access limitations are designated with
|url-access=subscription
etc.. see Template:Cite_web#Subscription_or_registration_required. The purpose of archiving links is not to bypass paywalls. Should that become our stated policy, at scale, then I imagine paywall sites will block the Wayback Machine entirely, even more than they already do. -- GreenC 21:13, 14 July 2021 (UTC)- And there are already more than two values. E.g., if the site has been usurped by a spammer or something,
|url-status=usurped
or|url-status=unfit
. If the material has been changed (e.g. at a page that changes radically each year or other time span) and remains a valid URL but one that no longer supports the claim it was cited for and only an archived version does, then|url-status=deviated
. — SMcCandlish ☏ ¢ 😼 05:47, 4 January 2024 (UTC)
- And there are already more than two values. E.g., if the site has been usurped by a spammer or something,
Latest archive copy
editIn the section it says the use is discouraged, because it doesn't link to a specific version of the page, and this is true and a problem if that is what you want to do. But I have found links like this useful in the case where I want to link to the most current version of a page, but the website that hosts the page is experiencing ongoing reliability issues, configuration problems, &c. — Preceding unsigned comment added by 77.61.180.106 (talk) 16:27, 9 November 2021 (UTC)
Internet Archive library and Wayback Machine blocked by ISPs (Parental filters?)
editI frequently access the Internet Archive's library or Wayback Machine while editing at Wikipedia (from the UK), but since swapping over to Three broadband it looks like these sites are blocked by parental filters (which are turned on by default). At the moment, I still have BT broadband and that works just fine, as does a VPN.
In Chrome, attempted access results in this response:
"This site can't provide a secure connection. archive.org sent an invalid response ..."
An article from 2019 suggests that Vodafone, Three, O2, and EE are blocking access: Internet Archive Wayback Machine blocked by Vodafone, Three, O2, and EE: We can change that.
- Could this be related to the UK's Online Safety Act 2023, due to the Internet Archive wittingly and unwittingly hosting adult content without user age verification?
This can be solved by going to the broadband user account and entering credit card details for a card check to verify age, but it won't accept debit cards. Alternatively, you can go into one of the relevant phone stores with a card and photo ID, assuming there is a phone store within reach.
However, I (for one, now a pensioner) can do neither, and I will note that already when I signed-up to Three they carried out a credit check and have my age.
Another way to obtain access (it works for me on Three) is to open an Opera or Avast secure browser (and probably others) and switch to a virtual private network (VPN).
- Could such issues be investigated, and would it be worth adding a new section to the help page?
(Topic posted here after originally posting to the Village pump (technical)).
Thanks and have a happy New Year, Esowteric + Talk + Breadcrumbs 17:55, 30 December 2023 (UTC)
The Washington Post issue
editThe Washington Post may be blocking Internet Archive's bots now. When I try to archive https://www.washingtonpost.com/entertainment/movies/2024/01/03/barbie-oscars-screenplay-adapted-original Wayback reports a 404 error, despite that page definitely not being a 404. If WashPost is doing this programmatically, then this could be serious bad news for using it as a source and not suffering a lot of fatal linkrot later. — SMcCandlish ☏ ¢ 😼 05:17, 4 January 2024 (UTC)
Wayback is saving the wrong article.
editI noticed an unexpected issue in Wayback, which is pretty difficult to explain, but which is noted here. Basically the wayback link which is saved is a totally different article than the URL that was input, except the headline is the same as the input url. The article numbers and dates are not the same, so it isn't a case that the author of the article edited it to be something else. The issue has been on-going from what I can tell for at least 2 years. I reported it to info@archive.org and got a reply the very next day that they are working on it. Unsure if it is a widespread problem or not, but when I posted the issue on Women in Red's talk page, I was advised to post it here. (I don't watch this page, if you respond please ping me.) SusunW (talk) 14:32, 7 February 2024 (UTC)
IA down
editPut warning IA is down. Luhanopi (talk) 10:19, 12 October 2024 (UTC)
- Seems the Wayback Machine is available again, at this point warning is not likely not needed Adam8410 (talk) 01:15, 15 October 2024 (UTC)
- Yes. Unfortunately only Wayback :( Luhanopi (talk) 17:20, 15 October 2024 (UTC)
- @Adam8410 No, it's in read-only mode.
- --Luhanopi (talk) 16:20, 19 October 2024 (UTC)