Wikipedia:Link rot/URL change requests/Archives/2022/January

Latest comment: 2 years ago by GreenC in topic Geek & Sundry


Ultratop (Flanders / Wallonia)

Hello. I was wondering if all of the Ultratop URLs could be archived. The old URLs currently redirect to new ones. For example, this is now this for Flanders and this is now this for Wallonia. As it's not a straightforward URL update, I was hoping if all of the URLs with the following URL formats could be archived:

I am curious to see how many can be archived and how many would need URL updating. If any of these are deadlinks, please let me know. Per a dicussion at Template_talk:Single_chart the old redirects are currently redirecting to the new URLs, so they may not need tagging deadlinks. Therefore, I think only ones that would need tagging if they do not redirect to the new URL. Thanks! --MrLinkinPark333 (talk) 19:17, 23 December 2021 (UTC)

Hi MrLinkinPark333, it appears there are (only) 1,649 pages that contain any of the URLs. The rest are template usages, for example in Old Skool (EP). Are you able to modify the template(s) for this domain so it prepends "https://web.archive.org/web/" thus in Old Skool #7 it becomes https://web.archive.org/web/https://www.ultratop.be/nl/showitem.asp?interpret=Armin+van+Buuren&titel=Old+Skool&cat=a .. after a few redirects it gets to the right page (I think). Another option is to use an intentionally outdated timestamp which causes the server to find the oldest available eg. https://web.archive.org/web/2000010101010101/https://www.ultratop.be/nl/showitem.asp?interpret=Armin+van+Buuren&titel=Old+Skool&cat=a .. another way is to use the index page eg. https://web.archive.org/web/*/https://www.ultratop.be/nl/showitem.asp?interpret=Armin+van+Buuren&titel=Old+Skool&cat=a .. another way is for the bot to delete the template and replace with a {{cite web}} which is a bit more complicated to program. In the mean time I'll work on the 1,649. -- GreenC 18:08, 4 January 2022 (UTC)
Actually, that won't work because redirects might be working. It would require checking each one. -- GreenC 18:48, 4 January 2022 (UTC)
@GreenC: Whichever the most easiest for you would be fine with me. Swiss charts has the same issue with this becoming that after redirect. Not sure which one of your solutions would work best here, since the URL is not similar at all to the current one. --MrLinkinPark333 (talk) 21:18, 4 January 2022 (UTC)
MrLinkinPark333, sorry if I am misunderstanding. Your original request said "all of the Ultratop URLs could be archived". I ran a test on 50 articles and in all cases the URLs still work, albeit via redirect. For example original is now new. There is nothing to be done since the redirect works. I'll process all and see if it uncovers any dead such as a missing redirect. -- GreenC 17:53, 8 January 2022 (UTC)
@GreenC: While the redirects work now, I wasn't sure if they will work in the future. if it's easier to find ones where there is no working redirect, then that'd be a better solution. --MrLinkinPark333 (talk) 18:00, 8 January 2022 (UTC)
Yeah that's always a possibility. That's a different tooling I'm currently running the find and fix dead links then maybe go back and retool for removing redirect. -- GreenC 18:06, 8 January 2022 (UTC)

MrLinkinPark333, it processed 1,649 pages and found and tagged dead links for ultratop.be URLs in 47:

Extended content

It also converted the redirects (Example), making 2,115 conversions. -- GreenC 21:03, 9 January 2022 (UTC)

asianfanatics.net

Taken over by scammers. – robertsky (talk) 09:07, 10 January 2022 (UTC)

– robertsky ok taken care of. If you see any problems let me know, it was in 44 articles. Also noted for IABot for other wikis. -- GreenC 16:41, 10 January 2022 (UTC)

www.businessdictionary.com

I found at least 100 broken links to www.businessdictionary.com: this website seems to be offline now. Jarble (talk) 16:53, 12 January 2022 (UTC)

Straight dead site, sent to IABot to archive, 48 pages. -- GreenC 16:58, 12 January 2022 (UTC)

Gambiling

Not many links to this site, but bocaratonnews.com is usurped by gambling site. Came across it in my bare ref run Rlink2 (talk) 00:31, 16 January 2022 (UTC)

Rlink2, thanks for this. It is part of the WP:JUDI gambling ring they have taken over 100s or 1000s of domains on Wikipedia. It's been added to the next batch, instances will get usurped. -- GreenC 03:54, 16 January 2022 (UTC)

www.forbes.com

I've found many URLS on forbes.com (such as this one) that have not yet been archived. Most of these broken links have a "404" page title, so they should be easy to find and repair. Jarble (talk) 16:29, 4 January 2022 (UTC)

Jarble Agree there are many problems with Forbes. Starting the process now. They are in 35,307 pages on enwiki, plus another 78,534 urls in the IABot database. It will take a bit of time to verify and update in respective locations. Update: they use bot detection/blocking which will slow it down. -- GreenC 03:40, 10 January 2022 (UTC)

Jarble: It took 9 days, with 15 concurrent processes running 24x7. It was slowed by the sheer mass of links, and that Forbes uses bot blocking so it required some redundant round about methods to get an accurate header result. The bot blocking is probably why InternetArchiveBot has not been able to correctly determine dead links. -- GreenC 16:38, 19 January 2022 (UTC)

Results

  • Processed 78,534 urls in the IABot database. Found and marked about 7,500 as dead.
  • Processed 35,307 pages on Wikipedia. Found and marked dead or added archive to about 3,000 links.

Cracroft's Peerage

There was a directory change in the website of Cracroft's Peerage in December 2020. The change is as follows.

The change is to remove /online/content. Cracroft's Peerage is deprecated per Wikipedia:Reliable sources/Perennial sources#Self-published peerage websites, but per Special:LinkSearch/http://www.cracroftspeerage.co.uk/online/content/ there are about 2,000 links across all namespaces, so probably still worth fixing. ネイ (talk) 10:03, 20 January 2022 (UTC)

Ok some won't migrate like [1] thus they all need to be tested ie. not a AWB search-replace script. My bot can do that. Running now. -- GreenC 16:05, 20 January 2022 (UTC)

Results

  • Articles checked: 1,434
  • Articles edited: 1,203
  • Swap old URL with new URL: 1,747 [2]
  • New archive URL added: 68 [3]
  • Existing |url-status=live changed to |url-status=dead: 298 [4]
  • Add {{dead link}}: 5 [5]

-- GreenC 03:27, 21 January 2022 (UTC)

Geek & Sundry

Looks like Geek & Sundry is truly dead as their website https://geekandsundry.com now redirects to https://nerdist.com/. It doesn't appear that any of their articles are being transitioned over to Nerdist - just a mass redirect to Nerdist's homepage. Not only are they a primary source for their own article, but also I know I've used many of their articles as secondary sources across RPG/tabletop game articles. A bot adding archive links would be awesome. Thanks! Sariel Xilo (talk) 05:42, 23 January 2022 (UTC)

Some were migrated example. More complicated. I'll take a look. -- GreenC 07:49, 23 January 2022 (UTC)
Thanks for looking! Was it just some of the Critical Role stuff that was migrated? Sariel Xilo (talk) 16:40, 23 January 2022 (UTC)

Sariel Xilo, done. It edited 134 pages. Converted 94 links to Nerdist, and added 165 archive URLs for geekandsundry. -- GreenC 03:17, 25 January 2022 (UTC)

mangalamvarika.com

Avira is spitting a virus warning at me from visiting mangalamvarika.com. We have about 60 links to pages there. Looks like the domain expired and has been usurped. --Geniac (talk) 20:59, 24 January 2022 (UTC)

Geniac: done. [6] -- GreenC 21:37, 24 January 2022 (UTC)