Welcome to my talk page.

Please leave me a note by starting a new subject here
and please don't forget to sign your post

You may want to have a look at the subjects
in the header of this talkpage before starting a new subject.
The question you may have may already have been answered there

Dirk Beetstra        
I am the main operator of User:COIBot. If you feel that your name is wrongly on the COI reports list because of an unfortunate overlap between your username and a certain link or text, please ask for whitelisting by starting a new subject on my talkpage. For a better answer please include some specific 'diffs' of your edits (you can copy the link from the report page). If you want a quicker response, make your case at WT:WPSPAM or WP:COIN.
COIBot - Talk to COIBot - listings - Link reports - User reports - Page reports

I will respond to talk messages where they started, trying to keep discussions in one place (you may want to watch this page for some time after adding a question). Otherwise I will clearly state where the discussion will be moved/copied to. Though, with the large number of pages I am watching, it may be wise to contact me here as well if you need a swift response. If I forget to answer, poke me.

I preserve the right not to answer to non-civil remarks, or subjects which are covered in this talk-header.


There are several discussions about my link removal here, and in my archives. If you want to contact me about my view of this policy, please read and understand WP:NOT, WP:EL, WP:SPAM and WP:A, and read the discussions on my talkpage or in my archives first.

My view in a nutshell:
External links are not meant to tunnel people away from the wikipedia.

Hence, I will remove external links on pages where I think they do not add to the page (per WP:NOT#REPOSITORY and WP:EL), or when they are added in a way that wikipedia defines as spam (understand that wikipedia defines spam as: '... wide-scale external link spamming ...', even if the link is appropriate; also read this). This may mean that I remove links, while similar links are already there or which are there already for a long time. Still, the question is not whether your link should be there, the question may be whether those other links should be there (again, see the wording of the policies and guidelines).

Please consider the alternatives before re-adding the link:

  • If the link contains information, use the information to add content to the article, and use the link as a reference (content is not 'see here for more information').
  • Add an appropriate linkfarm like {{dmoz}} (you can consider to remove other links covered in the dmoz).
  • Incorporate the information into one of the sister projects.
  • Add the link to other mediawiki projects aimed at advertiseing (see e.g. this)

If the linkspam of a certain link perseveres, I will not hesitate to report it to the wikiproject spam for blacklisting (even if the link would be appropriate for wikipedia). It may be wise to consider the alternatives before things get to that point.

The answer in a nutshell
Please consider if the link you want to add complies with the policies and guidelines.

If you have other questions, or still have questions on my view of the external link policy, disagree with me, or think I made a mistake in removing a link you added, please poke me by starting a new subject on my talk-page. If you absolutely want an answer, you can try to poke the people at WT:EL or WT:WPSPAM on your specific case. Also, regarding link, I can be contacted on IRC, channel [1].

Reliable sources

I convert inline URL's into references and convert referencing styles to a consistent format. My preferred style is the style provided by cite.php (<ref> and <references/>). When other mechanisms are mainly (but not consistently) used (e.g. {{ref}}/{{note}}/{{cite}}-templates) I will assess whether referencing would benefit from the cite.php-style. Feel free to revert these edits when I am wrong.

Converting inline URLs in references may result in data being retrieved from unreliable sources. In these cases, the link may have been removed, and replaced by a {{cn}}. If you feel that the page should be used as a reference (complying with wp:rs!!), please discuss that on the talkpage of the page, or poke me by starting a new subject on my talk-page

Note: I am working with some other developers on mediawiki to expand the possibilities of cite.php, our attempts can be followed here and here. If you like these features and want them enabled, please vote for these bugs.


I am in general against deletion, except when the page really gives misinformation, is clear spam or copyvio. Otherwise, these pages may need to be expanded or rewritten. For very short articles there are the different {{stub}} marks, which clearly state that the article is to be expanded. For articles that do not state why they are notable, I will add either {{importance}} or {{notability}}. In my view there is a distinct difference between these two templates, while articles carrying one of these templates may not be notable, the first template does say the article is probably notable enough, but the contents does not state that (yet). The latter provides a clear concern that the article is not notable, and should probably be {{prod}}ed or {{AfD}}ed. Removing importance-tags does not take away the backlog, it only hides from attention, deleting pages does not make the database smaller. If you contest the notability/importance of an article, please consider adding an {{expert-subject}} tag, or raise the subject on an appropriate wikiproject. Remember, there are many, many pages on the wikipedia, many need attention, so maybe we have to live with a backlog.

Having said this, I generally delete the {{expand}}-template on sight. The template is in most cases superfluous, expansion is intrinsic to the wikipedia (for stubs, expansion is already mentioned in that template).

Vandalproof.pngWarning to Vandals: This user is armed with VandalProof.
Warning to Spammers: This user is armed with Spamda
Choco chip cookie.jpg This user knows where IRC hides the cookies, and knows how to feed them to AntiSpamBot.
Wikipedia-logo-v2-en.svgThis user is one of the 400 most active English Wikipedians of all time.

Please am sorry if I spammed Wiki, Kindly remove this URL as spamEdit

Hello user Beetstra, I am really new on Wikipedia and I am still learning how to grow Wikipedia as an encyclopedia, so I made series of edits recently and one of the link is recorded as spam, this is the link below;


Pls kindly revert it! I won't spam anymore, thanks...have any reply, pls chat on my talk page... Daniel vic (talk) 08:48, 1 August 2020 (UTC)

Daniel vic, there is no need to remove it, no-one has acted on it, and since you are clearly reacting now there is no reason for action either. I will close it accordingly. It’s just a report, nothing more, nothing less. I made my mistakes when I started, and even while I was an admin. No need to worry about it. Dirk Beetstra T C 19:59, 1 August 2020 (UTC)

Okay! Thanks...I will become better sir! Daniel vic (talk) 04:06, 2 August 2020 (UTC)

COIBot/LiWa3 - list all users who added a certain URLEdit

Hey Beetstra, I was wondering if there's a way to get all users who added a particular URL (not the whole domain, just the specific URL). There's an SPI going on which has identified a particular URL as characteristic of a very large xwiki spam ring, but the domain in question is used in too many places for a useful COIBot report (mdpi.com, an open-access journal website of some sort, there will be plenty of legit/unrelated additions). I used "whoadded mdpi.com /2078-2489/11/5/263/htm" on IRC and it says there are 560 (!) users who have added that URL, but it only gives us the top 10. Is there a way to generate a COIBot report just for a specific URL instead of the whole domain? Alternatively, are you able to just pull the results of that query and either post them on the SPI or email them to me and Mz7? GeneralNotability (talk) 20:33, 4 August 2020 (UTC)

GeneralNotability, I can pull the query, that will be the easiest. I’ll try hat tomorrow. Dirk Beetstra T C 22:23, 4 August 2020 (UTC)
@GeneralNotability and Mz7: I created the dumps. tTis is a bit broader dump than what you were asking for, it contains all mdpi.com links with '2078' and '2489' in them, and a dump for the specific url (adding the search term '263'). Hope this helps. --Dirk Beetstra T C 07:13, 5 August 2020 (UTC)
reping: @GeneralNotability and Mz7:. --Dirk Beetstra T C 07:13, 5 August 2020 (UTC)
Could you add www.mdpi.com/2073-431X/8/3/60/htm to that? It's an earlier paper by the same authors see e.g. [2]. This is so widespread and over such a long time period that I think we need to add them to the blacklist, but should we wait until they are all removed? SmartSE (talk) 08:35, 5 August 2020 (UTC)
@Smartse, GeneralNotability, and Mz7: I will do Wikipedia_talk:WikiProject_Spam/LinkReports/mdpi.com#specific_dump 2 in a bit.
If any of you has toolforge access (and hence, sql access), can I have your SQL usernames, I can try to grant you SELECT rights on the db (and hope you all don't want to do a SELECT on youtube.com ...). --Dirk Beetstra T C 09:04, 5 August 2020 (UTC)
@JzG: ^^. --Dirk Beetstra T C 09:36, 5 August 2020 (UTC)
Beetstra, no toolforge here Guy (help! - typo?) 09:43, 5 August 2020 (UTC)
Working on getting access now. GeneralNotability (talk) 13:54, 5 August 2020 (UTC)
Got access - assuming I'm looking at the right thing ($HOME/replica.my.cnf), my SQL username is u25662. GeneralNotability (talk) 15:48, 5 August 2020 (UTC)
GeneralNotability, yep, that is it. I’ll try to give you access and some explanation tomorrow. Dirk Beetstra T C 17:18, 5 August 2020 (UTC)
GeneralNotability, you have access now to the table 'linkwatcher_linklog' on database 's51230__linkwatcher' (note that in the database there are two underscores). 'describe linkwatcher_linklog;' will give you a table description. Please be careful with the queries .. "select * from linkwatcher_linklog where domain like 'com.youtube.%';" will take days and likely both COIBot and LiWa3 will be affected. Dirk Beetstra T C 06:54, 6 August 2020 (UTC)

linklog queriesEdit

GeneralNotability I don't have access to IRC at the moment (and my weekend starts). Anyway: the table 'linkwatcher_linklog':

  • ID = just an ID
  • timestamp = unix format timestamp for the time of the edit (taken from the diff)
  • edit_id = (I forgot - I think I used this as a flag once)
  • lang = language of wiki
  • pagename = name of page
  • namespace = namespace (main, draft, user, and template)
  • diff = cleaned up diff-url, or the logitem url for spamblacklisthits
  • revid = the revid number
  • oldid = the oldid number
  • wikidomain = type of wiki (wikipedia, wikiversity ...)
  • user = name of editor / IP
  • fullurl = the complete link that was added
  • domain = encoded domain of the fullurl: The domain is reversed: 'example.com' becomes 'com.example.' (with an extra dot at the end). That makes searching much faster, especially on subdomains ('com.blogspot.'; otherwise your query will be domain LIKE '%.blogspot.com' to get all blogspots, which is slower than domain LIKE 'com.blogspot.%'). If you want to know who added 'blogspot.com' you have to search for domain = 'com.blogspot.' (again, with the trailing dot, domain = 'com.blogspot' will not give any results).  m:User:LiWa3 is stripping 'www.' and 'www3.' from the front of each domain, so both 'example.com' and 'www.example.com' are stored as 'com.example.'. Note that the whole path is stripped as well, this is just domainname and TLD, so I can count domain = 'com.mdpi.' quite fast).
  • resolved = IP of domain (some are static enough to be useful as a search term), and some IPs of domains are close to IPs of IP-users.
  • ip = whether the editor is an IP or a named user; that was originally only IPv4, and I turned it on for IPv6 way later than IPv6 was implemented.
  • date = date of diff (used for some search functions).
  • time = time of diff

Just to note, your primary search fields will likely be the username, the pagename and the domain.

Choose the queries wisely: SELECT diff,user FROM linkwatcher_linklog WHERE domain LIKE 'com.mdpi.%' AND fullurl LIKE '%2078%' AND fullurl LIKE '%2489%'; (0.25 sec, going through 11277 records out of almost a billion records) will be orders of magnitude faster than 'SELECT diff,user FROM linkwatcher_linklog WHERE fullurl LIKE '%mdpi.com%' AND fullurl LIKE '%2078%' AND fullurl LIKE '%2489%'; (I haven't tried). Username, domain, pagename, revid, oldid, wikidomain, language all have indexes on them, as long as you have the beginning correct (hence the 'com.example') these should be fast. Avoid those with big numbers (I've never even attempted to see who spammed a certain video on youtube with domain like 'com.youtube' AND fullurl LIKE '%<videocode>%, I am afraid that that will take ages).

Have fun. --Dirk Beetstra T C 15:47, 6 August 2020 (UTC)

Beetstra, thanks, this is really helpful - will try not to break anything. GeneralNotability (talk) 19:51, 6 August 2020 (UTC)

Manually archiving blacklist talkpageEdit

Hello Beetstra, I just did a quick test to manually archive 1 old case. Seems to work just fine - should we archive the other old stuff manually aswell, or do you see any technical reason not to? It might be easier than a lengthy research why the talkpage archiving is hanging yet again (I suspect something in your huge copypaste from 7 July is bothering the archive bot). Just wanted to double-check with you before further cleanup, please feel free to revert my test archiving if it causes technical problems. GermanJoe (talk) 11:49, 7 August 2020 (UTC)

GermanJoe, just do it. But I would prefer to the bot to start working again. Dirk Beetstra T C 11:59, 7 August 2020 (UTC)
OK, will do when I got a bit of time. My hope is that the bot will work again, once most of the old issues (and whatever is irritating the bot within them) are archived. GermanJoe (talk) 14:11, 7 August 2020 (UTC)

COIBot bad page creationEdit

COIBot recently created WikiProject Spam/LinkReports/onlinetutorials.tech in article space, rather than project space. power~enwiki (π, ν) 20:58, 7 August 2020 (UTC)

Power~enwiki, thanks for reporting. Every once in a while it misfires. I don’t understand why it sometimes misses the namespace. I’ve moved/merged it. Dirk Beetstra T C 21:11, 7 August 2020 (UTC)

Dirk, can you please add a check that it can't create pages in the main namespace. Over the last few days, we have all of these (and there may have been others which have since been moved, I can't check this):

I haven't moved them so you can check them, but these all should be moved out of the mainspace, and no new ones created of course. Fram (talk) 12:51, 10 August 2020 (UTC)

Fram, crap. I have partially blocked the bot for now until I have time to program a check (or figure out why it does this so suddenly). I deleted the pages, too lazy for a history merge. Thanks for the report. Dirk Beetstra T C 13:45, 10 August 2020 (UTC)

Hey there, B, twice I've dropped pokes here and here for COIBot to look at these links, and both times the bot hates me, and there are no reports. What am I doing wrong? Thanks, Cyphoidbomb (talk) 04:41, 14 August 2020 (UTC)