Wikipedia talk:WikiProject Disambiguation/Vulnerability of short pages to attack, UD overflow, and other issues of Boke

This discussion was moved from WT:WikiProject Disambiguation, and is now linked from there. It concerns methods of dealing with the possibility of short pages (such as disambiguation pages) being exploited to make very publicly visible attacks.

(Also appended is a second discussion moved from WT:WikiProject Disambiguation, about what should be the name of this subpage, moved here after renaming of this subpage.)

Looking for input on Boke (Disambiguation) edit

I stumbled on this page while looking for...at this point, I don't remember how I got there. Probably I'm still in a bit of shock from seeing this disambiguation page. Now, I don't pay attention to what the current state of policies, guidelines, or Manual of Style dictates are at any given time (for reasons I won't go into here), but this extensive list of possible meanings for "boke" & its unusual formatting either (a) conforms to standards so new it shouldn't be a surprise that I don't know about them, or (b) is evidence that someone has gone overboard in trying to create the first Featured Disambiguation Article. Since you guys are the experts on this sort of thing (me, I just start the pages with a few entries, & add more entries to them), I'd like to hear your thoughts. Comments please. -- llywrch (talk) 21:48, 27 April 2009 (UTC)Reply

Erm, well, the content, structure and organization of that page [1] is rather idiosyncratic, to say the least. I'll take a whack, although I'm sure I won't get through it all. olderwiser 21:58, 27 April 2009 (UTC)Reply
My point wasn't to nudge anyone to do a rewrite. It was just to get comments from knowledgeable folks. A lot of work has been expended there, & it'd be a shame if it all led to Yet Another Editwar. -- llywrch (talk) 22:15, 27 April 2009 (UTC)Reply
Oy vey. I think I'll let this sit for the judgment of collective wisdom. olderwiser 22:56, 27 April 2009 (UTC)Reply
Let us be clear there is no edit war here. Strongly suggest Llyrch refrain from leaping to mis-characterizations. The unusual nature of the page is obvious. What is NOT obvious is why the page is like that. THAT will be illuminated below.
I assume the likely result is that the "content" will end up being moved to another platform. HOWEVER the likely result may not be best result. We shall see.
IN ANY CASE, the particular issues which "inspired" the current version of the page ... are useful to examine. On my user page you may spot the words "Disambiguation Exception Analysis." Hold that thought, while I prepare some more notes in a sandbox before copying them to this page. Proofreader77 (talk) 23:20, 27 April 2009 (UTC)Reply
Proofreader77, are you absolutely sure you want to approach it this way? Why spend "100 hours" (I assume that's intended sarcastically, but I'm sure it instilled dread in the hearts of those reading this thread) on a "presentation" that runs the risk of being ignored or of being unnecessary, when a few well-chosen sentences might do? I'm not sure you even understand what it is people are questioning about this page. I'd suggest going with the short version first, and expand only if necessary. Surely you can describe an outline of what you're doing in a short paragraph in simple English?
If you really want to do it this way, knock yourself out, I suppose. But it would be a shame for you to make a long presentation, only to have people tell you at the end "we already knew that", or "that's not what we were asking about", or "this has already been discussed before, at WT:FOO/Archive 1" or something. --Floquenbeam (talk) 23:29, 27 April 2009 (UTC)Reply
100 hours of my time is to make the case well-designed, easy-to-skim, not tiring on the attention. :) The short answer requires some common terminology. Otherwise we're discussing different things. And surely you'd enjoy it more if it rhymed, yes. ;) Proofreader77 (talk) 23:34, 27 April 2009 (UTC)Reply
Well, OK, I've yet to find a way to force people to follow my brilliant advice against their will, so do what you think best. And I didn't know there'd be rhyming, I do like rhyming. If it's all the same to you, I prefer iambic pentameter, but that's your call. --Floquenbeam (talk) 23:48, 27 April 2009 (UTC)Reply
Iambic pentameter: Is there any other kind? :) Excellent. Proofreader77 (talk) 00:09, 28 April 2009 (UTC)Reply
N.B.: iambic pentameter has nothing to do with rhyme, but rather defines the rhythm and length of a line. And that's all I'm prepared to contribute to this conversation.--ShelfSkewed Talk 03:49, 28 April 2009 (UTC)Reply
LOL Yes, you're quite right — twas making a lazy connection/assertion of sonnets. Surely that should tempt you join us in the discovery of truth and beauty at the end of the universe? Oops, wrong page. Disambiguation page R&D?:) Proofreader77 (talk) 04:29, 28 April 2009 (UTC)Reply
PS Boke talk: I put a clear note on that talk page to contact me about the configuration. But no contact, and it has already been moved from Boke (disambiguation) to Boke ... when there was a reason for the earlier move, which no one knows because I was not contacted. yada yada yada :) Proofreader77 (talk) 23:41, 27 April 2009 (UTC)Reply

My doing edit

My doing: Dealing with a problem of disambiguation SEO vandalism (and also the Urban dictionary issue) on a Google #1 result (at the time) page.
QUICK NOTES for now (more later):
  • RE SEO VANDALISM (on a top Google result) page. A disambiguation page can be used for sticking an attack in a Google snippet of a #1 result. But more later.
  • The design has successfully deflected the earlier problem, but a bot could be designed for that. (Need to do bot research.)
  • Information is useful and should probably be moved to somewhere else eventually (perhaps a case where, e.g., knol, would be a good solution.
-- Proofreader77 (talk) 22:03, 27 April 2009 (UTC)Reply
Unclear what you mean by SEO vandalism. At a glance, most of the content is inappropriate for a disambiguation page. Some might be appropriate for Wiktionary. olderwiser 22:16, 27 April 2009 (UTC)Reply
"Unclear" because it is a situation that is not commonly dealt with, yet. Will give detailed illumination below (in a bit). Proofreader77 (talk) 22:23, 27 April 2009 (UTC)Reply
QUESTION: Are you familiar with "SEO"? Proofreader77 (talk) 22:25, 27 April 2009 (UTC)Reply
Instead of arguing with another editor about definition of words, why don't you supply a diff or two illustrating the vandalism you are worried about. It may be that there is a much simpler way to handle the problem -- & it may be a solution we can use in similar instances. -- llywrch (talk) 22:32, 27 April 2009 (UTC)Reply
I am not arguing. I am asking editors to discuss before taking preemptive action. Proofreader77 (talk) 22:33, 27 April 2009 (UTC)Reply
COMMENT: Mis-characterization of interactions is not helpful to the process. Proofreader77 (talk) 22:53, 27 April 2009 (UTC)Reply
First, let me repeat the question. "Are you familiar with SEO?" (That is not an argument.) Proofreader77 (talk) 22:47, 27 April 2009 (UTC)Reply
If the question is directed at me (which seems a little odd as this a general talk page unassociated with the specific topic), then no SEO means nothing in particular to me. olderwiser 22:52, 27 April 2009 (UTC)Reply
Fine. I will proceed from that. Proofreader77 (talk) 22:54, 27 April 2009 (UTC)Reply
PS Why here rather than on Boke talk? Something to do with WP:BEANS. :) Or rather, don't tell the people who are doing SEO vandalism about what you're doing about it. Hence here, rather than there. Proofreader77 (talk) 23:04, 27 April 2009 (UTC)Reply
I posted a request for comments/input here because more people would read a posting here than on that page's disambiguation page. And also because I've been wrong about changes to Wikipedia format in the past, & was looking for a sanity check. You've shown some good ideas on that page, Proofreader but (IMHO) some very bad ones too, which needlessly complicate the page. And, for the record, my comment about "writing the first disambiguation page to be a Featured Article" was not sarcasm -- I believe your work on this page raises the question if a disambig page honestly could become a Featured Article, regardless if your approach was the right one. -- llywrch (talk) 16:15, 28 April 2009 (UTC)Reply
Thank you your kind words and honest criticism. And let me note that if someone was going to instigate "all this," I'm delighted it's you. :)
Keep one thing in mind ... "all the complexity" was intentional to prevent the "SEO Vandalism" I mention elsewhere on the page. When the page is stripped down, the vulnerability returns. See comments on Talk:Boke about simply reverting to what the page was BEFORE the "complexity" —BUT only after a BOT has been designed to handle the problem that I solved with complexity.
NOTE: There is a second issue Urban Dictionary Overflow, which is handled down in the "WORD" section, but also by the complexity.
FINAlLY:) I'll conclude by noting that what came out of solving the SEO Vanndalism is good content I'm perfectly happy to put somewhere else (noting that the Wikipedia entry will no longer be worthy of Google#1, but it will remain so).
PS (LOL) BUT remember the SEO problem MUST be solved with a bot if the page is stripped down. As for the Urban Dictionary problem, that's back, but less a big deal. I.E., Cleaning the page resurrects two problems that have vanished.
--Proofreader77 (talk) 16:29, 28 April 2009 (UTC)Reply

The atypical issues of the page edit

Note that this will take some time to illuminate. I am prepared to invest 100 hours or so in such illumination, but not crammed all into the next minute.:) Relax a bit while I organize the presentation.
(much more to come)
-- Proofreader77 (talk) 22:42, 27 April 2009 (UTC)Reply

SOME TERMINOLOGY/CONCEPTS: edit

  • SEO (Search Engine Optimization) — Designing web pages (and taking other actions) so that they appear in the top search engine results.
  • Google snippet — A sampling from the page included beneath its entry in the search engine results.
    • NOTE: When information selected from a page becomes a snippet, it will not change in the results until the page is revisited by Googlebot and the "cache" updated. This may take a couple of weeks or more. SO: Vandalism reverted in Wikipedia may persist in the Google cache for some time.
  • Wikipedia Disambiguation SEO Vandalism (DSEOV, my phrase)— Inserting (sometimes only arranging) information on a disambiguation page .... so that it will appear in a Google snippet ... where the (often) high Google rank of Wikipedia disambiguation pages will place the "message" at the top of search results ... and will remain there for some time even after the Wikipedia page has been changed.
  • AN (ACTUAL) EXAMPLE: (not the kind on Boke, but begins to illustrate the concept)
Body Odor, Breath Odor, Barack Obama

This is the Google snippet someone attempted to "engineer" by:

(1) Breaking "B.O." off as a separate disambiguation page from BO, then
(2) putting those three entries at the top of the page, in that order. Then ...
(3) proceeding to argue that what they had done was completely legitimate. (I disagreed, and prevailed. :)
NOTE: Putting Barack Obama on a "B.O" disambiguation page is the kind of thing that happens all the time on disambiguation pages —the initialism kind of partial match which is not usually treated as "vandalism," but as a misunderstanding of disambiguation pages. (JFK is the special case which confuses many people and the issue.)
  • THE Boke CASE ... is a different type— an example of adding a large attack text to a small disambiguation page ... in which case, Google's algorithm treats the new large addition as what's important, and therefore the lede/lead of the attack text becomes a Google snippet ... which persists at the top of Google results after the attack/vandalism has been removed.
  • Urban Dictionary Overflow (UDO) ... which is the product of Frustrated UrbanDictionarian Syndrome (FUDS) — It can take several months (if ever) for an Urban Dictionary submission to appear these days.
That's no fun. SO, if there's a top-of-the-results Wikipedia entry (or two) for the word, why not stick it in there, where it appears immediately?
NOTE: A small, simply-formatted disambiguation page is very attractive to those with FUDS. Whereas a large, complexly-formatted disambiguation page is much more resistant to UDO.
  • Google #1 (For Wikipedia, often #1 and #2 indented)— The top search result.
NOTE: Until a week ago Boke disambiguation was #1. Now #2 under Urban Dictionary (which is rising recently, despite the fact it takes months to get entries in, but that's another matter).
IMPORTANT POINT: Not all disambiguation pages are Google #1 (or 2). The ones that are, are the ones most subjected to the problems of DSEOV and UDO.
  • Disambiguation BLP attack on a non-notable without a biography (D-BANN? figure out your own acronym:) — Under the right circumstances, someone without an article about them, can be attacked "in Wikipedia" by insertion of attack text on a Google #1 disambiguation page. And that attack can remain in a Google snippet for a week or two, or more.
  • Global words (concept) — It is clear why Wikipedia asserts it is not a dictionary. (Skipping long discussion) But of course, there are many exceptions that have grown into articles (of a sort) here.
Boke is an especially complex case. For example, bokeh, which is often spelled "boke," is anglicized Japanese, which has become part English usage. A kind of comedy team in Japan is anglicized as boke, and that is used in English. Same goes for Mongolian wrestling, with one simplified spelling being "boke." Yada yada yada. Then you have the Urban Dictionary issues (including various spellings) ... Yada yada yada.
WHAT ABOUT THAT? Well, consider if the Wikipedia entry is Google #1 — then that page is going to get all kinds of attention. (Among which see FUDS and UDS:). While we normally do not think of Wikipedia disambiguation pages having any special "duty" to be "worthy" of being Google #1.
TO PONDER: If we cleaned this page completely to correspond to the disambiguation guidelines, does that improve Wikipedia, or diminish it? Just laying that "big picture" thought on the table. Not that it is persuasive. And perhaps this page is something else altogether. Perhaps something, with adjustment, that can be allowed. An exception that might be the forerunner of a particular category of page that does not formally exist, yet. Just perhaps.

(to be continued)

-- Proofreader77 (talk) 23:00, 27 April 2009 (UTC)Reply

The sonnetized version (as promised - diversion and illumination) edit

Part I :)

(BDS.001) Diverting Urban Dictionary Overflow (DUDO:)

{BDS.001.01} _ THIS URBAN DICTIONARY does not work!
{BDS.001.02} _ I wrote my cool submission months ago.
{BDS.001.03} _ Where is it now? It's vanished in the murk
{BDS.001.04} _ of Urban Dict's quite untransparent flow.

{BDS.001.05} _ BUT, AH, HERE'S WIKIPEDIA. Ah hah!
{BDS.001.06} _ I'll put my definition in that place.
{BDS.001.07} _ And no lame Urban Dictionary ma
{BDS.001.08} _ will send it off somewhere to wash its face.

{BDS.001.09} _ My proud, unclean submission not held back
{BDS.001.10} _ from public view in Urban Dict's lost bowels.
{BDS.001.11} _ One save in Wikipedia's the knack
{BDS.001.12} _ that every child can learn. We don't need towels ...

{BDS.001.13} _ BUT WHAT THE F****? This Boke page is too hard.
{BDS.001.14} _ Forget this sh*t — I'll go play in the yard.

 

(BDS.002) Puke! Not? WTF?!

{BDS.002.01} _ I KNOW WHAT "BOKE" MEANS, thinks the Irish lad.
{BDS.002.02} _ His friends say it means "vomit." They can't spell.
{BDS.002.03} _ The TV Scotsman's script spells "boak" a tad
{BDS.002.04} _ bit diff'rently. On TV you can't tell.

{BDS.002.05} _ BUT OFF TO WIKIPEDIA he runs
{BDS.002.06} _ to tell the world of knowledge he is sure.
{BDS.002.07} _ His mind immersed in vomit and its puns.
{BDS.002.08} _ YET what awaits him there's a kind of cure.

{BDS.002.09} _ A HELLISH HOLE OF BOKES beyond all count.
{BDS.002.10} _ What are those bokes a doin' on this page?
{BDS.002.11} _ Don't they know it means puke? He must dismount
{BDS.002.12} _ the pony of his surety — engage . . .

{BDS.002.13} _ . . . the world of bokes beyond the local pub.
{BDS.002.14} _ And maybe learn to spell — aye, that's the rub.

(BDS.003) Where We Turn

{BDS.003.01} _ THE GOOGLE ALGORITHM POINTS to here
{BDS.003.02} _ if you are looking for where "boke" might be.
{BDS.003.03} _ Or what or who or when ... But thinks it queer
{BDS.003.04} _ that you have asked for "boke." (Yes, look and see.)

{BDS.003.05} _ I bet you mean "bo-keh." Or else the crap
{BDS.003.06} _ in Urban Dictionary where we thumb
{BDS.003.07} _ the turds dropped up or down, but rarely map
{BDS.003.08} _ a meaning for ourselves. Can't fix what's dumb.

{BDS.003.09} _ BUT MOST KNOW Wikipedia is where—
{BDS.003.10} _ if there's an answer— someone's spent the time
{BDS.003.11} _ to tell all there's to tell. They do not care
{BDS.003.12} _ if dozens fought to make it true. No crime.

{BDS.003.13} _ The ones who care are busy fighting more.
{BDS.003.14} _ And maybe one makes sonnets of their lore. :)

(to be continued) -- Proofreader77 (talk) 08:23, 28 April 2009 (UTC)Reply

(Housekeeping note) edit

Rather than continuing to edit this page incrementally, I will prepare a single-edit insertion in a sandbox. (to be continued) Proofreader77 (talk) 23:13, 27 April 2009 (UTC)Reply

Changed my mind, with gentle prodding. :) Now for the sonnetized orchestration of the fragments above ... (You think I'm kidding, don't you? :) But not tonight. Proofreader77 (talk) 03:44, 28 April 2009 (UTC)Reply

Please explain edit

Proofreader77: All the above seems to relate to issues of what text appears in the Google sample of the dab page. None of it seems to explain the decisions to use many blue links in entries, non-standard format of section headings, and "small" font for eye-achingly large proportion of the dab page. Are those explanations on their way? PamD (talk) 06:58, 28 April 2009 (UTC)Reply

Tonight, let me begin with your second information (see below). All issues will be addressed in due course.
-- Proofreader77 (talk) 08:53, 28 April 2009 (UTC)Reply
For info: I've found the text which is presumably what's being talked about - added in Feb 2008 by first an IP then a named user, and removed after a few days ( not immediately - can't have been on many people's watchlist). It was removed each time, but neither editor got any message on their talk page. PamD (talk) 07:05, 28 April 2009 (UTC)Reply
Preliminary response re content inserted: Some guiding questions ...
  • What do you think the Google snippet for Boke was after the insertion of the large chunk of text into the small disambiguation page? (See TERMINOLOGY)
(NOTE: The text was inserted 3 times. Twice with an account, once by same user with as ip—probably accidentally)
(NOTE: We can discuss the specific content, although the Google snippet is the key issue.)
  • How long did the Google snippet of the removed text remain in Google results after it was deleted from Wikipedia? (2 weeks)
  • How often would someone theoretically have to check Boke to guarantee that the Googlebot would not find that text again?
  • What might be done OTHER THAN checking often enough to satisfy the above?
  • Note, that is the primary inspiration for what the page looks like, but not the only things in play. First, see the sonnets. :)
Ponder that for now. More to come. :)
--Proofreader77 (talk) 08:53, 28 April 2009 (UTC)Reply
Yes, the poetry and so on is all very impressive, but is anyone doing anything about tidying this dab page up? I've just made an attempt with the first few groups of entries, but can't spend any more time now - anyone want to take over?--Kotniski (talk) 08:39, 28 April 2009 (UTC)Reply
Cease and desist. The page is under discussion. What is your rush? Proofreader77 (talk)
Actually i think Kotniski's edit improved readability of the dab page. It provides a good means to discuss what you are doing, we can compare pre- vs. post-that-edit. Proofreader, you seem to want others to "ponder" something about Google's caching of the wikipedia page, but I don't see what the relevance of that is to the question of how to revise the Poke disambiguation page in wikipedia, for wikipedia readers. In particular you suggest that the "Google snippet is the key issue", but i don't see any reason why that should be discussed here. It looks to me like the issues about disambiguation page formatting you must want to raise here are:
1. use of "small" font differently than is usual practice
2. inclusion of bluelinks differently than is usual practice
3. inclusion of items that are not wikipedia articles (and are not likely to become wikipedia articles), which is not usual practice
Does this characterization of three points capture everything? Perhaps you could give your reasons for these three differences, for the Boke page or in general, and identify any other differences that you feel are important. Also I am not clear on whether you regard the Poke disambiguation page to be different in some way from all others, or whether these three formatting/inclusion patterns are, in effect, general proposals to change all disambiguation pages. doncram (talk) 09:25, 28 April 2009 (UTC)Reply
Rushing to change the format that is under discussion is outrageous. Clearly the issues are being raised in detail.
Outrageous editing behavior by an administrator is now the issue.
--Proofreader77 (talk) 09:34, 28 April 2009 (UTC)Reply

FORMAL REQUEST that formatting not be changed until it is discussed edit

I have reverted Kotniski's rush to edit while the discussion is underway. The whys cannot be discussed if there is nothing to look at.
-- Proofreader77 (talk) 09:01, 28 April 2009 (UTC)Reply

FORMAL COMPLAINT re administrator User:Bkonrad and rollbacker User:Kotniski edit

Careful and diligent examination of why Boke was formatted that way is in progress.

Changing the page so that what is being discussed is poor form, and certainly beyond the pale for an administrator.

There is no legitimate reason to rush the reformatting of a page under discussion. Continuing shall result in a DR.

NOTE: A DR not for a content dispute — but for outrageous editing behavior by an administrator and a rollbacker.

Dismissive and contemptuous of the community and of process.
--Proofreader77 (talk) 09:27, 28 April 2009 (UTC)Reply

I've no idea what you mean. I started tidying up a dab page like we do every day on this project. If you want to discuss something about it, then please say specifically - in normal clear language - what it is you object to and why. Then we can move in the right direction. But there isn't any doubt that the dab page in its current format is totally at odds with the guidelines and accepted standards.--Kotniski (talk) 09:33, 28 April 2009 (UTC)Reply
Changing the format while the whys of that format is under discussion is outrageous editing misconduct.
Proofreader77 (talk) 09:37, 28 April 2009 (UTC)Reply
FORMAL REQUEST: Restore the page to the state it was when discussion began. '
-- Proofreader77 (talk) 09:39, 28 April 2009 (UTC)Reply
Proofreader77, perhaps you are not aware that we can see the page in your version of 19 March forever, it does not need to be restored to be the current version, to be discussed. You could just explain here, in what ways you think that is better than this Kotniski-edited version. doncram (talk) 09:46, 28 April 2009 (UTC)Reply
I direct your attention to Bkonrad's [diff] with this edit summary:
That insulting and dismissive response to the careful and methodical laying the groundwork for the discussion on this page is outrageous.
Revert it. The edit summary alone requires that bow to civility and process.
Proofreader77 (talk) 09:54, 28 April 2009 (UTC)Reply
I don't think you've quite grasped how we do civility and process at WP. If you find yourself in a minority of one, then you're probably better off trying to convince people with clearly presented arguments, not go shouting everywhere with complaints and demands. I will be ignoring these from now on, but will be happy to listen to any suggestions you may have if they are specific and understandable.--Kotniski (talk) 10:19, 28 April 2009 (UTC)Reply
See the above. All of it. Especially regarding laying the groundwork for discussion of the exceptional nature of that disambiguation page. Your mischaracterization of what has happened so far, is noted for the record. But mostly it demonstrates you have ignored what has been asserted so far, or are pretending to (a common, but ineffective rhetorical strategy).
You have been requested to revert your editing of Boke to the state it was when this discussion began.
And let us specifically note your dismissively commenting on this page thusly dif with the edit summary "back to reality":


Preemptive action, dismissing all discussion with the wave of a hand.
You do not wish to investigate the issues, fine. But do not edit the page while the discussion is in progress.
It would be in your best interest to revert the page. And then read the discussion above—there are interesting things to note, in any case.
-- Proofreader77 (talk) 10:45, 28 April 2009 (UTC)Reply

META COMMENT: The behavior of User:Bkonrad and rollbacker User:Kotniski is an example of damaging patterns of interaction edit

If you read what I said much earlier, I expected that after discussion, the most likely result would be a complete cleaning of the disambiguation page. HOWEVER, there was a possibility that something else might come of the discussion.

By ignoring all the issues raised (the whys of the unusual formatting to pointlessly rush to conform the page to a format anyone could do at any time is, let me be clear, vile.

Let me also be clear, no one should have the authority of the administrator bit who would act in this matter.

Let all the above be noted for the record.
Proofreader77 (talk) 11:01, 28 April 2009 (UTC)Reply

Proofreader77: I am myself not that fond of the current disambig guidelines, I too prefer to have more information on the disambig pages. However, you are going about this in the wrong way. It seems there are two issues here:
1: You want to use another style on that page than what is the current standard. In such cases we sometimes boldly try it, like you did. But if we get reverted and other users say they don't like it, then we don't just change back to our non-standard version. Instead you should copy your version to a test page somewhere so you can work on it and show it to people and discuss it. Since Boke is in the article space then the normal place to put the test page is under your own user page. So I suggest you create the page User:Proofreader77/Boke, or perhaps User:Proofreader77/Boke (disambiguation), and put your version there. Then you can work on it there. And then you can come back here and link to it and explain why you want the page to look like that. Then we can discuss that page here.
2: You seem to be worried about some specific kinds of vandalism of the boke page. (I had never heard about such "Google spamming" before, but I understand how it works and that some people might be motivated to do that. Interesting kind of vandalism.) It seems you want to obfuscate the page (make it complicated) to scare of vandals. But that is not an approach we use here at Wikipedia. Instead, if that page has a problem with vandalism, then you can ask us to semi-protect the page to make it harder to vandalise that page.
--David Göthberg (talk) 11:04, 28 April 2009 (UTC)Reply
As I said, I assumed that the current state of the page would probably have to be moved to another platform (perhaps knol).

But the page was that way for a reason largely having to do with preventing exploitation of Google's algorithm to plant persistent attack text in the Google cache by abuse of Wikipedia.

Refusal to wait while I clarified this issue, which is clearly being outlined under TERMINOLOGY — is an act contrary to any sense and civility.

There is no excuse for that behavior. Again, behavior. Not content.
And while an "ordinary" editor is granted more slack, and administrator has a higher duty than rush to conform to guidelines when special circumstances are being claimed.
The insulting dismissal of the discussion of those concerns is behavior unbecoming an administrator.
There was no need to make this an ugly mess. An administrator didn't have to make that choice. He didn't have to insultingly dismiss legitimate concerns—being raised with great care and detail (And an attempt to lighten the procedures with verse:) But he/she did.
Let it be noted. To be continued in the appropriate venue—not for content disputes, but for behavior. Proofreader77 (talk) 11:30, 28 April 2009 (UTC)Reply
Administrators (and editors in general) can deal with special circumstances when they are identified, not when they are claimed. "Lightening" with verse just puts a extra burden (small or large) on the reader (admin or not) the parse the meaning. Discussion of Boke should take place at Talk:Boke. If there is some reason why it can't be discussed there, that reason needs to be explained so that other editors can understand and possibly agree, not simply claimed. -- JHunterJ (talk) 11:54, 28 April 2009 (UTC)Reply
QUOTING FROM ABOVE:
Yep, that's the claim that needs to be explained. Especially since "there" you put a link to "here", where these same hypothetical SEO vandals could click through and read it, if doing so would yield some additional benefit to them (which I still don't see how). -- JHunterJ (talk) 12:12, 28 April 2009 (UTC)Reply
The link to here from that disambiguation talk page would not have been necessary if the rush to ignore this discussion had not demanded a note there. I adapt to the absurdities that present themselves.
As for "that's the claim that needs to be explained", the explanation is actually already covered briefly in the TERMINOLOGY section. So start there. That is the framework for the discussion interrupted by all the frantic rushing to ignore the discussion.
Proofreader77 (talk) 13:02, 28 April 2009 (UTC)Reply

Proofreader77, I think you better take a deep breath & consider your situation here: many of the people commenting on your version of this page have problems with it. The exception to this majority is doncram, myself (in part), & maybe David Göthberg (if I understand the implications of his dislike for the guidelines correctly). When that many people say your edits are wrong, perhaps you should consider whether they are, in fact, wrong. Arguing over the behavior & intent of other editors who disagree with you is only making you look bad. -- llywrch (talk) 16:23, 28 April 2009 (UTC)Reply

Proofreader77, I second llywrch's comment above, especially the "take a deep breath" part. If you want to suggest that Boke requires some special treatment unlike other disambiguation pages, then you ought to stick to explaining the reasons for that special treatment and not attacking the editors who disagree with you, rhetorically analyzing their comments, or otherwise shooting the messenger.
As for Boke, can you provide a specific example of SEO vandalism (in the form of a link to an old revision of the page)? I have tried to do so myself, unsuccessfully. The entire history of the page consists of 228 revisions, but 180 of them were submitted by you, so that narrows down the range of possibilities considerably. In fact, I can find only eight or nine revisions between 8 June 2008 and 27 April 2009 (yesterday) that were submitted by anyone other than you, and none of them looks to me like SEO vandalism. There was one instance of apparent spam, which you quite properly reverted, but that's it. Without specific examples of repeated improper edits, I would have to conclude that you are trying to fix a problem that does not actually exist. --R'n'B (call me Russ) 17:21, 28 April 2009 (UTC)Reply

Rhetorical questions for Llywrch and R'n'B, while awaiting archive-link question :) edit

  1. Are you calling me a liar? :)
  2. How stupid do you think I am? See #1.
  3. What is the difference between a content dispute and a behavior dispute, um, in terms of WP jurisdiction. :)

When I get the answer about the links below, I'll respond with links ... OR with slightly edited text. Fair enough?
--Proofreader77 (talk) 18:49, 28 April 2009 (UTC)Reply

No one called you a liar or stupid, as far as I see. You seem unfamiliar with Wikipedia process, but in my view you've been treated generally very kindly, with multiple editors trying to understand what your points are on the content or formatting or SEO vandalism topics. In my view, you seem to be rather withholding of your answers/responses to polite questions on these topics. I (and i think most others here) am less interested in the topics of administrative or other behavior which you are trying to raise. Your raising questions as to whether there has been a behavior breach is okay, as long as you keep your questions as civil questions, and don't go further into incivil accusations. As you are a relatively inexperienced wikipedia user, I think you should defer to others a bit on whether there have in fact been any serious behavior issues worth discussing much more. I see no others agreeing that there has been any serious misbehavior by anyone else. In my view, there were one or two edit summaries that were a bit negative in tone, but nothing "over the pale", and you have been free to state your objection to tone in an edit summary and move on. You are just not getting any agreement that there is any problem with others behavior. You are free to discuss the content, formatting, and SEO vandalism questions. After a while, if you don't communicate on those matters, I am less inclined to participate further and try to help clarify what your views on those matters are. Overall i think you have gotten an extraordinary amount of polite attention.
About your specific "deal" posed, to answer on some matters if you get an answer on an unrelated question, that seems inappropriate, although I tried to give you a response, anyhow. doncram (talk) 19:19, 28 April 2009 (UTC)Reply
re: the technical question's connection: Can you imagine that if a link on this page could be followed into an archive by Googlebot, I would (wisely) refrain from placing such a link to information I have spent a great deal of energy keeping out of Google search?

NOTE: With that question answered, I am preparing the diffs and will post them when formatted. (I have read and take under advisement your other comments. Thank you for your careful attention so far.)
--Proofreader77 (talk) 19:35, 28 April 2009 (UTC)Reply

TECH/WP QUESTION re linking to archive versions (diffs) and seach engines edit

  • If a link to an archive version is put on this page, does that expose the archive copy to search engine archiving. (e.g. Googlebot)?
    -- Proofreader77 (talk) 18:30, 28 April 2009 (UTC)Reply
I think it does not. Google is pretty smart about its inclusion of wikipedia articles: it seems to pick up new articles created, immediately. I do not ever encounter old versions of articles. It is very easy for Google and other search engines to know that the link is an old version, not the current version of a wikipedia article, by the nature of the URL. doncram (talk) 19:06, 28 April 2009 (UTC)Reply
Sorry to have troubled you, I now see the answer—Wikipedia archives are meta-tagged noindex, nofollow (that covers it).
--Proofreader77 (talk)

(unindent) Browsing in the past history of the Boke page, I see one vandalous edit that is probably one of the 3 vandal attacks mentioned by Proofreader here or on the Talk page of the article. It has an edit label about a "tall tale", and does seem to have been very mean and malicious, in clear violation of multiple wikipedia policies. The edit I saw was reverted one minute later, in favor of returning the page to a version edited by a user whose username includes the word "Boke", but I assume the malicious text persisted in Google for some time and was legitimately upsetting. I project there is some person nicknamed or otherwise known as "Boke" or some variation in real life, and it is this person that proofreader wants to protect. I hope I am not describing too much to make it hurtful to proofreader, who seems to be trying to avoid giving publicity to the specific past vandalism. Sorry if i am calling too much attention to it now. Seeing this, anyhow, I think there are not true general issues present, but rather the main thing going on is that one editor is legitimately concerned about malicious vandalism in this one article. I would support use of wikipedia's "oversight" edit deletion capability to remove the malicious material from all 3 edits out of the page history entirely. Also, I'll personally put the Boke article onto my watchlist and therefore be one person who might notice and be able to revert a future malicious edit. Perhaps the page could be put on permanent semi-protection, although I am not familiar with what level of attacks justify that (it may not be consistent with wikipedia protection practice). Is there anything else which can be done to help prevent vandalism from hitting that page in particular, for it to be removed promptly if it occurs, and/or from it being picked up in Google and other search engines? Here or on the Talk page, proofreader asks whether a bot can watch the page and detect vandalism. Could someone else explain about how the bots or other systems do work to remove obvious vandalism, and/or put in a report somewhere towards customising one of these systems to protect against the type of past vandalism on that page in particular? doncram (talk) 19:57, 28 April 2009 (UTC)Reply

Thank you, Doncram. I very much appreciate your careful phrasing of the issues. But it seems that so many people want to see the diffs, let's go ahead and display them. Proofreader77 (talk) 20:55, 28 April 2009 (UTC)Reply
Note re oversight (and possible deletion of diffs) Because the editor used an ip once, it was possible to identify who the person is. In the future it may be important to be able to document this should a legal action against that person be required to restrain them. So let it not be deleted via oversight (if possible)
Proofreader77 (talk) 21:04, 28 April 2009 (UTC)Reply

The diffs of the 3 occurrences of (attack-snippet) SEO Vandalism on Boke edit

22 FEB 08

#1 REVERTED IMMEDIATELY (10:14, 22 February 2008) BY RC Patroller User:Ravichandar84
#2 REVERTED IMMEDIATELY (10:23, 22 February 2008) BY RC Patroller User:Ravichandar84
NOTE: Ravichandar84 moves Boke to Boke (Disambiguation) inspired by above, and boke=puke slang entry (which he also removed) THE POINT: A page clearly labeled 'disambiguation page is easier to point out to new editors that it is a disambiguation page, and that special rules apply. etc.

25 FEB 08 (3 days later)

  • (3) (two edits: insert text, change one word)
(a) Revision as of 03:17, 25 February 2008]
(b) Revision as of 03:17, 25 February 200
by User:Wkt37211
(no edit summary)

[Undeniably insulting edit redacted. -- llywrch (talk) 06:10, 29 April 2009 (UTC)]Reply


MARCH 2 (5 days later) #3 REVERTED BY Proofreader77 (in two steps)


APPROX MARCH 12 (10 days after that) Google snippet disappears from Google results.


PROBLEM: How to prevent a repeat?

  • A. Check Boke entry every 3 hours (24-hours a day) Note: bad solution

THEN, AH HA!

  • B. Add ballast to Boke ... SO that insertion of attack text will NOT appear to be what "the article" is.
    NOTE: The < small > text was ballast for the Google algorithm and THEREFORE reduced in size to indicate its lesser importance. It was carefully researched to be "fitting" to the page, if not the guidelines.

COMMENT: We are now back to "solution 1" due to the outrageous behavior of the dab-cleaners who refused to delay a dab-cleaning in contempt of the requests made of them in good faith and for good reason

Let us pause there for the moment. -- Proofreader77 (talk) 20:55, 28 April 2009 (UTC)Reply

NOTE: How does this identify a specific person? edit

Due to changes in Google's algorithm it is now mid-page, with Wikipedia and Urban dictionary up top, but in February of last year www.boke.com was directly above or below the Wikipedia entry ... and the Google snippet you see displayed above.

BOTTOM LINE: The person can be identified with the right searches based on the content within the whole SEO vandalism entry as well. The "cover language" of "tall tale" is an attempt at "covering" the attack.
-- Proofreader77 (talk) 21:17, 28 April 2009 (UTC)Reply

NOTE: How did the "special formatting" and "extra content" help remedy this situation? edit

  • When a disambiguation page is small (as usual) ... the insertion of a LARGE block of text' with sentences ... will be perceived to be what "the article" is about ... AND treat the lede/lead sentence as the source of the Google snippet.
  • However if Google is seeing a large, complex page (ballast) ... the block of new text is not completely dominating of what is there (and so the snippet is far less likely to be the first sentence of the new text.
  • ALSO NOTE: The elaborate formatting does have the effect of conveying to potential vandals that the page is not a good target. While they could delete the existing text and replace it with new, it appears to be the case that a well-designed complex page tends to discourage vandalism.
  • FURTHER NOTE: The problem of Urban Dictionary Overflow was apparently resolved by the combination of the complex format, AND the listing of the optional spellings and links to clarify the spelling issue ... but for whatever reason, the "boke=puke" entries have not been recurring.

-- Proofreader77 (talk) 21:31, 28 April 2009 (UTC)Reply

Overall comments edit

I wrote this message some hours ago, but didn't get to save it since I was interrupted by a phone call. This message refers only to messages written further up this page, so some of it is probably redundant now.
llywrch: Clarification: I don't like Proofreader77's style. Small text is a bad thing, and he put in too much stuff on that page, making it hard to read. But I also don't like the current disambig guidelines, since I think they are too strict and don't allow us to put in information on disambig pages that I myself often want when I as a reader land on a disambig page.
Everyone: I think it is a good thing having this discussion on this project talk page, since this talk page has more users watching it. And as Proofreader77 already have explained, in this case there might even be reasons to avoid having this discussion on Talk:Boke.
Proofreader77: No, that is not how we work here at Wikipedia. We are allowed to revert anyone back to the style that is established practice and/or specified in the guidelines, even in mid discussion. When that happens then you (since you are the one that wants to deviate from the established practice and guidelines) have to demonstrate your suggestion on a test page instead. Remember that our pages are on-line, they are published and used the moment we save them. Deployed pages should not be used as test pages. New examples should be done on other pages in the background. Such as on a test page in your user space. So please, for the time being ignore any further discussion here and instead go make an example page in your user space, then write up a good explanation why you want it to look that way. Then come back here to show it and discuss it. (I see you have now written the explanation above. But an example page would still be good.)
And again, we have methods to deal with vandalism. If you think vandalism of that page is a problem, then ask us to semi-protect that page. That makes it harder for the vandals. If they are persistent enough to create several new accounts and do the work to make those accounts "auto-confirmed" then we can have a checkuser check the IP of those accounts and block those IPs. Thus that vandal can't create any more accounts. We also have a whole set of other tools that we can use to stop such vandals: Abuse filters, spam filters, bots that guard and revert, and so on.
--David Göthberg (talk) 21:54, 28 April 2009 (UTC)Reply
Proofreader, thank you for posting the links to the vandal revisions. Now I understand better what you are talking about. If this were a persistent problem, then page protection or the like might indeed be necessary. And if a user tried doing this again after being warned (which apparently never happened in this case), blocking would certainly be an appropriate response. But, really, we are talking about three edits by one user (and one IP address) over a period of three or four days, 14 months ago. Since then, nothing. This is just run-of-the-mill vandalism like we see, unfortunately, five to ten times a minute across Wikipedia. The effect on the Google search results was certainly regrettable but probably not even intended.
Your "solution" to deter future vandalism is like using a steam-powered sledgehammer to swat a fly. I don't question your intentions at all -- I think your motivations are good, and you strike me as one of the most sincere users I have run across on Wikipedia (or else you are an extraordinarily talented faker :) ). But I learned a long time ago that "good faith" and "reasonableness" are two very different concepts. I fear that you have overreacted to a long-ago fleeting incident of vandalism, that is unlikely to be repeated even if the page were three words long. And if it were to be repeated, as David has explained, we have other ways of dealing with it. --R'n'B (call me Russ) 22:53, 28 April 2009 (UTC)Reply
Here's the deal: The person who did this is the relative of someone who I testified against in a criminal proceding. Will it never cross their mind to do it again? I'm not famous, but I am known by my name/website to some extent. (Not notable.)
One day you get up and Google yourself (yes, checking:) ... and this appears directly under your website ... with the authority of Wikipedia. I repeat:

[Obviously insulting material commented out. -- llywrch (talk)]

How would you feel about that if it was you?
An enemy did it. They will not stop being an enemy. The whim of the enemy. See?
SO WHAT IS THE RESULT OF DISAMBIGUATION CLEANING?:
  • The page is less informative. (But no skin off my nose.:)
  • I must return to the burden of constantly checking the page again. (Skin off my nose, big time!)
  • Nothing to deflect the Urban Dictionary Overflow now. (Boke = puke.... Boke = puke .. )
  • This is not a reasonable result. It is not a good result.
It was a relatively pointless act, with costs that the cleaners do not have to pay I do.
Until you put a bot on duty there for me. :) Proofreader77 (talk) 23:55, 28 April 2009 (UTC)Reply
But the page is more navigationally useful. Skin restored to many readers' noses. Your issue appear to be one for WP:AN/I (for the personal attack) and/or WP:RFO (to remove the earlier edits), not for the disambiguation project. -- JHunterJ (talk) 00:32, 29 April 2009 (UTC)Reply
Information designers might disagree with you. (Of course I'd have to explain to even them the purpose of the small text "ballast" beneath the primary section) And, no on oversight (would prefer to keep them there for the record) —the problem I see is some patterns of behavior of dab-cleaning (which includes a tendency I've observed before, to be dismissive, which I believe you were alluding to). But suspect you know that in your heart of hearts. :) Cheers! Proofreader77 (talk) 01:42, 29 April 2009 (UTC)Reply
You are dismissive of the efforts and motives of the dab cleaners when you categorize them as "dismissive". You were apparently subject to an attack, and Wikipedia has long-standing, well-exercised pathways for addressing those attacks. You appear to have dismissed those pathways in favor of a new path of unclear effectiveness. -- JHunterJ (talk) 11:28, 29 April 2009 (UTC)Reply
re: dismissive of long-standing well-exercised pathways for addressing those attacks
My first two edits (see the earlist page of my contribs) were the two undos to revert the 3rd insertion of the attack. (I didn't know how to remove it in one edit, I just saw "undo"—sounded good.)
I.E. I knew nothing about Wikipedia.
My third edit was trying to get myself unblocked, because I had been collaterally blocked by the (I've heard gone wild somehow) Can'tSleepClownWillEatMe (something like that see line 33 here: Wikipedia:Former_administrators#Desysopped_by_ArbCom.2C_Jimbo_Wales_or_otherwise) who had range-blocked a whole lot of people (and was subsequently, much later, stripped of his admin bit, I believe I read somewhere) ... Ah, mysterious world.
After trying unsuccessfully to get myself unblocked with the template, and being told by PersianPoetGirl I could not be unblocked because, who knows, LOL (let's see, the reason given was checkuser) ... No one had figured out yet that CantSleepClownWillEatMe had gone wild, and undoing what was claimed to be a checkuser block was beyond the pale of something which I didn't know anything about ... other than I had found myself (collaterally) blocked after having done nothing other than reverting the attack.
Meanwhile I was asking through whatever channel I could figure out (My talk page, and perhaps email) about whether the attack could be deleted from the record (that's what I thought at the time, have changed my mind, re the record, should the person require legal restraint someday), and being told that it didn't really look like it needed removed from the record, but it could be done by the right person.
NOTE: Not a very satisfying encounter with the mysteries of Wikipedia ... and so decided I'd better figure out an answer myself.
AND NOTE: And NO, Wikipedia has no long-standing processes for dealing appropriately with attacks like this (other than oversighting the record away, or locking if the pattern of attack is repeating, blocking an ip—which can be bypassed with another ip), or else I would certainly know of it now. None of this would prevent the insertion of another attack into Google. Only my constant vigilance OR something else to be determined ... yes, which I found. Most of what I'm hearing is, "big deal, get over it, we're not going to do anything special about this." No, it's still up to me. That's why this is not wrapped up.
Re dab-cleaners
My current perception is that there may well require an arbcom to enjoin dab-cleaners for their behavior. THERE WAS NO REASON to proceed with a cleaning under those circumstances. The cleaning was abusive of process. Dismissive of reasonable concerns. Such patterns shall be stopped. By discussion, or command.
DEPOSITION QUESTION #1: What was your reason for rushing to clean the page in the face of "all this." (See all the above, the Boke talk page, and your talk page)
re: of unclear effectiveness
Of course it's unclear to you. You were too busy dab-cleaning to see anything clearly. You should have been discussing the matter with me (the one person who knows) as was requested on the Boke talk page, which all dab-cleaners ignored. The dab-cleaning of Boke under these circumstances was a willful and collective act of contempt. Unfortunately this is not a freak occurrence, but representative of dismissive (and abusive) patterns of behavior that has become acceptable among dab-cleaners. And shall now be addressed effectively.
Clear? :)
-- Proofreader77 (talk) 07:39, 30 April 2009 (UTC)Reply

COI www.boke.com? (Preventing libelous attacks on an identifiable non-notable via a disambiguation page) edit

Proofreader77: Your last few edits just confirmed what I have been thinking for some hours now, this is about your own web site "www.boke.com" and your own nickname "boke". And the vandalism you are referring to were ages ago and just some incidents. So you are doing lots of strange edits to a Wikipedia page to optimise what is shown next to your web site when doing Google searches. And you are wasting the time of a whole bunch of editors. And you are really rude towards some of the editors here. And you are demanding we set a bot to patrol "your" page? I think you just broke a whole bunch of Wikipedia policies and guidelines. And I think you have just seriously annoyed a whole bunch of Wikipedia editors and admins. You as an experienced Wikipedia editor should know better than that. I strongly urge you to stand back, and think about what you have just done.
--David Göthberg (talk) 01:59, 29 April 2009 (UTC)Reply
I concur with David Göthberg & JHunterJ. If you feel that these edits were insulting to someone, you should have sought to have them oversighted. Instead, you antagonized several well-meaning people here, & have greatly harmed your own standing. I'll repeat my advice again: step back from this matter, & either allow the rest of us to handle it, or find an uninvolved experienced user to help you fix it. Because if you don't do either of those, you may find yourself the subject of a thread at WP:AN/I, where the consensus may be to block you from Wikipedia, perhaps indefinitely. (BTW, quoting this passage here only helps to get that insult back into Google, & defeats your apparent intent of removing this material. I think everyone here would concede those words are insulting, & since that much information is unnecessary I'm commenting it out -- to protect the person's reputation.) -- llywrch (talk) 06:00, 29 April 2009 (UTC)Reply
Oversighting is not the issue (the record is better there, to allow possible action against the perpetrator) PREVENTION of this abuse of a Wikipedia disambiguation page to make this kind of attack is the issue.
NOTED: The warning of potential blocking from editing of Wikipedia in the specific context of this matter ... is OUT OF ORDER. For the record: Objection.
--Proofreader77 (talk) 06:32, 29 April 2009 (UTC)Reply

[Quotation of obviously insulting language redacted. -- llywrch (talk) 06:10, 29 April 2009 (UTC)]Reply

DO YOU MEAN BY "optimise" PREVENT libelous attacks on a non-notabable by inserting text into a Wikipedia disambiguation page specifically intended to appear in the search results?
There is no article about me in Wikipedia (nor should there be), and it was quite an unpleasant surprise to discover this could be done.
It is most certainly not a WP:COI to prevent such attacks on oneself. It should not have to be my job ... but such is the hand one is dealt by technology.
An RC patroller caught the first two attempts to insert the attack text, but not the third ... which remained in Wikipedia until it made it into Google search results ... which is how I arrived to make my first two edits reverting the attack.
I was grateful to the person who had reverted it before and read their page discovering they were a Recent Changes Patroller ... which led me to begin that work (without rollback or automated help) 1,000+ edits reverting vandalism (NOT to Boke) before asking for rollback (Now a couple thousand more to mainspace, and more thousands to talk, etc)
MEANWHILE Visually checking the Boke disambiguation page every three hours 24 hours a day is the burden placed on me by this abuse of Wikipedia. This was the personal cost of preventing a repeat "SEO Vandalism" attack via a Wikipedia disambiguation page.
Clearly, a different solution than checking the page continuously was necessary. The special formatting (and ballast content) added to Boke made constant watching unnecessary.
"BUT IT WAS LAST YEAR, why are you still concerned?" If someone attempted to break into your house three times, and on the third time succeeded ... when would you stop having to be vigilant? Could you stop being concerned after a year?
RHETORICAL QUESTION: Will Project Disambiguation GUARANTEE that the same kind of SEO Vandalism attack will not be successful again?
Of course not. The burden falls on me. It appears to be my responsibility to prevent future attacks. Preventing libelous attacks on a non-notable is NOT WP:COI.
NOTE: See the "Barack Obama" example in TERMINOLOGY section. "Disambiguation SEO Vandalism" isn't just a Boke issue.
FINALLY, check my mainspace article edit count, I think you'll find my anti-vandalism activity has not been dedicated to this page.
-- Proofreader77 (talk) 05:33, 29 April 2009 (UTC)Reply
Since Project Disambiguation has "undone" MY solution to the Google-snippet attack problem AND therefore re-placed the burden of constant vigilance on me ... would it not be reasonable and fair for Project Disambiguation to take some action to ameliorate that burden placed on a private individual? E.G., A bot? (I would be willing to help design/code the algorithm.) edit

David Göthberg has implied my request for a bot is an unreasonable request. I respectfully, but profoundly disagree. (See all the above)
Proofreader77 (talk) 06:11, 29 April 2009 (UTC)Reply

Well, you can write and run one yourself if you want, but it seems unreasonable to insist that someone else spend time and resources designing and running a bot to meet a concern of yours that, as has been explained, is entirely disproportionate. Gratuitous insults of the sort you mention happen every day, and could happen to any of us - there is no reason for`people to expend dedicated effort preventing a repeat of one such attack that happened effectively just once, over a year ago. I suggest you simply let this go, stop worrying yourself over this incident, and stop taking up time and space for others who have other problems to deal with.--Kotniski (talk) 06:34, 29 April 2009 (UTC)Reply
Have you ever discovered a personal attack on you placed in Wikipedia explicitly designed so that it would appear in a top Google search snippet? Attacks on the web may be common, but placing them to appear with the authority of Wikipedia at the top of the search results are not the common and unimportant occurrences you imply.
BEHAVIOR: The "dismissive" nature of your communications in this serious context are specifically noted for the record.
Proofreader77 (talk) 07:02, 29 April 2009 (UTC)Reply
The most effective solution (IMHO) is to have several regular wikipedia users have the page on their watchlist. I'll pop it on mine. I gather, Proofreader77, that you are not and do not plan to be a regular editor here? (John User:Jwy talk) 07:06, 29 April 2009 (UTC)Reply
Thanks for adding to your watchlist. re: regular editor? Check my stats. I've been otherwise occupied lately :), but most of my mainspace edits are RC patrol. Proofreader77 (talk) 07:12, 29 April 2009 (UTC)Reply
Ah yes, re: Watchlist solution: I certainly know of and use the watchlist, so theoretically if we know there would always be someone with Boke on their watchlist "on duty," we could probably guarantee that an attack text could be reverted before scanned by Googlebot.
HOWEVER: Now that the page has been returned to a small K size, a large attack text if, scanned by Googlebot before reversion, is likely to end up with the attack lede appearing in a Google snippet. I.E., I don't think we can assume there will always be someone with it on their watchlist on duty. NOR should we expect to maintain such a congenial group of watchers. :) Not their job.
GENERAL ISSUE: While this seems a very idiosyncratic matter, I did see another (less vile) example of "Disambiguation SEO Vandalism," so we may likely see more of this. SO: Coming up with an automated solution for when attacks have occurred and therefore may occur again, I think is a useful exploration.
-- Proofreader77 (talk) 07:34, 29 April 2009 (UTC)Reply
The problem then is with google and the solution should be addressed there. I suspect there are ways for google to remove on demand defamatory material. Monitor here. If you see someone messing with the information, monitor google. If it appears there, complain to them. (John User:Jwy talk) 15:30, 29 April 2009 (UTC)Reply
AT THE TIME: I tried to get Google to remove it. If I was the webmaster of Wikipedia it would be an easy matter. Get some special code to put in the html head. etc. If you're not the webmaster of Wikipedia, it's not so easy. I tried several times. NOTE: As for "defamatory," to Google that means "court order" (legal system recognizes it as defamation), not that someone claims it is. Wikipedia's position on "attack" text of any kind is less demanding. :)
--Proofreader77 (talk) 16:50, 29 April 2009 (UTC)Reply
(MORE) WHY IS THIS A WIKIPEDIA ISSUE? Because Wikipedia provides the vehicle for the placement of the attack text ... presented in Google as From Wikipedia lifted to prominence with Wikipedia's reputation, and (to many) the authority of Wikipedia (because many eyes can correct errors) ... BUT those who would perform Disambiguation SEO Vandalism can take advantage of that ... because errors/attack can take awhile to correct, and the Googlebot waits for no person. :) OF COURSE, ... There are several (hundred) more paragraphs to write about this, but I'll stop there for the moment.
Proofreader77 (talk) 17:03, 29 April 2009 (UTC)Reply

Helpful suggestion for page monitoring edit

There's a useful free service at http://www.changedetection.com/monitor.html which emails you when a chosen page is changed, on a daily basis. Could be useful to Proofreader in addition to Watchlisting, as it alerts you even if you're taking a Wikibreak, as long as you're still checking emails. Doesn't prevent vandalism, but picks it up in 24 hours. PamD (talk) 07:20, 29 April 2009 (UTC)Reply

Many thanks. The problem (again, now the page has been cleaned down to 2K) is that 24 hours may not be good enough. When the page is long and complex, Google has a larger context of previous information and current information to work with (and so an attack text lead/lede will be much less likely to be a snippet). With the current size, checking every three hours is a minimum, one an hour, better. LOL See the problem.
OF COURSE, I understand why people think "what's the big deal?," but to be attacked with a vile Wikipedia snippet at the top of search results for a week to two weeks or more damages opportunities. That can happen' even if you revert the vandalism within 24 hours (with the current page size where a large attack text appears to be "the article" itself to Google).
-- Proofreader77 (talk) 07:42, 29 April 2009 (UTC)Reply

TECHNICAL SOLUTIONS to SEO Vandalism (inserting e.g., attack text so it will appear in Google snippets) edit

Let us consider some technical possibilities for effectively addressing the problem of pages that have been attacked in this way (i.e., with the intent of getting text in to Google result snippets

Some preliminary notes before technical solutions:

  • We'll assume that my earlier "solution" on Boke of making the page long (small text ballast) and complex (more links) is not an acceptable solution for disambiguation pages. :)
  • Watchlisting — for this kind of attack, timing is everything. Once the Googlebot scans a vandalized page, even if the page is reverted a minute later, it's going to persist in Google results for awhile (variable, but often a week, two, or more)
  • Protection — Since a page vandalized this way may be vandalized again, but vandalism is not persistent/regular, locking the page is probably not a good solution.

-- Proofreader77 (talk) 11:43, 29 April 2009 (UTC)Reply

Bot? edit

I have seen Cluebot in action, but really have no concept of their implementation in Wikipedia. Whether a bot to watch a single page makes sense, I don't know. Tell me. :)
-- Proofreader77 (talk) 11:43, 29 April 2009 (UTC)Reply

Metatag Googlebot NOSNIPPET? edit

Not clear whether it would be allowed for a Wikipedia programmer to add the NOSNIPPET metatag to a page that was subject to this problem, but it would "solve" the Google snippet issue by producing a Google search result without a snippet.

BUT: Since this makes the search result much less helpful to the potential reader (and would therefore usually avoided), there may be circumstances when this might be an accepable option.
-- Proofreader77 (talk) 11:43, 29 April 2009 (UTC)Reply

QUESTION: Flagged revisions (timeframe) edit

Any time frame on when those might be a factor here? Proofreader77 (talk) 20:55, 29 April 2009 (UTC)Reply

QUESTION: Couldn't adding a few lines of code to Cluebot handle this? edit

E.G.,

IF disambiguation page AND size-increase > (e.g., 25% of current size)
THEN revert

NOTE: That handles what happened on Boke in this case, but we could also handle the case where the current text was erased and replaced (i.e., size didn't change much, but content was rewritten).
-- Proofreader77 (talk) 17:43, 30 April 2009 (UTC)Reply

OVERALL COMMENTS (re tech solutions, etc) edit

doncram's perspective edit

re topic headings edit
I removed two more section dividers with long titles inserted by Proofreader, to simplify reading. Proofreader: you cannot keep inserting more sections and hope for many persons to follow. Most readers would prefer to check just at the bottom of one active discussion, and most will not follow multiple discussions at many different subsections. Every time you insert new section dividers, it seems you are expecting to open separate discussions. I for one cannot handle having 2^n separate discussions going on after n visits to this page by you.
(reply) HOUSEKEEPING NOTE INSERTED (along with topic): I am restoring headings removed by doncram which are intended to summarize and simplify navigation of a long page which is to be a reference text for addressing the issues raised here. I explicitly request that doncram to cease and desist from such changes and especially making inferences and casting aspersions as to my motives for such headings.
--Proofreader77 (talk) 20:36, 30 April 2009 (UTC) (PS Please excuse emphatic tone. I understand formatting is unusual, if purposeful.) Proofreader77 (talk) 23:41, 30 April 2009 (UTC)Reply
Part 1 edit
To respond: 1. Glad you're accepting that the "ballast-adding" approach is not acceptable. 2. Watchlisting is fine, and you have several more people watchlisting the Boke disambiguation page. 3. Glad you recognize that semi- or otherwise protecting the page does not seem justified, given low frequency/recency of attacks. Also, as noted elsewhere, the recent changes patrol also does a good job (detecting and immediately reverting 2 of the 3 past vandalism changes to the page in question). Likewise, any special bot attention to the page is not justified. If the page is not attacked enough to justify protection (which any of ~2,000 administrators could do), in my view there is no way that one of the relatively few bot programmers would pay any attention to a request to address this one page in some way. Especially as you and I have no specific knowledge of any similar case, and you don't even know what to ask for. (FYI, i once tried to get a bot run to implement an edit on several thousand wikipedia articles, which was clearly feasible and I think was more important than anything here, and i could not get a bot programmer interested.) With no precedent and no urgency, you have no chance of getting some custom treatment. And, likewise about attempting to affect Google treatment by some use of a "Nosnippet" tag: there is no urgency and no known precedent for asking for special treatment of this one article. I think you just have to live with the possibility that someone out there who dislikes you may possibly post defamatory stuff, in wikipedia and elsewhere, and you cannot prevent that in advance. doncram (talk) 18:28, 29 April 2009 (UTC)Reply
Two words :) Undue burden. No one should be given the lifetime job of continuously watching a page in Wikipedia to prevent an attack being inserted into Google search results.
The fact that RC patrol missed it on the third time, and my seeing it days later in Google brought all the words you have seen (plus the hours you have not), indicates a page that has successfully been attacked by SEO vandalism ... and that a method of regularly dealing with such cases must be found, so that the issue of undue burden has no possibility of becoming troublesome on a broader scale. That is important for the project—not just relieving the burden on one person who has been attacked via Wikipedia and whose life requires allotting daily effort/attention to the problem in perpetuity.
-- Proofreader77 (talk) 20:31, 29 April 2009 (UTC)Reply
re bot programming (and lack of programmer interest) - NOTE: I have a M.S. in computer science and have written a several thousand line program in Perl for verifying Shakespearean sonnet form (via CGI), I.E., I may be able to do the work, with guidance.
-- Proofreader77 (talk) 20:47, 29 April 2009 (UTC)Reply
Part 2 edit
To respond to your points and questions: About "undue burden", sorry, it is not wikipedia's problem that there is one person out there who dislikes you and might post negative stuff about you in wikipedia or elsewhere. You do not have to check google at all, much less every day: a) you could just ignore it, b) you could set up an email watch on any changes in the Boke article as PamD suggested. As well enough discussed already, also, any vandalism added to wikipedia will usually be caught very quickly and removed. Even if that gets into google, google will remove it soon enough, too. You are talking about one incident. And now there are several more persons following the Boke article who would likely notice and remove vandalism. However, if there is continuing battling about the form of the Boke article, I for one will remove it from my watchlist as eventually my patience is wearing out, too.
About flagged revisions: I understand that is a possible future feature of wikipedia. I have little idea whether it could or would help address whatever you want. If it does, then it will turn out to do that. If it does not, then it will turn out not to do that. I don't mean to turn sarcastic, but you are asking a question within a subpage of WikiProject Disambiguation that has nothing to do with disambiguation. If you want to know more about flagged revisions, go read up about it, and then wait for it to happen, or not.
About whether a few lines added to ClueBot could do something: A) I don't know, technically. B) As i have asserted already, there is no way that any bot programmer at wikipedia is going to set up a custom bot or put in special treatment of the Boke disambiguation article. All this amounts to is there was one incident once. To address this with some special treatment is not justified. If requested to make a change, any bot programmer will take one look at this huge discussion and run screaming, I would expect. I am about ready to drop this separate discussion article from my watchlist, and run, screaming, myself! :) doncram (talk) 18:43, 30 April 2009 (UTC)Reply
Response to Doncrom by Proofreader77
  • I have already responded sufficiently to your perspective.
  • This is a reference page for the issues, and for a potential DR regarding behavior of dab-cleaners.
  • In that context, you are specifically requested to leave the headings alone, since they are designed to provide a summary and simplify perusal by new readers (for information or judgment), and quick location by all to specific issues within this LONG page.
  • Since you have decided the issues, and expressed your position (clearly), then removing this page from your watchlist may make sense.
    --Proofreader77 (talk) 20:36, 30 April 2009 (UTC)Reply
Just to note, your "housekeeping" type edit split my responses to three points into a separate section. It seems awkward now, that your "undue burden" point is in one section, your QUESTION: Flagged revisions (timeframe) is one section, and your QUESTION: Couldn't adding a few lines of code to Cluebot handle this? is another section, and then my responses to all three are in another area that looks like a section, but in fact is a bolded statement with my username misspelled. Whatever. No response needed, I am indeed bowing out now. :) Bye! doncram (talk) 22:47, 30 April 2009 (UTC)Reply
My sincere apologies for mispelling your name. Thank you for your attention and help in getting the page moved to this title. (Note that the previous removal of headings was part of the complication here, and I will see if there is a better way to organize this part. Your comments are an "overall comment" (or covering all points) which I think are better handled as separate discussions—but as we have seen above, a section of "overall" comments is also useful. Information design is not always easy. :)
Again, my thanks to you and a salute. In any case best wishes and farewell, and as always welcome back if the situation shifts in a way that justifies that. Cheers!
-- Proofreader77 (talk) 23:24, 30 April 2009 (UTC)Reply

Discussion of name of this subpage edit

(A notice was posted at Wikipedia talk:WikiProject Disambiguation and then discussion about what the name of the subpage should be grew long itself. After renaming this page, I removed the additional discussion to here. doncram (talk) 22:11, 29 April 2009 (UTC))Reply

The initial posting, plus my followup at the Wikiproject, is as blockquoted here:

Vulnerability of short pages to attacks

I have moved a recent very long discussion to a subpage: WT:WikiProject Disambiguation/Vulnerability to attacks (since it was coming to dominate this talk page). It began with a discussion of the dab page Boke, but concerns a general issue: that short pages (as many dab pages are) can be used to get defamatory vandalism in a very visible position in Google search results, persisting even after it is reverted from Wikipedia. I suggest that discussion focused on that problem continue on the subpage. I've left a note at WP:VPT directing people to that page.--Kotniski (talk) 10:35, 29 April 2009 (UTC)

I moved/renamed the subpage to Wikipedia talk:WikiProject Disambiguation/Vulnerability of short pages to attack, UD overflow, and other issues of Boke, per Proofreader77's stated preference for the name of the subpage. I removed, from here, a lengthy discussion discussing what the name should be, and put it also in that subpage. doncram (talk) 22:04, 29 April 2009 (UTC)

(Following is the removed discussion of the name for this subpage)

(objection to title - in the context of dismissing other issues) edit

While I agree this is the current main issue, (and that the topic is dominating this page) ... there are other issues re Boke that are raised in that discussion ... and the selection of title for the archive dismisses those with the rename (and moving under that name).
SO: Formal objection to Kotniski's title change which ignores (dismisses) those issuses and specifically because he is a major participant of the preemptive dab-cleaning of Boke while discussion was underway—ignoring that issues were being specifically raised on this page, and having been requested (here, and on his talk page, and Boke talk page) to cease and desist until the serious matter of the "why" of that page was discussed. ...
I.E., a choice with an implicit conflict of interest in the context of a potential DR for Arbcom regarding the dab-cleaning (and related communication) behavior of (some) members of Project Disambiguation ... which would seek to enjoin such behavior from repetition (i.e., not a matter of content disagreement, but persistent patterns of action that damage the editorial environment of Wikipedia).
NOTE: I understand many here will not agree with that characterization—and it is noted that many of those are members of Project Disambiguation. But let us be convivial in such matters of dispute. We can smile while dealing with serious matters ... which are not beyond lightness in handling. Verse may ensue:) Excuse one concluding smile for this serious block of text.
SO, THE PRACTICAL MATTER OF THE TITLE: Yes, Boke discussion needs off this page, but with a title that encompasses (rather than dismisses all but one) issue. Suggestions? -- Proofreader77 (talk) 17:41, 29 April 2009 (UTC)Reply
  • The title is fine. Thanks, Kotniski, for moving the discussion there and for posting nicely enough about the potential vulnerability issue at the Village pump technical issues area. I, for one, am watchlisting the moved discussion, although not really expecting to participate further. doncram (talk) 17:54, 29 April 2009 (UTC)Reply
While you may agree with Kotniski's dismissing of all other issues with one wave of a (dismissive:) hand, I obviously disagree—and casually observe that your affirmation of Kotniski's title is made while asserting you are not going to participate further. :)
-- Proofreader77 (talk) 18:07, 29 April 2009 (UTC)Reply
  • Support Kotniski's choice of title. Further, entreat Proofreader to find more succinct phrasing for his points. I find my attention span to be adequate for most purposes, but Proofreader's seeming tendency to address many issues simultaneously and slowly detracts from the persuasiveness of his arguments. Smiling nonetheless, --AndrewHowse (talk) 19:08, 29 April 2009 (UTC)Reply
I'm really good tweeter, but some complex arrays of issues require a text. :) While I understand completely the highly unusual expansiveness of "this thing" we're in the midst of, the other issues raised are simply waved off the table by Kotniski's title—and is therefore simply "not done." :) How about this:
Vulnerability of short pages to attack, UD overflow, and other issues of Boke
The point being is that the framework of concepts for discussing ALL of that is on that page, and the title assigned would make one think that they were not there. SO, a grossly inappropriate title.
While I would like to put these issues into nice little separate boxes, the actions taken by some dab-cleaners (without reasonable justification rushing to "clean" the page while extensive information about the page was being provided here) make it particularly necessary to keep those issues arising from Boke (and not limited to the SEO attack issue, although it is predominant) on a page that at least acknowledges where the issues arose. :) (smiling at more-than-tweet-length response)
NOTE: I'm new here. (duh) Providing enough text to see the nature of my participation (and level of attention to these matters), is useful at this stage of, um, initiation. :)
ALSO NOTE: I format for skimmability (so you don't have to read it all). PS entreat (excellent word!)
-- Proofreader77 (talk) 20:02, 29 April 2009 (UTC)Reply
Believe us, nobody reads it all. But doing it that way means that points that you consider important are overlooked. My understanding of your Boke complaints was that they concerned almost exclusively the short-page vulnerability question. But if that was a misunderstanding, I suggest it's your fault as a communicator rather than mine as a listener - your idiosyncratic style of discourse, while entertaining, really doesn't work as an effective means of getting your points across. (If you want to change the other page title, then just move it - but the same principle applies, if you give it a title that is itself longer than people's concentration spans, then you're less likely to attract satisfactory responses.) --Kotniski (talk) 21:10, 29 April 2009 (UTC)Reply
On a related point of communications technique: please keep talk page section headings brief, as headings, unlike some of those above. When I contribute to a section, I don't want to have to scroll through lengthy text before I can add my edit summary, and I don't want the record in my contribs list or other people's watchlists to include a lengthy statement, just a heading. Thanks. PamD (talk) 22:24, 29 April 2009 (UTC)Reply
Hi PamD. Especially with regard to the one too long to fit in edit summary, point well taken. (That one attempting to be a complete summary of one major issue will be shortened soon).
However :) , this is a page which is clearly too long for any reasonable person to wade through it all, so the long topic titles function as a readable summary for a grasp of what "this" is all about.
I understand this is bothersome to someone adding comments to those sections, but perhaps let's assume that discussion of that information will most likely be re-begun in some bottom-of-the-page fresh analysis/discussion topic, which I will strive to keep somewhat closer to the norm. :)
-- Proofreader77 (talk) 23:24, 29 April 2009 (UTC)Reply

Reference data: Google cache refresh rate for Boke edit

Since the persistence of attack text in Google cache (if successfully inserted) was the reason for "all this," let's see how long Google takes to update to the cleaned version (edited on 4/28/08, with embedded comment added 4/30/08)

I will check this daily until the cache date changes (snapshot of the cleaned version appears as cache) and add an entry below.

Current Google cache date for Boke:

  • 4/30/09 - "March 20 2009" (note: date of last edit to page before 4/28 cleaning)-- Proofreader77 (talk) 21:38, 30 April 2009 (UTC)Reply
  • 5/01/09 - "March 20 2009" (check - timestamp - no change) -- Proofreader77 (talk) 20:20, 1 May 2009 (UTC)Reply
  • 5/03/08 - "March 20 2009" (check - timestamp - no change)) -- Proofreader77 (talk) 20:10, 3 May 2009 (UTC)Reply
  • 5/04/08 - "March 20 2009" (check - timestamp - no change)) -- Proofreader77 (talk) 20:37, 4 May 2009 (UTC)Reply
  • 5/05/08 - CACHE IS UPDATED (1 wk) (with cleaned version of 4/28) -- Proofreader77 (talk) 23:06, 5 May 2009 (UTC)Reply

FAQ edit

Why wasn't a warning issued for the attack insertion? (Vandalism?) edit

How could the text inserted not appear as vandalism to an RC patroller? Remember the initial "cover story" edit summary that this is a "tall tale"

If that's what it was (rather than a misleading edit summary), then the editor has simply made a mistake about what should go on a Wikipedia disambiguation page. That kind of thing is usually reverted without warning, and depending upon how much time the RC patroller feels like devoting to this, that's all that happens.

BOTTOM LINE: RC Patrollers don't read everything carefully most of the time. They'd go insane. They read enough to make a judgment and move on.

BUT why didn't I (Proofreader77) give a warning since I clearly knew it was not a "good faith" mistake?

These were my first two edits, I had no idea about what should be done. AND in this case, might have thought "don't feed the trolls" (or antagonize an enemy). REMEMBER, I was not an RC patroller at the time. I was new. (And got to make these two edits before being blocked by collateral range-block by a now desysopped administrator.)

NOTE: RECENT RC STORY (WITH A MORAL): I saw an RC patroller (who is an admin on Dutch Wikipedia) who made a wrong judgment call on "vandalism" ... end up 3RR blocked, accused of being a sockpuppet, and instantly stripped of his rollback bit. After which he "retired" from RC work (duh), went on wikibreak, and has returned, but not to do RC anymore. I.E., when in doubt, don't treat it as vandalism. Revert once or twice, and move on. Proofreader77 (talk) 00:02, 1 May 2009 (UTC)Reply

SECOND BOTTOM LINE: Maybe RC patrol saw it ... but the (always) quick judgment call did not click the "vandalism" switch in their processing.
A bot may be smarter for this than a human. :) -- Proofreader77 (talk) 00:02, 1 May 2009 (UTC)Reply