Wikipedia talk:Naming conventions (use English)/Archive 8

Archive 5 Archive 6 Archive 7 Archive 8 Archive 9 Archive 10 Archive 12

"Ganges" article

I would like to register a complaint. I have been a long-term contributor to Wikipedia (for several years now) and I have been reprimanded for editing the "Ganges" article to read "Ganga" whenever the river is named within the article. My objection is that the official name of the river, in English, is Ganga (the Indian government changed the name from the abominable British imperialist "Ganges" back to the name by which the river has always been thought of by Indians, even during the horrible Raj, to Ganga, even in English. So, the official name of the "Ganges" is now Ganga, even in English. While print encyclopedias may list the "Ganges" in deference to common knowledge, they WILL acknowledge its real name as being Ganga. I would like to apply for a change to the article where even though the river is listed as "Ganges", it is spoken of within the article as the Ganga, lest we persist in contributing to and aggravating the readers' ignorance. I would object to a little section about a name change because I would doubt that most readers would get all the way through the article. Thus, we should respect the Indian government and people's wishes that the name in English remain Ganga and not be constantly referred to by the Sinicized British imperialist "Ganges." At least the body of the article should habituate new readers to this permanent change by the government and people of India. Otherwise, this is not as much of an encyclopedia as much as it is merely a common knowledge forum. --LordSuryaofShropshire (talk) 01:09, 22 June 2010 (UTC)


The right place to discuss naming issues is the article talk page, where it has indeed been discussed. By the way, the Govt. of India itself acknowledges the spelling Ganges as an alternative. In fact, its page on "national river" starts with The Ganga or Ganges is the longest river of India. Many pages in the .gov.in domain refer to the river as the Ganges (example). But again, the proper place to discuss this is the article talk page. --Ragib (talk) 01:22, 22 June 2010 (UTC)

French forests

I'm currently undertaking an expansion of WP's coverage on French forests, and am looking for some insight on naming conventions. Currently there are a lot of discrepancies with the articles already placed within Category:Forests of France. There are:

I'm wondering which of the above would be the most preferable? I'm partial to the "Forest of xxx" as it retains the Latin-y naming structure but the "xxx Forest" has a lot more hits in a google test. Any insights would be much appreciated. Minnecologies (talk) 20:01, 10 August 2010 (UTC)

The names should be derived from reliable English language sources. If you can not find one then either leave the name in French or choose the most common English language format. The most useful guideline in cases such as this is Wikipedia:Naming conventions (geographic names). -- PBS (talk) 05:54, 11 August 2010 (UTC)

Using local terms for local phenomena

Wikipedia:Manual of Style#National varieties of English has been used to justify naming conventions, not just orthography. Perhaps we should have an explicit statement on this? (I've placed a notice of this question at WP:COMMONNAME.)

Two recent examples I've seen are Ganges, which Indian editors have repeatedly argued should be moved to Ganga because that is the usage in Indian English; and Luganda, which was moved there from Ganda language largely on the argument that it is the form used in Ugandan English. The former have even argued that India has more English speakers than the US or UK, so Indian usage should prevail per WP:COMMONNAME. (Luganda might be justified by a more traditional interpretation of Common Name, so this is a question of principal rather than any particular case.)

Can we decide, or have we already decided,

  1. Should we, or should we not, rank English naming conventions of a host country as we do spelling conventions, so that local forms prevail over international forms?
  2. When usage is divided, should we go with the form used in the country with the largest English-speaking population, or the form most widely used among different English-speaking countries? Population or universality?

I'm fairly confident what the answers will be, but I think we should have this spelled out to save repeatedly making these arguments with editors grasping at every possible reading of policy to push their preferred name. Unless this has already been settled, and I'm just not seeing it? — kwami (talk) 19:41, 24 August 2010 (UTC)


The relevant rule is "Articles are normally titled using the name which is most commonly used to refer to the subject of the article in English-language reliable sources" (emphasis added). Not "in Indian English-language sources", not "in American English-language sources", but in any and all English-language reliable sources, regardless of perceived or actual nationality. This is the English Wikipedia; we would not choose the Urdu or Sanskrit name for the river when most sources are using an English name.
The number of people who speak a given variety of English is irrelevant. It's the number of sources that use a given form that matters.
If English-language sources are (more or less) evenly divided (which they aren't, in this instance), you can apply the ENGVAR solution, e.g., give more weight to Indian English sources than to American English sources for rivers in India, and the other way around for rivers in the USA. You can also default to whatever the first major editor preferred. But these are tie-breaker issues, not the primary principle. WhatamIdoing (talk) 21:18, 24 August 2010 (UTC)
But what if the RSs themselves are divided this way? Say, publications in India use Ganga, and these outnumber publications elsewhere, which use Ganges, so that by number of RSs we would choose one, but by universality of RSs we would choose the other? — kwami (talk) 21:37, 24 August 2010 (UTC)
WP:ENGVAR states that "An article on a topic that has strong ties to a particular English-speaking nation uses the English of that nation." COMMONNAME repeats that "An article title on a topic that has strong ties to a particular English-speaking nation should use the variety of English appropriate for that nation". On that basis "Ganges" should be named based on Indian English-language sources. Ben MacDui 07:56, 25 August 2010 (UTC)
I always read that as concerning spelling, grammar, and style, not names. If it is supposed to include names, that should be addressed specifically. — kwami (talk) 08:51, 25 August 2010 (UTC)
Names and spellings are interrelated as indicated in WP:ENGVAR by "proper names (use the original spelling, for example United States Department of Defense and Australian Defence Force". Usually the spelling used in the article title is throughout the article hence that mandates against using United States Department of Defence and Australian Defense Force as article titles but see my next comment on where we address this specific issue in the article titles policy page. -- PBS (talk) 09:22, 25 August 2010 (UTC)

The palace to discuss this is not here which is to do with foreign language usage. The issue is addressed in the section of the policy page called National varieties of English so if that section does not address the issue and needs further discussion it should take place on Wikipedia talk:Article titles -- PBS (talk) 09:22, 25 August 2010 (UTC)

No, what we need is to alert that talk page; this discussion has already been moved once. Septentrionalis PMAnderson 15:26, 25 August 2010 (UTC)

On the merits: In practice, if Indian English uniformly uses a given spelling, it will be adopted by the rest of the world in time - as has now happened with Mumbai. If, however, the "Indian" term is somebody's dialectal gamesmanship, we should not adopt it at all, unless it does catch on - as with the new official name of Bangalore. We should not be precipitate - we probably have been with Chennai, which is still locally known as Madras, as it has been since its foundation.

For more, see WP:Official names, and the pages to which it links. Septentrionalis PMAnderson 15:26, 25 August 2010 (UTC)

I didn't think this applied to either Official Names nor to National varieties, since the latter only addresses spelling, and here we have different names, not just spellings of those names. — kwami (talk) 16:29, 25 August 2010 (UTC)
Huh? Bombay and Mumbai are different spellings of the same name; so are Ganges and Ganga, if any different strings can ever be regarded as the same name. Chennai v. Madras is more complex, but the issues are virtually idenrical. Septentrionalis PMAnderson 17:26, 25 August 2010 (UTC)
If they were different spellings of the same name, they would be pronounced the same. At least, that's how most people would understand "spelling". They aren't even cognate: Bombay is Portuguese for 'good bay', and Mumbai is a Marathi adaptation meaning 'mother Mumbadevi'.
Ganges and Ganga are at least cognate, but they aren't different spellings any more than Jesus and Joshua are different spellings. — kwami (talk) 17:46, 25 August 2010 (UTC)
Really? Hanover and Hannover aren't pronounced the same - yet that is as clearly a difference of spelling as any I know. As for "Good Bay", that is a folk-etymology, as has been known since 1876. Septentrionalis PMAnderson 17:58, 25 August 2010 (UTC)
If they aren't pronounced the same, then they aren't simply differences in spelling. The policy speaks of differences in spelling. Perhaps we should treat Chennai and Madras as different "spellings" of the same name, but we need to be explicit about that if we're to avoid this kind of argument every time s.o. wants to move an article. — kwami (talk) 18:08, 25 August 2010 (UTC)
If they aren't pronounced the same, then they aren't simply differences in spelling. Citation, please. By that definition, there are hardly any differences in spelling. This guideline was not written, and is not argued over, to be an exercise in triviality. Septentrionalis PMAnderson 18:13, 25 August 2010 (UTC)
Really? You need a citation for what "spelling" means? Okay: Oxford English Dictionary: (a) "Manner of expressing or writing words with letters; orthography. (b) "a special collocation of letters representing a word".
That, of course, takes us to what a "word" is. Are two things the same "word" if they're not pronounced the same? Are "Jesus" and "Joshua" the same name? We can't rely on such ambiguous terms when debating how policy applies. It obviously covers things like color/colour, because those are given as examples. It even gives some examples with different pronunciations, such as aluminum/aluminium. Taken to its extreme, that means we should move Ganges to Ganga. But the consensus has been that it doesn't mean that. So, either the consensus is wrong, and we need to move the article, or a literalistic extrapolation of the policy is not what was intended, and we need to clarify it. — kwami (talk) 18:44, 25 August 2010 (UTC)
  • Your source does not support your claim; the OED says nothing about the pronunciation of the word being spelt.
  • "Jesus" is the same name as "Joshua"; see the Septuagint for Joshua 1:1 et seqq.
  • There is no consensus that Ganga is the standard English form for the Ganges; I do not know whether this is because "Indian English" is almost as figmentary as "Ugandan English" or because this is a case where we seek - and can find - an interdialectical variant. Ask the next time that move fails.
  • This guideline says nothing about local varieties of English; that is indeed elsewhere.
  • One editor went on and on about Ugandan English; but the move to Luganda was not made on those grounds.
  • I will clarify that "spelling" in this guideline has nothing to do with pronunciation.
  • I am stepping out of this discussion; I do not think it can be conducted to any profit while arbitrary and fantastic definitions are being pulled out of hats. Septentrionalis PMAnderson 22:12, 25 August 2010 (UTC)
Perhaps you misunderstand me: The reason we need clarity is because arbitrary interpretations are pulled out of hats, quite commonly. If we refuse to clarify the guidelines whenever their meaning can be debated, then the guidelines are effectively useless in resolving these debates. If you can't defend the policy in the face of the interpretations I present here, how am I supposed to defend it when those same interpretations are brought up on article talk pages? It's a waste of everyone's time for me to have to come here every time someone says the policy means something other than what I read it to mean.
"the OED says nothing about the pronunciation of the word being spelt." Exactly my point: A change in pronunciation is not a change in "spelling", as you have presented it.
Are you proposing that 'national varieties' of English should not include Indian or Ugandan? That would definitely need to be made explicit.
kwami (talk) 02:19, 26 August 2010 (UTC)
By your reasoning, Paris and fr:Paris aren't spelling the same word. Or is this the reasoning you are paraphrasing from the nationalists? Septentrionalis PMAnderson 02:45, 26 August 2010 (UTC)
Assuming both were used in English, I'm not sure they would be the same word; it may only be because they're spelled the same that they would be considered so. My point was that that guideline only mentions spelling, and a concomitant change in pronunciation is more than just spelling. (As for the similar point above, if I started calling my friend Joshua "Jesus", he would likely object that that's not his name, implying that in his mind they are not the same word.) I'm rather doubtful that there is a clear-cut definition for "word" that would consistently work in the real world. Again, what I would like to see clarified, if there is any consensus on this at all, is how to approach situations where one form is preferred locally, but another internationally. That's a large part of many naming arguments. (It's rather easier when the argument is based on naming in the local language, such as Croats denying that Serbo-Croatian exists because of the connotations the equivalent word has in Croatian; not so easy when similar arguments are made using regional English.) — kwami (talk) 08:33, 26 August 2010 (UTC)

Suggestion

Hi, I have a suggestion for you page watchers over here. I came to this page trying to figure out whether a diacritic mark should be used in an article title but didn't find the information I needed until I got all the way down the page to wp:EN#No established usage. Could a very tight mention of the recommendation to "Follow the conventions of the language in which the entity is most often talked about (German for German politicians, Turkish for Turkish rivers, Portuguese for Brazilian towns etc.)" be brought up to the lede, somehow?--Hodgson-Burnett's Secret Garden (talk) 16:06, 21 September 2010 (UTC)

It doesn't belong in the lead; it's a very narrow exception. The section you want is called Modified letters - to avoid the argument "that isn't really a diacritic" - and says to follow English usage. Septentrionalis PMAnderson 19:31, 21 September 2010 (UTC)
I have moved it down, as a general default clause, and moved diacritics up; I trust this is clearer. Septentrionalis PMAnderson 19:40, 21 September 2010 (UTC)

W/regard the titling usage question itself: PMAnderson, thanks so much for your help!--Hodgson-Burnett's Secret Garden (talk) 19:44, 21 September 2010 (UTC)

This page is rather dishonest about the whole issue though - observation shows that Wikipedia prefers original diacritics except when there is very well established English usage without them; however, there are a few editors whom this doesn't please, and who therefore refuse to allow this or any other page to document this simple fact.--Kotniski (talk) 09:08, 22 September 2010 (UTC)
My observations show the opposite. There are a handful of nationalists in every language concerned who will move articles to the spelling they are accustomed to, whether it is English or not. But when it is discussed, this position is the usual line held to; sometimes that means anglicization, sometimes it doesn't. Septentrionalis PMAnderson 20:27, 22 September 2010 (UTC)
Probably because it's only discussed in the borderline cases, where the English usage without diacritics is claimed to be very well established. The vast majority of names are written with original diacritics and other modified letters - if that's because they're considered to be "unknown in English", then the guideline shouldn't claim (as it does) that such cases are "rare" - they're actually (numerically) very common. But even well-known subjects like Lech Walesa and Zurich tend to get diacritics on Wikipedia, even though you'd have to say that a majority of reliable sources omit them - so there certainly is a strong (and justified) bias towards using diacritics in the encyclopedia, which the guardians of this guideline apparently refuse to admit to.--Kotniski (talk) 08:35, 23 September 2010 (UTC)
And there is widespread agreement that the decision not to use Zurich - taken by one irresponsible admin who invented a non-existent consensus - was at best marginal and should not be precedent. Septentrionalis PMAnderson 17:37, 23 September 2010 (UTC)
in case this refers to me (?) I have always recused myself from voting on this, and I have always stated that there has never been a consensus either way. This is five years ago now, but if I remember correctly the article was just left where it was created because there never was any consensus to move. A move would certainly be arguable, but it is absurd to claim that the current title "misrepresents the English language". In fact I feel that it is absurd to have any strong opinion on this. --dab (𒁳) 15:50, 24 September 2010 (UTC)
Not at all. I was talking about another admin, who is an open advocate of diacritics at every occasion, whether English usage or not. It has probably been closed more than twice. Septentrionalis PMAnderson 17:05, 24 September 2010 (UTC)
dab there was never a consensus to move the article from Zurich to Zürich. If anything it ought to be move back to Zurich until there is a consensus to move it from that name, particullarly if it is being used as justification for misnaming other articles. You wrote on the Zurich talk page "There are better ways to spend time with this article, e.g. ways that result in an actual improvement" In which case I presume that if the majority of reliable sources point to Zurich as usage that you would support such a move. If so then you should support the wording in this guideline. Kotniski are you seriously saying that we should ignore reliable English language sources when deciding on the name of articles? -- PBS (talk) 04:08, 26 September 2010 (UTC)
No of course not, but we don't have to (and don't) blindly follow the majority in any particular case, especially when there are too few to indicate any established usage.--Kotniski (talk) 09:03, 26 September 2010 (UTC)
Is that not covered by the wording in "No established usage"? -- PBS (talk) 10:15, 26 September 2010 (UTC)
Well yes, except that for some totally unexplained reason you've just readded the "in some rare cases" wording to that sentence, where as we know, it's not at all rare - it's the most common situation (only a prominent minority of foreign places have established English names). Was your edit summary meant to be a joke - you've just reverted perfectly uncontentious changes along with the one which you presumably objected to?--Kotniski (talk) 10:23, 26 September 2010 (UTC)

In particular, this proposal is absolutely unacceptable:

Names which are originally written in a Latin alphabet, and which have no particularly well-established English name, are normally written in their native form, even if that contains diacritics or letters that do not normally appear in English, as in Strübbel, Łopuchówko and Reyðarfjörður. However, when there is a well-established English form, such as Aragon (for Aragón) or Napoleon (for Napoléon), that is used instead.

If Kotniski wishes to give aid and comfort to our Icelandic nationalists, he can join with them in defending their abusive moves; then they may even approach a majority. In the meantime,we should call places and people (even in "Latin" alphabets - and does that include Gothic? Old Irish?) by what reliable sources in English call them. Septentrionalis PMAnderson 15:17, 25 September 2010 (UTC)

But this describes what happens. What has it to do with nationalists? It's Wikipedia's practice, whether you personally like it or not. This is what I mean about people not allowing this page to correctly describe what Wikipedia does. That being the case, it has no business being marked as a guideline.--Kotniski (talk) 18:38, 25 September 2010 (UTC)
No it doesn't describe what happens; it certainly doesn't describe what our best practice is; it describes what a disruptive minority would like to happen - and sometimes do without and against consultation. Septentrionalis PMAnderson 22:14, 25 September 2010 (UTC)

And yes, the case where an otherwise notable topic has not yet received any significant attention in the English-speaking world, so that there are too few English sources to constitute an established usage is rare. It is only mentioned as a logical possibility; I know of cases where the writers have not bothered to look for English sources, and of articles where there are no English sources and notability is doubtful (is every pool in Europe and every village in the Andes really notable? Yet most of them have by now been discussed in English); there may be some on the boundary between these two cases, but they are, as stated, rare. Septentrionalis PMAnderson 22:54, 25 September 2010 (UTC)

There hasn't been enouth attention for an English name to be established. In that case, we use the original spelling. At least, in relation to the countries I'm aware of - are there some where some other convention is applied? We certainly don't trawl the few reliable English sources mentioning a particular village to see whether more or fewer of them use modified letters or not - it would be stupid to do so, as you'd get random and meaningless variation between those that do and those that don't, to the total confusion of readers. Maybe you think this is what ought to be done, but it isn't, so making the guideline imply that it is is simply wishful thinking. If we can't even get consensus to drop the umlaut on "Zurich" on the grounds of established English usage, then there certainly can't be any to do something analogous in cases where there is no such established usage.--Kotniski (talk) 08:54, 26 September 2010 (UTC)
It would not be stupid to do it, it would be due diligence. "you'd get random and meaningless variation between those that do and those that don't" Well we could go with a solution to that. don't use funny foreign squiggles, unless they are found in English language sources (in which case they are not foreign but adopted). However that is not a solution that would be sourced based any more than suggesting that we used funny foreign squiggles even when English reliable language sources usually do not. --PBS (talk) 03:17, 28 September 2010 (UTC)
All right, if and when that solution is adopted, we should certainly document it in this guideline. But while a different solution (funny foreign words come with funny foreign squiggles) is the one that holds sway, we should be honest and say that. --Kotniski (talk) 08:19, 28 September 2010 (UTC)
Sometimes they do; sometimes they don't. Often where the squiggles are used, they can be traced to one -er-determined editor, who rampages through making page moves right and left. When discussions are held, as at Novak Djokovic, the results are otherwise - and ought to be. Septentrionalis PMAnderson 21:31, 28 September 2010 (UTC)

Google hits versus examining usage by the most careful (and often, incidentally, most prestigious) media?

I'm a novice concerning thinking about usage of diacritics in English but I've recently compiled the following fairly random list of their usage with people's names, from the media:

-Article in NY Daily News: "Motley Crue Frontman Vince Neil Arrested on Suspicion of DUI in Las Vegas"
-Article in NY Times: "Mötley Crüe Files Suit Against NBC for Banning It Because of an Expletive"
-Blurb from HarlperCollins, publisher of an autobiobiography of the band: "Motley Crue was the voice of a barely pubescent Generation X"
-Text from the book itself: Mötley Crüe
-From entry for "heavy metal" in the Encyclopædia Britannica: "A wave of 'glam' metal, featuring gender-bending bands such as Mötley Crüe and Ratt, emanated from Los Angeles beginning about 1983...."

-Book sold by Amazon: Ines of My Soul
-Book authored by Allende: Inés of My Soul

-From the NYT: "'I do think The [Hollywood] Reporter lost a lot of its luster,' said Lorenza Munoz, an adjunct professor of journalism at the University of Southern California...."
-From the NYT: "Andrew Muñoz, a public affairs official at the Lake Mead National Recreation Area, said that the economic downturn had led to a steady influx of repo men hunting Lake Mead’s marinas for craft whose owners are delinquent...."

-How the NYT styles names of Mexican reporters: "The authorities have confirmed only one of the disappearances, that of Miguel Ángel Domínguez Zamora of Reynosa’s newspaper El Mañana, who disappeared March 1. ... Ciro Gómez Leyva, the news director at Milenio...."
-An example of how it occasionally styles the name of a person that we thought apparently had slightly anglicized his name's orthography: "José Canseco [the former baseball star] was detained by authorities for nine and a half hours Thursday after they found a small amount of human chorionic gonadotropin, a controlled substance, in his possession as he tried to enter the United States from Mexico...."

-This is how the NYT styles "Cesar Chavez.
-This is how Wikipedia styles César Chávez.

-From The New Yorker: "...since President Felipe Calderón’s aggressive drug offensive..."
-Also: "...Jose Canseco: former American League M.V.P., Madonna’s onetime 'bat boy,' best-selling author..."
-Again: "...not...a “matinée idol, ... Catsimatidis was wearing White House cufflinks given to him by Ronald Reagan, but a few of the pictures on the wall—of Fidel Castro (for whom he built a Greek Orthodox church) and César Chávez—seemed to hint at a more left-leaning past. 'I supported César,' Catsimatidis said. 'His people still call us. It wasn’t the unions. It’s that the pesticides they were using on that food were killing my customers.' Another photograph showed Bill Clinton talking closely with an attractive blond woman—Catsimatidis’s wife, Margo. 'That one was in the National Enquirer,' Catsimatidis said. 'I’m sitting on the other side. They cropped me out.'"
-Again: AVATAR ... "An ex-Marine (Sam Worthington) in the shape of a Na’vi—an avatar—is sent to spy, but he falls in love with a warrior princess (Zoë Saldana).... ... NINE ... The singers are a distinguished company: Sophia Loren, Penélope Cruz, Nicole Kidman, Kate Hudson, Fergie, Judi Dench, and Marion Cotillard, with the laurels divided between the last two."

-Profile in NYT: "One of Spain's foremost leading ladies of the 1990s, Penélope Cruz has managed to make her mark with international audiences as well."
-Lede of Britannica bio: "Penélope Cruz, in full Penélope Cruz Sánchez (b. April 28, 1974, Madrid, Spain), Spanish actress known for her beauty and her portrayal of sultry characters. She achieved early success in Spanish cinema and quickly established herself as an international star."

My conclusions? Well, I believe that this guideline page errs on the side prescribing for Wikipedia to fall in line with the way news orgs tend to dispense with such marks instead of its descriptively mimicking the precision that more prestigious publications pride themselves on of more accurate duplications of individuals' actual own usages. So, in general, I think an encyclopedia should try to go with how the idividual himself writes his or her name when that person is doing stuff in English--supposing that he or she puts his or her name to a substantial number of things, whatever "stuff" this might be, in English. That said, of course, in individual cases, maybe one particular writer for The New Yorker or The New York Times might have wrongly assumed, say, that Cesar Chavez had not slightly orthographically anglicized his name whereas another New Yorker / New York Times writer might have correctly known, say, that Jose Canseco had done so--& vice versa. But, the thing is, an encylopedia should at least try to go with the Encyclopaedia Britannica/The New Yorker/The New York Times/etc. consensus rather than simply Google up results that mix in a lot of media sources that simply dispense with diacritics across the board. Thoughts?--Hodgson-Burnett's Secret Garden (talk) 17:45, 24 September 2010 (UTC)

    • Establishing English convention is not a precise process, but an art about which reasonable people can disagree. Raw search results can only be one (usually small) part of such an assessment. I believe the guideline is clear about this, mentioning at least twice that search results should be treated circumspectly. I agree with you that "serious" publications have particular weight when making assessments on usage, with the caveat that even these can have idiosyncrasies which border on pretentious. (For instance, the New Yorker generally embraces diacritics whenever possible, even to the point of spelling cooperation as coöperation and elite as élite etc). Additionally, the border between "serious" and "technical" can be blurry, and the importance of each on an article's title can also be debated (if a serious but general publication with broad readership renders a term without a diacritic, but a scholar writing a academic monograph for a relatively small community uses it, what is "common" usage? What is more important: recognizability or technical precision?) The usage of an individual, especially if living, is likewise important, but I personally do not believe it is absolute. In my opinion, our guideline does a good job of describing principles to use to resolve these questions, although I'm sure that we can always make improvements.Erudy (talk) 20:39, 24 September 2010 (UTC)
Roll up my comments for tl:dr concerns.--Hodgson-Burnett's Secret Garden (talk) 18:09, 28 September 2010 (UTC)
The following discussion has been closed. Please do not modify it.

I'm just saying that using sources that don't use diacritics at all skews the results. For example, the AP doesn't like em so we get

Associated Press: "It was at turns defiant and deferential, part plea and part plaint, a message as much to the drug gangs with a firm grip on Ciudad Juarez, the bloodiest city in Mexico's drug battles, as to the authorities. 'We want you to explain to us what you want from us,' the front-page editorial in El Diario in Ciudad Juarez asked...." (Notice that an editor added the accent over Juarez's e in the title of the San Antonio Express-News's version of this AP story, though.)

New York Times?

NYT: "It was by turns defiant and deferential, part plea and part plaint, a message as much to the drug gangs with a firm grip on Ciudad Juárez, the bloodiest city in Mexico’s drug battles, as to the authorities and their perceived helplessness. A police officer worked at the site where Mr. Santiago was shot dead. He was killed while leaving a shopping mall after lunch. 'We want you to explain to us what you want from us,' the front-page editorial in El Diario in Ciudad Juárez asked...."
It's not so much being stuffy...imagining a The New Yorker writer pronouncing "elite" with a French accent as he reads his story...; it's simply the de facto way Wikipedians happen to style, in general, titles with foreign-language derivations. What this page should do is to either figure out the way that the Encyclopaedia Britannica decides these issues (...hmm, the Enc. B. is published by the Univ. of Chicago Press; you think their Manual of Style could help here?...) or else create a matrix of sources that do style such titles the way Britannica and Wikipedia do and we would go by what members of this matrix do or else work analogously from what they have done in similar cases in the past.

So in the case of Ines Sainz it would clearly be with an accent over the e because that is the way the New York Times and places that are not shy of diacritics seem to do it:

  1. Yahoo.com: We've already reported the alleged incident between Mexican television reporter Inés Sainz of TV Azteca and the New York Jets.
  2. New York Times: "In the aftermath of the Inés Sainz episode there was a lot of blame-the-victim...."

As a test, plug in "Ciadad Juarez" and "Penelope Cruz" into a search engine and get boucoup resultes w/o any diacritic in either case; but do the same using the aforementioned "matrix" and we get what Wikipedia editors actually decide to name both the article about the particular city in Chihuahua and the blp of the particular TV Azteca sports reporter. And if in one's mind we hear the voice of "Wikipedia" saying such titles using a pronunciation native to Spanish or whatever the language in question, I don't think such an effect is hoity toity so much as just seeming appropriately "encyclopedically" accurate.--Hodgson-Burnett's Secret Garden (talk) 16:59, 27 September 2010 (UTC)

I've just run across what I believe may be an interesting sociological factor. Of course, the Enc. B. is not longer edited by Brits--BUTUK magazines and newspapers themselves do NOT use foreign langauge diacritics, as a whole. Why? My own guess is it's because the imagined accent of a prestige publication in England is that of Received Pronunciation. And, dammit, in Received Pronunciation, it would be eye-raen not ee-rahn, for Iran, etc. So, a Brit publ is simply gonna to write /Penelope Cruz/, thinking of some posh pronunciation of the very English given name Penolope..... Whereas the American high-brow pubs the NYT or the NYer (or maybe mixed-Spanish-English area pubs like the Miami Herald or the El Paso Times) will write "Pene`lope," imagining someone's making the effort to mimic the proper pronunciation of this name in Espan~ol. You know? In other words, my theory would be that elitism in England would tends toward some type of "Anglo-(singular)nativism" whereas such elitism in the US would tend, rather, toward some kind of "pan-(plural)nativisms (to overstate my premise)?

Anyway, I think it might be interesting to know where those who rally on this page "Convert everything to English, where possible!" hail from and where those that rally "Keep use of native diacritics when possible!" hail from. In any case, for sure the present stand off--the guideline's saying one thing while individual Wiki pages come to the opposite conclusion--will remain, simply because of the fact that there are more people who hang around individual entries whose editorial POV run toward whatever flavor from among the pan-nativisms, whereas there appear to be more people hanging around this guideline whose slant tends toward Anglo-nativism.--Hodgson-Burnett's Secret Garden (talk) 19:28, 27 September 2010 (UTC)

...the guideline's saying one thing while individual Wiki pages come to the opposite conclusion....--Hodgson-Burnett's Secret Garden (talk) 19:28, 27 September 2010 (UTC)
Yes, I think you're right that this guideline fails to reflect the truth about what is Wikipedia's actual practice, simply because there are a few editors guarding this page who would like that practice to be different, and are unwilling to admit how it actually is.--Kotniski (talk) 08:19, 28 September 2010 (UTC)
For what it's worth, I've posted a Q about this on the big man's page here: User talk:Jimbo Wales (Archive 65): "A? b? or c? or another choice/no comment?"--Hodgson-Burnett's Secret Garden (talk) 18:09, 28 September 2010 (UTC)
And, well, fwiw, Mr. Wales seemed to agree with those who'd hope to limit the squig's use to when genuinely called for.--Hodgson-Burnett's Secret Garden (talk) 15:11, 29 September 2010 (UTC)

revert

We've had edit warring over some wording recently ("rare" etc.). Really, this is supposed to be a guideline, so we need to have consensus. I reverted. — kwami (talk) 18:27, 26 September 2010 (UTC)

This would appear to be a carry-over of a dispute elsewhere. But until further disputes arise, showing that this rare case has been taken as normative where it does not apply, I am content to say nothing. Septentrionalis PMAnderson 15:50, 27 September 2010 (UTC)
Dispute elsewhere? First I heard. Anyway, why do you keep restoring the bit about Weierstrass p with no explanation? What has it to do with "use English"? (Like I keep saying, it's already stated on the policy page itself.)--Kotniski (talk) 16:10, 27 September 2010 (UTC)
Because it's an exception to "use whatever squiggle is in the source", which is and should be our general rule; the math books don't have to worry about browser capabilities; we should. Septentrionalis PMAnderson 17:33, 27 September 2010 (UTC)

Whatever the merits of the arguments, we don't edit war on guideline and policy pages. They need some modicum of stability. If there's one place WP:BOLD should be respected, it's here. Work it out through discussion or RfC or I'm taking this to ANI. — kwami (talk) 20:34, 27 September 2010 (UTC)

Which we are discussing cheerfully. Let us see if Kotniski has an answer to the above/ Septentrionalis PMAnderson 22:46, 27 September 2010 (UTC)
If you two can't agree, you both know the routine. I tried to revert to prior to the dispute (as well as adding a couple minor points of my own, which don't seem to be an issue); if I misjudged, I'll be happy to adjust accordingly. — kwami (talk) 00:36, 28 September 2010 (UTC)
I don't think there's a dispute of any seriousness at the moment (well, I'd like to add some text that explains more clearly what Wikipedia actually does, but I understand that it offends against certain people's ideology, so we'll just have to let new users find out these things for themselves). The thing about the math symbol is not that I disagree with what's written, it's just that it's totally off topic for this page, whose banner is "use English". It's already mentioned at an appropriate place - in the WP:AT policy.--Kotniski (talk) 08:10, 28 September 2010 (UTC)

Request for comment

I'm posting WP's current diacritics usage guideline below. My personal impression is that comprehending its meaning is about as easy as programming a VCR from the owner's manual. Maybe that is by design, but... Anyway, read and then comment below.--Hodgson-Burnett's Secret Garden (talk) 16:04, 29 September 2010 (UTC)

Modified letters

Modified letters. Wikipedia does not decide what characters are to be used in the name of an article's subject; English usage does. Wikipedia has no rule that titles must be written in certain characters, or that certain characters may not be used. Versions of a name which differ only in the use or non-use of modified letters should be treated like any other versions: Follow the general usage in English reliable sources in each case, whatever characters may or may not be used in them.

English usage is often best determined by consulting works of general reference which deal with the subject and seeing what they use. Search engines are always problematic, unless their verdict is overwhelming; modified letters have the additional difficulties that some search engines will not distinguish between the original and modified forms, and others fail to recognize the modified letter because of optical character recognition errors. If there is a consensus on spelling in the sources used for the article, this will normally represent a consensus of English usage.

It may be worth considering reader convenience, not just frequency, in dealing with symbols so rare that many browsers will not render them, especially when usage is divided. For example, in deciding whether to place Weierstrass p under that name, or under the symbol itself, either would be common usage; it is reasonable to consider whether readers will see the symbol or a little square box.

One recurrent issue has been the treatment of ae and oe and their variants. By and large, Wikipedia uses œ and æ to represent the Old Norse and Old English letters. For Latin or Greek-derived words, use e or ae/oe, depending on modern usage and the national variety of English used in the article. German proper names should be treated with care, and attention to English practice. Not all German proper names use umlauts (for example, Emmy Noether is correct in both languages); English resolves umlauts where German need not: Johann Wolfgang von Goethe is standard English usage, although both forms have been found in German.

Beware of over-dramatising these issues: as an example Wikipedia:Manual of Style (Ireland-related articles) may be mentioned, which, as a side-effect, regulated use of diacritics regarding Ireland-related articles – peacefully – before, during and after an extensive dispute on the question of diacritics in 2005, e.g. Inishmore, not Inis Mór; Tomás Ó Fiaich, not Tomas O'Fiaich (see the mentioned MoS page for details).

No established usage. It can happen that an otherwise notable topic has not yet received much attention in the English-speaking world, so that there are too few English sources to constitute an established usage. Very low Google counts can but need not be indicative of this. If this happens, follow the conventions of the language in which this entity is most often talked about (German for German politicians, Turkish for Turkish rivers, Portuguese for Brazilian towns etc.).

If, as will happen, there are several competing foreign terms, a neutral one is often best. For example see the suggestions in the sections "multiple local names" and "use modern names" in WP:NC (geographic names) for ideas on how to deal with this problem.

My take: Someone without a PhD in Wikipedia studies who reads the above comes away with the conclusion that there is no ultimate authority/strict, overriding convention with regard to diacritics on WP and individual cases are to be decided on an ad hoc basis. Let's be less mealy mouthed when saying so; after all even the statement of a lack of a convention can be made reasonably clear and succinct. Conversely, if there are principles that could be useful when approaching the question, would it be a good idea to distill them from out of the wikispeak above (or come up with them independently) and give them top billing? "These are what principles we have, blah blah blah (outlining them simply and clearly). But, other than that, you're on your own."--Hodgson-Burnett's Secret Garden (talk) 16:04, 29 September 2010 (UTC)

We say precisely "do it on a case by case basis"- because that's what we do. There is c; and even that has an exception for practicality. (I see the advice on specific cases - such as it is - is included here; would HB like more - or less?) Septentrionalis PMAnderson 17:26, 29 September 2010 (UTC)
PMAnderson, I think the guideline would be improved even if only it was to put the meat of your comment, just about verbatim, at its very top; then, the rest of the guideline could follow exactly as it is written now, including where it talks of specific cases:
"Use diacritics in the same manner general works of reference do, with practical exceptions enumerated below. If such sources are lacking, decide whether to use diacritics or not on a case-by-case basis." (Then go on here with ... "Wikipedia does not decide what characters..." etc., as it is now.)--Hodgson-Burnett's Secret Garden (talk) 18:41, 29 September 2010 (UTC)
The problem with the wording you suggest is that we have had repeated claims of "[my favorite squiggle] isn't a diacritic, so I can ignore that guideline". This is why the word does not appear in the text; squiggle would probably be equally objectionable to the "guidelines must be solemn" crowd. I'll see what I can do. Septentrionalis PMAnderson 20:14, 29 September 2010 (UTC)
At present the very first sentence is: The title of an article should generally use the version of the name of the subject which is most common in the English language, as you would find it in reliable sources (for example other encyclopedias and reference works). Why isn't this enough? Septentrionalis PMAnderson 20:15, 29 September 2010 (UTC)
But every section should be self-contained to avoid reading out of context; so I've added a sentence before "Wikipedia does not decide..." Septentrionalis PMAnderson 20:31, 29 September 2010 (UTC)
As noted by others above, this guideline no longer reflects the community opinion and practice on the issue. And as noted by Husond, it unfortunately seems to have been the target of partisan editing. The history shows that it has been rewritten by two strong opponents of diacritics; Pmanderson and Philip Baird Shearer (who are now by far the main authors of the page). Pmanderson removed the 2005 poll result from the guideline as "dated", despite having participated in the poll himself (on the "losing" side). In this and other edits, sentences that showed where the community stands on different types of diacritics conveniently disappeared. However, as shown by Wikipedia:Requested moves/Tennis, if the most common spelling in English sources is the same as the correct/native spelling but with the diacritical marks dropped, most users still favor accuracy over common inaccuracy. This seems to be especially the case when dealing with people's names. And no wonder. As an encyclopedia, we must get all aspects of our articles right, whether other sources are successful in that or not, particularly with BLPs. Getting an article right starts at the title.
The current version of the guideline is almost as far from our general practice as a Scuderia Ferrari dress code that requests all team mechanics to wear a blue overall. In this form, were it only a proposal, it would stand no chance of being upgraded to guideline status. Therefore, we need a dramatic revert, demotion to {{essay}} or a rewrite based on the community's consensus (or lack of it). The end result should be honest and descriptive, and not dishonest and prescriptive (as the current version). Prolog (talk) 20:41, 29 September 2010 (UTC)
That's a sweeping claim; do you have any move discussions that support it? Septentrionalis PMAnderson 20:55, 29 September 2010 (UTC)
I'm not sure which claim you are referring to. Move discussions like the 300KB+ tennis biography RM I linked above? Prolog (talk) 21:41, 29 September 2010 (UTC)
No, not the tedious arguments of a handful on an obscure Project Page. Actual move requests at which the name used in reliable sources has been overturned in favor of "correct" Outer Slobbovian spellings. Septentrionalis PMAnderson 21:55, 29 September 2010 (UTC)
The handful is well over 20 editors, and the move discussion was linked at the appropriate talk pages. Prolog (talk) 01:15, 30 September 2010 (UTC)
And it resulted, not in consensus, but in an ArbCom case. Try again. Septentrionalis PMAnderson 02:29, 30 September 2010 (UTC)
Your edit summary was "date-delinking". Are you referring to this ArbCom case? That's all I could find. Besides your 12-month topic ban from all style guidelines, what is the connection? Prolog (talk) 22:05, 30 September 2010 (UTC)
As for the poll; 62-46 is "no consensus"; that's a 57% vote. We are not run by majority rule, especially on a polarized poll like that, with a result which would not approve an admin. Some people want diacritics on all possible occasions; some never want them. This is the midddle way - acceptable to all except Husond and his friends on one side and the "always Anglicize" movement on the other. Septentrionalis PMAnderson 21:13, 29 September 2010 (UTC)
Agreed, polls cannot stand in for consensus. If all we have is a 60-40 vote, and no closing decision that one side has greater merit and so prevails regardless of the poll (common for private-garden/ethnic conflicts), then the best we can say is that there is no WP convention and you're on your own with each article. — kwami (talk) 21:34, 29 September 2010 (UTC)
And do go look at the poll; it deals strictly with the issue of spellings which are like Zurich/Zürich - differing only by diacritics - and is a choice between "always" and "never". Never has no consensus - and this page does not support it - but neither does always. Septentrionalis PMAnderson 21:38, 29 September 2010 (UTC)
I didn't say that the poll ended in a consensus. It clearly didn't. It did however show where the community stood on this issue, and that your position (now reflected in the guideline) is also not consensus-supported or the "middle way". The tennis discussion, on the other hand, did end in a clear consensus. Prolog (talk) 21:41, 29 September 2010 (UTC)
Where the community stands is divided; it was then, it is now. The assertion that a divided community should be constrained to the One True Policy by editing Wikipedia space is the effort of one who has never had consensus to invent one; it also ignores what guidelines are: a summary of what the community actually does - which has never been to include diacritics where English does not; occasional disruption by those who know some other language better than English aside. When there is no consensus, we fall back on other bases for decision; in this csse, the assertion of the governing policy that we should choose names on the basis of what reliable sources use. Septentrionalis PMAnderson 21:51, 29 September 2010 (UTC)
Yes, divided. So why did you replace the "Disputed issues" section with one that is not descriptive of what the community does or supports? Since I already advocated for bringing the guideline in line with the actual practice in my first comment, no dispute about that. Categories and lists like List of Czechs give a good indication of what the community does. At least with European topics, it is very rare for an article to be missing a diacritic if one could/should be there. Prolog (talk) 01:15, 30 September 2010 (UTC)
Does anyone bvsides this advocate believe that an inconclusive and narrow poll of five years ago is essential to a description of what the community now believes? Septentrionalis PMAnderson 02:18, 30 September 2010 (UTC)

If this current version is not it, what was the last stable version of this convention? — kwami (talk) 21:34, 29 September 2010 (UTC)

This version from August 2007 was stable for a long time and did not include the controversial changes made soon after. Prolog (talk) 01:15, 30 September 2010 (UTC)
I don't think that one can argue that it is necessary to go back three years! If one goes back to the version current in July of this year there had been next to no changes in the first seven months of this year diffs January to July 2010. This would seem to be a stable version. -- PBS (talk) 02:59, 30 September 2010 (UTC)
Except that by completely excluding the real practice and disputes about modified letters, the version is not a reflection of how things are done in the project but how some users think they should be done. Do you really think that "Versions of a name which differ only in the use or non-use of modified letters should be treated like any other versions: Follow the general usage in English reliable sources in each case, whatever characters may or may not be used in them" describes the consensus or the practice in the community? Prolog (talk) 22:05, 30 September 2010 (UTC)
My experience is that in RMs there are those who "vote" for their preferred version if they explain their "vote" it at all it is "this is the correct version" or "they say we should not use foreign spellings" and they neither try to justify their comment to others or to persuade others that their views are correct. For those who do try to persuade others and justify their opinions the debate inevitably revolves around the usage in reliable sources. So my experience is that the debate is always framed by what is used in reliable sources. -- PBS (talk) 21:56, 1 October 2010 (UTC)

Prolog any poll taken before the development of the concept in the policy page that names should be based on reliable sources, is not going to be of much use. Prior to that simple addition to the policy, many guidelines were based on a complicated set of rules to mimic usage in reliable sources. We still have the vestiges of that system in in some of the guidelines and it causes never ending disputes. Those guidelines that promote following the usage in reliable sources tend to be areas were there is little dispute over the names to use.

This guideline was one of the first if not the first, to start to use the simple formula that names of articles should be based on reliable sources. It became an imperative for this guideline precisely because the community could not agree on which set of rules to use for the naming of articles (whether to nearly always use accent marks, or whether to nearly never use them). By simply following the usage in reliable English language sources we have a method that ties the article names into the content of the articles via verifiable reliable sources and minimise the Wikipedia editorial interference in the process (WP:OR).

Prolog would you suggest that we go for a method of naming articles that is not based on usage in reliable English language sources? -- PBS (talk) 22:26, 29 September 2010 (UTC)

Of course article titles must be verified from reliable sources. And of course our encyclopedia is not obligated to follow the spelling that is most common in other (almost always; news) sources or weaken the encyclopedia by picking up any other common bad habits either (gossip, undue weight to hot fringe ideas, censorship of "offensive" text/images, capitalization of headers and titles etc.) There's nothing OR about aiming for accuracy and evaluating non-encyclopedic sources when writing an encyclopedia article. This sort of "editorial interference" is strongly preferred. As you can see by browsing articles (or featured articles), the naming method you favor is not in use. Therefore, it can not be favored by this guideline. Prolog (talk) 01:15, 30 September 2010 (UTC)
Where did you get that from? None of the English-speaking countries have an Academy; the only determinant of proper English is the usage of English-speakers as a whole. Septentrionalis PMAnderson 02:26, 30 September 2010 (UTC)
Prolog you write "weaken the encyclopedia by picking up any other common bad habits either by picking up any other common bad habits either (gossip, undue weight to hot fringe ideas, censorship of "offensive" text/images, capitalization of headers and titles etc." but is that not covered by the requirement to use "reliable sources"? You mentioned the article Dominik Hašek but as far as I can tell the last time that a requested move was put in for that article was before the policy was altered to suggest that we used reliable sources to determine names and as you say "Of course article titles must be verified from reliable sources." would you be in favour of moving that article if it were shown that the majority of reliable English language sources used a different spelling? If not why not? -- PBS (talk) 02:59, 30 September 2010 (UTC)
(You duplicated part of what I said. Please fix that.) No, I would not support that move. I haven't formed an opinion on all the diacritic-related disputes, but I do strongly support the usage of a person's real name, if it is supported by (a minority or a majority of) reliable sources and if the most common spelling in English language sources is that name or the name with simply the accent(s) dropped (with the obvious exceptions as noted by Kotniski). I believe my reasoning is similar to those of others (encyclopedic accuracy, pronunciation guidance, not confusing Jonssons with Jönssons and Jónssons etc.) As can be seen from List of Czechs and Category:Czech people, consistency can play a part as well. Prolog (talk) 22:19, 30 September 2010 (UTC)
Prolog, I am amazed by your casual dismissal of "news" as a category of source. It seems disinengenuous to me that Wikipedia could rely so heavily on news sources for actual facts, but then dismiss them when it comes to titles. On practically every major article, our citations show that we put a tremendous amount of trust on the reporters and editors of respected news organizations; we expect them to do the the difficult work of finding and describing facts about the world, staking their professional reputations on accuracy, sometimes even risking their lives to get the truth. But, when it comes to diacritics, suddenly they are "lazy", "inaccurate", "inconsequential". This continues to baffle me. If news organizations are so unreliable that they can't even spell right, how do they form the backbone of this encyclopedia's content? Erudy (talk) 04:01, 1 October 2010 (UTC)
They may supply a lot of (reasonably) reliable facts, but that doesn't mean we have to follow their style of presentation - newspapers and encyclopedias are written for different purposes and subject to different constraints. It's not a question of "spelling right" - one spelling is just as "right" as the other, but one is more useful for imparting information in an encyclopedia, and it won't necessarily be the one that we can find in the majority of all possible sources.--Kotniski (talk) 08:40, 1 October 2010 (UTC)

I think PMA's recent modification has improved the wording somewhat, though as said in the RFC, we could say all this in a much clearer fashion. In fact I don't think we should have this guideline page in its present form: it's not needed in terms of article naming, as all the points are already covered at WP:AT, where it's made clear that "commonness" (though important) is not the one absolute criterion for deciding on article titles, a point which this guideline generally fails to make (though PMA's modification goes some way to doing this). I would change this page to a guideline called simply "Use English", applicable to all content, not titles specifically. And as regards modified letters, it ought to recognize explicitly the fact that Wikipedia generally uses them (even when only a minority of English sources do) as being more encyclopedically informative at no cost, though there are situations (though the boundary is not absolutely defined) where using the modified forms is considered to be against established English usage (like Napoleon and Aragon - note we even have our own pronunciations for those words) or significantly harms recognizability (as with Djokovic).--Kotniski (talk) 07:09, 30 September 2010 (UTC)

It should not be changed from a naming convention. Content and article names although related are not the same thing. I do not think that these recent changes by PMA are an improvement. The repeat what is already said lower down the same paragraph but in my opinion not as clearly. -- PBS (talk) 08:59, 30 September 2010 (UTC)
So in what ways does the exhortation "use English" apply specifically to titles in a way that does not apply to content?--Kotniski (talk) 09:02, 30 September 2010 (UTC)
It is mainly to do with how one surveys the literature to derive a name and the polices used to to get there. The content is bound by the three content policies, and using the collation methods we use for deriving the name of an article is OR as far as content is concerned. Also at a Wikipedia political level, given where we are today and the histories of the two developments, I do not think that mixing MOS and AT is desirable. -- PBS (talk)
One example: an article with some math in it might use the integral sign rather than the English word, and no-one would claim that's inappropriate. But our articles on integrals and the integral symbol do not use the integral sign as their title, but an English phrase; is a redirect. (On the other hand, there are more assimilated symbols which are used as article titles instead of English words, such as 3 (number) rather than three, and D rather than dee.) — kwami (talk) 09:13, 30 September 2010 (UTC)
I suppose so, but that's hardly the sort of thing this page is about - it's covered better at WP:AT (anyway, you wouldn't use the integral sign in text, only in equations).--Kotniski (talk) 10:00, 30 September 2010 (UTC)
I personally could live with the guidance given in the Economist style guide on accents, because I realise that educated English monoglots usually recognise French, German and Spanish spellings, but a simpler way to achieve the same result without the political baggage of internal Wikipedia political correctness (we can not do that with German and not Vietnamese because ...) is to specify that we use what is used in reliable English language sources. It tends to work just as well. If this is no different from the policy then the guidance is doing its job because it explains things that come up in lots of discussions which are too detailed for the policy page. -- PBS (talk) 09:31, 30 September 2010 (UTC)
But it doesn't really explain them, that's the problem. "We use what is used in reliable English language sources" is quite a vague statement, given that different reliable English language sources do different things.--Kotniski (talk) 10:00, 30 September 2010 (UTC)
I think it does, or we go back to the Born2Cycle's arguments and that more weighting should be given to Sunday supplements than to academic journals on botany. The only way out of reliable English language sources would be to write a new definition for reliable sources (in the AT policy) rather than relying on the one used in WP:V and I do not think that is desirable. What this guideline does is give guidance on what to do when the reliable English language sources do not give clear guidance. -- PBS (talk) 22:38, 30 September 2010 (UTC)
  • Question: Do we really need article titles like this one? It just seems counter-productive, especially when very little people with English keyboards will be able to directly get to it, forcing them to either go through a disambiguation page or through a wikilink on another page. I mean, most of the redirects are fairly obscure as well. SilverserenC 16:51, 30 September 2010 (UTC)
I'm not sure what solution you're suggesting, or what exactly you think the problem is ("getting directly to it" isn't really a problem, since there is or could be a redirect from whatever you think the title should be). For me the problem is the length of the title (which consists of a succession of foreign words). For that kind of title I much prefer a translated form if available, or if not, one of the abbreviated forms of the name. --Kotniski (talk) 17:46, 30 September 2010 (UTC)
That's what I mean, the article title isn't really appropriate. Though I do think that it shouldn't use words that have non-English letters. I don't have an issue with a tilde here or there in an article title, but an "ə" is just confusing. SilverserenC 19:21, 30 September 2010 (UTC)

"The person formally known as Prince". I think we should junk the paragraph "It may be worth considering reader convenience..." it brings needless complication to this section, and it opens up another area of dispute. If the symbol is rare then it is unlikely to be the title in many reliable sources, so while theoretically possible it is an issue it is not worth complicating the section with mention of it. It is already on the policy page under WP:AT#Special characters so I think we should remove the paragraph from this guideline. -- PBS (talk) 03:57, 1 October 2010 (UTC)

Yes, that's what I keep saying (but people - mentioning no names - keep putting it back). It would be relevant here if we had an example with a foreign letter rather than a mathematical symbol.--Kotniski (talk) 08:44, 1 October 2010 (UTC)
I have removed the paragraph. -- PBS (talk) 21:41, 1 October 2010 (UTC)
There is an example in Wikipedia:Naming conventions (Norse mythology)#Standard spelling which does relate to this issue but I think as it is addressed in the policy it does not need to be addressed in this guideline as well. -- PBS (talk) 23:26, 1 October 2010 (UTC)

arbitrary break

My language might be disagreed with but I think it is important to state more directly to people who consult this page the lack of an established convention. My language says, or at least implies, to leave "bare" English (or for that matter, diacritics-laden) titles alone, although it is acceptable to change them when backed up by RSes. Other than that, I rearranged the rest of the section slightly to group things realted to diacritics (as opposed to graphemes) together.--Hodgson-Burnett's Secret Garden (talk) 15:43, 5 October 2010 (UTC)

I have partially reverted your changes. Firstly because with changes that throw up big differences it is difficult to see exactly what has changed. Second you had ended up with more words not less. The third reason is that I disagree with some of the wording you use and do not have the time to go through text word by word without the aid of a useful diff to see the differences. For example in the very first line (as here) mention "established convention" we do not have conventions per say we have policy and guidelines. Besides this is the specific guideline on how to deal with funny foreign squiggles in article names. The thing about leaving alone is given for the content of an article in the MOS (that is not something this naming convention needs to worry about) and there is no need to mention reasons it here as it is covered in the policy under Wikipedia:AT#Considering title changes. -- PBS (talk) 22:42, 7 October 2010 (UTC)
Other than suggesting putting a diacritics version in the lede, there was only added the bit about there being no established convention encouraging/discouraging their use. I still think this the case but don't want to fight the issue (namely editors coming to the guideline are likely to think the guideline specifies some specific approach when in actuality it's more of an "either or is fine"....)--Hodgson-Burnett's Secret Garden (talk) 20:39, 8 October 2010 (UTC)
Below is the version I'd changed to.
Modified letters; graphemes

As of 2010, Wikipedia has no consistent convention encouraging or discouraging use of modified letters (such as accents or other diacritics) in article titles. If there is a consensus on spelling in the sources used for the article, this will normally represent a consensus of English usage. (Search engines are always problematic, unless their verdict is overwhelming; modified letters have the additional difficulties that some search engines will not distinguish between the original and modified forms, and others fail to recognize the modified letter because of optical character recognition errors.) If the title is the version of a name without diacritics, it is acceptable to include a version of the name with the diacritics in the lede to the article; however, a move from an existing title to a version of a name which differs only in its use or non-use of modified letters is allowed, if not necessarily encouraged, when supported by reference to English reliable sources (for example other encyclopedias and reference works). In all cases, place redirects at alternative titles, such as those with or without diacritics.

Beware of over-dramatising these issues: as an example Wikipedia:Manual of Style (Ireland-related articles) may be mentioned, which, as a side-effect, regulated use of diacritics regarding Ireland-related articles – peacefully – before, during and after an extensive dispute on the question of diacritics in 2005, e.g. Inishmore, not Inis Mór; Tomás Ó Fiaich, not Tomas O'Fiaich (see the mentioned MoS page for details).

German proper names should be treated with care, and attention to English practice. Not all German proper names use umlauts (for example, Emmy Noether is correct in both languages); English resolves umlauts where German need not: Johann Wolfgang von Goethe is standard English usage, although both forms have been found in German.

One recurrent issue has been the treatment of such graphemes as ae and oe and their variants. By and large, Wikipedia uses œ and æ to represent the Old Norse and Old English letters. For Latin or Greek-derived words, use e or ae/oe, depending on modern usage and the national variety of English used in the article.

It (A) added an introductory phrase to the effect "Don't worry, be happy; either/or is fine," and (B) added mention that adding a diacritics-laden version to a lede is OK, as is moving a title to such a version, if supportable by RSes [subject to consensus of ed's at the page].--Hodgson-Burnett's Secret Garden (talk) 16:47, 12 October 2010 (UTC)

Do guidelines guide how to edit Wikipedia? Yes, as a rule; so my calling the diacritics guideline not really a guideline was bound to be controversial. That a handful of editors currently at the talkpage of a guideline can determine practice throughout the English-language Wikipedia is amazing. But cool. But, we seem to have a nebulous guideline here. That is, there seems to be no history of this guideline's ever very "tightly" determining whether diacritics should be used or not; and, likewise (although wp:OTHERSTUFFEXISTS applies here), the fact is that practice (or "the convention," if you will) throughout English Wikipedia seems to follow no "tight" rule (although I personally would be in favor of adopting a tighter rule).

One thing I want to bring up is the "once-removed," "theoretical/philosophical," techno-speak. I think the supposed precision it enables is more than offset by its allowing ourselves to overstate or understate important facts pertaining to the matter at hand. Cut down to most basic and readily understood terms: Someone comes across a title that can have diacritics in it and wants to know whether to use them in titling or not. Let's say that s/he is writing a blp about a Spaniard and the article has 100 sources, 99 of which are news sources that leave the diacritic off of the given name (as well as leaves off the subject's compound, Spanish surname) and one reference used at the top of the article's lede, from a reference work, that includes a diacritic in the given name (as well as includes the mother's surname as part of the subject's complete name). The person writing the blp comes to the diacritics guideline and is guided to follow especially the reference work. The person is about to use that as the article title but then reads the summarizing statement that when the article's sources enjoy a consensus, that is usually English usage.

Another issue is, Is a great deal of benefit supplied by speaking at one remove, in both cases? It's like we're using euphemisms or something. The fact is, reference works favor using especially a recently derived foreign word or name in a form it would be found in that language. (Ah, but I myself am being mealy mouthed here!) I mean to say, ref works use diacritics. Whereas news sources generally avoid them and using a form of a name without diacritics is common English usage, except in the case of cafe and a few other, rare cases.

My suggested version (above) had gone...
-1 Don't worry be happy; either/or is fine
-2 Consensus in article's sources normally renders English usage
-3 Don't mess with search engines
-4 Diacritics version OK in lede, even if not in title--although title can be moved to diacritics version if supported by reference work, too
-5 Don't forget to do redirects
-From 6–end-of-section. Don't overdramatize these issues ... examples from various languages

The problem, though, is that it ADDED to the length of the section. (Along with making the guideline an UN-guideline, lol.)

Well, I'm going to try at least to tighten up some of the language that raise philosophical problems in the current version by my making some teensy suggested edits, for which I'll explain the reasoning in edit summaries. Bye for now.--Hodgson-Burnett's Secret Garden (talk) 17:36, 12 October 2010 (UTC)

OK, I added a teensy phrase mentioning that sometimes there is no consensus with regard to what version of a name is in the sources; other than that, I added absolutely nothing. I rearranged the text the bare minimum necessary for the underlying ambiguity to be made apparent. Cheers.--Hodgson-Burnett's Secret Garden (talk) 18:20, 12 October 2010 (UTC)
The recently introduced sentence "A consensus on spelling in the sources used for the article will represent a consensus of English usage." cannot stay as it is, because it invites editors of bad faith to edit the content of the article to fix the sources so that they can then move the page. Even if there is nothing nefarious going, on the content may be based on an old source (like EB1911) which does not reflect modern usage, or it may be by chance that the sources used do not happen to reflect common usage. This is why WP:MOS#Foreign terms says "For foreign names, phrases, and words generally, adopt the spellings most commonly used in English-language references for the article, unless those spellings are idiosyncratic or obsolete". -- PBS (talk) 23:11, 12 October 2010 (UTC)
Oops! I thought my snip from that sentence left it with a meaning essentially the same that you intended back on May 6, 2008, when you'd rendered it as "Follow the general usage in English verifiable reliable sources in each case, whatever characters may or may not be used in them." Then on June 11, PMAnderson contributed, " If there is a consensus on spelling in the sources used for the article, this will normally represent a consensus of English usage," with an edit summary of "add per talk"--and with the talkpage discussion PMAnderson referenced here: Wikipedia talk:Naming conventions (use English)/Archive 7#Diacritcs and this policy.

Note that I reverted to the more recent version, " If there is a consensus on spelling in the sources used for the article, this will normally represent a consensus of English usage." But if this sentence is still problematic, for the reason PBS mentions, I have zero attachment to it and personally would not object to its being removed altogether.--Hodgson-Burnett's Secret Garden (talk) 00:47, 13 October 2010 (UTC)

Since my last edit to the guideline these changes have been made. I do not think that the first two paragraphs of the section are clearer and I think it has bought in some subtle changes that do not help. For example "general usage" and "consensus" are two different things. -- PBS (talk) 01:37, 13 October 2010 (UTC)

You emphasize "since (you) last edited" and link to the changes which you say are less clear. I'm fine with the introduction you prefer. As shown in the diff you provided, the only words I'd added, other than the word graphemes, were "if there is no consensus in the sources, either form is acceptable as a title."

Would it be OK if I leave these two additions in, reverting otherwise back to the intro you prefer? See diff.--Hodgson-Burnett's Secret Garden (talk) 02:07, 13 October 2010 (UTC)

And, if there is no objections, I'll remove the sentence "If there is a consensus on spelling in the sources used for the article, this will normally represent a consensus of English usage." I myself agree with user PBS that Wikipedia's favoring of "general usage" over its endorsing simultaneously "either/or" both "general usage" and "consensus in an article's sources" is an improvement. It's less ambiguous and more "tight" of a guideline.--Hodgson-Burnett's Secret Garden (talk) 03:37, 13 October 2010 (UTC)
The point being made with the statement about sources, is that often the reliable sources used to support the content are a good indicator of what is used, just as Google searches can often show if one term is used far more often than another. If the sources in the article, a Google book search of books published since 1980 and other encyclopaedias use a term, then that is fairly conclusive. If any one of those three diverge from agreement then more investigation will be needed. -- PBS (talk) 23:16, 13 October 2010 (UTC)
Wonderful, very-very straightforward, simple language, PBS! about how various mutually-reinforcing ways of analysis contribute to determining usage: a bold contribution of this comment to the guideline would really help distill this concept, even word for word how you just wrote it.--Hodgson-Burnett's Secret Garden (talk) 17:26, 14 October 2010 (UTC)

Bundeswehr

So Bundeswehr should be German military? Or does common name apply instead?Marcus Qwertyus 05:51, 3 October 2010 (UTC)

It isn't "instead" - this page is more or less saying to use the common name (emphasizing that we mean the name commonly used in English, which might not consist of English words).--Kotniski (talk) 09:14, 3 October 2010 (UTC)

Google hits

"Counts over 1000 are likely to be meaningless"? Why so? I checked the search strings mentioned in the article referred:

The numbers look logical. Google may have improved the calculation algorithm or something. I don't think there's any reason to disbelieve the numbers given by Google any longer. The Other Saluton (talk) 11:12, 4 October 2010 (UTC)

Sometimes I search for A B; then search for A B C; and find that the number of hits given for the second option is significantly greater than the first, which is logically impossible. So no, I don't entirely trust Google numbers.--Kotniski (talk) 12:27, 4 October 2010 (UTC)
EG: [climategate] About 3,190,000 results. [climategate wikipedia] About 110,000 [climategate -wikipedia] About 841,000 results, so where did the other 2 million ghits go? If one searchs on climategate, although it says it finds 3,190,000 if one goes to the last page -- which is page 46 -- Google states it has "only" returned 452 pages. Google gives the option on that page to "repeat the search with the omitted results included": which returns 78 pages or about 780 pages. As Google has only returned 780 pages we can not check the other three million so it is not much use to us as we can not know how many of those are reliable sources -- PBS (talk) 00:34, 5 October 2010 (UTC)
Did you search [climategate]? In square brackets? I don't understand where 3,190,000 comes from. The Other Saluton (talk) 04:56, 5 October 2010 (UTC)
No the square brackets represent the box on a Google search. -- PBS (talk) 05:15, 5 October 2010 (UTC)
...and the 3,1 million hits come from googling Google in English, not in Russian. Different national variants of Google give different results, different language settings give different results even for the same Google address, and you even get different results depending on which random server your query is routed to, since Google's distributed data base is not consistent (indeed, it is, with current technology, impossible to keep consistent). --Stephan Schulz (talk) 20:20, 5 October 2010 (UTC)

mixed Latin–non-Latin transliteration

The convention of using Latin transliterations for non-Latin text isn't straightforward. It's common to mix in Greek letters such as γ, δ, β, χ in transliterations of Central Asian scripts derived from Aramaic, such as Sogdian and Mongolian. There is a debate at Talk:Mongolian script#Greek gamma or Latin gamma whether these should be made Latin per this convention. There is, for example, Latin gamma, ɣ, which is used in Latin scripts such as Berber, and of course ð is commonly used as the Latin equivalent of Greek δ. (Indeed, for some Central Asian scripts ð is used as the equivalent of δ.)

Personally, I prefer not mixing in Greek where it's possible to avoid it. In many fonts γ, for example, is easily confused with y, and ɣ would help avoid that problem. Where should we come down on using Latin script when our sources may not? — kwami (talk) 23:16, 7 October 2010 (UTC)

So far this page says nothing about non-Latin letters in primarily Latin scripts, because there is no consensus on what letters are non-Latin. One faction even contends that the rune thorn is a Latin letter in Icelandic. Septentrionalis PMAnderson 16:24, 28 October 2010 (UTC)

Turkish

A factoid/curiosity, in case WP ever decides to adopt a policy discouraging altered letters: The honorific /Hanoum/ ( /Hanum/, /Khanoum/, etc.) is nowadays rendered Hanim (well, that is, with its /i/ undotted): hence, this webpage renders Ataturk's mother as Zubeyde Hanoum; whereas many other sites render her Zubeyde Hanim (while her WP article remains at "Zübeyde Hanım").

...My question would be, What transliteration system would one use with Turkish altered letters? Would /g/-breve be /gh/ or simply /g/? /s/-cedilla, /sh/ or /s/? /c/-cedilla, /ch/ or /c/? (some of which are found at this WP page).--Hodgson-Burnett's Secret Garden (talk) 20:23, 12 October 2010 (UTC)

Redirects from non-Latin article names

Recently we had an AFD for deletion of a redirect from a Japanese film title to the English-language article about the film and the novel on which it was based (see Wikipedia:Articles for deletion/失楽園). In the course of the discussion the prevailing argument was that there was no need for such redirects because searching produced the correct article. The redirect was recreated on the same day that it was deleted, by someone who hadn't participated in the discussion.

I personally agree with the result. The statement in this guideline that

Redirects from non-English names are encouraged.

is being used to justify these redirects. I do not see the point of these redirects because someone typing in our search is unlikely to use Kanji or Katakana, and a Google search will find the article if the non-English title appears in the article. But at any rate I think we need to resolve this since there seems to be some consensus against the interpretation that's being applied now. I suggest

Redirects from names using non-English diacritics are encouraged; however redirects for non-Latin character sets are unnecessary and should be omitted.

Mangoe (talk) 19:04, 2 November 2010 (UTC)

Here's my question: Why? What is gained by removing them? You say it's unlikely, but redirects are free and if there is even a chance that one person will someday find the content they were looking for because of one then it is worth having. Nothing is lost by keeping them and nothing is gained by deleting them. Beeblebrox (talk) 19:23, 2 November 2010 (UTC)
What's the difference between this and diacritics? Wouldn't a search find either one?
I copy and paste non-Latin text into the search window all the time. Usually that triggers a search, but if it's a rd, so much the better. — kwami (talk) 19:30, 2 November 2010 (UTC)
I can think of several instances where non-Latin redirects are helpful. One easy one is for a person who signed their name in non-Latin characters. The example I think of off the top of my head is Chihiro Iwasaki- she didn't speak English, and she (somewhat unusually for Japanese people, because she was a kids artist) always signed her works in hiragana, いわさき・ちひろ. If someone were to see her signature on something and search, there's a chance they would search for her name in hiragana; like Kwamikagami says, even if the hiragana redirect didn't exist, you'd manage to find the article easily enough, but redirects are better. And I'd agree with Beeblebrox; redirects are cheap, just let them be. The Blade of the Northern Lights (話して下さい) 00:40, 3 November 2010 (UTC)
I think the whole thing needs reconsidered. I don't think that Agua should redirect to Water -- nor Eau, nor Wasser, nor Acqua nor Maji, nor any of the other dozens of words that are verifiably used to indicate that clear, wet liquid in non-English languages.
There are times when it is appropriate. IMO, any non-English proper noun is an excellent candidate for this (e.g., Gdańsk/Danzig, Lech Wałęsa/Lech Walesa, Mikołaja Doświadczyńskiego przypadki/The Adventures of Mr. Nicholas Wisdom). But the unqualified blanket statement goes well beyond this. WhatamIdoing (talk) 22:27, 3 November 2010 (UTC)
I'd agree that something needs to get done; we don't need to create 水 or みず as redirects to water. The problem is, something too narrow can get a bit tough because sometimes foreign language redirects aren't particularly intuitive to people unfamiliar with the context. For instance, as someone who knows some Japanese, I can say that it makes perfect sense to redirect 野球 and やきゅう to Baseball in Japan, because that's the Japanese term for baseball; in fact we even have the romaji, yakyuu, as a redirect as well. However, there's no need to do something like this for アイスホッケー (ice hockey) or バスケトボル (basketball), because those are less significant sports in Japan, and are both taken from English anyways. Since the only Japanese sports without English names are running, ping-pong (which has a Chinese name), and baseball, and we don't have articles on running or ping-pong in Japan, it's a unique situation, and stands out to people who have some knowledge of Japanese culture. However, people unfamiliar with this might not be able to make the distinction when it comes to that sort of thing. The shorter version; I think the wording does need to be broad, however, it could use some narrowing down to prevent things like my first example. The Blade of the Northern Lights (話して下さい) 03:17, 4 November 2010 (UTC)
A rule of thumb is that if the subject of the article is primarily or disproportionately known of in a non-English language, a redirect from how they refer to the subject in that language will be useful. Fences&Windows 00:00, 5 November 2010 (UTC)

Possible conflict between Wikipedia:Manual of Style (Japan-related articles) and Wikipedia:Naming conventions (use English)

It has recently been pointed out that the guideline for article names in the Manual of Style for Japan-related articles is in direct opposition to the common names policy and to Wikipedia:Naming conventions (use English), particularly WP:DIACRITICS. More input is requested to determine whether or not WP:MJ needs to be modified to accurately reflect the community's actual practices and best advice. The discussion is taking place here. Jfgslo (talk) 17:11, 2 December 2010 (UTC)

References works @ modified letters

It reads in the section that under certain conditions further research will be necessary, however, this seems vague as a policy pronouncement so I think some kind of example of what this might entail might be useful. I suggest something along the lines of

...for example, the consultation of general or niche reference works such as dictionaries and encyclopedias

Any other suggestions/comments?--Hodgson-Burnett's Secret Garden (talk) 15:51, 24 January 2011 (UTC)

OK, I see now that a review of a number of news/mag articles was not mentioned as a specific criterion. My suggestion thus is to substitute it for other encyclopedias in the "three criteria" pgraf and then use other encyclopedias as the possible tie breaker. I believe this correct because the likelihood is great for such reference works to favor foreign usages, and thus to be out of kilter with the other three. Whereas, reference works often err on the side of presenting as much detailed information in as short a form as possible, Wikipedia's purpose is somewhat different in that WP endeavors to utilize English usage but to put variant forms toward the top of the lede: as the title of this page says, "Use English."--Hodgson-Burnett's Secret Garden (talk) 16:45, 24 January 2011 (UTC)

Seems noone want to discuss this.--Hodgson-Burnett's Secret Garden (talk) 15:10, 2 February 2011 (UTC)
I think "investigation" should not be further qualified. This is to cover exceptions, where the nature of the sources will vary. Giving examples, by implication, puts unwanted limits on the investigation. For instance: for some topics, official databases might be appropriate and magazine articles might be totally inappropriate. Mentioning one but not the other might give undue weight. --Boson (talk) 22:48, 2 February 2011 (UTC)

An exception to the rule?

Please read carefully before making a comment. Hello, everyone. I've found an interesting and unusual situation related to naming conventions. There is a Portuguese monarch called "João VI" (it spells like the French "Jean") who has an article about him under the title John VI of Portugal.

I have noticed at Google books that specialized works written in English (João VI's, Pedro I's, Pedro II's biographies, books about the history of Portugal, or Brazil, etc...) prefer (if not always) to use the name "João VI". On the other hand, more generalist books (books about the history of Europe, Napoleonic wars, etc...) prefer to use the name "John VI".

According to the Manual of Style, we need to follow two rules: use the widely known name ("Queen Victoria" instead of "Victoria of the United Kingdom") and/or anglicize the name whenever it's possible ("Ferdinand Magellan" instead of "Fernão de Magalhães"). The goal of both rules is to make the life of the reader easier.

The problem is that "John" VI is the son and successor of Queen Maria II (not "Mary") and father of both Emperor Pedro I of Brazil (not "Peter") and King Miguel of Portugal (not "Michael"). He is also the paternal grandfather of Emperor Pedro II of Brazil (not "Peter II") and of Queen Maria II of Portugal (not "Mary").

As anyone can see, it is quite odd to write or read about a Portuguese monarch who has his name anglicized while everyone else in his immediate family have their names kept in Portuguese. It gets worse once you read his article where there are also other Portuguese or Brazilian historical characters like José Bonifácio de Andrada, for example.

The point is: shouldn't exist an exception for both rules mentioned above? For cases like this one, for example? My opinion is that:

1)In exceptional cases such as this one, the name should be kept in its original form. I repeat: exceptional only. A simple note explaining the pronunciation would be added ("João" is supposed to be pronunced as in the French "Jean", for example) .
2) In his article, or aticles closely related to the subject, his name would be spelled in its original form. In the case of João, for example, his article, as well as in articles related to Brazilian/Portuguese history his name would be spelled "João".
3) In articles which focus in more generalist subjects (for example: history of Europe, Napoleonic wars, list of European monarchs, list of monarchs who were murdered, etc...) the name should be anglicized and redirectioned to its proper aticle.
4) The rule for cases such as this one would be that the preferable name for an article is supposed to be the one widely used by specialized works (when I says, "specialized", I'm not talking about books for specialists only, but that are focused on the subject mentioned), not generalist works.

I believe my suggestion is fair and could be implemented. Nonetheless, the present situation can not be maintained. An article about a monarch who has his name anglcized when the previous monarch and the monarchs after him have their original names is awkward at best. Regards to all, --Lecen (talk) 21:53, 2 February 2011 (UTC)

Comment: As more scholars abandon the anglicization of names for most modern-period monarchs (and some ancient monarchs and other personanges as well), this becomes more irritating and inconsistent. João VI is a good example, and most references I have consulted use the João/Joao form of the name. Interestingly, Encylopaedia Britannica also used João in older editions, though they eventually made the "improvement" to anglicize all instances of Portuguese monarchs named "João" to "John". Recent sources use "João", and I agree that it is confusing for both editiors and readers to jump back and forth between anglicized and non-anglicized forms, sometimes within the same dynasty or article where some names are anglicized and others are not. • Astynax talk 09:34, 3 February 2011 (UTC)

"Demonstrate a clear prevalence in English usage, and that will resolve the issue. On the other hand, where there is genuine usage, as with the present King of Spain (who is not John Charles I, although his ancestor is, as usage makes him, Charles V), there is usually no issue. Many of these, as with the section below, are artificial issues produced by nationalists. Septentrionalis PMAnderson 15:00, 12 May 2011 (UTC)

Conflict between usage and policy wording

It is a fact that - at least, as far as Polish people or placenames are concerned - we use diacritics. There are very few exceptions to that, either people who have emigrated and changed their name, or few individuals like Casimir Pulaski whose name is changed to a degree real first name (Kazimierz) is almost universally translated into the English usage (but those cases are rare, somewhat controversial, and often discussed to proverbial horse's death on their talk pages).

However, when I read the "Modified letters" section, the above would not be obvious. First:

The use of modified letters (such as accents or other diacritics) in article titles is neither encouraged nor discouraged

This is neutral, but is it really true? We don't really encourage or discourage diacritics, but in many cases (like Polish people or placenames) we use them. If a rare undiacriticized Polish bio or place pops up on our radar, we (members of WikiProject Poland) diacriticize it, and this has been an uncontroversial approach of ours for years, since this issue was discussed long, long time ago (early 2000s) and consensus for use of diacritics have emerged.

As such, the above phrase is misleading, as in places like Poland-topics we don't really encourage the use of diacritics - we treat it as given, as it is not only common, it is the unwritten rule of what to do. This raises the question of "is WikiProject Poland" the exception? As far as I know, we are not. Diacritics are used throughout other Latin-alphabet Slavic languages (Czech, Slovak), I've seen them in Nordic languages, German, French... as such I strongly believe that the above sentence needs change, either to encouraged, or at least, to mention that diacritic use is much more common then non-use. In fact I think this policy needs a rewrite to the extent is makes it clear that the use of diacritics is common and encouraged, but with a special section on "exceptions" - rare cases where we do not use diacritics. --Piotr Konieczny aka Prokonsul Piotrus| talk 18:06, 18 April 2011 (UTC)

Yes, I keep making this point - but this guideline seems to be under the control of people who have an aversion to diacritics that isn't shared by the Wikipedia community in general, hence it doesn't reflect actual practice very accurately. --Kotniski (talk) 18:11, 18 April 2011 (UTC)
I've not been active here before, but if this is indeed a case, the usual solution to break a WP:OWN hold on a policy is proper and wide canvassing. An RfC, plus a note on VP:POLICY and various WikiProjects/Regional Noticeboards to attract editors who actively edit articles with diacritics should be enough to counter any bias. I've announced this discussion at WT:POLAND, since I mention this project as an example, would you care to announce it more widely? The more editors that would be aware of this discussion, the better. --Piotr Konieczny aka Prokonsul Piotrus| talk 18:15, 18 April 2011 (UTC)
The policy statement mentioned above is too vague and does not reflect the Wikipedia reality. As Kotniski mentions, the guideline is there for some reason. The question is if the guideline should reflect the reality, I think it should. - Darwinek (talk) 08:01, 19 April 2011 (UTC)
IMO, the first best solution is to just embrace diacritics. If the policy cannot be written to support that because of WP:OWN issues with this page, then a bit of federalism would be a good thing - by that I mean letting individual projects decide on best practice.Volunteer Marek (talk) 14:54, 21 April 2011 (UTC)
All the latest English-language scholarly sources (any Roman alphabet language) use diacritics. I believe therefore that diacritics should be encouraged, not discouraged. It's not the days of typesetting with hot lead where there was a limit to how many special character molds you could put on a Linotype. Earlier common English language usage was bound by technology, not just Anglo-centric laziness when it came to ornamenting characters. PЄTЄRS J VTALK 15:06, 21 April 2011 (UTC)
If its true that "All the latest English-language scholarly sources use" diacritics then the the wording of this guideline will be biased in favour of diacritics as it says use reliable sources as a guide. -- PBS (talk) 15:25, 21 April 2011 (UTC)
Wrong, unfortunately. For many biographical articles there is nothing that remotely approaches scholarly sources. I don't think it's reasonable to make a purely orthographic question dependent on whether someone has already been mentioned in a high quality source or not. We want to be a high quality source, and if we already know how the other high quality sources will spell a name, should it ever arrive there, there is no reason to wait for that. Hans Adler 06:04, 22 June 2011 (UTC)

I guess that the reason why Polish editors who contribute to the English Wikipeida use diacritics is because words with which they are familiar without diacritics look odd to them and makes them want to change them to the "correct" usage in Polish. However for monoglot English readers to see words that do not usually have diacritics on them having them is equally distracting. As this is a matter of taste the simplest thing to do is to appeal to sources, because while editors may not agree on which looks better, they can agree on which on the usage in reliable English language sources. I find it baffling why editors who are usually quite happy to agree on content using reliable sources, wish to ignore that usage for the spelling of words, simply because common English language usage does not suit their tastes. I think it is long past time that Polish editors should embrace the advise in this guideline and choose the spelling of a name like "Lech Walesa" the way it is commonly spelt in reliable English language sources. -- PBS (talk) 15:25, 21 April 2011 (UTC)

The monoglots amongst us have seen diacritics in common French and Spanish words forever. Where it is an issue of not only diacritics but a complete change of name or alternate spelling, there a preponderance of particular English language use should hold sway over scholarly. Where the only issue is the addition of diacritics, e.g., "Pēters Jānis Vecrumba" versus "Peters Janis Vecrumba", diacritics certainly are not that strange, and in my case serve as notification that my middle name is a boy's, not girl's, name.
It's not about Polish, it's about every Roman alphabet language between Western Europe and Russia and (per the Portuguese example above) beyond. Our standard is to be a scholarly encyclopedia, not a daily newspaper. PЄTЄRS J VTALK 16:21, 21 April 2011 (UTC)
I merely used Polish as an example because that was what was mentioned above. Your point with French and Spanish is more complicated because an educated English person is meant to be familiar with French (in the UK) and Spanish (in the US) and possibly Portuguese, German and Italian. This is for example reflected in "Accents" in the Style Guide issued by the Economist. However as this is a POV view (an one for which we would never get agreement), but we can do the same thing by using a simpler rule, the usage in reliable sources (if it is a general assumption) will reflect it. -- PBS (talk) 12:36, 23 April 2011 (UTC)
Indeed. Anyway, as it seems that given the choice of diacritics or not, our normal procedure, in most cases is to use them, I see no reason why this shouldn't be clarified in the policy (again, there are exceptions to this rule, such as common English names for placenames covered by NCGN, and that should be clarified here as well). The primary problem is that the current policy is confusing and does not represent our regular naming policies, and this needs to be changed. PS. And of course this is hardly Polish-only issue, Lech Wałęsa is no different than François Mitterrand or all things named Blücher. --Piotr Konieczny aka Prokonsul Piotrus| talk 09:14, 23 April 2011 (UTC)
It is different with names like Lech Walesa, and the simplest formula is to copy what is used in reliable sources, if we do that then we will be in sync of what is used in reliable English language source. -- PBS (talk) 12:36, 23 April 2011 (UTC)
Different how? --Piotr Konieczny aka Prokonsul Piotrus| talk 17:53, 23 April 2011 (UTC)
Usage in reliable English language sources. The most common usage for Blücher is the general who commanded the Prussians at Waterloo, if Blucher is more common than Blücher the biography on that man should be changed just as should the article on Lech Walesa. -- PBS (talk) 20:27, 25 April 2011 (UTC)
We use writing to help us identify and communicate. In the colonial era, we "Anglicised" words and names according to received English pronunciation. In this day and age of globalisation, democratisation and acceptance of cultural diversity, it is wrong to oversimplify. Using only the 26-lettered alphabet is like squeezing the proverbial square peg into the round hole: "Lech Wałęsa" cannot be correctly rendered; phonetically the name would need to be spelt "Lehk Vowensa", but that's not what we do. We already do that with Russian names: Dimitry Medvedev is one such approximation, which is wrong, and demonstrates the lack of care taken, even by so called reliable sources, in rendering world leaders' names (hint: correct romanisation would be more like "Medvedyev"). What we do, mangling his name by stripping out the diacritics from Polish names is worse, IMHO. It would be akin to rendering your name "Pilip Bird Shera" – that may be fine with you, but it's a serious loss of information for the reader. Spelling it as "Walesa" tells me its OK to pronounce it "wa-ley-sa", which of course isn't OK. Wałęsa might accept that as a shortcoming of English typography when dealing with westerners, but it's unrealistic to believe he is going to change his name for the English-speaking world. --Ohconfucius ¡digame! 03:52, 22 June 2011 (UTC)
With this edit last year I added a paragraph which I would have thought correctly describes our actual practices in this area. It was pretty quickly removed by someone who clearly doesn't like those practices - but perhaps we could consider readding it (or something like it)? (The wording, added after the paragraph about non-Latin alphabets, said: Names which are originally written in a Latin alphabet, and which have no particularly well-established English name, are normally written in their native form, even if that contains diacritics or letters that do not normally appear in English, as in Strübbel, Łopuchówko and Reyðarfjörður. However, when there is a well-established English form, such as Aragon (for Aragón) or Napoleon (for Napoléon), that is used instead.)--Kotniski (talk) 11:51, 23 April 2011 (UTC)
  • What does "well-established English name" mean?
  • "even if that contains diacritics or letters that do not normally appear in English" We already do that by the usage in reliable English language sources.
I don't see what the advantage of that wording is over the current wording which says something similar, in simpler language. Can you explain the nuisances of how it differs, from the current wording. An example or two would help, and how would it support the spelling of "Lech Wałęsa" instead of "Lech Walesa"? -- PBS (talk) 12:47, 23 April 2011 (UTC)
Well established needs clarification indeed. And I am not sure if most common spelling is useful, seeing how often for ease of print diacritics are omitted from (particularly older) publications. I'd support use of diacritics if they are used by minority of English publications, but would have doubts if they are used by none (assuming they object in question is mentioned at least in some other English sources, of course). Wałęsa, for example, seems to have his diacritics used in less than half of English sources, nonetheless, his name is spelled correctly often, even in book titles ([1], [2], [3] and so on). Second, there is clearly no support for moving Lech Wałęsa to Lech Walesa, per various arguments used during the RM request, and as such, this policy should reflect this (and we should recognize this is a common situation, not an exception to a rule). PS. I note, PBS, that in this discussion, you were in the minority. Just like it appears you are here, in arguing against diacritics. --Piotr Konieczny aka Prokonsul Piotrus| talk 17:53, 23 April 2011 (UTC)
"You were in a minority" 11/10! The page did not move because there was not a consensus to move. If the page had been at Lech Walesa it would not have been moved to "Lech Wałęsa". Not quite the small minority your comment "you were in the minority" implied. -- PBS (talk) 20:38, 25 April 2011 (UTC)
Well then, splitting the hair, you were not in majority big enough to justify the move. Which only proves my point that established procedure on Wikipedia is to use diacritics. This is what we do through titles and text, unless there are specific exceptions (Warsaw, Casimir Pulawski, etc). Those are exceptions and this naming convention should make it clear. As it is, it implies some sort of, cough, "equivalency", cough, between using and not using diacritics, which is quite FALSE, as in practice, we almost always use diacritics. Policies should reflect common application. --Piotr Konieczny aka Prokonsul Piotrus| talk 18:06, 26 April 2011 (UTC)
What does "correct" mean? Surly "correct" is usage in reliable English Language sources and the vast majority of reliable English language sources use "Lech Walesa" which makes it correct as far a the verification is concerned. -- PBS (talk) 20:27, 25 April 2011 (UTC)
I think we should deliberately use an ambiguous term like "well-established", since there clearly isn't any consensus (or even possibility of precise definition) as to exactly where the line is to be drawn. The standard is not, however, what the majority of English sources do (and what counts as a "reliable" source in this context is similarly undefined), so we shouldn't keep implying we do that - instead we should explain our practices accurately - namely that in typical (i.e. relatively obscure) cases we use the diacritics and modified letters; in cases like Napoleon we follow established English; somewhere in the middle runs an imperfectly defined boundary between the two treatments. (It should perhaps also be pointed out that we are more likely to diverge from the native spelling where it impairs recognizability, as we have done with the better-known Djokovics.) --Kotniski (talk) 16:52, 25 April 2011 (UTC)
Kotniski in the last move debate in Talk:Lech Wałęsa you ignored the advise you are giving here and wrote "Oppose, serves no purpose except to 'dumb down' ", yet one could equally argue "to use modified letters when most English language sources do not, is laziness for not bothering to verify usage in reliable sources, and a form of 'dumbing down' ", they are two sides of the same coin. It is much better to go with the usages in reliable English language sources because apart from anything else it reduces the differences between content and article name -- unless one is going to ignore WP:V not only for the article title but also content. -- PBS (talk) 20:27, 25 April 2011 (UTC)
Not really - our practice is to use diacritics both in content and in article names, except in clear cases like Napoleon. That's another thing I keep saying about this guideline - it's titled and phrased as if it's about article titles, but there's no reason for it to be so restricted - the principles set out here (particularly the eponymous "Use English" dictum) apply to all aspects of Wikipedia content (or would do, if we wrote them to properly reflect actual accepted practice). (And we're not "ignoring WP:V" by using diacritics - that policy doesn't say anything to the effect that we have to present information in the same way that a majority of sources present it.)--Kotniski (talk) 11:07, 26 April 2011 (UTC)
Of course it does say that -- we present information in the same way that a majority of sources present it, that is why names like the "Boston Massacre" are used rather than a NPOV name! You have been arguing so strongly for the abandonment of rule based article naming in the area of WP:NCROY, (For example you initiated a move for Queen Anne based on common usage.) So why not in this area as well? Personally I was very pleased when usage in reliable sources (as opposed to usage in all sources) was introduced in to the policy because it simplified the policy and removed the need for rule based guidance in lots of areas and it allowed us a clear formula that ties into the content polices in a simple and elegant way. -- PBS (talk) 12:18, 26 April 2011 (UTC)
Where does WP:V say anything like that? But the difference between the cases of Anne and Lech is that adding the diacritics doesn't make "Wałęsa" any less recognizable to those who are used to seeing it as "Walesa" (and everyone will readily understand why the diacritics are there and what they need to do to write it without diacritics); whereas calling someone as well-known as Queen Anne by an unfamiliar and largely invented title may well cause people not to realize who the article is about, or mislead them as to how the person is normally referred to. Diacritics are a win-win feature for an encyclopedia - they add information (our overriding goal) without harming recognizability or conciseness. And this is clearly the view that Wikipedia has taken. Whether Wałęsa falls into the Napoleon category can be left as a matter for individual discussion (I would say not, partly because we don't pronounce him "wails-a"), but we should certainly try to document truthfully how we treat the vast majority of names of this type.--Kotniski (talk) 14:04, 26 April 2011 (UTC)
Your argument that it is a "win-win feature for an encyclopedia" looks wrong to me because you have misspelt "encyclopaedia". The reason why we do not insist on unformed spelling is because to use British spelling in American centric articles looks odd, as does American spelling in British centric articles, to those used to seeing either usual national spellings. Just as I do not think one can argue that that theatre is right and theater is wrong, neither can one say that "Wałęsa" or "Walesa" is correct, but one can say that it looks wrong or right depending on personal preference. I do not think it is insignificant that it is often native speakers who are most vociferous in insisting that Anglicised words are "wrong" (Ie look wrong to them as they are used to seeing the words with diacritics. It is for them like "colour" is to an American).
Your argument that it "add information (our overriding goal) without harming recognizability or conciseness" is in my opinion just one point of view. For example how is Moscow pronounced (differently in London and Washington as are many words other foreign place names (How do Americans pronounce Worcestershire when asking for the source?). How is Zurich pronounced in Washington or London? Not the same ways as it is pronounced in Zürich! "Recognizability or conciseness" are not harmed by spelling color "colour", but people object to the "wrong" spelling because it hampers their enjoyment of reading of an article as it niggles (just as a piece of food stuck between teeth tends to distract from the enjoyment of a meal while not altering the taste of the meal).
It can also be argued that by following the spelling usually used in reliable English language sources, we are informing our reads as to the commonly used form of the word. After all if the word is placed under its usual English spelling if that differers from the native spelling that can always be placed next to the common English in brackets with the language spelling in italics, which give the reader a lot more information than just placing the article at the native spelling eg Zurich:
  • Before the move back to Zurich: "Zürich or Zurich (see Name below) is.." [4]
  • Now: "Zurich (German: Zürich, Swiss German: Züri) is..."
It seems to me that the latter is more informative than the former, rather disproving the argument about article titles using native spellings "add information (our overriding goal) without harming recognizability or conciseness" . -- PBS (talk) 13:02, 28 April 2011 (UTC)
Discussing Zurich is a bit of a dead horse, as it is already well covered at WP:NCGN. What is more of a problem is dealing with biographies. --Piotr Konieczny aka Prokonsul Piotrus| talk 16:49, 3 May 2011 (UTC)
Yes, Zurich is hardly a pertinent example, as it would fall into the Napoleon/Aragon category. What we're talking about is the tens of thousands of article subjects that don't have any well-established English name like Rome, Moscow, Zurich or Napoleon. In these cases the established Wikipedia practice is to use a standard transliteration if the original name is not written in a Latin alphabet, but to use the original name - including modified letters - if it is written in a Latin alphabet. The first part of that practice is documented on this page; the second part should be too. (Though obviously with a clear exception for cases that are determined to fall in the Napoleon category.)--Kotniski (talk) 16:43, 4 May 2011 (UTC)
Again, this is part of NCGN. If a village is never mentioned in English sources, we of course use its local name, with diactrics. A case of Łódź could be more interesting, and Kraków has been causing some controversies, but it is noteworthy that in both examples the consensus has been to use diacritics. Indeed, the point here is that with few relatively famous EXCEPTIONS, as far as nameplaces are concerned, diacritics use is a common procedure. The same seems to be true for biographies. This should be reflected in the policy here, so that when editors ask "do we use diacritics or not" we can point to the policy that reflects established usage and say "mostly, we do, but check the exception section" (which obviously needs creation, too). --Piotr Konieczny aka Prokonsul Piotrus| talk 22:10, 4 May 2011 (UTC)
  • Not using Cracow has not been the result of this page, or of any page; Professional Poles (most of them less than perfectly fluent in English) have insisted on the local spelling on the grounds that it is official usage (which it is consensus to ignore, unless it has affected independent sources). Septentrionalis PMAnderson 16:10, 12 May 2011 (UTC)
    • Not sure what you mean, exactly, but will you agree when I say that past discussions resulted in the name Kraków being preferred over Cracow? --Piotr Konieczny aka Prokonsul Piotrus| talk 17:13, 12 May 2011 (UTC)
      • I am genuinely sorry not to be able to; but I cannot agree even to that. This appears to be the first discussion, and it seems to have been Kraków then; the discussion is the usual no consensus/you can't make me business all too common in defending the "correct" forms some non-native speaker is used to in his language against English idiom. Septentrionalis PMAnderson 18:10, 12 May 2011 (UTC)
        • No consensus to move is still important. Would you prefer to say that "Kraków is used on Wikipedia, and there has been no consensus to use Cracow instead"? --Piotr Konieczny aka Prokonsul Piotrus| talk 18:36, 12 May 2011 (UTC)
          • I see no reason to mention the city at all, especially while it has no consensus; this page is for the principles (on which we may hope to agree) by which consensus ought to be reached. I believe we disagree on why it has not been reached, but I see no reason to mention that either. Septentrionalis PMAnderson 19:00, 12 May 2011 (UTC)
          • I would support adding some such sentence as "Roma is correct Italian; that does not make it "correct" English." And any argument that would lead to using Roma for the capital of Italy, or Warszawa for the capital of Poland, is harmful to the primary purpose of the encyclopedia: to communicate. Septentrionalis PMAnderson 19:19, 12 May 2011 (UTC)

Question/proposal re translation of names of things

I had a question (and a proposal, I guess) regarding the translation of the names of things, such as the names of buildings and organizations. It impinges somewhat on this guideline, so I point to it here: Wikipedia talk:Article titles#Titles of things should be translated, yes? Herostratus (talk) 18:40, 25 April 2011 (UTC)

Use of diacritics in biographical article titles

I am seeking a consensus on if the policies of WP:UCN and WP:EN continues to be working policies for naming biographical articles, or if such policies have been replaced by a new status quo.

I have been strongly warned that the above policies are no longer in force, [5] and further, have even been threatened with a block [6] for attempting to invoke the Wikipedia:BOLD, revert, discuss cycle for several articles that were moved without discussion to non-verifiable, non-English forms (with diacritics) as its article title.

It is my belief that wiki-policy dictates that this is the English Wikipedia, and according to the policy of WP:COMMONNAME and Wikipedia:Naming conventions (use English), a biographical article does not use the subject's name as it might be spelled in Czech or Slovakian (with diacritics) as its article title, nor does it use the person's legal name as it might appear on a birth certificate or passport; it instead uses the name that is most frequently used to refer to the subject in English-language reliable sources. For example, in the case of Marek Židlický, all sources within the article verify the spelling as Marek Zidlicky (with not a single source to verify the spelling with diacritics), yet the ice hockey project supports the use of the non-verified, non-English spelling. See also Category:Czech ice hockey players and Category:Slovak ice hockey players for many more similar examples. There is a small group of editors within the ice hockey project who have been very forceful with their POV to use diacritics, and as a result the policies of WP:UCN and WP:EN are now wilfully violated. Dolovis (talk) 04:57, 17 May 2011 (UTC)

No you have been pointed that the ice hockey naming conventions page no longer is in effect. Which is very different from being told that UCN and EN are not in effect. Please don't misrepresent what was said. You were also warned for making pointy moves which you knew were controversial and for trying to trick admins into making the moves for you by trying to speedy articles and then recreate them at your desired location. And by listing them at the uncontroversial page move requests list when you knew them to be controversial moves. As for envoking the BRD cycle, if you keep having the discussion part of the BRD cycle confirm that consensus is to keep the diacritics and then you keep trying to invoke the BRD cycle over and over again that is called disrupting the wiki to make a point. Because you already know the consensus of the discussion that will happen and all you are doing is wasting editors time. The warning was less about what you were changing and more about how you were trying to change them. You've also been pointed in the past to a wikipedia guideline that neither encourages or discourages diacritics and that specifically says you shouldn't over dramatize the situation. I think the two ANI reports you have made which went against you and the community discussion at the hockey project that went against you were clear indications that you are over dramatizing and that there is different way of interpretting UCN than you have interpreted it. As as been mentioned a large portion of editors think names with and without diacritics are still the common name when it comes to UCN. I would also note EN is a guideline not a policy, as such it is only a recommendation not an absolute requirement so it can't be "wilfully violated". -DJSasso (talk) 11:51, 17 May 2011 (UTC)
It is common to use diacritics in biographies. See also discussion above. --Piotr Konieczny aka Prokonsul Piotrus| talk 16:49, 17 May 2011 (UTC)

IMHO, diacritics should be deleted from all article titles on English languague Wikipedia, as they're 'non-english' symbols. That's all I've got to say. GoodDay (talk) 23:39, 17 May 2011 (UTC)

This may be an English language encyclopædia, but it covers foreign subjects too. (the clue is in the word "encyclopædia"). Sometimes those foreign subjects use foreign languages, or at least diacritical marks. If foreign content offends you, the best option may be to create a fork, call it anglopedia, and delete all the foreign articles.bobrayner (talk) 21:51, 21 May 2011 (UTC)

As already discussed, Wikipedia's practice is to use diacritics and other modified letters in most cases where the name is "originally" written in a Latin alphabet, and to apply a standard transliteration system from non-Latin alphabets. Exceptions are made when there is some clearly established usage in English (as with Zurich, Napoleon and Tchaikovsky), and in some other cases where some particular issue arises (as with "Djokovic" for recognizability, or e.g. where a person has become a naturalized American and started spelling their name differently). I see no reason to change any of this (though I'm not too keen on these Croatian and Icelandic letters that impair recognizability) - it seems to be the right approach for an encyclopedia to take - we don't have to follow the style used by a majority of sources, when the minority style better serves our purposes (which is this case is to convey information). Neither is right or wrong, English or un-English - there is just a choice of styles, and I think overall we've made a good choice.

About this page, I think it should be edited (as I've attempted to do in the past) to more clearly and accurately describe how we actually do things as regards modified letters; and more globablly, the page should also be renamed to simply "WP:Use English", and be refactored so as not to concentrate on article titles, since the principles it expounds are not specific to titles.--Kotniski (talk) 10:27, 18 May 2011 (UTC)

Wikipedia's practice is to use diacritics and other modified letters in most cases where the name is "originally" written in a Latin alphabet, and to apply a standard transliteration system from non-Latin alphabets. No, it isn't. That is the practice of certain nationalists, who are uncomfortable with the idea that English may spell Foolander names differently than English does.
But when seriously considered, our practice is much simpler: do what reliable sources on the subject in English do. Septentrionalis PMAnderson 02:13, 19 May 2011 (UTC)
I think we should look at Britannica as a model for this case. It uses diacritics in the title (and first sentence) when it's useful (for example for pages about middle east and Asia). Alefbe (talk) 05:50, 25 May 2011 (UTC)
Again, not really. That applies when reliable English sources clearly do do something (i.e. in the Napoleon-type cases I mentioned). But in the (numerically) vast majority of cases, English has no established practice one way or the other, and in those cases what we do ("nationalist" or otherwise) is what I described. Just as it makes sense to use one transliteration system consistently for Russian or Chinese (except for names like Tchaikovsky where a particular English usage has become established), it makes sense to be consistent in the way we represent Latin-alphabet names, and the way we do it (not for nationalist reasons, but - I assume - for encyclopedic reasons) is to preserve the modified letters. --Kotniski (talk) 06:20, 19 May 2011 (UTC)
No, it is a practice of the vast majority of editors, whereas a small minority of purists argues against them. Which is proven by the fact that vast majority of articles that could use diacrticis uses them. Unless you will argue that suddenly, Wikipedia consensus fails on diacritics, or that most editors are "nationalists", it is obvious that for majority, diacritics are fine. --Piotr Konieczny aka Prokonsul Piotrus| talk 16:02, 19 May 2011 (UTC)
No, it is the practice of a small, but dedicated, segment of editors: those who are not native speakers of English but of two or three European languages (if Iceland is now part of Europe); whenever a wider pool of editors has been appealled to, these efforts have failed. We do not speak translationese; we adapt names as English has in fact adapted them. Septentrionalis PMAnderson 20:05, 19 May 2011 (UTC)
Can I just point out that calling us "nationalists" and demonstrating plain geographic ignorance isn't really constructive. We're supposed to be writing an encyclopedia of human knowledge, including topics outside of the English speaking world and that which lies within the interests of the average American or Brit (English is, after all, an international language spoken by one and a half billion non-native speakers as well as us natives). - filelakeshoe 20:14, 19 May 2011 (UTC)
No, you may not.
There is a Wikipedia for non-fluent readers of English; there is also a Wikipedia for each of the European languages concerned; there is no other Wikipedia for anglophones. Those who prefer to write in Foolander instead of English should go edit one of them. Septentrionalis PMAnderson 21:09, 19 May 2011 (UTC)
A self-assured and self-referencing statement about "what English is", but until such time as serious English sources stop using diacritics (and it looks like they don't intend to, as repeatedly noted), or English gets itself a regulating body that says "don't use diacritics", the claim is, well, bogus. It only tries to legitimize a frivolous diacritics-hurt-my-eyes claim - mirroring the inflammatory insinuation above, I urge any such users who prefer uncomplicated spellings of neologisms to "go and edit" the Simple English Wikipedia (which, btw, is the other wikipedia for anglophones). Dahn (talk) 21:34, 19 May 2011 (UTC)
Okay then, let's go and delete Polish language. That article is full of "Foolander", and seems to be impossible to rewrite without it. That kind of foreign gibberish has no place on the English Wikipedia. - filelakeshoe 21:38, 19 May 2011 (UTC)
"That is the practice of certain nationalists" – an interesting theory. I was unaware that Timothy Snyder, who writes "Radziwiłł", is a Polish nationalist. That Michael Beckermann, who writes "Dvořák", is a Czech nationalist. That Peter Siani-Davies, who writes "Mănescu", is a Romanian nationalist. That Marcel Cornis-Pope and John Neubauer, who write "Tuđman", are Croatian nationalists. That Kevin O'Connor, who writes "Kārlis Ulmanis", is a Latvian nationalist. That Alan Axelrod, who writes "Mátyás Rákosi", is a Hungarian nationalist. Very enlightening indeed. - Biruitorul Talk 23:04, 19 May 2011 (UTC)

You realize Dolovis that you are now violating WP:CANVASS by only inviting people who edited this page and might agree with you right? Would be fine if you were inviting both sides of the discussion but it appears you are skipping people who are not likely to agree with yourself. -DJSasso (talk) 17:11, 19 May 2011 (UTC)

Not if he invites everybody. Septentrionalis PMAnderson 20:05, 19 May 2011 (UTC)
Which is why I said if he invited both sides of the discussion it wouldn't be an issue. But when I wrote this he had skipped over a few recent editors who clearly looked like they had no problem with them. When I last looked he still hadn't notified them. However, its not worth arguing about I was just letting him know. -DJSasso (talk) 11:41, 20 May 2011 (UTC)
Djsasso is simply wrong on his point. I skipped over no one. Every editor who had contributed to Wikipedia talk:Naming conventions (use English)‎ was given the appropriate notification of this discussion. Dolovis (talk) 00:57, 27 May 2011 (UTC)
Did you miss the point where I said "when I wrote this". You since went on to invite the ones you had skipped. -DJSasso (talk) 01:55, 27 May 2011 (UTC)

I was asked to add my opinion to this discussion by Dolovis, but I'm not sure I agree with their interpretation.

What I understand from reading the project page is that our procedure is this:

  • If the title has been "imported" into English as evidenced by reliable English-language sources using a modified spelling (e.g. dropping diacritics), follow the spelling in those sources (which will usually be diacritic-free).
  • If the title has not been "imported" into English because it is not discussed in reliable English-language sources, use the spelling of the original language (or the closest transliteration in Latin script, including diacritics)
  • Make a redirect from the spelling without diacritics to the one with diacritics, if necessary.


We never strip diacritics on our own initiative with the rationale that English doesn't have them. I agree with that idea, that we are simply using correctly-spelled words from other languages when there is no English word, proper noun or otherwise. That follows common usage, while still allowing people who don't input diacritics to find the article they are looking for through a redirect. We don't use non-Latin scripts because that's just too bonkers for English speakers to mentally pronounce, and we have a good alternative in transliteration. -- Beland (talk) 17:23, 19 May 2011 (UTC)

Bravo. Precisely so. I would only add that the second case, where there is no English usage at all, is less often true of actually notable subjects; we should not have auto-generated articles on every obscure hamlet in Fooland in the first place. Septentrionalis PMAnderson 20:05, 19 May 2011 (UTC)

  • This was brought up years ago, and for the love of me I don't understand what the point is, and why it itches so much to overturn consensus on something that touches millions of pages by now. Seriously: while one may argue that the diactricized names of people from the Anglosphere are irrelevant, on a case by case basis (though I don't ever see the same point being made about Frenchmen naturalized in England, with their acute accents, and Poles in America with their Ł), it is utterly irrelevant to the wider world. And, no, it doesn't follow that the consistent failure to add diacritics in print in many imperfect and inconsistent American media sources is the established "English usage". It merely speaks volumes about the fact that mediocrity in information will produce mediocrity in culture - something that wikipedia should preferably stay away from. Just what is the concern here? That readers will actually learn something, even though they might not want to? Pass. Dahn (talk) 17:31, 19 May 2011 (UTC)
    • Well said. Particularly in the old sources, diacritics were omitted because they were hard to insert into regular typewriters and printing presses. Wikipedia is a 21st century publication, and there is no reason for us not to give the most correct information (correct names with diacritics, accents, and so on). --Piotr Konieczny aka Prokonsul Piotrus| talk 17:49, 19 May 2011 (UTC)
      • I don't know about your keyboard, but mine has exactly 26 letters on it. I wouldn't know how to type some random foreign squiggle, or even what it was called so I could look it up. Gigs (talk) 17:59, 19 May 2011 (UTC)
        • Which would be why at the top of the edit box of a page there is a link called special characters if you are someone who can't add them or doesn't know how to add them you still can. As for people searching for the information, that is why we redirect from the opposite version. -DJSasso (talk) 18:04, 19 May 2011 (UTC)
        • I come from a culture which makes heavy use of diacritics, and I edit heavily in fields related to that culture. Yet I never did install me the necessary keyboard: I have the same "26 letters on it", and, on wikipedia, I always use the special characters set from the edit window. Give it a try, it won't kill you. Dahn (talk) 18:08, 19 May 2011 (UTC)
        • You don't have to use the diacritics. Redirects will take you where you want to be, and others will add them to your texts if you don't care for copy/paste, characters or alt software keyboards settings. --Piotr Konieczny aka Prokonsul Piotrus| talk 17:17, 20 May 2011 (UTC)

First, I am simply appalled that DJSasso used up so much space attacking an editor instead of responding to the question. That said, I must side with those who feel that diacritical marks have no place in an English encyclopedia. I have seen common American family names like Hernandez spelled with an accent, even though accents are not used in the United States. I imagine the same convention exists in Commonwealth countries. These accents cluttering up WP articles are simply added by people who read a foreign language and who think that a word looks odd or peculiar without the diacritics. Well, to an English-reader, the word looks weird and absolutely wrong when it DOES have diacritics. I don't blame the Slovaks for thinking a name looks really funny without the marks, but, honestly, it looks even odder WITH them to the vast majority of readers of this encyclopedia. That's why we have a rule, or anyway a policy, that requires article titles to be in English — a policy that is sadly often not being followed. Sincerely, your friend, GeorgeLouis (talk) 17:44, 19 May 2011 (UTC)

Actually I was responding to an attack on me. His post was indicating he was told things he was not told and was phrased in a way as to attack me. In order to present both sides of the discussion he is referencing I provided the other side of the situation. In no way was what I wrote an attack. -DJSasso (talk) 17:50, 19 May 2011 (UTC)
  • My own opinion is that this is the English Wikipedia, and that our articles should be titled with the most common spelling that is used in English sources. Especially when it comes to German or Icelandic characters that are unrecognizable to most English speakers, it's very important to stick with recognizable English characters. This is in accordance with what sources use, and makes searching and linking more convenient for everyone. --Elonka 17:48, 19 May 2011 (UTC)
Presumably, Encyclopedia Britannica is also an English language (well, British, I guess) encyclopedia and it has no problem with diacritics, uses them frequently, and I suspect is not written by "nationalists" (so let's drop that red herring/ad hominem right away).Volunteer Marek (talk) 17:52, 19 May 2011 (UTC)
Britannica, for the record, can be confusing. The entry on Lech Wałęsa uses diacritics in title and in the article, but not in the page heading. Same with Józef Piłsudski. Note that this affects google results - if you google for britannica + wałesa / piłsudski Google will suggest Britannica DOES NOT use diacritics, which is INCORRECT (vide - those pages). For Hugo Kołłątaj, it is the same, note that 1911 version not only did not use diacritics, it used a grammatically incorrect way to spell his second name... (wikisource:1911 Encyclopædia Britannica/Vol 15:14). For Słowacki, diacritic is used in modern edition, 1911 mispells his first name (and ignores the diacritic: wikisource:1911 Encyclopædia Britannica/Poland). Some poor page building aside, it is obvious that Britannica supports diacritics. There is an advantage, occasionally, to being written by academics - people who get use to seeing diacritics, and don't treat them as "OMG I don't see them on my TV!" problem... (sigh - in the end not using diacritics always comes back to the dumbing down problem... Wikipedia should not be dumbed down, if you don't like diacritics, go to Simple Wikipedia). --Piotr Konieczny aka Prokonsul Piotrus| talk 18:16, 19 May 2011 (UTC)
I think it is somewhat misleading to call Julius a "misspelling" of Juliusz. I would rather call this an anglicisation. JoergenB (talk) 19:30, 19 May 2011 (UTC)
  • I was solicited to comment here, but I don't particularly remember what my previous involvement was with this. This guideline is still in force, and Dolovis is essentially right. It's not mediocrity to transliterate characters that don't mean anything to English speakers into Latin characters, it's a normal and expected practice. This is the English Wikipedia, if you want to use non-English characters, go somewhere else. Gigs (talk) 17:57, 19 May 2011 (UTC)
  • I like to listen to Motörhead as well as Frédéric Chopin. Marie Curie (née Maria Skłodowska) was a great scientist. I'd like to visit Ü eventually. And I think the claim that English or this encyclopædia does not use funky characters is at least naïve. Yes, many, even usually reliable sources drop diacritics. But very often the best and most scholarly sources keep them. Unless there is an strongly established usage in a particular case, I'd stick with the original spelling. --Stephan Schulz (talk) 18:05, 19 May 2011 (UTC)

I was asked for my opinion by Dolovis, so here's what I think should be the policy:

  • About diacritics on foreign names, I agree with Kotniski. If the title was originally written in a Latin alphabet, then keep the diacritics. English borrowed lots of words keeping their diacritics, like naïve, résumé, führer, etc.
  • Every article should be written in the Latin alphabet, which means that .срб, .рф should be moved to the transliterations; there are also plenty other like (ε, δ)-definition of limit
  • There should *always* be a redirect from a title that can be written by a normal English keyboard. If I see somewhere printed (−2,3,7) pretzel knot and I try (−2,3,7)_pretzel_knot, I won't get to the article)

bogdan (talk) 17:59, 19 May 2011 (UTC)


Here is the article I referred to above. The politician normally spells his name without he accent. Mike Hernández. Yours, GeorgeLouis (talk) 18:10, 19 May 2011 (UTC)

Case by case basis, George. I can see how an American politician who did not use diacritics in his own name shouldn't have them in his name. But consider that non-English people use them. Why should the name of Józef Piłsudski be "dumbed down" to Jozef Pilsudski? It's not like we don't have technical means to display the diacritics, or create redirects. And there are plenty of English language publications who use diacritics (in the context of this person), up to and including Britannica. Nobody is saying "we should always use diacritics". What I am saying, is that we should use them, unless there are good reasons for them not too (like, in the case you mention, subject not using them). --Piotr Konieczny aka Prokonsul Piotrus| talk 18:21, 19 May 2011 (UTC)
  • Yes, WP:EN and WP:UCN are still in force, but neither of them ever says "don't use diacritics". That would be stupid. The rule is to use common English names when they exist – and this rule works great for the names of countries, cities, books, and movies that may have a common English name that's distinct from their local/native name. But with people it's different. People don't have different names in English than in their native language (with the exception of the Pope and some historical monarchs like the various Kings John of Portugal or Philip of Spain - note that the modern monarch of Spain is called Juan Carlos in English, not John Charles). So, since there is no such thing as an English "translation" of a name, we use the spelling that the person uses him/herself (or a romanization of it, since articles are always written in the Latin alphabet). Some people may drop their diacritics when they move to an English-speaking country (like all the Americans named Gonzalez, not González) and that's fine, but people who do use their diacritics should not be deprived of them in their Wikipedia article. (The same also applies to Latin-alphabet letters not found in English, like ß, ð, þ, and ə.) There is absolutely no reason Wikipedia article names should be constrained by the decades-old limitations of the ASCII character set or old typewriters, or by the narrow-mindedness of xenophobes who freak out whenever they see a "funny foreign squiggle" at English Wikipedia. —Angr (talk) 18:13, 19 May 2011 (UTC)
  • What Angr said. — kwami (talk) 18:24, 19 May 2011 (UTC)

(ec) The thing I find most disdainful about this discussion is that you anti-diacriticists actually seem to think removing diacritics from someone's name makes it English. It doesn't! There is no English form of Marek Židlický, the English form of Marek Židlický, if there must be one, is "Mark Fork". Language doesn't work like that, English imports words and names from other languages all the time and if it becomes common enough use, then the form changes, e.g. with delicatessen. It even happens with people too, like for instance Nicholas Sarkozy or Andy Warhol. When writing about people who are exclusively notable for actions in Slovakia, the chances are there isn't an English form of their name.. there might have been in the 18th century if word of them somehow made it across the channel, but as the world becomes more globalised names are far less commonly translated.

With regards to the "use what English sources use" argument, the problem is that these hockey players whose names Dolovis keeps trying to "anglicise" aren't being written about in encyclopedias. They're being written about in news sites, which would never check something so trivial as this, and hockey fan sites, which are great reliable sources for statistics, but not for how to spell someone's name properly. If we're going to go so far as to suggest that what the media say is always correct, then why the hell stop at renaming Tomáš Ujfaluši to get rid of those offensive diacritics, we can also add that his name is actually pronounced /ʊdʒ.fə.lu:si:/ since that's how John Motson says it, therefore that's "English". According to Dolovis' argument.

I also love how "readers of foreign languages" are getting sniped at. Obviously, people with knowledge irrelevant to the average American aren't welcome to push their fringe POV here. I would love to know, aside from making people who've never learned another language in their life twitch and go "zomg wut is this foreign squiggle?!!!!", what purpose on Earth does leaving diacritics out of someone's name achieve? - filelakeshoe 18:27, 19 May 2011 (UTC)

Most of those titles show what once happened in English, as in most European languages, back in the days of dominant cultures, little alphabetization, typographical restrictions, institutionalized xenophobia and charming quirks. As for Ulam: the article could just as well be moved to the "with diacritic" variant, since the current title is arguably erroneous; in any case, it is not a global pattern, nor a global decision. Dahn (talk) 21:58, 19 May 2011 (UTC)
Fuseli and Handel used anglicized forms of their name in their lifetime. None of the two simply dropped the diacritics. I don't know what Van Buren is doing there - isn't this a Dutch name that was always spelled this way? --Stephan Schulz (talk) 22:06, 19 May 2011 (UTC)
The spelling in the Netherlands would be Martin(us) van Buren. What both extremes in this discussion fail to recognize is that this guideline is opposed to both global patterns; saying that reality fits no global pattern is true, and supports our present practice: see what the sources do in each case. Septentrionalis PMAnderson 22:18, 19 May 2011 (UTC)
If your answer is specifically directed at me: I have begun by saying that there is no global pattern, in my first post on this thread; but I also feel that to claim "English doesn't use diacritics" based on sources which discard them because they just don't use diacritics at all (for practicalities that needn't concern us), when the most professional sources do, is a point that discredits itself. I also don't think that invoking willing, crude, and ancient, Anglicizations of names that relate to practices from another era (the same era when, for instance, French Francized all neologisms, some of which have stuck as such), counts as a practical solution. Forget ancient names and folkloric spellings, let's talk modern. In modern spellings, the vast majority of them, there is absolutely no reason to discard the diacritics, even if most sources may; we care about the sources that are either most professional or closest to the subject. In the case of hockey players, we may have neither, but, at the very least, if one or the other is shown to exist, or if the names with diacritics are shown to exist in at least one independent source, the natural assumption is that the sources not using diacritics simply don't use them out of ignorance or indifference. Either way, using diacritics on wikipedia, wherever diacritics are the natural or presumed grammatical choice, should be the absolute rule, not the absolute exception. Dahn (talk) 22:47, 19 May 2011 (UTC)

My opinion, for what it is worth:

  • The discussion supra is slightly confusing two subjects. The initial discussion was about ice hockey player names. For these, there has existed a policy/guideline/proposal page, which was degraded to "historical". The main arguments (from both sides) are about general usage of diacritics, and of "commonly used" spellings versus "correct" ones. Here, there are very clearly marked policies, WP:AT and WP:EN, which by no means are history marked. The writers concerned with the article titles in general should relate to these policies, e.g., by suggesting changes, if they think that they are not in accordance with actual standard usage.
  • I think it is artificial to distinguish between common usage of letters and common usage of diacritics. Besides, the "diacritical" signs actually often distinguishes separate letters in other languages. As far as I understood the argument, some users think that lots of appearences without diacritics could motivate that we write the article name with "the same letters, but including the diacritics". However, e.g., Zurich is not "the same spelling as Zürich, but without the diacritics".
  • Another argument I don't like is the pedagogical one. We should not use diacritics, just in order to teach our readers the proper way to do things. Actually, I think that this idea partly has to do with a misunderstanding. There are numerous people who believe that "you never, never 'translate' proper names", and therefore think that they encounter mistakes whenever they become aware of exonyms.
  • I think that the "least surprise" principle is a good general policy, and that it should be retained also as concerns diacritics. There are sometimes good reasons to deviate from this policy; in particular, for consistency. Thus, when an authoritive body has decided on a certain naming convention for a natural category of objects, and in most cases this convention coincide with the most common usage, then one could well decide to follow the convention consistently. C.f. the Aluminum versus Aluminium discussions. We could very well decide on (or respect an erlier decision on) using the native spelling for all hockey players, if the convention in itself is less surprising than sometimes using diacritics, and sometimes not.
  • However, my principal opinion is that the applications of the "least surprise" principle is more easy for native English speakers, and others with a long permanent recidence in the English speaking world. JoergenB (talk) 19:09, 19 May 2011 (UTC)
  • We should follow the common usage of English-language sources. Editors who wish to work in Polish, Czech, &c. should please do so in the Wikipedias for those languages. The sidebar links for those other languages will then provide the usage of those languages for those that want them. Colonel Warden (talk) 19:12, 19 May 2011 (UTC)
  • Can I just point out that if you're seriously suggesting journalists are reliable sources for how someone's name is spelt, there's quite a strong case to move a certain article to Obama Bin Laden right now. - filelakeshoe 19:30, 19 May 2011 (UTC)
  • Common usage in English may or may not include the diacritic. If anyone wants to ban diacritics altogether, I say No way José. Aymatth2 (talk) 19:54, 19 May 2011 (UTC)

English has a strong, established tradition of using diacritics in handwriting and competent typesetting, and of dropping them as an expedient, whether out of economy, technical deficiency, or ignorance. Certainly denying that they are a part of the language is either ignorant or obstinate (viz. rose/rosé, lame/lamé, resume/resumé, pate/pâté, and especially maté, a loanword made English by adding a diacritic). Saying that an e is no longer an e when an accent is drawn over it is playing word games.

And drop all the diacritics if you like, but you still won't convince me that Antonín Dvořák is an English name. Like it or not, when you write about the world you must use words and names from the world. Michael Z. 2011-05-19 20:45 z

Is Prague an English word? Is Antonin Dvorak lived in Prague an English sentence? It's the same question.
Now in that case, Antonín Dvořák lived in Prague has become relatively more common, sufficiently so that we have moved the article; as we have moved to Johann von Goethe, with a lower-case v. But nobody who wrote to be understood, instead of showing off, would write Antonín Dvořák lived in Praha - and we write to be understood. Septentrionalis PMAnderson 21:04, 19 May 2011 (UTC)
Prague is an English name. Dvorak (pron. “dvore-zhack”) is a Czech name, regardless of the orthography (although [August] Dvorak is arguably English). In English, we use a mixture English and foreign diacritics, letters, words, and names. Fie on misguided aspirations to cleanse the language. Michael Z. 2011-05-19 21:36 z
So all this depends on a metaphysical distinction between the director of the National Conservatory in New York, and the educational psychologist in Wisconsin.
But it is not those of us who wish to follow English who want to cleanse her; that we leave to those who wish to ignore her, either to impose diacritics where she does not use them, or to omit them where she does. Septentrionalis PMAnderson 21:43, 19 May 2011 (UTC)
No metaphysics; usage. The difference depends on how the names entered and have been used in English. And the English, she gonna be jus' fine, with or without us. It's editors muddying the discussion with untruths that bugs me. Michael Z. 2011-05-19 21:52 z
That is what the guideline says. Septentrionalis PMAnderson 22:18, 19 May 2011 (UTC)

I have the impression that modern English book sources are more likely to use a diacritical spelling such as "Józef Piłsudski" than older ones. What is "most common" in English sources may thus be subject to drift. Regardless of what our current policy states, my preference for biographical articles is for them to have the same spelling as how the subjects themselves wrote their names at the end of the period for which they are known, provided that spelling uses a Latin alphabet. That would mean Józef Rotblat, but vanilla-Latin Stanislaw Ulam. For some languages the undiacritical version of a name, is turned into an annoying vulgarism, like for Turkish that of poor Mr. Ahmet Şık.  --Lambiam 23:04, 19 May 2011 (UTC)

For what it's worth, I believe that we should have a policy that diacritics are simply omitted from article titles. It would cut out so much wasted time on WP:RM and similar forums, and I can see no downside. Andrewa (talk) 00:00, 20 May 2011 (UTC)

  • But what about Häagen-Dazs or Touché? Touché! Aymatth2 (talk) 01:30, 20 May 2011 (UTC)
    • I see little benefit in having the diacritics on even these article titles. There is no possibility of confusion or ambiguity if they are omitted, it is perfectly valid and clear English. On the other hand, if we have them, we continue the endless unproductive discussions on where to draw the line. Been there, done that. Let's cut our losses and simply abandon diacritics in article titles. They ain't worth it. Andrewa (talk) 03:13, 20 May 2011 (UTC)
      So you'd rename Öre, Øre, Pâté and Rosé to what? Michael Z. 2011-05-20 05:50 z
      With the Ore articles either a merge, disambiguation or an exception would be necessary, any of them acceptable. With the others, just drop the diacritics in the fullness of time. No great urgency. Andrewa (talk) 12:55, 20 May 2011 (UTC)
      You do realise rose has a different meaning to rosé, right? - filelakeshoe 13:28, 20 May 2011 (UTC)
      Yes, see below. Andrewa (talk) 15:49, 22 May 2011 (UTC)
  • You can see no downside to eliminating diacritics? Really? What about accuracy? Why should we write about Lodz and Poznan and Elblag when their correct spelling is, in fact, Łódź and Poznań and Elbląg? Sure, there may be discussions from time to time over precisely what articles get accents, but overwhelming consensus favors their retention. Any respected newspaper uses diacritics for French or German nowadays, so dropping them would dumb us down to a striking degree. As for more "exotic" languages like Albanian or Lithuanian, diacritics for words in those languages are also becoming more common in scholarly publications as well. If we're slightly ahead of the curve now, we'd be totally behind after implementing this idea.
    • Wikipedia is not a crystal ball. At present, the undiacriticised versions are quite normal in English. Disagree that it would dumb us down even slightly. And our policies are actually to be behind the curve, rather than in front of it. Andrewa (talk) 12:55, 20 May 2011 (UTC)
      • What does this have to do with WP:CRYSTAL? Diacritics have been used in foreign alphabets since their standardization (often at least 150 years ago), and have become increasingly standard practice in rendering foreign names in English in the last 20 years or so. We're not predicting Prešov will have a diacritic; we're reflecting the fact.
      • It depends on what you mean by "quite normal". Take for instance the French Prime Minister. You'll note that the Washington Post, Wall Street Journal, New York Times, Telegraph, Guardian, Independent, Economist and Irish Times all use the cedilla in his name.
      • For more "exotic" languages like Czech or Polish, diacritics are becoming quite standard in professional publications. So yes, it would dumb us down to deliberately eliminate a correct orthographical feature and one in widespread English usage. If this were 1960, when the only diacritics you'd see in English publications were é, è, á, â, î, ç, ñ, ö and ü, you might have a point. But usage in general has shifted to a more diacritics-inclusive stance. - Biruitorul Talk 16:36, 20 May 2011 (UTC)
  • What about diacritics' use as a pronunciation guide? If you know the basics of a foreign language's phonetics (not a requirement for reading en.wiki, but many of our readers will know them), diacritics can be vital. Take Hungarian, where á and a sound quite different. If you see Salgotarjan or Hajdunanas or Bacsalmas, you may have no clue what kind of as are involved. But Salgótarján, Hajdúnánás and Bácsalmás make it all clear. Or what about Godollo? Without diacritics, there could be about nine ways of pronouncing that; knowing it's Gödöllő gives vital information.
    • Thanks for raising this. Pronunciation belongs in the article text, not the title. Andrewa (talk) 12:55, 20 May 2011 (UTC)
      • Let's not split hairs here. Of course IPA pronunciation belongs in the text. But diacritics are part of a subject's name, and have the additional function of aiding in pronunciation. There's no plausible reason for deliberately misspelling a name in the title (and yes, omitting diacritics is a form of misspelling) but going on to give the correct spelling in the lead. (Well, there may be occasional exceptions like Zbigniew Brzezinski, a naturalized US citizen who doesn't use the diacritic in his public life, but they're few and far between.) - Biruitorul Talk 16:36, 20 May 2011 (UTC)
        • I'm not intending to split any hairs. Omitting the diacritic is not a mispelling, and that's a very important point IMO. It is a valid orthography, and correct within itself. The question is, would Wikipedia be improved by adopting it? And I think we need to consider very carefully the possibility that it would be. Agree that particularly with regard to living people we need to consider their wishes, but I think that if the article lead gives their preferred version of the name that should be sufficient. We don't automatically follow the capitalisation of band names and trademarks, so why not adopt a similar policy with regard to diacritics on personal names? Andrewa (talk) 18:38, 22 May 2011 (UTC)
          • Omitting the diacritic is not a mispelling, and that's a very important point IMO. It is a valid orthography, and correct within itself.[citation needed] When one learns French, or Czech, or probably Polish, or any of the other languages employing Latin-based scripts, you are taught that there are more than 26 letters in the alphabet. These cannot be uninvented just because "English doesn't have these so it must be substitution". We don't automatically follow the capitalisation of band names and trademarks We do, actually. Check out iPad and Motörhead. ;-) --Ohconfucius ¡digame! 02:35, 22 June 2011 (UTC)
  • And even as a native speaker of Romanian, I wouldn't know how to pronounce the names of certain places not known to me. I'd know Niculesti is really Niculeşti because I know of the -eşti suffix (not that anglophone monoglots would have a clue), but what about Cosesti? Is it Coşeşti or Coseşti? Both are plausible; the former happens to be correct. Then again, what about the other Cosesti? Is it Coşeşti too or is it Coseşti? It's Coseşti this time around. Or Posesti: it could be Poşeşti or Poseşti; it's the latter, but I wouldn't have known that without diacritics. Then there's Darabani: the natural tendency is to say Dărăbani, and one might well do that in the total absence of diacritics, but knowing they'd be there if correct ensures no such mistake is made. "No possibility of confusion or ambiguity"? Think again. - Biruitorul Talk 06:00, 20 May 2011 (UTC)
  • That's a bit destructive. The point of this discussion is to resolve the issue which is cluttering up RM. - filelakeshoe 09:43, 20 May 2011 (UTC)
    • Disagree it's destructive. Rather, it's taking a view that this policy would save a lot of work in the long run. It would effect a great many articles, and create a great many redirects which should be there anyway. In fact I think the work would be done very quickly, but I don't think it matters whether it is or not. Andrewa (talk) 12:55, 20 May 2011 (UTC)
      • Let's look into the issue a bit: as a hypothetical user, you don't (don't know how to) create an article under a name with diacritics, so in the worst case scenario you create it under a title of your liking, and someone moves it, slowly and patiently - as things tend to happen already. You are left with the following choices: a) ignore that this happened, since it cannot really be conceived as a move for the worse; b) actually learn how to do it yourself, and no real time will get wasted; c) resist the change for some obscure and contrived reason, and actually make everyone lose time over trumped up notions about how "English doesn't use diacritics". To accept diacritics as a rule cannot harm anyone; to propose mass moves of articles on a fancy, and arbitrarily turn the proper titles into redirects (reverting the practice that we used for years), does. The claim, that "it saves a lot of work in the long run" is a misleading hypothesis: the only work it saves is minor, and is performed by those who actually care about diacritics usage, i.e. the very people whose work you want to see questioned and overturned. You don't want them yourself? Fine, don't use them. But should you ever start an article on a topic with diacritics and for some reason can't copy-paste letters not on your default keyboard, someone will still move it to its proper title. It is what happens all the time. Someone resisting the changes in the name of "less work" is not an everyday occurrence, thank God: we'd all be wasting our editing lives on discussing that "less work". Dahn (talk) 13:36, 20 May 2011 (UTC)

It's all really simple: if the most common spelling in English uses diacritics, then the wikipedia article should too. if the most common spelling in English doesn't use diacritics, then Wikipedia shouldn't either. really, quite simple. Masterhatch (talk) 03:36, 20 May 2011 (UTC)

It's not really simple, because there's essentially a dispute about what constitutes a reliable source and how we measure the most "common spelling". If a hundred fan sites and newspapers spell a name without diacritics, for the likely reason that they don't have them on their keyboards or a special characters box like we do, but the person's official website and his hockey club's website spell it with diacritics, what's the common spelling? - filelakeshoe 09:43, 20 May 2011 (UTC)
Actually, it is really simple. The most common spelling is a case by case thing. And most common spelling can be found in published texts, both on and off the Internet. It is not laziness or lack of keyboards that publishers don't use diacritics, it is because that is the way it is done and has been done in Engilsh for a long, long time. Using diacritics looks just as foreign to most native speakers as not using them does to most non-native speakers. As for "his personal page", he could write anything he wants. he could write that the world is going to end in 2012! but that doesn't make it fact. I don't consider blogs or other "personal pages" to be reliable sources. Masterhatch (talk) 11:37, 20 May 2011 (UTC)
That's wrong on several points. English tends to drop diacritics in loanwords where possible, but many of them have stood the test of time. Cafe (“keyf?”), for example, just looks wrong, and resumé without diacritics is another word, with only two syllables. Poor-quality publishers who don't care to bother with correctness, hurried journalists, and commercial advertisers appealing to the indifferently-literate, are most likely to let proper orthography slip routinely. Michael Z. 2011-05-20 16:54 z
Actually self identification is what we are supposed to follow on wikipedia for religion and nationality and the like. I don't see why it would be different when it comes to how they spell their name. Which to me is just as big a deal to a person as the other two. But yes for the most part it is laziness or lack of keyboards that cause publishers to not use diacritics if the publishers I have asked in the past are any indication. Most of them say it just takes too long to figure out the ALT codes to the different characters and english speakers don't seem to care if the names aren't quite correct. So to me that seems to sum up that it is a combination of lack of easy keys to push and a laziness of them to learn the codes to use them since there are no easy keys to push. Now of course this is an unscientific sample and was only a few I have talked to in the past but I think it sums up the issue pretty well. As for English doing it for a long time, well that is certainly a technological issue as it cost too much to have that many keys in printing presses for the various diacritics. -DJSasso (talk) 11:46, 20 May 2011 (UTC)
This misunderstanding of RS really bugs me. It's not as black and white as "X is or is not a reliable source" or "X is or is not a reliable source for article Z", it's rather "X is or is not a reliable source for statement Y in article Z". Yes, someone's personal page is not a reliable source for the date of the apocalypse, but it's a pretty reliable for what their name is and, similarly, how to pronounce it. Take Chuck Palahniuk and Bob Moog, whose names are more often than not pronounced differently to how they should be. That doesn't make the mispronunciation "common English usage" that we should document in an encyclopedia. In both these articles we source the pronunciation from self-published sources, because in this case, they are reliable. If you're really suggesting what the majority of newspapers write is always correct, then we really could have moved Osama bin Laden to Obama Bin Laden a few weeks ago. - filelakeshoe 13:26, 20 May 2011 (UTC)

I'm with what Stephan Schuld said above: I'd go with the most correct spelling if the only or major difference lies in leaving diacritics off the names. Only if even scholarly sources drop the diacritics, then Wikipedia should also use that name. (Disclaimer: I've never been too fond of WP:CN; I think "correct names" are more important than "common names".) —Nightstallion 07:07, 20 May 2011 (UTC)

Also, I agree with what Angr and Dahn said. —Nightstallion 07:10, 20 May 2011 (UTC)

Angr, Dahn, Kotniski and others said it perfectly. If the English Wikipedia has to be a modern encyclopedia of the 21st century, it has to use diacritics. Each day dozens of unrelated editors create hundreds of articles with diacritics. In that situation, screaming angrily about "pure" English Wikipedia without "foreign interference" is just a dying man's wish. - Darwinek (talk) 08:21, 20 May 2011 (UTC)

There seem to be three principal possibilities:
  1. We never use diacritics in names of persons.
  2. We always use diacritics in names of persons.
  3. We employ the presently existing policy of naming the articles in such a manner that as causing as small surprise as possible, also for biographical articles.
(With each principle, there is of course room for singular exceptions, due to other overriding concerns in a few cases.)
Some of you argue as if only the two first possibilities exist. Since I support the third principle, I really don't want you to forget the existing general policy.
As for arguments: The long term trend in English seems to yse less and less of exonyms and translations of names, and more and more of the exact original name forms. Wikipedia should observe the trend and adjust to actual changes "out there", but not lead or initiate the changes.
Another thing: I repeatedly see reflects of the idea that you don't translate names. Traditionally, proper names were almost always translated. This is still the cases in some instances.
For example, As to the Dvorac in Praha example supra, Pmanderson argues in a wikipedia manner, reasoning about most common usage and least astonishment. On the other hand, Mzajac is convinced that Check and English personal names are completely different objects. From the older point of view, Mzajak's opinion is completely wrong. The old ways still apply for popes, and to a lesser extent for royality; but in older texts, they were ubiquitous.
Mzajak, is your opinion that "Jan Pavel" is a Check name that cannot be translated; "John Paul" an English name without any translation to other languages, "Giovanni Paolo" an Italian proper name, and thus untranslatable? Or are you of the opinion that the articles cz:Jan Pavel II, Pope John Paul II, it:Papa Giovanni Paolo II all should be renamed Ioannes Paulus II, since this is the only official Latin name of the late pope, and 'proper names never are translated'?
If this is your beliefe, then you are not alone in your misunderstanding. To-day, quite a lot of people seem to agree that Karel is (inter alia) a Check name, Charles a different Englisn name, Karl a different Swedish name, and Carlos a different Spanish name; and they don't understand why a Swedish king, who signed his name Carl in Swedish documents, and Carolus in Latin ones, is called Karl XII in Swedish, Charles XII in English, Karel XII. in Checkish, and Carlos XII in Spanish. I notice that your articles cz:Karel XII. and cz:Karel II. Stuart do not even mention that the names kings in question was written in another manner in their original languages. However, at that time, it was as natural to translate proper names, as to translate concepts like "chair" or "king".
We are maneuvering in a situation, where there are older and newer conventions floating around. Most of us no longer believe that there is some kind of affiliation between people who happen to have the same first names; therefore, many people do not consider Marek and Marcus or Maria, Mary, and Miriam as translations of the same name. We look at the name as a more or less arbitrary label, and think that if the labels appear different, then they are different. Others have a more historical view. This is a matter of taste, not of right or wrong; whence dominating actual usage should be our principal guide in naming biographical articles. JoergenB (talk) 10:51, 20 May 2011 (UTC)
(That's “Mzajac.”) I don't think they called the composer “Tony Dvoh-rack,” even in the old days, so perhaps the way names are used is not so simplistic. Michael Z. 2011-05-20 16:54 z

I've found a note on my talk page asking my opinion on this subject, so here it is:

  • Whenever one (possibly non-native) orthography is dominant throughout the English-speaking world, use that. For instance: Dublin, not Baile Átha Cliath; Moscow, not Moskva and not Москва; Prague, not Praha; Warsaw, not Warszawa; Lisbon, not Lisbõa; Copenhague, not København; Munich, not München; Geneva, not Genève and not Genf; Brussels, not Bruxelles, Brussel or Brüssel.
  • Whatever orthography is used for the title of the main article, all other known spellings (most especially including, if different, the native spelling) should redirect to it.
  • In some cases there are different spellings; then use your best judgment: Dvorak or Dvořak? Or the former for computer keyboards and the latter for music? — Peking or Beijing? — Burma or Myanmar? — Białystok or Bialystok? — Milosevic or Milošević?
    In any case, be aware that the convention may be different in English-speaking countries other than yours (or, if English is not your native language, than the one closer to you).
  • If there is no overwhelmingly majoritarian spelling used throughout the English-speaking world, then no flame wars please: leave the article where its creator put it unless the spelling (s)he chose is really outlandish, not only in your opinion, but also according to the customs of English-spelling nations other than yours.
  • If there is no received English name and the native spelling doesn't use the Latin alphabet, then a Latin transliteration should be used. Which transliteration? Here again, use your best judgment.
  • The lead paragraph of the article should mention the English name (if any), the local name, other names (if any) in different languages (if any) of the same country, probably even, for former colonies, the name(s) (if any) in the former colonial language(s). All these with diacritics if any.

Tonymec (talk) 10:10, 20 May 2011 (UTC)

I'm tired of English Wikipedia being treated as World Wikipedia. The use of diacritical marks is to provide clarity, but we cannot expect speakers of English to be familiar with the use of diacritical marks in all languages. To say that a certain spelling (Łódź and Poznań and Elbląg) is "correct" is only to say that it is correct in its native language. This is not different from using Japanese characters to spell Japanese names because it is "correct". These spellings are impenetrable to the vast majority of English speakers and therefore entirely unhelpful, which is the opposite of their purpose. English publishers traditionally allow only a handful of European diacriticals, particularly those found in Häagen-Dazs and Touché. That's all English Wikipedia should allow in article titles or bodies. But I'm all for showing the source language spelling in parentheses at the beginning. --Tysto (talk) 14:15, 20 May 2011 (UTC)

Tough. English Wikipedia is very much the international Wikipedia, written and read by many who are not English native speakers. While we of course should strive for best English prose, it is high time to realize that this is no longer the 19th or 20th century, where diacritics were too difficult to use because of typetting. Wikipedia is 21st century, we have redirects, and we can use the correct spellings of words from languages other than English, be it café, Düsseldorf or Wałęsa. This is what others encyclopedia do (ex. Düsseldorf in Columbia, Encarta or Britannica). I'd really love it if somebody would tell me why they are "incorrect"... --Piotr Konieczny aka Prokonsul Piotrus| talk 17:31, 20 May 2011 (UTC)
Those who want a World Wikipedia should go ask Wikimedia for an encyclopedia in pidgin English; in the mean time, this encyclopedia is supposed to be in actual literate English. This entire thread is as destructive of that end as it would be if anglophones were to insist that the Czech and Polish and German wikipedias use Prague and Warsaw and California "because the world does". Septentrionalis PMAnderson 16:28, 21 May 2011 (UTC)
Bydgoszcz, Brno, Grzesik and Cvrk are all probably "impenetrable" to the vast majority of English speakers too. What are the English forms of those? - filelakeshoe 14:59, 20 May 2011 (UTC)
Szczebrzeszyn... --Piotr Konieczny aka Prokonsul Piotrus| talk 17:31, 20 May 2011 (UTC)
Now if that were a Czech city it would probably be named Štěbřešín, and according to some in this discussion massively offensive! I'd be interested to see which out of that and the Polish is easier for an English monoglot to read. - filelakeshoe 21:05, 20 May 2011 (UTC)
They're not more opaque than trough, tough, though, thought, through, thorough, bough, and hiccough. Although English has a tendency to drop unnecessary diacritics (as in hôtel, coöperation), it also has a counter, etymological tendency to retain orthography, whether it “makes sense” or not. So it's easy to read douche and touché, rose and rosé, waif and naïf, chafe and café.
Those calling for eliminating English diacritics are being selective in their desire for orthographic reform. Michael Z. 2011-05-20 18:21 z

And what about “foreign” English names, like Emily Brontë, Seán Cullen, Sinéad O'Connor, Zoë Wanamaker, and Renée ZellwegerMichael Z. 2011-05-20 19:19 z

Easily changed to Emily Bronte, Sean Cullen, Sinead O'Connor, Zoe Wanamaker and Renee Zellweger. GoodDay (talk) 20:25, 20 May 2011 (UTC)
GoodDay, I don't think you got the point. We are sitting here debating how using diacritics is supposedly not backed by English sources - an idea which has now been refuted a number of times around; the reductio ad absurdum shows that even established English names, originating in English usage, are known to have used diacritics. There is no question of changing these articles titles, particularly since this how the people in question chose to spell their names, without asking wikipedia users if they were right to do so, or if it was good English to do so. It is irrelevant whether changing these titles is "easy" (it isn't even that, btw), since it is also pointless, whimsical, carried by circular argumentation, and anomalous.
Now, what the Luddites were doing was also remarkably easy, but I don't think that was ever a good reason to become a Luddite. Dahn (talk) 20:40, 20 May 2011 (UTC)
Yeah, GoodDay, the discussion is about changing non-English names. Why are you suggesting changing English names? Michael Z. 2011-05-21 02:49 z
I don't think the contribution in question misses the point at all.
But more to the point, there is a serious proposal here to drop all diacritics from article titles. It may or may not get sufficient support. It's up for discussion. Andrewa (talk) 16:01, 22 May 2011 (UTC)
I can't take this proposal seriously until someone clearly articulates it, and sets forth some reasonable justification for it. There seem to be more editors blindly, and incorrectly, arguing that all diacritics are “non-English,” and all should be removed, but I think that the original proposal seemed to be restricted to diacritics in “non-English forms,” which is a different thing. The proposal seems to be lost in a lot of noise generated by people who think they're in favour of it. Michael Z. 2011-05-22 21:53 z
The policy as spelled out at Wikipedia:Article titles requires that the article title is to use the name that is most frequently used to refer to the subject in English-language reliable sources. This applies to the title of the article – but within the text of the article, pursuant to WP:MOSBIO, the person's legal name should usually appear first in the article. I trust that explains the current Wikipedia policy as it relates to this issue. Dolovis (talk) 14:46, 20 June 2011 (UTC)

An uncomfortable gap

  • For article content, en.wikipedia makes no distinction between sources written in English and those written in other languages. (And a good thing too; otherwise our systemic bias would be far worse).
  • For notability, en.wikipedia uses sources in a slightly different way, but again there is no distinction between sources written in English and those written in other languages. Again, this is a Good Thing.
  • For article names, however, we currently have an intermittently-observed rule that the spelling in English sources is all that counts; even if there are other spellings elsewhere. This is an aberration; it is difficult to reconcile with the two policies above, and it can be difficult to reconcile with our quest to write an accurate encyclopædia, rather than one which repeats common misspellings of foreign subjects. The English-sources-only policy as it stands also leaves an awkward gap - it simply doesn't work when we write about subjects which have no english language sources (and there are plenty notable subjects out there which have been widely documented but not yet in the anglosphere).

There are many anglophone sources which do not exactly replicate the spelling of non-english subjects, especially the diacritics - that's exactly why we have controversies like these. We should not automatically adopt a poor spelling from English sources if we know that there is a more accurate spelling in other sources. Right now, if you wrote an article on some (say) eastern-European subject, and had a hundred reliable sources which used č in the name but the only anglophone source uses c, then our existing policy requires us to use the c regardless of that source's quality. Such institutionalised distortion is absurd. This is an encyclopædia; surely accuracy is important. bobrayner (talk) 21:36, 21 May 2011 (UTC)

Your assumptions are false; see Wikipedia:V#Non-English_sources.
Your conclusion is harmful to the encyclopedia. By the same logic, we would write like the report in A Tramp Abroad.
About 7 o'clock in the morning, with perfectly fine weather, we started from Hospenthal, and arrived at the maison on the Furka in a little under quatre hours. The want of variety in the scenery from Hospenthal made the kahkahponeeka wearisome; but let none be discouraged: no one can fail to be completely recompensee for his fatigue, when he sees, for the first time, the monarch of the Oberland, the tremendous Finsteraarhorn. A moment before all was dulness, but a pas further has placed us on the summit of the Furka; and exactly in front of us, at a hopow of only fifteen miles, this magnificent mountain lifts its snowwreathed precipices into the deep blue sky. The inferior mountains on each side of the pass form a sort of frame for the picture of their dread lord, and close in the view so completely that no other prominent feature in the Oberland is visible from this bong-abong; nothing withdraws the attention from the solitary grandeur of the Finsteraarhorn and the dependent spurs which form the abutments of the central peak. (The whole passage should be read; about five pages.)
We are written in English; we do what English does with foreign words. If not, why not use bahnhof for "railway station" and hopow for "distance"? Septentrionalis PMAnderson 02:35, 22 May 2011 (UTC)
There's no need for such bizarre strawmen; I'm not asking for any such thing. I do not know whether you deliberately distorted my point, or whether you actually believe that I want such absurdity, but either way, your point can safely be ignored. This was a discussion on diacritics, not on throwing random foreign words into English text in the body of an article. bobrayner (talk) 08:43, 22 May 2011 (UTC)
Funny you should mention "bahnhof", we do have an article on Berlin Hauptbahnhof, which I find to be perfectly reasonable considering that's what you'd find on any map or travel guide... - filelakeshoe 09:40, 22 May 2011 (UTC)
As WP:NCGN says, maps and travel guides are not particularly good guides to English usage. Unlike our articles, their principal purpose is to show you what's on the local street signs. Septentrionalis PMAnderson 22:01, 22 May 2011 (UTC)

Dropping the diacritics can be seen as a change of article title, but it can alternatively be seen as a mere change of the orthography used to represent the existing titles. True, this does introduce a few ambiguities, such as rose and rosé, and ore, öre and øre. These can be dealt with in several acceptable ways, and IMO this is a small price to pay for the elimination of one of Wikipedia's biggest time-wasters and a frequent cause of friction and ill will, some of which can be seen above. Andrewa (talk) 02:52, 22 May 2011 (UTC)

I do not find the 'gap' (that is, the apparent difference in policy) uncomfortable. For article content and notability it would be a serious mistake to treat reliable sources in different languages as having different degrees of weight. What matters about the sources for those purposes is simply whether they are reliable. However, when it comes to naming, it is necessary to differentiate between how our own language uses a name and how others do. If we did not do so, then we should need to have our articles on Germany and Albania at Deutschland and Shqipëria: most reliable sources to do with those topics are sure to be in their own languages.
On the specific matter of diacritics, they are mostly left out of the orthography of encyclopedias, apart from French names and a few others, but I do not think that has anything to do with the practicality of typefaces. Old-fashioned hot-metal type is not used at all in modern mass printing. The reason for the omission is surely that for the vast majority of English-speaking readers the diacritics in most foreign languages, including all Slavonic languages, simply add nothing useful to the spelling. Most educated people understand, more or less, the accents used in French and Spanish, while well under half of us understand the German-language umlaut, but I do not suppose more than one in a hundred English speakers has any real grasp of the diacritics used in the Slavonic languages, Turkish, or Hungarian. That may be a painful reality to those who do understand them and who find names spelt without them ugly, but it was always thus, nothing has changed. For the vast majority of our readers, these unknown diacritics are at best a puzzle and at worst an irritation. Moonraker2 (talk) 15:04, 22 May 2011 (UTC)
That raises some wider issues, but the specific issue here is orthography, not typeface, and use in article titles, not text. Andrewa (talk) 15:38, 22 May 2011 (UTC)
The issue you're raising is in article titles, but the original issue was to remove diacritics from peoples' names altogether because they're "not English". For the record, I don't think banning them from titles would cut out any drama. The drama would simply shift to whether an article should start Antonin Dvorak (Czech: Antonín Dvořák)... or Antonín Dvořák... if you look at the articles OP Dolovis has been protecting from diacritics you'll see he's been protecting the main article text from them too (Mitja Sivic, Ziga Pance, Matej Hocevar...) - filelakeshoe 20:06, 22 May 2011 (UTC)
Yes, the issue I am attempting to discuss here at Wikipedia talk:Naming conventions (use English)#Use of diacritics in biographical article titles is the use of diacritics in article titles. Yes, it's possible that The drama would simply shift to whether an article should start.... But I don't think it would. People just aren't that logical! And where there are variations, these can and should all be given in the lead, so the text is not as contentious as the title. Andrewa (talk) 03:35, 23 May 2011 (UTC)
Well I'm not sure what the actual issue we're debating is. We're not all discussing the same thing. We have various "problems" being put forward by different editors all of which result in the same thing, removing diacritics from article titles.
  • Dolovis, the OP, wishes to write foreign names without diacritics when the majority of sources (I'm not going to say reliable sources, see below) do so. By the looks of his edits, he would not even put the diacritics in article text under this rationale.
  • GoodYear moves to ban diacritics from all article titles because he believes they're "not English".
  • Andrewa moves to ban diacritics from all article titles to "save drama at WP:RM".
Sorry if this seems like an assumption of bad faith, but to me this just looks like a group of people with sensory issues with diacritics fishing for some rationale or other to ban them. Dolovis' argument is the more sensible imho, but we need a clear cut definition of what constitutes a reliable source for how to spell someone's name. News sources, fan sites, official site of person/club he belongs to, encyclopedias, ?? And the native name with diacritics should always be in the article text, even if the person's name is written in another alphabet. If we banned diacritics simply because they cause drama, this "English usage" issue would still stand. It seems defeatist to me and we're ignoring the reason there is drama in the first place. - filelakeshoe 12:38, 23 May 2011 (UTC)
Well, since you refer to me specifically, I think I should respond that I do find this an uncalled-for departure from WP:AGF, and an excellent example of the main reason that I suggested simply dropping diacritics from article titles. I don't think that you bear me any malice, it's just where the argument naturally led, and often leads. Andrewa (talk) 19:45, 23 May 2011 (UTC)
I must respond because I have also been mentioned in the above comment and my position has been misstated. To be clear, it is my opinion is that the current policy of WP:COMMONNAME and Wikipedia:Naming conventions (use English) stipulates that a biographical article does not use the subject's name as it might be spelled in Czech or Slovenian (with diacritics) as its article title, nor does it use the person's legal name as it might appear on a birth certificate or passport; it instead uses the name that is most frequently used to refer to the subject in English-language reliable sources. An article title should only use diacritics if that form of spelling is the most commonly used form of the name as verified by sources used within the article. Unfortunately, there is a group of editors who are moving articles to their non-English forms even when the article contains no sources to verify that form. For example, in Jakub Kovář, all of the sources within the article (NHL.com, Hockeydb.com, and Eliteprospects.com) read “Jakub Kovar” yet the article has been moved to the non-verified form of spelling with diacritics. The same is true for the vast majority of the hockey-bio articles which use diacritics. I believe that using a form of spelling that is not supported by verifiable sources should not be condoned. Dolovis (talk) 02:13, 27 May 2011 (UTC)
Not more than one of in a hundred English speakers has any real grasp of familiar letters used in Slavonic languages either. I think the average English speaker is more likely to guess that Š = shirt and Č = chocolate than that J = Yes or C = Bits. If you want to make Slavonic names "penetrable" for English speakers we need to invent a transliteration system, whereby Marek Židlický would be Marek Zhidlitsky (as if he were Russian). And doing that is not at all in the spirit of Wikipedia. We can have soundbites and pronunciations in the article text to show how unfamiliar spellings are pronounced. I really think this "English speakers don't know how to read diacritics" argument is bogus. We're not talking about, for example, ß here.. letters with diacritics still have a familiar latin letter in them. - filelakeshoe 19:52, 22 May 2011 (UTC)

There are some fallacious arguments being put forward.

  • Diacritics are “non-English.”

This is just false. Diacritics are often dropped or considered optional, but continue to be used in English. Diacritics appear in naturalized English words and names of English origin, and this usage is documented in many current descriptive dictionaries and style guides. Just because someone can't type them with their crappy Windows keyboard driver, doesn't mean that they aren't used and expected by literate anglophones around the world.

  • Diacritics should be dropped because these discussions waste too much time.

This is contrary to our principles, and holds no water here. Wikipedia favours the interests of readers over those of editors. It doesn't matter if we have to hash this over another hundred times, we should use the right English orthography for an article title, whether it includes a diacritic or not. Michael Z. 2011-05-22 22:03 z

  • This is all very confusing. I just started Malmö Konsthall, with the diacritic because the parent article Malmö has it and it seemed best to be consistent. De-orphaning it, I found 15 articles that used the form with the diacritic and one that did not. Of the inbound link article names, diacritics were contained in Malmö (a town in Sweden) Kutluğ Ataman (a Turkish artist) and Clémentine Deliss. The last article was the one that did not use the diacritic but referred to Malmo Konsthall. Possibly this is because Clémentine Deliss is British-born. I have a feeling none of this is very helpful. Aymatth2 (talk) 01:56, 23 May 2011 (UTC)
    • My mistake. List of exhibitions by Ólafur Elíasson also has diacritics. So 25% of the articles referring to Malmö Konsthall also had diacritics in their name. This would be because the article is about an un-English subject, and apart from Clémentine the referencing articles are too. Perhaps that is the root cause of the problem. Aymatth2 (talk) 02:23, 23 May 2011 (UTC)
Agree that diacritics are part of English, and the argument that they are not is therefore invalid.
Agree Wikipedia favours the interests of readers over those of editors (your emphasis). Disagree however that This is contrary to our principles, and holds no water here and It doesn't matter if we have to hash this over another hundred times. It does matter. It discourages editors and wastes their time, and neither of these is in the interests of readers.
Agree that we should use the right English orthography for an article title, except for one very important logical point... saying the right orthography is a bit like saying the present King of France or the road to Rome. There is no single right orthography, many different orthographies exist. The question is, which is best for us? And I'm seriously suggesting that despite a lot of well-intentioned work on diacritics in article titles, the time has come to cut our losses. It has had unforseen effects that outweigh its benefits.
I think the question of whether there is a single correct orthography is very important here. Nearly all, perhaps all, of the arguments in favour of using diacritics presuppose that there is. Evidence? Andrewa (talk) 19:23, 23 May 2011 (UTC)
A basic principle in Wikipedia is to favor "generally accepted" over "correct". In the articles we try to present what is commonly thought about the subject, backed up by citations to reliable independent sources, whether or not this is the truth. The same applies to diacritics. There is no morally right or wrong usage. We should follow the most common usage in English-language sources, indicating alternate usages where relevant. Ise Ekiti is more common than Ìṣẹ̀-Èkìtì in English-language sources, so the form without diacritics should be used as the article title. Malmö is more common than Malmo in English-language sources, so that is the form to be used as a title. There will always be borderline cases open to argument, but the more heated the argument the less likely it matters much: the "correct" usage is not clear. Redirects can always solve the problem.
"Use no diacritics" will never be accepted. "For a non-English word, use diacritics as used in the native language" will not be accepted either. "Use the form most commonly found in English sources" leaves some room for interpretation, but is surely the simplest and most in line with general WP principles. Aymatth2 (talk) 01:26, 24 May 2011 (UTC)
  • I note that Touché, which I used as an example in a comment here on 20 May 2011, was redirected to Fencing#Terminology on 21 May 2011, apparently without discussion. Some relevant content was dropped in the process. Aymatth2 (talk) 01:47, 24 May 2011 (UTC)
You may well be right that "Use no diacritics" (in article titles) will never be accepted, but to me it's such an obvious solution to such a needless and recurring problem that, asked for an opinion, I gave it. Nothing said since has changed it.
I have no particular axe to grind on this, despite the allegation to the contrary above. No national loyalty to Polish, Russian, Swedish or Norsk, no membership of the Alliance française although I do love to speak French, nor of any asteroid's fan club although I am considering forming FOPP the Friends of the Planet Pluto (Save the Planet Pluto), I'm just a medium to (sometimes) hard working admin who sees a lot of heat over issues that seem to benefit Wikipedia's mission by exactly 0%. But the suggestion just generated more heat. Perhaps that is in hindsight predictable. Andrewa (talk) 03:48, 25 May 2011 (UTC)

I'm not going to restate everything but I agree with just about every argument in favor of diacritics used above. I'd also like to say that it would have been nice of Dolovis to notify me since I'm the one who told him five times in the past two weeks to start a centralized discussion on the topic. Pichpich (talk) 15:59, 25 May 2011 (UTC)

The question to be answered

My question for this discussion, which can be answered with a simple “In force” or “Not in force”, is as follows: Are the wiki-policies of WP:COMMONNAME and Wikipedia:Naming conventions (use English) in force or not? Dolovis (talk) 02:19, 27 May 2011 (UTC)

  • In Force: The policies of WP:COMMONNAME and Wikipedia:Naming conventions (use English) are current and remain as the proper policy to follow when naming biographical articles. That is, biographical articles should use the name that is most frequently used to refer to the subject in English-language reliable sources. Dolovis (talk) 02:23, 27 May 2011 (UTC)
    I think the problem you are having is that you are seeking an answer to a question that isn't really what the problem is about. No one is suggesting the policies are not in force. What they are suggesting is that they can be interpreted differently than the narrow view you have put forward. As you can see above the wiki is quite divided on the interpretation of the policy which is why some projects including but not limited to the hockey project have come to a standardized way of dealing with the situation so we don't have different articles using different methods and to avoid having the heated time wasting debates alluded to by others above.. -DJSasso (talk) 02:26, 27 May 2011 (UTC)
With respect, I think that this entire issue has arisen because some editors have taken the position that the policy of WP:ENGLISH has changed and that diacritics are the preferred form of spelling for non-North American hockey players. I have been told (falsely) several times that the consensus is that Wikipedia uses diacritics for Czech, Slovakian, and Slovenian, and other non-North American hockey players; but no editor can show me that consensus. There is no discussion or written policy which supports the use of diacritics when no sources verify that form of spelling, yet several editors have been pushing their point of view that diacritics are required even in situations where no sources are shown to verify that diacritics are used. Some editors have stated that diacritics should be used, even against reliable English-language sources. Their logic is that those sources are wrong, and so therefore, they are not reliable. Well, the threshold for inclusion in Wikipedia is verifiability, not truth, (unless the policy of WP:ENGLISH has been changed or is no longer in force). So yes, I am seeking to answer the question: Are we going to enforce the current policy and require that names be verified by reliable English Sources, or not? Dolovis (talk) 02:51, 27 May 2011 (UTC)
Yes, and verifiability means you need sources to be reliable. Newspapers spell peoples' names wrong all the time. Like Obama Bin Laden. - filelakeshoe 09:45, 27 May 2011 (UTC)
I think Filelakeshoe has hit the nail on the head. Many people believe most Newspapers etc to not be reliable for the spelling of a name when they incorrectly do so many times. The wiki is mostly divided on how to deal with the situation as you can see above. The whole reason the guideline specifically says it does not prefer either version and that you shouldn't make it a bigger deal than it needs to be is because the wiki has never been able as a whole to come to consensus on the situation in either direction. So smaller parts of the wiki have have come to a consensus for the articles under their scope to at least provide some sort of consistency. I would also note that WP:ENGLISH is a guideline not a policy. -DJSasso (talk) 10:37, 27 May 2011 (UTC)
The sources that are used in the articles under question include NHL.com, Hockeydb.com, Eurohockey.net, Eliteprospects.com, ESPN.com, TSN.com, Hockeyfutures.com, Legendsofhockey.com, and other sources who have made it there business to accurately record names and statistics of hockey players. I would be not raise an issue for any specific player if Filelakeshoe would cite even a single example of his preferred form of spelling from any English-language source. If Filelakeshoe and Djsasso thinks that the "newspaper" has misspelled the name, then produce a source that spells it right before you move the articles away from its commonly used English form. Dolovis (talk) 14:12, 27 May 2011 (UTC)
Right those sources might be accurate for statistics or whatever and not be accurate for names. Sources can be reliable for one set of information and then not reliable for others. As has been shown to you by others above, sources often leave off diacritics in their pages for a number of reasons even though it is wrong and the correct way would be to leave them. The minute they do that, the source becomes unreliable because they have not taken the time to properly list the name. One of the key components to be considered a reliable source is careful fact checking. Such careful fact checking would lead a reliable source to properly writing their name, so without doing such careful fact checking the source is no longer reliable in this particular subject matter. As you can see above the majority of the people responding to your question have said using the diacritics are not an issue in most cases. -DJSasso (talk) 15:51, 27 May 2011 (UTC)
I have asked you to provide a reliable source that supports the use diacritics where they used as an articles title. Will you do that? Dolovis (talk) 16:03, 27 May 2011 (UTC)
And can you provide a reliable source that doesn't use them? If you can't the policy you have quoted indicates we should use official spelling. -DJSasso (talk) 16:07, 27 May 2011 (UTC)
You are proving my point that sources do not exist to verify that diacritics are required. I have provided reliable sources. All of the articles of concern do have reliable sources showing the non-use of diacritics (see again NHL.com, Hockeydb.com, Eurohockey.net, Eliteprospects.com, ESPN.com, TSN.com, Hockeyfutures.com, Legendsofhockey.com, ect.), but you have failed time and time again to show any sources of any kind to verify your position that diacritics are required, but you continue to claim that they should be used. Dolovis (talk) 04:22, 28 May 2011 (UTC)
I think you are missing the point, yes I can find sources for various players as has been done in numerous past discussions on the topic. Can I find them for all no probably not. As for reliable sources no you don't have any. As mentioned all of those sources you mention are spelling the name incorrectly (for whatever reason) thus they are not reliable sources in terms of spelling of names. Atleast one player has spoken out about the topic in the past as well in regards to nhl.com atleast. Welcome to the issue of diacritics and why a compromise was worked out in various locations. Please read the discussion above where most people support the use of them as still complying with the policies/guidelines you quote. Instead of just repeating the same thing over and over. -DJSasso (talk) 19:24, 28 May 2011 (UTC)
If there is no source to verify the spelling, then it should not be used in the article. To suggest otherwise goes against the basic policies of Wikipedia. (i.e. No Original Research and Verifiability). Dolovis (talk) 20:11, 28 May 2011 (UTC)
In agreement. GoodDay (talk) 03:54, 30 May 2011 (UTC)
Strawman warning! The policies are of course in force, but the use of diacritics are supported by them. Dolovis, would you please drop the stick and move away from that dead horse's carcass? --Piotr Konieczny aka Prokonsul Piotrus| talk 20:19, 28 May 2011 (UTC)
The use of diacritics must be supported by reliable sources. If you can point to where the policy says otherwise, then show it to us. Dolovis (talk) 20:29, 28 May 2011 (UTC)
Ah, but they are. In most cases, I am sure there are exceptions that need to be discussed on a case by case basis. But that's all there is to this story. --Piotr Konieczny aka Prokonsul Piotrus| talk 20:48, 28 May 2011 (UTC)
  • In force. Absolutely. Biographical articles should use the name that is most frequently used to refer to the subject in English-language reliable sources. No question at all about that. "Generally accepted" is fundamental to Wikipedia as a way to avoid endless arguments about "right" and "wrong". For example, Ademar José Gevaerd always has the diacritic in English-language reliable sources while George Bush never has it. Aymatth2 (talk) 01:49, 29 May 2011 (UTC)
  • Wrong question. The policy and the guideline are not supposed to be "in force", but represent editors' ongoing best attempt to describe accepted Wikipedia practice. Like any policy or guideline, they do so imperfectly. Generally speaking these two do a pretty good job, but on the question of diacritics, they could do better, since they don't make it clear just how consistently we do in fact use them. (My main criticism of this guideline is unrelated to that point - it ought to be called simply "WP:Use English" and not restrict its scope unencessarily to just article titles.)--Kotniski (talk) 06:31, 29 May 2011 (UTC)
  • What a silly way to try and win an argument... Are we going to start hiring lawyers next? Guidelines should be descriptive, not prescriptive and in any case, the current wishy-washy phrasing of the guideline is not entirely reflective of the current (longstanding) situation. Pichpich (talk) 21:45, 29 May 2011 (UTC)
  • This is the English language Wikipedia, not the Multiple language Wikipedia. The pushing of usage of non-english symbols by editors is frustrating, particularly when you don't see them on the english alphabet. GoodDay (talk) 03:58, 30 May 2011 (UTC)
Symbols which are not part of the "English alphabet" (26 letters) are still frequently used in good English writing, so I don't really see what this argument is based on.--Kotniski (talk) 10:44, 30 May 2011 (UTC)
I don't understand where this argument is placed in reality either. If English dictionaries contain these words, then there's no reason why it shouldn't leave non-English names in the latin alphabet as they are. - filelakeshoe 12:04, 30 May 2011 (UTC)
Those non-english symbols have little to no meaning for an english reader. There place is on the French Wikipedia, Slovak Wikipedia, Czech Wikipedia, Swedish Wikipedia etc etc. GoodDay (talk) 12:45, 30 May 2011 (UTC)
Yeah, those stupid foreigners can have their own damn wiki. Speaking of the French Wikipedia (and the German, Spanish and Italian ones to name those I'm familiar with): fr.wiki also use diacritics for, say, Czech names despite the fact that the Czech diacritics are not used in French (or German, Spanish or, Italian). Pichpich (talk) 14:37, 30 May 2011 (UTC)
Again, "non-English symbols have little meaning for an English reader" is just false. Many English readers (particularly those who will be reading the articles in question) will understand the significance of those symbols; and those who don't will not have their understanding impaired by seeing them. This really does seem to me to be a campaign of "dumbing down" - making things harder for the knowledgeable just in order to soothe the feelings of the ignorant.--Kotniski (talk) 16:28, 30 May 2011 (UTC)
This argument reminds me of far-right politics. Nuff said. - filelakeshoe 17:28, 30 May 2011 (UTC)

No one has disagreed that the wiki-policies of WP:COMMONNAME and Wikipedia:Naming conventions (use English) remain in force, but those editors who support the use of diacritics within article title's suggest that I am asking the wrong question. So, now that we have a consensus that WP:COMMONNAME and WP:EN remain as valid policy for all of Wikipedia, then let's conclude this discussion with a follow-up question. Dolovis (talk) 12:53, 30 May 2011 (UTC)

What?? Several people have said that these things are not "in force" - Wikipedia does not have laws or rules that are "in force" (at least, not in this area), so it really doesn't make sense to argue in this way.--Kotniski (talk) 16:17, 30 May 2011 (UTC)
Dolovis I still think you're missing the point of these "policies" (Naming conventions, by the way, is not a policy, but part of the guidelines that make up the manual of style). If you read the little boxes at the top of them you'll notice they advise use of common sense, and that there will be exceptions. I wouldn't want to describe these guidelines and policies as "in force", they're simply generally accepted conventions which are supposed to help people write articles. Wikipedia is not a bureaucracy. - filelakeshoe 17:28, 30 May 2011 (UTC)

Follow-up question to be answered

When a person's name is the article's title, must that form of name be supported within the article by verifiable sources, or may any editor correct the form of name by moving/changing the article's title without providing sources? Acceptable answers are: Sources that support article's title must be present; or Sources are not required when correcting names. Dolovis (talk) 12:53, 30 May 2011 (UTC)

There ought to be sources, certainly. Though the questions that arise are (1) whether those sources should be in English, and (2) whether we necessarily have to follow the majority of English sources. To those questions I would answer (1) not necessarily, if there are too few English sources for us to conclude anything about English usage; and (2) no.--Kotniski (talk) 16:22, 30 May 2011 (UTC)
And I would add for (2) that "majority of English sources" is an ill-defined concept anyways. Putting scholarly work and Ghits on an equal footing is a sure recipe for the dumbing down of the project. Pichpich (talk) 19:13, 30 May 2011 (UTC)
The english alphabet, is the core of the english language. Last time I checked, there were no diacritics in the english alphabet. GoodDay (talk) 20:23, 30 May 2011 (UTC)
That would rather be English orthography, which is a very complex and deep set of rules governing how English words are spelled (and since English is a living breathing language, are often broken anyway). The reason your argument is fallacious is because it's not just rare cases of diacritics which break the rules of English orthography. Neither Szczebrzeszyn nor Brno nor Holešovice is an "English" name, and all three break the rules of English orthography, since as well as not allowing for /š/, it wouldn't allow for /szcz/ or initial /brn/ either. None of these names are English and all of them break the rules, but since there are no English names, we use them.
Regardless, diacritics are used in English. Check the section "diacritics" in the article English orthography and the article linked. - filelakeshoe 20:41, 30 May 2011 (UTC)
It's not as if anyone is advocating writing Robert Johnson as Ròbéřt Jøhnšöñ just for added kicks. We're talking about foreign names which are written in Latin-based alphabets that any English speaker will be able to read with or without diacritics. Proper pronunciation is of course a different matter and at least readers familiar with diacritics get that info (yes, many English speakers also speak these weird diacritic-filled foreign languages). When we get the chance to be precise, we're precise. Pichpich (talk) 22:07, 30 May 2011 (UTC)
It might help a little if the guideline explicitly distinguished between
  • using English words or names (e.g. Munich, the Luther Bible) where they exist and are the most commonly used names
  • (not) using an "English" (i.e. diacritic-free) spelling of foreign names (e.g. Munchen, Dusseldorf, Hitler und die Endlosung).
The current wording may conflate the two, very different, concepts. --Boson (talk) 00:13, 31 May 2011 (UTC)
There would hardly be an 'English' spelling of München, as we use the exonym Munich. When it comes to the 'English' spelling of most words which in German have an umlaut, the diacritic-free approach is to use an -e- after the vowel instead: Duesseldorf, Fuehrer, Bluecher. These are also alternative spellings in writing German. Moonraker (talk) 04:09, 31 May 2011 (UTC)
Yes, I deliberately used the example of Munchen (Muenchen would also serve) to highlight the difference between the correct use of established exonyms and the use of purported English spellings of non-English words. I am aware of a German convention of replacing "ü" by "ue", etc. when the correct characters are not available (see, for instance Duden Rechtschreibung, Hinweise für das Maschinenschreiben, Fehlende Zeichen) , but I am not sure how familiar most non-German-speakers are with this convention. I suspect more are familiar with the convention of just dropping the diacritics (possibly out of ignorance). Some American publishers might also adopt the German convention of using ue for ü, etc. where it causes typesetting problems, for instance, but off-hand, I can't think of a reliable source that describes this as a convention of English spelling.--Boson (talk) 20:12, 31 May 2011 (UTC)

So we are agreed that an article must include a verifiable source to show the form of spelling (with or without diacritics) as used in the article's title. If no source is shown to support the article's title, then the title will be changed to conform to the sources used within the article. Dolovis (talk) 04:39, 31 May 2011 (UTC)

For the purpose of determining the name to be used in the title, English language reliable sources are needed, as English may have a different form of the name from one or more other languages. Also, on "the title will be changed to conform to the sources used within the article", the title may not need to be changed, and (once again) in such an exercise English language reliable sources should have priority over others. Moonraker (talk) 09:52, 31 May 2011 (UTC)
No, we are not agreed. Reliable sources for a particular spelling are required only when deviating from the language in which the topic is most often talked about (often the local language). See Wikipedia:Naming conventions (use English)#No established usage. When deviating from this, we need reliable sources to show that the deviant English spelling is established (as encyclopaedic usage). --Boson (talk) 10:45, 31 May 2011 (UTC)
Quite so. Accents should be used if appropriate, unless common English non-accented usage is established. The Guardian style guide says on accents "Use on French, German, Portuguese, Spanish and Irish Gaelic words (but not anglicised French words such as cafe, apart from exposé, lamé, résumé, roué). People's names, in whatever language, should also be given appropriate accents where known. Thus: "Arsène Wenger was on holiday in Bogotá with Rafa Benítez"" There is no reason not to adopt a similar stance here. Daicaregos (talk) 11:40, 31 May 2011 (UTC)
English language reliable sources use diacritics when appropriate. I do not agree with "unless common English non-accented usage is established". It is very common for English to have a "non-accented usage", but if reliable sources (such as other encyclopedias and specialist academic studies) don't use that, it should be avoided here. Moonraker (talk) 13:06, 31 May 2011 (UTC)
Is Boson and Daicaregos saying that they can use a non-English form of a person's name as the article's title even when there are no sources to support that use? That is unacceptable. There must be a source. Dolovis (talk) 13:13, 31 May 2011 (UTC)
I think we can all agree that aligning ourselves with other encyclopedias or scholarly work makes more sense than aligning ourselves with an Internet database. However, we're unlikely to ever find an entry for František Kaberle in Britannica. This doesn't change the fact that scholarly studies and encyclopedias use diacritic marks and that we should too. Pichpich (talk) 13:32, 31 May 2011 (UTC)
Yes, I am saying that no English sources are required. It is not only acceptable but current consensus:

"It can happen that an otherwise notable topic has not yet received much attention in the English-speaking world, so that there are too few English sources to constitute an established usage. . . . 'If this happens, follow the conventions of the language in which this entity is most often talked about (German for German politicians, Turkish for Turkish rivers, Portuguese for Brazilian towns etc.)."

This rule is appropriate because not all notable subjects are sufficiently notable in the English-speaking world to have an established English name. --Boson (talk) 16:17, 31 May 2011 (UTC)

@ Dolovis: No, I am not saying that a non-English form of a person's name can be used as the article's title even when there are no sources to support that use. How could we know that non-English form without sources? I don't propose they should simply be made up. Of course reliable sources are needed for a person's name. They need not be English language sources though. The point here is that the default position should be the person's correct name in their own language, even when that name uses accents; as suggested by the Guardian style guide above, unless common English non-accented usage is established. Daicaregos (talk) 20:47, 31 May 2011 (UTC)

The catalyst of this discussion are the mass article moves currently being performed by User:Darwinek and others, without any sources at all to verify the non-English form of spelling (with only the edit summary saying "correct name"). The practice of moving articles to names that may very well "simply be made up" is rampant and on-going, and that is why I started this discussion. Dolovis (talk) 05:48, 2 June 2011 (UTC)
Are you really saying that your concern is that these names are fake or made up for lolz? We don't usually source information which is unlikely to be challenged but if all you care about is a source for the diacritics, let me know of any that you find suspect and I'll fix it for you. Pichpich (talk) 12:51, 2 June 2011 (UTC)
The "mass article moves" is your paranoia, Dolovis. Members of various WikiProjects, like WPP Poland are constantly watching new articles and if diacritics are missing in the article title, they move them. This is a standard practice for many years, there is no sudden "mass moving" going on. If you need confirmation of usage of those names, just use Google or ask members of given WikiProjects for help. - Darwinek (talk) 22:39, 3 June 2011 (UTC)
Seconded, on behalf of the WP:POLAND. --Piotr Konieczny aka Prokonsul Piotrus| talk 23:38, 3 June 2011 (UTC)

I generally prefer to stay away from "dead horses" but my two cents: yes, the burden of proof should be on an editor who proposes changing a name either way. Spend time adding information to articles instead of moving back and forth with no improvement to the articles. Reach a consensus before making wholesale changes is better than "bold, revert" in my (perhaps minority?) opinion. The argument I keep refuting is that removing diacritics somehow converts a name from another language into English: it does not! The case I dealt with is the Hawaiian language. In that case (similar to other non-European languages, especially of indigenous people with mainly oral cultures), the orthography was entirely constructed by English speakers, with the diacritics specifically to help English speakers pronounce the words. So the "English" spelling is the one with diacritics; locals usually do not bother since they already know how to pronounce the words. That is why I prefer to use diacritics, except for words like "Hawaiian" where it is so English that it gets English word endings. Let me add we found out the US 2010 census will use diacritics in the official place names for the first time, and that Geographic Names Information System has mostly adopted them already. Modern scientific journals now use them. So the trend is clear, do not be stuck in the past. And do not underestimate readers: it seems obvious that dropping the diacritics in Latin alphabets at least, is a simplification, and thus reduces information. That is why I hate wording like "Waikīkī, also known as Waikiki"... since it is so obvious to be patronizing. W Nowicki (talk) 16:58, 2 June 2011 (UTC)

Some very good points here. What a contrast to the general tone of discussion above. But despite the possibility that English is increasingly using diacritics, and therefore older sources are less relevant, I remain of the view that at this time the best solution (possibly the only solution) is to drop diacritics from article names and to retain them in text where appropriate.
But no consensus seems likely on anything here at present. Andrewa (talk) 20:45, 3 June 2011 (UTC)

For what its worth, that comes close to the compromise I proposed for the Hawaiian articles. We kept the non-diacritic article names for places, while the person names with diacritics are generally kept where they are, some in titles, some not, but generally tried to keep the diacritics in the body. The minor downside of this policy is all the piped links, but that is only a minor inconvenience to the editor, not the reader. To clarify, I never said "English" is using diacritics, what I tied to say is that English language encyclopedic quality sources tend to have typesetting support now to handle diacritics, so use them on words in other languages to help English speakers know how to pronounce them. And readers are not surprised by them. Article titles are less critical to readers than editors think. W Nowicki (talk) 21:23, 3 June 2011 (UTC)

Amen to that last point.
In a way I don't care at all whether the article title has the diacritics or not. In either case there should IMO be a redirect from the other version. To suggest that Wikipedia is in any way dumbed down by what should be a pragmatic choice as to whether to use or not to use diacritics in what is merely a database handle is ludicrous. And yet it is repeatedly and passionately argued, as is the equally ludicrous opposite view.
So I simply say, this discussion isn't part of Wikipedia's mission at all. Both viewpoints are WP:SOAP. And there's only one way I can see to get rid of it, but that solution is simple, obvious and without any relevant downside. Andrewa (talk) 01:41, 4 June 2011 (UTC)
Andrewa, concerning those names that use diacritics in their originating countries (such as Slovenia, Slovakia, and Czech Republic), do you have any comments on the repeatedly made claims that WP:COMMONNAME and WP:VERIFY do not apply for such names? Dolovis (talk) 02:07, 4 June 2011 (UTC)
No one said they didn't apply. You keep going back to that. What has been said is that common name is being interpreted differently than you interpret it. Secondly no one said verify doesn't apply, what has been said is that verification doesn't require an english source, which is stated right on the wp:verify itself. -DJSasso (talk) 02:33, 4 June 2011 (UTC)
My view is that these questions are off-topic on this particular talk page. Happy to discuss on my own talk page, and resigned to watching them discussed at length on other talk pages. Andrewa (talk) 03:15, 4 June 2011 (UTC)
DJSasso, you just moved 16 articles from English to to non-English form with diacritics.[7] None of those articles have any sources to verify the name is used with diacritics, and none meet the policy of WP:COMMONNAME. So obviously you are saying, with your actions if not your words, that those policies do not apply. Dolovis (talk) 03:31, 4 June 2011 (UTC)
DJSasso is right. Several other users were correcting your disrupting revert moves or intentional creations without diacritics. Dozens users here disagree with you, and still you are the one pursuing your POV despite general consensus and practice. Dolovis, you are simply not right, and you should deal with the pure fact that diacritics ARE used throughout Wikipedia and nothing's gonna change that. - Darwinek (talk) 09:40, 4 June 2011 (UTC)
You keep on track, Dolovis. Those non-english symbols are annoying, almost as annoying as the push to keep them. It has to be respected that this is the English language Wikipedia, not the Multiple language Wikipedia. For those who prefer dios usage? we've got the French Wikipedia & all those Eastern European Wikipedia's for you (plural) to build. GoodDay (talk) 11:15, 4 June 2011 (UTC)
No what I told you about those moves was that I was fixing copy paste moves that someone else moved, that I was putting them back where they were moving them to. I also told you that you are free to object to their moves. Secondly as it says in verify, you don't need to put a source in for everything, and we generally don't source the spelling of names on any article, but with that being said it still needs to be verifiable and those wishing to do so can easy verify them with a search in the subjects native language google. I would also mention that you keep calling the diacritic version non-English when you have been shown by nearly everyone above in this discussion that calling it non-English is incorrect because diacritics are used in English. -DJSasso (talk) 12:30, 4 June 2011 (UTC)
DJSasso, you are supporting and performing out-of-process moves. Process is important, and it should be unacceptable for an Administrator to encourage out-of-process edits. Dolovis (talk) 12:50, 5 June 2011 (UTC)
No I am not encouraging out of process actions. The process is Bold-Revert-Discuss. He boldly moved them. If you disagree you revert the moves. Then you discuss them via a request for move if the original mover decides to pursue the matter. Where am I encouraging out of process moves. And even if I were, I would point you to Wikipedia is not a bureaucracy which is an actual policy versus the essay about process you linked to which is one (or more) persons opinion. -DJSasso (talk) 12:59, 5 June 2011 (UTC)

I thought from an outsider

Two questions

Why do we use diacritics but not non-Latin alphabets?

What purpose do diacritics serve in an English encyclopedia? Martin Hogbin (talk) 11:24, 5 June 2011 (UTC)

There's a significant difference between recognizable Latin symbols with a diacritic mark and non-recognizable symbols. Transliterating names in, say, Cyrillic alphabets is pretty essential since most readers won't be able to visually match the name to something they've seen elsewhere. On the other hand, readers unfamiliar with diacritics have no problems identifying Jönsson and Par-Gunnar Jonsson, Václav Havel and Vaclav Havel or Agnès Jaoui and Agnes Jaoui. Scholarly publications usually do the same: for instance, every math journal I know transliterates author names over non-Latin alphabets but keeps diacritics over Latin alphabets. As has been pointed out above, this is also the rule of thumb used by Britannica. It's about being precise when we can be. Pichpich (talk) 13:48, 5 June 2011 (UTC)
It seems just a bit parochial. We can cope with a slight variation in the Latin alphabet but not with this foreign stuff. Martin Hogbin (talk) 17:03, 5 June 2011 (UTC)
Dropping diacritics as if English speakers shouldn't be bothered with foreign stuff sounds pretty parochial to me. Pichpich (talk) 17:24, 5 June 2011 (UTC)
BUt why draw a line at Latin plus diacritics? There is no logic to this. Just the symbols used in English is one obvious option, the other, if we want to be truly international, is to use the correct national symbols. Martin Hogbin (talk) 18:17, 5 June 2011 (UTC)
If we use Cyrillic as an example, one immediate reason is that virtually all scholarly sources in English use transliterations. This practice certainly has roots in typographical limitations but there's also the more fundamental problem that Вале́рия Ильи́нична Новодво́рская is not only hard to pronounce, it's also practically impossible to remember if you're unfamiliar with Cyrillic. Note by the way that the English language does contain words with diacritics and that diacritics use for Latin-alphabet foreign names of people and places is common place in reference works. Pichpich (talk) 18:55, 5 June 2011 (UTC)
I was just trying to get you to think about this subject from a more fundamental perspective. I do agree that WP should not set trends but should reflect current good practice but every now and then we should ask whether we could do things better. I will leave you to it now. Martin Hogbin (talk) 22:21, 5 June 2011 (UTC)
How very generous of you. All that conversation above and it never once occurred to me to give the issue any thought. :-) Pichpich (talk) 23:20, 5 June 2011 (UTC)
I was going to leave but comments like that make me want to stay and argue the case more. Martin Hogbin (talk) 21:25, 6 June 2011 (UTC)
I think Pichpich explained it quite well, it is simply "common sense". Few people will not understand the function of ł, ó or ę , but Д or ๘ are much less clear. --Piotr Konieczny aka Prokonsul Piotrus| talk 18:48, 6 June 2011 (UTC)
You are joking of course, ' Few people will not understand the function of ł, ó or ę'. I do not even know in which language they might be used? Martin Hogbin (talk) 21:25, 6 June 2011 (UTC)
I think he meant few people would fail to recognise ł as an l with a diacritic, ó as o etc., whereas far fewer would recognise Д as D. Anyway, as for what purpose diacritics serve, they serve the purpose of us being accurate and spelling things right. I'm a native English speaker. I can't speak Polish, but I live next door and I know what an Ł is. If I went to look up information on the town of Kłodzko and found it spelled "Klodzko" throughout the article, I would be none the wiser about how to spell it properly. Just because you don't know doesn't mean nobody does. - filelakeshoe 22:37, 6 June 2011 (UTC)
I have no objection whatever to giving the national spelling in an appropriate alphabet of any foreign word in a WP article, along with the IPA pronunciation. What has still not been made clear is why we should have Kłodzko as an English word but not Αθήνα. Martin Hogbin (talk) 08:11, 7 June 2011 (UTC)
The reason is that one is in a Latin script and the other is not. Whether you accept that reason is a different matter. You may wish to draw the line elsewhere.
But I notice that you use the term "English word". Proper names of foreign entities, however spelled, cannot simply be classified as English words in the normal sense, unless an exonym (like Munich), a translation (like the Luther Bible), or an equivalent English name (like Henry IV) has become established. In other words, neither Kłodzko or Klodzko are normal English words. When there is no established English name, the foreign name is normally used, unless it is normally written in a non-Latin character set. When a foreign name contains occasional letters from an extended Latin character set, some publishers have a convention of replacing them with "standard" Latin characters without diacritics, thus using an "English" spelling of foreign words.--Boson (talk) 09:35, 7 June 2011 (UTC)
Yes! And to answer your second question, one good reason I gave above. The diacritics are useful to English speakers to let them know how to pronounce the word, and a clue to its meaning. Locals already know how to pronounce it, so in reality the diacritics are more "English" than the dumbed-down writings. But I do agree than moving articles back and forth is a waste and should be discouraged. Content matters. W Nowicki (talk) 16:54, 5 June 2011 (UTC)
They only give a guide to pronunciation with some effort. First you have to identify the language then determine what the diacritics mean. For example, I have no idea how to pronounce Pär-Gunnar or Václav. We already give the IPA pronunciation. Martin Hogbin (talk) 17:03, 5 June 2011 (UTC)
It's not really that hard and it's certainly easier to remember than IPA which uses plenty of non-Latin symbols. There's plenty of information on Wikipedia that's directly meaningful to a minority but accessible to all who put in some effort (IPA for instance). There's nothing wrong with that. Pichpich (talk) 17:24, 5 June 2011 (UTC)
No, diacritics have completely different effects in different languages. You first have to determine the language then find out what the mark means in that language. We already have IPA pronunciation. Martin Hogbin (talk) 18:17, 5 June 2011 (UTC)
You are right, and still, you are not. They have different meaning in different languages (but non-accnted letters have other meanings too, even more) but it is still quite pretty easy to learn the pronounciation. As for now I can easily see the aproximate pronounciation for some 90% european languages, while I do not speak those languages, merelly can read it. I am not some kind of linguist, just biologist. (Am I clever or something else too much? I do not think so, just try to learn it, you will learn the pronounciation in few minuts or hours). IPA gives absolutelly exact pronounciation and it is difficult to decypher, not needed if you are not going to learn the language as native speaker. --Reo + 10:56, 16 June 2011 (UTC)
The policy as spelled out at Wikipedia:Article titles requires that the article title is to use the name that is most frequently used to refer to the subject in English-language reliable sources. This applies to the title of the article – but within the text of the article, pursuant to WP:MOSBIO, the person's legal name should usually appear first in the article. I trust that explains the current Wikipedia policy as it relates to this issue. Dolovis (talk) 14:49, 20 June 2011 (UTC)

"Diacritics" aren't, and are necessary for disambiguation purposes

I'll drop in to say the things I've said several times over the years when this topic comes up:

  • The notion that, say, Jaromir Jagr is "Jaromír Jágr in English" is ridiculous, as anyone with any experience in translation knows. I wish this canard would stop being bandied about in these discussions, as well as the ludicrous idea that English doesn't have diacritics.
  • Having said that, it is necessary to distinguish between the "diacritical-containing" and "non-diacritical-containing" letters, for instance a and ä, for disambiguation purposes. To take an example that springs to mind, the Finnish painter Johan Backman and academic Johan Bäckman are two different people with different names.

For what it's worth, in my opinion this crusade against "diacriticals" is driven by rather poor motives. User GoodDay said above: "Those non-english symbols are annoying", and to this date this seems to me to be the only motivation for trying to exclude them from Wikipedia. I firmly oppose deliberately misspelling names and inventing bogus "English spellings" of foreign names just because some editors find them "annoying". Elrith (talk) 23:47, 6 June 2011 (UTC)

  • Comment: Because, of course, WP:COMMONNAME and WP:ENGVAR don't count as "real" motivations? That being said, when the non-English Wikis start using standard English proper names with standard English spellings instead of their own national variations, I'll think better of European language warriors coming over to demand that the English Wikipedia conform to theirs.  Ravenswing  03:54, 7 June 2011 (UTC)
  • Not sure what you mean there, I don't see any article on pl.wikipedia about Wince Cable or on cs.wikipedia about Kvincy Jones just because Polish and Czech don't use the letters V and Q respectively. We have Warsaw and Prague and they have Londýn, obviously... - filelakeshoe 09:30, 7 June 2011 (UTC)
  • To be fair most other language wikis do use the proper English names when no translated/transliterated name exists. Which is exactly what people are stating in the above discussion that we also do. People mistake removing diacritics as translating when it is not. Removing them is neither proper English or proper native language. This is where the issue is. Its just wrong whatever way you look at it. -DJSasso (talk) 11:24, 7 June 2011 (UTC)
Where does this bizarre idea come from that non-English Wikipedias somehow mistreat English-language names, and that the English-language Wikipedia must reciprocally mistreat non-English ones? It's madness. Elrith (talk) 22:22, 14 June 2011 (UTC)

Yes, both "zealots" are not improving the content: mass moving either direction should be discouraged. And to save the time of constantly repeating this, can we please add some text to this policy to clearly state one of the major issues: "Removing accents or diacritics does not convert a word into English." Maybe a linguist can phrase it more precisely. W Nowicki (talk) 16:03, 7 June 2011 (UTC)

Second, and support further clarification: English words use diacritics, so neither WP:COMMONNAME and WP:ENGVAR can be invoked in a crusade against diacritics. Here's a written, reliable source that supports this assertion: "For foreign words that have become common in English, no simple rules can be given for when to retain an accent, or diacritic, and when to drop it. The language is in flux. ... Accents and diacritics should be retained in foreign place names (such as São Paulo, Göttingen, and Córdoba) and personal names (such as Salvador Dalí, Molière, and Karel Čapek).". QED.--Piotr Konieczny aka Prokonsul Piotrus| talk 16:21, 7 June 2011 (UTC)
It's not English you are talking about, but proper names. We don't invent verbs or adjectives with foreign characters. The conventions used to produce a suitable English version of a foreign name mostly translate to removal of foreign characters and foreign accents. There is no benefit to using foreign characters 'because we can' on Wikipedia, when readers don't see them, don't speak them and are not trained in them. So, we translate to the combination of characters that sound as much as possible as the original. There is no -need- for special dispensation for languages similar to English, as opposed to cyrillic or asian languages. If we find in the media of a whole continent or english media milieu that ö (for example) is not used, then we have a de facto translation method. ʘ alaney2k ʘ (talk) 18:09, 7 June 2011 (UTC)
That is sort of the point. The removal of the diacritics isn't changing them to the letter that sounds the closest. Diacritics on a character completely change the sound of a letter so that it often doesn't sound anything like the letter without the diacritic. Which is why above it was described by someone above in the discussion that the addition of the diacritics is actually more to help English speakers than for the sake of the native speakers who probably already know how it is pronounced. This is why removing them isn't a translation at all. A translation would involve switching the characters from the diacritic version to the closest sounding alternative which would be completely different characters alot of the time which is done in some words but not usually in names. Its actually the case that proper names rarely have the diacritics removed, in academic sources at least. -DJSasso (talk) 18:16, 7 June 2011 (UTC)
An academic work Wikipedia is not. If we are trained with them, then I could see your analogy. But, the normal convention in the media is to simply remove them. Whether or not that constitutes an accurate translation is somewhat moot. That would be for the non-anglos to decide. Often an athlete provides a better translation. But should we simply adopt the spelling used in Europe? I don't think we are anyway near going that way in North America. Correct or not to a European, it is simply not friendly or usable for North American readers. ʘ alaney2k ʘ (talk) 18:27, 7 June 2011 (UTC)
No but we are an encyclopedia who strives to have correct information, and we strive to be like the Encyclopedia Britannica which uses them . We are doing a disservice to the reader if we do not include someones actual name. In most cases with diacritics someone who doesn't have any training in them can still read them just as easy as if they weren't there. I believe you once said you just ignore them. Well then there is no issue if they can be ignored? Now in the cases of the really strange diacritics like Д there is usually an actual translation of the name and I 100% agree that we should use that as its a translation and the original form would be unreadable to someone not in the know. But in the case of a é â ö or the like. I think we are doing a disservice to the read to remove them incorrectly. -DJSasso (talk) 18:38, 7 June 2011 (UTC)
I find it hard to argue in favour of something if it is to be ignored. :-) Like I said, the argument to me is not about the correctness either way. Staying in context is most appropriate. I doubt that all US or Canadian professors mark down a paper over an umlaut. We have a lovely mechanism of directing someone to the birth spelling of a name in the article. The foreign spelling shows in the popup. That's enough. I think of it as a spectrum. You've got extremes, such as Ozolinsh, where the birth spelling is just completely foreign, or Selanne, where it's a minor difference. I don't believe it is an egregious offence to omit them, like some do. I don't advocate being lazy, but I don't advocate going beyond the common usage much. Count me out (in). I think that I will always argue in support of not mandating them in common usage. Sometimes you've just to admit that something is foreign. You can work on specific rules, and I see that it's necessary; but a common start is to omit them. And that should be okay in context. ʘ alaney2k ʘ (talk) 19:09, 7 June 2011 (UTC)
This project is suppose to be for the laymen. An english-only reading laymen, sees these non-english wiggly sqigglies as at best, a distraction. GoodDay (talk) 00:09, 8 June 2011 (UTC)
I thought that's why we developed the Simple Wikipedia... no diacritics there. --Piotr Konieczny aka Prokonsul Piotrus| talk 04:07, 8 June 2011 (UTC)
An "English-only reading laymen" (sic) should go and read Simple Wikipedia. No diacritics, IPA, complex mathematical code, or anything else that might make your head hurt. I use Simple Wikipedia for maths articles because the code here along with the pretext that the reader understands it does my head in, but you don't see me arguing to ban that. I'll repeat. Just because you don't know doesn't mean nobody does. - filelakeshoe 07:12, 8 June 2011 (UTC)
Those who want dios, should read French Wikipedia & the Eastern European Wikipedias, instead of forcing their preffered non-english accents & symbols on English Wikipedia. Editors like me are not simple minded, we just happen recognize bs, when we see it. GoodDay (talk) 13:09, 8 June 2011 (UTC)
Sure, sure. <sarcasm> Curse all those academics who use those weird "dios". They should take all of their weird "science" mumbojumbo and go somewhere else, preferably to France and East Europe. We need no science here... Glory to Wikipedia, the first encyclopedia to fight against "dios", where the dinosaurs of Britannica and Columbia still use them. By popular vote (or, vocal demand of a tiny minority...) we will defend the "pure English language" (and woe to those academics who argue for diacritics - those eggheads in the ivory towers surely don't know what language they speak, even.). </sarcasm> Seriously, please, don't stop others from using proper English, and contribute to the encyclopedia using proper names and nameplaces; and pretty please, don't accuse us of "bs". I'll finish by pointing out that neither "dios" nor "bs" are really a part of an English language, but usage of such slang simplifications is certainly indicative. Of what, I will not say, per WP:NPA. So please, keep it cool, focus on arguments, and don't accuse others of various attitudes. --Piotr Konieczny aka Prokonsul Piotrus| talk 18:24, 8 June 2011 (UTC)

Once again, per endless earlier discussions, certainly all recently published scholarly sources I've run across regarding Eastern Europe observe the "squigglies." We are, first and foremost, an encyclopedia, after all. Once we've observed common English language "non-squiggly" usage where there is a preponderance thereof, there's no impediment to using the "squigglies." On the related, I don't think the average (mono-lingual English with smatterings of some other European language in school) reader cares whether the additional decorations modify a letter or create a completely different letter. PЄTЄRS J VTALK 02:24, 8 June 2011 (UTC)

I agree about the last point. The discussion is complicated enough as it is. Not even Unicode differentiates between ö in German (an umlaut, i.e. a letter different from o, historically derived from oe via an e written over the o, and therefore traditionally transliterated as oe where no ö is available), ö in Finnish, Hungarian or Turkish (also a letter different from o, but not an umlaut although borrowed from German; cannot be transliterated as oe) and ö in coördination. Some comments about the core of the matter:
  • To the limited extent that the average name of a person or entity can be said to be part of the English language at all, it is part of the English language both with and without the diacritics. Some sources such as international sports associations use ASCII characters exclusively and broadcast their versions widely. Other sources, such as virtually all academic sources, use the accented versions almost exclusively. E.g., if you search for "Gödel's theorem" on Google Books, you will find more than twice as many (English) publications than when you search for "Goedel's theorem". Closer inspection shows that those using the ö spelling are generally of a higher overall quality and more on-topic. They are generally written by people who have English-sounding names, and appeared with English-language academic publishers such as A K Peters, Routledge, Springer (a huge publisher that started in Germany but has had an international scope and been focused on English for a long time), Blackwell, Wiley etc. There are even more hits with the misspelling "Godel's theorem", but almost all of these are due to OCR errors where the original actually used the umlaut.
  • In our globalised age, people don't just read and write about foreign places and people, they also visit them and get exposed to the original versions of their names in the original linguistic context. As a result, the English language is moving away even from the most established English versions of such names and is gradually replacing them with the original versions. Examples include Lyon which used to be spelled Lyons in English but is now more commonly found without the s, Beijing and Kolkata, which used to be referred to by their traditional English names Peking and Calcutta. Presumably by the same mechanism, it is moving towards the original spellings including diacritics. You can see this at work with Google Books searches for Heinrich Brüning: For books until 1950 the ratio "chancellor Brüning":"chancellor Bruening" is 280:1110. For books since 1971 it is 641:278. The ratio "chancellor Schröder":"chancellor Schroeder" is 4090:1730. (The same tendencies can be observed in German, where people increasingly say and write Nijmegen not Nimwegen, Ústí nad Labem not Aussig, Tallinn not Reval, Győr not Raab, 's-Hertogenbosch not Herzogenbusch etc. No doubt other languages are going through the same evolution.)
  • WP:COMMONNAME does not speak about spellings. It speaks about fundamentally different names such as Nazi Party vs. de:Nationalsozialistische Deutsche Arbeiterpartei. Spelling with or without diacritics is not the kind of question that should routinely be answered by inspecting usage in English sources for each individual topic. For relatively obscure topics we might as well read tea leaves. This is the kind of thing that is addressed by style guides, and our relevant style guide for article titles is WP:DIACRITICS a section of WP:ENGLISH. It says that by default (no consensus of sources), spelling with or without diacritics is both acceptable. But our policies and guidelines are supposed to be descriptive not prescriptive, and actual usage in Wikipedia is that where English sources use the original name with or without diacritics, we almost always use it with diacritics. This is firmly within the normal range of style guides for English publications, and we are following the trend. [8] [9]
  • It makes sense to standardise use or non-use of diacritics, because otherwise we are facing categories in which spellings with diacritics are mixed randomly with spellings without them. We will never be able to get rid of them completely, as there are examples in which diacritics are the most natural way of disambiguation, and since academic works and serious encyclopedias such as Britannica use them in titles (they didn't used to use them – another example that the language is in flux). So the most natural solution is to always use them, as we are already doing.
  • Names in Cyrillic or Chinese letters or any other non-Latin writing system are of course a different matter. Most readers of English texts cannot parse them at all, cannot even form conjectures as to the corresponding pronunciations, and would need a lot of effort to compare two such words just to see if it's the same word. That's why we are not using them but use transliterations or transcriptions into the Latin alphabet instead (where there is no English alternative). And guess what, some of the commonly used transliteration/transcription systems use diacritics and other modified Latin letters. This way English even acquired some words with diacritics such as the one for the (Tibetan) Bön religion. There are some fine distinctions to be made here. E.g. Pinyin, the standard system for transcription of Chinese, uses diacritics. But we don't, following a common practice in China and among academics publishing in English.
While it would be nice if this little exposé on háčeks and similar phenomena that I am throwing into the mêlée would give the coup de grâce to this attempted coup d'état, I am aware that, not being an Übermensch, nor a Führer who cannot be ignored, I may be evoking TLDR reactions in some. In that case, take it as a smörgåsbord from which you can pick a few canapés while drinking a Gewürztraminer. (I suggest that you help yourself to an apéritif first, and then start with the crudités. And do try the crêpes and the crème brûlée.) Sorry if there are too many cases of déjà vu on the menu. I hope you won't mistake my arguments for papier-mâché tigers.
So much for my 5 øre. Hans Adler né Scheuermann. 06:23, 8 June 2011 (UTC)
All those words are commonly spelled without the diacritics. We don't all write for the OED. ʘ alaney2k ʘ (talk) 14:46, 9 June 2011 (UTC)
As Google Books searches with these words show, usage is actually mixed. It is of course wron to write exposé without the accent because that can only lead to confusion. (A Google books search for "write an expose" has a lot of hits, but they are overwhelmingly OCR errors with exposé in the original. Incidentally, preventing confusion between words that are otherwise spelled equally is one of the reasons for accents in French.) I wouldn't know how to verify this, but I am pretty sure that the English word cannot be spelled ne. In contrast to the (also somewhat odd) spelling nee for née, nobody would know what is meant. And I refuse to use nee with reference to myself because I don't want anyone to believe I was born female. At the other extreme there is smörgåsbord, which is almost always spelled without the accents. But in any case the spellings with accents are correct variant spellings of the respective English words in the same way that colour and color are both correct variant spellings. Hans Adler 15:53, 9 June 2011 (UTC)
  • Please, tell me someone, how could using the diacritic in foreign proper names damage or threaten the purity of the English language? What have the Czech, Slovak or Swedish proper names to do with the English language? A name is just a name. If a person has a name that contains the "wiggly sqigglies" and the person is verifiably known under that name, a really good encyclopedic project should respect that, because it is the correct version, no more and no less. Wikipedia should strive for accuracy. That's my only conclusion. Of course, transliteration is a different matter, but I'm talking about the proper names written in languages which use the Latin alphabet. Do you want to save the English language by deforming the names? --Vejvančický (talk | contribs) 09:55, 9 June 2011 (UTC)
    • No one is arguing for or against the English language purity. What we are talking about is not enforcing the foreign spellings when we have lots of English usage without the character modifiers. I disagree that those diacritics are part of the same alphabet. The letters are rendered using a separate code. Correctness also is not the issue. Correctness is subjective in this instance. For example, the Latvian hockey player who is commonly known as Sandis Ozolinsh has a birth name of Sandis Ozoliņš. Those two diacritics are -unknown- in common usage. The question is really, where do you draw the line? To a regular reader, unknown modifiers give no information to the reader as to the pronunciation. As per WP:COMMONNAME, Sandis is better known as Ozolinsh, as he played the majority of his career in North America. He might have never been notable otherwise. To enforce the use of modifiers would be appropriate if Wikipedia were an academic publication, but it is certainly not. Frankly, I am okay with either spelling for article titles, (leaning slightly to common spelling without) but if a person's common spelling is without diacritics, then it is not appropriate to enforce them in articles where the person is mentioned. As I've said elsewhere, I see no point in arguing for something that will be ignored. ʘ alaney2k ʘ (talk) 14:46, 9 June 2011 (UTC)
      • Thanks for your observations, alaney. The anlicized version of the name of Sandis Ozolinsh follows, at least a bit, the sound and the pronunciation of his name. But what about Jan Srdinko, Roman Kadera, Matus Vizvary, Ivo Kotaska, Vladimir Buril and other players known solely for their careers in European hockey leagues? (This RfC was started mainly as a reaction to the situation at Talk:Vladimir Buril) The names are cut off like a tree stumps. I'm well aware that the diacritic can hardly help an average English reader to better understand the pronunciation, however, I believe it is more encyclopedic than the nonsensical current state. The current state is a result of missing code on major ice hockey websites. Wikipedia should do better, as this project has the technical tool and the international editorship capable of far more accurate work. --Vejvančický (talk | contribs) 16:03, 9 June 2011 (UTC)
        • Thanks for pointing those out. I was unaware of the origin. If this discussion was to eliminate the use of diacritics completely, I don't support that. Those persons have no 'common' English spellings, as far as I can tell. No 'English' name, if you will. I think those articles will be moved to the native spellings. ʘ alaney2k ʘ (talk) 17:04, 9 June 2011 (UTC)
      • Sandis Ozolinsh is a transliteration. This does happen sometimes with names in the Latin alphabet, viz Franz Josef Strauss and Bronislava Nijinska. But it doesn't happen all the time, and one doesn't transliterate by removing diacritics. - filelakeshoe 16:36, 9 June 2011 (UTC)
        • Ozolinsh has no diacritics. I think you mean it is not simply the removal of diacritics that makes it English. There must be a variation of opinion on the acceptability of transliteration. Because the article on Ozolinsh is under his birth spelling. ʘ alaney2k ʘ (talk) 17:04, 9 June 2011 (UTC)
          • Ozolinsh is most likely where it is because of how frustrated most people got with these discussions a few years back and everything was pushed as far as possible in both directions. We should probably move him back. Its the absolute wiping out of diacritics that is currently the issue. Currently the originator of this thread has been trying to wipe them off every article that uses them, consensus on the issue be dammed. -DJSasso (talk) 17:08, 9 June 2011 (UTC)
  • On a side note, Czech Wikipedia attempted to resolve a similar problem, using the feminin surname suffix -ová (a standard component/suffix of female surnames in the Czech language) in foreign female names. The debate was creative, editors defended all possible stances.[10] The Institute of the Czech Language of the Academy of Sciences CR even issued a special statement, defending the Czech language as a naturally inflective language. However, the statement included the following recommendation: "In an encyclopedia, it is appropriate for users to get information about the original form of a proper name".[11] But it is Czech language, a different venue, a different problem. Vejvančický (talk | contribs) 09:55, 9 June 2011 (UTC)
We cannot present material that would expose the ignorant, for that would fail to provide balance between knowledge and ignorance. As an encyclopedia we need to strive for balance. The goal must therefore be the LCD. — kwami (talk) 11:38, 9 June 2011 (UTC)
@Vejvančický, Agreed, the feminine surname suffix should be observed, that English does not decline its nouns does not mean we force the masculine form of a surname.
Vejvančický was making an analogy to a similar issue on Czech Wikipedia, not an issue whether we should spell Martina Navratilová as Martina Navratil, but whether on cs.wiki they should spell J.K. Rowling as J.K. Rowlingová. It's quite a similar issue to this diacritics thing. I guess in the same way Czech is a naturally inflective language, English is a naturally xenophobic language. - filelakeshoe 15:28, 9 June 2011 (UTC)
Thanks for the clarification, Filelakeshoe. --Vejvančický (talk | contribs) 16:03, 9 June 2011 (UTC)
@Kwamikagami, I am sorry, I cannot agree with "lowest common denominator." "Balance" means we write in accessible language, not that we dumb down proper names. You should have more faith in the intellectual capacity of the average reader. PЄTЄRS J VTALK 12:58, 9 June 2011 (UTC)
I am relatively sure that Kwamikagami was making a joke. Of course in the context of this insane debate the usual assumption that anything that is too outrageously stupid to be possibly a serious statement must be a joke doesn't make much sense, so I am not totally sure. Hans Adler 15:23, 9 June 2011 (UTC)
I'd hope so, but in case it was semi-serious, I'll once again point to Simple English Wikipedia, where editors scared of diacritics, too long words, technical jargon, and such, can find safe refuge. --Piotr Konieczny aka Prokonsul Piotrus| talk 17:38, 10 June 2011 (UTC)
As an admin there I would point out we tend to use them there ironically. -DJSasso (talk) 17:41, 10 June 2011 (UTC)
Excuse me, but editors who wish dios to go away, are not dummies. Therefore, please stop characterizing them directly or indirectly as such. GoodDay (talk) 17:46, 10 June 2011 (UTC)
As opposed to you who has been over and over assuming bad faith of anyone from a country whose primary language isn't English? -DJSasso (talk) 17:47, 10 June 2011 (UTC)
Calling editors like me 'dummies', is just another example of the pro-diacritics crowd's arrogance about their precious wiggly sqigglies. GoodDay (talk) 17:49, 10 June 2011 (UTC)
No one called you a dummy. This is just another example of your assuming bad faith of everyone who disagrees with you. -DJSasso (talk) 17:51, 10 June 2011 (UTC)
"I'll once again point to Simple English Wikipedia, where editors scared of diacritics, too long words, technical jargon, and such, can find safe refuge" by Piotrus. That's a poor choice of wording. GoodDay (talk) 17:54, 10 June 2011 (UTC)
And how is that different than you pointing people to the Czech wiki for example? Most of the people who have argued against diacritics have done so stating they are too complex. So it is perfectly valid to point to a the simple english version of wikipedia where things aren't too complex. Although he was incorrect because they actually use diacritics. -DJSasso (talk) 17:58, 10 June 2011 (UTC)
As an administrator, you should be advising both sides to use moderation, not just the side you're against. GoodDay (talk) 18:03, 10 June 2011 (UTC)
I am not acting as an administrator since I am actively involved in this topic. I am acting as any regular editor. If you want an uninvolved admin to decide if you or both of you are out of order I can certainly go ask for one. I am just pointing out that you are making this discussion far too personal. Step back and be objective. -DJSasso (talk) 18:06, 10 June 2011 (UTC)

I'm merely stating, if the otherside knocks off the 'intellectual put downs', I'll stay away from the 'linguistic pride' accusations. GoodDay (talk) 18:10, 10 June 2011 (UTC)

What pride? I am not "proud" that Polish has diacritics. This is just a fact of life, just like English has letters we don't use (q, x and v) - but we will use them on Polish Wikipedia in relevant topics (pl:Quebec, pl:Vincent van Gogh, pl:Xanadu (oprogramowanie)). And of course we accept diacritics that don't exist in Polish (pl:Würzburg). It is just surprising to me that a few people who are supposed to write an encyclopedia, a work of reference, by default discussing many subjects unknown to most, would argue that we should dumb down articles on foreign subjects. IF this was a common rule in English publications and encyclopedias, it is one thing. But as have been shown, English works are split on the use, and other encyclopedias simply use them without any arguments. This makes it clear that the argument "diacritics are not used in English" is false, leaving only the "diacritics are unknown to most and confusing." Well, though, this is an encyclopedia, and it covers many subject that are unknown to one. The solution is hardly to remove the topics - or dumb down the articles by removing things like diacritics. I am sure we could find people who are confused by graphs, headings, edit buttons, hyperlinks, footnotes, templates... but we don't pander to them. I see no reason why our treatment of diacritics should be any different. So if there is any "pride", I think it is some misguided pride of the part of English language purists, who don't want to see signs they consider non-English; pride that is obviously misguided, as many modern English-language books and encyclopedias illustrate quite clearly. --Piotr Konieczny aka Prokonsul Piotrus| talk 19:18, 10 June 2011 (UTC)
I really think this round in circles stuff needs to stop, GoodDay has already expressed on his talk page that nothing is going to make him "change his opinions". We might as well be trying to convince a brick wall that diacritics are used in English. - filelakeshoe 19:27, 10 June 2011 (UTC)
True, this (including some comments of mine, perhaps) was getting a bit repetitive, and thus, annoyingly unhelpful. In that case, I'd suggest we move on to discussing how the wording could be changed, see if a change would be acceptable (stable), and if there is still a dissenting group (individual...) who would revert such a change, hold a straw poll to determine consensus (majority's opinion). --Piotr Konieczny aka Prokonsul Piotrus| talk 19:30, 10 June 2011 (UTC)
Basically I think the problem here lies not solely with diacritics, it lies with the "common name" policy being followed absolutely to the letter based on slack research such as "number of google hits" or "a quick look at google news results". As I just said today on Talk:Slavia Prague, following "use the most common name used by English speakers" is insane (read my comments in the discussion), there has to be some kind of balance between common names and correct names, so sure, Caffeine and not 1,3,7-trimethyl-1H-purine-2,6(3H,7H)-dione, but Fellatio rather than Blowjob, Manchester United F.C. rather than Man United and Cattle rather than Cow. By this token it follows that we should use Vladimír Búřil, because it's verifiable and clear that his name is Vladimír Búřil. This "xxx google hits say this name is more common" argument should be used with more caution. - filelakeshoe 19:42, 10 June 2011 (UTC)
Especially when WP:DIACRITICS specifically mentions that google hits are an unreliable method for judgment because optical character recognition errors often miss diacritics thus deflating the numbers. Clearly consensus here is to rewrite to make clear that diacritics are valid but the question is how to do that in an efficient way. -DJSasso (talk) 19:47, 10 June 2011 (UTC)
I think it would be better to get an uninvolved admin or two in to judge consensus, as I know I've been getting a bit worked up with all the linguistic misconceptions and such... - filelakeshoe 19:50, 10 June 2011 (UTC)
The entire "uninvolved admin" concepts is scary; how can we be sure that the admin has no - subcontious, even - take on that? I'd rather see proposals (one or more) for modification, one for keeping the policy as it is, and see some votes. This will much more clearly show where the consensus lies. --Piotr Konieczny aka Prokonsul Piotrus| talk 20:20, 10 June 2011 (UTC)
That would work - a "users who endorse this view" style RfC. I don't know enough about the relevant processes to know how to properly go about this.. - filelakeshoe 20:27, 10 June 2011 (UTC)
I wonder if an administrator ruling can be applied? Perhaps, the best course is to merely take things one RM at a time. GoodDay (talk) 20:24, 10 June 2011 (UTC)
Oh of course. I just meant that out of about 30 editors in the discussion only 4 were outright against using them and one said he had no problem with them in titles but not in the rest of the article. So it seemed pretty clear cut. Of course an outsider would have to make the final call. I just meant we should come up with some options for how to word it since we seem to be headed that way. -DJSasso (talk) 20:26, 10 June 2011 (UTC)
Replying to both of you, RM is the reason we are discussing this here. It is my understanding that RMs commonly succeed in moving articles from diacritic-less titles to ones with diacritic, and this RfC was started by a user who was unhappy with this, seeing the RMs as going against our policy. We can just close this discussion and do nothing, but I think it would be better if we just faced the facts (and the consensus of majority), and clarified the rules that diacritics are accepted, to prevent some users being confused and claiming that "diacritics have no place on Wikipedia". It is clear, from common use on Wikipedia, that they do have such a place. --Piotr Konieczny aka Prokonsul Piotrus| talk 21:26, 10 June 2011 (UTC)
We should go by the RM route. GoodDay (talk) 21:57, 10 June 2011 (UTC)
The RM route is ongoing all the time. We need to go beyond it, and improve it by updating policy to prevent false arguments from being used in RMs. --Piotr Konieczny aka Prokonsul Piotrus| talk 22:38, 10 June 2011 (UTC)
Everyone has a different view on what's 'false arguments', though. GoodDay (talk) 22:43, 10 June 2011 (UTC)
Well that is the point of this discussion, we are coming to a consensus of what that is. -DJSasso (talk) 23:18, 10 June 2011 (UTC)

That's quite likely an impossibility. GoodDay (talk) 23:51, 10 June 2011 (UTC)

Consensus is not unanimity. There is no liberum veto on Wikipedia - so I think we will reach an agreement... --Piotr Konieczny aka Prokonsul Piotrus| talk 00:41, 11 June 2011 (UTC)
Maybe. GoodDay (talk) 01:11, 11 June 2011 (UTC)
I agree with Hans argumentation above, DJSasso make that 31 v 4. --Stefan talk 01:38, 11 June 2011 (UTC)
This entire argument is tl;dr. I weighed in a couple weeks ago, I think, but just don't have time to follow all the bickering. To reiterate my own position: Article titles should reflect common English usage, as indicated in the majority of English-language sources. Speaking as someone who does a lot of work with disambiguation pages, I would also say that diacritics are often a pain in the you-know-what. I can type about 130 wpm, but when I run into a diacritic, it slows me down to a snail's pace as I have to squint at the hundreds of possible choices and track down the correct one, or switch to copy/pasting rather than typing. Given a choice, I prefer non-diacritic titles. However, if a name is routinely spelled with diacritics in English-language sources, I can, and do, adapt. I do not, however, support the idea of spelling names in a native language just because "that's the way it's supposed to be spelled". What's next, changing all the names of the Chinese biography articles to native Chinese? That would make searching, linking, categorizing, and navigation a nightmare. In any case, moving forward on this discussion, perhaps a straw poll would be a good idea, to make sure we're getting opinions from lots of editors, and not just those that have the time to engage in these endless discussions? --Elonka 15:44, 11 June 2011 (UTC)
"Straw poll" on what exactly? I mean, what would editors have to decide between? - filelakeshoe 15:54, 11 June 2011 (UTC)
It's best to go the RM route. An overall ruling across English Wikipedia, would likely be un-accepted, no matter what the result. GoodDay (talk) 16:01, 11 June 2011 (UTC)
Yes, Elonka I realized you did. You were one of the ones I counted as not liking them. Diacritics and Chinese characters are two very different situations. One uses a different alphabet/character set whereas in the case being discussed they are using the same alphabet. Is still readable to everyone, doesn't lose any information, makes the wiki more accurate. I fail to see where causing an editor to slow from 130 wpm to a snails pace is an actual problem. Our standard is to be the best possible wiki for the readers, not the editors. So slowing down an editor in my view isn't a problem at all if it helps the reader which I think adding relevant information does. -DJSasso (talk) 16:15, 11 June 2011 (UTC)
There's no diacritics in the english alphabet. GoodDay (talk) 16:17, 11 June 2011 (UTC)
There are no diacritics in any alphabet. As has been mentioned to you time and again diacritics are part of the orthography of a language. Most of the languages we are talking about all use the same Latin Alphabet. A few of them add a few letters or remove a few letters. But the diacritics come from the orthography, so every time you say there is no diacritics in the english alphabet you make yourself look like a fool. -DJSasso (talk) 16:20, 11 June 2011 (UTC)
Let the other alphabets worry about diacritics, I'm concerned with the english alphabet. Meanwhile, the RM route is our best choice. Either that or English Wikipedia should be split in two -- English Wikipedia (New World) -i.e no dios & English Wikipedia (Old World) - ie. dios. GoodDay (talk) 16:25, 11 June 2011 (UTC)
There is no such thing as the English alphabet as its own entity, it is just the Latin Alphabet. It would help that if you are going to be so anti-diacritics that you atleast learn about the topic you are so vigorously fighting. -DJSasso (talk) 16:27, 11 June 2011 (UTC)
There's no & never will be a consensus on this topic across the entire English Wikipedia. The RM route, though time consuming, is the best route. Arbitrary page moves by either side is disruptive by its arrogant nature & should be discouraged. GoodDay (talk) 16:37, 11 June 2011 (UTC)
Repeating your opinion dozens of times without actually providing any arguments, as soon as you realise that you may be on the losing side, is a bad and disruptive habit. Please drop it. What you are proposing is a huge waste of time, simply for the purpose of pushing through your desired change to some extent locally when there is clearly no general consensus for it. Hans Adler 18:20, 11 June 2011 (UTC)
Please follow Elonka's advise. GoodDay (talk) 18:52, 11 June 2011 (UTC)
Please follow the many (I am sure its got to be closing in on 100 now) editors that have told you not to repeat one liners like this over and over and over again. We heard you the first time. If you don't have any new arguments to provide just stop writing. Or heck any arguments period other than I don't like them. -DJSasso (talk) 18:59, 11 June 2011 (UTC)
Ease off the harrassing. GoodDay (talk) 19:08, 11 June 2011 (UTC)
Can we please try to keep comments focused on the topic, and not on other editors. Saying "you look like a fool" is not helpful to this discussion. As for the comment about "best possible wiki", we are not talking about removing diacritics from the wiki as a whole, we are just talking about article titles. If someone has a name with diacritics (or Chinese, or Arabic), we can and should put that information in the lead paragraph of the article. It's just the title that should stick with "common English" spelling, as defined by majority usage in English-language sources. --Elonka 16:42, 11 June 2011 (UTC)
I don't mind dios in the lead paragraph, if the article title is devoid of dios. That's assuming having it english in the lead with the dios version next to it in brackets is rejected. GoodDay (talk) 16:49, 11 June 2011 (UTC)
It's not the majority of English sources that matters for us. Quality matters more than quantity. The lower segment of newspapers such as the Daily Mail seems to drop accents and replace any 'un-English' or 'un-American' letters consistently. Quality newspapers such as the New York Times or the Guardian do it sometimes but not consistently. As I have shown, the Chicago Manual of Style does not recommend doing it. (I don't have access to the book itself; maybe someone can look up whether it says something helpful about the matter.) And serious encyclopedias such as Britannica consistently use proper names from Latin-based languages in their original form and use romanizations involving diacritics, where appropriate, as in Brāhmī. (With some exceptions. Britannica replaces ß by ss, for example, as is done routinely even by German speakers in Switzerland, and it replaces þ by th. But it dinstinguishes correctly between the first names of Thorbjørn Egner and Thorbjörn Fälldin, for example.) Given that English dictionaries list words such as exposé with an accent, I simply won't buy that Britannica is in error when it uses diacritics in titles. Hans Adler 18:20, 11 June 2011 (UTC)
Hans, can you point me to where you've shown that CMoS supports diacritics? --Piotr Konieczny aka Prokonsul Piotrus| talk 18:58, 11 June 2011 (UTC)
It was in two links to "Chicago Style Q&A" in my longest post. To quote from them: "In any case, it is not true that English is without accents. I would guess that accents were often dropped in published material many years ago because of the extra difficulty of typesetting them—especially in the case of a word like façade (Webster’s prefers facade but allows façade; American Heritage prefers façade but allows facade). On that basis, I would guess that in the future, accents will become more rather than less common in English." [12] "Assuming that the readers are to be primarily English-speaking, I’ll follow Webster’s 11th Collegiate Dictionary, which lists Iguaçú first (though Iguazú is listed also, as an equal variant; Chicago usually picks the first-listed term and sticks with it)." [13] Hans Adler 19:11, 11 June 2011 (UTC)
Meanwhile I actually found a way to access the Chicago Manual of Style from home.
  • Some example sentences speak for themselves: "He is a member of the Société d'entraide des membres de l'ordre national de la Légion d'honneur."
  • But it gets more explicit elsewhere: "Any foreign words, phrases or titles that occur in an English-language work should be checked for special characters -- that is, letters with accents [...], diphthongs, ligatures, and other alphabetical forms that do not normally occur in English. Most accented letters used in European languages [...] can easily be reproduced in print from an author's software and need no coding. [...] If type is to be set from an author's hard copy, marginal clarifications may be needed for handwritten accents or special characters (e.g., 'oh with grave accent' or 'Polish crossed el'). If a file is being prepared for an automated typesetting system or for presentation in electronic form (or both), special characters must exist or be 'enabled' in the typesetting and conversion programs, and output must be carefully checked to ensure that the characters appear correctly."
  • The following on typesetting French is particularly interesting: "Although French publishers often omit accents on capital letters [...] they should appear where needed in English works, especially in works whose readers may not be familiar with French typographic usage." (My italics.)
  • And on romanization: "Nearly all systems of transliteration require diacritics [...]. Except in linguistic studies or other highly specialized works, a system using as few diacritics as are needed to aid pronunciation is easier to readers, publisher, and author. [e.g. Shiva not Śiva, Vishnu not Viṣṇu] Transliterated forms without diacritics that are listed in any of the Merriam-Webster dictionaries are acceptable in most contexts."
Unfortunately I am afraid we will still continue to read that using diacritics in English text is just plain wrong... Hans Adler 22:26, 11 June 2011 (UTC)

The question is: are there non-diacritics versions of these diacritized names, being used in english. The answer is yes, so use the non-diacritized version. This is how non-english names should be adopted to English Wikipedia. GoodDay (talk) 18:44, 11 June 2011 (UTC)

And can you prove to me that this simplified English is the correct English? --Piotr Konieczny aka Prokonsul Piotrus| talk 18:59, 11 June 2011 (UTC)
Can you proove to me that diacritics usage is best? GoodDay (talk) 19:08, 11 June 2011 (UTC)
It's the status quo and it's what the other encyclopedias do. You will have to prove that not using them is best, if you want to change the practice. Hans Adler 19:13, 11 June 2011 (UTC)
"...it's what the other wikipedias do", is not a good argument. I tried that argument with Infobox headings & it was rejected. GoodDay (talk) 19:35, 11 June 2011 (UTC)
He didn't say what other wikipedias do. He said what other encyclopedias do. The two are very different things. -DJSasso (talk) 19:43, 11 June 2011 (UTC)
(ec) To you? No you have already said you refuse to accept any evidence shown to you no matter what. There is a tonne of evidence given down below and above in the discussion. Rational open minded people would look at that and probably draw the conclusion that it is a good thing or at least be open to the fact that it might be. You however, have declared numerous times that no matter how much proof you are given that your mind won't change. -DJSasso (talk) 19:14, 11 June 2011 (UTC)
If an RM ends in favour of dios, I would respect the result (even though I wouldn't like it). I haven't been reverting any pro-dios page movements which were done arbitrarily. I'm not known as a page-move warrior. Anyways, If ya'll try and force something across the entire English Wikipedia, it'll be a recipe for disaster. GoodDay (talk) 19:20, 11 June 2011 (UTC)
No one is forcing anything. This is a discussion. This is how things work on the wiki. This is how change happens. And as you can see from the large discussion above this is already common practice on the wiki, so really I would doubt there would be much disaster since its already what happens the majority of the time. -DJSasso (talk) 19:23, 11 June 2011 (UTC)
I hope you're right. GoodDay (talk) 19:25, 11 June 2011 (UTC)
It is clear that there is no consensus to change the current policy.

Specific proposals to change the wording of the policy

I agree that this discussion is tl;dr, and likely going in circles. Some have asked "what can we do?"; well, if you scroll up, up and up to my mid-April post (#Conflict between usage and policy wording), you see my proposed wording change. I'd like to suggest that we move to discussing the specific wording change(s), in proposal's like my new one below. Let's not vote yet, let's see if we can hash out one (or more) wordings that have some support, then we can put them up for a straw poll. --Piotr Konieczny aka Prokonsul Piotrus| talk 18:58, 11 June 2011 (UTC)

Proposal 1 (Piotrus)

Current wording reads:

The use of modified letters (such as accents or other diacritics) in article titles is neither encouraged nor discouraged

I am proposing a change to:

The use of modified letters (such as accents or other diacritics) in article titles is common, and thus encouraged.

Note that this by no means overrules WP:NCGN (I thought I should state the obvious).

Brief summary of the rationale for that change:

Therefore, I believe it is time to recognize the trends in - so we may as well make it officially sanctioned by the policy (just look at Category:Polish people stubs or Category:Villages in Lower Saxony or many others).

Perhaps, the above rationale could be included in the article, too, although it would be nice to see some more proof for some of those claims.

This clarification is needed to stop time waste (WP:DEADHORSE...) that occurs when some editors try to move a an article or a small group from a title with diacritics to one without, or object to a move in the opposite direction. Such objection happens to a few new articles, as all established articles have been moved to a diacritic-using name long ago.

Please note that this is a thread to refine the proposed wording change and arguments behind it, so let's keep "I don't like dios" "arguments" out of it. Thoughts? --Piotr Konieczny aka Prokonsul Piotrus| talk 18:58, 11 June 2011 (UTC)

Oppose We should stick with the RM route, even though it's time consuming. I wish to point out that as far as diacritics go, I haven't been moving pages & am quite capable of accepting an RM ruling, even when it favour dios. GoodDay (talk) 19:04, 11 June 2011 (UTC)
Support I support this proposal. It describes current practice through most of the wiki and I agree with most of the reasoning you list in your various links. Guidelines/Policy are supposed to describe practice not prescribe practice so the guideline clearly needs to be fixed to reflex what actually happens on the wiki. It would definitely help stop a very large time sink. -DJSasso (talk) 19:20, 11 June 2011 (UTC)
Comment The following clause in the sentence in question currently reads "when deciding between versions of a name which differ in the use or non-use of modified letters, follow the general usage in English reliable sources (for example other encyclopedias and reference works)." To encourage the use of a specific spelling and immediately afterwards refer to general usage is somewhat contradictory in the case where general usage does not follow that spelling (and if general usage does use the spelling with modified letters, then the proposed changes are redundant). Regarding the broader point, I think it is important to keep in mind that if there is evidence that the subject has adopted a different spelling of its name, then it is appropriate to adopt this spelling, regardless of how the name is spelled in its original language. The proposed change does not take this into account. isaacl (talk) 20:31, 11 June 2011 (UTC)
This seems like something that is too detailed for this policy and should be added to Wikipedia:Naming conventions (biographies). I do agree that if we can show that the subject has dropped diacritics from their name, we should use the version preferred by the subject. Although this may be better discussed elsewhere; subject's preferences are not always taken into account (consider Casimir Pulaski). --Piotr Konieczny aka Prokonsul Piotrus| talk 21:09, 11 June 2011 (UTC)
It goes beyond people names; for example, I don't know if NASA gave official translations of the term "space shuttle", but if it did, then they should be given precedence. I don't believe this policy should provide a blanket encouragement of a specific form of spelling, as each case has its own set of circumstances. It would be better to provide guidance on how to weigh if sources are reliably reporting the spelling of a subject in English. (Unfortunately, this is a pretty difficult task to do in general.) isaacl (talk) 21:34, 11 June 2011 (UTC)
Right, but this discussion is about people. A name for a person is very much a different thing than a term used for an object or company. And object/company won't for example be insulted by an incorrect spelling of their name. Which is one of the issues people have with the removal of diacritics when it comes to BLPs. There will be exceptions as mentioned where a person goes by a different name, and that can be dealt with on a case by case basis. However, we should be encouraging the name in its proper form for atleast the title in the case of people unless there is a common form. (different from just removing the diacritics) I would note we aren't actually talking about the translations of names, pretty much everyone agrees if there is a translated name then we use that. This change is meant to make it clear that removing diacritics isn't a translation. -DJSasso (talk) 21:49, 11 June 2011 (UTC)
"removing diacritics isn't a translation". You hit the nail on its head. This should be added to the policy, too. --Piotr Konieczny aka Prokonsul Piotrus| talk 22:00, 11 June 2011 (UTC)
I was replying to Piotrus who was indicating that the proposed changes went beyond just people's names. You and I know the intent is not to cover cases where someone has translated their name (whether or not the translation involves a dropping of diacritics or something more elaborate), but the proposed change does not make this clear. Wikipedia editors shouldn't be translating names on their own without reference to reliable sources, period, so I suggest that be made clear. isaacl (talk) 22:05, 11 June 2011 (UTC)
I certainly have no problem rewording the proposal if you can think of a better way to word it. But I do think this general change needs to be made. -DJSasso (talk) 22:08, 11 June 2011 (UTC)
Addressing the original question for this section, I suggest making it clear that the policy on common names does not prohibit the use of modified letters:
Wikipedia policy is neutral on using modified letters (such as accents or other diacritics) in article titles; when deciding between versions of a name which differ in the use or non-use of modified letters, in accordance with Wikipedia's policy on using common names, follow the general usage in English reliable sources (for example other encyclopedias and reference works). The policy on using common names does not prohibit the use of modified letters, if they are used in the common name.
isaacl (talk) 23:00, 11 June 2011 (UTC)
That doesn't really solve the issue however, because people will still say that because lots of English sites don't have the diacritics then we shouldn't have them. We need it to clearly say that if the persons actual name contains them, they should be used except in cases of people known better by pseudonyms or if there is a translation. What we are trying to make clear is that "Joè" is still the common name even if lots of english sites show "Joe". (assuming the persons name does actually have the diacritic) -DJSasso (talk) 23:04, 11 June 2011 (UTC)
... and that's why I suggested that better guidelines on evaluating reliable sources be drawn up. However, I don't have a good proposal in mind because, as I said, it's a difficult task to do. isaacl (talk) 23:17, 11 June 2011 (UTC)
I think of it not as a change but as a codification of what we have been doing for years. And of course it applies to all proper names, not just people's names. All style guides that I have seen treat personal names, geographic names, titles of books etc. in the same way. Except that for personal names the Economist style guide says one should use the version preferred by the person, if the person gives any guidance. Hans Adler 22:31, 11 June 2011 (UTC)
I do not support the proposed codification, to use your term, that appears in its current form to unconditionally favour the use of diacritics, since errors can happen both ways. For example, I believe there are articles on Major League Baseball players who have had diacritics added in error to their names. I can imagine there could be cases where Spanish language newspapers would have spelled these names incorrectly, so there would be (otherwise) reliable sources with the wrong spelling. Editors should be seeking clarification in all cases, not assuming that the spelling with diacritics is always correct. isaacl (talk) 22:45, 11 June 2011 (UTC)
We already have policy for that its WP:V. No one is saying anything has to be assumed. You still have to have sources that show you their name is spelled with them. Nowhere in the change are we suggesting otherwise. All that is changing is the reliance only on English sources which are often wrong. There are always errors in some sources, but it isn't up to this policy to solve that issue, that is what WP:RS is for. This is purely a naming convention page. -DJSasso (talk) 22:51, 11 June 2011 (UTC)
The specific proposal doesn't address the reliance on English sources. It unconditionally says that the use of modified letters is encouraged. isaacl (talk) 23:02, 11 June 2011 (UTC)
Right, and they are as long as the name includes them. Questions on the reliability of sources and all that jazz fall into the WP:V and WP:RS pages. This page is just about how we choose the name. We don't have to spell out the other policies again here. That would be redundant. It is not unconditional, its conditional on all the other verification policies we already have etc. -DJSasso (talk) 23:07, 11 June 2011 (UTC)
Would tagging on "... thus encouraged when supported by sources that the name includes them." and perhaps change English language sources just to say reliable sources period. -DJSasso (talk) 23:15, 11 June 2011 (UTC)
It's also redundant to say that using modified letters is encouraged in article titles. Wikipedia's policies on using common names, verifiability, and identifying reliable sources cover this. Wikipedia policy is agnostic on the use of modified letters. isaacl (talk) 23:17, 11 June 2011 (UTC)
Well that is what the discussion above was about, getting consensus that it actually isn't agnostic about that. And now we are looking at proposals to make that clear. -DJSasso (talk) 23:20, 11 June 2011 (UTC)
In my view, the discussion was about how to identify when there isn't an established usage in English, and what established usage/convention should be followed in this case. Whether or not any of the candidates include modified letters is not important (and is why I believe Wikipedia policy should remain agnostic on spelling). isaacl (talk) 23:25, 11 June 2011 (UTC)
  • Oppose Proposed language simply swings the pendulum one way. Too simple. I don't believe that writing something that is expected to be ignored, is completely having merit. In general, article names, proper name spellings, translations are in flux until it's clear what is the most appropriate. I am certain there is a wide spectrum of cases. We should endeavour to use our best efforts in naming and usage for the sake of all users of Wikipedia. We can use the examples of other encyclopedias wisely. If this means more time must be taken to debate this, then so be it. I predict that diacritics will go on the rise simply out of the mix of international communications. I'd rather append a paragraph on the line of 'Use the wording and spelling that is most appropriate given the locality, the regional or international notability of the subject. Use valid translations and transliterations as is compatible with the subject and its notability. Do not simply remove valid diacritics on the basis of little evidence, but recognize that the over-use of them may add little to a large segment of readers' comprehension of the topic's name and may be unfamiliar to the reader. A birth name may not be any more credible than the common name if the subject willingly discarded the spelling and usage of the birth name.' Chew on that. Reminds me of my OECD days ... ʘ alaney2k ʘ (talk) 04:08, 12 June 2011 (UTC)
  • Oppose: Proposal 1 (Piotrus) contradicts the policy of WP:COMMONNAME which specifies that articles are to be titled to match the name that is most frequently used to refer to the subject in English-language reliable sources. For biographical articles, this means that the article title does not use the subject's name as it is legally spelled, or as might appear on a birth certificate or passport. It instead uses the most commonly used form of name (not necessarily the “correct” one) as verified by reliable sources. The standard of Wikipedia is "verifiability, not truth", and this fundamental policy must not be ignored to encourage the used of modified letters. Dolovis (talk) 04:36, 12 June 2011 (UTC)
    WP:COMMONNAME is about titles, not about the spelling of titles. Otherwise almost all article titles that are currently at a British English spelling could be moved to the American English spelling. Try enforcing that, and you will see how wrong you are. The simple fact of the matter is that some sources are not reliable for the spelling of foreign names because they routinely drop all accents or otherwise butcher names. While an acceptable practice in some contexts, it is not acceptable in an encyclopedia of international stature. These sources must therefore be discarded, as they do not give information about this fine point of spelling. Hans Adler 08:14, 12 June 2011 (UTC)
  • Questions. (There isn't a discussion section of this proposal, so I apologise if this is the wrong place.) If this proposal is successful, do you still believe this guideline would default to the common name policy, i.e. using the most common name in English sources? If so, how would you reconcile the difference between this guideline and the article titles policy recommending different outcomes?

    You give examples of The New York Times etc. using diacritics. If this is the case that respected newspapers such as the NYT and others are willing and able to use diacritics, what do suggest should happen in a move discussion where it's demonstrated that the NYT and other high quality newspapers have chosen not to use diacritics (and the subject isn't covered in books or encyclopedias)?

    Please note that I have recently been involved in several recent move discussions where I have been informed that newspapers like the NYT are not considered a reliable enough source for determining whether or not to use diacritics, so I find it odd that a proposal supporting the use of diacritics is now claiming that they are a reliable source for the use of diacritics. Jenks24 (talk) 07:33, 12 June 2011 (UTC)

    There was not supposed to be a vote at this stage in the first place. The usage of the New York Times is inconsistent. The same person is sometimes spelled with diacritics and sometimes without. (The same holds for the Guardian.) The New York Times Manual of Style has an Amazon preview that goes as far as "accent marks" (page 6), where it says they "are used for French, Italian, Spanish, Portuguese and German words and names. [...] Do not use accents in words or names from other languages (Slavic and Scandinavian ones, for example), which are less familiar to most American writers, editors and readers; such marks would be prone to error, and type fonts often lack characters necessary for consistency. Some foreign words that enter the English language keep their accent marks (protégé, résumé), others lose them (cafe, facade). The dictionary governs spellings, except for those shown in this manual. In the name of a United States resident, use or omit accents as the bearer does; when in doubt, omit them. (Exception: Use accents in Spanish names of Puerto Rico residents.) [...] Some news wires replace the umlaut with an e after the affected vowel. Normally undo that spelling, but check before altering a personal name; some individual Germans use the e form."
    So they sometimes drop accents (1) because of technical restrictions, (2) when they cannot guarantee to get them right (dropping them completely is more acceptable than getting them wrong), or (3) when the bearer drops them. We could think of adopting (3), but otherwise this is the same as our current practice, except we don't have the technical restrictions and we are usually able to distinguish between Julia Görges, who is consistently spelled Goerges by sports associations because they always butcher umlauts, and Angelika Roesch, who is occasionally spelled Rösch when a newspaper writer or editor tries to undo that butchering in a case where it didn't occur. Hans Adler 08:14, 12 June 2011 (UTC)
    You left out (4) "less familiar to most American writers, editors and readers." The NYT MOS also applies this policy to geographic names on page 143–144 (1999 edition). To quote "Retain accent marks in French, Italian, Spanish, Portuguese and German names only" (on page 144) ʘ alaney2k ʘ (talk) 14:11, 17 June 2011 (UTC)
I think that adopting the MoS of the NYTimes or the Guardian would be a narrow and imperfect solution, unapplicable for Wikipedia. The capacity and possibilities of the mentioned newspapers are limited, both technically and by the availability of human resources. Wikipedia is an open source project with international editorship that can reach far better and more accurate outcome than the imperfect MoS of large newspapers written in English. As you have said, we don't have the technical restrictions, and the multilingual human resources can guarantee the correctness with very good precision (of course, it is always necessary to cite reliable sources). The NYT says: [the "accent marks"] "are used for French, Italian, Spanish, Portuguese and German words and names. [...] Do not use accents in words or names from other languages (Slavic and Scandinavian ones, for example), which are less familiar to most American writers, editors and readers; such marks would be prone to error, and type fonts often lack characters necessary for consistency." It is not a problem for us Wikipedia editors, as we have human resources familiar with Slavic and Scandinavian languages, and we have all the type fonts needed (the biggest advantage is that we can use the fonts in various combinations for any name and thus eliminate the problem with typing unusual characters). With all due respect, I consider the claim a bit discriminatory: Why should we privilege the major world languages over the minor ones? ¶ This proposal offers a bizzare step back: we should adopt the imperfection and limitations, even though we have the human resources and technical mechanisms allowing correct description of facts. It is not a defense of the English language, as the names aren't English. The names are correct and complete with diacritical marks, no matter what the artificial and unnatural G-News search comparison says (you can search for a specific result by placing + before a name/word containing or lacking the accent marks [14]). Google search can't change someone's name. --Vejvančický (talk | contribs) 07:53, 13 June 2011 (UTC)
I do not endorse the NYT, Economist and Guardian MOS in all aspects. After all, they are the manuals of style of newspapers that are written almost exclusively by native English speakers. Newspapers are under severe time constraints that can make it impractical to check details of spelling in languages that are generally not well known among this demographic. (Actually, it's easier nowadays, but presumably all those MOSes date from the pre-internet era.) These languages are 'privileged' not because they are major world languages but because of practical concerns. When confronted with your name filtered through an ASCII-only medium (e.g. exchange of emails with someone who does not know how to enter accents on a US keyboard, or does not bother, or typical sports result tables) they would have a choice between guessing that "Vejvancicky" should really be spelled, e.g., "Vejvancićky" or playing it safe and omitting all accents.
We should follow the spirit behind these rules, update them to the internet age and adapt them to the international demographic of our editors, our lack of time constraints, and the higher precision requirements of an encyclopedia when compared to a newspaper. The inevitable result is something very much like the practice of Britannica and other English encyclopedias, or indeed our de facto practice. Hans Adler 08:11, 13 June 2011 (UTC)
So, how far would you go to including the diacritics? All of the languages of the European union? ʘ alaney2k ʘ (talk) 14:11, 17 June 2011 (UTC)
I think the style manual of the National Geographic could be a good inspiration for us. --Vejvančický (talk | contribs) 14:19, 17 June 2011 (UTC)
That is an interesting suggestion. Their first rule: Follow Webster's. Pertinent: "Retain the original diacritical marks (accents, apostrophes, dots, cedillas, glottals, etc.) in unanglicized words in the following languages: Czech, Danish, Dutch, Finnish, French, German, Hawaiian, Hungarian, Icelandic, Irish, Italian, Latvian, Norwegian, Polish, Portuguese, Slovak, Spanish, Swedish, and Turkish. Some anglicized terms from these languages also retain their accents (follow Webster’s)." Would that be acceptable to the editors here? There is an interesting quote: "The diaeresis is being dropped, though classical names and a few others still retain it: Laocoön, Brontë, the opera Aïda." I wonder what Finnish editors think of that, as the language uses that in names. E.g. Teemu Selänne. ʘ alaney2k ʘ (talk) 15:44, 17 June 2011 (UTC)
Diaeresis is a function of a trema. There is no diaeresis in "Selänne". In that case ä is a different letter, so the two dots are not dropped. It would be absurd to retain å and ø but drop the dots on ä and ö, and this is clearly not what is meant here. Hans Adler 16:38, 17 June 2011 (UTC)
The National Geographic guidelines are not in conflict with the current naming convention guideline. The crux of the disagreement between most of the editors in this thread is the clause from the second paragraph on the National Geographic page: "that have not become anglicized". Any suggestions on how to provide better guidance on determining when a term has become anglicized? isaacl (talk) 16:24, 17 June 2011 (UTC)
Good point. I think it is impossible to avoid ambiguity and find a perfect way how to word a new proposal. The current policy says: The title of an article should generally use the version of the name of the subject which is most common in the English language, as you would find it in reliable sources (for example other encyclopedias and reference works). Imagine a hypothetical situation: We would follow the guideline verbatim, we would compare the results of G-search or another search engine and then divide the names into two groups based solely on the search results. It would be in my opinion unencyclopedic, inconsistent and totally confusing to the majority of our international readership. It would create a crazy and unnatural situation that would have little to do with correct encyclopedic description. A really good encyclopedic project cannot distort the reality and made up proper names based on language selection. The current guideline is ambiguous and imperfect, however, it works because common sense and rationality wins over the drastic simplification offered by the proponents/supporters of this RfC. I think we should emphasize that in case of a proper name or a toponym in the areas covered by the languages mentioned above, we should prefer the form retaining the original diacritical marks, and we should emphasize that exceptions are allowed: i. e. subject has decided to drop the diacritics from their name etc.) Unfortunately, our situation is a bit different from the National Geographic or the NYTimes, as we don't have any directive mechanisms to enforce anything. Wikipedia's crowd administration is a benefit but also a handicap ... and it is a cause of the chaotic and disorganized discussion on this page. --Vejvančický (talk | contribs) 08:45, 18 June 2011 (UTC)
Note the guideline cautions against the use of search engine results to determine common English usage, and instead says that "verifiable reliable sources" should be given a higher weighting. Can anyone suggest how to evaluate the reliability of a source's reporting of a name, with the understanding that inaccuracies can occur in both directions (a source might mistakenly assume, for example, that a person of Latino descent has accents in his/her name), so an asymmetric guideline favouring one form of spelling is undesirable? isaacl (talk) 15:07, 18 June 2011 (UTC)
Our readers have a choice, they can type a version omitting all accents and they are subsequently redirected to an article containing full name. It is beneficial to anyone and it cannot damage neither the encyclopedia nor the English language. This project should be a modern reference point providing undistorted facts, especially in case of proper names. I completely agree with the summary in the last part of your comment. Vejvančický (talk | contribs) 10:44, 13 June 2011 (UTC)
We all want verifiability, though. How will you be able to propose a Slavic spelling of someone who has moved to North America and subsequently became notable. Will you be willing to accept that the person willingly dropped the diacritics, or will you insist on the birth spelling? If we find no diacritical spellings in English sources, but we do in a Czech sports article reporting a youth game, or even reporting on the person's activities in North America? Regardless of the rules of translation, we have a verifiable spelling that should not be discarded lightly. ʘ alaney2k ʘ (talk) 16:10, 13 June 2011 (UTC)
The concept of "notability" here on Wikipedia or any number of news articles can't change someone's name and I would insist on the birth spelling. Of course, if a bearer chooses to drop the accents, we should respect that, but I'm afraid in most cases it would be hardly verifiable. Vejvančický (talk | contribs) 16:45, 13 June 2011 (UTC)
Tell us, what is the "birth name" of August Dvorak (the engineer, not the composer]]? How do you propose we find out? Septentrionalis PMAnderson 00:05, 16 June 2011 (UTC)
It looks that August Dvorak was born in Glencoe, Minnesota and was American [15], so I think in this case it is correct to write his name without the accent marks. --Vejvančický (talk | contribs) 08:37, 17 June 2011 (UTC)
  • Support - as another data point, I just picked up from my real life mailbox copies of the latest journals from the American Economic Association and just happened to notice that the journal (AEJ: Macroeconomics) uses diacritics as well. Quite frankly this "anti-diacriticism" appears to be some kind of a Wikipedia-particular obsession, where you get a bunch of people who have convinced themselves that they know "what English really is" based on some kind of "I didn't see it in my high school readings or in People magazine so it must not be English" experience, but haven't actually bothered to look around how actual (academic) English language sources approach diacritics (they use them). If it's good enough for top journals in Economics, it should be good enough for Wikipedia. Unless we want our standards to be lower or something, which is sometimes the impression I get.Volunteer Marek (talk) 23:32, 14 June 2011 (UTC)
I saw the same article and thought the same thing! Demokratickid (talk) 00:06, 16 June 2011 (UTC)
The Principal-Agent/Experts paper or a different one? (And that was actually Micro). Edit: there's one in AEJ:Macro too, on the recessions and expansions article.Volunteer Marek (talk) 00:30, 18 June 2011 (UTC)
That is no big thing. The printer has installed the software to print the fonts. You don't have to install anything. Not all symbols are supported on computers. There is a standard of Unicode, but in North America, it often requires installing optional components. Not all standard fonts have all of the possible characters. Why, because, they are rarely used. Besides that, Wikipedia is clearly -not- an academic journal. ʘ alaney2k ʘ (talk) 18:47, 17 June 2011 (UTC)
No, it is not. It is an encyclopedia, and as we have shown, all English language encyclopedias like Britannica, Columbia or Encarta, use diacritics. Funny how you keep ignoring this little point... --Piotr Konieczny aka Prokonsul Piotrus| talk 00:12, 18 June 2011 (UTC)
  • Support it would be absurd not to encourage the use of correct versions. As an editor has said, the ignorance on the part of the public is not an excuse. It would make no sense not to use diacritics in German (Dusseldorf?), Czech (Ceska Trebova?), Polish (Lodz?) or in just any more or less known language. Miacek and his crime-fighting dog (woof!) 09:43, 15 June 2011 (UTC)
  • Support. The use of modified letters in article titles IS common, as can be seen every day on the Wikipedia's main page, which features many articles with diacritics in their titles. - Darwinek (talk) 20:27, 15 June 2011 (UTC)
  • Very strongly oppose This effort by the editors of a certain national affiliation to change the English language is appalling; those who are not content to write in English have a wide variety of other Wikipedias to adorn. The arguments for this, where they are not patently false (Dusseldorf remains quite common in actual English, as do analogous forms), or ignorant (what is used in English is correct, in English), amount to claims that diacritics are used in certain contexts; this is true, and where it is true, Wikipedia should use them. We should not change policy on the grounds that it is no change in policy. Septentrionalis PMAnderson 21:23, 15 June 2011 (UTC)
    • This encyclopedia is intended to serve those whose first language is English, experially the majority of them who are not fluent in any other language; they have no other Wikipedia. Others are welcome, but not at the expense of our primary mission; those who are fluent in other languages do have other Wikipedias, where they may spell as they please. Those who really want an "international" Wikipedia, in broken English, would do well to ask MediaWiki to authorize one; I expect any such request would be readily granted, and I should be fascinated by the result. Septentrionalis PMAnderson 01:48, 16 June 2011 (UTC)
      • I'm afraid I'm repeating myself. This encyclopedia is intended to provide correct and undistorted information. The proper names discussed here have nothing to do with the English language, a foreign proper name cannot be called "broken English". How can I break the rules of the English language by signing my full and correct name? --Vejvančický (talk | contribs) 08:37, 17 June 2011 (UTC)
        • The proposal here, however, is to use diacritics where they are, in English, incorrect, even though they are incorrect. Where diacritics are the correct spelling, this policy supports their use. Septentrionalis PMAnderson 22:15, 20 June 2011 (UTC)
          • Please, no straw men. The proposal suggests we use diacritics where they are used in English by Britannica, Encarta, or other sources (which is in most cases, unless there is an established alternative name for a locale, or the subject is naturalized and prefers dediacritiized names). What you and some others seem to argue, however, is to remove all diacritics from the project, with the very few possible exceptions (the oppose votes differ from "few exceptions allowed" to "death before a single diacritic allowed" camp). --Piotr Konieczny aka Prokonsul Piotrus| talk 22:22, 20 June 2011 (UTC)
            • No, what you paraphrase is more or less what the guideline says now: follow the general usage in English reliable sources (for example other encyclopedias and reference works) (it does not appeal to Encarta, which has gone out of business). You wish to change that to unconditional encouragement, whatever English reliable sources do. I support the present language; I have also supported it against the minority who would kill all diacritics. Septentrionalis PMAnderson 22:45, 20 June 2011 (UTC)
              • The present wording is pretty good, and perhaps my proposed wording can be adjusted. The problem was, and is, that in most cases, sources support the use of diacritics, and so does our common usage. So perhaps we should not encourage or discourage the use, but add a sentence stating the fact that in most cases, when diacritics can be used, they are (think: most names or less known (and thus more common) places). --Piotr Konieczny aka Prokonsul Piotrus| talk 06:09, 21 June 2011 (UTC)
  • Support Unlike the above editor claims, diacritics are used in English and are a part of the English language. I am not of a national affiliation and I support this measure, and take offense that it is presumed that all people in favor of diacritic use are pushing anti-English elements. Demokratickid (talk) 00:06, 16 June 2011 (UTC)
    • This editor has a userbox declaring his membership of WikiProject Slovakia. Septentrionalis PMAnderson 01:30, 16 June 2011 (UTC)
      • Your point being? Or has this discussion turned into a McCarthyist witch hunt? Demokratickid (talk) 16:39, 16 June 2011 (UTC)
        • Not at all. If cettain western Slavic editors wish to have a Wikipedia which conforms to usages they are more comfortable with, rather than with English, they should really go build one. I believe Wikimedia would produce a fork which can be acapted to anybody who prefers not to write or read English, but if not, see WP:Mirrors and forks. In the mean time. let the majority write in English, to be read by anglophones. Septentrionalis PMAnderson 18:15, 17 June 2011 (UTC)
          • Don't be obnoxious. We ("cettain western Slavic editors") are building one (an English language encyclopedia that is). Or is this some kind of a megalomaniac WP:OWN on a project wide scale? Other encyclopedias use diacritics, academic journals use'em, books use'em. YOU go fork a dumb down version of English Wikipedia if you want to. I'm going to stick around this one and try to make it better. And this is definitely a step in that direction.Volunteer Marek (talk) 00:29, 18 June 2011 (UTC)
          • Excuse me, Pmanderson!? How dare you assume my bad faith. As I have stated numerous times, I am an American. I was born in the USA and have lived here my entire life and I am also an Anglophile. How dare you insult me by off-handedly categorizing me as someone who wishes to trash up 'your' language by using proper diacritics. I demand an apology, not just to me, but to everyone who happens to be of West Slavic descent who edits on English Wikipedia. If you did not intend your remarks as pointedly hurtful or minimally assumptive of bad faith, then I also kindly suggest you rephrase or retract your words. Please, this is Wikipedia, we are supposed to have some standards to live up to. Demokratickid (talk) 02:18, 18 June 2011 (UTC)
              • I don't assume bad faith; you two have carefully provided the evidence I observe. Septentrionalis PMAnderson 20:11, 19 June 2011 (UTC)
  • Support Per Vejvančický's arguments. Also this proposal will lead to a less time consuming method of dealing with the (let's be honest relatively small issue) of moving articles for the reason of diacritics alone. If I find a diacritic unknown to me I just imagine the letter without it, it's not really a tiring mental exercise, I believe all of our readers are capable of doing the same. There is no loss of functionality here, or being harder to read or understand, just more information being provided, which is what Wikipedia is about. Hobartimus (talk) 19:57, 16 June 2011 (UTC)
Hear hear! Demokratickid (talk)
  • Support Diacritics are being used in most of the articles already and it seems that there isnt any confusion. Articles of people should use the spelling that is applied in their documents. Every article with diacritics in their title should also have a redirect page without them. Ratipok (talk) 02:34, 17 June 2011 (UTC)
    An inhabitant of Slovenia, according to his user page.Septentrionalis PMAnderson 18:15, 17 June 2011 (UTC)
    Which doesn't matter in the slightest. We don't segregate people here. -DJSasso (talk) 19:22, 17 June 2011 (UTC)
    I have to say, Septentrionalis, I am rather disappointed by this argument. Discuss the edit, not the editor. Labelling editors as "connected with a diacritic-using country thus a special case and likely biased", which is what you are obviously implying, is not very helpful. But please explain to me why is it that this "diacritic cabal" has nonetheless taken over all English language encyclopedias, academic journals and many, many books? Will you claim that Britannica is using "broken English" too? :) --Piotr Konieczny aka Prokonsul Piotrus| talk 00:16, 18 June 2011 (UTC)
    I don't have to explain what is not the case: Had some diacritics cabal taken over "all English language encyclopedias, academic journals and many, many books", you wouldn't need to amend the policy; its present language would provide the guidance you desire. When, as with Besançon, a diacritic is the standard English spelling, this guideline says unequivocally to use it. Septentrionalis PMAnderson 20:11, 19 June 2011 (UTC)
  • Comment I found the UN's Style Manual from 2002 (quick view here), see page 23: "Respect use of accents and special characters in proper names. EXAMPLE: Zéphirin Diabré." I think that in contrast to the English language newspapers, the United Nations is an institution that must follow specific international language standards with far more caution. Their style manual is in my opinion closer to the principles of this international encyclopedic project, and it is worded clearly. Vejvančický (talk | contribs) 12:50, 20 June 2011 (UTC)
    • Comment That's the UN Development Programme style manual that you've linked, not the 'UN Editorial Manual.' Their first rule, in that section : "Follow The Concise Oxford Dictionary (Ninth Edition)." I would not want to follow the UN and its long debates about wordings and spellings. They have to maintain spellings and exact meanings across languages. ʘ alaney2k ʘ (talk) 15:45, 20 June 2011 (UTC)
  • Support - per Vejvančický's arguments, accuracy and in accordance with numerous style guides. Daicaregos (talk) 13:51, 20 June 2011 (UTC)
  • Very strong oppose - our use of diacritics is already very bad and very out of control. There are many reasons why the English Wikipedia should be written in English, and this is one of the strongest.--Jimbo Wales (talk) 14:53, 20 June 2011 (UTC)
  • Comment: English Wikipedia does not follow the UN style manual. It follows Wikipedia policy that has been created by a consensus of editors through discussion. The policy as spelled out at Wikipedia:Article titles requires that the article title is to use the name that is most frequently used to refer to the subject in English-language reliable sources. This applies to the title of the article – but within the text of the article, pursuant to WP:MOSBIO, the person's legal name should usually appear first in the article. I trust that explains the current Wikipedia policy as it relates to this issue. Dolovis (talk) 14:57, 20 June 2011 (UTC)
  • Comment: I would prefer The use of modified letters (such as accents or other diacritics) in article titles is discouraged, but Redirects using these modified letters is actively encouraged. I primarily use the English language Wikipedia, and I use an English language keyboard. When I search for a name, I don't use diacritics because that would require extra effort with my keyboard. I expect most users of the English language Wikipedia do the same, so the most common name for any article is therefore one without diacritics. Of course the full name in the lede, and whatever Redirects are considered appropriate, are proper and sensible. Do a Google search for various names, and you'll see a mix of results. Flatterworld (talk) 15:57, 20 June 2011 (UTC)
  • Oppose. The English spelling of proper nouns often omits the diacritics, and in that case, we "use English" and drop them. It is when there is no accepted English version that we revert back to the native spelling. There are cases where diacritics are absolutely essential (e.g. the Norwegian letters "Æ","Ø", and "Å" cannot be omitted without changing the name completely, and most English sources keep those letters in tact). There are other cases where they are inappropriate since the English version is well established (see the arguments against diacritics Talk:Peter Leko for an example where the diacritic-free version is used by the subject himself even.) From a practical perspective, diacritics make the article harder to read and edit, so I see no need to encourage them unless their omission is blatantly incorrect. Sjakkalle (Check!) 16:43, 20 June 2011 (UTC)
  • Oppose. I see this refrain that "diacritics are used in English" all the time, but the problem is that it's just not true. The only use of diacritics in English are through loan words, and they are always dropped (very) quickly. The "numerous style guides" that a couple of people above have cited are simply wrong, and they don't actually have consensus (not allowing people to change the guideline does not create consensus). The English Wikipedia should be written in English, just as the French Wikipedia should be in French, the German in that language, ect...
    — V = IR (Talk • Contribs) 18:17, 20 June 2011 (UTC)
    • Except that it is true. You obviously haven't bothered to read the numerous examples given above. From the Chicago manual of style, to other encyclopedias, to academic journals - these all use diacritics. Including diacritics IS writing in English, proper, widely used English.Volunteer Marek (talk) 02:05, 21 June 2011 (UTC)
      • Bullshit. Perhaps you should take some time to re-read the examples that you're pointing to, after having you allow the scales to from your eyes.
        — V = IR (Talk • Contribs) 01:27, 22 June 2011 (UTC)
        You are not in a good position for this kind of attack after making ludicrous, obviously counter-factual claims.
        • Wikipedia linguist Ohms law: "The only use of diacritics in English are through loan words, and they are always dropped (very) quickly."
        • Concise Oxford Companion to the English Language: "There is a continuum in borrowing, from words that remain relatively alien and unassimilated in pronunciation and spelling (as with blasé and soirée from French), through those that become more or less acclimatized (as with elite rather than élite, while retaining a Frenchlike pronunciation, and garage with its various pronunciations) to [...]." [16] (Other examples of pretty old loanwords that are still very commonly spelled with their French accents include bon appétit, café, canapé, château, cliché, communiqué, coup d'état, coup de grâce, crème brûlée, déclassé, décolleté, décor, déjà vu ... If you don't believe that these are all legitimate English spellings, consult a dictionary. They may be rarer among undereducated Americans than among the Brits, but that doesn't make them wrong.)
        • Wikipedia linguist Ohms law: "The 'numerous style guides' that a couple of people above have cited are simply wrong".
        • A large number of English style guides including the Chicago Manual of Style: Detailed information about how and when to use diacritics in English. All agree that foreigners with a Latin-based name are spelled in English precisely as they spell themselves. (Unless they have Napoleon-like fame and hence an English name.)
        • All major encyclopedias: Use diacritics for foreign proper names in all European Latin-based proper names.
        • No style guides that anyone has found yet: Advise dropping diacritics in French or German. Hans Adler 16:09, 2 July 2011 (UTC)
  • Oppose We are the English Wikipedia. We should be using the most common names in English. A Quest For Knowledge (talk) 19:20, 20 June 2011 (UTC)
  • Support, good idea. As an encyclopedia, we educate, we do not censor diacritics to provide an easier-looking (but dumbed down and less accurate) "English". It is great that we have Unicode, no need to go back to ASCII for the sake of stupidity. —Kusma (t·c) 20:18, 20 June 2011 (UTC)
  • Support as a good start in reflecting community practice and opinion, but we should clarify when diacritical marks are common (Dominik Hašek, Düsseldorf, À nous la liberté and Médecins Sans Frontières) and when they're not (George Frideric Handel, Montreal and debut). In addition to the style manuals already mentioned, I found these:
  • British Council: "Accent marks: Retain when using foreign names, whether personal, geographical, or company titles [...] Personal names: When you cite a person's name, it is important that you spell the name correctly, so check, even if the name appears to be a simple one."[17]
  • Council of Science Editors: "Retain diacritics in personal names and place names if the names have not been anglicized. Word-processing programs now offer a wide variety of characters combining letters and the applicable diacritics, but such characters must be checked after typesetting to ensure that the desired characters appear." (Scientific Style and Format: The CSE Manual for Authors, Editors, and Publishers - p. 65)
  • European Commission: "Personal names should retain their original accents, e.g. Grybauskaitė, Potočnik, Wallström."[18]
  • The Times: "Give French, Spanish, Portuguese, German, Italian, Irish and Ancient Greek words their proper accents and diacritical marks; omit in other languages unless you are sure of them. Accents should be used in headlines and on capital letters. With Anglicised words, no need for accents in foreign words that have taken English nationality (hotel, depot, debacle, elite, regime etc), but keep the accent when it makes a crucial difference to pronunciation or understanding - café, communiqué, détente, émigré, façade, fête, fiancée, mêlée, métier, pâté, protégé, raison d'être; also note vis-à-vis."[19]
  • The higher quality the source, the heavier the use of appropriate diacritics seems to be. Prolog (talk) 20:21, 20 June 2011 (UTC)
  • Oppose: We should be following the common practice of using WP:COMMONNAME from English language sources. If the common name from English sources is the one with diacritics, then we should be using that. However, in any case where we have the title in diacritics after following COMMONNAME, we should also make sure to also have the name in regular spelling as an alternative within the first line of the article. That solves both problems right there. It is exactly because of the following of COMMONNAME that the use of diacritics is neither encouraged or discouraged, because we use both where appropriate when it is the most common name in english sources. Thus, we have no preference one way or the other, but follow the proper processes to determine which name is proper from the english sources. That is what it means to be neutral. SilverserenC 20:59, 20 June 2011 (UTC)
  • Support The use diacritics is normal in scholarly sources. See no reason why Wikipedia shouldn't follow.--MyMoloboaccount (talk) 08:44, 21 June 2011 (UTC)
  • Support Piotrus arguments are convincing, and I also like Hans Adler's comment about spelling. After all, the policy is titled WP:COMMONNAME, not WP:COMMONSPELLING… Eisfbnore talk 11:32, 21 June 2011 (UTC)
  • Support; I could probably go on a rant about imperialism, but I'll simply copy the relevant bit of my argument from Jimbo's talk page: My given name is Marc-André. It's not "Marc-Andre", nor is it "Marc" and if I magically became notable, an article titled "Marc-Andre Pelletier" would be, simply, erroneous. I do not have a name in English, though I conventionally accept being called Marc for simplicity's sake, and I would be very much insulted at the suggestion that I should pretend that some random sequence of letters that resemble my name are my name to assuage some naming convention. "Marc-Andre" is no closer to my name than "Xarc-André" would be, and just as incorrect: in both cases you'd be randomly substituting some incorrect letter.

    We are an encyclopedia. We should strive to not make clear errors when we can avoid it, and I can't think of an argument that would classify "substitute a letter of the title of an article with another that looks a little bit like it" as anything but a clear error. — Coren (talk) 14:55, 21 June 2011 (UTC)

    • Indeed. For the same reason I am Piotr, not Peter, even through I tell my American friends to call me Peter. We don't translate names in English, and my second name is Bronisław, not Bronislaw, just like Coren is Marc-André. --Piotr Konieczny aka Prokonsul Piotrus| talk 18:50, 21 June 2011 (UTC)
  • Strong Oppose I am in general agreement with Jimbo and others here. This isn’t a simple problem and the proposed guideline is overly simplistic. After having read through the arguments and researched the issue, it is clear that much more discussion by those specializing in the art of English-language technical writing directed to a general-interest readership need to run off somewhere and further discuss this. The current guideline (chaos rules and however some 14-year-old starts out an article goes a long way) is unsatisfactory and the proposed change is half-baked and worse than the existing one. To properly address this issue, IMO, requires a more nuanced guideline.

    My first observation about diacriticals is that respected English-language newspapers like The New York Times (example here) and The Guardian (example here) don’t use diacriticals in titles or body text for “Ho Chi Minh City” but our article has Hồ Chí Minh City throughout.

    I see arguments about the practices of “scholarly sources.” The trouble is that Wikipedia is directed to a general interest readership and is not a scholarly resource for specialists working on their doctoral thesis; if Wikipedia were, our articles would be incomprehensibly abstruse. Some of our articles have obscure diacriticals that a general-interest readership never sees being used in English-language publications that are directed to a general-interest readership; our use of them for a general-interest readership makes Wikipedia look either pretentious, or like it has been hijacked by foreign-language purists who have little knowledge about English-language technical writing and journalistic practices, or both. Greg L (talk) 15:10, 21 June 2011 (UTC)

  • Greg, you miss the point, and are mistaken in the description of the situation.
  • First of all, the article you cite is using the de-diacriticized title (it is at Ho Chi Minh City, not Hồ Chí Minh City (which is a redirect). There are no diacritics in the bolded lead opening, nor in the English name in the infobox. Next, within the article proper, the use of diacritic is inconsistent, which is something that should be fixed. As such, this article does not represent either side of the argument here, it is simply a mess, which does, however, represent the current policy (where the diacritics are neither discouraged or encouraged). This suggests it is a case ripe for a discussion on the talk on whether English sources use or don't use diacritics when referring to it. After this is made clear on talk, the article should be updated to reflect the consensus. And while I come from the "diacritics should be encouraged" camp, I am open to being shown that in this particular case the sources do not use diacritics, and that this may be one of the exceptions, where the common name in English sources is without diacritics. But let me repeat that this article is NOT using diacritics (it uses them inconsistently) to use this article as an example of improper diacritic use, without further investigation, is not that helpful (again, I am saying this and being open to the possibility that this article should not be using as many diacritics as it is).
  • Lastly, your argument about the academic use ignores the fact that diacritics are widely used outside academia, primarily by other encyclopedias, and in many other cases by the very newspaper you cite (check for links in the preceding discussion). The fact that they are not used in the two articles you cited may simply mean that the article (name) you cite is one of the exceptions where it has been adopted into English without diacritics. --Piotr Konieczny aka Prokonsul Piotrus| talk 17:13, 21 June 2011 (UTC)
  • You are right, I didn’t fully appreciate the particulars of precisely what is being discussed: article titles only; my post spoke to the whole broad issue of diacriticals in Wikipedia’s articles. I also agree with you that the Ho Chi Minh City article is “a mess.” One thing is quite clear to me: Wikipedia’s classic approach to achieving consensus on bitterly divisive issues (do whatcha like ta) isn’t working with diacriticals. The whole broad issue needs to be sorted out so that a comprehensive, consistent, and clear guideline properly addresses the broad issue of diacriticals in all parts of our articles. The current guideline for titles is beyond-worthless. I find your proposal (encourage use in the titles too) is far too simplistic and seems the wrong direction to go for titles.

    The solution, IMHO, now that Jimbo has spoken, is keep him out of it and for a group of specialists (people who make a living at English-language technical writing and bless Wikipedia with their insight) go run off and come up with a guideline that relies less on chaos theory.

    Jimbo’s input if highly valuable as he serves in a leadership position and his voice is respected and second to none. However, his presence creates the “vortex” phenomenon whereby wiki‑theater and wiki‑drama causes far too many editors with no expertise in a particular matter to do an ill‑considered drive‑by shooting so as to assert that they too exist and matter in Wikipedia affairs. Well, I suppose that beats running about and tagging store fronts with graffiti, but the result after Jimbo has landed somewhere is not an exhibition of Wikipedia deliberations on complex issues at their finest.

    I’ll conclude by observing that I believe that in many cases, I suspect the better way to use diacriticals in articles is to show just once how the word is properly spelled with all its diacritical glory—like how a word is pronounced. But I type on a Mac, where many of the common accents are keyboard-accessible. That’s why I type with curly quotes (“like this”); they’re keyboard accessible. I can also easily type “naïve” from the keyboard. But people using Windows computers have so much difficulty with special characters, curly quotes (an example of fine, exemplary typography) are discouraged. With diacriticals like “Hồ Chí Minh City”, we’re now talking about the use of diacriticals that even I don’t know how to type and it’s safe to say that most others aren’t going to bother when they need to type “Ho Chi Minh City”. I find that our use of them; particularly the really odd stuff that newspapers and magazines like The New York Times and Newsweek invariably skip, looks elitist and pretentious in many cases.

    I don’t know what the final, comprehensive guideline ought to be. But adding über-exotic diacriticals to the titles of articles seems the wrong way to go, which is in agreement with (*sound of audience gasp*) Jimbo’s opinion. Greg L (talk) 18:31, 21 June 2011 (UTC)

  • If we had an expert group... but we don't, and most of our policies have been created by those who care enough, often codifying existing practice. My point is that the practice, on Wikipedia, has evolved to support the use of diacritics, check categories on German or Polish villages or people, and you'll see that in 99% of cases, if a diactritic is to be used, it is. Second, on the expert argument, we have shown above that experts - be it other encyclopedia authors or academics - prefer to use diacritics (note that this does not mean they use it 100%, they are exceptions, and the Ho Chi Minh City may be one of them). This discussion was started, IIRC, by an RfC by a user who tried and failed to move a bunch of (Hungarian?) bios to de-diacriticized titles, and who objected, in futility, to a bunch of RMs moving other bios to titles with diacritics. The bios in question where, IIRC, low key sportspeople who are rarely mentioned in English sources; those mentions are inconclusive with regards to diacritics - but they are common in other languages. As those RMs have shown, and the usage of diacritics in names of peoples and places elsewhere on Wikipedia, we have a rough consensus to use diacritics in such situations. My proposed change was an attempt to reflect it, to stop people arguing in RM that "our policy/English language does not support the use of diacritics". You may be right that the wording was not good for that, and I'd appreciate it if you could come up with an idea of how to improve it. But I am pretty sure it us up to us to do it, we will not find any "experts" to do it for us. --Piotr Konieczny aka Prokonsul Piotrus| talk 18:47, 21 June 2011 (UTC)
  • You are misinterpreting what I am referring to when I write of “experts.” I am talking about those wikipedians who have been trained in English journalism and make a living in English-language technical writing. They actually know what they are doing and plenty of them inhabit Wikipedia. I am advocating that we seek their guidance rather than pretend that good and wise manuals of style come about just because some 14-year-old with A) a pulse, and B) a computer, and C) an Internet connection, can come along and magically become wise and knowing on all Wikipedia matters. Whereas I might not entirely agree with him, User:Prolog, above, would be a good editor to include in a focus group as he actually seems adept at looking some of this stuff up and seems to have an appreciation for details.

    Wikipedia has had its share of those who advocate that Wikipedia become way-cool and embrace new-age terminology like The Dell Inspiron came with 256 mebibytes of RAM when the rest of the computing world universally uses “megabyte”. The reasoning was that the new proposed standard was the future and Wikipedia is so way-cool that we ought to lead the way to a New, Unfamiliar, and Baffling Future.™®© In this case, we’re talking about writing in a manner that few English-speaking readers can replicate and seldom see (like “Hồ Chí Minh City”). I didn’t see it that way on those computer prefixes. I’m not seeing it now with using every diacritical known on this pale blue dot as I find it awkward and pretentious in many cases. One always avoids a writing style that unnecessarily calls attention to itself to its target readership (a general-interest one in our case).

    As I wrote above, it’s time for a comprehensive guideline covering the whole issue; not this one aspect, which “encourages” their use in titles of all things. Arguments about “Unicode” (below) and what technology supports (256-bit character sets capable of supporting Klingon?) are beside the point. We’ll just have to agree to disagree here. I do hope that is OK with you. Happy editing. Greg L (talk) 20:18, 21 June 2011 (UTC)

  • re "we’re now talking about the use of diacriticals that even I don’t know how to type": There is an easy way to type those diacritics : copy paste. Munci (talk) 10:18, 23 June 2011 (UTC)
  • Support, per Hans Adler and Vejvančický. We're an encyclopaedia. Our job is to represent the fact of a person or place's actual name - if English-language speakers don't know how to type or to pronounce those unfamiliar characters, that's why we have redirects and a pronunciation guide. (It's not as if normal English spelling can be relied upon for pronunciation - we wouldn't change the title of an article like Kirkcudbright to "Kerrcoobree" because that's how it's pronounced, but we might set up a redirect for the benefit of someone who's heard the name but not seen it written down). We insult our readers my assuming that they can't cope with mentally removing some unfamiliar markings from the basic English character set. The issue of the difficulty of typing diacritics has lttle relevance, as editors who work on those articles will know how to do it, and everyone can copy and paste, either from elsewhere within the article or from the Unicode Latin character set that Wikipedia conveniently provides immediately below the edit box. Diacritics are no barrier to searching - the search box copes admirably with them. If you search using only non-diacritic'd characters, the search facility will find and suggest any variations that have diacritics, whether or not a redirect exists. In short, using diacritics is the right thing to do and presents only trifling inconvenience to those who prefer to ignore them. Colonies Chris (talk) 19:20, 21 June 2011 (UTC)
  • Oppose - This is the English Wikipedia. Article titles should be in English, using English letters. Use of other symbols should be prohibited unless there is a really good reason to do so in a particular case, though I could live with them simply being "discouraged," depending on what that actually means in practice. But certainly not "encouraged," and none of this "neither encouraged nor discouraged" stuff, either. Neutron (talk) 20:16, 21 June 2011 (UTC)
  • Support - This is the English Wikipedia. The English language contains diacritics.
[copy of my post to Jimmy Wales' talk page]
During the British Library Editathon on January 15, Wikipedians were privileged to be given a guided tour of the the Evolving English: One language, Many voices exhibition. The curator explained that the English language had evolved though absorbing thousands of loanwords. Sometimes these words retained their original diacritics in common usage, e.g. née, fiancée,façade, déjà vu. This practice goes back to Anglo Saxon times, so Modern English does indeed contain diacritics through these loanwords.
I personally believe that a great deal of useful information will be lost, or at least not be as accurate as it should, if we exclude diacritics from Wikipedia. -- Marek.69 talk 20:47, 21 June 2011 (UTC)
*exceptions include: diacritics, accents and any other unidentified squiggles

Support (it's probably obvious that I favour this if you've read the rest of the discussion). Jimbo apparently did not since the idea that using diacritics means writing the project in another language has been pretty thoroughly and eloquently dismantled. I'd like to note that as far as I know, the use of diacritics (with exceptions when an alternative name is well-established) is the de facto standard on the German, French, Spanish, Portugues, Romanian, Italian, Scots, Dutch, Danish, Esperanto, Latin, Hungarian, Finnish, Swedish, Basque, Polish and even Simple English Wikipedias. Of course these versions of Wikipedia are written in German, French, Spanish, Portugues, Romanian, Italian, Scots, Dutch, Danish, Esperanto, Latin, Hungarian, Finnish, Swedish, Basque, Polish and English respectively. I suppose one can discount this on the grounds that they're all foreign languages anyways so why should we care. A better explanation though is that they're right in doing what most scholarly sources would do. Pichpich (talk) 22:58, 21 June 2011 (UTC)

Oppose proposal to drop diacritics Alternative support move from en.wiki to dumb.wiki Agathoclea (talk) 10:17, 22 June 2011 (UTC)

As seen by my striken comment I am against the dropping of diacritics, but I have misread the above proposal therefor I would support with the caveat of WP:COMMONNAME ie the Article Munich should never be at München. Agathoclea (talk) 13:19, 22 June 2011 (UTC)
  • Oppose - My sentiments pretty much echo those of User:A Quest For Knowledge & User:Silver seren's expressed above. I don't necessarily dislike this proposal, I just think that WP:COMMONNAME covers these situations and should be pre-emminent. The big question this change raises in my mind is what happens if there is a conflict where "Joè" is the technically correct name, but "Joe" is the common name (i.e. the name used by the majority of reliable sources). In that situation, I'd opt for the common name, but this policy change might "encourage" people to pick "Joè". NickCT (talk) 12:52, 22 June 2011 (UTC)
  • As I stated some time above, we should certainly add a note that there are recognized exceptions where English prefers a variant without diacritics (Munich), just like the common name will sometimes be different from the name that the subject used (Confucius). You are more then welcome to craft such a disclaimer sentence. Would this appease you? --Piotr Konieczny aka Prokonsul Piotrus| talk 17:14, 22 June 2011 (UTC)
Ok, well dismissing the risk of circumventing WP:COMMONNAME, I'm not sure I see a good reason for "encouraging" people. I've read your initial arguments above, and they seem to be arguing that it is OK to use diacritics (which seems like what's already stated in the "neutral" policy), not that we SHOULD use diacritics. The only argument I see that argues we should use it is the "usage of diacritics ... other English-language encyclopedias" line. I think I'd counter that point by saying that WP:COMMONNAME makes wikipedia better than "other English-language encyclopedias".
Approaching this from another direction, can you give a simple example of an article title where the English WP:COMMONNAME does not include diacritics, but you think that it would be wise to "encourage" thier use? NickCT (talk) 19:46, 22 June 2011 (UTC)
  • Oppose Neutral on the basis that no particular rule about diacritics is needed. The principle of least astonishment is all that's needed: titles should be in whichever form, including diacritics, that readers are most likely to be familiar with. That means that WP:COMMONNAME suffices. If no particular variant is the most common, it's probably better to use the one with the correct diacritics, though, simply as a matter of correct spelling.  Sandstein  12:59, 22 June 2011 (UTC)
    • You know, this is roughly what this proposal is trying to say. If you look at the history, it was started because one user questioned through RfC that diacritics are allowed at all, argued that current policies do not support them, was involved in dozens of RMs in which he tried to move articles to de-diacriticized titles, and in dozens more going the other way, which had to be started after he edit warred moving them there and defending "the correct version without diacritics". --Piotr Konieczny aka Prokonsul Piotrus| talk 17:14, 22 June 2011 (UTC)
      The problem is that the "diacritics aren't allowed" crowd is a reaction to the "we need diacritics wherever possible" crowd, who have been running around moving articles, largely without even starting RM's about doing so, for at least the last year. The side that you're representing here has been behaving much worse then the crowd that I happen to be (mostly) representing (although, I think that my own view, and that of Sandstein and most others here, sits more in a middle ground where diacritics are OK but not encouraged). Many, if not most, articles that are at titles with accents and diacritics have copious references that don't use them (easily supporting a COMMONNAME argument), yet they commonly fail RM's due to a deluge of opposition from people who think that we should use accents and diacritics wherever possible. The process is apparently broken, and seemingly because of "bad faith" editors, in this area.
      — V = IR (Talk • Contribs) 17:27, 22 June 2011 (UTC)
I haven't followed these developments, but could support some form of "use COMMONNAME, and when that is unclear, use the form with diacritics" rule. I don't really see how typography relates to language (which I think of as a characteristic of sentences and paragraphs, rather than of proper names and individual glyphs), and so I don't see why one would say that "é" is less "English" than "e". But then I think that the whole issue is also not very important, much like the hyphen-dash-thing, because the lead will contain all typographical variants anyway. Maybe diacritics could be treated as an optional style thing, like British or American spelling, where we follow the first contributor.  Sandstein  18:20, 22 June 2011 (UTC)
Struck the bolded oppose above and replaced with neutral, since I can't agree with either extreme: we should primarily use COMMONNAME, i.e. what English-language readers are most familiar with, so it's Zurich, not Zürich. But where no name is clearly the most common, we should not needlessly spell words (let alone proper names) incorrectly just to make them look less "foreign", hence Übermensch, not Ubermensch.  Sandstein  18:33, 22 June 2011 (UTC)
I think we agree on this. Piotrus' wording is a bit too strong, but it's a reaction to the current push for page moves on an epic scale that would include moving Friedrich Dürrenmatt and François Mitterrand to their "English names", which are of course no such thing. Hans Adler 19:14, 22 June 2011 (UTC)
I am very open to somebody proposing a less strong version of the proposal. I think we have already improved the guideline by adding the "The policy on using common names and on foreign names does not prohibit the use of modified letters, if they are used in the common name as verified by reliable sources." Perhaps my new proposal below, about adding the statement of fact that diacritics are commonly used on English Wikipedia, would be enough? --Piotr Konieczny aka Prokonsul Piotrus| talk 19:19, 22 June 2011 (UTC)
While not ideal, that statement could be sufficient for the moment. Hans Adler 19:35, 22 June 2011 (UTC)
I can agree with that as well, although I still wouldn't recommend the use of accents and diacritics (so, something like "use COMMONNAME, or a form that can be agreed on by consensus through discussion on the talk page."). This is good for as far as it goes, but it fails to address the thousands of page moves which have occurred, largely without discussion, moving pages to titles that include diacritics.
— V = IR (Talk • Contribs) 20:00, 22 June 2011 (UTC)
  • Oppose as framed.
As regards proper names such as places and people, this page should not be concerned with spelling and should at most detail rules for the use of exonyms, where there is a true English name, as in the case of Munich, Cologne, and Warsaw. Where such well-established English names exist, it should be made abundantly clear that they should be used. We should try to avoid redundant (and therefore potentially conflicting) rules in different guidelines. How to spell proper names like André, Günter and Düsseldorf should be (and is) dealt with at WP:Proper names. If necessary, that page should be modified slightly to make absolutely clear that the rules apply to the spelling of the name wherever it is used, including the article title (except, of course, when describing alternative names or spellings). This project page and WP:COMMONNAME should point to that page. Since people may find misspelling of their name not only inaccurate but also insulting, we may also need to add something to WP:BLP. In those cases where we use a spelling other than that actually used by the person concerned, we should perhaps, as a matter of courtesy, provide a hatnote with a suitably diplomatic explanation.
To address one point raised by Jimbo Wales in relation to a specific article: where special characters are used in an article title, we should explicitly say that an explanantion of any special characters, their pronunciation, and possible alternative characters (for use when the original characters are not available) should be made easily available, for example by using the Foreignchars template with links to the articles on the relevant extended Latin characters. To achieve GA or FA status, an article should be required to have such links, as well as IPA explanations and, preferably, audio files giving the pronunciation. Relevant projects could even include such requirements in their B-class criteria (criterion 6). -- Boson (talk) 19:50, 22 June 2011 (UTC)
  • Support: I don't see any reason why diacritics shouldn't be used if accurate. Redirects are cheap, anyway. And I would hazard a guess that 99% of Wikipedia readers have at the very least a form of diacritic support; Unicode was even present in Windows 98! There are also issues with using "old" sources here to justify a COMMONNAME, as for technical reasons they may have not been able print diacritics. There have been times when it has interfered with COMMONSENSE. At the very least, article names should be transliterated. We'd have articles at "Rossiya", not "Poccnr", if the English name "Russia" didn't exist. Sceptre (talk) 17:50, 23 June 2011 (UTC)
  • Support
  • I don't believe this issue should be resolved by case-by-case decisions based on RS as I see this mainly as a stylistic issue. We take guidance from other manuals-of-style, but when all is said and done, we decide how WP displays information. Unless a RS indicates why accents and diacriticals should not be used—e.g. for Handel (he lived in London for almost fifty years, became a citizen, tried very hard to become all things British, and abandoned the umlaut when signing his own name)—we should be encouraging them as much as possible (which is what this proposal is doing).
  • To use the "ç" in "François Mitterrand" but not to use the modifiers in "Đặng Hữu Phúc" is insular. That is his name, and the tonal indications are as important to him as "ç" is to the ex-French President.
  • The use of accents and diacriticals portrays immediate information. "Handel" gives a different visual clue than does "Händel", and if we were to use "Dang Huu Phuc" there is the implication that he himself has Anglicised his name. There is information and even beauty in the markings in "Đặng Hữu Phúc", and such markings provide a visual clue as to the origin of the subject. The modifiers also will awaken an interest in some readers to find out more about their effect on the underlying characters.
  • The WP redirect system is wonderful and permits our readers to find articles no matter what extra markings the underlying characters contain. Also great is the WP search box which offers similar functionality. We must be careful not to base decisions on sources that historically have avoided accents and diacriticals due to technical reasons of their own. The technical world is moving on, and we have a wonderful opportunity (due to our great technical prowess) to lead (stylistically). Are there many modern computer programs that don't permit the use of extended character sets (even if that involves copy-and-paste)?
  • In terms of moving on, "Đặng Hữu Phúc" returns 121,000 Google hits but "Dang Huu Phuc" returns 6,430 Google hits. Therefore the "real" world is almost 19 times more likely to use and recognize the tonal markings in his name. WP should not be left behind.
  • WP has a duty to inform to as high a standard as possible. We are an encyclopaedia and not the big-boys-book-of-easy-to-write-and-comfortable-to-read-information™. To remove accents and diacriticals risks us being perceived as having dumbed-down our content.
  • Trying to find a middle-ground on the use of accents and diacriticals will be a minefield. To try and find a compromise position will lead to never-ending edit-wars on some articles (as individual personalities cherry-pick RS to support a stance). To avoid that, either abandon all accents and diacriticals, or support their use wholeheartedly. With that in mind, here's how I would have stated this proposal:

In article titles the use of modified letters such as accents and other diacritics (when based on Latin-derived alphabets) is encouraged.

GFHandel   21:13, 23 June 2011 (UTC)
  • Oppose the opposers the correct answer is "use diacritics when appropriate", i.e. to properly represent foreign names, especially in the intro where we usually give people's full names, affiliations, etc.. If the diacritics they can be omitted in Wikipedia--but that's not the case for a lot of foreign names. Come on folks, there's absolutely nothing unusual about seeing a name like André Malraux in an English-language source. Does this actually need a policy change? Probably not. But I'm seeing stunningly bad arguments from opposers that for some reason English-speaking users should be shielded from correct renderings of names because diacritics offend them. All I can say about Jimbo's comments is thank god he no longer has the power to make unilateral decisions on Wikipedia. 128.104.61.47 (talk) 21:49, 23 June 2011 (UTC)
  • Respectfully oppose. Following the policy of WP:COMMONNAME is the only logical way to proceed. Encouraging or discouraging the use of diacritics in article titles would indeed contradict this policy. mgeo talk 13:57, 24 June 2011 (UTC)
  • Oppose - The wording doesn't make sense. "The use of modified letters (such as accents or other diacritics) in article titles is common" It is? Compared to what? Under what criteria is it common? "and thus encouraged." - Huh? That doesn't follow. Common and encouraged are different things. Currently it's not encouraged or discouraged. This is the only sane position, as encouraging the use would mean "Please put diacritics there, as many as you can!" and discouraging would mean "please don't use them unless absolutely necessary". Neither position makes sense. WP:COMMONNAME is all that is needed here. --OpenFuture (talk) 04:06, 25 June 2011 (UTC)
  • The word "common" is in reference to the current practice on Wikipedia and the use of diacritics is undeniably the current default. Whether that's a good idea (and whether it's the result of coordinated efforts by a handful of diacritic-loving lunatics) is being debated. Pichpich (talk) 05:16, 25 June 2011 (UTC)
It being "common" is first of all debatable, and second of all, if it *is* common, why would the rule need to be changed? And lastly it still doesn't follow that it is or should be encouraged just because it is common. I stand by the position that WP:COMMONNAME is all that is needed, and that any change from the current rule by necessity must conflict with WP:COMMONNAME and hence cause a move-war for tons of articles. --OpenFuture (talk) 06:01, 25 June 2011 (UTC)
It is actually because of the wording here and in commonname that this proposal has been made. The current wording causes many move wars. The reason for us to codify what is currently extremely common practice is so that we can stop having the move wars that are constantly occurring because of the ambiguity here. -DJSasso (talk)
And this proposal just *adds* ambiguity. I really thought I made that clear in my vote comment above. You don't *have* to comment on every vote you know. ;-) --OpenFuture (talk) 06:34, 25 June 2011 (UTC)
No you vote didn't really make that clear at all. As for commenting on every actual !vote...I have commented on almost none of the votes. In fact upon looking yours was the only one. Two if you count commenting on someone making a comment about someones ethnicity that wasn't appropriate. -DJSasso (talk) 06:36, 25 June 2011 (UTC)
  • Support. I'm a member of the vast majority of English-speakers who don't understand (for example) Czech; but I do know the difference between ‹c› and ‹č› and I want to know which it is so that I can pronounce it more correctly. There presumably exist readers who can say the same of Vietnamese, and I want them to have the same benefit even if the diacritics are meaningless to me; I am not harmed by seeing the funny squiggles, now that Unicode fonts are generally available. If Wikipedia excluded everything that a substantial number of English-speakers don't understand, I wouldn't bother visiting. —Tamfang (talk) 20:02, 30 June 2011 (UTC)
  • What if there is a well-established English pronunciation that differs from the native one? mgeo talk 14:12, 1 July 2011 (UTC)
Then provide both pronunciations explicitly. —Tamfang (talk) 03:17, 2 July 2011 (UTC)
This would actually mean: "do not use the English name for the title even if it is well-established but the native one", and this clearly contradicts the actual policies. These ask us to use the English name for the article title and to provide the native name and alternative names in the opening sentence, not the contrary. mgeo talk 10:57, 2 July 2011 (UTC)
To call the unaccented form "the English name" begs the question; not everything has, or ought to have, an English name. It strikes me as analogous to insisting that book and movie titles ought to be written with quotation-marks, rather than italics, because that's how it's done by most newspapers and people who don't know how to italicize. —Tamfang (talk) 20:02, 2 July 2011 (UTC)
But, but, but, ..., but replacing quotation marks by italics is original research! If we allow common sense and intelligent editorial decisions, then we infringe on the rights of those who are incapable of either to also edit Wikipedia and decide whether it is an encyclopedia or a compilation of sports reports! And anyway, it follows from WP:UE that we may only write for the uneducated. John Adler 20:57, 2 July 2011 (UTC)
PS: All those communists who know how to pronounce diacritics and where to place them on names clearly have a COI. We can't allow foreigners and their sympathizers to decide how we spell foreign names. It's bad enough that we can't keep them out of our encyclopedia. John the Bald ((Eagle) 20:57, 2 July 2011 (UTC)
@ Tamfang. Of course, the vast majority of foreign names don't have an English translation and if they contain diacritics we surely shouldn't remove them to make the names look more English. My question was about the names that have a "historically" established spelling. Do you think we should use diacritics just because the English spelling is not different enough from the native one? By only encouraging the use of diacritics (instead of their careful use), it seems to me that it is actually what the proposal is suggesting. mgeo talk 23:11, 2 July 2011 (UTC)
I agree with you. This is why I don't actually support this proposal, even though my arguments seem to have convinced a few other people to do so. I don't think there is a danger that Napoleon will be moved to Napoléon if this proposal finds consensus, but it can't hurt to make it clear that that's not what is meant. Hans Adler 00:55, 3 July 2011 (UTC)

Proposal 2 (OpenFuture)

Current wording reads:

The use of modified letters (such as accents or other diacritics) in article titles is neither encouraged nor discouraged

This is claimed to be ambiguous and it is also claimed that the above wording is used to argue against WP:COMMONNAME. I therefore propose to change it to:

The use of modified letters (such as accents or other diacritics) in article titles is neither encouraged nor discouraged. The usage should follow WP:COMMONNAME and where appropriate other naming guidelines such as WP:NCGN.

(Yes, I know this basically duplicates what comes in the sentence fragment after. But it is claimed that it is ambiguous. I don't agree, but this should remove the ambiguity. --OpenFuture (talk) 06:40, 25 June 2011 (UTC)

  • Neutral - I personally think the current wording is fine, and can't be used to argue against any other rule. --OpenFuture (talk) 06:40, 25 June 2011 (UTC)
  • Oppose It's the reliance on using COMMONNAME that is the issue as it is too ambiguous and implies that only English sources are ok. Which contradicts at least 3 other policies as someone mentioned somewhere on this page that I am quickly getting lost on. -DJSasso (talk) 06:43, 25 June 2011 (UTC)
If there are problems with COMMONNAME, they should be fixed. Adding contradictory policies is not going to help. --OpenFuture (talk) 07:14, 25 June 2011 (UTC)
I believe we can trust English reliable sources, for example the New York Times. ʘ alaney2k ʘ (talk) 08:50, 25 June 2011 (UTC)
  • Support If it really needs to be clarified, then this would be the way to do it. It is rather redundant, but a number of the people supporting Proposal 1 have said they were doing so because of the ambiguity of the current wording. Clarifying it like this without fundamentally changing the purpose of the section seems like the best method. SilverserenC 07:37, 25 June 2011 (UTC)
  • Support In the absence of a formal rule about diacritics in a name, at least it spells out commonname. ʘ alaney2k ʘ (talk) 08:50, 25 June 2011 (UTC)
  • Support. Perfectly reasonable reference to COMMONNAME. Sjakkalle (Check!) 11:39, 25 June 2011 (UTC)
  • Oppose Not only is it redundant, as the proposer admits, it's retrograde. It also seems to be a none too transparent to rid Wikipedia of many articles with diacritics in their title, potentially causinges(–redacted) significant disruption through WP:POINTy page moves by purists away from well established and long-standing articles with diacritics. --Ohconfucius ¡digame! 12:40, 25 June 2011 (UTC)
So you are saying that: 1. WP:COMMONNAME is wrong and 2. Many article violate it as of today? --OpenFuture (talk) 12:52, 25 June 2011 (UTC)
Not to speak for him but in my opinion yes, policies are supposed to be descriptive not prescriptive. The standard across the wiki is that most articles where diacritics exist in the proper name use them (there are exceptions of course). Commonname however encourages that to not be the case. However, policies aren't supposed to describe a way we want things to happen they are supposed to describe a way they do happen. Currently commonname does not reflect was is community consensus as is evidenced by how these are handled throughout the wiki. It is often used as a bludgeoning tool to try and strip the diacritics out of all articles, even ones where no English sources exist. Hence why there is an attempt above to try and codify what actually happens on the wiki and to remove the ambiguity that commonname throws into the mix. -DJSasso (talk) 15:14, 25 June 2011 (UTC)
How do you propose to reduce ambiguity by having contradictory policies? Since your answer is that COMMONNAME is wrong, shouldn't COMMONNAME simply be fixed? --OpenFuture (talk) 15:28, 25 June 2011 (UTC)
I don't think clarifying that diacritics are acceptable is contradictory. I think it just expands. Since common name and diacritics are subsections of the same policy. In other words this is fixing commonname. Personally if I was making a proposal I think I would word it the opposite of how you have it. And state that the use of diacritics do not fall under commonname and instead fall completely under WP:DIACRITICS. Then the wording essentially stays the same in both policies and it more closely reflects actual practice. The only change in wording would be to say the use or non-use of them doesn't fall under commonname but under diacritics where it says they are neither encouraged or discouraged. -DJSasso (talk) 15:30, 25 June 2011 (UTC)
You claimed that current practices was not in accordance to COMMONNAME. So let's backtrack, and try this again: So you are saying that: 1. WP:COMMONNAME is wrong and 2. Many article violate it as of today? Yes or no? --OpenFuture (talk) 15:40, 25 June 2011 (UTC)
Yes, I am saying it is incorrectly used to apply to diacritics. Diacritics fall under wp:diacritics, not wp:commonname. However people use wp:commonname to try and get around wp:diacritics which indicates we have no preference. But yes, diacritics are used all over the wiki in thousands of articles that a case could be made through commonname that its not necessarily the most common way its put to print. But commonname is talking about different names. ie short names or pseudonyms etc. Its not talking about spelling. Whereas wp:diacritics is directly speaking about the use or non-use of diacritics. As such there are thousands if not tens of thousands of articles on the wiki that use diacritics if you follows commonname to the T wouldn't actually comply but do comply with wp:diacritics. -DJSasso (talk) 15:43, 25 June 2011 (UTC)
No, COMMONNAME talks about the titles of the articles. Since your answer is that COMMONNAME is wrong and not followed, shouldn't this problem be fixed in COMMONNAME? --OpenFuture (talk) 16:04, 25 June 2011 (UTC)
Yes, it talks about titles of articles in regards to alternate names. The use of diacritics or non-use does not figure into that. Which is why there is a separate section that talks about it. Commonname works fine for what its intended for. But its not intended for diacritics in the title. The use of diacritics as per its section requires you to use sources of a higher quality (ie it specifically mentions encyclopedias and other academic works and when none exist use of other language sources is ok). Whereas commonname has a much lower standard of any old english source is ok which works fine for differences between Bill Clinton and William Jefferson Clinton, but falls apart when it comes to diacritics. As such the best way to fix commonname is to make it clear it doesn't apply to diacritics and to instead rely upon this one instead. -DJSasso (talk) 16:13, 25 June 2011 (UTC)
I don't believe this an issue specific to modified letters, but an issue regarding reliable sources for proper nouns. Thus I suggest trying to reach a consensus on modifying Wikipedia's policy on common names for proper nouns and their spelling (regardless of whether or not they contain modified letters). isaacl (talk) 16:24, 25 June 2011 (UTC)
@DJSasso: I'm sorry, WP:COMMONNAME clearly talks about titles in general, completely without restrictions to "alternate names". I see now why this goes in circles. We can't even agree what a text says. You apparently read the words "alternate names" somewhere in the WP:COMMONNAME section, where neither me nor my browser is able to find any mention of alternate names in that section. I guess we just have to conclude that we live in alternate realities, where WP:COMMONNAME says different things in different realities. I don't think it will be possible to have a constructive discussion on those grounds. In my reality, COMMONNAME is about article titles in general. I also see no contradiction between WP:COMMONNAME and WP:DIACRITICS as it is today. I also do not find WP:DIACRITICS unclear of ambiguous. --OpenFuture (talk) 16:56, 25 June 2011 (UTC)
Well then you might want to check out your computer. It clearly says it right below all the examples. It also says it in different wording in the second sentence of the section. -DJSasso (talk) 20:14, 25 June 2011 (UTC)
Did you perhaps mean to say "alternative" when you said "alternate"? The meaning of "in regards to alternate names" and "which of several alternative names" have very different meanings. However I don't see how that changes anything. Now you say COMMONNAME isn't about diacritics and irrelevant to it. Well, then actual usage with regards to diacritics can hardly go against COMMONNAME, and WP:DIACRITICS isn't contradicting it and there is no ambiguity in this way *either*, so there still is no problem and the current text still doesn't need to be changed. QED. The end? --OpenFuture (talk) 20:58, 25 June 2011 (UTC)
I was frustrated to see the repeated proposals which obviously (to me) fail to address the problem at hand, and now I regret the implication that the proposal is a bad faith attempt at making a point. I have struck the comment accordingly.

I'll make reference to the oft-held maxim: "if your [foreign] interlocutor doesn't understand you when you speak English, you should speak louder". Repeating reference to WP:COMMONNAME at an earlier juncture in the guideline is by no means offering a solution to the problem – you are merely heading down the SHOUTING route. The reams of comments in this and surrounding discussions suggest to me problem lies in the definition of what a "common name" is, and its rather subjective but polarised interpretation. What we need is a clearer wording and a tighter definition. Jimbo expects to see "André" but not "Đặng Hữu Phúc", although we more frequently than not see 'Aimé Jacquet' referred to in "reliable sources" by his name sans diacritics – so the line as I see it fails. It needs to be firmly drawn, giving examples. I personally believe that we should abide by reliable English sources as to diacritics' use – Britannica, National Geographic and The Guardian are mainstream and international enough as publications go. They are recognised for their quality, and use diacritics more or less in the way we currently do. So I fail to see why should play a crude numbers game by weighting the preponderance of "all other reliable sources". Many of these sources do not have such high reputations as the aforementioned; many also face technical and financial constraints in diacritics use (typesetting, or to maintain the quality of linguistic fact-checking). --Ohconfucius ¡digame! 02:29, 27 June 2011 (UTC)

So, there is a problem with the policy, and then reliable sources use diacritics as Wikipedia do. So then what is the problem? --OpenFuture (talk) 08:33, 27 June 2011 (UTC)

This discussion is still going around in circles

I'm taking this page off my watchlist and will happily join in again once this discussion stops going around in circles. I continue to be disgruntled by some of the arguments being repeatedly put forward as if we didn't hear them the first time, in particular, User:Dolovis' copypasting of the mantra "the inclusion threshold for Wikipedia is verifiability, not truth". Not only does this show he really does believe we're making the names with diacritics up, this is a grave misunderstanding of verifiability which has really compromised Wikipedia in the past.

Talking from memory (I couldn't find the specific issue on a Google search) in late 2007 there was a biography controversy where someone edited some Wikipedia biography saying that the article's subject had died (and he hadn't), and this was reported afterwards in a few news sources. On the article's talk page, an editor wrote, and I remember this quote word to word because it was so stupid, "it's clearly not true, but now that it's been reported in several reliable sources, under Wikipedia rules it doesn't make any difference whether it's true or not". This is where editors who believe that Wikipedia is a bureaucracy made up solely of policies and guidelines get us, people. Sources often make mistakes and lots of sources often make the same mistakes. Whether a source is reliable is entirely subjective depending on what piece of information, not what article or topic, you're trying to source. Diacritics in article titles are hardly as big a deal as some of the other situations this kind of grave misinterpretation of what WP:RS and WP:V are supposed to stand for could get us into, but this wikilawyering battology is really sickening. - filelakeshoe 12:48, 13 June 2011 (UTC)

Indeed, it will always go in circles. Neither side is ever going to convince each other of their respective arguments. GoodDay (talk) 18:59, 13 June 2011 (UTC)
Which is why Dolovis' attempt to circumvent the already existing agreement is so irritating. Filelakeshoe hit it dead on the head, Dolovis' (and other users) gross misunderstanding of existing policy has created a colossal waste of time. This discussion has been on-going for how long now, and at the end of the day, we're going to end up right where we started. If you ask me, much more productive things could have been going on at Wikipedia over the last few weeks than this nonsense. – Nurmsook! talk... 21:28, 13 June 2011 (UTC)
I tried to end it with the poll above, but since few people care to vote... --Piotr Konieczny aka Prokonsul Piotrus| talk 20:11, 14 June 2011 (UTC)
You asked for discussion, not votes, and I provided my proposed changes which directly addresses the original question regarding any conflict with Wikipedia's policy on common names (WP:COMMONNAME). However, if the goal is to try to avoid citations to sources that are reliable in other aspects but not reliable regarding reporting of a person's name in English (as Djsasso refers to above), then your proposal does not address this. (My proposal addresses it a little bit, by referring to WP:COMMONNAME's guidance on the issue.) As I suggested, crafting some guidelines to judge the reliability of sources regarding a person's name would help, if a good set of criteria can be determined. isaacl (talk) 20:42, 14 June 2011 (UTC)
You are right, it was not supposed to be a vote, but it turned into such. And nobody proposed anything alternative to discuss/vote on... --Piotr Konieczny aka Prokonsul Piotrus| talk 20:47, 14 June 2011 (UTC)
Just in case you missed it, here is the edit where I proposed a different wording. (Note this is just an aside to the idea of refining the guidelines to evaluate reliable sources for a person's English name.) isaacl (talk) 20:51, 14 June 2011 (UTC)
You may want to repost it in a separate section for higher visibility. --Piotr Konieczny aka Prokonsul Piotrus| talk 21:20, 14 June 2011 (UTC)
I have included the last sentence of that proposal. Frankly, I think it redundant; but it is certainly true, and if people don't find it in what we have said, there's nothing wrong with adding it. Septentrionalis PMAnderson 01:58, 16 June 2011 (UTC)
It is redundant, but since some of those arguing against the use of any modified letters have referred to Wikipedia's guidelines on using common names, I thought it might be useful to explicitly point out that there is no conflict. isaacl (talk) 02:53, 16 June 2011 (UTC)
The policy as spelled out at Wikipedia:Article titles requires that the article title is to use the name that is most frequently used to refer to the subject in English-language reliable sources. This applies to the title of the article – but within the text of the article, pursuant to WP:MOSBIO, the person's legal name should usually appear first in the article. I trust that explains the current Wikipedia policy as it relates to this issue. Dolovis (talk) 14:42, 20 June 2011 (UTC)

Which variant of English?

IMO the use of diacriticals, foreign names for titles and the like is dependant on the variant of English. For example, I went to school in South Africa. My secondary school, Estcourt High School catered for tuition in both English and Afrikaans. The Afrikaans name for the school was definitely Estcourt Hoër Skool, not Estcourt Hoer Skool - the word "hoër"means "higher" and "hoer" means "whore". The South African variant of English would certainly demand that the diacritical be used as Afrikaans is widely understood by English-speaking South Africans - moreover almost all Afrikaans-speaking South Africans also speak English. Carrying on from there, I have a sneaky, though unsubstantiated feeling that the British have much less resistance to the use of foreign names and therefore diacriticals than the Americans - one slang word that is used in the UK a certain amount these days is the German word "über" ("very", "total" or "higher" - for example "he was driving über fast"). I have no idea how much such expressions are used in the US - maybe an American could advise? Back to my original observation - the use of diacriticals and foreign names should be on a national variant basis - an article that is written in UK English will observe British conventions and one is written US English will observe American conventions (and likewise for Australian, South African, Indian and other variants of English). 21:42, 14 June 2011 (UTC) — Preceding unsigned comment added by Martinvl (talkcontribs)

I think you have found a valid point, but are simplifying things too much. The Brits are probably more open to French and German accents, the Americans to Spanish accents, and, as you say, the South African English speakers to Afrikaans accents. This is all very normal. But there are other phenomena that are orthogonal to this. A linguistic research paper uses diacritics more precisely than a literary criticism paper or an encyclopedia, which again uses them more precisely and more often than a high-quality newspaper, which in turn uses them more precisely and more often than a tabloid or a sports association. On this scale we should use the variant of English that other encyclopedias use, which means extensive use of diacritics, with a small number of (relatively rare) simplifications such as rewriting Middle English words: þorn -> thorn, yoȝ -> yogh. And I doubt that encyclopedic British English and encyclopedic American English differ in how they treat diacritics. Hans Adler 22:38, 14 June 2011 (UTC)
The decision on whether to use diacritics in an article title (or not) should be made by looking at as many different English Language sources (that mention the topic) as possible, and applying WP:COMMONNAME (while also taking into account WP:ENGVAR). In other words, if a significant majority of English language sources use the diacritic when discussing the topic, then so should we... and if a significant majority do not, then neither should we. If there is no significant majority either way, then we are free to choose which ever we like, based on other criteria. I see no need to make this more complicated than that. Blueboar (talk) 23:03, 14 June 2011 (UTC)
I could not agree more, there cannot be a sweeping charge made against the use or non-use of diacritics in articles. This is a case-by-case issue and must be treated as such. The ability to use diacritics must be protected, but beyond that there isn't much else to say. Demokratickid (talk) 23:46, 14 June 2011 (UTC)
By the same argument all articles must be titled in American English. WP:COMMONNAME is WP:COMMONNAME, not WP:COMMONSPELLING. Trying to make it say something about diacritics is an exercise in tea leaf reading. There is a tiny number of primarily non-English topics that do have independent English names, e.g. Lyons for Lyon or Munich for München. That's where inspecting usage in English-language sources makes sense. But if were to rely on the same method to decide between the spellings Düsseldorf, Dusseldorf and Duesseldorf, we might as well throw dice. (The problem is that every single source either uses these forms randomly, or follows its own manual of style. Therefore which spelling is more common depends on which sources use the word, rather than saying anything about the most standard spelling.) This is not what the other encyclopedias do. They all use diacritics in such cases. Hans Adler 00:00, 15 June 2011 (UTC)
AS much as I agree with the above statement, I have to even agree with this one more. I am biased towards using diacritics as should everyone else because they are, essentially, correct. Isn't the purpose of an encyclopedia to be correct? Demokratickid (talk) 00:07, 15 June 2011 (UTC)
The purpose of an encyclopedia is to inform. It should not be the purpose of an encyclopedia to decide what is correct. As Hans pointed out, we have different spellings of the same thing. We have to accept that that is normal. I've edited several articles on hockey players. This is not a diacritic issue. Some had adopted the misspellings of their names, as evidenced by their tomb stones. Another had not bothered to correct the misspelling of his name during his player career. It came out later in his life. We are all trying to cope with representing something or someone. It's not always clear what is the 'correct' name. It can change, too. Wikipedia tries to go with what is most common, as long as it is verifiable as a valid spelling. We must remember 'a rose by any other name would still smell as sweet.' ʘ alaney2k ʘ (talk) 00:37, 15 June 2011 (UTC)
It is not the purpose of an encyclopedia to inform readers about the majority manual of style of the sources covering a topic. An encyclopedia has its own manual of style and follows it. I have given several examples of manuals of style above. They all agree that for a foreign name with accents they prescribe either writing it with or without accents based on criteria that have nothing to do with inspecting other sources. When people move from one culture to another they sometimes change the spellings of their names. That's a relatively rare case that needs special treatment. The vast majority of the cases we are discussing here is politicians, sportspeople etc. who still reside at their country of birth and have not changed their names in any way. They, and the huge number of places with diacritics, should not get random spellings just to make it marginally easier to spell a person who immigrated into the US without the diacritics. Hans Adler 00:50, 15 June 2011 (UTC)
It is not still the same person? ʘ alaney2k ʘ (talk) 01:04, 15 June 2011 (UTC)
I don't get your point. First, you seem to be arguing only about that very rare case of people with diacritics moving to an English-speaking country. Second, a woman who marries and adopts her husband's surname also remains the same person. That doesn't mean we get to randomly use either of her two names, depending on accidents such as whether most of the sources predate the marriage or not. Instead, we try to find out how she wants to be known in public after the marriage. For people in non-English countries who have diacritics the presumption by all manuals of style that I have seen is that they want to be known under their name with diacritics and therefore, absent technical obstacles against doing so, they are spelled with them. Hans Adler 01:17, 15 June 2011 (UTC)
My father and my mother's parents moved here to Canada. My father changed his name to drop the use of Unicode character 0141 (the L with a slash) (I cannot render it on my computer). I don't believe it is rare enough that the issue should not be considered. And his first name was spelled in various ways. So birth records would not be accurate, once he became known in North America. No-one would be able to find him in the phone book. There are many emigrants here in Canada. We change our names to conform, to fit. This is the standard. ʘ alaney2k ʘ (talk) 16:30, 15 June 2011 (UTC)
I couldn't immediately find the number of first-generation immigrants living in the US, but surely it's dwarfed by the number of people from non-English countries who didn't emigrate to the US. And since this encyclopedia functions as an international encyclopedia, not just one for the English-speaking world, the first-generation immigrants to the US are a tiny minority among all people born with diacritics. Your personal family history can't change this fact. I am not really interested in this small percentage and wouldn't mind different rules for them. But they are not a reason to strip the large majority of Czechs living in the Czech Republic, French living in France, Polish living in Poland, Estonians living in Estonia, Germans living in Germany, Irish Gaelic speakers living in Ireland, etc., of their accents. Hans Adler 18:42, 15 June 2011 (UTC)
I'm not in favour of doing that. Don't lump me with user Dolovis or GoodDay. I'm basically okay with existing policy. Diacritics neither encouraged or discouraged. I would be okay with spelling it out in more detail, but I'm not okay with pushing diacritics beyond what can be demonstrated to be valid usage. That of course, is somewhat subjective, but I think editors who are 100% or 100% against are the problem, not people like me who want something in-between. ʘ alaney2k ʘ (talk) 20:02, 15 June 2011 (UTC)
I have now reread most if not all your posts on this topic here, and I see now that I must apologise for completely misunderstanding you and responding to what I read into your mind rather than what you said. Not sure how this happened; maybe I mixed you up with someone else. (It was recently reported that humans can only keep social relations with up to 150 different people. The number of people I should be able to distinguish on Wikipedia is well above that, and I think we haven't met before. This should have been a reason to do my homework before my aggressive response to you below.) Hans Adler 20:58, 15 June 2011 (UTC)
@alaney2k it is of course to inform but it must also be correct in the act of informing, and not informing a non-truth. Demokratickid (talk) 00:54, 15 June 2011 (UTC)
There are 34 diacritics for the letter 'a'. See the bottom of the Ä page. The usage of the diacritic may not inform at all. That's what I am talking about. It's hard to build enthusiasm for that. Unicode is HUGE. It's one thing to say, yes, let's use them. But, you'd have to be an expert to know them all. So there has to be a point that is reasonable that will be understood. By the majority? By all? ʘ alaney2k ʘ (talk) 01:04, 15 June 2011 (UTC)
So we are back to the tiresome nonsensical "diacritics make me blind and since I don't understand them nobody should have them" argument. Hans Adler 01:17, 15 June 2011 (UTC)
Why is it so black and white to you? Do you not see other colours? :-) It's tiresome. On my system, several of the variants did not even display. Those are 'technical obstacles', I would say. Not all alphabets and characters are installed on my system. I doubt that I am alone in that respect. ʘ alaney2k ʘ (talk) 16:30, 15 June 2011 (UTC)
It is not black and white to me at all. Technical obstacles are a very valid concern where they exist. But you have just changed your argument from "you'd have to be an expert to know them all" to "several of the variants did not even display". It's unfair to blame me for not anticipating that.
One of the letters at the bottom of the Ä article doesn't even display on my system, even though I have a lot of special fonts installed for mathematics, Chinese etc. (It's Unicode 1d8f, "a with a retroflex hook".) It's not on the Windows Glyph List 4, and I doubt that it would be used in many of our articles. We obviously need to adapt our approach for languages that use such extremely rare characters. But for the large majority of Latin-script languages, those where all letters are on this standard glyph list, there is simply no technical problem on any half modern computer that isn't severely broken (in which case it should be fixed). Hans Adler 18:59, 15 June 2011 (UTC)
Unicode's adoption has improved things immensely in the display of characters. We're basically in an age where the technical standards are in place, but it's well ahead of practice and understanding here in North America. ʘ alaney2k ʘ (talk) 20:02, 15 June 2011 (UTC)

The article Nürnberg Hauptbahnhof is a good example. If we use the word "Hauptbahnhof" in the title, then I see no problem with writing "Nürnberg" as well. The article in question translates the title into English in the lede, but thereafter uses the German spelling. This is normal British practice. Is this typical US practice? Martinvl (talk) 07:11, 15 June 2011 (UTC)

It does use both Nürnberg and Nuremberg in the text. It uses both "Nürnberg Hauptbahnhof" and "Nuremberg Hauptbahnhof" as well. I'm not sure what is to be learned from that article. ʘ alaney2k ʘ (talk) 16:30, 15 June 2011 (UTC)
That some editors have difficulty writing English; the article itself says that Central Station is usual. This used to be pretentious anglophones; Mark Twain has a lengthy section on the sort of writing which shows off by using Bahnhof for "Railway Station". Septentrionalis PMAnderson 21:52, 15 June 2011 (UTC)
The policy as spelled out at Wikipedia:Article titles requires that the article title is to use the name that is most frequently used to refer to the subject in English-language reliable sources. This applies to the title of the article – but within the text of the article, pursuant to WP:MOSBIO, the person's legal name should usually appear first in the article. I trust that explains the current Wikipedia policy as it relates to this issue. Dolovis (talk) 14:41, 20 June 2011 (UTC)

I'm with Hans and Democratkid. I would preface my comment saying that I refer to proper names of people and places which originate in languages with a Latin script having diacritics. It excludes loan words and names from non-Latin script-based languages. I'll try and be as concise as I can, but the complexity of the subject the risk of being too long-winded, here goes anyway...

English is the über-colonial language: Variants abound; Czech characters, Polish characters, Russian Greek; even Arabic Japanese and Chinese characters are capable of being rendered into forms recognisable by people who know only the 26 characters they learned in school. The English language is capable of almost infinite assimilation; hundreds of new loanwords are added to the official English vocabulary every year. To some, "proper Anglicisation" implies the dropping of diacritics; resistance to that is futile.

But in the globalised 21st century world, with the trend for information to flow outside of borders, the English alphabet is showing its limitations. The English alphabet, like all other alphabets, is only capable of capturing the pronunciations that are characteristic of that given language. What is more, English is known for its grammatical and pronunciation idiosyncrasies; It is woefully inadequate when trying to capture pronunciations of even many other languages with Romanised characters and standardised pronunciations, such as French and Czech, both of which I speak. As an encyclopaedia, I feel we should strive for a quality higher than the TV newscasters or the journals that still use typesetting (I jest) – both of these often get it terribly wrong, thereby doing a disservice to their target audience. WP is technologically capable of displaying a very wide range of diacritics; we also have armies of editors from various linguistic backgrounds happy to ensure all this is carried out properly. Both these are advantages that can and do give great service to our readers.

I am all in favour of keeping diacritics. The fact is that the letters 'ç' and 'é' are already loan-letters in our alphabet (viz their fairly pervasive use: café, façade, rôle). Use of other letters, such as the 'á' (long a), 'ř' ('r' with a haček), for which there are no equivalents, gives clues to a different pronunciation. The reader may not know exactly how such words are pronounced, but they may be at least made aware that it isn't to be pronounced as they might expect an English word to be; those curious will initiate their own enquiries. Expanding their use is to be encouraged and not fought. People may be a little bit puzzled the instant they reach the Václav Havel article, which they accessed by typing 'Vaclav Havel' (without the "long 'a'"); Thankfully for a famous namesake, 'Dvorak' is now universally pronounced using a zh-sound even when the haček is absent. However, for poor Jiří Novák, English people seeing the bare 'Jiri Novak' would undoubtedly call him "Jerry Novak" instead of pronouncing his name as it should be – "Yirzhi Novaak".

I would apply the same logic to the correct use of punctuation (the endash, mdash, comma, minus sign) that materiel limitations are not, and should not be, an issue. We don't need to take many steps to ensure the reader has the 'best' information. On the other hand, removing diacritics from names that natively have them amounts to misrepresentation and loss of crucial linguistic information. --Ohconfucius ¡digame! 04:53, 21 June 2011 (UTC)

I modified a part of your comment, Ohconfucius: Wikipedia should respect the original form of proper names of people and places which originate in languages with a Latin script having diacritics. It excludes loan words and names from non-Latin script-based languages. I think this should definitely be a part of the Wikipedia manual of style. It is more descriptive and less commanding than the "discourage/encourage" proposal above. Any thoughts? --Vejvančický (talk | contribs) 06:38, 22 June 2011 (UTC)
I think that's definitely along the right lines. --Ohconfucius ¡digame! 06:46, 22 June 2011 (UTC)

Question

In the current wording of the guideline it says: In general, the sources in the article, a Google book search of books published since 1980, ....

I'm just wondering, who picked 1980 and why? Why not 1988 or another year?Volunteer Marek (talk) 02:29, 21 June 2011 (UTC)

I don't know. In general, restricting to more recent years is sensible because in English (and in German and probably many other languages) there is a strong recent trend towards internationalisation of proper names, which typically consists in using the original forms including the spelling. This trend has already made some English versions of non-English names obsolete, while others are in the process of becoming so. E.g. Lyons for Lyon is largely obsolete nowadays, and Francfort-on-the-Main / Frankfurt on the Main has largely been replaced by Frankfurt am Main. In part this is due to political correctness (especially with the use of Polish names in German for formerly German places, e.g. de:Olsztyn not Allenstein, or the gradual move from Calcutta to Kolkata), but in part this is just the normal language regularisation process, triggered in this case by the increasing relevance of numerous small foreign places that do not have an English (or German, etc.) name and are therefore written in the original form. A special English form of a name was once the standard for most foreign names that appear in English. Due to globalisation it has become an exception, and exceptions are always under pressure to disappear. Hans Adler 07:53, 21 June 2011 (UTC)
Let's update it to 2000 or somesuch recent epoch. 1980 publications are not hugely different to what appeared in 1911 Encyclopaedia Britannica, so has lost relevance in this day and age. --Ohconfucius ¡digame! 08:25, 21 June 2011 (UTC)
Ha ha. Based on your comments I think that maybe we ought to have a "Young person's Wikipedia" - the world began in 2000 ... and a "Crotchety old people's Wikipedia"? LOL and ROTFL. ʘ alaney2k ʘ (talk) 22:39, 21 June 2011 (UTC)
Who: [20]. Why? Good question. But is the 2000 any better of a cut off? --Piotr Konieczny aka Prokonsul Piotrus| talk 22:12, 21 June 2011 (UTC)
1980 would be better as it encompasses a larger body of work and it goes before current trends, which a cut-off of 2000 might be susceptible to. If you were to pinpoint an epoch change, it would probably be not long after world war II and the rise of the baby boomers, computers, satellites and television. 1990 saw the demise of the USSR and the rise of the internet and globalization, so you could make a case for that year. You could just change it to 'go back 20 years' as a basic reference that never gets dated. ʘ alaney2k ʘ (talk) 22:39, 21 June 2011 (UTC)
Well a date that makes most sense to me would be 1989 or 1990. Fall of communism, lots of political transformation, reunification of Germany, and all that.Volunteer Marek (talk) 00:04, 22 June 2011 (UTC)
I'd support the 1990 per VM convincing argument. --Piotr Konieczny aka Prokonsul Piotrus| talk 01:15, 22 June 2011 (UTC)
As the year in which the Internet started gaining international acceptance, it's also a good cutoff year. --Ohconfucius ¡digame! 03:14, 22 June 2011 (UTC)

Transliteration is proper action

Tony1 made the best point, I think, on Jimbo's talk page. That transliteration to a proper corresponding name in the language of Wikipedia is the proper action, that you shouldn't be using the name the way it is originally, but the way that is proper for the language of Wikipedia you're on. For example, on English Wikipedia, we use Barack Obama. This happens to correspond to his original name as it is, but it's still just an English version of his name. Now, when you go to other Wikipedias, you see his name in the proper method of those language, namely:

am:ባራክ ኦባማ

ab:Барақ Обама ar:باراك أوباما az:Barak Obama bn:বারাক ওবামা ba:Барак Обама be:Барак Абама be-x-old:Барак Абама bh:बराक ओबामा bi:Barak Obama bo:བ་རག་ཨོ་པྰ་མ། bg:Барак Обама ca:Barack Hussein Obama cv:Барак Обама dv:ބަރަކް އޮބާމާ nv:Hastiin alą́ąjįʼ dahsidáhígíí Barack Obama el:Μπαράκ Ομπάμα myv:Обамань Барак fa:باراک اوباما gan:奧巴馬 ko:버락 오바마 hy:Բարաք Օբամա hi:बराक ओबामा os:Обама, Барак he:ברק אובמה kn:ಬರಾಕ್ ಒಬಾಮ ka:ბარაკ ობამა kk:Барак Обама ky:Барак Хусеин Обама lo:ບາຣັກ ໂອບາມາ la:Baracus Obama lv:Baraks Obama jbo:byRAK.obamas mk:Барак Обама ml:ബറാക്ക് ഒബാമ mr:बराक ओबामा arz:باراك اوباما mzn:باراک اوباما mn:Барак Обама my:ဘာရတ်အိုဘားမား ne:बाराक ओबामा ja:バラク・オバマ mhr:Обама, Барак pnb:بارک اوبامہ ps:باراک حسين اوباما km:បារ៉ាក់ អូបាម៉ា crh:Barak Obama ru:Обама, Барак sah:Барак Обама si:බැරැක් ඔබාමා ckb:باراک ئۆباما sr:Барак Обама ta:பராக் ஒபாமா tt:Baraq Husseyın Obama II te:బరాక్ ఒబామా th:บารัก โอบามา tg:Барак Ҳусейн Обама tk:Barak Obama uk:Барак Обама ur:بارک اوبامہ ug:باراك ئوباما wuu:巴拉克·奥巴马 yi:באראק אבאמא zh-yue:奧巴馬] zh:贝拉克·奥巴马

These are all proper examples of titles in each language Wikipedia. They all properly mean Barack Obama. Even though his name is spelled in English letters (English diacritics, you could say), you don't just use the same name on every language Wikipedia. No, you translate it into the proper form for the language Wikipedia you are on. We should be doing the same here. We should be translating the names into the proper forms for the English language. How do we tell what the proper forms are in English? Look at the most common spelling of it in English language sources, that is the proper form for the English language. Sometimes, this will happen to have diacritics in it and that's fine, it just means that for that one word, it is common practice in English to use a diacritic. However, this is not true for every or even most words. We should be using English translations of names and places (among other things), based on English language sources. That is how it has always been done, that is how every language does it, that is the common sense method of doing it. SilverserenC 06:14, 22 June 2011 (UTC)

This is misleading. The current discussion is primarily about how to handle names from foreign languages that are written in an alphabet based on the Latin alphabet. For names in other scripts we transliterate if there is no established English name. Many transliteration systems make heavy use of diacritics. We tend to prefer those which don't, but for some languages they are not available, or not in common use, and so we have to use transliterations with diacritics. In some cases, such as pinyin, it is defensible to simply drop these diacritics, and then we tend to do this. In others it is not, and then we don't. Hans Adler 22:04, 22 June 2011 (UTC)
Actually, if you look at some of the languages listed above, there are more than enough that use the Latin alphabet, but purposefully spell Barack's name in a manner that is common for their language, with the Latin alphabet involved. So, no, that's not true. SilverserenC 06:13, 23 June 2011 (UTC)
No fair! Barack Obama is already in English, and for languages that employ Latin-based scripts, it is spelt exactly as it is spelt in English. We expect that many of the other languages will translate the name phonetically. I note Jimmy Wales is Iacobus Wales in Latin, but I digress! Let's just take the first Jiří that pops up on our wiki search engine: Jiří Novák. Without necessarily being exhaustive, here are some 'choices' of how his name can be presented in English Wikipedia:
  1. Jiri Novak (transliteration to ASCII, total)
  2. George New (literal, total)
  3. George Novak (literal, partial)
  4. Jerry Novak (phonetic based on ASCII rendering, partial)
  5. Jerry New (hybrid, total)
  6. Yirzhi Novaak (phonetic, total)
  7. Jiří Novák (native)
Which would you choose, and why? --Ohconfucius ¡digame! 07:21, 23 June 2011 (UTC)
It would have to come down to a choice between 1 and 6. Personally, I would go with 6, for the phonetic translation, because it is specifically accurate for pronunciation. And then the first line of the article would have his native spelling included with diacritics. SilverserenC 07:28, 23 June 2011 (UTC)
That is the most ridiculous thing I've ever heard. Why don't we then move Happisburgh to Hayesborough or Wrotham to Rootham, Cholmondeley to Chumley since that's how they're pronounced? English rules of phonology are far, far from logical so how a foreign name "appears" to a monoglot to be pronounced is essentially erratic. Usually American tourists trying to read Nádraží Holešovice just spit all over the place, so I guess we should move that article to Naddddratpthkewbfubwe. We should NOT be creating new transliteration systems that don't exist. As I keep saying, it's not just names with diacritics which are "unintelligible" to English speakers, it's most foreign names full stop, and if we're going to move everything which "looks unpronounceable" to an English speaker to some nonexistent mangled translation then we have a lot of prescriptive new names to make up. - filelakeshoe 09:45, 23 June 2011 (UTC)
No, what we should be doing is following the common name. I forgot to say that before. What is the common name in English sources for Jiří Novák? I would like for it to be 6, but it's probably 1. Either way, we should be going with the most common name in English sources, because that is de facto what he is called in English. If there is no English sources discussing someone, then we should just default to their name with diacritics. Ohconfucius was asking for my opinion in terms of the name and I gave it, but the question he asked in the first place had nothing to do with the discussion, as i've stated multiple times that WP:COMMONNAME is the process we should be following. SilverserenC 09:57, 23 June 2011 (UTC)
I mean, Filelakeshoe, you have this essay that you wrote and, while it deals with translations, it should be close to the same thing as here. We should be using the common English sources for the names. Really, your essay perfectly fits this. SilverserenC 10:03, 23 June 2011 (UTC)
The problem with COMMONNAME is that it currently contains no provision for when the sources are factually wrong, which does happen from time to time, if simply because of human error and/or technical limitations. Billy Joel didn't wake up one morning and think, "Hey, I'm going to release that concert I did in Russia under a completely gibberish name"; he thought "hey, I'm going to release that concert I did in Russia under a Russian name". Same with Paul McCartney when he released an album of cover songs in Russia around the same time. COMMONNAME is not an infallible policy we should always follow, and nor was COMMONNAME ever meant for use with regard to diacritics. In relation to the Billy Joel example, a Russian user once stated:

Constructs like 'COHUEPT' were only acceptable in a world where anything beyond English in ASCII code page was technically impossible to maintain in electronic and print media; I suppose we're well past this point.

Character support has improved vastly in the past ten years. It was much simpler for a print publication before widespread character support to print "Jiri Novak" instead of "Jiří Novák". Indeed, other policies state that accurate transliteration is preferred to following the leader; the worst thing we want to do is enforce truth-by-being-on-Wikipedia (which I infamously once did; the leitmotif of Requiem for a Dream is commonly known as "Lux Aeterna" these days for the simple fact as when I created the page for the popular track, I consulted the back of the album). You'll note that the Russian Wikipedia page for Obama doesn't transliterate back to "Vagask Ovama". Sceptre (talk) 00:16, 24 June 2011 (UTC)
Then, can you explain to me why the other language Wikipedias that also use the Latin alphabet having the article titles of "Barak Obama", or "Baracus Obama", or "Baraks Obama", or "Baraq Husseyın Obama"? If things were done as you say, then everything in the same alphabet should be spelled the same, name-wise, but they aren't. Furthermore, it is not up to you to determine that the majority of reliable sources, per COMMONNAME, are wrong. You are not allowed to do that whatsoever. You are just a random person. I am just a random person. Neither of us is allowed to impose our opinions onto encyclopedia articles, we are supposed to reflect the opinions of reliable sources. SilverserenC 02:31, 24 June 2011 (UTC)
I would guess it's because it's their equivalent version of USEENGLISH. COMMONNAME doesn't extend to diacritics at all, because the removal of diacritics does not an English version make. And source-fetishization has to stop; in this case, a reliable source is simply one that has a known record of accuracy and authority. RS does not, and has never and will never, require every character of every page to be absolutely correct. Imagine if the Guardian was the only source to print a birth-date of an actor, and prints it as "1997" instead of "1967", where the actor is clearly in his forties. Do we say that this actor, who was credited in a 1994 film, has the power to appear in films three years before he was born? Sceptre (talk) 03:38, 24 June 2011 (UTC)
Of course not. There's always a degree of common sense that has to go into the use of reliable sources. However, the accuracy of the facts of sources is not what we're discussing here. What we are asking, or what I am saying we should be asking, is what is the spelling of a name or a place that is common for it in the Western, english-speaking world? If that common spelling is without diacritics in the english-speaking world, then we should not be using them for that name as the title. If there are diacritics used for the name in the english-speaking world, then we should be using them for that title. If the name is not covered in sources in the english-speaking world, then we default to the original name in the native language. All in all, we should always make sure that, while making sure we are covering everything in the world, per WP:BIAS, we are still covering it from an English perspective, which is why we are English wikipedia. This is just how other language wikipedias are representing the subjects covered in them from the perspective of that language and the sources in that language. SilverserenC 04:03, 24 June 2011 (UTC)
Transliteration is fine, between different alphabets. Diacritics, however, are part of the Latin alphabet... --Piotr Konieczny aka Prokonsul Piotrus| talk 02:09, 24 June 2011 (UTC)
Yes, but some people seem to be confusing what this whole proposal is about. They seem to think that if you are opposing, then you are saying diacritics should not be used anywhere, which is not at all what this is about. Opposing means that you are keeping the wording "they are neither encouraged or discouraged". SilverserenC 02:28, 24 June 2011 (UTC)
The reason people are thinking that is because that is exactly how this discussion started. A user wanting to strip them out of every article because the current wording says "they are neither encouraged or discouraged" so it was felt stronger language was needed to explain that. -DJSasso (talk) 11:50, 24 June 2011 (UTC)
The original editor has repeatedly referred to requiring English-language reliable sources, and not mentioned the wording you quoted. Changing these words without addressing the question of how to evaluate the sources to use won't resolve anything. isaacl (talk) 12:36, 24 June 2011 (UTC)
Yes he has in some places. In others he has rejected using them even when confronted with english sources. So frankly the real reason for this discussion is to stop the move to strip them from every article and to reword the sentence to show that diacritics are acceptable to use, and not that english sources are preferred and therefore removing them is preferred which is what the current language implies, but which is clearly not common practice. And policies/guidelines are supposed to reflect practice not prescribe practice. So we need to remove or in some way change the implication that they should not be used which is currently in the wording. -DJSasso (talk) 14:03, 24 June 2011 (UTC)
All I'm saying is that the proposal does not address "...not that english sources are preferred", and so if consensus is to alter this, more is required. However, since this guideline essentially repeats Wikipedia's guidance on using common names, to avoid duplication, it may be best to centralize all changes on that policy page. isaacl (talk) 14:16, 24 June 2011 (UTC)