Wikipedia talk:Manual of Style/Arabic/Archive 2

Below is old, archived discussion regarding Wikipedia:Manual of Style (Arabic)

Reviving the project

Since the project started, it has seen a lot of activity. But now, it seems that it stopped without reaching a consensus. To revive the project, I suggest:

  1. Putting down all issues that need to be discussed.
  2. Discuss one by one to reach consensus.

I hope the policy will be finished. CG 08:48, August 28, 2005 (UTC)

Good idea; would you like to do that, then? ;) Palmiro | Talk 15:37, 28 August 2005 (UTC)

Cedar-Guardian, There seem to be only three outstanding issues: the first rule of alphabetizing (I went ahead and picked one from here, though feel free to change it), transliteration of ة, and transliteration of ى. Therefore, I'd recommend that you just arbitrarily pick a solution you prefer for those three items, explain it on the talk page, and then call the first cut of it done. From then on, the page can evolve organically. --Arcadian 00:56, 1 September 2005 (UTC)

I did some work on articles at the end of July, and then was a way from my desk for about a month. I discovered template:lang-ar and started to implement it. It would help produce a tidy look for lead paragraph transliterations. However, I thought that incorporating some of the features of template:unicode or template:IPA might be helpful. These templates suggest to the browser which fonts may be more appropriate to display their contents. This seems to be particularly helpful to Internet Explorer, which gets itself very confused. Such an approach might allow us to use a more standard set of diacritics for transliteration, rather than having to do SAMPA-like workarounds. When it comes to transliteration, I think we should keep it simple for the non-Arabic-reading English-speaking readership of Wikipedia (if they can read Arabic, they don't really nead the transliteration anyway). Here are some thoughts on this:
  1. Assimilate sun letters into the definite article: people will then pronounce things better.
  2. Use digraphs before diacritics in letters like sh, dh, th, kh and gh: people will stand a better chance pronouncing these than the diacritics (in a very few words this becomes complicated though).
  3. Fully distinguish every letter. Emphatics should be written with a sublinear dot, except for q and `ayn. Hamza should be represented by either a straight apostrophe or a unicode right half-circle. Likewise, `Ayn should be represented by either a backtick (`) or a left half-circle.
  4. It is an accepted standard to use the vowels a, i and u to transliterate Arabic, and to mark long vowels with a macron.
  5. The Ta Marbuta does not represent a noticeable h-sound in pronunciation, so the transliteration as -ah should be avoided. To mark this letter clearly, I suggest that is used (the two dots of the diaresis remiscent of the dots on the Ta). In construct situations the form -ät then seems most appropriate.
  6. For the undotted final Ya, I prefer the transliteration (a with grave accent) as a way of distinguishing it from other endings.
What do you think? Gareth Hughes 12:19, 1 September 2005 (UTC)
I pretty much agree with you on all these points, but I think the ta marbuta should be represented as an 'a' with a small superscript 'h' next to it instead of 'ä' (which is confusing, particularly to Scandinavians like me). What comes to the representation of 'ayn, I think it should definitely be marked with a half-circle (since the back-tick is extremely ugly in my opinion). Hamza should be a simple apostrophe, to avoid confusion between the two. - ulayiti (talk) 12:35, 1 September 2005 (UTC)
These suggestions are mostly in line with what I've suggested above, and I like the half circle for the `ayn and the apostrophe for the hamzat qat`. I like the idea of distinguishing the 'alif maqsūra with an accent.
Here are the Unicode half ring letters, in normal, italic, and bold:
ʾ ʾ ʾ U+02BE MODIFIER LETTER RIGHT HALF RING
ʿ ʿ ʿ U+02BF MODIFIER LETTER LEFT HALF RING
--JWB 03:34, 13 May 2006 (UTC)
Again, the only question I'm not sure about is the ta' marbuta. I still think there's a strong argument for representing it as a ha' because in formal Arabic it is considered equivalent to a ha' and is actually exactly written as a ha' in contexts where exact pronunciation is specified (such as at the end of a line of poetry). I like the idea of distinguishing it somehow, but I'm not sure how. Technically, the "a" is from the kasra, not from the ta' , so it may be strange to set the "a" apart. I sort of like the idea of superscripting the "h" because it retains the ha' but distinguishes it. My only hesitation is that it may be a bit unwieldy. Treating it simply as an "h" may not properly distinguish it, but then again it's not a "real" letter anyway--in Arabic it's simply considered a double variation on the letters "t" and "h"--and non-Arabic speakers will know how to pronounce it and (in every or nearly every case) Arabic speakers will recognize the origin. Jbenhill 18:35, 1 September 2005 (UTC)
I also agree with Gareth Hughes' points, except I think the ta marbuta (tā' marbūṭah) should always be h unless in construct when it is t. This is nearly across the board the way it is handled in the major transliteration systems in use. Rather than invent something new I think we should stick with common conventions.
Also `ayn with a backtick and hamzah with apostrophe ' is fine and keeps us from using too many fancy characters. It is important to distinguish them, however, so `ayn should not be written with the regular apostrophe.
Another question to be resolved is the handling of hamzat al-waṣl. In BGN, UNGEGN and ALA-LC it is generally written out with the following vowel intact e.g. `Abd ar-Rahmān and I think this should be followed here.
Note that I am talking about transliterations used within articles and not about article titles here. Article titles, I think, should strive to use primary transliterations when they really exist. If they don't exist, we should use Garzo's system above, without diacritics. Thus emphatics, long vowels, and the alif maqsurah would lose their dots and dashes. Note however that I do not count ` for `ayn as a diacritic. In my opinion it should always be written, if at the beginning of a word the following vowel is capitalized e.g. `Abd and so on. Hamzah should be left off at the start of words and after the article, but written everywhere else as '.--Cam 23:52, 15 January 2006 (UTC)

hamzat wasl

One topic that doesn't seem to have come up here is how the elusive quasi-letter hamzat waṣl should be transcribed in different contexts. Is it preferable always to write the vowel associated with it, even when a preceding vowel causes this implied vowel neither to be written nor pronounced in Arabic? (Many mistake the initial 'alif for a vowel, which it is not--'alif here is something vowel are placed onto, and sometimes it has no vowel and is therefore mute.) Correct pronunciation might be better understood if we only wrote the vowel when not preceded by another vowel, excluding case endings, as in Muḥammad ibn ʿAbd Allāh, but not when it follows another vowel: Al-ḥamd li-Llāh, 'ilá l-'ākhir , Fī ntiẓār. (I find it strange to write Al-ḥamd li-Allāh, 'ilá al-'ākhir, or Fī intiẓār, spellings that involve vowels that are neither written nor pronounced in formal Arabic).

I suggest that we might indicate the hamzat waṣl by its simple vowel where not preceded by another vowel and by dropping it entirely wherever it is preceded by a vowel. Taqī ad-Dīn would then become Taqī d-Dīn. It may look funny but it's closer both to the correct pronunciation and to its fully-vocalized spelling. The hamzat waṣl in this case is not explicitly written but it is presupposed wherever there is an unvoiced initial consonant.

Whether we decide to treat each word independently (thus including the hamzat waṣl as if each word were at the beginning of the sentence) or treat them as a phrase, at the very least I think it's important always to differentiate the hamzat waṣl from the hamzat qaṭʿ by indicating the latter, especially in the initial position (thus 'Aḥmad and not Aḥmad), and by not writing it where it doesn't belong (for example 'ibn or 'Allāh).

Along the same lines, the transliteration ibn seems more accurately to represent formal Arabic than bin or ben. After a vowel in Arabic it is pronounced bn and can be written without the initial 'alif, but there is no internal vowel. Aside from primary transliterations, I think we should always transliterate it ibn. Jbenhill 19:33, 1 September 2005 (UTC)

Official transliteration

I liked the definition of Primary definition and Standard transliteration, but I think that an Official transliteration should be also added. It refers that the english transliteration of an arabic name in its official site (not official fan site) should be taken in consideration. This official transliteration should be higher than the standard in terms of priority, but the problem is if the official and the primary are different, which one has the highest priority? CG 20:35, September 7, 2005 (UTC)

Romanization Idea

Greetings everyone:

I have an idea of how to transliterate ﻅ and ﺫ, and I would like your thoughts about it. I realize it is not in keeping with standard conventions (such as putting a dot underneath the consonant to indicate that it is emphatic), but it is the most simplified that I could come up with that (hopefully) keeps in correct pronunciation of these two letters. What about transliterating ﻅ as /dh/ and ﺫ as /dh/. I know it may look a bit odd, but we could always include a link with a very brief description as to manner and point of articulation for those who may be interested in more detail.

Carmen 05:05, 18 October 2005 (UTC)

My Romanization Thoughts

Here are my thoughts on the important issues mentioned above:

1. As for long vowels, I like the diacritic idea; most Americans know what /ā/, /ī/, and /ū/ indicate. However, there is a third option: using a semicolon to indicate vowel length. So, we would have a:, i:, and u:.

2. I am totally in favor of assimilation to sun letters (ad-dars), since it indicates the proper pronunciation of the noun phrase.

3. As for tā' marbūṭa, I think we should romanize it as /a/ at the end of a word: ṭāliba "female student" except:

When it is in construct state: /sayyarat-u ar-rajal/ or /sayyarat ar-rajal/ "the man's car"

and

When (and if) we decide to indicate case endings: dhahabt-u 'ilā al-madīnat-i "I went to the city."

Carmen 05:38, 18 October 2005 (UTC)

Templates

I suggest a template to mark text in Arabic script, and Arabic transliteration, each, along the lines of Template:IPA, Template:IAST, Template:PIE, etc. What would the most appropriate names for such templates be? 81.63.114.127 19:30, 25 October 2005 (UTC)

perhaps worth a mention

People involevd in this discussion might like to make a contribution to the open question of how to refer to the Baath/Baas/Ba'th/Ba'ath Party at Talk:Ba'ath Party. Palmiro | Talk 21:11, 4 November 2005 (UTC)

Lead Paragraphs is a terrible idea

First let me say that while I can't read a word of arabic, I do speak several languages and I think I'm an open-minded person. I read the whole project page and most of it looks like good ideas.

That said, putting arabic script in the lead paragraph of an article in the English Wikipedia is a terrible idea. It's just really, really bad. I mean, what is it for? Most of the people who read english wikipedia can't read arabic script, and probably don't even care what the arabic name is for some city. They want to read something in English. If they want to read it in arabic, they can click on the arabic link and read it in arabic.

Also, the idea here seems to be to stick the arabic at the very beginning of the article. This is where the most important information should be. If I'm reading an article about, say, Dar es Salaam, the first thing I want to know about it is not some foreign script which I will never remember. It's not a useful piece of information. It sticks out like a sore thumb. Which is why I tracked down this project page and came here to make my comments.

Anyway, obviously a lot of articles have had this change made. I strongly suggest that these arabic scripts of names should be eliminated from english wikipedia articles. I've already taken it out of Tunis which I'm doing a lot of work on right now.

Sbwoodside 08:31, 9 November 2005 (UTC)

I can see your point, but I disagree with you. In some articles, the variations of names can fill the first sentence with parenthetical material. Most other encyclopaedias do not give non-Roman text in articles. However, I think it is really valuable to have information of a word or phrase in its native language. I am suggesting that we accompany all Arabic with transliteration, so that those who cannot read Arabic can get a sense of the Arabic word as well. It is useful to have the Arabic script alongside the transliteration, because there are a number of schemes of transliteration, and the original version is in Arabic script anyway. We have Arabic in the lead section of such articles for the same reason that the article on Moscow tells us what the Russian name of the city is. It seems completely odd that such an article would not have that kind of information. After all, the conventional English spelling is often different or misleading. The position of the Arabic is moot. I believe that if it's straightforward, it's best addressed straightaway. If the translation is more complex, it might be better to have its own sentence furthere down the article. --Gareth Hughes 16:46, 9 November 2005 (UTC)

Your Moscow example is certainly convincing. Can you point me to a policy or debate around the use of cyrillic in that article? Also I think that the transliteration is much more useful since it's providing me with information that I (as a normal English wikipedia user) can actually read and use. So presumably now with Dar es Salaam I should pronounce it the way it's transliterated? Is that the idea, that it's a pronounciation guide? Sbwoodside 19:38, 9 November 2005 (UTC)

Moscow's cyrillic was added long ago [[1]]. Sbwoodside 19:45, 9 November 2005 (UTC)

OK, I'm against wikipedia policy in much of what I said. See: Wikipedia:Use_English which is policy and clearly says that the non-Latin names should appear on the first line. I'll add a link on the project page. I still think it's really ugly. I think it would be better to have some kind of box on the right that goes through the names. But this is clearly not the place for that discussion so I'll bring it up on WP:UE talk page... Sbwoodside 03:37, 11 November 2005 (UTC)

Transliteration and pronunciation are slightly different. Transliteration is a representation of a word in non-Roman script. Usually, transliterations are fairly scientific in that they attempt to retain as much information as possible (and sometimes add some). Hence, transliterations often use a whole host of diacritics that differ from language to language. Where transliteration renders script, pronunciation renders speech. Often these are the same, but they may be different. For example, Dar es Salaam is an Arabic phrase, but its Swahili pronunciation differs from modern standard Arabic. The transliteration (and the Arabic original) reveals that the vowel in Dar is long, as is the second vowel in Salaam. The original shows that es is nothing more than the official speeling in the name of the Arabic definite article al- as it coalesces with the following s sound. --Gareth Hughes 13:27, 11 November 2005 (UTC)

Sayyid and what to use as a test of 'primary transliteration'

There is a discussion going on on Talk:Sayyid about whether that transliteration or the transilteration syed should be used. The former is used in pretty much every serious book that I have seen; the latter is far more common on Google (but mainly in names, and it looks like mainly in names of Indian/Pakistani origin).

My contention, apart from anything else, is that use in an academic context is a better guide to what Wikipedia's preferred style is than a Google search.

Any thoughts? Palmiro | Talk 19:14, 11 November 2005 (UTC)

Inactive?

Why has this been marked as inactive? This should be completed. That's rather anti-productive, don't you think? --LakeHMM 22:47, 30 December 2005 (UTC)

Yes you're right, I readded {{proposed}}. CG 17:19, 31 December 2005 (UTC)
  • I labeled it as inactive because there had been no edits to this talk page for over a month (which is quite a long time in Wikispace, hence the term "inactive"). If you wish to reinvigorate discussion on the subject, by all means do so, and advertise it e.g. on RFC or the VP. I had thought {{historical}} made that clear. Radiant_>|< 20:41, 2 January 2006 (UTC)

Non-Arabic letters

The proposal says that it is for all languages using Arabic script, but does not yet seem to include letters such as پ, used in Persian and Urdu. JPD (talk) 16:27, 10 January 2006 (UTC)

Place names

The place names approved by the U.S. Board on Geographic Names (BGN) are available for free [2] and are used many places on the Web in various gazetteer products. Significantly, they are the names always used in public-domain CIA maps which are found in many articles here. The BGN has a policy of setting aside its standard Arabic transliteration for names with established forms in English (e.g. Mecca, Damascus, Algiers), and also of representing local pronunciation in certain countries (e.g. Lebanon). When Standard Arabic readings are represented, the system is very similar to the ALA-LC. Official BGN names use diacritics (see Arabic transliteration table), and are shown with diacritics on CIA maps, but their products also provide no-diacritic versions which we could use as the titles of articles here. Note that the backtick for `ayn is always written in the no-diacritic version and I would recommend its use in our article names (with redirects if desired).

Also note that the BGN coordinates its naming policies with the quasi-official Permanent Committee on Geographical Names (PCGN) in the UK. My understanding is that the Arabic transliteration system used by PCGN is still exactly the same as the BGN system.

I strongly recommend adopting BGN place names without diacritics as the standard for use in article names. --Cam 21:56, 15 January 2006 (UTC)

Darn, I see BGN/PCGN is missing from the Arabic transliteration table. It is exactly the same as UNGEGN except that the article is shown without a hyphen and the "a" of the article is capitalized.
For example UNGEGN ash-Shishah is BGN/PCGN Ash Shishah (الششة). --Cam 22:05, 15 January 2006 (UTC)

standard vs. casual

From observing many many WP pages, I've noticed that there are two versions of Arabic transliteration. One is a very standard version, which uses dashes over long vowels, dots, and underscores when necessary. The other form is a casual form, which is one of two things: the standard version without the dashes and dots, or if the word is widely used in English (like Cairo or Mecca), then the English version.

I would like to propose that the official WP policy reflect this. There is great scholarly and academic value in keeping the standard version of transliteration at least once in the article, but using it throughout doesn't make much sense. I think the standard version needs to be mentioned, and guidelines set down for what it should be, and also some guidelines for what the casual version should be. Currently several different versions of casual transliteration are used. For example, Muhammad, Mohammad, Muhammed, and Mohammed are all used.

I'll try to re-write some of the page to reflect this. The standard version I want to use is the ALA-LC, Library of Congress, which is the most common on WP. Cuñado - Talk 07:03, 21 March 2006 (UTC)

I stongly prefer DIN 31635 over ALA-LC. DIN is used more in the academic community and ALA's underscores seem very problematic to me. —Ruud 19:50, 26 March 2006 (UTC)

Muḥammad ibn Mūsā al-Ḵwārizmī

People might be interested in (and might want to comment on) Muḥammad ibn Mūsā al-Ḵwārizmī, and article I'm currently wroking on and transliteration is heavely used. —Ruud 03:08, 24 March 2006 (UTC)

Apart from Ruud, I saw a consensus to move that page to Muhammad ibn Musa al-Khwarizmi
In order to prevent future adventures in this sense, I move the present guideline proposal up to naming convention; I see no problem to handle remaining pending issues mentioned on this page with this being a guideline. --Francis Schonken 22:46, 26 March 2006 (UTC)
I object, a short while ago this proposal was inactive and I have (apart from the naming convention of articles) some further objections about this proposal. Sorry, but simply changing this to an accepted policy is not... acceptable. Please raise this at the village pump. —Ruud 22:49, 26 March 2006 (UTC)
Also note that the proposal was recently heavily modified by a single editor. —Ruud 22:55, 26 March 2006 (UTC)
I'm that single user. I'm not sure why a fight over a page title has spilled over onto this talk page, but I don't think it has any relevancy to the content of this page. This page is still a proposal and is not enforced as policy yet. Cuñado - Talk 23:28, 26 March 2006 (UTC)
I suspect it is because this policy is currently named Wikipedia:Naming conventions (Arabic) instead of the more appropriate Wikipedia:Arabic transliteration. Would you support renmaing it? —Ruud 23:30, 26 March 2006 (UTC)
No. This title is in accordance with all the other language articles Cuñado - Talk 00:10, 27 March 2006 (UTC)
@Ruud, maybe a good idea to familiarize yourself a bit with the naming conventions series of guidelines.
@Cuñado, "policy" is maybe not the right word: all naming conventions are "guidelines" (only the one grouping them and giving the general principle, wikipedia:naming conventions, is a "policy")
If the final proposal would inlude the use of diacritics in (some) article titles, I'd encourage to list it in the list of active discussions at wikipedia:naming conventions (use English)#Disputed issues. Note that most other naming conventions guidelines/proposals that are about transcription/transliteration of foreign characters (Wikipedia:Naming conventions (Chinese); Wikipedia:Naming conventions (Cyrillic); Wikipedia:Naming conventions (Hebrew); Wikipedia:Naming conventions (Korean);...) generally avoid the use of English letters with diacritics for article titles, even if this means resorting to a more "traditional", academically less up-to-date system.
As you already may have noticed I moved Al-Kitāb al-muḫtaṣar fī ḥisāb al-ğabr wa-l-muqābala to The Compendious Book on Calculation by Completion and Balancing - that has to do with another naming conventions guideline, see wikipedia:naming conventions (books)#Title translations --Francis Schonken 02:21, 27 March 2006 (UTC)

Official transliteration

I really like the progress in this guideline, and I hope it will become soon an official policy. Just one commment: Besides the "primary transliteration" and the "standard transliteration" we should also add an "official transliteration". This transliteration is related to official use of names. For example, the transliterated titles of the Lebanese presidents should be chosen according to the official presidency site, and the names of many singers should be used according to the one used in their official site. CG 11:18, 2 April 2006 (UTC)

This is a good point. I think this would be a subset of "primary transliteration." --Cam 15:55, 2 April 2006 (UTC)
Yes, I think whenever possible the official transliteration as given here should be used, even if it is not the primary transliteration according to the definition given. That would be in line with something I saw on one of the Wikipedia:Naming conventions pages about "self-defining entities", though of course I can't find it again now... Palmiro | Talk 16:08, 2 April 2006 (UTC)

Proposed changes to the transliteration

  • In the section Transliteration, ALA-LC transliteration is used as the accurate transliteration, but in the rest of the proposal DIN 31635. I propose we use DIN 31635, except that ﺥ is transliterated as ḵ instead of ḫ (this system is used in other places, would keep it closer to the (English) standard transliteration and is better readable on computer screens.)
  • Currently apostrophes are used in the accurate transliteration. I propose we use the half rings ʾ and ʿ instead.
  • I propose we add the characters necessary in DIN 31635 transliteration to MediaWiki:Edittools.

Cheers, —Ruud 16:51, 2 April 2006 (UTC)

  • DIN is less widely used in the English-speaking world than the ALA or UN (BGN/PCGN) systems. I think their use would be better than DIN.
  • The left and right single quotation marks are the marks specified in the ALA and UN systems for these letters or sounds.
--Cam 17:55, 2 April 2006 (UTC)
Ok, but...
  1. DIN is already quite heavily used on Wikipedia.
  2. The ALA system doesn't use underscores, as this proposal currently suggest (and using underscores would seem problematic to me.)
  3. ALA isn't a 1-1 mapping even omitting the Alif, actually it's not very different form the currently proposed standard transliteration. This makes it in my eyes unusable for the acurate transliteration, and we might as well omit it in that case.
Ruud 18:43, 2 April 2006 (UTC)
  1. I think ALA/UN systems are more widely-used in the English-speaking "outside world" and are also more easily converted from standard to accurate than DIN, so I would argue that we should move away from DIN. (It's always possible to use it within an article, but it wouldn't be the "normal" way.)
  2. I don't think we should use underscores if ALA or UN don't.
  3. I would argue that perfect 1-1 mapping ("super-accurate" mapping) isn't really necessary in the accurate transliteration. ALA at its most accurate (see this pdf for the full ALA system) is accurate enough, I think. Anyone who really needs the perfect mapping is going to know the Arabic script anyway, which will be visible right next to the accurate transliteration.
--Cam 20:10, 2 April 2006 (UTC)
Sorry, but if you think accurate translations are not necessary just use the standard translation. I think it would be misleading to start calling not so very accurate transliteration system the accurate transliteration. Currently DIN transliteration is often used where the original arabic is not given and as far as I can see ALA-LC would not always let you reconstruct the it. Further more, I don't buy your ALA/UN systems are more widely-used in the English-speaking "outside world" without some evedince. For example, the library system at the (Dutch) university where I study uses DIN (with the ḵ instead of ḫ modification). So do the few books I've read on Arabic mathematics. To summarize, in cases where accuracy isn't that important the standard translation is accurate enough, in cases where accuracy is wanted ALA-LC isn't accurate enough. —Ruud 22:29, 2 April 2006 (UTC)
I will look into DIN some more based on your recommendation. But it doesn't seem to be used that much on the Web. Note that Ḵwārizmī gets 91 hits in Google. Ḫwārizmī gets 40 hits. The common place name element ǧabal gets 54 hits. These are small numbers. Use of DIN does not seem common on the Internet. But again I am interested in DIN and will read more about it. --Cam 23:04, 2 April 2006 (UTC)

Exclude Persian, etc?

I think maybe we should leave languages other than Arabic out of consideration here. Persian, Urdu, etc. names have special problems that might be best handled separately. Perhaps eventually there could be an overarching "Arabic script" policy, but that's something for the future after each language's style is worked out. What do others think? --Cam 17:00, 2 April 2006 (UTC)

  • Yeah, I definitely agree. --LakeHMM 02:09, 26 April 2006 (UTC)
I definitely do not agree. And I'm going to revert the change. A great deal of Persian words come straight from Arabic, and are pronounced the same. Besides grammar, which is not affected by these guidelines, the only difference in Persian is the pronunciation of a few letters like "P", "V", and "Z". I'm copy editing tables of Persian and Urdu, which can be expanded on. Cuñado - Talk 08:26, 26 April 2006 (UTC)
Anyway, this is a MoS page, not a list of letters, so please add proposals for wikipedia's (standard, strict) transliteration for the letters you added, otherwise I don't see how this would fit on a MoS page. --Francis Schonken 09:23, 26 April 2006 (UTC)
If you want to add Persian, Urdu, etc. you should research all the differences these languages have in their transliterations (v instead of w, for example) and add them to the tables. Otherwise this is just about the Arabic language with a couple of vowels added to the long vowel table. --Cam 14:36, 26 April 2006 (UTC)
I added tables for Persian and Urdu. Do you mean add two more columns to the Arabic table? I'm not sure what you mean by the vowels added to the long vowels. The vowels are the same in Perisan/Arabic, and I'm not familiar with Urdu, but I assume it's the same. Cuñado - Talk 20:50, 26 April 2006 (UTC)
But it's more complicated than just that. If the words come from Arabic, then they can be transliterated under the Arabic guidelines, otherwise they're different. There are other issues, like, with Arabic, ال and و. I think there should be a different manual of style for Persian words. Plus, there are SO many languages that are written with the Arabic abjad that we could hardly include special information for each. This would follow the same logic as if, for example, the Arabic language encyclopedia had a manual of style for transliteration of all languages written with the Roman language. --LakeHMM 00:41, 27 April 2006 (UTC)
Yes, that's what I meant above by "all the differences". It's not just additional consonant letters; some of the consonants are pronounced differently and they are generally transliterated differently in Western texts. See Persian, Pushto, Urdu, etc, in the ALA-LC Romanization Tables as a starting point, if you want to tackle it. --Cam 04:47, 27 April 2006 (UTC)
I'm not really disagreeing with either of you. Only that those differences are not that great. "Muhammad" is still transliterated "Muhammad" regardless of what language it's used in. Some Persians think that "Mohamed" is the correct "Persian" way of transliterating it, when that is just a non-standard way of spelling. A good example is with Riḍwān, which according to Persian is pronounced and spelled Rizvān. Besides the letters that are already on the page, what else is particular to Persian/Urdu that needs to be addressed? Persian doesn't use the definite article "AL", but what about that needs to be on this page? This page is mainly to provide a standard format of transliterating Arabic script into Latin characters. Persian and Urdu are in Arabic script, and the non-Arabic differences are already noted. Cuñado - Talk 06:24, 27 April 2006 (UTC)

wa

The current proposla doesn't specify how wa ("and") in conjunction with a definity article sohuld be transliterated. I've seen both wa-al-, wa-l- and wa'l-. I prefer a version with a hyphen, but think that the choice between wa-al- and wa-l- should probalby depend on wheter or not we use assimilation. —Ruud 00:13, 3 April 2006 (UTC)

Naming conventions (Persian) and (Arabic)?

Question: where do I post a proposal for a naming convention on Ayatollahs? Thank you everyone... Gryffindor 21:39, 6 April 2006 (UTC)

Persian and Urdu

I filled out the section on Persian, but I'll admit I'm not an expert. I know even less about Urdu, so if anyone is inspired to look them over please go ahead. If the Urdu section gets finished I think this can become an active proposal again. Cuñado - Talk 07:43, 27 April 2006 (UTC)

Digraphs

I think the whole principle of using digraphs over diacritics is a bad idea. Not only is it just logically odd to turn one letter into two, but the underlining is really awkward. I propose that we leave the digraphs in the "standard transliteration", but use letters with diacritics for the strict one. I don't think that appeasing the average English speaker should be our number-one priority. If at all, that should be left to the "standard transliteration". --LakeHMM 09:10, 7 May 2006 (UTC)

As long as the systems correspond to something used in the real world, I'm OK with that. I don't know if I said this somewhere above, but I'm not comfortable with inventing a "Wikipedia system" for transliterating Arabic, it feels like original research. Deciding on which system to adopt and how to use it seems OK to me, though. --Cam 12:06, 7 May 2006 (UTC)
I don't have time to give a long detailed response, but the current version doesn't work and I'm going to revert it. The issue is of reversibility is not a concern as long as the strict transliteration can differentiate between s-h and sh. It doesn't matter if one letter maps to two letters, as long as you can go the other way. As far as the comment about a "wikipedia" version... there is no international standard. In fact there have been several standards used by academics at different times and they aren't in complete agreement. The current version is pretty close to a combination of the common standards. Cuñado - Talk 04:40, 8 May 2006 (UTC)
The current version doesn't work either, it uses combining diacritical marks which do not display correctly on my system (Firefox) and probably not at all on other systems. —Ruud 16:55, 8 May 2006 (UTC)
I again express a strong preference for DIN 31635 (very similar to the other major standard ISO 233, see Arabic transliteration#Comparison table). It is widely used in the academic community, unlike ALA-LC which I've never seen in actual use (most likely because it was designed only very recently) and which requires underlining, by itself awkward enough that it probably won't be used even if it was recommanded in this guideline. —Ruud 16:55, 8 May 2006 (UTC)
What would be really neat if everybody could choose their own favourite transliteration system. Maybe I'll code some MediaWiki extension to do that some day. —Ruud 17:30, 8 May 2006 (UTC)
I strongly recommend using DIN 31635, and the {{ArabDIN}} template. Of course there will always be 'lazy' transliterations, I am not suggesting every occurrence of "Saddam Hussein" is spelled in DIN 31635, of course, but DIN should be used whenever the term itself is under discussion; compare to this IAST and {{IAST}} vs. lazy transliterations along the lines of Rigveda. dab () 18:18, 8 May 2006 (UTC)
I prefer the ALA-LC. It's completely reversible and much much closer to the lazy version. I don't see any problems with 1-2 character mapping, and I've only seen DIN used in old academic books. The United Nations naming conventions use a version very close to the ALA-LC. Cuñado - Talk 19:34, 8 May 2006 (UTC)
It isn't the 1->2 character mawpping that is the problem, but the underlining that would be necessary. I don't like how it looks and underlineing is a bit of a typographical don't. Also I could imagine a few articles where both Arabic and Sanskrit transliteration would be necessary and using IAST for one and ALA-LC for the other would look very inconsistent. —Ruud 12:21, 9 May 2006 (UTC)
we don't have a proper ALA-LC article, but from the list on this page, the "strict transliteration" seems to be practically identical to DIN, including underlines for ṯ, ḏ (Ṯāʼ, Ḏāl). What is the difference between ALA-LC and DIN? It seems to be restricted to ḵ vs. ḫ for ﺥ dab () 13:25, 9 May 2006 (UTC)

The ALA-LC article for Arabic can be seen here. On note 21 they use a prime ′ when two distinct consonant sounds might otherwise be confused with a digraph. I never actually read the entire document before so I didn't know how to deal with those cases. I took the underlining from several examples I had seen, and nothing else seemed to work.

The same article also shows the LC transliteration methods. Dab, you came in after the page was already changed to DIN, take a look at this version to compare the LC and DIN tables. I suggest using the LC version for transliteration. Cuñado - Talk 16:01, 9 May 2006 (UTC)

OK I moved things back, added the prime symbol, and got rid of the underlining. I also removed vowels which were added to the consonant table. Does this setup work for people? Cuñado - Talk 16:21, 9 May 2006 (UTC)
No, you still use combining diacritical marks and there currently is almost no diffence between the standard and strict transliterations, why list a strict transliteration at all in that case? —Ruud 16:26, 9 May 2006 (UTC)
If you find DIN unacceptable, I suggest we simply add both to the guideline, as I see little reason for ever using ALA-LC ever myself, and I think a few other people would feel the same way. A guideline this is going to be ignored is of litte use. —Ruud 16:33, 9 May 2006 (UTC)
That negates the purpose of an MOS. I don't have a solution. I think ALA-LC is the most readable, and the most widely used in my experience. It's completely reversible so I don't see a need to use DIN, which is not comprehensible to an untrained reader. Cuñado - Talk 16:53, 9 May 2006 (UTC)
It seems ALA-LC doesn't ues underlining at all, making it somewhat more acceptable. If we would use ALA-LC we should drop the distinction between standard and strict transliterations, though. —Ruud 17:14, 9 May 2006 (UTC)
Actually there are 8 consonants which are different in the strict transliteration: 6 letters with underdots and the ayin and hamza. The reason for having strict and casual is because most editors do not want to use the correct transliteration with dots and quotation marks and accents. They will put up a big fight. I tried to change the page for Shi'a Islam to the proper form of Shi`ah Islam and people had a cow. Also using the strict version in the page content would be a nightmare to maintain. I've resigned myself to cleaning up the introductory sentence and make sure the one-time transliteration is in the correct form. I've also tried to correct just plain awful transliterations (I just moved Ashoura massacre to Ashura massacre), and this MOS is a big help for that. Cuñado - Talk 01:07, 10 May 2006 (UTC)

I still insist that using digraphs is sloppy, and s'h just looks bad. There's a reason why IPA doesn't use sh for ʃ. As for just trying to make it readable by the average English speaker, if you want to use sh and kh, you might as well replace every i with ee and every u with oo. It's linguistically inaccurate to represent one sound with two letters. It's only logical that sh should be the sound of s and then the sound of h. You could also transliterate ف as rp, but it just wouldn't make sense. I would also like to reiterate that if we use digraphs, I think it should just be for the standard transliteration. That's where we're supposed to make it easy to type, read, and pronounce. It seems like the only person really determined to use these sloppy digraphs and awkward transliteration is Cuñado. That's not a good reason to revert a perfectly helpful contribution to the MoS. --LakeHMM 00:27, 12 May 2006 (UTC)

Arabic names are a pain!

I've just been working on Shahab al-Din Suhrawardi and some related pages - finding all the variants of a name is a pain!

I have a question about the prefixes al- and as- - sometimes I see "al-Suhrawardi", sometimes "as-Suhrawardi" (same for ad-Din or al-Din, etc.). Is one form "better" ? Why are there two forms? Does it correspond to different situations (like ibn and bin) ? To arabic and persian conventions? To ignorant europeans transliterating the wrong way?

It'd be nice if there was a page with info about this, beyond the Arabic name page. Not something to define a convention, but rather to discuss the issues that come up with arabic names - how to recognize the different bits, which variants are incorrect (but should still be looked for), how to recognize bad transliterations, etc.

I've been working on the Al- article, and it links to the Arabic names page. I haven't gotten around to introducing the one reference I have, or expanding it to include assimilation just yet, but if anyone wants to take a look, they are of course very welcome. -Fsotrain09 16:38, 21 June 2006 (UTC)

I'm under the impression that a good chunk of the "red links" in arabic don't refer to a missing article, but just have the alternative transliteration of an existing article. Which means a lot of red links could be fixed by searching for the right keywords. A guideline to how to do that would be nice :) Flammifer 14:51, 14 May 2006 (UTC)

Agree completely. -Fsotrain09 16:38, 21 June 2006 (UTC)
al-Suhrawardi correspondends to the way it is written in Arabic, as-Suhrawardi to the way it is pronounced. Both have their advantages, but the al- form seems to be much more popular. Short vowels are not written in Arabic, so ibn/bin is just written bn, but pronounced ibn in the Middle East and bin in North Africa. Arabic only had three vowels which are sometimes rendered as u, a, and i, sometimes as o, e and i. The last two together means that there are sometimes several different varianst of a Romanized Arabic name (Khwarizmi, Khowarizmi, Khuwarizmi, Khawarizmi). Sometimes "al-" is transliterated as "El". Then there are several ways of transliteration plus numerous ad-hoc tranliterations of popular Arabic names. Sometimes you have to double a consonant to preserve the correct pronounciation, or do you want to stick closest to th Arabic original? How many titles and names of ancestors do you include? This does tend to lead to a combinatorial explosion. —Ruud 15:56, 14 May 2006 (UTC)

Cool! I wrote a blurb on Wikipedia:Arabic names, hoping it may someday be a useful guideline. I se it having a different purpose from this article, which is more about defining what's the standard thing to do. Wikipedia:Arabic names would be "how to cope with the standard when you don't speak arabic and don't have an encyclopedic knowledge of all the variants of arabic names". By the way, I did study arabic for some time, but haven't retained much of it beyond the alphabet and simple phrases. I didn't know (or had forgotten) about the as-Something thing. Anyway, I hope having a quick reminder of that kind of things will be useful. Flammifer 16:51, 14 May 2006 (UTC)

This page already actually deals with what you're talking about, read the part about solar letters and it explains when to use the "al" or the other forms. Some standards transliterate every definite article as "al", and some take into account the solar letters, which changes the pronunciation and the "L" turns into the solar letter. Cuñado - Talk 23:01, 14 May 2006 (UTC)

I reworked the bit on solar letters so it was more explicit about how that influences transliterations. I think Arabic names page isn't completely covered with this Manual of Style, but maybe it could be. In the meanwhile, improving both seems the way to go. I see Arabic names as mostly about how to deal with the combinatorial explosion of transliterations of a name.

Hmm, this manual of style doesn't cover whether al- or ad- (etc.) should be prefered when there is a solar letter ... should it? (I wouldn't be surprised if there was already a lengthy discussion about this in the archives) flammifertalk 17:05, 16 May 2006 (UTC)

No it really doesn't cover which should be used. I've seen both transliterations in different standards. The ALA-LC does not use solar letters, the UN standards do, and I'm not too sure about DIN. I think it's ok right now just to say that both are appropriate. Cuñado - Talk 17:19, 16 May 2006 (UTC)

(Edit conflict! This was written before the answer above)

Ah, so it's the question of assimilation. Reading the discussion above, it seems that a) some standards prefer assimilated, some prefer non-assimilated; and b) most people here seem to prefer assimilated, so - I wrote in the manual of style that the assimilated form was preferred. Feel free to change, I don't feel very strongly about it (except that I'd prefer to pronounce things the right way). flammifertalk 17:24, 16 May 2006 (UTC)

Personally, my preference would be for non-assimilation, but yes, it's best to establish one rule, and the general preference here is for assimilation, so I think you were right to put that in the MOS. Palmiro | Talk 10:30, 17 May 2006 (UTC)

Ayatolah naming

Since I can't seem to get an answer to my first question, I will simply post the proposal here, hopefully it's in the right place. My proposal: Muslim clergy are listed with their titles, just like Christian clergy. It is therefore Grand Ayatollah Ali al-Sistani and not Ali al-Sistani, and Ayatollah Mohammad Yazdi and not Mohammad Yazdi. Gryffindor 16:32, 16 May 2006 (UTC)

This page is an MOS for transliteration, I think what you're looking for is naming conventions. There is a page for Western clergy but that does not cover Islamic clergy. You might have to just create a new page under the title Wikipedia:Naming conventions (Islamic clergy). I suggest you post your ideas on this talk page and go from there. Cuñado - Talk 17:09, 16 May 2006 (UTC)
Thank you for pointing that out, that's really nice of you. I created a new discussion page, please feel free to voice your opinions. Gryffindor 22:00, 23 May 2006 (UTC)

Transliteration into Arabic Script

I am looking for standards regarding transliteration of Non-Arabic names into Arabic language/script, for use in Arabic Wikipedia. This includes the following issues:

  1. Representing the non-Arabic letters P, G, V, ZH, NG, ... possibly using (Persian/Malay) extended Arabic characters.
  2. Transliteration of vowels, using diacritics or otherwise.
  3. Writing dates in Gregorian, Hijri, Coptic, and Berber calendars.

Regards. --Shafei 20:16, 23 May 2006 (UTC)

After checking primary Arabic-language sources, try looking at sites like the BBC, CNN, Xinhua, Radio France International, etc. that have international news in Arabic for ideas on transliterating Western names. --Cam 22:02, 23 May 2006 (UTC)

Missing alphabets in English & Arabic

This is all about those missing alphabets. For example: ط - ظ - ص - ع - غ - ء - ق You can not find any equivalent alp habits (You can not pronounce them in English because of that) in English to the previous Arab ones. That's why we are using the closest ones from English to those.. Example:

ط = T

ظ = Dh or Th

ص = S

ع = A or '

غ = Gh

ء = A or '

ق = Q

Also you can find that problem in Arabic where you can find some English Alphabets with no equivalent in Arabic such as:

Ch

G (pronounced as Guy)

V

So people are using the closest alphabets to those in English:

ش or ج = Ch

غ or ج = G

ف = V

Note. There are some unofficial alphabets being assigned to the previous ones in English but since they are unofficial you can not find them in the keyboard. —Preceding unsigned comment added by Bashari (talkcontribs)

I'm not sure why you added all that, and there are quite a few errors in your mapping. Cuñado - Talk 01:11, 6 September 2006 (UTC)