Talk:X-SAMPA

Latest comment: 1 year ago by Nardog in topic X-SAMPA and SAMPA

Untitled edit

I set the layout of this page the way I did, as opposed to the more chart-based layout of SAMPA chart because I think the people needing a reference for X-SAMPA are more likely to be trying to read it than to form it, and alphabetical order would be simpler to find. Disagree? --Muke Tever 07:37, 1 Mar 2004 (UTC)

I like this layout, it's less scary for users who come here to "decode" a pronunciation guide, much more user friendly.
Three things I would like to see:
  • the lowercase letters all set out, many people wont know the IPA, this article should cover X-SAMPA in it's entirety, not refer people to IPA charts
  • images of the IPA characters in the table, the IPA characters wont display on most people's computers - that's why X-SAMPA exists after all. Any idea where we can get these? (We could just chop up an image of an IPA chart, if someone has one of those).
  • The example words written out in X-SAMPA as well. E.g.:
    • D - ð - voiced dental fricative - English then, [DEn]

fabiform | talk 14:47, 1 Mar 2004 (UTC)

Yeah... I'll see about setting to work on a chart for the lower-case letters. I wasn't sure about uploading images (especially for stuff that is technically text), but if it's necessary... --Muke Tever 16:51, 1 Mar 2004 (UTC)
IMHO it would be useful. The vast majority of the IPA symbols come up as boxes for me. If you have a chart, I'd be happy to do the chopping up.  :) fabiform | talk 16:54, 1 Mar 2004 (UTC)
There's IPA charts (and nice ones too) under the article International Phonetic Alphabet. There's also a comparison chart marked with X-SAMPA symbols beside the IPA ones, which is already linked to under this article. (Note that a couple of distinctions X-SAMPA marks as diacritics, are technically different characters under the IPA, e.g. the retroflexes... The reverse is also true, e.g. in the case of X-SAMPA /5/ for l with tilde through.) --Muke Tever 09:44, 2 Mar 2004 (UTC)
Why on earth didn't I think of looking there?! I'll start chopping it up now.  :) fabiform | talk 11:16, 2 Mar 2004 (UTC)
I'd like to see the spelled out lowercase letters as well (probably could be done with a fairly low amount of cut'n'paste, I just don't have the time right now). Why to upload characters as pictures when the IPA fonts are downloadable? At least I tend to remember its links in the IPA/sampa articles. Pictures are ugly: they're not in the right resolution, cannot be resized correctly and they're not IPA characters but pictures, so one cannot use the article to actually copy the character in other articles.
The article could use some examples from popular languages to show how to use the system.
I'll see about putting some together. --Muke Tever 17:56, 2 Mar 2004 (UTC)
I hope you don't mind my border around the tables. I tried center as well but it looks ugly. --grin 14:17, 2004 Mar 2 (UTC)
Don't worry, I'm not going to replace the IPA text, but put in a row of images as well. Not everybody will be using computers which they have administrator priviliges on remember, and plenty of people won't know about downloading fonts. fabiform | talk 14:29, 2 Mar 2004 (UTC)

Since we're organising this chart quite differently to the IPA one, do you think it would be a good idea to merge the Backslashed symbols with the upper- and lowercase letter tables? I was thinking of putting the letters followed by ` in with the upper and lowercase tables too, or do you think these would get too long?

Like this:

XS IPA IPA Description Example
r r   alveolar trill ...
r\ ɹ   ... ...
r` ...   ... ...
r\` ...   ... ...

fabiform | talk 17:14, 2 Mar 2004 (UTC)

That seems like a good idea, and I can't see anything wrong with it other than it feels slightly less organized. If you do this it should be noted somewhere that symbols with a following backslash are separate symbols not necessarily related to their unbackslasheds, so that people trying to parse, e.g., [r\`] don't stop at [r] and parse it as some kind of retroflex trill. Actually that should be noted anyway, I'll do it now. --Muke Tever 17:56, 2 Mar 2004 (UTC)

It looks like one of us (me?) has made a mistake with X\ and x\, can you check which is which because at the moment they conflict. Not being able to see the IPA font is a slight handicap for me.  :) fabiform | talk 18:06, 2 Mar 2004 (UTC)

It was my mistake, sorted now. :) fabiform | talk 22:16, 2 Mar 2004 (UTC)

I see different IPA1 and IPA2 chars at 3\, they're mirrored. Maybe my charset is buggy, maybe the article. --grin 02:29, 2004 Mar 3 (UTC)

You're right. Mistake on my part (i.e., in the text... the image was correct) It's sposta be a sort of closed three (hence the mnemonic value of 3\). Fixed now. --Muke Tever 04:12, 3 Mar 2004 (UTC)

I've been adding the SAMPA after the example words in English and French (I hope I've got them right!), I don't think we have any other dictionaries in the house to help me with the other languages.

I realised that we've not covered dipthongs, I think we should if this is going to be user-friendly. I mean [n@U] doesn't exactly look like it's going to translate to "no" for the average user. Shall I take the dipthongs from the English SAMPA page, or are they two language specific? Are there dipthongs for other languages? fabiform | talk 23:40, 3 Mar 2004 (UTC)

Missing symbols edit

While cross-checking Hungarian X-SAMPA chart to be I noticed several missing IPA/SAMPA symbols.

  • ʦ, LSL Ts Digraph
  • ʧ, LSL Tesh Digraph
  • ʣ, LSL Dz Digraph
  • ʤ, LSL Dezh Digraph

These are not diphtongs but normal phonemes (morphemes? I always mix them up).

The one you want here is "phoneme", but technically they're neither; they're symbols for specific "phones": the unit of linguistically-significant sound. A phone's symbol may also be used to represent a specific phoneme in a given language, but a phoneme is a more abstract concept: it's the "underlying" sound that a speaker has in their head, as opposed to what actually comes out. For instance, English has a phoneme /p/ that comes out as [p_h] at the beginning of words and bare [p] elsewhere; /b/ is normally [b] but is [b_0] when whispering. We know that /p/ and /b/ are distinct because of minimal pairs like "pat" and "bat" and "fob" and "fop"; but the different /p/'s are "the same sound" because they never distinguish words. The phonetic result of two phonemes together may also be different than their results individually; phonemically, "cucumber" starts with /kju/, but phonetically it may come out as [c_hCu].

As far as I see they're not on the normal IPA chart, so I guess they doesn't even exist. :) So, um, what now? --grin 10:53, 2004 Mar 11 (UTC)

Normally, such sounds - called affricates - may be indicated by simply putting the two symbols together, as long as the distinction between the affricate and the corresponding stop+fricative pair is not germane to the discussion. If such a distinction must be preserved, the affricate may be indicated by tying the two symbols graphically, with an arc much like the tie used in musical notation. The digraphs are a recognized variant of the tied notation, allowed by but not part of the IPA.

According to the X-SAMPA specs, there are basically two options: mark them as /ts tS dz dZ/ plain, and if you need to differentiate clusters, use /t-s t-S/ etc for the clusters (which is what hyphen "separator" is for) -- the other option is to mark the digraph phonemes explicitly as /t_s t_S/ etc. (the diacritics _s _S _z _Z are "reserved" so as not to clash with this kind of use) and leave the clusters as unmarked /ts dz/ etc. The first is probably more readable. --Muke Tever 17:16, 11 Mar 2004 (UTC)

I reverted a series of edits by an anonymous user today, they might have contained some useful edits, so someone might want to check them over. I reverted because of his/her use of "Amerikkkan", which made me doubt the content of all his/her edits to this page. fabiform | talk 19:07, 12 Apr 2004 (UTC)

Voiceless versus unvoiced edit

I disagree with the change of "voiceless" to "unvoiced". While they may for the most part be synonymous, "unvoiced" is ambiguous terminology, which could suggest either "voiceless" or "devoiced" sounds (the latter being a process affecting voiced sounds, which may still retain lenis articulation; the _0 diacritic is generally used to represent this). In addition, both X-SAMPA and the IPA use the term "voiceless". Unless there's a reason for "unvoiced" it should be changed back. —Muke Tever 02:29, 9 Aug 2004 (UTC)

  • OK, I understand what you mean. Now... is there a case when you can have both a devoiced and a voiceless variants of a voiced sound?
Apparently some Dutch accents have [v_0] and [f], according to [1].
What about the other way around: using the _v diacritic on an initially voiceless sound, does it produce the exact same sound as the corresponding voiced sound? That is, in the case of alveolar fricatives, is [s_v] the same as [z]? Or is it the case that the first one is fortis and the second one lenis?
I suppose it depends on the language. Presumably the choice of using [_v] instead of a voiced character indicates the ordinary voiced character is unsuitable somehow, but alternatively it may be for consistency in representation of a morpheme (as if one wanted to represent knife, knives as /naIf, naIf_vz/). But honestly I do not see [_v] used much, so I can't say.
The "Voiced" and "Voiceless" diacritives are usually used in strict phonetic rendering to show a sound that is treated as a (say) voiceless phoneme but effectively pronounced voiced AFAIK --199.202.104.120 19:42, 7 Sep 2004 (UTC)
After changing them back to 'voiceless'... What do you suggest we do with the previously existing 'unvoiced' entries? It doesn't seem right to use both terms in the same list so... should we then turn those into voiceless/devoiced?
--Danakil
Presumably the terminology should mirror the official X-SAMPA terms, so 'voiceless' in all cases. (I hadnt noticed there were "unvoiced"s in there before.) —Muke Tever 17:28, 9 Aug 2004 (UTC)
Done. Thanks for pointing this out. --Danakil

Handy comparison chart edit

This IPA/X-SAMPA comparison image was made by a friend of mine. I uploaded it to Image:X-sampa.gif. It's very handy. Maybe it can be incorporated into the article somehow.--Sonjaaa 03:00, Sep 12, 2004 (UTC)

That link doesn't seem to go anywhere, Sonjaaa. I'm assuming you mean KT's chart? There is a link for that in External Links. --Actually, I just removed the link there to http://www.diku.dk/hjemmesider/studerende/thorinn/xsamchart.gif, and replaced it with http://www.conmicro.cx/~kturtle/language/xsamchart.gif, since the latter is the current, more or less maintained, version. — Preceding unsigned comment added by 68.115.21.42 (talk) 03:14, 2 February 2006 (UTC)Reply
The latter link now lands on a UK gambling website, and the other gives a page not found (404) error, so neither of them is any use for people who want to compare or convert between IPA and X-SAMPA. yoyo (talk) 07:33, 14 October 2019 (UTC)Reply
@Yahya Abdal-Aziz: Given they're 13 years old it would be surprising if either of them worked. But luckily Wayback Machine has it saved. Nardog (talk) 08:25, 14 October 2019 (UTC)Reply

Danish pronunciation edit

I have an objection to Danish vælge ["vElG@]. This is a very old-fashioned way to pronounce the word - a more contemporary way is ["vElj@]. Maybe "væge" [vEG@] is better, but the [G] sound is nowadays used only in very formal Danish. Danish stød is only pronounced [sd2?] in certain dialects - I presume the editor intended a reference to the article about Stød. --Apus 10:25, 23 June 2006 (UTC)Reply

Yep, the velar fricative is ridiculously old-fashioned and I've removed it. Mr KEBAB (talk) 22:19, 2 May 2017 (UTC)Reply

Asterisk? edit

(Note that it is a convention among some conlangers to use an asterisk (e.g., O*) instead of backslash). -- What's the source for this statement? I'm a long-standing member of the conlanging community and I've never come across this convention, nor has anyone I've asked about it. 84.70.22.61 12:28, 9 July 2006 (UTC)Reply

  • Id like to second this. Ive never come across this in standard or occasional usage.--ATG 14:54, 12 July 2006 (UTC)Reply
    • Likewise here; never seen or heard of this usage, so as I'm fixing other things here I'll remove this too. – Anon, 20/08/06
  • I'm not surprised if some people use asterisk instead of backslash. I think using backslash in an ASCII transcipt wasn't a very lucky idea, as this can (and indeed does) sometimes cause problems in programs that interpret it as an escape character. E.g. you cannot use backslash this way in Emu. --Kgeza7 (talk) 15:26, 6 April 2009 (UTC)Reply

CXS ambiguity? edit

I just noticed Conlang X-SAMPA not only extends, but also conflicts with the official X-SAMPA, on the meaning of &. Since people looking for the charts may not know about this (I didn't, and I find it a little odd the X-SAMPA encodings for the four redefined vowels don't appear on the CNX-augmented chart), it may be helpful to point out this difference/incompatibility a little clearer.

Chart, again edit

I'm setting up one of the fancy IPA charts on my talk page as a reference that I can use to go from IPA to X-SAMPA. Would it (once it's both complete and with its other two main charts) be useful here for others who want to use a phonetic alphabet but don't have a keyboard with, say, "ŋ" on it?

Adiabatic 01:45, 4 May 2007 (UTC)Reply

As a reference I think it's a brilliant idea. Well done! It's a bit less useful than it could have been, now that Unicode IPA fonts are easily available and UTF-8 support is catching on on the Internet, but I think it's wonderful. --Kjoonlee 22:24, 11 May 2007 (UTC)Reply

Sortable charts edit

You might like to experiment with adding the parameter "sortable" in the Wikitable header (ie simply changing class="wikitable" to class="wikitable sortable"). This makes it easier to find the symbols by the shape of the IPA or the description of the sounds. Alternatively, the examples could be grouped by language. Just a suggestion: the article is very good as it stands.

You can see a couple of examples of this technique in Spelling in Gwoyeu Romatzyh. --NigelG (or Ndsg) | Talk 18:42, 3 January 2008 (UTC)Reply


Scat? edit

Do we really need "scat" as the example for /k/? I am guessing it used to be cat before some vandalism? —Preceding unsigned comment added by 85.10.193.6 (talk) 02:52, 9 August 2008 (UTC)Reply

the sound at the beginning of "cat" is actually aspirated in English ([k_h] in X-SAMPA), so it wouldn't be accurate to use cat as an example of pure [k]. --86.135.181.146 (talk) 23:56, 16 October 2008 (UTC)Reply

Requested move edit

The following discussion is an archived discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. No further edits should be made to this section.

The result of the move request was: page moved. Angr (talk) 17:59, 22 June 2012 (UTC)Reply



Extended Speech Assessment Methods Phonetic AlphabetX-SAMPA – An user (see history) moved this page from the original X-SAMPA to the current page, by expanding an acronym that a few people in the world know (only the original members of the very old ESPRIT project 2589 SAM). This makes the page inconsistent with similar pages in the other languages and not really necessary (is like to move IBM to International Business Machines). I propose to restore the original X-SAMPA name. Thank you. Relisted. Jenks24 (talk) 11:19, 22 June 2012 (UTC) Nonna Abelarda (talk) 07:33, 13 June 2012 (UTC)Reply

I went ahead and moved it in the absence of any objections. Angr (talk) 17:59, 22 June 2012 (UTC)Reply
The above discussion is preserved as an archive of a requested move. Please do not modify it. Subsequent comments should be made in a new section on this talk page. No further edits should be made to this section.

Wrong Example for "M" edit

The example for M is a Korean word, apparently pronounced (eu), that doesn't use M in it's X-SAMPA notation. Suggest a different example or, if the notation is confused and in reality does include a close back unrounded vowel, the correct pronunciation. — Preceding unsigned comment added by 180.57.62.4 (talk) 13:33, 19 September 2013 (UTC)Reply

The Korean word is transliterated eu but it's pronounced [ɯ], or [M] in X-SAMPA. Aɴɢʀ (talk) 19:46, 19 September 2013 (UTC)Reply

X-SAMPA and SAMPA edit

Is SAMPA a subset of X-SAMPA (or is X-SAMPA a superset of SAMPA)? Error (talk) 22:54, 12 November 2022 (UTC)Reply

Neither. SAMPA is language-dependent and thus has many forms, and X-SAMPA is not. Nardog (talk) 23:13, 12 November 2022 (UTC)Reply