# Wikipedia talk:WikiProject Linguistics

Active discussions
 WikiProject Linguistics collaboration on linguistics in Wikipedia
 Main page Discussion Assessment Article alerts Recognized content Portal
WikiProject Linguistics (Rated Project-class)
This page is within the scope of WikiProject Linguistics, a collaborative effort to improve the coverage of linguistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Project  This page does not require a rating on the project's quality scale.

Welcome to the talk page for WikiProject Linguistics. This is the hub of the Wikipedian linguist community; like the coffee machine in the office, this page is where people get together, share news, and discuss what they are doing. Feel free to ask questions, make suggestions, and keep everyone updated on your progress. New talk goes at the bottom, and remember to sign and date your comments by typing four tildes (~~~~). Thanks!

## Set phrase

There's a long-standing merge proposal for Set phrase and Fixed expression, which might would benefit from some input from this project; both are short and almost unreferenced, and I wonder whether idiom might be a better target. The discussion is at Talk:Set phrase#Merge proposal. Klbrain (talk) 21:17, 14 April 2020 (UTC)

Merger complete. Klbrain (talk) 08:38, 6 July 2020 (UTC)

## Point of view on Verbosity

The article Verbosity currently cites a number of authorities advising against wordiness, but seemingly none critiquing this advice. Any sources or other contributions would be appreciated. See Talk:Verbosity#POV Issues. Cnilep (talk) 02:39, 26 June 2020 (UTC)

## Rework source-filter model

I just made a first pass at reworking the article on the source-filter model to organize it better and attempt to explain the concept in a more accessible way. However, I believe it would benefit from other editors taking a look. It still needs a lot of in-line citations, and I don't have access to all of the original articles (for example, Chiba & Kajiyama) so I can place them. Does anyone want to check it out? Emflazie (talk) 19:38, 2 July 2020 (UTC)

@Emflazie: Send me an email with the papers you need and I'll send any pdfs I have. I'll also take a look at the article. 22:41, 2 July 2020 (UTC)
Wugapodes, thank you. This clarifies a lot. Emflazie (talk) 03:24, 3 July 2020 (UTC)

## RfC on the Cognitive-Theoretic Model of the Universe

Hi there is an RfC on the Cognitive-Theoretic Model of the Universe that may be of interest to this project. See: Talk:Cognitive-Theoretic Model of the Universe#Request for comment: on the notability of the CTMU in 2020 with sources published after 2006 and "unredirect" of this page to Christopher Langan - Scarpy (talk) 06:45, 8 July 2020 (UTC)

## IPA-xx: part two, through the lens of IPA-en

So, {{IPA-en}} is strange. It's used on three pages. Perhaps it was mass removed/replaced with {{IPAc-en}}? Why does it still exist, then, if deprecated? Nothing on the doc page either. The French version is used on 284 pages, odd, much more popular.[1]

It has no error checking. I came across this revision of Chewa language as it appeared from 20 May 2020 until I edited it just now.

With no parameters at all, it used to look like this:

English pronunciation: /{{{1}}}/

However, it's not really possible to call it without parameters and not notice it as an error, so the more common mistake is to try e.g. {{IPA-en||pron}}, which would show:

pronounced //

I figured we may as well get some use out of this template until we figure out what to do with the whole family of them, so to quickly fix Chewa language I made it so if the label is empty, only the {{small}} appears. So:

In '''Chewa''' (also known as '''Nyanja'''; {{IPA-en||pron}}{{IPAc-en|ˈ|n|j|æ|n|dʒ|ə}}) Chewa (also known as Nyanja; pronounced /ˈnjændʒə/)

Really, what should happen is, an overarching Lua module covering all the {{IPA-xx}} templates should be made; so one module to cover {{IPA-tl}}, {{IPA-es}}, {{IPA-de}}, and so on, using a similar system to {{cite book}} and its ability to figure out the names of languages from internationally standard codes like nan-tw, ja, etc.

The same module can probably also cover {{IPAc-xx}} as much of the work is very similar, just the parameter layout is different. A good thing is Lua can take arbitrary numbers of template parameters, so {{IPAc-ja}}'s silly limit of 34 can easily be raised. {{IPAc-de}} supports 50. Definitely the lack of uniformity can confuses people, especially when 34, even 50 is not really that much. There's also the problem that some templates randomly use slashes and others brackets, and there's no logical reason for this that I know of. {{IPA-en}}, slashes. {{IPA-eo}}, square brackets.   Of the two forms, most of them use brackets. English is the only major language I see with slashes.

It looks like Mr. Stradivarius and Nardog already got the ball rolling over at Module:IPAc-en. Module:IPAc-en/phonemes is cool.

Proposed game plan

1. Break the {{IPAc}} redirect
1. First change all 8 transclusions to {{IPA}}, which will be left alone I suppose.
2. ...perhaps eventually deprecate {{IPA}} and redirect it to {{IPA-}}, or {{IPA-}} to it. ┐(´ ∀｀ )┌
2. Overwrite Erutuon's Module:IPA, which isn't used in the mainspace and is abandoned since 2017;
1. Give it two functions, cc, like IPAc-xx, and p, like IPA-xx.
3. Create {{IPA-}}
1. Template just an {{#invoke:IPA|p}}
2. To get the old {{IPA-en|ˈnjændʒə|pron}} output, now write {{IPAc|en|ˈnjændʒə|w<!--ith-->=pron}}
4. Add a function to Module:IPA, pc ("Old IPA-xx argument style compatible")
1. Start replacing {{IPA-en}}, etc., with Module:IPA invokes: {{#invoke:IPA|lang=en|pc}}
5. Create {{IPAc}}
1. Template just an {{#invoke:IPA|c}}. It calls out a lot to other modules like Module:IPAc-en.
2. To get the old {{IPAc-en|ˈ|n|j|æ|n|dʒ|ə}} output, now write {{IPAc|en|ˈnjændʒə}}
3. Add a |w= (with) here too to stop hacks like the one I did to Chewa language from proliferating
1. So, to get {{IPA-en||pron}}{{IPAc-en|ˈ|n|j|æ|n|dʒ|ə}}, just do {{IPAc|en|ˈnjændʒə|w=pron}}
6. Create Module:IPAc-ja, Module:IPAc-de, etc. Work them into {{IPAc}}, the parent template, as we go.
7. Add a function to Module:IPA, cc ("Old IPAc-xx argument style compatible")
1. Start replacing {{IPAc-en}}, etc., with Module:IPA invokes: {{#invoke:IPA|lang=en|cc}}

### Discussion

Ping line

This is all very involved, but things will work much more smoothly when done I believe, and there will be more uniform presentation. I'm happy to work on some of this, but I don't know if I can finish it all alone in any quick timeframe. What do you all think? Psiĥedelisto (talkcontribs) please always ping! 09:33, 9 July 2020 (UTC)

• I'm always in favor in reducing many identical language templates to one general one with a consistent style. This has recently been done with {{In lang}} which was the result of a TfD which deleted hundreds of near identical templates. --Gonnym (talk) 09:40, 9 July 2020 (UTC)
• I don't think your proposal fully accounts for the existing complexity/ununiformity of IPA(c)-xx templates, but I support consolidating the IPA-xx templates to one {{IPA}} (using Lua) and making a wrapper module for all IPAc-xx. Like I said in the previous discussion, I see the handling of whitespace and labels as the biggest problems with these templates. I have modules sitting in my sandbox but the lack of response to the last discussion discouraged me.
(The use of slashes vs. square brackets isn't random at all. See Phonetic transcription#Narrow versus broad transcription, Help:IPA#Brackets. Help:IPA/English is the only key that uses a phonemic transcription, because readers of the English Wikipedia are expected to be familiar with the phonological system of English, and to account for the variety of accents they would have. The other keys use (rather broad but) phonetic transcriptions, because we can't expect them to be familiar with the phonology of non-English languages.) Nardog (talk) 10:03, 9 July 2020 (UTC)
@Nardog: Wow, very interesting about the brackets, thank you! I learned something today. So, that's something we'd need to preserve then in the #invoke:. On another template I've contributed to, {{unichar}}, I did it by defining a parameter |br= expected to be a string of length 2, then just Module:String's substring stuff. So, {{unichar|40|br=[]|sans=y}}: U+0040 [@] 🙌 About the complexity of the templates, for sure you are right, and it's a rough start, hopefully we will add to it as we go along if this goes from proposal to actually in the mainspace. Best, Psiĥedelisto (talkcontribs) please always ping! 10:41, 9 July 2020 (UTC)
The choice between slashes and brackets would depend on the key, not the transcription. If we were to consolidate IPA-xx to IPA, then we would most likely have a data module storing the correspondence between languages, keys to link, and dialects, so the type of brackets could easily be dealt with there. It would look something like this:
["en"] = {
key = "Help:IPA/English",
slashes = true,
dialects = {
["uk"] = "British English",
["us"] = "American English",
},
},
["sv"] = {
key = "Help:IPA/Swedish",
dialects = {
["fi"] = "Finland Swedish",
},
},

Nardog (talk) 10:48, 9 July 2020 (UTC)
The dialects list should probably not duplicate any language module that already has that data. is there one that handles dialects? --Gonnym (talk) 11:00, 9 July 2020 (UTC)
No dialects list that I'm aware of. I presume that the above examples relate to IETF language tags: en-UK, en-US, sv-FI where UK, US, and FI are region subtags derived from ISO 3166-1. Module:Lang/data maps some lang-region tags to adjectival language names that link the tag to its en.wiki article but Module:lang doesn't attempt to create anything like 'Finland Swedish' from sv-FI. Do I remember correctly? Weren't you looking for an official source of country names in their adjective forms? Did you ever find such a source?
Trappist the monk (talk) 17:47, 9 July 2020 (UTC)
You remembered right. I did look for something and ended up creating a module Module:Country adjective and basing the data on the list at List of adjectival and demonymic forms for countries and nations. Not sure if it useful here though. --Gonnym (talk) 01:30, 10 July 2020 (UTC)
@Nardog: Sure, but could there not be a writer who can do narrow transcriptions of non-English languages? Many editors speak more than one language. So maybe we need both? Or is it more that the reader isn't likely to understand the nuances of a narrow transcription so it doesn't matter to mark one as such? Psiĥedelisto (talkcontribs) please always ping! 11:01, 9 July 2020 (UTC)
We just write {{IPA|[...]}} for that. There's also {{IPA-all}}. The IPA-xx templates that link to specific keys under Help:IPA/ are exclusively for transcriptions that adhere to those keys, which exist to help readers figure out what each symbol means—otherwise readers would be utterly confused. See Wikipedia:Manual of Style/Pronunciation. Nardog (talk) 11:06, 9 July 2020 (UTC)
• (Also, IPAc-en supports labels too—they just have to be the first argument. (Except lang, which converts to nothing. It's a long story!)) Nardog (talk) 10:30, 9 July 2020 (UTC)

Is there a template I can use, or some wiki markup, to generate a non-underlined hyperlink? I'm working on a couple articles on proto-Semitic, which by convention uses underdots to symbolize emphatic consonants and underlines to symbolize interdentals. When wikilinking a single character, this looks pretty bad. The IPA template turns off link underlining, which is handy, but IPA notation isn't appropriate in context, as these usages require traditional semiticist transliteration. --April Arcus (talk) 18:40, 13 July 2020 (UTC)

Template:Nounderlines. Nardog (talk) 21:39, 13 July 2020 (UTC)

There is a discussion proposing that part of the Phoenician alphabet article might need to be split off into an article that is to be named Canaanite scripts. Please comment at Talk:Phoenician_alphabet#This_page_might_need_to_be_split. Debresser (talk) 12:09, 26 July 2020 (UTC)

## Could someone undo the May 16th edit on article "Nabatean Alphabet"?

See https://en.wikipedia.org/w/index.php?title=Nabataean_alphabet&action=history . This edit is wrong for reasons explained at Talk:Nabataean alphabet#Problems in the table. Due to a combination of Coronavirus isolation and the stupid encryption protocol upgrade, I'm editing with a non-fully-Unicode-compliant tool, so I can't do it myself right now... AnonMoos (talk) 21:54, 26 July 2020 (UTC)

## New dab page Islamic language

FYI I have opened a discussion about the rationale of the new dab page in Talk:Islamic language. –Austronesier (talk) 07:13, 28 July 2020 (UTC)

## Draft:Powari language

Could somebody look at Draft:Powari language. It's been kicking around WP:AfC for two years. We're trying to figure out if it's worth keeping, but needs a Subject-matter expert. -- RoySmith (talk) 13:35, 28 July 2020 (UTC)

I have collected some articles which have language- or linguistics-related links to DAB pages, where expert attention would be welcome. Search for "disam" in read mode and for "d" in edit mode, and if you solve any of these puzzles remove the {{dn}} tag and post {{done}} here.

Thanks in advance, Narky Blert (talk) 14:03, 28 July 2020 (UTC)

## Third-person pronoun

Need more eyeballs at Third-person pronoun, which is an OR disaster. I'm particularly interested in the table in this section, especially the portion under the header, Gender-neutral singular pronouns. This entire portion of the table should come out; it is populated by hapax and other obscure science fiction trivia, and is completely unencyclopedic. Thanks, Mathglot (talk) 08:25, 4 August 2020 (UTC)

I have seen this page a while ago and closed it with mixed feelings of shudder and cringe, for various reasons. It's not about third-person pronouns in general. Much of it is about gender-distinctions, with a gender-distinction-normative bias: e.g. pronouns in gender-neutral languages are called "gender-inclusive", even if these languages never had an exclusion problem. Plus lengthy stray material that is not about third-person at all, e.g. the subsection Rapa; not to mention the non-notable hapax from hobby conlangs. –Austronesier (talk) 11:18, 4 August 2020 (UTC)

I moved it to Gender neutrality in languages with gendered third-person pronouns and removed all the sections on languages that didn't fit the topic, which left only Swedish and Norwegian and examples of the opposite trend it CJK. Please don't hesitate to try something else. — kwami (talk) 18:48, 10 August 2020 (UTC)

## Is this reference on the Hajong language reliable?

Publisher: SIL

Context: A citation to this was flagged ([2]). The issue at the heart is the alphabet used for this language. The language is strongly associated with an ethnic group and there is an effort to preserve and advance it into a literate form. Currently a number of different scripts are being used by the speakers, who are distributed in Assam in India and Bangladesh. The scripts in use are Latin, and Bengali-Assamese script, and from among these both the alphabets, Bengali and Assamese, are being used.

The point here is, is this reference reliable for the claim that the Assamese alphabet is one of the alphabets used to write the Hajong language, from the examples in the book and from the sentence - "Each word in the word lists is written first in Roman script followed by Assamese script in brackets." (p.1)

Thanks! Chaipau (talk) 15:13, 4 August 2020 (UTC)

@Chaipau: Sorry, I have just noticed your question here. The answer is yes, it is just as reliable as the source (Ethnologue) for the use of the Bengali alphabet in the same box of the table. Both sources are published by SIL. But actually, I find this SIL source[3] most enlightening (p. 27), which is also cited in Hajong language. This survey indicates that some Hajong speakers actually are less sensitive to these details than Bengalis/Assamese would expect. –Austronesier (talk) 09:48, 14 August 2020 (UTC)
You have gone ahead and responded to the tag as well. Thank you! Chaipau (talk) 11:50, 14 August 2020 (UTC)

## Do IPA click letters require velar/uvular symbols?

Kwamikagami edited {{IPA non-pulmonic consonants}} to show all click letters with velar/uvular symbols preceding them, sayingʘ ǀ ǃ ǂ ǁ⟩ only symbolize releases according to the the Handbook of the IPA, taking the fact pp. 20–1 of the Handbook show ⟨k͡ʘ k͡ǀ k͡ǃ k͡ǂ k͡ǁʰ⟩ to illustrate the sounds through examples from ǃXóõ and Xhosa. p. 10 of the Handbook says:

'Velaric' airstream sounds, usually known as 'clicks', again involve creating an enclosed cavity in which the pressure of the air can be changed, but this time the back closure is made not with the glottis but with the back of the tongue against the soft palate, such that air is sucked into the mouth when the closure further forward is released. The 'tut-tut' or 'tsk-tsk' sound, used by many English speakers as an indication of disapproval, is produced in this way, but only in isolation and not as part of ordinary words. Some other language s use clicks as consonants. A separate set of symbols such as [ǂ] is provided for clicks. Since any click involves a velar or uvular closure, it is possible to symbolize factors such as voicelessness, voicing, or nasality of the click by combining the click symbol with the appropriate velar or uvular symbol: [k͡ǂ ɡ͡ǂ ŋ͡ǂ q͡ǃ].

This doesn't strike me as saying the letters must be used with velar or uvular symbols, especially when the letters are seen by themselves in works, including the IPA Illustration for Sandawe.

Is it true that ⟨ʘ ǀ ǃ ǂ ǁ⟩ can only represent releases? How should our {{IPA non-pulmonic consonants}} be arranged? Nardog (talk) 08:27, 10 August 2020 (UTC)

I don't see that the Handbook says affricates "must" be written with two symbol either, they just give a couple examples written that way on p. 22. The examples are all we have to go on.
It's very common to leave out the accompaniment if it's <k>. However, it's less common to do so if there's a velar-uvular distinction to worry about.
Before the Handbook, the illustrations of the IPA implied that the clicks needed to letters but then wrote the tenuis clicks with only one. That was evidently cleaned up for the Handbook.
As far as the Sandawe illustration, note that the many of the illustration in the Handbook omit the tie bars from the affricates (Igbo even uses the tie bar bar for labial-velars but not for affricates!), but it's still proper to write them with.
When giving a prescriptive explanations, we should be careful not to take shortcuts that might confuse the reader. We can always explain common shortcuts and use them ourselves in the language articles, but when readers refer back to the IPA articles they should be clear on official usage.
For several years, there was a shift toward using the simple click letters for the complete consonant, with diacritics for anything else except for the velar-uvular distinction. Now usage seems to be swinging back. There's an upcoming phonetics volume on clicks coming out, and much of the transcription is explicit about both places, though typically using superscripts rather than tie bars (a convention also commonly seen for affricates and labial-velars). — kwami (talk) 08:45, 10 August 2020 (UTC)
The official IPA chart only says "Affricates and double articulations can be represented by two symbols joined by a tie bar if necessary" (emphasis added). I don't understand how that's germane even if what you say about clicks were true. Nardog (talk) 08:51, 10 August 2020 (UTC)
It's germane because it's exactly parallel. You object that the IPA doesn't explicitly require two letters for a click, but then it doesn't explicitly require two letters for an affricate either. So, should we assume that the second letter is optional? We can only go by the examples. The Chart doesn't give any, the Handbook does, and in the 'Guide to IPA notation' all clicks and affricates are written by two letters joined with tie bars. — kwami (talk) 08:56, 10 August 2020 (UTC)
It's possible to do all sorts of things with the IPA. It's possible to use ⟨c⟩ for an affricate, for example, but that doesn't mean we should define it as one. — kwami (talk) 08:56, 10 August 2020 (UTC)
What the IPA doesn't explicitly require for an affricate is a tie bar, not two letters. Nardog (talk) 09:02, 10 August 2020 (UTC)
It never says that two letters are required, and indeed in some of the illustrations affricates are written with only one letter (e.g. Korean, Sindhi).
Note also with Sandawe that they incorrectly transcribed the glottalized nasal clicks as ejective. They just adopted common conventions, they weren't being precise.
It's also quite common to use ⟨ɾ ɽ⟩ for laterals, e.g. in Indic languages, but that doesn't mean [ɾ ɽ] are lateral. — kwami (talk) 09:05, 10 August 2020 (UTC)
(edit conflict) But you can't possibly represent a double articulation with one letter, with a tie bar or not. So "if necessary" is clearly in reference to the use of a tie bar, not two symbols. I'm not saying you can't represent an affricate with one letter, just that a tie bar is only optional when representing an affricate according to the official chart. What the Igbo illustration does is actually a very sensible choice: [tʃ, dʒ] are in fact transitions from [t, d] to [ʃ, ʒ], while [k͡p, ɡ͡b] not so much, following closely the IPA Principle #4 (c).
By this analogy, which you brought up, then, the Handbook is saying that ⟨ʘ ǀ ǃ ǂ ǁ⟩ can indeed represent entire clicks. That's not quite the same as The Handbook requires them. What the Chart calls 'clicks' aren't consonants at all, just the releases, which is what you said at Template talk:IPA non-pulmonic consonants#velar vs uvular clicks. Nardog (talk) 09:25, 10 August 2020 (UTC)
You're reading a lot of your own opinions into the motivations of the writers of the Handbook. It makes just as much sense to write a labial-velar stop as ⟨kp⟩ in a language that doesn't contrast that with k + p as it does to write an affricate ⟨ts⟩ in a language that doesn't contrast that with t + s. And I don't see how you can possibly think that Principle 4(c) is any more relevant to one than to the other. — kwami (talk) 09:38, 10 August 2020 (UTC)

The question is whether you wish to be precise in your presentation of how the IPA works. In explaining the IPA, I feel that we should be precise. When being precise, [t͡s] and [k͡p] are one segment, [ts] and [kp] are two. [c] is a plosive, not an affricate. [ɽ] is central, not lateral. [ɨ] is a central vowel, not back. [k] is pulmonic, not ejective. [ǂ] is a click type, not a velar click. Outside pedagogy, it's fine to use the alphabet more broadly -- [ts] and [c] can both be affricates, [ɽ] can be lateral, [ɨ] can be a back vowel, [k] can be ejective, and [ǂ] can be a velar click. But you don't start off presenting them that way to people who are not familiar with how the IPA works. — kwami (talk) 09:35, 10 August 2020 (UTC)

As for the phrasing it is possible to symbolize factors such as voicelessness, voicing, or nasality of the click by combining the click symbol with the appropriate velar or uvular symbol, they're saying that it's convenient to transcribe features that have nothing to do with the rear articulation as if they were part of that articulation rather than of the entire consonant. The voicing and nasalization don't belong to the rear articulation -- that's only specified for uvular-velar, affrication, ejection and the like. The IPA transcription of clicks is weird, like writing labial-velars as *⟨k͡p, g͡p, ŋ͡p⟩. A lot of the variability and debate in transcription is related to this fact, that the IPA letters for clicks don't really fit in with the rest of the alphabet. I suppose that one might address this by transcribing clicks as e.g. ⟨k͡ǂ̥, ɡ͡ǂ̬, ŋ͜ǂ̃⟩, but I've never seen anyone do that. — kwami (talk) 09:55, 10 August 2020 (UTC)

The Chinese Wikipedia's IPA article uses the diacritics for voicelessness and nasalization to modify click symbols, see zh:國際音標#非肺部氣流音. Love —LiliCharlie (talk) 04:51, 12 August 2020 (UTC)
I'm sure that's just copied from a version of our {{IPA non-pulmonic consonants}} before recent edits. What I want to know is the community's opinion on how that template should be presenting the links to articles about clicks. I for one think the articles themselves (most of which are unreferenced) are hardly notable and should be merged into just Bilabial click, Dental click, Alveolar click, Retroflex click, and Palatal click, so that then the template can simply have a row of ⟨ʘ ǀ ǃ ǂ ‼ ǁ (ʞ)⟩ much like the actual IPA chart. But even barring that, do we need both velar and uvular symbols, making the rows twice as tall? My understanding that the use of velar symbols has been far more prevalent, even if the actual posterior closure of such clicks may be more accurately described as uvular. Nardog (talk) 05:07, 12 August 2020 (UTC)
Following Ladefoged & Maddieson (1996:265–266) velar and uvular symbols are both required to account for a phonemic contrast in ǃXóõ. However Miller et al. (2007) say in their study of Nǀuu that "evidence suggests that the contrast between “velar” and “uvular” clicks proposed for the related language ǃXóõ is likely also one of airstream and that a contrast solely in terms of posterior place would be articulatorily impossible." Love —LiliCharlie (talk) 05:50, 12 August 2020 (UTC)
The Cornell link is dead, here's the new URL: [4]Austronesier (talk) 10:11, 12 August 2020 (UTC)
Good spot, thanks for pointing out and providing a working link, Austronesier. My outdated link was actually to the earlier 2007 version of the study that was submitted to JIPA where it was published in 2009. I've managed to find the 2007 version archived on WaybackMachine, but the 2009 JIPA version is even better and certainly more accessible. Love —LiliCharlie (talk) 17:07, 12 August 2020 (UTC)

Pace Miller, there's at least one language that distinguishes velar from uvular clicks without any airstream contour. You hear it in the vowel rather than in the release of the click. (Miller discovered when working on N|uu that in that language where a velar-uvular distinction had been posited, all clicks were uvular and the distinction was one of timing, e.g. [q͡ǂ] vs [ǂ͡q] -- that is, whether or not you could hear the uvular release. She suggested that all languages had uvular clicks only in this fashion, and that "velar" clicks simply had an inaudible back release. But it turns out that not all of them do -- some have only velar clicks, and some have both. Why she should say that such an easy distinction might be "articulatorily impossible" is beyond me. They're easy to articulate and the spectrograms are pretty clear, with e.g. a velar pinch after velar clicks.) But regardless, the question here is what is the IPA convention, not what is Miller's. Lots of people use the bare click letter for a tenuis velar click. And that's fine, if you're not sticking to strict IPA. The IPA itself did that when it introduced the Beech letters in 1923. But the 1999 Handbook -- the replacement for the 1949 Principles so they could accommodate the Kiel convention that had replaced the original click letters with the ones we see now -- doesn't take such shortcuts in the examples it gives.

(Side note, Sandawe and Hadza have only (somewhat backed) velar plosives and ejectives, and clicks at the same rear place of articulation. Some Khoe langs have both velar and uvular plosives and ejectives, and clicks at both places of articulation. If you were looking only at those languages, it would be natural to conclude that clicks are doubly articulated. But Xhosa has only velar plosives and ejectives and only uvular clicks. If you were to look only at that language, it would be natural to conclude that the uvular closure is part of the airstream mechanism, not a place of articulation. So there's plenty of reason for theoretical differences between phoneticians, which are reflected in how they choose to symbolize clicks.)

As for merging the articles, that would effectively be saying that click consonants aren't important enough to bother distinguishing. It's as if a French-speaker were to say that our articles on affricates should be merged into the corresponding fricatives because affricates aren't important -- after all, they don't occur in French and there are no IPA letters for them. — kwami (talk) 04:51, 14 August 2020 (UTC)

## Reliable sources noticeboard: EtymOnline

Wikipedia:Reliable sources/Noticeboard#etymonline could use input from this project's participants. Nardog (talk) 13:35, 13 August 2020 (UTC)

## etimo aut no etimo

Hey There,
many psychological pages have no etymology whatsoever
ie "panic attack" does not refer to Pan. Goddess Psyche is never mentioned anywhere
thanks Linguists --Wittgenstein51 (talk) 19:08, 15 August 2020 (UTC)

Please go ahead and add what you deem to be missing. −Woodstone (talk) 07:33, 16 August 2020 (UTC)

## Unicode chart template references

Regarding templates within Category:Unicode charts, would there be a reason that the superscript numbers at the top could not be replaced by, for example, letters, to better distinguish them from article references? This would make them more clearly linked to the template notes they apply to. (Drmccreedy,BabelStone) CMD (talk) 14:52, 19 August 2020 (UTC)

Personally, I find these less confusing/irritating than the notes/references in Help:IPA/English :) But sure, there is no reason not to convert them into something that better meets common expectations, even if not prescribed by MOS. –Austronesier (talk) 15:31, 19 August 2020 (UTC)

## Feedback requested at Talk:Anti-LGBT rhetoric#Merger

Your feedback would be appreciated at this discussion, which proposes the merger of four articles into Anti-LGBT rhetoric. Thanks, Mathglot (talk) 18:23, 22 August 2020 (UTC)

## Is an "alphabet" and a "script" same?

Is an "alphabet" and a "script" the same thing? I know this is probably not a strictly linguistic issue, but I can thing of no other expert group that can help with this. If this is not the forum, please point me to the right one.

Context: Today we have Bengali-Assamese script and the two alphabets: Bengali alphabet and Assamese alphabet. The "script" article came about because the then Bengali script was too language specific and after some discussion it was decided that a "parent" article was required (Talk:Bengali_alphabet#Merge_with_Assamese_script?, 2006-2007). After some meandering the article name settled on "Bengali-Assamese script".

The immediate context is the discussion at Talk:Rangpuri_language#Writing_system. In short the discussion is on whether we should link the script of the Rangpuri language as [[Bengali-Assamese script]] or [[Bengali alphabet|Bengali script]].

I am tagging the other interested parties: user:Za-ari-masen and user:Msasag. And also user:SameerKhan who was instrumental in the 2006/2007 decision.

Thank you!

Chaipau (talk) 12:53, 29 August 2020 (UTC)

Terminology varies. A good starting point might be the Glossary of Unicode Terms. In their strict terminology the Bengali-Assamese script is a script ("Bengali script") of the abugida type (and not of the alphabet type), and the Bengali writing system as well as the Assamese writing system use the Bengali script. HTH. Love —LiliCharlie (talk) 13:08, 29 August 2020 (UTC)
Does Unicode provide names of scripts or blocks of codes? There is in fact a proposal to change the block to "Bengali-Assamese" ("It may be possible to change the block header name, though the block property values cannot. The most neutral and least disruptive name would be “Bengali-Assamese”. This is an editorial, not a normative, matter." [5]) Nevertheless, the script is already called "Bengali-Assamese" in Saloman (1998) Bengali–Assamese_script#cite_note-1. Chaipau (talk) 13:34, 29 August 2020 (UTC)
Unicode provides a lot. The latest standard has over 1000 pages. And they also host ISO 15924. Love —LiliCharlie (talk) 13:50, 29 August 2020 (UTC)
I don't think Unicode's stability policy allows script or block name changes, not even if the names contain obvious spelling errors, but formal name aliases are allowed. Love —LiliCharlie (talk) 14:07, 29 August 2020 (UTC)
But Unicode does not encode scripts per se, according to their FAQ. For instance, Bengali uses the "danda" defined in the Devanagari block, so does it mean that Bengali encoded in Unicode uses a hybrid Bengali-Devanagari script? Yes we started with Unicode, but we need to move on. Chaipau (talk) 14:24, 29 August 2020 (UTC)
Blocks are handy, but they don't determine script. Characters have character properties, and one value of the script property is Zyyy for "undetermined script" aka "common". (Many scripts share punctuation, numerals, diacritics, etc.) Love —LiliCharlie (talk) 14:35, 29 August 2020 (UTC)
I respect the knowledge and opinions of the editors who joined that discussion of 2006-07 that Chaipau showed, but it just appears to be a case of WP:OR where the editors came up with the term "Bengali-Assamese script" which now seems like a WP:NEOLOGISM as there are some visible efforts to popularize the term. All the relevant sources call it "Bengali script" including the Unicode glossary that LiliCharlie showed and in the context of Rangpuri language, Ethnologue states the writing system of Rangpuri as "Bengali script". The sources use "Bengali script" and "Bengali alphabet" interchangeably, even the article on Bengali alphabet uses "script" numerous times in its description. I think the best way to solve this issue is to rename Bengali-Assamese script to Bengali script. Za-ari-masen (talk) 09:34, 30 August 2020 (UTC)

A script and an alphabet aren't same imo. A script is a set of characters and an alphabet is based on one or more scripts and the characters have certain sound values and other rules. This script we are talking about isn't just used for Assamese and Bengali but also for Maithili, Meitei Manipuri, Kamtapuri, Bishnupriya Manipuri, Sylheti, Hajong, Santali, Chittagonian etc etc. And this script is known by many different names. This script currently has two Unicode blocks: Bengali and Tirhuta. Tirhuta block is, as of now only usef for Maithili language, it's also known as Mithilakshar. And the Bengali block is used for many different languages like Assamese, Bengali, Rangpuri/Kamtapuri etc. Unicode has three names for the script: Tirhuta script, Bengali script and Assamese script. Though since this is one script, an unified name should be used. I prefer the name Eastern Nagari. The Siddham script has two descendants, 1) Nagari or Devanagari or Western Nagari and 2) Eastern Nagari. So the term Eastern Nagari is suitable for the script since it's used in the Eastern region. Scripts like Odia and Nepalese script came from early Eastern Nagari that emerged in 13th-14th century. We cannot choose any of the regional names like Bengali or Assamese or Tirhuta. That is because people from other regions don't accept any specific regional name. For example, if it's renamed as Bengali script, then people from Assam, Bihar, Jharkhand, Kamtapur region will feel offended. They feel offended and disadvantaged when their script, languages, culture etc are mistaken to be Bengali. This leads to hatred among different groups. So this issue will never be solved, people will keep demanding to change the name "Bengali script". So I think it's best not to favour any specific regional term and we should use an unified term like "Eastern Nagari" for the script. Outside wikipedia, the unified name "Eastern Nagari" or "Purvinagari" is quite accepted as I've seen. Only few people opposed this term, it seems that they prefer a term to which their cultural identity is associated. I'm a Bengali and I've many Bengali friends from Bangladesh and West Bengal who have no issues using the term Eastern Nagari. Msasag (talk) 10:41, 30 August 2020 (UTC)

, No. When user:SameerKhan suggested the name "Bengali-Assamese" in 2006 it was already prevalent ("Indian Epigraphy" Saloman 1998). It was to accommodate the non-Assamese/Bengali languages that for a time this article was named "Eastern Nagari script". (Manipuri language uses the Bengali and the Assamese for example. - addendum) We know from Brandt 2014 that the academic community rightly prefers "Eastern Nagari script" for the very same reason. [6] Chaipau (talk) 14:24, 30 August 2020 (UTC)
Ethnic pride is not among our five criteria for article titles and we will continue to call Serbo-Croatian by that name in spite of animosities between fervent Serbs and Croats who hate their linguistic varieties to be described as varieties of a common language, and in spite of Bosnians and Montenegrins who hate not to be mentioned. We should be guided by our existing policy and not invent new ad hoc rules to cater for the taste of people who lack scientific objectivity (i.e., maximum distance between observer and the observed). Love —LiliCharlie (talk) 15:30, 30 August 2020 (UTC)
Chaipau, these are just one or two sources where the terms "Bengali-Assamese" or "Eastern Nagari" are mentioned but there are thousands of sources that describe the script as "Bengali script". Even Brandt himself notes that "Bengali script" is the most common and popular term for this script, hence, it seems to be the most suitable title per WP:COMMONNAME. You should see what LiliCharlie stated above, ethnic pride is not a criteria to suggest article titles. Za-ari-masen (talk) 09:13, 31 August 2020 (UTC)
What applies here is WP:NAMINGCRITERIA, not WP:COMMONNAME. The point is that academics and others have recognized the name "Bengali script" is problematic. You are misquoting Brandt—this is what she says: "In fact, the term 'Eastern Nagari' seems to be the only designation which does not favour one or the other language. However, it is only applied in academic discourses, whereas the name 'Bengali script' dominates the global public sphere." In other words, she (and the academic community) is rejecting the dominant name ("Bengali script") and is preferring quite another name ("Eastern Nagari script").
And the claim to WP:COMMONNAME is a little misleading. In determining what is WP:COMMONNAME it recommends In determining which of several alternative names is most frequently used, it is useful to observe the usage of major international organizations, major English-language media outlets, quality encyclopedias, geographic name servers, major scientific bodies, and notable scientific journals.. Just a search on the web is not enough. Again pointing back to Brandt's statement preferring "Eastern Nagari script" over the popular "Bengali script".
Chaipau (talk) 11:23, 31 August 2020 (UTC)
Addendum: Using solely WP:COMMONNAME, one should use "Bengali-Assamese script" rather than "Bengali script". This is because there are significant works that mention the script as Assamese or Asamiya script (e.g. in "Indo-Aryan Languages, Cardona") and it improves recognizability. Chaipau (talk) 11:33, 31 August 2020 (UTC)

Za-ari-masen I don't see any reason to consider the "popularity" of a word, rather we should use a name that is acceptable to all (not just an individual). We should also keep in mind the publication dates of those "thousand" sources. Mohsin274 (talk) 10:20, 31 August 2020 (UTC)

For what it is worth, Unicode does call it "Bengali and Assamese"[7]. Chaipau (talk) 12:06, 31 August 2020 (UTC)

"It"? No. Unicode calls the script "Bengali script", and the block starting at U+0980 "Bengali", cf. chapter 12.2 of the current standard which also mentions "Bangla script", "Asamiya", and "Assamese" as synonyms for the script. What you are citing is a page to help users find charts of Unicode blocks rather than scripts. Love —LiliCharlie (talk) 12:45, 31 August 2020 (UTC)
I think we have addressed these issues earlier.
• The block header name will never change in Unicode. It will break too many things and it was designed not to change.
• Blocks encode codes, not scripts (look at the FAQ link I provided above). It says they do not encode scripts, per se. I also gave you an example why every complete sentence used in Bengali Unicode is a hybrid Devanagari-Bengali code.
• Further more, look up the answer to the FAQ: Can I determine the script of a character by the character or block name? Ans: No, not at all. The character names and block names are not reliable indicators of the script of a character. In other words, the name "Bengali script" may or may not determine the name of the script to which the characters in the block belong. For example, the letter which is called "BENGALI LETTER RA WITH LOWER DIAGONAL". This letter does not even exist in the Bengali alphabet, and it is not "RA' but "WO".
In this case at least, we cannot go by Unicode naming conventions.
Chaipau (talk) 14:07, 31 August 2020 (UTC)
I don't know if these will help but you should read these news articles once: [8], [9], [10] [11]. Mohsin274 (talk) 14:19, 31 August 2020 (UTC)
You are aware that the whining Sentinel editorial is utter BS that conflates script with language? –Austronesier (talk) 14:48, 31 August 2020 (UTC)
I don't know. You may/may not be right, but I personally don't have any issues with Sentinel editorial. I am just showing few articles about "The London sitting of the International Organization for Standardization... held between June 18 and June 22, 2018." And, if you think the article from Sentinel is unreliable or biased then you can read the other 3 from The Assam Tribune, NE Now, and Indian Express. Mohsin274 (talk) 15:17, 31 August 2020 (UTC)
Don't get me wrong, but we need peer-reviewed scholarly articles rather than newspaper articles by people who seem involved. Love —LiliCharlie (talk) 15:27, 31 August 2020 (UTC)
I agree. Opinion columns are the bane of Wikipedia in many instances. The Indian Express reports are also too opinionated. It was be better to look at the Unicode ad hoc committee report, which is some kind of a peer-review of the submission made by the BIS. Here they are:
• The proposal: [12]
• The Ad Hoc Committee report: [13]
• The Working Group Report: [14]
Please note the Recommendation M67.25b from the Working Group (page 5): Change the block header from Bengali to Bengali-Assamese. Obviously, the WG did not accept everything the BIS submitted.
Chaipau (talk) 15:47, 31 August 2020 (UTC)
I never said we should follow Unicode or ISO 15924 practice. What I said was we should be guided by our own five criteria for article titles, and not consider ethnic pride. And I now add: We shouldn't try to settle any political issues. Love —LiliCharlie (talk) 14:46, 31 August 2020 (UTC)
Yes, I agree with you. We should apply WP:NAMINGCRITERIA diligently here. We have seen that the old usage has some problems, and the Unicode, the academics and scholars are moving in a certain direction. We are best off being mindful of that direction. Not doing so is political. We should not overstep them either. This debate has been going on for some time in different talk pages, and I believe the experts in this Linguistic forum are possibly the best equipped to take the nuances into consideration and resolve the issue. Chaipau (talk) 15:28, 31 August 2020 (UTC)
My above example of Serbo-Croat was chosen because issues are involved that lead to atrocious wars with massacres and many casualties. I refuse to fuel tensions by taking sides, neither the Bengali-speaking, nor the Assamese-speaking nor any other linguistic or ethnic group have the right to demand considerateness that might result in hurting somebody else's feelings. I prefer to remain completely neutral by not agreeing with any of the parties involved. And certainly not with the loudest one. Love —LiliCharlie (talk) 16:04, 31 August 2020 (UTC)
According to user:Za-ari-masen, the reliable sources like Ethnologue uses "Bengali script" (not "Bengali-Assamese script"). And, according to Ethnologue, they use ISO Standard 15924 for identifying writing systems or scripts (As stated here). Therefore, we are indirectly following ISO 15924. But, if ISO renamed "Bengali script" to "Bengali-Assamese script", then we should use the same. Mohsin274 (talk) 15:42, 31 August 2020 (UTC)

## Wikipedia policy on including etymology information?

Hello etymology friends. I often consider adding an etymology section to articles without one, but I'm never sure if that's acceptable. It's not clear to me that there's a consistent threshold, if you will, even for what one would imagine to be the most vetted topics. For example, Tree, Future, and March (music) don't have etymology info, but Animal, History, and March (month) all do. What gives? What is the policy/common practice/tradition for including etymology on a topic's page?

• Does it depend on how notable the page is?
• How important the topic is?
• How "obvious" and/or well-known it is what the etymology is?
• How attested the etymology is?
• Whether the origin is Germanic, Latin/French, or other?
• Whether the word is shared by other languages?
• How abstract the topic is?
• Should etymology be explained in a parenthetical in the lede? Or in its own section?
• Is there a policy at all?

CampWood (talk) 02:09, 30 August 2020 (UTC)

I have no idea if there is any guideline hidden somewhere, but intuitively I find an etymology section helpful if it explains how the concept described by the term developed, as in the case of History. For Tree, etymological information adds little to our understanding of what a tree is, so should be left out per WP:NOTDICTIONARY. –Austronesier (talk) 07:51, 31 August 2020 (UTC)

## Discussion of example number formatting on helpdesk

I'm just gonna leave a link to this discussion about using running numbering schemes for linguistic examples. The idea would be basically to have Wikimarkup support something like the LaTeX/linguex "\label" and "\ref" system. Botterweg14 (talk) 12:42, 31 August 2020 (UTC)

There is an ongoing discussion about Late Greek in Talk:Late Greek -- is it a "period" of Greek? is it a "register"? should it have a standalone article, or be part of some other article? Kindly help us out! --Macrakis (talk) 17:05, 31 August 2020 (UTC)

## Is Ethnologue reliable for the Kamta group of languages?

The Ethnologue seems to give classifications and names in a very different system, at variance with accepted knowledge and recent findings. Here are some examples:

1. Ethnologue calls Rangpuri language a language [15], whereas Masica 1991, p 25 calls it Rajbangsi (" Thus the Rajbangsi dialect of the Rangpur District (Bangladesh), and the adjacent Indian Districts of Jalpaiguri and Cooch Behar, has been classed with Bengali because its speakers identify with the Bengali culture and literary language, although it is linguistically closer to Assamese.")
2. Ethnologue, on the other hand, calls Rajbangsi a different language from Nepal [16].
3. Ethnologue calls Kamtapuri an alternative name for Rangpuri [17], whereas Toulmin (PhD 2006) finds "However, with a sizeable number of speakers now located within a different country to Rangpur, and lacking any special historical reason for choosing Rangpuri over Kamta, it is unlikely that this term will catch on further afield."

It seems Ethnologue is at complete variance with linguists and their findings and reports.

Could we then consider Ethnologue, at least for these entries, reliable?

Chaipau (talk) 17:39, 1 September 2020 (UTC)

The Ethnologue is a tertiary source because it is a compendium of other secondary sources (which in turn rely on primary sources). As such, it may be helpful, but proper secondary sources should be preferred. For our policy regarding primary, secondary and tertiary sources, see WP:PSTS.
Regarding the Ethnologue, I do not know about the Kamta group of languages, but I know cases where the Ethnologue does not reflect the best consensus in Linguistics, namely when it comes to the differentiation between Western Upper German varieties (which is what I know about the most), which includes entries such as “Swiss German“ (not a linguistic division, but rather a cultural or national one), “Walser” (various Highest Alemannic German varieties, but not the only ones), but scandalously lacks Alsatian.
I think it is problematic that the ISO has basically copied the Ethnologue classifications. Of course, a hard classification scheme like ISO 639-3 is a necessity for computers, and it has many benefits. However, it obscures the inherent fuzziness of linguistic classifications and perpetuates one classification system, in this case the Ethnologue’s. Also, the Ethnologue now has a hard paywall. --mach 🙈🙉🙊 18:54, 1 September 2020 (UTC)
I think it is problematic that the ISO has basically copied the Ethnologue classifications.
It's the other way round, see Ethnologue's The Problem of Language Identification page where it says: "Since the fifteenth edition (2005), Ethnologue has followed the ISO 639-3 inventory of identified languages (http://iso639-3.sil.org/) as the basis for our listing of distinct languages." (A more direct link to the language identification policy of ISO 639-3 is https://iso639-3.sil.org/about/scope. See articles SIL International, Ethnologue, and ISO 639-3 for the relationship between Ethnologue and ISO 639-3.) Love —LiliCharlie (talk) 19:39, 1 September 2020 (UTC)
P.S. The starting point to request an ISO 639-3 entry for Alsatian is their Introduction to the Code Change Process page. Love —LiliCharlie (talk) 19:57, 1 September 2020 (UTC)
Oh-oh, shows that it’s better to research first and rant later. Thanks for the corrections. --mach 🙈🙉🙊 22:10, 1 September 2020 (UTC)
Yes. In the Indo-Aryan context, where "The speech of each village differs slightly from the next, without loss of mutual intelligibility, all the way from Assam to Afghanistan.", Masica 1991 p.21 has a very comprehensive description of the language/dialect problem. This is a much bigger problem that cannot be adequately captured by the mutually exclusive categories of Ethnologue. Chaipau (talk) 10:04, 2 September 2020 (UTC)
This is typical of dialect continua, of course, and by no means restricted to Indo-Aryan. Mach's Western Upper German example within the Continental West Germanic continuum is of the same kind. A language is a dialect with an army and navy. Love —LiliCharlie (talk) 10:30, 2 September 2020 (UTC)

LiliCharlie, J. 'mach' wust could an unpublished thesis be considered a reliable source over Ethnologue? Za-ari-masen (talk) 09:26, 2 September 2020 (UTC)

Sources are required to be verifiable, and our verifiability policy rules that "content is determined by previously published information". Love —LiliCharlie (talk) 09:43, 2 September 2020 (UTC)
What do you mean by "unpublished"? If you're talking about a PhD thesis that has been submitted and accepted then it counts as published (WP:SCHOLARSHIP). Nardog (talk) 10:05, 2 September 2020 (UTC)
• The current setup on Ethnologue for the Ranjbanshi dates to 2008, and like with other recent changes it's got a paper trail that you can follow [18] (you will recognise the name of Toulmin somewhere in there). My experience with similar code changes in this part of the world is that they're usually based on the results of a sociolinguistic survey. Of course, conclusions could be different if other methods were used, and even the same sociolinguistic data is often open to different interpretations. Also, a recent survey can paint a different picture from the one gleamed from a three-decades-old reference text. – Uanfala (talk) 10:30, 2 September 2020 (UTC)
Yes, I agree with user:Nardog on the general principle. Furthermore, the PhD thesis in question, Toulmin 2006, is open-access published by the University: [19]. Therefore, it satisfies WP:V too, as required by user:LiliCharlie. Chaipau (talk) 10:40, 2 September 2020 (UTC)
So it looks like Toulmin himself was part of the team at Ethnologue to create the database for Rangpuri, so shouldn't we follow Ethnologue over Toulmin's earlier thesis? Za-ari-masen (talk) 11:09, 2 September 2020 (UTC)

## Feedback requested at Portuguese vocabulary

Your feedback would be appreciated at Talk:Portuguese vocabulary#Examples and article title. Thanks, Mathglot (talk) 01:35, 8 September 2020 (UTC)