Wikipedia talk:WikiProject Languages/Archive 11

basic language article in need of review

National language is a rather basic article for a lot of what we write, but has been listed as needing attention since 2009. According to the intro, a "national languages" is any variety of speech associated with an ethnicity, which is a bit useless as a definition. We should clean it up if fixable, or state the phrase has no meaning if it's not. — kwami (talk) 23:36, 17 January 2015 (UTC)

Focus in Spanish

I find that the articles on grammar are so devoid of any linguistics that it is revolting. Instead they all rely on traditional ideas about grammar and in Spanish also shared with RAE. Like the Spanish preterite but why not call it aorist but in Greek it is aorist not preterite, so that instead of making effective and quick the teaching of Spanish they just stick to old ways. Most important however is that which I did not say which is FOCUS since just as they did in the article for Tuareg a Berber language which is very related to Spnaish and other Southern European languages, as many other Afro-Asiatic languages, I would also like to see talk about focus in Spanish which literally dominates everything that is said much like in English when we start a sentence with the object we make a passive construction yet Spanish has its own mechanism whereby to accomplish such object focus. Nevertheless I found no such thing. What's worse is this whole coger thing where supposedly it is to "have relations", when in reality there are many dialectal words as that is a very colloquial word in any language unlike in English where it is always the f word, but in reality the old word for it in Spanish is "joder" which comes straight from the Latin futuere just as vulgar although standardly it can mean to bother since such puritanical selfcensorship is natural in language so that there is no standard word for the private parts and sexuality outside of medical formalities. Therefore I find many articles superficially scholarly but indeed devoid of exanimation of important typological and syntactical features which are more important than some traditional morphological prescriptivism. Furthermore the dialectal differences between Latin American dialectals are completely neglected because Spanish is not just Argentinian voseo, Spaniard vosotros, and Latin American seseo. Indeed there is hardly anything similar between Argentinian (standard voseo) and Dominican (extreme coda deletion) and Mexican (reduced vowels) and Antioquia Colombian (apico-alveolar sibilant) and areas affected by an Indian language substratum or by foreign intonation or isochrony or even the fact some dialects find weird that others extremely use usted yet others that others would even use tú. I concludingly find these Spanish articles representative of a fake Spanish being a fusion of different characteristics and standard codified Spanish ideas about grammar frequently neglected all over Latin America, because in reality what happens in Latin America and even more in Spain is not whatsoever far from the Arabic situation and to speak standard Spanish is as much a matter of much practice and little success as it is to use the Standard for Arabic speakers. Thank you very much. — Preceding unsigned comment added by 98.254.198.111 (talk) 02:30, 13 February 2015 (UTC)

Language examples in new Palatalization articles

Palatalization was recently split into Palatalization (phonetics) and Palatalization (sound change).

The phonetics article is the place where palatalization as a phonemic feature is described. If you know a language with this feature, please add a section for its language family, and a section on the language, under the Examples section. Adding a few examples of minimal pairs, or notes on typologically unusual features relating to the palatalized phonemes, would be good too.

The sound change article is where palatalization as a historical phonological change is covered. This includes Romance and Slavic historical palatalizations, phonemic splits relating to development of palatal or palatalized consonants, development of alveolar and postalveolar affricates and fricatives, vowel fronting and raising, etc. Examples of palatalization sound changes can be placed in the Examples section. Brief statement of when and where the sound change applied, with a few examples, would be great.

The two must be distinguished: examples of languages with palatalized and unpalatalized phonemes belong in the phonetics article; all examples of historical sound changes, even ones resulting in palatalized phonemes, belong in the sound change article. — Eru·tuon 22:14, 14 February 2015 (UTC)

Punic language, Tunisian Arabic

Edit-war by editor who believes that Punic is spoken by 50 million people in the Maghreb. Doesn't appear to understand what a substratum is. (Unless his source actually say what he thinks it does, in which case we'll need to address it per RS.) — kwami (talk) 19:42, 16 February 2015 (UTC)

"Altaic ?" in the Infobox?

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


(Note: This is a centralization and reinitiation of an RfC that has occurred in different forms elsewhere) In the Infoboxes for Turkic languages, Mongolic languages, Tungusic languages, Japonic languages, and Koreanic languages and most, if not all, of their daughters, assigning the color "Altaic (areal)" to the infobox automatically fills in the text "Altaic ?" in the "family" line of the classification at the top node unless it is specifically overridden by inserting another value into the "family" slot. This should be eliminated since Altaic has been generally discredited among historical linguists and only a small minority of linguists still cling to it. --Taivo (talk) 08:42, 7 February 2015 (UTC)

  • Support. "Altaic", as a genetic unit, has virtually disappeared from serious historical linguistic consideration. Only a decreasingly small number of linguists still cling to it as to a floating deck chair from the Titanic. "Altaic" should be described in the history of classification sections, of course, but should no longer occur (automatically or otherwise) in the infoboxes of the constituent clades or the daughter languages. --Taivo (talk) 08:42, 7 February 2015 (UTC)
  • Support - Colouring should be Language Isolate with linguistic classification being something along the lines of: Language Isolate (controversially classified as Altaic) or Language Isolate/Altaic (?) Luxure Σ 07:30, 12 February 2015 (UTC)
  • Support - In my understanding, the term Altaic, whether it has many or only a few supporters, has not achieved a wide consensus among linguists. As much as possible, the infobox should only include uncontroversial material. I know that this is not always possible, but I'm sure that classifying a language as Japonic or Mongolic is certainly uncontroversial, whereas Altaic is not. Landroving Linguist (talk) 11:19, 9 February 2015 (UTC)
  • Comment: "Language isolate (generally accepted)" is misleading; it's not that the majority of modern linguists are actively proposing that, say, Basque, Sumerian, Korean (minus Jeju), or Haida has literally no relatives (or even no living ones), but that they're refusing to take a position in any direction owing to lack of evidence. I think "Altaic (controversial)" or the like would be more appropriate. Tezero (talk) 15:04, 9 February 2015 (UTC)
Completely untrue. Why do you make statements like this? HammerFilmFan (talk) 14:13, 15 February 2015 (UTC)
Changed. Luxure Σ 07:30, 12 February 2015 (UTC)
  • Support - per Landroving Linguist. ミーラー強斗武 (StG88ぬ会話) 17:23, 9 February 2015 (UTC)
  • Support Controversial classifications should not be in the infobox. It is not the case that "linguists refusing to take a position" somehow lends credibility to the hypothesis, that is a misunderstanding of how classification works. The default position is "isolate" until any relations have been satisfactorily demonstrated. User:Maunus ·ʍaunus·snunɐw· 20:43, 9 February 2015 (UTC)
  • Support - this needed to be done some time ago. HammerFilmFan (talk) 14:13, 15 February 2015 (UTC)
  • I recognize that; I'm merely arguing that "generally accepted" is misleading given what "isolate" means. It'd be like saying agnostic atheism is the religious position the majority of some subset of philosophers or scientists believe in - even if that's the most common response to a self-identification survey, it's not really "believing in" anything. If the concept is to be included in the infobox with no mention of Altaic, but Altaic is considered to be major enough not to label the proposed member only an isolate, I'd prefer something like "No demonstrable relationship to other languages", which I think Mayan languages used last I checked. Tezero (talk) 21:43, 10 February 2015 (UTC)
  • Support per Maunus. As a rule, only well-accepted classifications should be in the infobox. —Granger (talk · contribs) 21:30, 9 February 2015 (UTC)
  • Comment: For the record, the Altaic languages' article lists noticeably more supporters of the theory than opponents. If it's as fringe an idea as you people are saying, you ought to amend that in the interest of due weight. Tezero (talk) 21:43, 10 February 2015 (UTC)
Most of those Altaic supporters are either dead or have abandoned Altaic. --Taivo (talk) 01:00, 11 February 2015 (UTC)
Yes, your assertions are not based on any facts, and makes me somewhat suspicious of your motives here, in spite of AGF. HammerFilmFan (talk) 14:11, 15 February 2015 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

English language article GA cleanup

Hi, everyone,

I'm writing here to let project participants know that I'd like to work collaboratively with other Wikipedians to bring the article English language, a very high-page-view article, up to good article status and beyond to featured article status. The previous (failed) good article review from January 2009 and the helpful peer review from September 2012 agree on many points of improvement needed in the article. This project's own template for articles on spoken languages is also a good resource for restructuring the English language article to mention the most important issues about English as one language among many. What I have been doing for the last month or so is exhaustively going through every reference now cited in the article, reference by reference, to verify the references, complete the bibliographic description of each reference, and to collect all the references gradually into a bibliography at the end of the article, with a new inline citation format to ease verifying controversial statements and finding examples. I have done very little rewriting of the article text as yet, but that will eventually have to be done by someone to bring the article to GA status. The previous comments point out that the article lists a lot of miscellaneous facts without unifying them through clear prose. I invite you to suggest current, reliable sources for the article, to query dubious statements in the article, and to discuss on the article talk page what restructuring of the article will place due emphasis on the various aspects of the English language. And, yes, feel free to fix problems as sources identify problems to fix. Please let me know what I can do to help. -- WeijiBaikeBianji (talk, how I edit) 16:25, 16 February 2015 (UTC)

I'm interested in phonology, so I will be editing the Pronunciation section to make it more readable, accurate, and so on.
One endemic problem is that some traditional vowel symbols are no longer phonetically accurate. ʊ/, for instance, are pronounced as lowered and centralized to ɵ] in standard US pronunciation. This is problematic, because the actual vowel [ʊ] occurs in other languages, like German and Hindi, and use of one symbol for both is misleading. There's a certain amount of wiggle-room in phonetic transcription, but the difference between German Stunde /ˈʃtʊndə/ and English could /kʊd/ is great enough that it should be noted in the transcription. Not sure if this observation is supported by reliable sources. — Eru·tuon 06:06, 21 February 2015 (UTC)

Smallcaps

I have initiated a discussion regarding the use of small caps which some editors consider to be deprecated by the MOS in general but which are of course necessary for writing interlinear gloss in language and linguistics articles.User:Maunus ·ʍaunus·snunɐw· 19:55, 16 February 2015 (UTC)

Here is the RfC about the issue: Wikipedia_talk:Manual_of_Style/Capital_letters#RfC:_Proposed_exceptions_to_general_deprecation_of_Allcaps You input will be valued. User:Maunus ·ʍaunus·snunɐw· 21:16, 16 February 2015 (UTC)
It's actually a muddled RFC mostly about two unrelated all-caps style issues, untreated to linguistics matters. The linguistic issue should be a separate proposal.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  11:20, 20 February 2015 (UTC)
No, it is about three unrelated allcaps issues. The question of using small caps in authornames in references is however also related to linguistics since the Linguistic Society of America style guide uses this.User:Maunus ·ʍaunus·snunɐw· 22:40, 20 February 2015 (UTC)
Irrelevant; WP does has it's own citation styles, and does not use those of the LSA or other organizations.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  02:47, 25 February 2015 (UTC)
That is wrong. WP does not have its own citations styles and allows the use of all citation styles. Please read the actual policy WP:CITEVAR.

Conservation Status

Hello fellow linguists. I wanted to propose a change to the language infobox that adds a section concerning language conservation and vitality. I was hoping we could throw the idea around of making a language conservation template or diagram of some kind that could be used on each language's page in the infobox to help the reader visualize the vitality of the language, similar to the endangered species one. Each language in the world has a status and vitality, as do species.

Here is my current idea about language conservation statuses we could have in the infobox (I am always open to discussion and other ideas):

  • Global/Least Concern(for international or national languages used for wider communication like English and French)
  • Developing/Vigorous/near Thriving (for littler languages that still yield considerable vitality; children are still learning them but they are not yet widespread and may have limited official/regional status)
  • Vulnerable/threatened (the language is spoken by all generations but is a minority language, and its use maybe restricted to certain domains; perhaps the language community needs some kind of conservation to maintain their language)
  • Shifting/moribund (the language is no longer spoken/acquired by children as a first language, but is in use among the parent generation and older who could theoretically turn around and start speaking the language to their children)
  • nearly extinct (only a few elders remain)
  • dormant/dead (no known living first language speakers, but perhaps revitalization attempts)
  • extinct (completely gone)

Here is the endangered species diagram, which I was hoping the language status diagram might look like:  

However, the problem with language conservation status is there is not a concrete source as there is for defining the conservation status of species (the Red List of Endangered Species). UNESCO can be reliable for language conservation status, but it appears to struggle with original research and has trouble differentiated between a language and a dialect. Ethnologue by SIL International is generally reliable and does provide a status for each language, but it is a missionary resource and is thus biased; some of the data is manipulated and linguists do not agree on its accuracy. I was thinking that the Catalogue of Endangered Languages by the University of Michigan and the University of Hawaii looks reliable, and well defines the various degrees of language endangerment/vitality, but I'd like to hear everyone's ideas. We need a source that linguists agree is generally reliable to prevent potential edit warring between users knit picking various sources.

~~user:Neddy1234~~

  • I think that in principle it is a good idea, and that the template should certainly support it if it oesn already. I am not sure I would want to make it a requirement however. Sometimes the status is controversial (for example calling a language dead when revival efforts are ongoing), or sometimes the authoritative sources are wrong (I have myself brought "dead language" back to life by writing to the ethnologue to tell them that I found speakers of a variety they listed as extinct). I also definitely think that we should not tie ourselves to one single source, but use which ever sources is best for a given language and use editorial discretion to do so. But having the option and making it the standard is a good idea.User:Maunus ·ʍaunus·snunɐw· 21:32, 19 December 2014 (UTC)
    • Having an infobox display discrete levels of language endangerment strikes me as unsupportable OR. Even Ethnologue doesn't divide all languages into seven neat categories, and neither (AFAIK) does anyone else. Therefore, neither should we. —Aɴɢʀ (talk) 23:44, 19 December 2014 (UTC)
It is only OR if we require it in absence of sources. There are several resources that are attempting to make global endangerment indexes that we could use, ethnologue is only one of them. UNESCO is one, and the google based endangered language project is another. No reason we couldnt use those categorizations in infoboxes.User:Maunus ·ʍaunus·snunɐw· 23:51, 19 December 2014 (UTC)
  • Thank you guys for all taking my idea into account! I'm very thankful for your feedback! As is the case of individual species whose conservation status is data deficient, we need not include such a template/diagram on every language article if there is nothing known. My vision is that if we have enough data from reliable sources like Ethnologue, the google-based endangered languages database, and UNESCO, we can make an accurate (and non-Original) conservation status evaluation. In describing the categories, I have incorporated the various guidelines for language health and vitality from Ethnologue, UNESCO, and Google's endangered language project. And, by the way, Aɴɢʀ, Ethnologue ABSOLUTELY DOES put languages in neat little categories concerning their vitality: http://www.ethnologue.com/about/language-status. So why should we not if we are using these reliable sources to make status evaluations based solely on the information found there? There is no Original research involved.
--user:Neddy1234

I see nothing wrong with it as long as the data is from a RS. What do ppl think of the Catalogue of Endangered Languages? I would oppose using Ethnologue, as SIL admits that the goal of their assessment is to make the lang look as vital as possible in order to facilitate funding for translating scripture, which you're not going to get if the lang is dying. The result is that non-missionary linguists have been denied funding for documenting endangered languages when the funding agency checks Ethn., which says the lang is not endangered. There are many more endangered langs in Africa than you'd understand from Ethn., compared to other continents, for example, with the result that Africa would be under-funded if ppl relied on Ethn. for funding decisions. Maybe our ref'ing some other source would help remedy that.

As for revitalization, IMO we should have a category for that. But even if ppl want to deny it, once a lang is gone, it's gone. If you are able to bring s.t. back, it won't be the same language. That's even the case for Modern Hebrew, which is arguably relexified Slavic rather than Semitic. And few revitalization efforts actually change ppl's native language like that. — kwami (talk) 17:39, 20 December 2014 (UTC)

Kwami, do you have a reliable source for your claim that SIL admits cooking the vitality books to facilitate better funding of its activities? I guess you don't, and it is plainly not true. Just to the opposite, you could claim that many linguists classify a language as highly endangered in order to get access to grants from Rausing or other organizations who have an interest in endangered languages only. Just recently I was reading in an MA thesis about a language that it is "on the brink of extinction", when the writer as much as I know well enough that it is actually quite vital, and still generally being passed on to the next generation. It was written, because the thesis was part of a language documentation program, and they always should have endangered languages as their subject. Therefore, if there is any possible bias about language endangerment status, I would expect it rather from that side. SIL is not going to invest resources into developing a language that is dying, if they know it is dying. This is not to say that all of Ethnologue's vitality assessments are correct (I know they are not), but that there is no motivation to tampering with the status of a language. Landroving Linguist (talk) 08:47, 25 December 2014 (UTC)
I must admit that that is one accusation against the SIL that I also havent heard myself. And I have heard and read a lot. As I say I have myself had to correct languages listed as extinct that are in fact alive. User:Maunus ·ʍaunus·snunɐw· 18:50, 25 December 2014 (UTC)
That's what SIL said when I asked them, after I'd heard complaints. I am aware that there is a lot of exaggeration in the other direction too. These are largely subjective categories, so there could be bias in either direction even without intentional misrepresentation. And there's the anger we'd provoke by saying a language is extinct if there are attempts at reviving it. I guess we'd need to decide which POV we wish to represent, if we're going to add this category. — kwami (talk) 19:36, 25 December 2014 (UTC)
I don't think it is entirely subjective - for most languages there is hard and fast data, such as whether the language is used in education, how many monolingual speakers there are, whether parents pass it on to their children, or whether there is any institutional effort for language development. In the case I mentioned above, the language in question is doing well according to three of these criteria, and any claim that the language is seriously endangered can be easily debunked, based on published sources. I agree that this kind of data may not be available for all languages, and then the situation may be more difficult. In any case, just like the similarly troublesome question about speaker numbers, maybe we can agree here to refer to the best published sources available, and only if nothing else is available default to sources like the Ethnologue. For reasons mentioned above, published sources may be in disagreement, and then this should just be mentioned. Landroving Linguist (talk) 21:48, 25 December 2014 (UTC)
I think the key is not to make the information obligatory but to exclude it on editorial discretion if there is reason so doubt the validity of the extant sources for any reason.User:Maunus ·ʍaunus·snunɐw· 22:02, 25 December 2014 (UTC)
I think that it's messy, subjective, and guaranteed to cause disgreements- but worth it, for the most part. Except for the absolute no-brainers like English or Sumerian, there should always be a link to the section of the article discussing the issue, and there should be a way of marking statuses as disputed and/or unknown (unknown as in "no reliable source has that information", as opposed to "no one at Wikipedia has checked"). It may even be worth it to have multiple statuses possible (with annotation) in cases where there are differences between reliable sources. At any rate, it should always be made clear that it's only a simplified graphical representation of potentially very complex and disputable facts. Chuck Entz (talk) 23:34, 25 December 2014 (UTC)
Oddly enough, as a member of SIL for 35 years, and actively interacting with Ethnologue since they started including EGIDS ratings, I've never heard anyone in SIL suggest that we should or do bias an evaluation of a language's vitality upward so as to make it easier to justify funding for work in that language. Now, I'm not saying that kwami is wrong; I trust him that someone in SIL did actually say that to him. But, given that my experience within the organization is so different, I would guess that the opinion expressed is not widely held, and it certainly is not a matter of policy. AlbertBickford (talk) 23:07, 23 January 2015 (UTC)
One other factor I just thought of, and that is the confusion that can arise over the term "endangered". According to one definition, a language is endangered if there is likelihood that it may disappear within the next century. By that definition, a language can be endangered even when it is still being transmitted to all children. The EGIDS scale used in Ethnologue attempts to rate current level of vitality, rather than "endangerment" in this sense. Other uses of "endangered" that I've seen are more along the lines of languages that are beginning to fade away--where children are no longer learning the language. So, when people use the term in different ways, there is great potential for misunderstanding--especially when money is involved, such as getting funding for research. AlbertBickford (talk) 23:15, 23 January 2015 (UTC)
The word can indeed be ambiguous. I've seen linguists get funding for "endangered" languages that to me seem quite robust, to an extent that many communities could only dream of. That would be the opposite bias to the one I mentioned.
BTW, in Ethn.18, Lyons SL is described as 6a "vigorous", but then there's a note saying that a survey is needed to determine if it's still spoken. Just a heads-up on the problem with copying categories blindly. — kwami (talk) 18:39, 5 March 2015 (UTC)

RfC: The MoS and the generic he

A conversation about the Wikipedia Manual of Style's stance on the generic he and gender-neutral language that started on this talk page has progressed to two RfCs at the village pump. Further opinions are welcome. Darkfrog24 (talk) 18:57, 5 March 2015 (UTC)

Hey, thanks for posting. I've copied your note on the WikiProject Linguistics talk page, since users there will also be interested. — Eru·tuon 00:38, 7 March 2015 (UTC)

Khowar language or Chitrali?

There seems to be a campaign by some editors, mostly IPs, who are trying to say that this language does not exist, or disputing its name, see here and here.

I'm not a student of language, so it's confusing for me (it appears to me that Chitrali is the language of the Khowar people? However, the page is called Khowar language but uses both names in the text and Chitrali in the lead?)

See this edit here (copy-pasting content) and later here at Khowar, a redirect that had the text from Khowar language pasted into it, for example. See also edits at Languages of Chitral and Chitrali language. More information/discussion at Talk:Khowar language#Vandalism. 220 of Borg 02:41, 7 March 2015 (UTC)

I've reverted the last year's edits to the lead. It's not just the name: the population was falsified, as at various times were the refs.
There are several languages in Chitral. Khowar is commonly called "Chitrali" because it is the most populous, but the term can be ambiguous. The ISO name is the more precise "Khowar".
Thanks for catching this. — kwami (talk) 04:06, 7 March 2015 (UTC)
Ah, very good. I came to the right noticeboard then! Just a little POV IP editing on this topic. :-/ - 220 of Borg 06:29, 7 March 2015 (UTC)

Ethnologue 18 is out

I've updated the language info box, so now citations of E17 put the article in Category:Language articles citing Ethnologue 17. There will soon be thousands of articles in that cat that need to be updated. We might want to start with those in Category:Language articles with old Ethnologue 17 speaker data‎, which should fill up in coming days. I've started with the oldest pop dates (< 1983), but anyone who wants to help would be appreciated. (If Ethnologue does not provide a date for its figure, and has not changed since E17, then please leave it as E17, and add a comment that it shouldn't be changed.) — kwami (talk) 21:09, 27 February 2015 (UTC)

(I think most updates were in Eurasia and sign languages, while most of our old data is from outside Eurasia. — kwami (talk) 21:40, 27 February 2015 (UTC))

I've added one update that I knew about because it was me who suggested it. User:Maunus ·ʍaunus·snunɐw· 22:49, 27 February 2015 (UTC)
I moved some links to match what we had elsewhere. If that was wrong, we'll need to correct the statements supporting it. Would be nice if you could create a stub for Chiapas Nahuatl as well, since all I'm working on right now are the new(ish) editions of Glottolog & Ethnologue. — kwami (talk) 00:32, 28 February 2015 (UTC)
Also, if you could ID the remaining INALI names at Nahuan languages#List of Nahuatl dialects recognized by the Mexican government and Wikipedia:WikiProject Languages/INALI names for Mexican languages, that would be wonderful. — kwami (talk) 00:39, 28 February 2015 (UTC)
Yeah unfortunately Glottolog is weirdly listing Tabasco as part of Isthmus, but really it is not. It does not in fact share any of the main innovations characteristic of the Isthmus dialects, it is closer to Pipil. I will try to make an article on Chiapas Nahuatl as well.User:Maunus ·ʍaunus·snunɐw· 01:06, 28 February 2015 (UTC)

I've been recruited to this job, and I will do some work on it. I'd like some input on something. I updated Abkhaz language; the number didn't change, but appears to have three significant figures rather than two. I might be making a mistake (I'm not exactly a math major), so could one of you glance at it and let me know? — Eru·tuon 03:52, 4 March 2015 (UTC)

Yes, the 101,000 + 4,000 + whatever makes up the remaining 7,740 would be to the nearest 1000 and so 3 figs. But consider that the 4,000 is not from the citation date of 1993, but from 1980, and that we have no idea how old the data adding up to the other 7,740 is. Also, the published Turkish pop. might have been, maybe, a range of 3–5k, and Ethn. just reported the mid-point. (They do that a lot. Old editions of Ethn. are often more reliable in this regard than recent editions.) So, yes, just following the math, it would be 3 sig figs, but I seriously doubt that the data is really that reliable. I generally don't like to report anything greater than 2 sig figs, though other editors might disagree with me. Never more than 3 sigfigs, though: that would be greater than 1% accuracy, and population data is hardly ever going to be that accurate. — kwami (talk) 02:44, 5 March 2015 (UTC)
Ethnologue figures for speakers of native languages in the US are still woefully inflated and out of date. Are other published sources acceptable besides Ethnologue? --Vihelik (talk) 20:40, 4 March 2015 (UTC)
Lots of other sources are acceptable. Just follow WP:RS. Ethnologue isn't exactly a RS, really, but it's more complete than anything else, and heads off edit-wars by POV editors cherry-picking sources to inflate the population of their favorite language. But if you can find something that covers the whole US, that would prevent concerns about cherry-picking. You might want to ask a specialist like @Taivo: for the most up-to-date sources. — kwami (talk) 02:44, 5 March 2015 (UTC)

Thanks to @Abrahamic Faiths: and @Miniapolis: on helping with the drudge work.

See also Category:ISO language articles citing sources other than Ethnologue. Some of these could be updated to E18. For most of the top 100 language of the world, we cite the Swedish national encyclopedia rather than Ethn. Also, a number of langs (esp. in Ethiopia and Canada) are cited directly to the census that Ethn. uses. We might as well leave those alone. But some others might be old, or cherry-picked to maximize the population estimate. — kwami (talk) 23:43, 6 March 2015 (UTC)

UPDATE: 1,400 articles have been updated, including all the ones with old population figures. Someone started updating all the Caucasian languages; that may be an approach for those of you interested in a particular family or region. — kwami (talk) 04:28, 11 March 2015 (UTC)

Extinction dates needed

For those of you interested in ancient or extinct languages, the lang box now generates two new tracking categories: Category:Language articles with unknown extinction date‎ and Category:Language articles with unreferenced extinction date‎. Many articles in the latter are ref'd in the text, just not in the box, but many have no ref at all. The first isn't actually unknown (sorry for the poor choice of wording), but just where we haven't yet found a date. — kwami (talk) 23:32, 12 March 2015 (UTC)

Is the extinction date of a language defined as the year of the death of the last native speaker?
Wavelength (talk) 23:38, 12 March 2015 (UTC)
For most of them presumably it will be date of last documentation. The fetichization oflast native speakers is pretty much only a north american phenomenon.·maunus · snunɐɯ· 23:41, 12 March 2015 (UTC)
For ancient or historical languages, we have an "era" field that may be more informative than "extinct". And in many cases we can only say "mid-20th century", "some time before 1931", etc. If all we have to go on is a few documents, then we can use their dates. (Should that be under 'era' or 'extinct'?) But if we have the date the last native speaker died, that would be good to include. — kwami (talk) 00:12, 13 March 2015 (UTC)
  • A little weird to list Early Modern English etc as extinct languages.·maunus · snunɐɯ· 23:40, 12 March 2015 (UTC)
It doesn't have an "extinct" field, but an "era" field, and the dates in that field are unreferenced. I lumped in historical languages for two reasons: We already have plenty of tracking categories, and many older articles use the "extinct" field rather than the newer "era" field anyway. Feel free to change the names of the categories if you like. I didn't put much thought into them, since few readers are ever going to see them. I suppose we could create a separate cat for "unreferenced era", which might help us review where we should change the box from "extinct" to "era". — kwami (talk) 00:12, 13 March 2015 (UTC)

Some of these articles link to Linguist List for the ISO code description. If there's a date there, you can ref it by entering "linglist" in the ref field. (Can do s.t. similar with AIATSIS for Australian langs.) Also, the 'unknown date' cat is only populated if there is no ref. If the ref is set to e17 or e18 (in some cases where Ethn. does not give a date), then we won't see it. Should those articles be included? Maybe as a subcat? — kwami (talk) 00:24, 13 March 2015 (UTC)

There are Wikipedia wikis in Old English (ang) and Latin (la). DMOZ has links to web pages in Latin. In a sense, those two languages have current documentation. See also "Revival of the Hebrew language". How are extinction date criteria applied to those three languages?
Wavelength (talk) 02:56, 13 March 2015 (UTC)
Liturgical languages are going to have additional dates of L2 use, but that should be kept distinct from L1 use, as we do for living languages. With Hebrew, you have two periods of L1 use, if you accept that they're the same language. Old English is more straightforward. AFAIK, there's no significant modern usage. — kwami (talk) 03:52, 13 March 2015 (UTC)

Deleted Puntland Arabic

This article may have been incompetent rather than a hoax, in case anyone wants to rescue it. The author seems to be invested, but the info is either fake or unref'd. I turned it into a redirect. — kwami (talk) 20:53, 17 March 2015 (UTC)

Splits of IPA help pages

Several splits of IPA help pages are being discussed or are in progress.

Also, a question is unresolved: whether Ecclesiastical Latin has four mid vowels, as in Italian, or only two. I think it must only have two, since the difference is not marked in spelling. To comment on this, head over to Wikipedia talk:WikiProject Latin § Pronunciation of Ecclesiastical Latin.

It's helpful to have classical and modern Latin and Greek side by side, so that readers can compare them. — kwami (talk) 17:51, 20 March 2015 (UTC)

"Revival" field in language infobox

If you enter a value for "revived" in the info box, it will now produce a "revival" field. It can be used in conjunction with "speakers" for revitalization efforts of endangered or moribund languages that still have L1 speakers, and with "extinct" for reconstruction or revival of extinct languages. I'm hoping this will encourage greater description of these efforts, as well as take some of the sting out of reporting a language is extinct when the community is trying to maintain it. — kwami (talk) 02:07, 18 March 2015 (UTC)

I think this is a great idea, thanks for implementing it.·maunus · snunɐɯ· 22:35, 20 March 2015 (UTC)

Sanskrit article

The illustration of Devangari as used for writing Sanskrit has associated info/text that looks like this: <<

 
"My name is 'incomplete third word is the name'" (written) in Sanskrit

>>. I can't figure out what is intended or how to fix it. (Posted at talk:Sanskrit and talk:WikiProject Language). -- Jo3sampl (talk) 19:29, 21 March 2015 (UTC)

Ethnologue update update

Thanks to several editors, especially Abrahamic Faiths, all language articles with e17 population estimates of 10k or more have been updated to e18. That's 68% of our articles, making the remainder of the job all that much easier for the rest of you! — kwami (talk) 03:13, 5 April 2015 (UTC)

Arabic language

Can someone who is familiar with Arabic writing please review the Pending Changes for this article? Thank you, --Scalhotrod (Talk) ☮ღ☺ 15:13, 7 April 2015 (UTC)

Orthography tables in letter articles

It occurs to me that just as we have tables giving the languages in which phones occur, there should be tables of the pronunciations of letters in different languages. In the article on the letter i, I added a table showing what phonemes the letter represents in French, German, and Italian. I've got to think more about what sort of information the tables should include, but sourcing may not be too hard, since we have many good articles on orthography. — Eru·tuon 09:26, 9 April 2015 (UTC)

RfC: Minority languages ​​in geographical articles

Please see an RfC at Talk:Minority language § Minority languages ​​in geographical articles. sroc 💬 08:36, 12 April 2015 (UTC)

Member of this project...

...who are interested in English-language slang, and in proper word usage in English, might be interested in this discussion. BMK (talk) 06:52, 20 April 2015 (UTC)

omniglot.com

There is a discussion to blacklist omniglot.com at MediaWiki talk:Spam-blacklist#omniglot.com. Please read and join if you can help resolve it. Richard-of-Earth (talk) 20:19, 20 April 2015 (UTC)

#lingwiki editathons

I'm organizing a series of editathons to encourage linguists to improve linguistics-related articles on Wikipedia. Although many participants have been working on more technical linguistics topics (for which I've been posting on WikiProject:Linguistics), I've also been encouraging those with specific expertise on a particular language or family to add to those articles, especially for under-documented languages where the grammars may exist only in paper copy in academic libraries, so I thought I'd mention it here as well.

Here are some dates, if anyone wants to use them as an excuse to get some editing done, follow along on #lingwiki, or even organize a local meetup or satellite editathon (feel free to get in touch if you want editathon-organizing tips):

May 2015 - Editathon at Canadian Linguistics Association (CLA) annual meeting in Ottawa

July 2015 - 4 weekly editathons (Wednesday afternoons) at the month-long LSA summer institute in Chicago

October 2015 - Editathon at NWAV (Toronto) - main North American sociolinguistics conference & Editathon at NELS (Montreal) - large regional north-east theoretical linguistics conference

January 2016 - Editathon at LSA annual meeting in Washington DC

Also, if anyone has any particular pages or topics that you've noticed need attention but don't have time for/don't match your expertise, feel free to let me know and I'll try to find someone for them!

You can see lists of articles edited in previous editathons here and here. I'm also currently applying for a grant from Wikimedia to support these events, which you can see/comment on here. --Gretchenmcc (talk) 01:16, 21 April 2015 (UTC)

Does this have anything to do with the multiple single-purpose accounts I've been reverting this past week? The edits are quite similar, but the articles have nothing to do with each other. The editor typically starts in a sandbox, but deletes a lot of stuff (e.g. glottolog links) from the infobox. There's usually a "general info" section containing an incoherent collection of factoids, some having nothing to do with the language, a "further reading" section full of refs that have little to do with the language (maybe it's mentioned somewhere), and a "see also" section that has generic links like "Africa" and "Christianity". A lot of the info is taken from sources like Ethnologue, so it doesn't appear the editor is an expert in the language, and the quality of the writing suggests high school students. At first, I tried to save the improvements, but after several articles edited this way, I'm starting to just revert them.
Some of the articles are Adi language, Twendi language, Somyev language, Tregami language, Xiri language (useful, but lead to a merger), Kiong language, Massalat (rd'd to the language article), Wancho (despite that already being a dab page to an existing language article).
kwami (talk) 21:54, 6 May 2015 (UTC)
Hi @Kwamikagami:, thanks for checking in and I had a look through some of the edits, but unfortunately I have no idea who these people are. (I've also been telling people to put something on their user page before editing, not to delete things, and my participants have been linguistics graduate students and profs who should be writing better than that and be aware of the pros and cons of Ethnologue.) The event that I'm organizing in May is the last weekend in May and I have not talked to or heard of anyone editing in conjunction with #lingwiki since the first weekend of April. It's possible that a few random people have seen a post I made about it on social media and just decided to "help", but I'm not sure why that would happen now when I've been posting about this in general since November and yet I haven't posted about it recently. Good luck in finding your high schoolers or pseudo-highschoolers, I wish I could help! --Gretchenmcc (talk) 00:00, 8 May 2015 (UTC)
Thanks, Gretchen. I suspect it might be a school project somewhere. There are so many similarities that I almost thought it was a single editor evading a block, but I can see individual differences. There is also an odd combination of knowledge and ignorance of how to edit WP, so perhaps they're working off a template provided by their teacher. — kwami (talk) 00:10, 8 May 2015 (UTC)
Ah, it is a school project. Last year the articles included Korku language, Puroik language, Bongo language, Kumzari language (maybe), Vafsi language, Tegali language, Homshetsi dialect, Kota language, Suri language, Neo-Mandaic. I'll let the teacher introduce himself. — kwami (talk) 02:11, 8 May 2015 (UTC)
83 articles to be revised tomorrow. The prof is upset that I'd criticize him for using WP as his personal writing tutorial, and seems to be about to walk off in a huff. Oh well. — kwami (talk) 02:41, 8 May 2015 (UTC)
This project may be well-intentioned, but it is leading to incredibly disruptive editing, as you can see from the revision history of Tregami language. An editor is now arguing on the talk page that her edits must remain in place for some arbitrary period of time, after which they may be reverted, which suggests a total lack of understanding of Wikipedia's purpose and normal editing processes. Something needs to be done about this. Suggestions? FreeKnowledgeCreator (talk) 06:34, 8 May 2015 (UTC)
They're just taking their lead from their prof. I invited him to introduce his project, but he thought that was somehow an affront to academia, and in upset that editors have been reverting the unintelligible writing, irrelevant material, and falsehoods his more clueless students have been adding. I created a template they can post on the top of the page, that will populate Category:Articles_in_class_projects/Rutgers. Wish we had a list of articles in the project, but I can scan for key words in the template he provided his students. — kwami (talk) 23:05, 12 May 2015 (UTC)

Gutnish and WP:V

I have started a thread at talk:Gutnish about the fact that this article has been left uncited for over a decade. I urge anyone who wants to improve the article to join the discussion.

Peter Isotalo 12:09, 24 May 2015 (UTC)

Dari language (Zoroastrian) listed at Requested moves

 

A requested move discussion has been initiated for Dari language (Zoroastrian) to be moved to Zoroastrian Dari language. This page is of interest to this WikiProject and interested members may want to participate in the discussion here. —RMCD bot 23:02, 28 May 2015 (UTC)

Southern Kurdish dialects listed at Requested moves

 

A requested move discussion has been initiated for Southern Kurdish dialects to be moved to Southern Kurdish language. This page is of interest to this WikiProject and interested members may want to participate in the discussion here. —RMCD bot 23:03, 28 May 2015 (UTC)

Romanian subdialects listed at Requested moves

 

A requested move discussion has been initiated for Romanian subdialects to be moved to Romanian dialects. This page is of interest to this WikiProject and interested members may want to participate in the discussion here. —RMCD bot 23:16, 28 May 2015 (UTC)

Sama language listed at Requested moves

 

A requested move discussion has been initiated for Sama language to be moved to Sinama. This page is of interest to this WikiProject and interested members may want to participate in the discussion here. —RMCD bot 23:17, 28 May 2015 (UTC)

Western Persian listed at Requested moves

 

A requested move discussion has been initiated for Western Persian to be moved to Iranian Persian. This page is of interest to this WikiProject and interested members may want to participate in the discussion here. —RMCD bot 23:17, 28 May 2015 (UTC)

Modern Hebrew and the ELL2 as FRINGE

We have a debate at Modern Hebrew with some editors characterizing the ELL as WP:FRINGE and therefore to be disregarded when summarizing reliable sources. At issue is the characterization in the ELL and other sources of Modern Hebrew as a mixed, hybrid, or relexified language. (I suspect scholars are struggling to account theoretically for revived languages, as they did half a century ago with creoles, and it's rare to find two who use the same wording.) Other RSs characterize it as a simply Semitic language, though a revived one. The info box now says "mixed", which is one-sided, with people edit-warring to copy over the genealogy of Biblical Hebrew, which IMO is also one-sided. I would suggest "revitalised Mishnaic Hebrew or mixed Yiddish–Hebrew", which is pretty close to what we'd had before the current debate. — kwami (talk) 23:33, 15 June 2015 (UTC)

ELCat student projects

There's the possibility that editors of the Endangered Language Catalog at Manoa will have WP class projects like the one we recently had thru Rutger's. This last time was a bit of a headache: there are still articles that have a "conclusion" section as if they were an essay on endangerment, and lots still have a "general info" section full of miscellaneous and unorganized information. Are there things we might do to make the process more productive and hopefully of higher quality in future years, such as a student sign-up sheet here? Could we maybe update or expand Wikipedia:WikiProject Languages/Template, say by adding a "Status" section for endangerment? — kwami (talk) 17:41, 23 June 2015 (UTC)

Requested move at Tagalog

 

A requested move discussion has been initiated for Tagalog to be moved back to Tagalog language. This page is of interest to this WikiProject and interested members may want to participate in the discussion here. — kwami (talk) 00:30, 26 July 2015 (UTC)

(request closed with move back to Tagalog language and accompanying move of disambiguation page to Tagalog.)

Requested move at Tigrinya language

A move request has been initiated at Tigrinya language to move it to "Tigrinya". --Taivo (talk) 01:15, 26 July 2015 (UTC)

(request closed with no move)

Request for Move at Tagalog

Another request for move has been filed to move Tagalog to "Tagalog (disambiguation)" with a redirect at "Tagalog" to "Tagalog language". The discussion is here. --Taivo (talk) 23:16, 9 August 2015 (UTC) (Resolved)

Request for Move at Kapampangan language

Kapampangan > Pampango. — kwami (talk) 02:40, 3 September 2015 (UTC)

Origins of Arabic

Just in case anybody with specialist knowledge in Semitistics reads here, I've started a discussion on Talk:Arabic language#Classification of Safaitic and Hismaic where expert input would be highly appreciated. --Florian Blaschke (talk) 19:55, 26 June 2015 (UTC)

Copyright Violation Detection - EranBot Project

A new copy-paste detection bot is now in general use on English Wikipedia. Come check it out at the EranBot reporting page. This bot utilizes the Turnitin software (ithenticate), unlike User:CorenSearchBot that relies on a web search API from Yahoo. It checks individual edits rather than just new articles. Please take 15 seconds to visit the EranBot reporting page and check a few of the flagged concerns. Comments welcome regarding potential improvements. These likely copyright violations can be searched by WikiProject categories. Use "control-f" to jump to your area of interest (if such a copyvio is present). --Lucas559 (talk) 15:35, 1 July 2015 (UTC)

Maps where a language family is official

At Indo-European languages, Eurasiatic languages, and recently Turkic languages, the info box has/had a map of countries where a language from that family is official. I replaced them with normal language-family maps, but it appears we now have an edit war over it. — kwami (talk) 23:56, 3 July 2015 (UTC)

The normal language-family maps are much better and informative. The officialness maps may even be misleading with their distribution including countries where they have been imported over the last few hundred years. --JorisvS (talk) 08:47, 4 July 2015 (UTC)

Merge discussion at WT:WikiProject Deaf

 

There is a discussion on merging Nigerian Sign Language, Bolivian Sign Language, and Ghanaian Sign Language into Dialects of American Sign Language. If interested, you can contribute to the discussion. Thank you. Wugapodes (talk) 18:52, 4 July 2015 (UTC)

Do we want to list a family in the info-box genealogy of a language, when the language is the ancestor of that family?

This came up at Danish Sign Language, but would also be a question if we were to ever add language info boxes to proto-Indo-European etc.

Danish Sign Language is the ancestor of several other sign languages, which together constitute the Danish Sign Language family. I hadn't included the DSL family in the tree of DSL, as it seemed weird, as if we were saying DSL descends from itself. User:JorisvS wants to include it, thinking it weird to leave it out, and that seems reasonable too. Similarly, would we want to put Indo-European in the tree of proto-Indo-European? Or, should we maybe say "ancestor of the DSL family" or "ancestor of the IE languages" for the genealogy in the infobox?

We don't currently use infoboxes for protolanguages, partly for this reason, but maybe we should think about it. — kwami (talk) 20:51, 25 July 2015 (UTC)

Valencian

There is an edit warrior at Valencian (with no activity at other articles), who insists on removing that it is a variety of Catalan from the first sentence, instead calling it "a language spoken in ..." and saying it is a "glossonym for the Catalan of the area" (whatever that is supposed to mean). He refuses to take it to the talk page and just keeps on reverting. --JorisvS (talk) 09:31, 19 June 2015 (UTC)

@JorisvS: The user is back. Ogress smash! 02:25, 26 July 2015 (UTC)

Sinitic languages, Varieties of Chinese, Chinese languages, Spoken Chinese, and other titles

Your discussion is welcomed at Talk: Varieties of Chinese. Thanks. -- WeijiBaikeBianji (talk, how I edit) 14:34, 27 July 2015 (UTC)

Khoshey language

  An article of interest to this WikiProject, Khoshey language, has been created without any references to reliable sources. I have been unable to verify that this language even exists. If you can assist, please see Talk:Khoshey language#Unsourced article. Thanks. Wdchk (talk) 02:17, 28 July 2015 (UTC)

I think the creator is a sock of User:Najaf ali bhayo, the editor who vandalizes articles about Chitral and Khowar. Khestwol (talk) 03:57, 28 July 2015 (UTC)
@Khestwol: Please file a sock report so we can have it removed (assuming it gets no cites). Ogress smash! 05:00, 28 July 2015 (UTC)

Were Bulgar and Hunnic the same language?

We have an edit-war at Bulgars and Bulgar language over citing a paper by a prof at U. Göteborg that *starts* by assuming that Bulgar and Hunnic are the same language. That strikes me as dubious. Is there anyone here who knows anything about this area and can evaluate the claim? — kwami (talk) 00:07, 31 July 2015 (UTC)

WP:NCLANG

A discussion on the relevance of "primary topic" to language article titles has been initiated at Wikipedia_talk:Naming_conventions_(languages)#.22Primary_Topic.22. --Taivo (talk) 04:30, 6 August 2015 (UTC)

Do we want automatic notification of all move requests on this talk page?

See discussion at User_talk:RMCD_bot#WP:LANG. — kwami (talk) 20:15, 27 July 2015 (UTC)

An article alerts bug was fixed, and now Wikipedia:WikiProject Languages/Article alerts, which shows all recent requested moves, is up-to-date again. This is transcluded at the top of Wikipedia:WikiProject Languages, and could easily be translcuded at the top of this talk page as well. Is the AA report sufficient, or do you still want the notices from RMCD bot as a supplement to that? Wbm1058 (talk) 14:41, 6 August 2015 (UTC)

Request for Comment

A request for comment has been initiated at Wikipedia_talk:Naming_conventions_(languages)#RfC:_Should_the_NCLANG_guideline_include_references_to_PRIMARYTOPIC.3F. Your participation would be appreciated. --Taivo (talk) 20:57, 6 August 2015 (UTC)

Pending language RMs

The following requested moves have not been advertised here; they require expert input. Alakzi (talk) 00:51, 11 August 2015 (UTC)

South American Phonological Inventory Database

A linguist at UC Berkeley has put together an online database of phoneme inventories of over 300 South American Indigenous languages, with citations back to published grammars and other reputable sources, which seems like it would be good information to add to the Wikipedia articles of these languages. I haven't looked through it exhaustively but from a bit of clicking around it seems quite good, as many of these languages don't have a whole lot of information about them online. I'm planning on directing the attention of #lingwiki participants to it, but I thought that it might also be useful to other editors on WikiProject:Languages. Or if others have already been using this resource or discovered problems with it please let me know so we don't duplicate effort! Here's the link to the database. --Gretchenmcc (talk) 22:36, 19 July 2015 (UTC)

@Gretchenmcc:: Are you associated with the DB? It would be easy to have a bot link all of their entries if they were cross-linked by ISO code. If they had an index by ISO code as well, we could copy the list, and a bot would follow the individual codes to their WP pages, where it could place a standardized link (preferably using a template) to the ISO code in the DB, which would redirect to the proper page.
[1] is a start, but I don't know how we would automate a connection to the abbreviated names they use (like "TenaQ") except manually. Easier if they used ISO on their end. (Exceptions, where they use sub-ISO codes, we could enter manually.) — kwami (talk) 20:44, 25 July 2015 (UTC)
Created {{SAPhon}} to format the ref. — kwami (talk) 21:58, 25 July 2015 (UTC)
@Kwamikagami:: I'm not associated with the project itself but I know the prof and I think he edits Wikipedia sometimes. I don't know his username though, so I'll send him this thread via email and see what could happen. --Gretchenmcc (talk) 03:02, 19 August 2015 (UTC)

Hi, my name is Lev Michael, and I’m in charge of SAPhon. Thanks, gretchenmcc, for bringing this discussion to my attention. I’m not sure whether this will fit the bill entirely, but the Language Lists page includes the ‘code’ for each language (with a link to the inventory). This code is typically the iso code, except in cases where we needed to make a finer-grained (typically, dialectal) distinction than given by the iso nomenclature, or in cases where a language lacked an iso code. I imagine that one issue for the systematic ingestion of SAPhon inventories into Wikipedia is that the URLs for each inventory do not include the above-described code, but rather a more human-readable abbreviated language name. (I had a recent-ish email exchange with another Wikipedian, who indicated the desirability of rationalizing these URLs to include the above-mentioned codes instead of the abbreviated language names. I agree, but being in the Amazon right now, and having no research assistants currently employed on the SAPhon project, this is beyond my ability to implement at this time.) Ldmanthroling (talk) 17:22, 19 August 2015 (UTC)

RfC: Are personal pronouns (including "who") to be avoided for fictional characters?

Please take part in the discussion at Wikipedia talk:Manual of Style#RfC: Are personal pronouns (including "who") to be avoided for fictional characters? Curly Turkey ¡gobble! 23:09, 19 August 2015 (UTC)

Flags in info boxes

A couple years ago we had a small response (positive) to a request to only allow flag icons in language info boxes where a language is listed as official (i.e., in the "nation" and "minority" fields); everywhere else in an info box, flags should be removed. This follows general MOS advice about not cluttering up articles with flags. I've been doing this manually, but this is about having a bot take care of it. I'm reopening the request, which got little feedback at the time. Please comment at User_talk:Anomie#Removing_flag_icons_from_language_info_boxes if you have an opinion. — kwami (talk) 01:57, 26 August 2015 (UTC)