Timing of Old vs. Middle Indo-Aryan

I've noticed in several articles about Middle Indo-Aryan (and I think articles about Sanskrit have the same issue) a point that many laypeople seem to be confused about, but which is actually a no-brainer and should be made clearer and more explicit for the reader. For example, Pali#Ardha-Magadhi points out that Classical Sanskrit, despite being linguistically Old Indo-Aryan, was contemporary with Middle Indo-Aryan. Also, Middle Indo-Aryan linguistic elements can be found in Sanskrit. However, I'm pretty sure no Indologist takes this to mean that Middle Indo-Aryan doesn't represent a linguistically later stage than Old Indo-Aryan, and doesn't directly descend from it (more specifically, individual Middle Indo-Aryan dialects directly continue individual Old Indo-Aryan dialects). Classical Sanskrit is simply a standardised form of Old Indo-Aryan, but the effective standardiser, Panini, like his contemporaries, certainly didn't speak Sanskrit as a first language from childhood on, but rather some Middle Indo-Aryan dialect, and Sanskrit was a second language for him, just like Latin was a second language for a medieval Italian cleric. Indologists generally agree that Vedic is based on a stage of Indo-Aryan which was spoken as a first language in the late 2nd and early 1st millennium BC; the language of the Rig-Veda is often dated to the 13th century BC or thereabouts (even if the oldest layer may date to about 1500 BC, its linguistic form as available in the received text, especially its phonetics, is probably not representative of it and instead somewhat more recent; Mitanni Indo-Aryan gives us a glimpse of what Indo-Aryan must have been like back then, which is well compatible with a reconstructible stage of Proto-Indo-Aryan). The transition from Old to Middle Indo-Aryan must therefore have already taken place in the 1st millennium BC, probably in the early part of the millennium, and this eminently reasonable assumption makes it obvious how it is possible that Middle Indo-Aryan linguistic material or features turn up in Classical (or Epic) Sanskrit: MIA was simply the vernacular language (or the "natural language", which is what "Prakrit" means, as opposed to the highly artifical Sanskrit – although MIA dialects such as Pali eventually stop changing with the spoken language as well). Only very educated people such as brahmins, scholars and poets, and presumably aristocrats knew Sanskrit in addition. That high-status individuals do not speak Prakrit in the dramas and instead only Sanskrit should not be taken to mean that they literally had Sanskrit as their mother tongue, unlike the commoners, even if they may have learned Sanskrit very early in their youth. This is similar, only much more extreme, to how RP is nobody's natural accent (well, granted, I don't know what HM The Queen spoke like as a child; maybe for the high nobility, RP is in fact their natural accent, I'm not clear on this point) and only acquired at British boarding schools (hence "received"), or, for a closer analogy, how Modern Standard Arabic is a second language for everyone in the Arabic world, even the most highly educated religious scholars and members of royal houses. --Florian Blaschke (talk) 17:06, 26 August 2016 (UTC)

Prakrit#Etymology says:

"Traditionally, many have believed that the Prakrits are older than Sanskrit, and that it was from the Prakrits that Sanskrit was refined. However, from a comparative point of view, Sanskrit (especially Vedic Sanskrit) is closer to the reconstructed Proto-Indo-European language than the Prakrits, so Sanskrit belongs to a linguistically earlier stage of history."

I suspect that this is an opinion harking back to the early days of Indology in the 18th and 19th centuries, and based on (arguably a misinterpretation of) the meanings of the terms Sanskrit and Prakrit. (There's a marginal opinion that sees Middle Indo-Aryan features already in Mitanni Aryan.) This would be like the idea that Latin is simply a "refined" version of Italian, or Old English a "refined" version of Modern English; it is difficult (to put it mildly) to conceive how this should work, and it will make sense only to a linguistically naive reader. The only vaguely similar concept that comes to mind is linguistic reconstruction, especially under the naive premise that more ancient languages are somehow more sublime and sophisticated (because of their cultural cachet, association with sophisticated culture and because in much of Indo-European-speaking Europe and Asia, more ancient stages of modern languages tend to be more complex morphologically), but the idea seems to be explicitly not that Sanskrit was a recovered more archaic form of the Prakrits, but that a language similar to Sanskrit had never even existed before. To be fair, however, to some extent "artificial" Sanskritisation of Middle Indo-Aryan material must have happened indeed, as evidently mechanically derived ahistorical "pseudo-Sanskrit" forms attest ("false Sanskritisation", such as Dravida from something like damila); however, this is of course a secondary phenomenon established in an environment where the vernacular was MIA and the high-status or written language Sanskrit, and Sanskrit-speakers could observe and imitate existing patterns of correspondence. --Florian Blaschke (talk) 02:59, 28 August 2016 (UTC)

I'm not sure what's needed here. That sentence has been unsourced for six and a half years so I just removed it. The whole area is lacking in reliable sources, in particular scholarly work from actual linguists but that's another issue entirely. The entire Middle_Indo-Aryan_languages#References needs work but there's an issue of systemic bias in that ancient South Asian languages aren't easy to find information about and thus we have a lot of citations to 1960s books in India of dubious if any relevancy. -- Ricky81682 (talk) 21:40, 29 August 2016 (UTC)
Just a quick note (about a largely orthogonal matter): there might be systemic bias in the coverage this topic receives on wikipedia but I don't think there is one (or at least none of that kind) in scholarship generally. There is an enormous body of research accumulated in the last two centuries and a lot of it is easily accessible through the most basic channels (like jstor) so I don't think that ancient South Asian languages are generally difficult to find information about. Uanfala (talk) 22:16, 29 August 2016 (UTC)

ISO 639:prb

What language does prb refer to? ISO 639:prb currently redirects to a disambig, and neither of the target pages lists this code. --ἀνυπόδητος (talk) 07:33, 15 September 2016 (UTC)

From what I see at ethnologue and glottolog, prb refers to a language that is separate from either of the three languages listed at the dab page. And until Kwamikagami's edit from a month ago, that dab page used to be an article about the language [1]. Uanfala (talk) 07:53, 15 September 2016 (UTC)
Read this Request for Change to ISO 639-3 Language Code: "There is no evidence that the language exists. ... Lua' is a cover term for the Prai and Mal languages. The term Lua' has been widely used in northern Thailand to refer to different langauge groups such as Prai [prt], Western Lawa [lcp], and Eastern Lawa [lwl] ... Since the term Lua', Lua, or Lawa is widely used to refer to Mon-Khmer langauges through the region, it was likely that Lua' was added to the Ethnologue by mistake." Love —LiliCharlie (talk) 08:01, 15 September 2016 (UTC)

Template:Wikt-lang

I've created {{wikt-lang}}, and users here might find it useful. It is essentially a combination of {{lang}} with a link to Wiktionary. It tags text with a language, adds italics if the language is defined as being written with the Latin script in Module:Language (the list of languages is incomplete, however), and links to that language's section of the Wiktionary entry. I've added it to a few articles with Wiktionary links, but I'm sure there are more. If there are any suggestions for improvements (for instance, perhaps there should be a way to turn off italics), or languages that should be added to the data list, post on Module talk:Language. — Eru·tuon 21:15, 4 October 2016 (UTC)

How should languages be categorized

The following conclusions are drawn:

  • All languages should be categorised by date of first attestation.
  • Only "literally" extinct languages should be categorised by date of extinction. This would include "dead" and "extinct" languages which are nowadays solely used in academic study of old corpuses, but exclude "obsolete" languages that have evolved into later languages and languages that remain active in "liturgical" use.
  • Only artificial languages should be directly categorised into the existing "establishment" and "disestablishment" categories. Natural languages should be removed from them in favour of attestation categories.
  • Status categories can be used as appropriate. There is currently little desire to standardise all languages into one such category tree.

As WP:ANRFC closing admin, I'd also like to remind editors to follow reliable sources in all content decisions. Deryck C. 11:26, 10 October 2016 (UTC)

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

As discussed above, I think we categorize languages in a number of ways. I propose both categories for when the language is first attested and when it is extinct. If there's issues on the titling, that can be discussed later. I also propose that the same language articles be put into the general establishments/disestablishment categories based on locations. Please either support/oppose the concept of these categories and renaming can be done later. For example, Meroitic language is in Category:Languages attested from the 3rd century BC, Category:3rd-century BC establishments in Africa, Category:Languages extinct in the 4th century and Category:4th-century disestablishments in Africa. As noted above, the introductions category can be used as is done for Esperanto. I put separate discussion sections for each portion. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)

Attestation categories

As discussed above, there were some suggestions to limiting this to constructed or artificial languages. If supported, please indicate whether you support for all languages or just for a subset of some sort. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)

Support (attestation)

  • Support for all languages. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)
  • Support attestation categories, but not sure where exactly these should go into the higher category structure. "Establishment" categories are problematic, as has become clear in the discussion right before this RfC. Uanfala (talk)
  • Support for attestation of a language. The first known record, or a somewhat accurate guess based on language comparison for proto-languages, could be used here. Landroving Linguist (talk) 11:28, 10 August 2016 (UTC)
  • Support for all; this is pretty basic encyclopedic information, and is not frequently overturned by new revelations or new hypotheses becoming widely accepted, so it will be stable.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  11:04, 13 August 2016 (UTC)

Oppose (attestation)

Discussion (attestation)

Attestation is a somewhat problematic category in itself, because there are sometimes widely diverging views about when a language is first attested. As an example, there is currently an academic debate whether the Arabic language was first attested shortly before Islamic times, coming from Yemen, or whether it was attested in much older inscriptions in the Western Syrian desert. I envision some endless edit-wars resulting from that. Landroving Linguist (talk) 11:33, 10 August 2016 (UTC)

  • I agree but that's a content issue related to the specific case. There are individual disputes about when countries for example but the actual concept isn't at issue. I think it's fair to say that for the most part (and a lot of our pages lack this) reliable sources exist about it (even if it's usually about the "Early/Middle/Modern" versions with very vague (3rd-century-type) sourcing. -- Ricky81682 (talk) 16:59, 10 August 2016 (UTC)

Extinction categories

I think this one is more obvious but just in case. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)

Support (extinction)

  • Support for all languages. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)
The new category is a neat idea, but it doesn't address the main issue and that is the vagueness and arbitrariness of drawing the line between one stage in the evolution of a language and another. Uanfala (talk) 17:55, 10 August 2016 (UTC)
I'm not sure I agree here. Certainly the lines are imprecise and arbitrary, but quite often still a matter of academic consensus. Most historical linguists sort of agree about the temporal limits of Middle English or many other out-of-use language varieties. When such a consensus exists, it is also usually published, and therefore a matter of careful categorization on Wikipedia. A lot about languages is fuzzy by nature, such as dialect and language definitions. Many of the language entities placed in Wikipedia are somewhat arbitrary and even a matter of dispute. Still we work with them, because enough people agree on them and have published their agreement. Landroving Linguist (talk) 18:28, 10 August 2016 (UTC)
My view is that if reasonable sources can definitely say that the languages are different, they can be able to identify when the languages differs. Now our actually information here is quite lacking but that just means there's much more to do here in terms of citations. These exist. Linguists and archaeologists write on these things. That's not the issue. For example a number of our South American language articles can definitely define basic extinction periods (not evolution but change upon colonialization) but don't have a starting point since (a) the archaeological record is sparse and (b) most of these are spoken only and not written. Nevertheless that doesn't mean if I pinned someone down, they couldn't at least say a millennium for when it started (likely based on the first people beings there) and if that's our best guess, that's the best estimate for now. Remember, we are a wiki, most detail means this can be revised later. -- Ricky81682 (talk) 23:09, 10 August 2016 (UTC)
Hmm, Latin is indeed a bad example for an extinct language, because a) it developed into other languages and b) it is still used as a second language today. Other similar languages may be Ge'ez language or Coptic language. I wouldn't know what to do about them regarding the categories you suggest. The languages shown by the already existing categories you mentioned are accordingly all of a very different nature - languages that have never served as a language of any official standing, and that have usually never been written. They died out by not being transmitted by the last generation of speakers, and quite naturally it is often very difficult to pinpoint the language death in time. Actually, there are even different definitions of language death, as some linguists insist that a language is dead when exactly one speaker is still alive, as s/he cannot use it any longer for communication. There are some prominent cases when the death date of the last speaker of a language is known and announced, but in most cases language death looks a lot more messy. Be that as it may, I think for most extinct languages the point of death can placed within a 50-year bracket, but then most authors don't bother to mention this, because compared to the span of a life time 50 years are rather imprecise. For ancient extinct languages, instead, 50 years would be pretty good. It may still be enough to build up a set of categories on Wikipedia. I wonder how much we would succeed in populating these categories based on the information at hand. We can certainly try and see how far it gets us. I find the two category pages you mention somewhat disheartening in this respect, as they seem to contain almost all known extinct languages. Maybe the information is just missing by neglect, and not by ignorance. I like your 'out-of-use'-idea, which might avoid some of the problems of 'extinct'. Middle English is certainly not extinct, but decidedly out of use. Latin is not even out of use, however. Landroving Linguist (talk) 18:21, 10 August 2016 (UTC)
Latin isn't just Latin though. As Template:Latin periods notes, there's Old Latin, Classical Latin, Late Latin, Medieval Latin, Renaissance Latin, New Latin and finally Contemporary Latin as the major separates I'm certain. Medieval Latinn for example is in Category:Languages extinct in the 15th century (or better yet renamed as Category:Languages out of use by the 15th century) which includes Anglo-Norman language, Greenlandic Norse and other "Medieval" languages (largely a lack of citations with more specifics). On that basis, I think the specific dialects and variations are a useful categorization. My concern isn't the major languages, anyone can see those, but categorizing all the minor languages and dialects like found in something like Category:Languages attested from the 14th century or all the Alaska and Native American languages with some information. Again, I've touched may 1% of all languages by hand so imagine a fully-fleshed out categorization scheme and people will see how languages are evolving simultaneously at the same times. -- Ricky81682 (talk) 22:59, 10 August 2016 (UTC)
  • Prefer the "out of use by [era]" approach. But as second choice, support extinction categories for languages that have become literally extinct, without support for including languages that have evolved into later languages.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  11:07, 13 August 2016 (UTC)
    • Can we at least call this a "support the concept" category and then perhaps do another RFA on the actual naming? I'm aware I didn't fully think this through on the first round but I'm not moving forward until there's some clarity here. If people want to separate "extinct" from "evolved out" (I find that unnecessary) but we can. I think there's a possible name we can agree on that covers both. -- Ricky81682 (talk) 22:30, 13 August 2016 (UTC)

Oppose (extinction)

Discussion (extinction)

Establishment and disestablishment categories

This of course is separate from the main categories but these would put the language in both a "subject by time" category and a "time by location" category. Rather than a separate section, if people prefer the "introductions" structure, that can be suggested in the oppose or something. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)

Support (establishment/disestablishment)

  • Support for all languages, if that's debated. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)

Oppose (establishment/disestablishment)

  • Oppose natural languages to be included directly under a category with "(dis)establishment" in its name. Uanfala (talk) 08:02, 9 August 2016 (UTC)
  • Oppose natural languages to be included directly under a category with "(dis)establishment" in its name. Uanfala has made a good case why these categories cannot be reasonably applied to natural languages. Landroving Linguist (talk) 11:36, 10 August 2016 (UTC)
  • @Uanfala and Landroving Linguist: Do you support then artificial or constructed languages in these categories? Or a blanket objection to all languages? -- Ricky81682 (talk) 20:55, 10 August 2016 (UTC)
I think the "introduction" categories (like Category:20th-century introductions), which you suggested earlier, might be more suitable as they contain inventions and products – things that were brought into existence by a deliberate action, like constructed languages. Uanfala (talk) 21:29, 10 August 2016 (UTC)
  • Oppose - with the exception of artificial languages, we can almost never have a provable time of establishment; for one thing, the question of when it actually became that language, as opposed to a predecessor language, is unclear; for an other thing, it may have been spoken for a long time before it was written down. And it's quite possible that only a sub-population of the speakers of the language actually used the writing system, so even a time of its "disestablishment" is unclear. I would support placing any language which is provably artificial/constructed (e.g Esperanto) in an establishment category, though. עוד מישהו Od Mishehu 13:51, 12 August 2016 (UTC)
  • Oppose except for artificial languages, per all the above. It doesn't make sense as applied to natural ones.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  11:08, 13 August 2016 (UTC)
  • Strong Oppose with the exception of artificial languages, as mentioned above. Because languages are constantly evolving it is incredibly difficult (and controversial) to establish exactly when a language was established and disestablished. Take Sumerian for example, it was probably spoken long before it was written down, and even after it ceased being spoken it was still used liturgically and had a strong influence on Akkadian for hundreds of years. It is nearly impossible to say when it was established or disestablished, (not to mention that many extinct languages are going under revitalization efforts) so my vote is a strong oppose. Inter&anthro (talk) 16:29, 18 August 2016 (UTC)
  • Oppose not useful.·maunus · snunɐɯ· 10:31, 6 September 2016 (UTC)

Discussion (establishment/disestablishment)

  • Couldn't the "subject by time" and "time by location" cross-categorisation for language extinction happen under a new category (say, Category:Obsolescence) that would also include archaeological cultures, obsolete scientific theories and cultural traditions, as well as maybe extinct species? I imagine this could go under the rather broad Category:Former entities (where Category:Disestablishments is located too). Uanfala (talk) 09:15, 3 August 2016 (UTC)
    • Yeah I guess but seems unnecessary to have a whole new structure for just languages. -- Ricky81682 (talk) 19:17, 3 August 2016 (UTC)
Not just for languages, but also archaeological cultures, obsolete scientific theories etc. Uanfala (talk) 08:02, 9 August 2016 (UTC)
  • I think that these categories clearly do not apply to languages and should simply be removed. They are meant for states and organizations. ·maunus · snunɐɯ· 08:52, 9 August 2016 (UTC)
  • The very idea of establishment and disestablishment seems to view natural languages only from the perspective of developed languages or even official languages. For most languages in the world, this is not applicable. They exist often alongside languages of wider communication and serve their communities in restricted environments, even if this community is fully bilingual in the other language. Would you call a language disestablished, just because it is only used at home? Probably not. But then we have a huge boundary problem with this category, and again lots of edit wars, because some people's disestablished language is another person's established language. The criteria of extinction is much more clear-cut and useful when it comes to languages. I would also not hesitate to call an artificial language extinct when no-one is using it any more. Landroving Linguist (talk) 11:48, 10 August 2016 (UTC)

Based Upon Status

Have all languages be in a category of their status, such as extinct, meaning no one knows how to speak it. Perhaps "obsolete" or else "outdated" for latin or old english, and "endangered", which if added, an amount of speakers required to be considered a status would have to be agreed upon. One of the categories would be "Active", to include all languages declared by at least one country (or subsection of a country) an official language. The categories would contain all of the languages inside themselves, and also inside the subcategories, the subcategories would work by date of becoming the status of the category, for instance in the extinct category, would have "languages that became extinct in ___ century". Tiers of categories:

  • 1. Status of language (Example: Extinct)
  • 2. Timeframe of that status (Example: Languages that became extinct in the 10th century)

The languages themselves would be in both categories, for example Saka would be in both the extinct category and the Languages that became extinct in the 10th century category.

(Old proposal) Statuses Egids and Unesco (Kept for archival reasons)

  • These statuses were made based upon EGIDS, with some being modified and some added, and also some removed that I thought were superfluous, and all are up for debate/change. Names changed will be marked with bold, statuses I made will have the number bolded.
  • 1. "International" The language is widely used between nations in trade, knowledge exchange, and international policy.
  • 2. *"National" The language is used in education, work, mass media, and government at the national level.
  • 3. "Provincial" The language is used in education, work, mass media, and government within major administrative subdivisions of a nation.
  • 4. "Lingua Franca" Used in work and mass media without official status to transcend language differences across a region.
  • 5. "Educational" The language is in vigorous use, with standardization and literature being sustained through a widespread system of institutionally supported education.
  • 6. "Vigorous" The language is used for face-to-face communication by all generations and the situation is sustainable.
  • 7. "Threatened" The language is used for face-to-face communication within all generations, but it is losing users.
  • 8 "Shifting" The child-bearing generation use the language among themselves, but it is not being transmitted to children.
  • 9 "Moribund" The only remaining active users of the language are members of the grandparent generation and older.
  • 10 "Near Extinct" The only remaining users of the language are members of the grandparent generation or older who have little opportunity to use the language.
  • 11."Obsolete" A language that has evolved into something else and the language itself is rarely used, Examples such as old english or old latin.
  • 12. "Dead" No one speaks it outside of translating it academically. (I.E. only spoken by archeologists who learned it to translate it.)
  • 13."Extinct" No one speaks it.
  • 14. "Lost" No one knows how to speak it, it cannot be learned.

Comment your thoughts. Iazyges (talk) 00:05, 16 August 2016 (UTC)

(New) Status proposal (not necessarily exclusive against the above)

  • "Active" Still used as a modern language.
  • "Evolved" Languages like latin, or anglo-saxon, that evolved into something else, and while they may stil exist, are rarely used.
  • "Liturgical" Languages learned for liturgical or special purposes,
  • "Academic" well studied but not used outside of being research.
  • "Dead" No speakers, some fragments of it remain, but it is not well studied
  • "Unknown" No speakers, no remains of the language believed legitimate.

Support

  1. User:Iazyges (OP)
  2. Include EGIDS, UNESCO, and anything else sourceable (including other clarifications of usage, e.g. limited to liturgical or academic study, etc.) We have no rationale to stick exclusively to one particular published system if it is not accurate enough for our purposes, or is contradicted by others, per WP:UNDUE.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  21:13, 9 September 2016 (UTC)

Oppose

  • Tentative oppose. I'm confused on whether we are rely exclusively on Ethnologue as the source here or it's kind of WP:OR which language belongs where. I don't see the evidence that Ethnologue is on the same level of consensus in terms of classification as something like IUCN Red List is for species. The article itself cites some opposition from the editor at Glottolog. This seems more like a discussion for Template talk:Infobox language first and then from there to create categories as I'd expect use to include that information in the infoboxes as standard practice. If there's evidence that actual linguists uses these criteria and there's at least some ability for us to find reliable sources to classify each language here, then I'm fine with it. -- Ricky81682 (talk) 08:36, 21 August 2016 (UTC)
@Ricky81682: I agree, it needs work, I put it here as a form of peer review of sorts, to get feedback on how to improve it. Iazyges (talk) 13:14, 21 August 2016 (UTC)
It's a matter of just reporting what the sources tell us. It's not our job to decide which one is right (that's the OR). If sources give us conflicting information, we tell the readers that sources on the matter are in conflict.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  21:14, 9 September 2016 (UTC)

Discussion

  • If something like this is adopted, it had better be based on a classification system that is already in place. One such system is the EGIDS scale. Data (probably not always entirely reliable) about each living language's position on that scale is published by ethnologue [2]. I'm wondering if a somewhat less fine-grained version can adopted here.
Whatever system is adopted, it's worth pointing out that we can't ever hope to achieve total categorisation, as for the majority of languages there isn't (and there can't be) any historical data about when their status changed.
There are two elements in the OP's proposal that don't appear in the EGIDS. One is the distinction between extinct languages and ones that have evolved (like Old English), and we run again into the question (which I remain agnostic about) of the applicability of chronological categories to the latter: there is discussion about that earlier in the RfC. Another distinction is within the category of extinct languages: between ones that are extinct but studied academically ("extinct by speakers") and ones that no-one knows how to speak ("extinct"). If this is retained then I'd suggest the more transparent labels "extinct" and "unattested". Uanfala (talk) 09:56, 16 August 2016 (UTC)
@Uanfala: I think that while we cannot fit every language into a time, it can fit in the status folder until such a time as we know what time it became that status, perhaps even have a folder like "Unknown date of status" to fit in any language we dont know when it became that. Iazyges (talk) 12:36, 16 August 2016 (UTC)
@Uanfala: I have checked out the link you gave and changed the names to it. Iazyges (talk) 12:47, 16 August 2016 (UTC)
Iazyges, you base your updated descriptions on speaker numbers. While the absolute number of speakers is sometimes a good indication of a language's vitality (and it is vitality that we want to indicate with this new categorisaion, right?), in many many instances it is not. A community of 500 speakers can be a vigorous one, using the language in all domains and passing it to the next generation, while there are communities of hundreds of thousands of speakers that have stopped teaching their childred how to speak the language. Either the EGIDS or UNESCO's scales take account of speaker numbers, but they also register other relevant factors. Uanfala (talk) 15:44, 16 August 2016 (UTC)
@Uanfala: Hm, how would you propose we fix that? Should we mix it, or perhaps have two seperate categories, one for the amount, and one for whether or not it is actively taught? Iazyges (talk) 15:53, 16 August 2016 (UTC)
It depends on what we want to categorise for. You original proposal suggested that it was language vitality – i.e. whether a language is endangered or how actively it is used or supported. We can base such a categorisation on one of the two vitality scales in use. If we choose to also categorise by number of speakers, that's a completely different matter. I don't know if such categorisation would be helpful (but other people might disagree), and if we go that way, we'll have to use precise category names ("Languages spoken by more than 100,000 people" ....) rather than fuzzy labels like "uncommon". Uanfala (talk) 16:41, 16 August 2016 (UTC)
@Uanfala: I suppose your right, ill try to find the right blend of UNESCO and EGIDS
  • UNESCO also has a language vitality scale which is also used by Glottolog. I would suggest using this instead of EGIDS.·maunus · snunɐɯ· 12:58, 16 August 2016 (UTC)
@Maunus: Yes but the UNESCO one mostly about generational levels of speaking, ie if one generation speaks it but their children don't, the system I propose works of either a nation or state recognizing it officially, or based upon the number of speakers, while the UNESCO system appears based upon generations. Iazyges (talk) 15:14, 16 August 2016 (UTC)
Your suggested categories moves us into OR territory for some of the categories. It is much better to strictly follow one of the established systems either EGIDS or UNESCO (or both) that way we dont have to do interpretation from sources ourselves. The UNESCO system by the way is not based only on generations, but on vitality and it includes aspects of language policy such as official status. Pure speaker numbers don't really give any useful information about vitality.·maunus · snunɐɯ· 18:45, 16 August 2016 (UTC)
@Maunus: I will go with EGIDS for now, As I find it better.

Maunus (talk · contribs), Uanfala (talk · contribs) I have edited the statuses so most fit the EGIDS scale, I changed the name of the "Wider Communication" category to Lingua Franca because i thought it was more encyclopedic, I also kept the Obsolete, Academic only, and extinct category, I just added the "Lost" category, for languages no one knows how to translate to any other language. Iazyges (talk) 20:01, 16 August 2016 (UTC)

A distinction between "extinct" and "unattested" to respectively mean "extinct except among specialists" and "completely extinct" makes no sense to me at all. It's confusing as to "extinct", and that's not what "unattested" means. Proto-Germanic is unattested; there are no documents, inscriptions, etc. – testaments – in it. This two-way split isn't particularly useful anyway. Few attested languages are unstudied. I think what we'd be trying to get at is a) languages that are still learned to fluency or near-fluency for special (often liturgical) purposes, like Latin, Koine Greek, and Old Church Slavonic, as well as the more ancient forms of written Sanskrit and Chinese, and the rare cae of revived dead languages like Cornish; 2) languages that have no fluent speakers at all but about which a great deal is known and which are regular subjects of academic study, such as Anglo-Saxon and Aramaic; 3) just plain ol' dead languages, about which nothing is known, very little from ancient surviving works, or some amount from 18th- to 21st-century basic philological and linguistic analyses before the last speakers died in modern times. And maybe even that last category should distinguish between languages we know a little about and totally unknown ones about which we can only surmise. It's noteworthy that in many cases an ancient language has evolved into later ones; even Latin isn't really a dead language in any sense, even aside from liturgical use; it simply mutated into all the Romance languages, just as Anglo-Saxon became English and Scots. How much we know, and whether anyone is fluent are two distinct bits of information that are much more meaningful than overly sentimental and emotive notions like "living" and "dead", which has a whiff of that Last of the Mohicans noble savage romanticism to it.

I agree we should include EGIDS and UNESCO labels when available, but this doesn't preclude us doing our own source (not original!) research, since we already known both of those systems have their weaknesses, and it is not our job (per another policy, WP:NPOV, especially WP:UNDUE) to heavily favor them against other RS evidence of what the facts are.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  03:21, 9 September 2016 (UTC)

  • I don't have any desire to categorize unattested languages. It's just that languages with attestation exist and I think categorizing said attestation dates is helpful, nothing more. Obvious proto-languages, being proto and not actual languages, are unattested as they may not exist. As to extinction, I think the problem is trying to figure out what actual linguists use. I don't know if the better term is Language death but that includes both a description inside it as language shift and a separate section for language change. From that, we should follow the technical definition of the last native speaker. I don't think being "used" or "learned" is a relevant criteria as technically every language that is studied is being learned at the moment. -- Ricky81682 (talk) 19:36, 9 September 2016 (UTC)
    • Not sure I disagree strongly with any of that (thus a basis for compromise!), but would quibble with two bits. PIE, PGmc, etc., were (to the extent we're correct about them) real languages that were spoken, we just don't know anything about them directly only by reconstructive hypotheses. A great deal of linguistic work involves studying languages, side-by-side, without necessarily learning them in the language acquisition sense. I would be hard-pressed to give you a single word in a Puebloan language, other than proper names and some food, etc., terms adopted into local English and Spanish in the American Southwest, despite having written a paper years ago comparing several of their grammars with those of nearby non-Puebloan languages like Navajo. At any rate, my main point was we should not use misleading terms, or focus on something that may not be a good encyclopedic approach. Linguists' tendency to use biological metaphor ("genetic relationship", "living", "extinct", etc.) can produce multiple problems that we need to work around for our audience.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  21:05, 9 September 2016 (UTC)

SMcCandlish (talk · contribs), Ricky81682 (talk · contribs), I have added a proposal based upon your suggestions, If you believe It should be different or have more feedback, I would love to hear it. Iazyges (talk) 20:47, 9 September 2016 (UTC)


The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Importance assessment

How to assess the importance of articles under this WikiProject? Pratyush (talk) 09:48, 11 October 2016 (UTC)

Dispute on template: languages of Iran

User:LouisAragon has been removing the listing of Kholosi language from the template: languages of Iran. I placed it in the minority languages section and he has removed them on the arguments that it has such few speakers and is not recognized by the Iranian government. He is insisting on "consensus" even though the listing of languages spoken in a country in it's languages template is not created on the basis of government recognition, but rather if it's actually spoken there. Even if it was an immigrant language (not actually being one in this case), it would still belong in the immigrant languages section.

I don't have much time for Wikipedia as of now, so I will not be able to keep up with this user's edits. I think third party input can be added here to prevent further edit warring, though I have expressed to the user, it's a very silly argument to be edit warring over.--NadirAli نادر علی (talk) 18:21, 20 October 2016 (UTC)

A newly discovered language? Interesting. Do you have any sources about it? The only thing I'm seeing so far is this abstract for a paper. Google scholar says it's now published in Studia Iranica [3] but I don't see it on the journal's website [4] (assuming that this is the right journal). – Uanfala (talk) 18:59, 20 October 2016 (UTC)
Noting that the template only lists a handful of major languages and most languages of Iran aren't included. – Uanfala (talk) 19:07, 20 October 2016 (UTC)
(edit conflict) Lets see;
  1. Entirely based on one source, as well as literally the only source that makes a mention of it (Google Scholar confirms this)
  2. No mention of it by Ethnologue
  3. No mention of it by Glottolog
  4. No hits in Google Books
  5. "Recently discovered" according to the source
  6. "Spoken solely in two villages" according to the source
  7. No historical/present-day recorded significance
Based on these reasons, it does not belong in the template. See also the entire matter talked about here as well for more information. Notice btw that said user was reverted not only by me, but also by another user.
In my opinion, this is nothing more than tendentious editing by said user, e.g., giving absolute undue significance to something that should not be given such, in every sense of the word. And then I haven't even mentioned the edit-warring, and, unfortunately, falsely labeling of other users of performing "vandalism". His analogy with Hebrew, a language spoken there for over 2,000 years by a minority with major historical significance, and who have official representation in the parliament ever since the first constitution (1906), was quite mind-boggling as well. - LouisAragon (talk) 19:20, 20 October 2016 (UTC)
There is at least now an actual academic publication on the topic (ToC link here: [5], apparently the published version of that conference presentation linked to earlier), so it does seem to be a genuine topic potentially worth developing as an article. But of course, that article should be written first, before it's worth discussing links to it from elsewhere. Fut.Perf. 20:27, 20 October 2016 (UTC)
I don't think it's at all a requirement that an article exists before it is included in an infobox template. There are numerous examples of redlinks in templates of this type. Part of the reason is that people would click the links, find that the article is undeveloped and have the option to do some research to flesh it out. 0x0077BE (talk · contrib) 21:06, 20 October 2016 (UTC)
I placed it in the minority languages section of the template for the obvious reason that it is indeed a minority language. I believe it should be there because it's a language spoken in Iran, regardless of number of speakers just as it has to be in category:Languages of Iran (which the disputing user did not remove), but I'll let others work this out. LouisAragon should still have opened a discussion on the appropriate WikiProject rather than edit warring and then posting bogus vandalism messages to others for keeping what he didn't want. The Kalash language is a language spoken by no more than 3000 people, yet it stays on template: Languages of Pakistan because that's where it's spoken.--NadirAli نادر علی (talk) 20:46, 20 October 2016 (UTC)
What are the actual policies for inclusion in an infobox template like this? I assume it's somewhat more strict than the policies for inclusion in a category, but less strict than inclusion in an article? I'm genuinely uncertain here, because it sorta sends a mixed message that somehow it crosses the threshhold of notability for having an article about it, but doesn't meet the criteria for inclusion in an infobox template designed to navigate between languages spoken in Iran. I understand not including it if it's somehow controversial whether or not the language exists. 0x0077BE (talk · contrib) 21:06, 20 October 2016 (UTC)
  • I think it shouldn't be included. More than 80 languages are spoken in Iran (here is a list). The template will explode if we want to include all of them. We should only include the most significant languages. This language is discovered only recently. There's only one source about it, and its article is not more than two lines. It is not clear that in what aspects this language is important. I disagree about Kalash example. Unlike this newly discovered language, Kalash people are of significant importance. -- Kouhi (talk) 22:10, 20 October 2016 (UTC)
  • Kouhi, you should provide further discussion about why the Kalash are of more significant importance. Their language is also restricted to villages. I placed it in the minority languages which is where it is.--NadirAli نادر علی (talk) 22:44, 20 October 2016 (UTC)
@NadirAli: Kalash people are of significant importance, simply because they have been discussed by many many reliable sources. Please see WP:N. -- Kouhi (talk) 23:07, 20 October 2016 (UTC)

Discussion about bracketing around English diaphonemic transcriptions

Some members of this WikiProject may want to participate in a discussion I have started. I am proposing that the diaphonemic (cross-dialectal) transcriptions of English that are used across Wikipedia use different bracketing to distinguish them from phonemic transcriptions (i.e., transcriptions using symbols that accurately represent a particular dialect's vowel system).

Currently the two types of transcriptions use single slashes, like /ðɪs/. It seems that the easiest solution would be to use double slashes for diaphonemic transcriptions, like //ðɪs//.

If you have an opinion on what symbols should be used, or if you don't see the need for diaphonemic transcriptions to be visually distinguished from phonemic transcriptions, please post in the thread at Help talk:IPA for English § Marking transcriptions explicitly as diaphonemic. — Eru·tuon 02:11, 21 October 2016 (UTC)

Tlitliltzin vs tlilitzin

The Nahuatl word for Ipomoea tricolor, according to our article, meaning "black + reverential suffix". Can somebody tell which (if any) is the correct form? Thanks, ἀνυπόδητος (talk) 08:05, 23 October 2016 (UTC)

As far as I can see the provided source states that it is Ipomoea violacea that was called tlilitzin. I have not been able to find the word in a Nahuatl dictionary. Tlilli means "black ink", tliltic means "(something) black". So It is not strictly correct that tlilitzin means "black+reverential", if anything it means "ink+reverential". But it depends a little on the exact way it is attested, which I cannot find at present in any of the major dictionaries. It seems that the source of most citations of tlitliltzin (many of which atre mangled) is Wasson. Wasson has circulated a bunch of erroneous Nahuatl etymologies so it would be good to find his original source if there is any. ·maunus · snunɐɯ· 08:47, 23 October 2016 (UTC)
OK, Wasson's source is Pedro Ponce's "Breve Relación de los Dioses y Ritos de la Gentilidad". I will see if I can find it. Regardless Wasson's proposed etymology is not entirely correct, although the elements "black/ink" and "little" seem to be involved.·maunus · snunɐɯ· 08:57, 23 October 2016 (UTC)
In any case it seems that neither Ponce nor Wasson claims that tlitliltzin is the name of the plant - only of its hallucinogenic seeds.[6]·maunus · snunɐɯ· 08:58, 23 October 2016 (UTC)

Wu romanisation

Hi All, I was just wondering if we could establish a standard form of Wu romanisation throughout Wikipedia? It's a bit all over the place with some Hangzhou Romanisation here and Shanghai Romanisation there. Thoughts? Opacitatic (talk) 09:47, 23 October 2016 (UTC)

I believe Long-short (romanization) for shanghainese and suzhou and Wenzhounese romanisation for Wenzhounese may be valuable tools, if you are looking for help establishing a page I'd love to help. Iazyges Consermonor Opus meum 02:30, 24 October 2016 (UTC)
The Wiktionary romanisation system is pretty solid and there are some good examples there too. Maybe that since it's already on Wiki? Opacitatic (talk) 06:25, 24 October 2016 (UTC)

Draft:South Tyrolean dialect

Hello language experts. Your advice is welcomed regarding whether Draft:South Tyrolean dialect has a future in mainspace. It cites no peer-reviewed scholarly sources, and it isn't clear how it fits in with German dialects and Southern Bavarian. Please comment at the draft's talk page. Thank you. --Worldbruce (talk) 00:54, 25 October 2016 (UTC)

Wikidata property proposal for Ethnologue language status

Please contribute to d:Wikidata:Property_proposal/Ethnologue_language_status. – Uanfala (talk) 10:27, 5 November 2016 (UTC)

Language vs. dialect in South Asia: can we choose a single authority to apply across the board?

There are a few Indo-Aryan varieties that spawn regular article-naming controversies that centre around the use of either "language" or "dialect" in the title. This results in occasional page moving back-and-fourth and WP:RM discussions that attract people with strong WP:POVs, invite socking and result in long walls of text on talk pages, and the final decisions often have less to do with an assessment of sources and more with the particular configuration of outspoken editors with an emotional stake. There are recent examples at Talk:Saraiki dialect and Talk:Hindko dialect.

Can't this be avoided? I think this could be achieved if we adopt a rule that all article titles in this area should follow one given source. Barring ethnologue, the most comprehensive such source that I know of is Masica's 1991 The Indo-Aryan Languages. It has a lengthy appendix (pp. 420–45) that lists varieties giving each an explicit qualification as either "(former/emerging/current) literary language", "language" or "dialect". Of course, that would apply only to article titles. If there are controversies or subtleties, they will then be explained in the article text.

In order to adopt such a rule, a full WP:RfC would be needed, but I just want to probe the ground now. Does the idea make sense? – Uanfala (talk) 20:21, 28 October 2016 (UTC)

  • Oppose Not at all, we apply exceptions to a single language Punjabi where you had been WP:FORUMSHOPING on various dialect pages. Talk page allows discussions which are important as part of community consensus. Linguistic tagging should be allowed such as Language, Dialect, Accent, Pidgin, Creole etc. If you want a sources out of many which suits your preferred tagging (e.g. Deleting dialect) then that is not WP:neutral point of view. ₯€₠€₯ — Preceding unsigned comment added by 39.50.84.196 (talk) 00:54, 29 October 2016 (UTC)
  • Comment - I am not sure that this is viable. WP:NPOV asks us to summarise all the published views in reliable sources. However, we should avoid using dialect in page title when the common usage in the literature doesn't call it a dialect. The fact that it is considered a dialect by linguists can be mentioned in the article. -- Kautilya3 (talk) 09:07, 5 November 2016 (UTC)
    • Well, Masica's book does try to "summarise all published views", but you're very definitely right that we should try to do that ourselves. Not that we are managing to. Looking at the past discussions on the talk pages of Saraiki and Hindko I see that we've consistently failed to do that. – Uanfala (talk) 13:39, 6 November 2016 (UTC)
      • I see that you actually are attempting to change things for the better here, Uanfala. Help me see how, for example at the Saraiki talk page, we haven't managed to summarize at least several published views? The problem seems to be largely that the reliable linguistic sources that must be summarized don't always make it easy and simple to say with certainty that some rather controversial tongues are languages or dialects. I would also like to address the following from Kautilya3:
...we should avoid using dialect in page title when the common usage in the literature doesn't call it a dialect.
That would be a great procedure as long as the "literature" in question is well-defined. Some editors like to use all sources reliable, whether or not linguistics-based. Other editors consistently cite the fact that in many cases, even the linguistics sources can't seem to agree which qualifier, language or dialect, is best-use. In still other cases, such as Saraiki dialect, that language seems to be in transition and on its way to becoming a "full-fledged" language; however, many very reliable sources have yet to catch up and come to grips with this. It's not easy to close a pertinent requested page move under these circumstances, so that is I think the dilemma that Uanfala brings here for possibly a better solution than our present policies and guidelines are equipped to cover. And I support that editor's attempt to better define how Wikipedia handles the often controversial language vs. dialect natural page name disambiguators.  Paine  u/c 13:45, 12 November 2016 (UTC)
Sorry, I guess my comment wasn't entirely clear. I do mean all reliable sources, not just linguistic sources, to be used as the basis for page titles. Whether something gets to be called a language or a dialect depends on a variety of sociological factors, which may have nothing to do with the scientific status of the thing. By trying to impose a scientific view on the page titles, we would be merely asking for trouble. Imagine retitling Urdu as Urdu register (Hindustani) or something like that. All hell will break loose. The world is not scientific. Neither should we try to be.
As for Saraiki dialect, the very first citation on the page points out that there is no consensus on the status of Saraiki. So, by putting "dialect" in the title, we are asking for trouble. I am recommending calling it just Saraiki and explain the diversity of the views in the article body. -- Kautilya3 (talk) 14:13, 12 November 2016 (UTC)
That is an interesting explanation, Kautilya3. It seems to be the trend and desire within the science of linguistics to raise the level of study and application. While they may not yet be up to the level of, say, medicine, we might want to strive to make encyclopedic articles better defined in much the same way as are the medical articles. To me that would mean just the opposite to what you say about using other-than linguistic sources to determine both article titles and content. We want to see our way forward, and a more scientific application may be the forward way to go. Now, the dialect (or language) of Saraiki was at least first a dialect. If Saraiki should now be considered a language, then there should be consensus on its status in reliable sources, especially linguistic sources. Failing that, then there should be consensus in Wikipedia policies and guidelines as to when titles like Saraiki dialect should be altered to Saraiki language. This does not seem to be the case. Next, look at Saraiki and we see that there does not seem to be a primary topic, so titling the dialect (language) article just "Saraiki" does not appear to be an option. We do need to hash these things out, though, and in the process build a better, improved encyclopedia.  Paine  u/c 14:37, 12 November 2016 (UTC)
Two somewhat orthogonal comments: 1. When it appears that there is disagreement in the literature about whether a given variety is a language or a dialect, this disagreement is usually only apparent and it's all down to the the ambiguity of the words used (this is covered in Dialect, and there's a relevant brief discussion at Saraiki dialect#Status of language or dialect). 2. The distinction between "language" and "dialect", even in these words' narrower linguistic sense, isn't "scientific" in any way. I should stand corrected if I'm wrong but I think it's clear these terms aren't part of linguistics proper, but just elements of the metalinguistic framework. This is analogous to the metageographical notion of continents: labels like "Europe" and "Asia" are used all the time, but they don't have any scientific meaning and the distinction between the two has no geological, biogeographical, cultural, social or political meaning. – Uanfala (talk) 14:55, 12 November 2016 (UTC)
Uanfala is right - language and dialect are not clearly distinguished in linguistics, beyond the bonmot by Max Weinreich that a language is a dialect with an army and a navy. Well, nowadays, at least in the country where I work, a language is a dialect with an education budget, regardless what other linguistic criteria, such as mutual intelligibility or lexicostatistics, you want to employ. Political and sociolinguistic factors contribute to this question at least as much as hard linguistic facts. This is by the way also the stance of the Ethnologue, which tries to adapt its classification to these factors where necessary. Many languages in Mexico are linguistically not much more than barely distinguishable dialects, but the speakers can't and don't want to see it that way, and presto, we have another language. And the opposite happens elsewhere, too. I therefore think that any attempt to make Wikipedia adhere to so-called hard linguistic facts is doomed to fail, much as I wish that we could. Landroving Linguist (talk) 13:27, 13 November 2016 (UTC)
So, are you then saying that Wikipedia should go with the "presto" and call a dialect a language because the speakers don't want to see (for example) Saraiki nor Hindko as dialects and prefer to see them as languages? What we seem to be getting in the requested moves is a little of both: speakers seem to want their tongue to be considered a language, and others who perhaps live nearby and speak (in this case) the main language of Punjabi want to see Wikipedia keep Saraiki, Hindko and others as dialects. After reading what you and Uanfala have said, I still don't have an inkling how we would change things or even if a change is actually needed. And I still lack a clue as to whether the Saraiki article title should be naturally disambiguated with "dialect" or "language". There appear to be excellent arguments for both.  Paine  u/c 14:24, 13 November 2016 (UTC)
PS. Or maybe you are saying that Wikipedia should go with the criteria of "educational budget" or system? PS added by  Paine  u/c
I'm not sure what I am saying myself. Maybe just that determining the status of a language is complicated. I know nothing about the situations in South Asia, so I won't speak into these particular cases. The education budget is not a clear indicator, because there are languages of uncontested status which do not have any education going on, nor even an orthography. And an education budget is subject to policy, which may change, and both ways. Still, I'm pretty sure that most linguists (including the editors of the Ethnologue) have now given up on the idea that you can distinguish between language and dialect entirely on the basis of mutual intelligibility and lexicostatistics. More frequent, by the way, than dialect speakers demanding language status for their variety, is the situation of speakers of several non-mutually-intelligible varieties insisting that they all speak the same language, usually for reasons of ethnic identity. I know of at least two situations like this in Ethiopia. Other famous examples are the Kurdic languages, or even the Arabic linguistic landscape. That I am writing this here will guarantee me some hate-mail by those who see this very differently, usually members of these ethnic groups. Ethnologue has now resolved to accommodate these situations by calling the groups macro-languages, still listing the other varieties below these nodes, leaving it for the readers to decide whether these are dialects or languages. Landroving Linguist (talk) 06:16, 14 November 2016 (UTC)
I certainly and sincerely hope that you don't receive any hate mail just for discussing these issues. It doesn't really matter what language or dialect we're discussing, they're still only words. Yes, it does seem that some words have power, but only that power which we give them or allow them to have. Ethnologue is a special case because all they deal in are varieties of tongues that people speak and write. "Saraiki" and "Hindko" on Ethnologue undeniable refer to tongues, whereas on Wikipedia, those terms must be considered ambiguous and imprecise, since they may refer to more than one subject, and determining which subject might be a primary topic can be very complicated if not impossible. Thank you, Landroving Linguist, for your thoughts on this, and complicated though it may be, I think the science of linguistics has come a long way and will continue to nail things down as much as possible in coming years. Until then, Wikipedia will just have to continue to deal with the consequences and controversies, which just means that we'll have to continue to have these edifying talks!  Paine  u/c 17:01, 14 November 2016 (UTC)

Status of Sammarinese

Most sources list Sammarinese as a variation of the Romagnol dialect, should it therefore be called: A language (due to its being an official language of San Marino), a dialect of italian, or a subdialect of Romagnol (which is itself technically a subdialect)? Iazyges Consermonor Opus meum 21:48, 20 November 2016 (UTC)

It may be worth noting that the article on Romagnol says that Sammarinese is one and the same with Romagnol. Iazyges Consermonor Opus meum 21:52, 20 November 2016 (UTC)

Survey/Answer

Sammarinese is the same as Romagnol

Sammarinese is its own language

Sammarinese is a subdialect of Romagnol

Sammarinese is a subdialect of Italian

Discussion

  • Well, just go for the term used in relevant up-to-date reliable sources? Noting that usage is likely to vary with the context: when talking about the similarites/differences with Romagnol, I'd imagine "dialect" would be more natural, whereas "language" would clearly be the appropriate designation to use when referring to its official status. I don't think there's a need to choose a single term to use across the board. – Uanfala (talk) 22:24, 20 November 2016 (UTC)

Volunteer note give due weight to every aspect as stated by rules. — Preceding unsigned comment added by AksheKumar (talkcontribs) 07:45, 4 December 2016 (UTC)

Meitei language → Manipuri language

There is a move discussion going on at Talk:Meitei language. If you're interested in commenting. Thanks. – ishwar  (speak) 23:40, 4 December 2016 (UTC)

Internet world language stats

Any objection on adding it as a source for demographic citation ?AksheKumar (talk) 05:58, 4 December 2016 (UTC)

I guess the context you have in mind is something like this, and the source you're asking about is http://www.internetworldstats.com/languages.htm. Well, a random website that publishes internet usage statistics is not a reliable source for information about numbers of speakers of languages. Also, what proportion of a nation's population speaks a given language is probably a useful piece of information, but even if properly sourced, it doesn't belong in the first sentence of the article about that language. – Uanfala (talk) 13:08, 4 December 2016 (UTC)

@Uanfala: properly sourced ?

That is, backed up by a relevant reliable source. – Uanfala (talk) 00:02, 13 December 2016 (UTC)

Saraiki dialect -> Saraiki language

There's a move discussion at Talk:Saraiki dialect#Requested move 23 December 2016 where you're welcome to participate. – Uanfala (talk) 02:48, 23 December 2016 (UTC)

AfC help

Hi. There's a draft on loanwords at AfC, that it would sure be nice if someone with an interest in languages took a look at. Here the link, Draft:Loanwords in Hawai'i. Thanks. Onel5969 TT me 21:30, 6 January 2017 (UTC)

Missing topics list

My list of missing topics about languages is updated - Skysmith (talk) 14:09, 8 January 2017 (UTC)

garbled sentence

(Copied from Talk:Turoyo language#garbled sentence)

Turoyo is not mutually intelligible with Western Neo-Aramaic, having been separated for over a thousand years is considerable, but to a limited degree.

Intro section, 2nd paragraph. Maybe from cut-and-paste, but who can tell what it's supposed to mean?

(Cross-posted to the Talk pages of WikiProjects Syria and Assyria.)--Thnidu (talk) 06:42, 23 January 2017 (UTC)

Conflict vs Dispute

This might have already have been discussed but could dispute, rather than conflict be used to describe the Faroese language conflict and the Norwegian language conflict? These aren't conflicts per say–their disputes. No one is fighting or getting killed over this as with the Gospel riots, rather it is a situation with people with different linguistic philosophies disagreeing. Anyway that's my two cents feel free to disagree I thought it would be good to have a discussion here before I make any edit. Inter&anthro (talk) 07:57, 5 February 2017 (UTC)

Notice to participants at this page about adminship

Many participants here create a lot of content, may have to evaluate whether or not a subject is notable, decide if content complies with BLP policy, and much more. Well, these are just some of the skills considered at Wikipedia:Requests for adminship.

So, please consider taking a look at and watchlisting this page:

You could be very helpful in evaluating potential candidates, and even finding out if you would be a suitable RfA candidate.

Many thanks and best wishes,

Anna Frodesiak (talk) 01:07, 10 February 2017 (UTC)

Commons:Photo challenge February 2017 is Multilingualism

FYI, take a look in commons:Commons:Photo challenge/2017 - February - Multilingualism if you have any file you'd like reuse, but also to upload, maybe in some of your archive at home.--Alexmar983 (talk) 07:49, 10 February 2017 (UTC)

Using familycolor altaic to Korean and Japanese

Should we use this familycolor to Korean and Japanese? Someone want to stop to use it to two languages what I mentioned. (see User talk:Vindication) --117.53.77.84 (talk) 14:10, 12 February 2017 (UTC)

Among others, Vovin (2011) provides "the evidence that personal pronouns in Japanese and Korean that have been also cited as a 'proof' of the genetic relationship of these languages to other 'Altaic' languages have nothing to do with them except superficial chance rersemblance." --Vindication (talk) 03:41, 13 February 2017 (UTC)

Copied from User talk:Vindication#Korean language

Korean is generally included as a part of altaic hypothesis. See Altaic languages. --117.53.77.84 (talk) 14:25, 11 February 2017 (UTC)

Yes, but the Altaic hypothesis is a hypothesis, and a largely discredited one. You know that Wikipedia articles are not always a reliable source of information. --Vindication (talk) 15:24, 11 February 2017 (UTC)
There are another examples that use familycolor Altaic. (e.g. Turkish language and Mongolian language) Also, we use the colour as an areal classification, not a language family. --117.53.77.84 (talk) 17:39, 11 February 2017 (UTC)
And If we shouldn't use familycolor Altaic just because the hypothesis is discredited, It shoudn't be exist. --117.53.77.84 (talk) 17:44, 11 February 2017 (UTC)
Turkish and Mongolian belong to Turkic and Mongolic languages, which are/were widely accepted as Altaic, by the majority of the supporters of the Altaic hypothesis. Koreanic languages aren't. The page Koreanic languages itself contains the sentences "Among extant languages, Korean is considered by most linguists to be a language isolate and by some others as part of the widely rejected Altaic family or the Dravido-Korean languages. Some even suggest an Austronesian connection." in its lead section. --Vindication (talk) 01:32, 12 February 2017 (UTC)
Dravido-Korean is just a minor hypothesis and it is even widely discredited than Altaic hypothesis. The Koreanic languages are generally included a part of the Altaic hypothesis along with the Turkic and the Mongolian languages. --117.53.77.84 (talk) 05:16, 12 February 2017 (UTC)
There are two problems: 1. Koreanic languages do not have the same status as Mongolic or Turkic languages in the Altaic hypotheis. 2. The Altaic hypothesis is not proven. But as you said, English Wikipedia seems to be using the colour "as an areal classification" to mark Mongolic and Turkic languages, which may enables us to ignore the second problem. The first problem still remains. Koreanic languages are not generally regarded Altaic even by the supporters of the hypothesis. --Vindication (talk) 09:06, 12 February 2017 (UTC)
Georg et al. 1999: says 'The hypothesis of an Altaic language family, comprising the Turkic, Mongolic, Tungusic, Korean...' --117.53.77.84 (talk) 09:36, 12 February 2017 (UTC)
If you want to argue that Korean is not a part of the Altaic hypothesis, you should cite a reliable source which can prove your argument. --117.53.77.84 (talk) 09:40, 12 February 2017 (UTC)
For all anything going back that far into the past can be proved, we could just as reasonably argue that Korean is the third branch of the Uralic family... which (I can't recall where, though) I *have* actually seen being suggested... 2Q (talk) 10:58, 12 February 2017 (UTC)
The paper you linked (Georg et al. 1999:) concludes that "Despite the great amount of work which has been done to this effect (and most of the credit here surely must go to Doerfer 1963-75), this enterprise [Altaic hypothesis] still has far to go, especially as regards Korean and Japanese." (See p. 92) And the sentence you cited from its abstract says "The hypothesis of an Altaic language family, comprising the Turkic, Mongolic, Tungusic, Korean and, in most recent versions, Japanese languages continues to be a viable linguistic proposal, despite various published claims that it is no longer accepted." which means: the claims supporting the Altaic hypothesis (and among them, some supporting the inclusion of Korean and Japanese) are being raised continuously. It does not say the claims are valid or accepted. The openings typically say "there are these claims" and you should read the article through to see if "but" follows. --Vindication (talk) 13:45, 12 February 2017 (UTC)

List of historical common names

I just created List of historical common names. Please help improve it. Thanks. Anna Frodesiak (talk) 00:49, 22 March 2017 (UTC)

Phonologies of Native American languages

A new user Fdomanico51997 (talk · contribs) has been changing the phonological inventories of many articles about Native American languages, without poviding adequate sourcing for the changes. A couple directly contradicted reliable sources. Someone might help me take a look at these changes as they all need to be checked against reliable sources.·maunus · snunɐɯ· 19:59, 25 March 2017 (UTC)

Stress indicators

The symbols for indicating primary and secondary stress appear to be identical, a right-hand single quotation mark or something similar. Are they really the same? If they're different, then the real symbols should be inserted. If they are given correctly, then the difference or apparent similarity should be mentioned. — Preceding unsigned comment added by 189.142.209.180 (talk) 12:04, 1 April 2017 (UTC)

Is there a particular article you have in mind? The two symbols are identical in form (they're both a small vertical line), but they differ in position: [ˌfɒnɪˈtɪʃən], see Stress (linguistics). – Uanfala (talk) 12:14, 1 April 2017 (UTC)

One of your project's articles has been selected for improvement!

 

Hello,
Please note that Synchrony and diachrony, which is within this project's scope, has been selected as one of Today's articles for improvement. The article was scheduled to appear on Wikipedia's Community portal in the "Today's articles for improvement" section for one week, beginning today. Everyone is encouraged to collaborate to improve the article. Thanks, and happy editing!
Delivered by MusikBot talk 00:05, 3 April 2017 (UTC) on behalf of the TAFI team

Northwest Caucasian languages

I have started a discussion here (permlink), which requires third party opinions. In short, User:Listofpeople claims that Adyghe language, Kabardian language, and Ubykh language are all dialects of a Circassian language, and I claim that those three are typologically distinct languages, and the Circassian Languages is a subdivision of the Northwest Caucasian language family. I believe I have successfully demonstrated that the consensus among the linguistic community supports what I am saying, while Listofpeople relies mostly on the terms that are locally used in the region and cites non-linguistic sources. The discussion needs more input. Please come and join! :)

Vito Genovese 12:27, 15 April 2017 (UTC)

Outside opinions at Talk:Chineasy

Could use some feedback regarding a disagreement at Chineasy over the inclusion of some stuff. Some relevant edit background:

118.141.127.168 (talk) 01:01, 17 April 2017 (UTC)

RfC on the WP:ANDOR guideline

Hi, all. Opinions are needed on the following: Wikipedia talk:Manual of Style#RfC: Should the WP:ANDOR guideline be softened to begin with "Avoid unless" wording or similar?. A WP:Permalink for it is here. Flyer22 Reborn (talk) 22:51, 17 April 2017 (UTC)

Sanskrit Transliteration Corrections

Hello All. I'm looking for experts in Sanskrit who can verify the transliteration to English for the articles in Category:Asanas, the template {{asanas}}, and the article list of asanas. I also need to make sure the Sanskrit spelling (for example: [7]) is correct. ThanksGakiwa (talk) 18:54, 20 April 2017 (UTC)

Swahili Language

Hello to the members of this project. A rotating set of IPs is editing against policy at the Swahili language (edit | talk | history | links | watch | logs) infobox. A talk page thread about this has been started. If any of you could add the page ti your watchlist or post at the talk page it would be appreciated. Thanks for your time. MarnetteD|Talk 14:25, 6 May 2017 (UTC)

Update. Sections of the article are now being removed. If any members of this project could add their input on the talk page that would be appreciated. MarnetteD|Talk 16:37, 7 May 2017 (UTC)

Citation overkill proposal at WP:Citation overkill talk page

Opinions are needed on the following: Wikipedia talk:Citation overkill#Citations. A permalink for it is here. Flyer22 Reborn (talk) 06:44, 9 May 2017 (UTC)

Page moves

Pls see here e.g Blackfoot language to Siksiká...... Siksiká is not something English keyboards can type. --Moxy (talk) 03:53, 12 May 2017 (UTC)

I have begun a discussion at Talk:Siksiká#Page name, and mentioned that discuss at Talk:Niitsítapi and Talk:Siksika Nation. It is probably best to discuss the name of the language article in one place. Cnilep (talk) 03:34, 13 May 2017 (UTC)

Popular pages report

We – Community Tech – are happy to announce that the Popular pages bot is back up-and-running (after a one year hiatus)! You're receiving this message because your WikiProject or task force is signed up to receive the popular pages report. Every month, Community Tech bot will post at Wikipedia:WikiProject Languages/Archive 13/Popular pages with a list of the most-viewed pages over the previous month that are within the scope of WikiProject Languages.

We've made some enhancements to the original report. Here's what's new:

  • The pageview data includes both desktop and mobile data.
  • The report will include a link to the pageviews tool for each article, to dig deeper into any surprises or anomalies.
  • The report will include the total pageviews for the entire project (including redirects).

We're grateful to Mr.Z-man for his original Mr.Z-bot, and we wish his bot a happy robot retirement. Just as before, we hope the popular pages reports will aid you in understanding the reach of WikiProject Languages, and what articles may be deserving of more attention. If you have any questions or concerns please contact us at m:User talk:Community Tech bot.

Warm regards, the Community Tech Team 17:15, 17 May 2017 (UTC)

Baltic languages

There is currently a discussion taking place at Baltic languages, which might be interesting to the members of this project. – Sabbatino (talk) 16:58, 24 May 2017 (UTC)

WP:Citation overkill RfC

Opinions are needed on the following matter: Wikipedia talk:Citation overkill#Should this essay be changed to encourage more citations?. A WP:Permalink for it is here. Flyer22 Reborn (talk) 01:35, 9 June 2017 (UTC)

RfC

There is a discussion at Wikipedia talk:Manual of Style/Layout#Placement of expand language templates that may be of interest to those watching this page. Thanks. TimothyJosephWood 12:10, 18 June 2017 (UTC)

RfC regarding the WP:Lead guideline -- the first sentence

Opinions are needed on the following matter: Wikipedia talk:Manual of Style/Lead section#Request for comment on parenthetical information in first sentence. A WP:Permalink for it is here. Flyer22 Reborn (talk) 05:08, 2 July 2017 (UTC)

Rhaetian language

Hello. I'm not really sure that this is the right place to report a potentially inaccurate article. Still, I think someone should examine the recent changes which were made in Rhaetian language. I am not an expert in linguistics, but I always associated linking ancient European languages to Semitic family with fringe linguists. Yeowe (talk) 21:00, 15 July 2017 (UTC)

Should we move the Colloquialism article?

Opinions are needed on the following: Talk:Colloquialism#Recent move of article. A WP:Permalink for it is here. Flyer22 Reborn (talk) 06:30, 17 July 2017 (UTC)

Request for Comment - Introduction to Whataboutism

There is an ongoing Request for Comment about the introduction to the article Whataboutism.

You may comment if you wish, at Talk:Whataboutism#RfC:_Introduction_to_the_subject. Sagecandor (talk) 17:24, 21 July 2017 (UTC)

This article is a mess.

Hello. I apologize in advance to any of those I offend in writing this comment. The article, of which the following link refers to, is a mess: https://en.wikipedia.org/wiki/Matis_language. I'd like to request some guidance as to how I can improve it. Thank you for your attention. AWearerOfScarves (talk) 20:57, 13 April 2017 (UTC)

This could do with some copyediting, but I think it's in a much better overall shape than most language articles (in the sense that it does have content and that it is neutral). I volunteer to tidy up the morphology section at some point towards the start of next month (around the time when the new interlinear glossing template will be available). Leaving the rest for others to do. – Uanfala (talk) 21:04, 13 April 2017 (UTC)
@AWearerOfScarves: I think it's great that you are willing to help clean up the article. I notice what appear to be comments about editing in the text; I assume those were added accidentally and can be removed with controversy. Your edits so far look good. For advice on formatting, you might check the Wikipedia:Manual of Style. Thanks again, Cnilep (talk) 00:50, 14 April 2017 (UTC)

Thanks! AWearerOfScarves (talk) 10:42, 14 April 2017 (UTC)

Hi again. Just found something of some concern. It appears the majority of the contents of the article have been copied from this website: https://topics.revolvy.com/topic/Matis%20language AWearerOfScarves (talk) 16:05, 11 July 2017 (UTC)

Revolvy is a Wikipedia mirror. They copied their content from a version of the Wikipedia article, not the other way around.--William Thweatt TalkContribs 21:31, 11 July 2017 (UTC)

I see. Thank you. AWearerOfScarves (talk) 12:22, 22 July 2017 (UTC)

Seeking feedback on Tswana vs Setswana

Your feedback is requested at Talk:List of Setswana medical terms#Setstwana or Tswana. Thanks, Mathglot (talk) 04:07, 11 September 2017 (UTC)

RfC: Should the WP:TALK guideline discourage interleaving?

Opinions are needed on the following matter: Wikipedia talk:Talk page guidelines#RfC: Should the guideline discourage interleaving? #2. A permalink for it is here. Flyer22 Reborn (talk) 19:25, 19 September 2017 (UTC)

VPT query: Mapping Hiragana and Katakana?

Please take a look at WP:VPT#Mapping Hiragana and Katakana? and comment there if you might have an understanding about whether this is a good/bad idea. --Izno (talk) 14:36, 20 September 2017 (UTC)

Adding a video to an article

I was thinking of adding a video/sound clip of a speaker(s) to a language article, specifically Southwestern Mandarin, and I was wondering if there's a specific process to it. The Verified Cactus 100% 00:36, 6 November 2017 (UTC)

Have you read Wikipedia:Creation and usage of media files? There is a lot of info and plenty of links to more within that page. There is also a more sparse Wikipedia:Videos, although, IMHO, an audio clip is better for this purpose.--William Thweatt TalkContribs 00:53, 6 November 2017 (UTC)
Great, will do! The Verified Cactus 100% 19:28, 6 November 2017 (UTC)

Nuclear Austronesian

Is there any reason why there isn't an article for Nuclear Austronesian? The Verified Cactus 100% 21:32, 8 November 2017 (UTC)

The Austronesian situation is very complex. Most sub-groupings outside of Taiwan exist as vast dialect continuums. The relationship of the dialects or local languages to each other is fairly well-established but how the higher level subgroups relate to one another is not. "Nuclear Austronesian" is not a term that is used in the traditional classification. It was proposed by Malcolm Ross (2009) who posited four top level divisions. His "Nuclear Austronesian" includes every Austronesian language except Puyama, Tsou and Rukai which each would then form single-member first-order subgroups. "Nuclear Austronesian", therefore, only has meaning within the context of Ross' proposal, but his proposal is relatively new and not accepted by many Austronesian specialists. See e.g. the papers of Laurent Sagart and others for arguments against it. Also, Ross' proposal only defines the four highest level subgroups, it doesn't break down relationships within each sub-group, meaning all that we know about "Nuclear Austronesian" is that it is a term Ross uses in his 2009 proposed classification to include all the Austronesian languages except four. It doesn't warrant an encyclopedia article; it can be (and is) handled sufficiently in the Austronesian languages article.--William Thweatt TalkContribs 01:47, 9 November 2017 (UTC)
I see, thanks! The Verified Cactus 100% 00:52, 10 November 2017 (UTC)

Input needed on Quanzhang/Hokkien/Southern Min/Minnan Proper mess

  FYI
 – Pointer to relevant discussion elsewhere.

Please see Talk:Hokkien#Quanzhang confusion.
 — SMcCandlish ¢ >ʌⱷ҅ʌ<  10:18, 27 November 2017 (UTC)