Template-protected edit request on 14 October 2023 edit

Diff with my sandbox implementation: diff

This does three major things, things that address real perennial limitations with the {{zh}} template:

  • It adds a new |out= parameter, allowing one to select one of the terms to place before the rest, which are then put in brackets, an extremely common presentation format when writing Chinese text inline in paragraphs and tables.
  • It now uses double quotes for the |tr= parameter, for "full translations" instead of glosses, as is prescribed in MOS:FOREIGN and MOS:ZH.
  • It enables the use of multiple, correctly quoted glosses (literal translations, |l=), delineated by commas.

Examples:

{{Lang-zh/sandbox|c=我|p=wǒ|l=I, me|j=ngo<sup>5</sup>|out=j}}
Jyutping: ngo5 (Chinese: ; pinyin: ; lit. 'I', 'me')⸻|out=j puts the value for |j= outside the brackets

{{Lang-zh/sandbox|s=中华|t=中華|p=Zhōnghuá|l=China|out=p|labels=no}}
Zhōnghuá (中华; 中華; 'China')

{{Lang-zh/sandbox|s=她刚刚离开了|l=she's just left|out=l}}
lit. 'she's just left' (Chinese: 她刚刚离开了)

{{Lang-zh/sandbox|s=电脑|t=電腦|tr=computer|p=diànnǎo|l=electric brain|out=tr|labels=no}}
"computer" (电脑; 電腦; diànnǎo; 'electric brain')⸻double quotes now used for |tr=

I was apprehensive about patching the template, but I'm pretty sure I haven't broken anything, I'm sure someone more experienced than I will take a look-see before it gets merged. :) Remsense 17:20, 14 October 2023 (UTC)Reply

Remsense, would you be so kind in adding testcases for this change at Template:Lang-zh/testcases? SWinxy (talk) 20:36, 14 October 2023 (UTC)Reply
Certainly! I'll try to cover all the edge cases I can think of. Remsense 20:37, 14 October 2023 (UTC)Reply
I've gone ahead and added some! if you'd like me to be more thorough/any test cases you need to see, let me know. (Oh, and I found a bug in the process.) Remsense 21:17, 14 October 2023 (UTC)Reply
I fixed the (very obvious!) bug, and added the ability to put both simplified and traditional outside the parentheses. hopefully everything is good now! Remsense 00:37, 18 October 2023 (UTC)Reply
I have had a look at the testcases, and it's unclear from them what the use case for this is, what articles it would be used in. I mean when you have something like
"computer" (电脑; 電腦; diànnǎo; 'electric brain')
It would be normal to use the template for everything inside the bracket, and normal text for "computer". The ones where that wouldn't work such as:
pinyin: Cài Yīngwén (Chinese: 蔡英文)
I can't see a use for. You would never put pinyin like that in body text, with the Chinese characters in brackets. Generally in an article both the Chinese and Romanisations should go in brackets so as not to upset the flow of the English text, with the Chinese first. When the Chinese is being discussed, such as in an article on the language, this template isn't really suited for it.--2A04:4A43:90AF:FAB6:29E3:5D64:C243:2A6D (talk) 02:08, 15 October 2023 (UTC)Reply
You are able to put any field outside the brackets: gloss, characters, or romanization. There are situations where any of the three are appropriate, usually depending on whether the orthography, semantics, or phonetics are the focus of the prose. I've worked with all of them editing Chinese-related articles. The Tsai Ing-wen example was copied from above, I wouldn't actually write a personal name in such a way in an article. The cases were meant to demonstrate that all the features are working properly.
Here is a tweaked excerpt from Chinese characters where a pinyin-first example is the best-flowing, in my opinion.
The barrier between pronunciation and meaning is never total, however: in the Chinese system, phonetic characters may be deliberately chosen as to create certain connotations. This regularly happens for corporate brand names: for example, 'Coca-Cola' is translated phonetically as Kěkǒu Kělè (可口可乐; 可口可樂), with the characters selected so as to possess an additional meaning of 'delicious and enjoyable'. A more literal translation would be 'the mouth can be happy', though the phrase is technically grammatically sound.
Also, I think it's worth putting non-diacritical pinyin in |lang= tags when you can, because it will still be pronounced better by a screenreader than an English voice selected will attempting to read it aloud, e.g. {{zh|labels=no|c=蔡英文|p=Cai Yingwen|out=p}} is better for screenreaders than just Cai Yingwen ({{zh|labels=no|c=蔡英文}}).
Remsense 02:13, 15 October 2023 (UTC)Reply
  Done * Pppery * it has begun... 00:01, 27 October 2023 (UTC)Reply
Pppery, thank you so much! — Remsense 00:13, 27 October 2023 (UTC)Reply
Pppery, Actually, I made several revisions to the module from the initial one, making initial fixes. Could you merge the newest revision of the sandbox? — Remsense 00:30, 27 October 2023 (UTC)Reply
  Done Those too. * Pppery * it has begun... 00:44, 27 October 2023 (UTC)Reply

@Pppery and Remsense: Was it after these edits that the lead of Prunus kansuensis started to look so bold? 77.223.109.164 (talk) 17:09, 19 December 2023 (UTC)Reply

Yes. I thought I tested adequately for this. It's not a bug that should've been introduced, but also I think there's never a real reason to put bold text inside this template, so it's good to identify when it's happening at least. Remsense 20:48, 19 December 2023 (UTC)Reply

Template-protected edit request edit

Allow specifying t variant: zh-Hant-HK vs zh-Hant-TW. See this image for an example of vs . NM 02:45, 20 November 2023 (UTC)Reply

I will write a patch for this ASAP. Here's the question: how should it work? I am thinking just additional parameters |tw= and |th=. But how the module presently works, if only one character field is specified, it just gets tagged as zh. But it should be trivial to tweak the code so that the more specific language tag is used when specifying only |t= or |s=, et al.Remsense 02:54, 20 November 2023 (UTC)Reply
I don’t think there’s a need to modify the logic for |t= and |s=. Sometimes the region is really irrelevant that you only want to specify the script (for example, in a page that talks about the history of the writing system).
We should also avoid ambiguous abbreviations and just go with a slightly longer format like t_hk and s_sg.
NM 05:08, 26 November 2023 (UTC)Reply
fair—it's really difficult, re: abbreviations, because after the 10th time entering it in an article, you might wish concision was favored over disambiguation, but yeah. I still haven't started on this, but I'll keep this in mind when I get to it shortly. Remsense 05:26, 26 November 2023 (UTC)Reply

Template-protected edit request on 5 February 2024 edit

This may be politically inclined, but I am highly unsure if Tongyong Pinyin should be our generic zh-Latn, as the mainland Pinyin wins by a very large margin in users.

I propose the following changes:

--- Module:Lang-zh
+++ Module:Lang-zh
@@ -54,8 +54,8 @@ local ISOlang = {
 	["c"] = "zh",
 	["t"] = "zh-Hant",
 	["s"] = "zh-Hans",
-	["p"] = "zh-Latn-pinyin",
-	["tp"] = "zh-Latn",
+	["p"] = "zh-Latn",
+	["tp"] = "zh-Latn-tongyong",
 	["w"] = "zh-Latn-wadegile",
 	["j"] = "yue-Latn-jyutping",
 	["cy"] = "yue-Latn",

Alternatively, if Pinyin is not a suitable generic zh-Latn, we should still add variant tags for Tongyong Pinyin:

--- Module:Lang-zh
+++ Module:Lang-zh
@@ -55,7 +55,7 @@ local ISOlang = {
 	["t"] = "zh-Hant",
 	["s"] = "zh-Hans",
	["p"] = "zh-Latn-pinyin",
-	["tp"] = "zh-Latn",
+	["tp"] = "zh-Latn-tongyong",
 	["w"] = "zh-Latn-wadegile",
 	["j"] = "yue-Latn-jyutping",
 	["cy"] = "yue-Latn",

NasssaNsertalk 11:01, 5 February 2024 (UTC)Reply

This seems sensible to me – hanyu pinyin is more widely used than tongyong pinyin, and indeed MOS:PINYIN mentions that hanyu pinyin is the default romanization we use. What would the effect of this change be from the reader's perspective? —Mx. Granger (talk · contribs) 15:30, 10 February 2024 (UTC)Reply
Like the rest of the language metadata, it adds specificity for screenreaders, as well as other possible presentations of articles. Remsense 23:21, 10 February 2024 (UTC)Reply
  Agree – I double-checked, and IANA added Tongyong as a valid language subtag variant in 2020. Remsense 23:20, 10 February 2024 (UTC)Reply
  Done * Pppery * it has begun... 04:53, 11 February 2024 (UTC)Reply

Template-protected edit request on 5 April 2024 edit

I would like to enable the option "first=poj" analogously to "first=j". The "first=j" option allows Cantonese romanisations to be given before Mandarin romanisations, in articles where Cantonese is more relevant. The proposed "first=poj" option would allow Hokkien romanisation (POJ) to be given first, in articles where Hokkien more relevant, e.g. for Bukit Ho Swee, Hong-Gah Museum, Tamsui District.

I believe this could be achieved by adding the following:

From line 114, after:

	local j1 = false -- whether Cantonese Romanisations go first

insert:

	local poj1 = false -- whether Hokkien Romanisations go first

From line 121, after:

			if (testChar == "j") then
				j1 = true
			 end

insert:

			if (testChar == "poj") then
				poj1 = true
			end

(The variable is named "testChar" but it is defined by the regular expression "%a+", which will match not only a single character but also longer strings.)

(On a separate note, there seems to be a superfluous space before "end" on lines 120 and 123.)

From line 137, after:

	if (j1) then
		orderlist[4] = "j"
		orderlist[5] = "cy"
		orderlist[6] = "sl"
		orderlist[7] = "p"
		orderlist[8] = "tp"
		orderlist[9] = "w"
	end

insert:

	if (poj1) then
		orderlist[4] = "poj"
		orderlist[5] = "p"
		orderlist[6] = "tp"
		orderlist[7] = "w"
		orderlist[8] = "j"
		orderlist[9] = "cy"
		orderlist[10] = "sl"
	end

This puts POJ before the Mandarin and Cantonese romanisations. Freelance Intellectual (talk) 08:49, 5 April 2024 (UTC)Reply

  Done * Pppery * it has begun... 02:53, 15 April 2024 (UTC)Reply