Template talk:Lang/Archive 5

This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Proper way to handle two versions of non-English text?

Latest comment: 6 years ago4 comments3 people in discussion

Many of the old templates got broken because they used italics in cases like this:

{{lang-lt|Rusijos lietuvių seimas Petrograde'' or ''Visos Rusijos lietuvių seimas}}

What's the best way to fix it? I think Template:Lang/doc should be updated to include this scenario as well. Thanks. Renata (talk) 17:38, 31 December 2017 (UTC)

We should not insert the English language (or any other, for that matter) into a span of text that the {{lang-lt}} template is defining as Lithuanian because the English-language conjunction, 'or', is not Lithuanian. So, the correct way to write this is with two templates:

{{lang-lt|Rusijos lietuvių seimas Petrograde}} or {{lang|lt|Visos Rusijos lietuvių seimas}} → Lithuanian: Rusijos lietuvių seimas Petrograde or Visos Rusijos lietuvių seimas

I agree that the documentation could be / should be improved (in this case Template:Lang-x/doc). Please do. I'll add this condition to the error message help text.

—Trappist the monk (talk) 18:02, 31 December 2017 (UTC)

This fix also applies to two strings separated by a comma or slash, but where one is in a Latin script and one is in a non-Latin script. – Jonesey95 (talk) 19:22, 31 December 2017 (UTC)

Here are regular expressions and their replacements that are working for me in AutoEd:

  //Fix lang template with non-Latn script followed by latin script, separated by comma and a space
  str = str.replace(/(\{\{\s*lang[\|\-])([a-z]+)\s*\|([^,\}]+), \'\'([^\']+)\'\'(\}\})/gi, '$1$2\|$3\$5, \{\{lang\|$2\-latn\|$4$5');

  //Fix lang template with two words separated by "or" in italic markup
  str = str.replace(/(\{\{\s*lang[\|\-])([a-z]+)\s*\|([^\'\}]+) \'\'or\'\' ([^\}]+)(\}\})/gi, '$1$2\|$3$5 or \{\{lang\|$2\|$4$5');

  //Fix lang-xx using italics when it should use regular non-Latin script (bail out if you find a comma or slash)
  str = str.replace(/(\{\{\s*lang\-[a-z]+\s*\|)\s*\'\'([^\},\/]+)\'\'\s*(\}\})/gi, '$1$2$3');

I still have to inspect each potential edit manually, but they usually work. – Jonesey95 (talk) 02:33, 3 January 2018 (UTC)

Who thought this was broken?

Latest comment: 6 years ago21 comments9 people in discussion

Obviously you just did this and are going to defend it for a while, but add me to the voice of those who want the automatic italicization killed. It helps nothing and means, e.g., that Latinized names can't be wrapped in {{lang}} tags in the running text without looking like they're scientific. I can just remove the template (done) but that shouldn't be necessary. — LlywelynII 13:58, 1 January 2018 (UTC)

The changed behaviour of {{Lang}} presents a dilemma when attempting to fix it. One could, as User:LlywelynII suggests, remove it, or using the same effort, fix it by adding |italic=no. This dilemma could have been, and still can be, avoided by having a new name for the template's new functionality, restore {{Lang}} to its previous code, and advise users of the new version. -- Michael Bednarek (talk) 00:24, 2 January 2018 (UTC)

Examples from actual articles are always helpful in discussions like this. Can you please provide some? Thanks. – Jonesey95 (talk) 02:18, 2 January 2018 (UTC)

As mentioned before, Bist du bei mir is an obvious one because it's mentioned at Wikipedia:Naming conventions (music). Further examples are several hundred hymns (start at Ach Gott, vom Himmel sieh darein and wander through its navboxes and categories) and Lieder. All those ought to be upright per MOS:MINORWORK. -- Michael Bednarek (talk) 03:59, 2 January 2018 (UTC)

Adding to this: Latin words that became part of English but would still be nice to have marked as lang:la. Magnificat, Requiem, Nunc dimittis, other Latin phrases and hymns. Classical music even had a discussion about not to italicise them. (Not that it happened everywhere.)

I support the proposal to restore tl:lang to the functionality it had, and make a new one with a different functionality if so desired. Yesterday we had the first article on the Main page where I removed the template, to not look silly. --Gerda Arendt (talk) 10:30, 2 January 2018 (UTC)

I agree with LlywelynII. This is astonishingly ill-advised. Please change it back to the way it was until this is sorted out.

No matter how well-reasoned the proposed change is (and no doubt it is well-reasoned), the way to go about it is not to break thousands of articles and annoy thousands of people. Instead, you create a template that is backward compatible and then send in bots to silently change things to the new way you are advocating.

It is hard to understand how you can create a template throwing up a specific error message to the effect of "It looks like you want to see italic text. Well, well, we can't have that unless you jump to these hoops first" instead of just having the template render the damned italic text that was obviously requested by the user. [This doesn't even touch upon to the new regime of forcing italic text on people who do not want to use it for some reason] It is also hard to understand why you would use a parameter forcing people to type out the string "italic" instead of just using wiki-markup. Jesus, even using explicit html would save you keystrokes over this! If for some reason you must use the parameter name "italic", don't bother users with it but just use bots to change user-friendly input to whatever weird format you happen to prefer. --dab (𒁳) 09:52, 2 January 2018 (UTC)

I agree as well, this change seems ill advised. —TheDJ (talk • contribs) 10:17, 2 January 2018 (UTC)

Before you set all of your pitchforks on fire, please scroll up until you find the section containing The lang templates were "working" only in the sense that a Rube Goldberg machine "works" and then read down the page from there. There is a larger context beyond these hundreds of hymn articles and a few thousand articles that are currently showing errors. – Jonesey95 (talk) 12:44, 2 January 2018 (UTC)

I support the proposal to restore tl:lang to the functionality it had, per User:Gerda Arendt and User:Dbachmann. Backtracking should never be allowed. Please bring this issue to the attention of our administration. There's a reason why most our high-traffic templates are offered in read-only mode. Poeticbent talk 13:12, 2 January 2018 (UTC)

The previous functionality, about which people had complained for years, was a mess. It was not compliant with MOS, it silently emitted erroneous formatting, and it required goofy workarounds like adding italic markup to suppress italic markup. We are in a transition period. In order to understand the complexity of these lang templates and the state of this transition, you need to read, or at least scan, this very long talk page from close to the top, starting on 30 October 2017. – Jonesey95 (talk) 13:56, 2 January 2018 (UTC)

Who thought this was broken? That would be me. The general rule stated at MOS:FOREIGNITALIC is that non-English text written with the Latin script is to be italicized. There are exceptions to the general rule: proper names, titles of minor works, loanwords. In the not-to-distant past, {{lang}} was incapable of discriminating between Latin-script writings and writings using other scripts so a decision was taken to have {{lang}} render without styling. This decision required editors writing general-rule, non-English Latin-scripted text, to always include wiki markup while editors writing exception-to-the-rule Latin-scripted non-English text were relieved of that requirement. I think that is backwards. I think that the template should render according to the general-rule and that the exceptions to the general-rule should require the extra effort.

For the most part, {{lang}} is incapable, has always been incapable, of determining the context in which it is used. Templates cannot see what is outside their bounding {{ and }} (it is said that out there, beyond those horizons, dragons do dwell).

The argument for applying {{lang}} to loanwords seems problematic to me. Certainly, if we are talking about 'magnificat', 'requiem' or 'nunc dimittis' as Latin-language words then {{lang|la|magnificat}}, {{lang|la|requiem}}, and {{lang|la|nunc dimittis}} are appropriate as is the automatic italics. When using the words as loanwords for their grammatical meaning, the words themselves should not be marked up as Latin because, by virtue of their loanword status, they are for all intents and purposes, English. When used as titles, the application of capitalization, italics, language markup should be done on a case-by-case basis.

Editor Dbachmann would have me create a template that is backward compatible and then send in bots to silently change things to the new way [I am] advocating. Really? By subterfuge and sneakiness? I don't think so. That's not how I do things – as is evidenced by all of the words I have expended on this talk page since I started on this project two months ago. Nothing I have done here has been done silently nor will it be. The obnoxious red error messages have been present since this edit on the fifth day (4 November); from this template since this edit on 18 November.

Editor Poeticbent wrote: Backtracking should never be allowed. I do not know what 'backtracking' means in this context.

Rather than rolling back to the golden days, if they ever truly existed, perhaps a possible solution for the 'music problem' might be the creation of wrapper templates around {{lang}}. I don't know what one might call a template that, for example, rendered minor titles, maybe {{mt-lang}}. Such a template would then require only two positional parameters (just like the basic {{lang}} template):

{{mt-lang|??|<minor-title> – where ?? is the IETF language tag and <minor-title> is the minor title

which template might internally look like this:

"{{lang|{{{1}}}|{{{2}}}|italic=unset}}" – where |italic=unset disables styling normally provided by {{lang}}

and renders:

"minor-title"

A complimentary template for major titles would not use double quotes and would use |italic=yes so that regardless of script, the title would be italicized. CJK titles, an exception to the major title exception to the non-Latin script general rule, would needs be handled individually or by another wrapper template with |italic=no (to prevent external wiki markup from violating the CJK-no-italics rule).

Maybe there are other solutions that are more appropriate than the simple expedient of improvement-be-damned-reversion.

—Trappist the monk (talk) 16:51, 2 January 2018 (UTC)

Creating a wrapper template which would implement the previous behaviour of {{Lang}} misses the point. It would still necessitate revisiting all those articles mentioned above, in which case the desired non-italic option could then be be added instead. I notice that as an unintended consequence of the changes to this template, some editors have now taken to removing it, which is a lot simpler than making it do what the old template did. A rewrite of the template should never have changed its displayed output. Whether that always conforms with the MoS is not a matter that can be controlled by a template. Example: if a major work is cited by {{IMSLP}}, the editor is expected to provide italics, for minor works, quotes. If "Bist du bei mir" is not surrounded by italicising quotes, it should not be italicised. -- Michael Bednarek (talk) 03:30, 3 January 2018 (UTC)

A rewrite of the template should never have changed its displayed output.??? How would you have fixed a broken template that displayed text incorrectly in thousands of articles without changing the way that template displayed its output? As for visiting pages to fix or change templates, yes, that will need to happen during a transition such as this one. I fixed about 500 pages with lang templates today, as did another editor who is working on this project. In the long term, however, there should be much less need for maintenance, and readers and editors will experience greatly improved consistency when editing and viewing text rendered by lang templates. As you can see from the discussion in the multiple sections above, we are working our way toward a general solution as well as particular solutions for individual cases. – Jonesey95 (talk) 03:41, 3 January 2018 (UTC)

This exercise started because it amused Trappist the monk to improve the {{Lang-??}} family. This then spread to {{Lang}}, which wasn't broken, if properly used. -- Michael Bednarek (talk) 04:07, 3 January 2018 (UTC)

{{Lang}} was indeed broken, as you can see if you scan the archives of this talk page looking at problem reports and suggested workarounds. For example, at the top of Template talk:Lang/Archive 3, there is a list of ISO 639 templates that were being requested by the lang template, but which didn't exist. These lang template instances did not work properly, even though many of them were properly constructed in accordance with the documentation. I created hundreds of these ISO 639 templates and categories, a tedious (and now no longer necessary) process. Conversion of the template to a module has made it so that all valid language codes work, without creation of additional templates and categories. That is just one example of a way in which the template was broken and is now working better. – Jonesey95 (talk) 04:36, 3 January 2018 (UTC)

There is a difference between a template not covering all possible language codes without any visible effect, and emitting output flatly against editors' intentions. The list from October 2016 you mention simply shows omissions in the previous system, not that it was broken. I agree that simplifying support for ISO codes was desirable, but I fail to see why this template was overloaded with the impossible aim of enforcing MoS elements. -- Michael Bednarek (talk) 10:48, 3 January 2018 (UTC)

The changes to {{lang}} are not intended to produce output that always conforms with the MoS nor are they an attempt to meet the impossible aim of enforcing MoS elements. The are made so that {{lang}} produces the correct MOS-compliant output for Latin-script text most of the time.

{{IMSLP}} has nothing to do with {{lang}} as far as I can see. Are you suggesting that because {{IMSLP}} requires editors to apply stylistic markup with every use, {{lang}} must do the same? And does that extend also to the {{lang-??}} templates? And if it does, then to any other templates that apply styling? The templates exist, in part, to make editors' lives a bit easier. This change to {{lang}} accomplishes that for most uses when the template wraps Latin-script text. I've suggested one solution to the minor title issue and am happy to implement it or whatever better solution might be found.

—Trappist the monk (talk) 13:13, 3 January 2018 (UTC)

The problem is that the behaviour of {{Lang}} changed. It was widely used with the markup appropriate in the context (just like e.g. {{IMSLP}}); that broke. -- Michael Bednarek (talk) 06:25, 4 January 2018 (UTC)

Trappist the monk, have I understood correctly? You've changed the behaviour of {{lang|xx}} to default to italic? I'm not smart enough to know whether the figure of 624 547 transclusions includes {{lang-xx}} uses as well, but even if it does, that seems to be a quite unacceptable amount of collateral damage (however few of the uses were in previously accordance with MOS, every one of those is now bust). At United Nations, a fairly high-profile article, {{lang|ru|Организация Объединённых Наций}} displays correctly as Организация Объединённых Наций, but {{lang|es|Organización de las Naciones Unidas}} gives Organización de las Naciones Unidas, with unwanted italics. It's one thing to hope that volunteer labour will fix a few thousand articles, but 600 000 is not within the realms of realistic hope. Please revert this change until you can find a way of making it without breaking current uses of the template. Thanks, Justlettersandnumbers (talk) 17:47, 4 January 2018 (UTC)

That 625k number was way off. Perhaps that is what it used to be when {{lang}} was called by all of the 650ish {{lang-??}} templates, but no longer. I've fixed the number to 257k. Don't believe me? this link.

You are somewhat mistaken. The default behavior for {{lang}} is italic only when the text in the template is entirely Latin script. This is why {{lang|ru|Организация Объединённых Наций}} is not rendered in italics but {{lang|es|Organización de las Naciones Unidas}} is. As you might expect, given the five example uses of the template at United Nations, not all uses of the template render Latin-script text. Similarly, not all uses of the template with Latin-script text are titles or proper names.

I have fixed United Nations (including the proper use of |rtl=yes for the Arabic name).

—Trappist the monk (talk) 23:06, 4 January 2018 (UTC)

Trappist the monk, it's good to see that the overall number is lower than was previously advertised, it's a certainty that I'm at least partly mistaken (obviously, non-Latin uses of the template should not have been affected by your change), and it's nice that you fixed United Nations – thank you for that and for all your work here. I've also fixed a few pages myself. As far as I can see, none of that comes close to solving the apparent problem. So I ask again: could you please revert your change until a strategy has been developed for making it without so much collateral damage (and preferably without any)? I imagine that a series of bot runs will be need to prepare for the switch, and they will surely have to be designed by someone a lot cleverer than I. Justlettersandnumbers (talk) 11:12, 5 January 2018 (UTC)

Wish list for future enhancement

Latest comment: 6 years ago18 comments5 people in discussion

An issue I was just thinking of again today (and grinding my teeth) is that we need a way to suppress the labels entirely e.g. with a |labels=no and |labels=lang; we don't need the language name, the "translit.", or the "lit." labels after the first occurrence in the same block of material, or sometimes we need the language one only, e.g. when comparing cognates. What we're doing now is using the template once, then abandoning it for manual markup with a {{lang|xx}} in it; or reusing the {{lang-xx}} and driving readers nuts by repeating the same crap over and over at them as if they have dain bramage. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 14:18, 5 November 2017 (UTC)
In the sandbox |label=none suppresses all labels in {{lang-??}} templates:
|label= (empty or missing):
{{lang-sr/sandbox|Иван Иво Андрић|script=Cyrl|translit=Transliteration|translation=Translation}} → Serbian: Иван Иво Андрић, romanized: Transliteration, lit. 'Translation'

|label=none:
{{lang-sr/sandbox|Иван Иво Андрић|script=Cyrl|label=none|translit=Transliteration|translation=Translation}} → Иван Иво Андрић, Transliteration, 'Translation'

|label=[[Serbian Cyrillic alphabet|Serbian Cyrillic]]:
{{lang-sr/sandbox|Иван Иво Андрић|script=Cyrl|label=[[Serbian Cyrillic alphabet|Serbian Cyrillic]]}} → Serbian Cyrillic: Иван Иво Андрић

—Trappist the monk (talk) 15:46, 31 December 2017 (UTC)
Implemented in the live module.

—Trappist the monk (talk) 11:19, 6 January 2018 (UTC)
Should we also be warning against or disallowing language tags with suppressed script codes, e.g. ru-Cyrl? – Quoth (talk) 11:51, 6 November 2017 (UTC). If a warning or error is too heavy-handed, another option could be to just suppress the script code from the output, depending on the language it's attached to. – Quoth (talk) 16:32, 6 November 2017 (UTC)
I have added to Module:Language/name/data/iana data extraction tool so that it now extracts suppressed script data from the IANA language-subtag-registry file. Those data are now in Module:Language/data/iana suppressed scripts. I have also added code to Module:lang/sandbox that uses the new data but for the moment have left it disabled so that I don't have to rewrite examples elsewhere that are presently being discussed in other topics on this talk page.

—Trappist the monk (talk) 16:09, 21 December 2017 (UTC)
Support for bold-face – please see the section #Recent change immediately above this. Thanks to all who are working on this – it's long overdue. Justlettersandnumbers (talk) 16:47, 7 November 2017 (UTC)
I think that is handled:
{{lang-sco|''''''Dumbairton''''''}} → [''Dumbairton''] Error: {{Lang-xx}}: text has malformed markup (help)

{{lang-sco|'''''Dumbairton'''''}} → [Dumbairton] Error: {{Lang-xx}}: text has italic markup (help)

{{lang-sco|''''Dumbairton''''}} → ['Dumbairton'] Error: {{Lang-xx}}: text has malformed markup (help)

{{lang-sco|'''Dumbairton'''}} → [[Scots language|Scots]]: '''Dumbairton''' → Scots: Dumbairton

{{lang-sco|''Dumbairton''}} → [Dumbairton] Error: {{Lang-xx}}: text has italic markup (help)

{{lang-sco|'Dumbairton'}} → [[Scots language|Scots]]: 'Dumbairton' → Scots: 'Dumbairton'

for bold face without italics:
{{lang-sco|'''Dumbairton'''|italic=no}} → [[Scots language|Scots]]: '''Dumbairton''' → Scots: Dumbairton

—Trappist the monk (talk) 18:19, 7 November 2017 (UTC)
The behavior of initial or final single quotes should be changed; when I do {{lang-nl|'t}} → Dutch: 't on its own, the apostrophe is not italicized.

When I do {{lang-sco|'Dumbairton'}} blah blah {{lang-nl|'t}} → Scots: 'Dumbairton' blah blah Dutch: 't, this paragraph is messed up with uncontrolled bolding and italicization. — Eru·tuon 23:27, 21 November 2017 (UTC)
fixed.

—Trappist the monk (talk) 15:02, 26 November 2017 (UTC)
probably a good idea to consider single-template support for languages with multiple writing systems. Kazakh, for example, uses Latin, Cyrillic, and Arabic scripts. One template accepts a language code and |textn= and |scriptn= so for Kazakh {{lang-kk|text=<Latin text>|script=Latn|text2=<Cyrillic text>|script2=Cyrl|text3=<Arabic text>|script3=Arab}} where the text in all three cases is the same thing written in different scripts distinct from transliterations. No idea yet how this might be implemented.—Trappist the monk (talk) 14:15, 8 November 2017 (UTC)
- @Trappist the monk: If the problem is gathering the parameters, list parameters like |text=, |textN= would be simple to implement using wikt:Module:parameters. It can gather list parameters into an array, or convert a parameter to a boolean. The latter would be useful for |rtl=. It's sort of the Wiktionary equivalent of Module:Arguments. — Eru·tuon 22:41, 10 November 2017 (UTC)
Language-agnostic script-detection function, to make the |script= parameter unnecessary. It can be built on something similar to the function in wikt:Module:Unicode data that determines the script of a single character by looking up the codepoint in wikt:Module:Unicode data/scripts. It would need some way to determine which script code to return when text consists of multiple scripts (or characters not assigned a script). — Eru·tuon 21:36, 9 November 2017 (UTC)
How much overhead would that add if, say, the template were used 100 times in a long article? — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 03:19, 10 November 2017 (UTC)
The detection function itself is very fast, much faster than the language-dependent script detection function that is used in Wiktionary language- and script-tagging templates. For instance, in one of my sandboxes, detecting the script of each character in about 28,000 bytes of text from Russian language takes less than a second. (The list of scripts is at the bottom.) So, it probably won't add much overhead, assuming the function for deciding which of the scripts to return is relatively simple. — Eru·tuon 18:36, 10 November 2017 (UTC)

Current version that returns official Unicode script properties uses quite a bit of memory on my massive testcase (8 something MB), or about 2-3 MB with a smaller amount of text. If this is too much memory for a function like this, perhaps it could be reduced by breaking up the data module, or removing Zyyy and Zinh. Or maybe there would be a more creative solution. — Eru·tuon 01:07, 15 November 2017 (UTC)

Fixed memory problem and sped things up. Memory and time are now at about 1.6 MB and 0.05 second in my giant testcase. — Eru·tuon 11:33, 16 November 2017 (UTC)
~~Have the lang-xx templates stop transcluding the lang|xx variant, and instead directly call the same Lua functions to reduce the transclusion count.~~ — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 21:06, 13 November 2017 (UTC); struck as moot: 19:02, 16 November 2017 (UTC)
Nothing to do here; not a new feature.

—Trappist the monk (talk) 21:44, 13 November 2017 (UTC)

Italics

Latest comment: 6 years ago10 comments4 people in discussion

This is just ridiculous. I don't think there's any style guide in which the name of an article's subject need be in italics, no matter the language. Given the frequent use of lang for names and for other purposes where it is not required or even wished for the words to be in italics, would it be possible to just remove that default italics? There's like way too many (German, in this case) foreign language names for anybody to wish spending time on it manually. I understand that default italics might be because of a desire to comply with WP:MOS guidelines or just general usage, but it's far worse when something is italicized when it shouldn't (it really makes no sense and requires a lot of fixing and extra wikicode, i.e. not WP:KISS) than when something which isn't italicized should be - usually, it remains rather clear that it's in a foreign language (I mean, the only foreign language which might look even vaguely similar to English is Scots, and even that would require the reader to be confused is some way, given the, hum, rather different orthography). 198.84.253.202 (talk) 02:18, 6 January 2018 (UTC)

Removing the template is a valid option. -- Michael Bednarek (talk) 02:35, 6 January 2018 (UTC)

The template is often used in accordance with WP:ACCESSIBILITY to single out terms or passages as being in other languages for screen readers or other such technological tools - as such, removing it (in addition to being as long a process as adding |italic=no to each one) is, despite being a valid option, not really helpful. The spirit of WP:KISS, which I invoked above, is to attempt to keep things as simple as possible without hurting functionality - hence, why, for example, I removed a template here because it didn't add any useful functionality (or any such feature was greatly outweighed by the inconvenients). However, lang does add something in the vast majority of cases and simply removing it is, well, too simple. 198.84.253.202 (talk) 02:54, 6 January 2018 (UTC)

See the discussions above. Suggestions have been made for working with proper nouns, names of minor works, and other text that should not be italicized. – Jonesey95 (talk) 04:36, 6 January 2018 (UTC)

But I was looking at an attempt to make this simpler. Automatic italics, or automated anything, as per discussion above, "is handy but contributes to laziness and complacency and that impacts the quality of the encyclopedia". It is possible to add italic=no but we'd need to see how often the template is used for proper nouns which shouldn't be italicized and for foreign-language prose which should - if the former outweighs the latter (and I think it does), it would be much simpler to not italicize and simply revert to the previous template functionality since it causes more harm than good (as per other requests or remarks that it doesn't work, no 1 no 2 3 4, and the above section (Erde, singe)).

In fact, I think it could be even simpler to change the default behaviour of the template so it treats the italic parameter as being italic=unset ([[1]]) - this would make the template work as intended, require no major change and keep it simple for use by everyone (users not aware of the italic parameter would simply use (lang template), which still works), including newer editors who might not be aware of all the complexities created by the present situation (namely, the change of behaviour so it italicizes by default). 198.84.253.202 (talk) 15:47, 6 January 2018 (UTC)

+1 -- Michael Bednarek (talk) 00:26, 7 January 2018 (UTC)

Also, you should make |italic and |italics both usable alternatives, if simply to avoid errors (make it simpler to use, WP:KISS) like the one I had just made above. 198.84.253.202 (talk) 15:50, 6 January 2018 (UTC)

+1 -- Michael Bednarek (talk) 00:26, 7 January 2018 (UTC)

{{lang/sandbox|de|Erde, singe|italics=no|italic=yes}} → [Erde, singe] Error: {{Lang}}: only one of |italic=, |italics=, or |i= can be specified (help)

{{lang/sandbox|de|Erde, singe|italics=no}} → Erde, singe

{{lang-de/sandbox|Erde, singe|italics=no|italic=yes}} → [Erde, singe] Error: {{Lang-xx}}: only one of |italic=, |italics=, or |i= can be specified (help)

{{lang-de/sandbox|Erde, singe|italics=no}} → German: Erde, singe

—Trappist the monk (talk) 11:24, 7 January 2018 (UTC)

Well, (if I understand the reason for the comment) I think using both |italic and |italics (and with conflicting arguments) is a legitimate error and the editor should see such a warning... 198.84.253.202 (talk) 16:14, 7 January 2018 (UTC)

Minor Latn bug

Latest comment: 6 years ago13 comments7 people in discussion

Using {{lang|ja|script=Latn|go}} produces go and this in an odd font (doesn't look like go); it seems to be pulling the Latin characters from the Japanese character set instead of just using whatever the default one is for en.wikipedia. Maybe this is not fixable, if it's something the browser is doing. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 08:31, 4 January 2018 (UTC)

I think that this is not the template but some css applied somewhere related to the lang= attribute. There is the same noticeable difference if I handwrite this:

gogo

gogo

—Trappist the monk (talk) 12:22, 4 January 2018 (UTC)

This happens for many languages, not just Japanese:

{{lang|ru|script=latn|ABC üéîøå DEF abcdef}} → ABC üéîøå DEF abcdef

{{lang|ko|script=latn|ABC üéîøå DEF abcdef}} → ABC üéîøå DEF abcdef

{{lang|ja|script=latn|ABC üéîøå DEF abcdef}} → ABC üéîøå DEF abcdef

{{lang|zh|script=latn|ABC üéîøå DEF abcdef}} → ABC üéîøå DEF abcdef

{{lang|es|script=latn|ABC üéîøå DEF abcdef}} → ABC üéîøå DEF abcdef

{{lang|en|script=latn|ABC üéîøå DEF abcdef|italic=yes}} → ABC üéîøå DEF abcdef

I don't have anything constructive to say, just that I have noticed it as well. – Jonesey95 (talk) 13:27, 4 January 2018 (UTC)

Are you all using Firefox? There is a Language and Apperance section in the Firefox about:preferences page where you can click on the "Advanced..." button and change the default fonts for each language.

go produces go
go produces go

Should the lang value be "ja-latn" rather than just "ja"? -- WOSlinker (talk) 13:48, 4 January 2018 (UTC)

Chrome with its default settings.

But, you make an important point. The template as originally written:

{{lang|ja|script=Latn|go}}

produces this:

go

Notice the lang= attribute. |script= is not supported by {{lang}} because that when writing the {{lang}} template editors can modify the IETF language tag to include script, region, and variant subtags; something that is not appropriate with the {{lang-??}} templates. Rewriting the template:

{{lang|ja-Latn|go}}go

produces this:

go

which compared to ''go'' produces this:

gogo

—Trappist the monk (talk) 14:57, 4 January 2018 (UTC)

This is a browser thing, as MediaWiki:Common.css has nothing that references lang attributes. Many browsers apply certain fonts to Japanese, Chinese, and various other languages, based on the lang attributes of HTML tags.

Some people noted this (see the discussion here) when Wiktionary added a lang attribute (lang="ja-Latn") to transliteration of Japanese terms (in wikt:Template:link and wikt:Template:mention for instance; the HTML tags are generated by by wikt:Module:script utilities). The transliteration (romaji) was displaying with fonts that are appropriate for kanji and kana. So, I removed the language attribute for Japanese to avoid this. This wasn't the fault of wikt:MediaWiki:Common.css, because it only applies fonts to script classes (class="Latn", class="Jpan", class="Hani"), or to the combination of a script class and language attribute, not to language attributes alone, as some languages are written in multiple scripts. — Eru·tuon 00:05, 5 January 2018 (UTC)

I suspected that might be the case (thus "Maybe this is not fixable, if it's something the browser is doing.") However, I think the "Wiktionary nuclear option" is not the way to go. We shouldn't lose the language markup just because the font ends up looking a little different. It's certainly still readable, and as Ttm points out, we have a workaround anyway. But there's a better one could do in CSS. Demo:

''go'' → go
{{lang|ja|script=Latn|go}} → go
{{lang|ja-Latn|go}} → go
{{lang|ja|script=Latn|text=go}} → go

The potential problem with #3 is that it requires us to support |xx-Latn= for every language for which this issue might come up (maybe we're already doing that). A definitely problem with it as the only solution is that it's the only solution; it's not at all intuitive that {{lang|ja-Latn|go}} works and {{lang|ja|script=Latn|go}} doesn't, and probably less than 10 editors are every going to memorize this. And it ultimately doesn't make any sense that the output of these would look different.

However, the fact that #4 works indicates that we can have the site-wide stylesheet reset the font-family for any {{lang|xx|script=Latn|...}} case to the default font stack on the way to outputting the value of |text= when |script=Latn is present. This would be more robust.
— SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 07:07, 5 January 2018 (UTC)

|script= is not supported by {{lang}} which can be seen by this example:

{{lang|ja|script=Latn|行く}} → 行く ← 行く

It has been in the back of my mind to implement a |style= parameter that would do what is done in your example #4. If you think that that is a good idea, add it to the list at §Wish list for future enhancement.

—Trappist the monk (talk) 12:02, 5 January 2018 (UTC)

This is something that has come up a couple of times with {{zh}}. See Module talk:Zh/Archive 4#Language tagging for pinyin yet again and Module talk:Zh/Archive 3#Font?. It’s a browser problem, a particular problem with Firefox, and then not for everyone but for people who override the default settings. Firefox’s whole approach seems badly broken, legacy code which is still around from when you often needed to tweak language settings to get things to display properly.--JohnBlackburne^words_deeds 07:55, 5 January 2018 (UTC)

Perhaps if Wikipedia starts using the tags the way they're meant to be used, then there will be so many people complaining about a bug in the browser that the browser devs will fix it? It would be nice if we could nudge them in the right direction. :) Rua (mew) 20:55, 7 January 2018 (UTC)

You would think so. But this has been an issue for years, and they seem uninterested in fixing it. I think it just affects too few people, only those using Firefox who have configured it with custom fonts for e.g. Asian languages. Also because it’s a user setting users can usually fix it themselves.--JohnBlackburne^words_deeds 21:02, 7 January 2018 (UTC)

Really? I use Firefox and haven't changed any settings, but the font difference happens for me. Where would it be configured? Rua (mew) 21:33, 7 January 2018 (UTC)

On the preferences (about:preferences) pane, under 'Language and Appearance' click on 'Advanced'. Then you can set fonts for every language/script. There are two issues with it. It seems to use the specified font even when '-Latn' is specified for the text. E.g. zh-Latn for pinyin is treated as 'zh'/Chinese. This means it can look different, as the font used for Chinese might have subtly different Roman characters from those used for English text. It can become a serious problem if a user has overridden the defaults for English or Chinese. E.g. a bitmap font can work better for Chinese at small font sizes but can cause the pinyin to also use the same bitmap font which looks horrible, especially italicised.--JohnBlackburne^words_deeds 06:57, 8 January 2018 (UTC)

Trappist the monk's edits broke the template {{Template:lang-uk|}}

Latest comment: 6 years ago2 comments2 people in discussion

@SMcCandlish: Hi User:Trappist the monk's edits from December 2017 broke the template {{lang-uk}}. See example of how the template doesn't work now here Hej Sokoly. Can someone revert to the versoin before User:Trappist the monk edits?--Piznajko (talk) 19:41, 14 January 2018 (UTC)

Fixed in that article. The help text, while long, explains how to fix this template usage error. – Jonesey95 (talk) 19:50, 14 January 2018 (UTC)

Problem with Proto-Celtic

Latest comment: 6 years ago9 comments4 people in discussion

Hey guys, I just noticed a problem with Proto-Celtic ("cel-x-proto") in the template. Here are two quotations to illustrate. The first using the language "sga" (Old Irish), and the second using "cel-x-proto". At the moment the first works great, but the second is messed up (a bullet point appears out of nowhere breaking up the sentence into another line, and the italics are transferred over into the following text).

Example 1: The Old Irish ech, derived from the Proto-Celtic *ekʷos, means "horse".

Example 2: The Old Irish ech, derived from the Proto-Celtic **ekʷos, means "horse".

--Brianann MacAmhlaidh (talk) 01:49, 15 January 2018 (UTC)

I don't know why this is happening, but I notice that in Module:Lang/data, some bits are surrounded by double quote marks and some bits are surrounded by single quote marks. I don't know if that makes a difference. – Jonesey95 (talk) 01:57, 15 January 2018 (UTC)

I don't know why it's happening either. As an interim fix, you can write this:

Example 2: The Old Irish {{lang|sga|ech}}, derived from the {{lang-cel-x-proto|ekʷos|link=no}}, means "horse".

Example 2: The Old Irish ech, derived from the Proto-Celtic: *ekʷos, means "horse".

I'll look into it in the morning.

—Trappist the monk (talk) 02:21, 15 January 2018 (UTC)

Something about the plain asterisk that the template adds for proto languages confuses MediaWiki. I've replaced that with a * numeric character reference. Results in op's examples.

—Trappist the monk (talk) 11:02, 15 January 2018 (UTC)

On an unrelated note, it's a good intention to have the template supply the prefixed asterisk if it's missing, and I guess it will be helpful for people who have forgotten to add it (not sure how likely is this to occur: if an editor is diligent enough to use lang formatting with the obscure codes for protolanguages, then it's unlikely they will have forgotten to add the asterisk). But it will result in duplicate asterisks if one has already been supplied outside of the template (as above) – Uanfala (talk) 02:58, 15 January 2018 (UTC)

An editor who is that diligent will notice the duplicated asterisk at preview, right?

—Trappist the monk (talk) 11:02, 15 January 2018 (UTC)

Unless there is a tangible benefit in forcing the asterisk to appear within the lang template, I don't see the point of this feature. The intention behind it is nice, but in effect it only adds busywork. An asterisk can added for either ungrammatical or for unattested/reconstructed forms, and having to learn that in one of the several contexts of use the asterisk shouldn't be added by hand adds complexity. And there's also the fact that the asterisk isn't always used even for reconstructed proto-languages: for example in long running texts (see Schleicher's fable), or in interlinearised glossed sentences, where the asterisk will appear only once at the start of the sentence and because of the formatting the lang markup will need to be applied separately for each word. – Uanfala (talk) 13:42, 15 January 2018 (UTC)

I'm not going to pretend that I understand anything about what you have written here; I don't. But, in both of the articles you cite, there are no asterisks except where used for unordered list markup and there are no {{lang|???-x-proto}} or {{lang-??-x-proto}} templates; Schleicher's fable has a single instance of {{lang|de|...}}.

—Trappist the monk (talk) 14:39, 15 January 2018 (UTC)

My point is that not all instances of text marked up as being in a reconstructed language need to have an asterisk. If the template obligatorily puts such asterisks, then this is really a bug, not a feature. – Uanfala (talk) 14:48, 15 January 2018 (UTC)