Module talk:JCW

Latest comment: 3 years ago by Headbomb in topic Issue found
WikiProject iconAcademic Journals Template‑class
WikiProject iconThis module is within the scope of WikiProject Academic Journals, a collaborative effort to improve the coverage of Academic Journals on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
TemplateThis module does not require a rating on Wikipedia's content assessment scale.

p.selected edit

I'm looking at the request at WP:VPT. However I can't help but fiddle with code and I'm afraid I would also adjust several mostly style issues. First I need to understand what p.selected and the following are supposed to do.

if doi then
    text = text..string.format("\n** <code>\{\{doi\|[https://en.wikipedia.org/w/index.php?sort=relevance&title=Special%%3ASearch&profile=advanced&fulltext=1&advancedSearch-current={}&ns0=1&ns118=1&search=insource%%3A%%2F%s%%5C%%2F%%20*%%2F %s]\}\}<\/code>", doi, doi)
    text = mw.ustring.gsub(text, "%%2F10%.(%d)", "\/10\\.%1")
end

Is there an example of p.selected being used and its output (possibly from Special:ExpandTemplates)?

The above code puts a URL into text with the doi value inserted in two places. It then does the gsub to possibly insert /10\ in the URL (I think I decoded that correctly, perhaps not). What is a typical doi value? Surely it never includes %2F? I'm thinking that the doi value, if necessary, should be changed with a gsub, and the result inserted into the URL in two locations. There is no need to put the URL through the gsub? Would that work? Johnuniq (talk) 02:59, 5 January 2020 (UTC)Reply

@Johnuniq: Basically, {{JCW-selected|Foobar|Rawr|doi1=10.4321|doi2=10.3216|doi3=10.5446}} gives

This could be equivalently be rendered with {{doi prefix}} like so

But the module choked on putting stuff templates last time around, so I had to URL encode a bunch of stuff so insource:/10\.1234/\ * search links work. Headbomb {t · c · p · b} 03:06, 5 January 2020 (UTC)Reply

Imprint/Parents edit

@Johnuniq and Galobtter:, I'm trying to introduce two new iterative parameters |imprint#= and |parent#= in {{JCW-selected}}

The idea being that

{{JCW-selected|Foobar|YoYeah|imprint1=Imprint1|imprint2=Imprint2|imprint3=...|doi1=10.4321}}
{{JCW-selected|Foobar|parent1=Barfoo1|parent2=Barfoo2|parent3=...|doi1=10.4321}}

would give

and

I tried [1], but it made things explode and time out. A bit of help would be nice here. Headbomb {t · c · p · b} 09:06, 16 February 2020 (UTC)Reply

@Johnuniq and Galobtter: anything you can do to help here? Headbomb {t · c · p · b} 19:03, 6 March 2020 (UTC)Reply
@Headbomb: I'll have a look while hoping that Galobtter either says they will do it, or gives me a couple of days to play around (I imagine neither of us wants to do some work and then find the other finishes first). What is the following trying to do?
if mw.title.getCurrentTitle().fullText == 'User:JL-Bot/Publishers.cfg' or 'User:JL-Bot/Maintenance.cfg' then
    source = nil
end
Is it trying to see if the current page is one of the two user cfg pages? The if condition evaluates to User:JL-Bot/Maintenance.cfg if the == is true, or to false otherwise. I need to understand stuff before diving in. Johnuniq (talk) 00:44, 7 March 2020 (UTC)Reply
Johnuniq, feel free to (and thanks for) work on this - I really don't have much time for wikistuff atm so won't really be able to help with this. Galobtter (pingó mió) 04:10, 7 March 2020 (UTC)Reply
Where can I play? What-links-here suggests Module:JCW is not used much. What is its status? Can I just edit it without worrying if something will break? Are there any testcases? There should be at least one test in a sandbox with the wikitext of a full example call to {{JCW-selected}} using all wanted bells and whistles, and the wikitext of what the output should be. That would help me understand what it's trying to do, and would be needed for testing. Preferably without red links. Johnuniq (talk) 00:58, 7 March 2020 (UTC)Reply

@Johnuniq: the general idea is that there are four pages where this is called

I believe that piece of code is simply telling that |source= of {{JCW-selected}} is simply irrelevant on User:JL-Bot/Publishers.cfg and User:JL-Bot/Maintenance.cfg. I think this is mostly because {{JCW-selected}} could be used on that page, but currently isn't.

As far as playing is concern, you can play pretty much on the live versions of the templates and modules and the below examples generated from the above code.

JL-Bot only cares about the plain text of the .cfg pages. Only the people maintaining the pages (i.e. me) might care about how they render, and it's really not a big deal if they're temporarily borked. Headbomb {t · c · p · b} 01:39, 7 March 2020 (UTC)Reply

Or if you want a more concrete example

Code Live Goal
{{JCW-selected
 |Hindawi Publishing Corporation
 |imprint1=International Scholarly Research Notices
 |doi1=10.1155
 |doi2=10.3814
 |doi3=10.4061
 |doi4=10.5402
 |doi5=10.6064
 |doi6=10.7167
 |doi7=10.7217
}}
{{JCW-selected
 |International Scholarly Research Notices
 |parent1=Hindawi Publishing Corporation
 |doi1=10.5402
}}

Headbomb {t · c · p · b} 01:48, 7 March 2020 (UTC)Reply

@Headbomb: I'll have a look at it but I will also refactor the code. Details matter to me and I want to know if you really want the first of the following for one of the results, or should it be the second? The first has the <code> font for the 10.1155 search link, while the second uses the normal font.
I will have some other questions, and will put a version of what you posted above at User:Johnuniq/sandbox2 adjusted to suit the tests I will use. Johnuniq (talk) 02:55, 7 March 2020 (UTC)Reply
@Johnuniq: Fixed the fonts. I think? There was an inconsistency, but it didn't look like your example. Headbomb {t · c · p · b} 02:59, 7 March 2020 (UTC)Reply
I think you're saying you want the first. Johnuniq (talk) 03:36, 7 March 2020 (UTC)Reply

Questions edit

@Headbomb: The search URL from the module is slightly different from {{doi-prefix}}'s output. To see this, copy the URL shown in the above examples and paste it into an editor. The first of the following lines is the URL parameter from the module, and the second is from the template. I've tried to highlight the two differences.

?sort=relevance&title=Special%3ASearch&profile=advanced&fulltext=1&advancedSearch-current={}&ns0=1&ns118=1&search=insource%3A/10\.1155%5C%2F%20*%2F
?sort=relevance&title=Special%%3ASearch&profile=advanced&fulltext=1&advancedSearch-current={}&ns0=1&ns118=1&search=insource%3A%2F10%5C.1155%5C%2F%20*%2F

Which is wanted? Perhaps there is a good reason for what the template outputs and the second should be used (although they seem to do the same thing)? Johnuniq (talk) 07:14, 7 March 2020 (UTC)Reply

I believe the difference is mostly in how urls are handled by modules and templates, but that they both produce the same thing in the end? It's also possible that there's a mistake somewhere. Headbomb {t · c · p · b} 07:17, 7 March 2020 (UTC)Reply
The goal is a insource:/10\.1155\/ */ search in the main and draft spaces, so whatever achieves that. Headbomb {t · c · p · b} 07:22, 7 March 2020 (UTC)Reply
There was an error in the doi-prefix template (%%3A%3A), although it didn't affect anything. The second part is a module vs template difference, I think. %2F10%5C is simply the html encoded version of /10\. Headbomb {t · c · p · b} 07:27, 7 March 2020 (UTC)Reply
Also I notice that your recent changes to the module have made (Unknown) appear when it wasn't there before. Might not be important, since this is just the talk page of the module and is still suppressed on the relevant user pages. Might be worth switching to a whitelist to allow, instead of a blacklist to disallowed. Headbomb {t · c · p · b} 07:33, 7 March 2020 (UTC)Reply
After my edit, the source is nil (not displayed) at User:JL-Bot/Publishers.cfg and User:JL-Bot/Maintenance.cfg and is either the given source or "Unknown" at other pages. I can easily do a whitelist instead. On what pages should the given source or "Unknown" be displayed? Johnuniq (talk) 08:16, 7 March 2020 (UTC)Reply
On User:JL-Bot/Questionable.cfg is the only place it's relevant. Headbomb {t · c · p · b} 17:58, 7 March 2020 (UTC)Reply
I have done a first pass at implementing the requested changes. I will do a more thorough check in a day or two. Some behavior is different, for example the module will now show an error if there is no target or if both doi and doi1 parameters are used. Testing needed! Johnuniq (talk) 08:38, 7 March 2020 (UTC)Reply
From a brief look, everything seems to work just fine. I haven't tried intentionally breaking things though. Headbomb {t · c · p · b} 18:04, 7 March 2020 (UTC)Reply
I have finished tweaking the code. More testing soon would be good because if more than a few days elapse before another request I will have forgotten what it's all about and will be harder to move. Johnuniq (talk) 00:00, 8 March 2020 (UTC)Reply
Looks good, thanks. Headbomb {t · c · p · b} 00:19, 8 March 2020 (UTC)Reply

Update for JCW-doi-redirects edit

@Johnuniq:, could you do a tweak where the DOI prefix links generated by {{JCW-doi-redirects|Foobar|10.1001|10.1002}}

and those generated by {{JCW-selected|Foobar|10.1001|10.1002}}

are the same (i.e. in the style of JCW-selected style), this way it's easy to search for which articles are using those DOIs prefixes? In {{JCW-doi-redirects}}, |1= will always be a regular article link (Foobar), while |2/3/4...= will always be DOIs prefixes. Headbomb {t · c · p · b} 01:42, 29 October 2020 (UTC)Reply

I have totally forgotten what this is all about and need simple instructions. The first example shows 10.1001 as a link to an article (which is a redirect). The second shows the doi template syntax and links to an enwiki search. I don't know what "DOI prefix links" are generated. It would be clearer for me if you were to show the wikitext for what the first example should output. Also show any other changes needed with sample input wikitext and the wanted output. I can use that to understand the problem and as testcases. Johnuniq (talk) 06:43, 29 October 2020 (UTC)Reply
@Johnuniq: Well, there's the examples above. The current version of {{JCW-doi-redirects}}, which currently outputs the non-desired 10.1001. The desired output is {{doi|10.1001}}, from a template that already has the functionality ({{JCW-selected}}). Headbomb {t · c · p · b} 06:48, 29 October 2020 (UTC)Reply
If you want to see it used in concrete examples, see User:JL-Bot/Questionable.cfg/Publishers#B. Compare the entries for the 'Baishideng Publishing Group', with plain links, with those from Bentham Science Publishers, with search links. Or those from the two entries of the European Mathematical Society in User:JL-Bot/Publishers.cfg#E (one with plain links, the other with search links). Headbomb {t · c · p · b} 06:53, 29 October 2020 (UTC)Reply
I'll try to look in about 24 hours. Remind me if I'm not back in 36 hours. Johnuniq (talk) 08:17, 29 October 2020 (UTC)Reply

Both {{JCW-doi-redirects}} and {{JCW-selected}} call {{#invoke:JCW|selected}} and the result from one is exactly the same as from the other. Currently, there is a difference between positional parameters and named parameters used to specify doi values. Examples:

I could change function selected so the positional parameters give the same results as the named parameters. Doing that in selected would mean both templates continue to give exactly the same results. Is that what is wanted? Johnuniq (talk) 05:58, 31 October 2020 (UTC)Reply

For the DOIs, yes. But JCW-selected is also used for things like

So as long as that still works, it should be good, I think. Tweaking the templates won't screw up the bot, the bot only looks at the raw text of these pages. So feel free to iterate. The visual output is only for human convenience. Headbomb {t · c · p · b} 06:11, 31 October 2020 (UTC)Reply

So the module needs a simple rule to guess when a positional parameter is a name to be linked. Is a DOI always "digits dot digits"? If so, that could be the rule to decide when to display as a DOI. Any other text would just be linked. Johnuniq (talk) 08:15, 31 October 2020 (UTC)Reply
DOIs are always either 10.#### or 10.#####. The rest of positional parameters can be plainlinked (with a : at the start of the link to handle categories and such). Headbomb {t · c · p · b} 14:47, 31 October 2020 (UTC)Reply

I think that's done. Please check results. Johnuniq (talk) 06:12, 1 November 2020 (UTC)Reply

Seems fine, although makes the template expansion size explode. Headbomb {t · c · p · b} 16:42, 1 November 2020 (UTC)Reply
See User:JL-Bot/Publishers.cfg, where things blow up in the U section. I already bypassed the columns-list. Headbomb {t · c · p · b} 16:56, 1 November 2020 (UTC)Reply
I fixed it by pruning the search URL: diff. The search still seems to work, although how is unclear. More fat could be trimmed by removing the code tags, and then the {{doi}} markup. After that, a really major improvement would result from replacing the template calls with direct invokes of the module. That would give slightly uglier wikitext (and would require signifiant tweaks to the bot), and I would need to check and possibly fix the module so directly invoking it worked. Johnuniq (talk) 06:14, 2 November 2020 (UTC)Reply
The bot tweaks would be pretty easy to implement by switching to #invoke, but it would be very ugly in wikicode. If things work right now, that's really all that matters. The publisher page is fairly unlikely to significantly blow up in size (meaning number of entries on the page) any time soon. Anyway, thanks a bunch. Headbomb {t · c · p · b} 06:22, 2 November 2020 (UTC)Reply

Exclude edit

@Headbomb: While trying to get an overview of what Module:JCW is all about, I found WP:JCW/EXCLUDE which redirects to User:JL-Bot/Citations.cfg. That page is in Category:Pages where template include size is exceeded. Do you want me to fix that error? The problem is the use of {{columns-list}} which means that virtually the whole page is transcluded twice. The fix is to replace each template call with ugly wikitext. I don't know if that page is used or if a change would interfere with the bot. Johnuniq (talk) 06:12, 30 October 2020 (UTC)Reply

It wouldn't interfere with the bot, but if the columns-lists are the issue (I think figured that a while back on other pages, just never cared to bother with those), then feel free to subst/hardcode them. It won't inteference with the bot. Headbomb {t · c · p · b} 06:21, 30 October 2020 (UTC)Reply
Actually, I took care of it. Thanks for pointing out the solution. Headbomb {t · c · p · b} 06:33, 30 October 2020 (UTC)Reply
That's quite irritating because I got an edit conflict after doing the edit myself. Please check my edit which fixed an error and tweaked the doc. Johnuniq (talk) 06:42, 30 October 2020 (UTC)Reply
Yup it's fine. Headbomb {t · c · p · b} 06:44, 30 October 2020 (UTC)Reply

Pattern edit

Function p.pattern in Module:JCW was giving "Lua error: not enough memory" at User:JL-Bot/Maintenance.cfg and was calling gsub excessively (each parameter caused the function to gsub the entire string again). That's the reason it was hitting memory limits as shown by running {{JCW-pattern|Example|.*Abc.*|!Def!|G}} in Special:ExpandTemplates with the module before I fixed it just now. The output follows.

*[[Example]]
** <code><b><font style=color:#006400;><b><font style=color:#006400;><b><font style=color:#006400;>.*</font></b></font></b></font></b>Abc<b><font style=color:#006400;><b><font style=color:#006400;><b><font style=color:#006400;>.*</font></b></font></b></font></b></code>
** <code><b><font style=color:#8B0000;><b><font style=color:#8B0000;>!</font></b></font></b>Def<b><font style=color:#8B0000;><b><font style=color:#8B0000;>!</font></b></font></b></code>
** <code>G</code>

After fixing the module, the output is:

*[[Example]]
**<code><b><font style=color:#006400;>.*</font></b>Abc<b><font style=color:#006400;>.*</font></b></code>
**<code><b><font style=color:#8B0000;>!</font></b>Def<b><font style=color:#8B0000;>!</font></b></code>
**<code>G</code>

Checking User:JL-Bot/Maintenance.cfg shows "BREAK .*Palaeontologie.*" which is due to a missing pipe. You might like to fix that and contemplate whether the bot has been doing anything unwanted because of the typo (I have no idea what the bot is doing).

Question I have removed TableTools.compressSparseArray from the pattern function because I was running into memory and then time limits while working out the problem. The new module assumes that the template is used in a reasonable manner. For example:

  • {{JCW-pattern|Example|.*Abc.*|!Def!|G}} (this works)
  • {{JCW-pattern|Example|3=.*Abc.*|4=!Def!|9=G}} (this fails because numbered parameters are skipped)

I assume that skipping numbers will never occur and can be ignored. Johnuniq (talk) 01:39, 31 October 2020 (UTC)Reply

Yeah that's fine. Headbomb {t · c · p · b} 01:56, 31 October 2020 (UTC)Reply

JCW tweak edit

@Johnuniq:, the next tweak would be to have Template:JCW-PUB-rank (same in Template:JCW-CRAP-rank) handle an arbitrary number of DOI prefixes (in the 'Publisher column').

Rank Publisher Entries (Citations, Articles) Total Citations Distinct Articles Citations/article


618 University of Zagreb | {{doi|10.15233}} · {{doi|10.15644}} · {{doi|10.20532}} · {{doi|10.20901}} · {{doi|10.21278}} · {{doi|10.21861}} · {{doi|10.22598}} · {{doi|10.24099}} · {{doi|10.26582}} · {{doi|10.3336}} · {{doi|10.5552}} · {{doi|10.5592}}
All 11 University of Zagreb-related entries
23 23 1.000


The maximum case currently in use in the compilation is |doi47=, but I'm sure there's a smarter way than a dumb

-->{{#if:{{{doi1|}}}|{{·}}{{doi prefix|{{{doi5|}}}|mode=t}}}}<!--

...

-->{{#if:{{{doi50|}}}|{{·}}{{doi prefix|{{{doi47|}}}|mode=t}}}}<!--

type of copy-paste in the template to handle this. Headbomb {t · c · p · b} 06:31, 2 November 2020 (UTC)Reply

I'll take a couple of days off but will have a look at this later. If I'm not back by Friday, remind me. Johnuniq (talk) 06:39, 2 November 2020 (UTC)Reply
@Johnuniq: No worries, this is hardly urgent. AFAICT, there's no issue anywhere with this, save for it not handling something arbitrary. Headbomb {t · c · p · b} 06:44, 2 November 2020 (UTC)Reply
@Johnuniq: courtesy ping, as requested. Any updates on this? It's still not urgent, so no worries if you don't have time for a while. Headbomb {t · c · p · b} 03:25, 24 November 2020 (UTC)Reply
Oops, I totally forgot. I've noted it now and should have a look in the next few days. You'd better remind me again if I miss. Johnuniq (talk) 09:45, 24 November 2020 (UTC)Reply

@Headbomb: I am planning a new function in Module:JCW which might be called DOI_LIST. In {{JCW-PUB-rank}} and {{JCW-CRAP-rank}}, you would replace the guts of each template (all the stuff that refers to doi1, doi2, etc.) with {{#invoke:JCW|DOI_LIST}}. The new function would output equivalent wikitext. It would not call {{doi prefix}} but would output equivalent wikitext assuming mode=t is wanted. I assume you would be happy with this. What should the function be called? The module currently has functions selected, exclude and pattern. Would list be enough? doilist? Johnuniq (talk) 03:00, 28 November 2020 (UTC)Reply

Name it whatever you feel is natural. It should be fairly trivial to update if I decide I don't like the name down the road. The only real usability criteria is that the current output remains unchanged. Headbomb {t · c · p · b} 03:05, 28 November 2020 (UTC)Reply
As you saw, I have finished and it should be ok for any number of doiN parameters. You might like to edit {{JCW-PUB-rank}} and remove the blank line at the bottom before the category. I suspect that the category should be moved into the noinclude tags which are already there. Johnuniq (talk) 00:11, 29 November 2020 (UTC)Reply

Issue found edit

@Johnuniq: Found a small issue. In something like, where | is empty

Rank Target/Group Entries (Citations, Articles) Total Citations Distinct Articles Citations/article


405 Communications on Applied Electronics
[Beall's journal list]
1 1 1.000

It shows a {{doi|{{{1}}}}} instead of just nothing. Headbomb {t · c · p · b} 15:45, 12 December 2020 (UTC)Reply

Ouch, I'm writing some shitty code lately. I think I fixed that. Johnuniq (talk) 00:06, 13 December 2020 (UTC)Reply
It's OK, no babies, giraffes, or armadillos were harmed. Headbomb {t · c · p · b} 00:23, 13 December 2020 (UTC)Reply