Wikipedia talk:WikiProject Tree of Life/Archive 44

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 40

←

Archive 42

→

Facto Post – Issue 21 – 28 February 2019

Latest comment: 5 years ago1 comment1 person in discussion

Facto Post – Issue 21 – 28 February 2019

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.

To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Back numbers are here.

What is a systematic review?

Systematic reviews are basic building blocks of evidence-based medicine, surveys of existing literature devoted typically to a definite question that aim to bring out scientific conclusions. They are principled in a way Wikipedians can appreciate, taking a critical view of their sources.

PRISMA flow diagram for a systematic review

Ben Goldacre in 2014 wrote (link below) "[...] : the "information architecture" of evidence based medicine (if you can tolerate such a phrase) is a chaotic, ad hoc, poorly connected ecosystem of legacy projects. In some respects the whole show is still run on paper, like it's the 19th century." Is there a Wikidatan in the house? Wouldn't some machine-readable content that is structured data help?

File:Schittny, Facing East, 2011, Legacy Projects.jpg

2011 photograph by Bernard Schittny of the "Legacy Projects" group

Most likely it would, but the arcana of systematic reviews and how they add value would still need formal handling. The PRISMA standard dates from 2009, with an update started in 2018. The concerns there include the corpus of papers used: how selected and filtered? Now that Wikidata has a 20.9 million item bibliography, one can at least pose questions. Each systematic review is a tagging opportunity for a bibliography. Could that tagging be reproduced by a query, in principle? Can it even be second-guessed by a query (i.e. simulated by a protocol which translates into SPARQL)? Homing in on the arcana, do the inclusion and filtering criteria translate into metadata? At some level they must, but are these metadata explicitly expressed in the articles themselves? The answer to that is surely "no" at this point, but can TDM find them? Again "no", right now. Automatic identification doesn't just happen.

Actually these questions lack originality. It should be noted though that WP:MEDRS, the reliable sources guideline used here for health information, hinges on the assumption that the usefully systematic reviews of biomedical literature can be recognised. Its nutshell summary, normally the part of a guideline with the highest density of common sense, allows literature reviews in general validity, but WP:MEDASSESS qualifies that indication heavily. Process wonkery about systematic reviews definitely has merit.

Links

Evidence-Based Practice: Appraise, resources page from Duke University Medical Library & Archives.
What should Cochrane do next?, Bad Science blogpost 5 November 2014, Ben Goldacre.
Cambridge (UK) Science Festival event, How do scientific discoveries become clinical medicine?, ScienceSource workshop for ContentMine 23 March 2019, with systematic review process diagram. Also on Eventbrite for tickets, taking place in Makespace, 16 Mill Lane.
PROSPERO database of PRISMA, for registration of systematic review protocols.
Process wonkery thread, wikien-l mailing list, September 2006.
Meta-Analysis, xkcd cartoon.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:02, 28 February 2019 (UTC)

Vespertilioidae changes

Latest comment: 5 years ago2 comments2 people in discussion

I've made some taxonomic suggestions to cygnis insignis ( I would have made them myself, but the editor and I sometimes conflict, so I thought that it would be a better way to make changes), namely subfamily Antrozoinae should now be changed to tribe Antrozoini somewhere under the Vespertilioninae as that is how the article is referenced now....Miniopterinae should now be changed to family Miniopteridae.....Vespertilioninae article taxonomy can be changed as well...it looks like the higher and lower taxa for these conflict. Cygnus cites Mammal Species of the World as the reference, and states the higher taxa is correct. Cistugidae also seems to follow this way as well. I thought somewhere along the line MSW was out of date. Any experts on bats out there who can determine if MSW is out of date for these groups? Which should Wikipedia follow, the higher or lower taxa?.I've also posted this on the bats task force talk page....feel free to leave your comments there...Pvmoutside (talk) 16:45, 7 March 2019 (UTC)

MSW is still stuck in a 2005 state (with some spotty partial updates, IIRC). For many groups, it does not reflect current taxonomy. If such a list in a higher taxon article is expressly sourced to MSW, I suppose deviations will have to be noted specifically, as is currently, awkwardly, the case with Vespertilionidae. Restructuring this to conform to, e.g., current ITIS status might be preferable. --Elmidae (talk · contribs) 17:29, 7 March 2019 (UTC)

Notability of scientists who have named taxa

Latest comment: 5 years ago1 comment1 person in discussion

I have started a discussion at Wikipedia talk:Notability (academics)#Notability of taxonomists about an issue that has been bothering me lately when wearing my NPP hat: does being a recognized taxonomic authority make you notable? Please pop over and comment; I suspect for some of you this may not be a new topic, and I think some codification would be useful. --Elmidae (talk · contribs) 22:09, 8 March 2019 (UTC)

Home of Ichneumonoidea

Latest comment: 5 years ago5 comments2 people in discussion

Does anyone know anything about the status of Home of Ichneumonoidea (Taxapad), run by Dicky Sick Ki Yu? The web address (www.taxapad.com) has returned nothing for the last week or so. It's used as a reference on Wikipedia almost 4,000 times. I've sent an email to Yu and have received no response. Internet Archive's Wayback Machine usually seems to have copies, so the information is recoverable. What do we do next? SchreiberBike | ⌨ 00:57, 8 February 2019 (UTC)

You should also ask at Wikipedia talk:WikiProject Insects/Hymenoptera task force --Nessie (talk) 01:34, 8 February 2019 (UTC)

The site is still down. I think I could partially automate (using AWB) a process to clean them up and create links to Internet Archive copies of the pages. With 4,000 of them it would be tedious at best. Is there a better way? SchreiberBike | ⌨ 03:08, 13 February 2019 (UTC)

Maybe Wikipedia:Bot requests? --Nessie (talk) 03:42, 13 February 2019 (UTC)

These are fixed, except for a few which could not be linked to an archive and some other oddities. SchreiberBike | ⌨ 21:14, 13 March 2019 (UTC)

motivation

Latest comment: 5 years ago24 comments5 people in discussion

A discussion on my talk page was started by Cygnis insignis. I thought I'd bring it here for a more comprehensive discussion if people feel like talking about it. I'll continue doing my thing unless a concensus develops to do something different..... Pvmoutside (talk) 17:16, 14 March 2019 (UTC)

What is your motivation for moving articles on species to a common name? No doubt you regard it as an improvement, that is not what I am asking about, my query is about why is that an improvement. There are several reasons advanced that I am already aware of, the haven't persuaded me, are you able to justify your primary focus? cygnis insignis 15:10, 14 March 2019 (UTC)

Cygnis insignis, I only move articles from scientific name to common name when there is only one common name referenced from the references I use. These days, it is usually in reptiles with the reptile data base and the IUCN. If there are multiple common names, I've switched many to scientific names. Wikipedia is not a scientific reference, but one used generally by the public. Most of the species readily recognized use common names, I'm simply following that format unless the wikiproject objects (ie many of the I vertebrate ones, plants, etc.) If a wikiproject uses both scientific names and common ones, then the question become at what point does one use which.....some sort of at least minimal criteria would need to be developed rather than using personal judgement to decide which one to use in my opinion.... …..Pvmoutside (talk) 16:17, 14 March 2019 (UTC)

I agree cygnis insignis 16:57, 14 March 2019 (UTC)

I'm going to move this discussion to tree of life wikiproject to see if it generates anything...Pvmoutside (talk) 17:16, 14 March 2019 (UTC)

There's no strictly rule-based approach that can be followed. It's determined by WP:AT and the application of the five criteria (not just Recognizability, particularly when the link WP:COMMONNAME is mis-interpreted to mean the English name). English names for some groups are purely made-up, and are no better known than the scientific name – and often less well known to "someone familiar with, although not necessarily an expert in, the subject area". The other crucial issue with English names is Precision – does the name unambiguously identify the taxon? These often conflict: genuinely vernacular names vary within and between countries, so although they are highly recognizable to some readers, they are not to others, and they are regularly ambiguous. Made-up and standardized English names, like the IOC list of bird names or the BSBI list of names of British plants, have the merit of precision, but are not actually the genuinely common name. The preference of the relevant WikiProject should also be considered. Each case needs to be decided on its merits by discussion and consensus. Peter coxhead (talk) 18:01, 14 March 2019 (UTC)

Peter, I hear what you are saying, but sounds a little impractical...are you saying each time a new species page using an English name gets written for a wikiproject that uses both common and scientific names, then each page needs to be discussed first? So i'm writing amphisbaena species pages right now. I write about 10/day. Many of them have only scientific names or multiple English names, in which case there is no conflict, and the scientific name is used. If I come across one that has only an English name, then that needs to be discussed before the page is created despite having 2 references using just the one English name? That slows things down considerably and seems unnecessary. What if I choose a genus in which case most or all of the species have one English name each, that one name is referenced for example, by the reptile database and/or the IUCN, and there are 100 species in the genus? They all need to be discussed first?....Pvmoutside (talk) 20:01, 14 March 2019 (UTC)

@Pvmoutside: most species are 'obscure', little known by anybody but specialists, so the default should be to use the scientific name. If you want to use the English name for an amphisbaenian, I'd say that the onus is on you to be able to show that it's more recognizable than, and as precise as, the scientific name. Peter coxhead (talk) 23:20, 14 March 2019 (UTC)

P.S. watch spelling: amphisbaena is not the same as amphisbaenia. Peter coxhead (talk) 23:23, 14 March 2019 (UTC)

I'm working on writing up a much longer response, but I'd take the Reptile Database with a big grain of salt. I think many people are aware that Agkistrodon piscivorus has two very widely used English common names. Reptile Database technically doesn't list ANY common names for the species (although one of the common names can be inferred from the subspecies), while presenting a German common name that is a clear translation of the other English common name (and is a German common name for an American species really very useful for German speakers)? It appears to me that coverage of common names on the Reptile Database is far from complete. Nor are RD listed names necessarily unique; "Mexican Moccasin" is the only common name listed for Agkistrodon taylori, but "Mexican Mocassin" is also listed as a common name for Agkistrodon bilineatus. Plantdrew (talk) 03:06, 15 March 2019 (UTC)

That sometimes happens as species are split, and the same common names are used after the split for now distinct species across both home regions. Also, the same common name might be used for different species in different regions or continents. In both cases, if a common name was written for one of the species first, I move it to the scientific name and create a disambig page for both species. Peter, I agree with you most species are obscure, but, again, at what point is a common name considered obscure vs one that is recognized in those wikiprojects that use both.....my preference is to still use common names when there is only one referenced, but wouldn't object to someone moving it to the scientific name if someone feels strongly that way.....Pvmoutside (talk) 14:41, 15 March 2019 (UTC)

Pvmoutside, what is meant by "those wikiprojects that use both"? cygnis insignis 17:38, 15 March 2019 (UTC)

Most of the vertebrate wikiprojects use both......birds have English names for all species, mammals for most....the others (reptiles, amphibians, fish) use either......all invertebrates default to scientific name unless there is a common name readily recognized (ie American lobster, Monarch butterfly, Eastern oyster, etc.), and plants use scientific names for everything….Pvmoutside (talk) 18:48, 15 March 2019 (UTC)

Pvmoutside, Plants tend be under the name of the species, and any name that is used to refer to it in the sources (not an uncited note in a database) is also mentioned, probably in the lead. Another thing you are confused about. cygnis insignis 20:14, 15 March 2019 (UTC)

Without knowing an enormous amount about reptile names, if an experienced editor feels an article can/should be moved a certain amount of boldness is hardly going to worry me. I've certainly done it myself, where bird genera can be moved to an unambiguous common name (Teretistris to Cuban warbler, as I did recently, for example). Sabine's Sunbird talk 19:20, 15 March 2019 (UTC)

And if someone else were inclined to boldly move reptiles (sensu lato) to the most common, universal and nearly certain to be unambiguous names they would not be able to without exercising a privilege. And it is only reversible by using the same privilege to undo one unilateral decision with another. I use sources to … no, I read sources to improve content and reflect the stuff that is important, the common names are merely an interesting diversion. And same general inquiry stands, how is moving bird genera, or whatever, to whatever is found, moved to, and elevated as a 'common name' an improvement? cygnis insignis 19:44, 15 March 2019 (UTC)

All editing here is a privilege and everything we edit can and probably will be edited by someone else and most of our editing is 100% unilateral. Me moving something doesn't and shouldn't preclude someone else from moving it somewhere else if they feel the need. If disagreements arise we can always talk about it (and conversations can be better if we can drop the haughtiness from your tone, Cygnis). A quick review of Pvmoutside's edits show that they too use sources to draft material that is "important". I do too. So maybe don't paint this discussion in terms of people contributing and those not, please. Sabine's Sunbird talk 19:54, 15 March 2019 (UTC)

Sabine's Sunbird, read it how you like, I mean no offence to anyone, especially you. Please don't read in a way in which any concern can be dismissed outright. I grant more courtesy than I am given here. Please excuse that I am continuing to challenge something that was habitually done for an as yet unexplained reason. cygnis insignis 20:06, 15 March 2019 (UTC)
show where this isn't a strawman. And note that I cannot, and would not, move the pages with the tool you have. I can use sources to add or remove content, that is my point. cygnis insignis 20:11, 15 March 2019 (UTC)

And as to why things are often better on a common name.... years of experience has made me come to the conclusion that articles with common names have higher page hit counts that scientific names, and unaccountably I want the stuff I write to be read. Sabine's Sunbird talk 20:03, 15 March 2019 (UTC)

with the disclaimer that you have been moving pages for years, on the assumption is was going to be better. Maybe you are right, I don't know. cygnis insignis 20:18, 15 March 2019 (UTC)

14,000 estimated Google hits for Teretistris and 283 estimated Google hits for "Cuban warbler", some of which refer to Miguelito Valdés. Yes, WP:GOOGLETESTS have many problems; in this case it's not picking up the paywalled reliable source for the vernacular name. Surely a "common" name that is so uncommonly used needs to cite a source. I don't see any major uptick in page views following the move (barring the spike from a DYK), but it's probably too soon to tell. Using the vernacular title cardinal (bird) for Cardinalidae (most of which are known as tanagers, buntings or grosbeaks) certainly helps that page have high hit counts (at the expense of all the readers who are trying to find information about northern cardinals). Plantdrew (talk) 20:27, 15 March 2019 (UTC)
I cited 3 - the change was in no small measure to denote the newish status of group as a family (we have an inexplicable policy here about placing article names on the genus instead of family). But my data was more along the lines of this - common name article duck - 39 thousand views last month - the more important article Anatidae - 7 thousand views. Readers are looking for common names, know things by common names and where possible we have generally tried to do that in the past; with the obvious proviso that various wikiprojects have made decisions about how best to name their articles and I'm in favour letting them continue to do so. WP:BIRD has gone all in for common names (and one specific set) for species and generally uses them where possible for higher level taxa, and I'll continue to work in that spirit going forward. That doesn't involve many moves anymore, btw. Most article names are pretty stable. Sabine's Sunbird talk 20:43, 15 March 2019 (UTC)

The three sources happen to agree in this example pp 44-45, which refers to two taxa, or one, depending on the treatment and does not indicate the rank. The bird subproject mostly defers to the IOC, which is yet another attempt to substitute one English name for bird species, they don't attempt this for genera and collate the names for families ('Cuban Warblers' being the only genus of 'Cuban Warblers, there is only the taxon 'Cuban Warbler' to note). cygnis insignis 22:16, 15 March 2019 (UTC)

RfC on whether or not to allow the use of templates to store data

Latest comment: 5 years ago1 comment1 person in discussion

See Wikipedia talk:Template namespace#RfC on templates storing data. The automated taxobox system currently relies on storing the taxonomic hierarchy in taxonomy templates, which one outcome of this RfC would forbid. Peter coxhead (talk) 10:13, 16 March 2019 (UTC)

Edit warring about common & scientific name format in lede - can we please come to some current consensus here?

Latest comment: 5 years ago37 comments14 people in discussion

I am getting completely fed up by what is currently going on with cygnis insignis and the formatting of the first sentence in articles on species with common names. To whit: for whatever reason, they insist on phrasing the first sentence as "Common name, species scientific name, is ...". Whereas the vast majority of articles (at a guess, 90%+ - didn't want to work up a regex for it) use "Common name (scientific name) is ...". They will happily edit war over it (e.g., [1][2]), and appear to consider this short exchange as a justification for continuing to do so [3]. I'm not the only one who has blown a tyre on that particular pile of caltrops.

Now I'm aware that we are generally handicapped by the fact that our nominal topic-specific MOS, which contains the latter usage, still has this big rider on top that says "feel free to ignore me". I've also seen that there have been a couple of discussions in the past already ([4]; [5]) that touched on the issue and went nowhere fast. However, I think the current state of affairs is quite counter-productive. I see no benefit in scattering around a small minority of articles that have divergent formatting and are enforced to remain in that state by ownership behaviour, for no practical gain. All the more since, when you as a reader are used to the C(S) format, encountering these articles looks just as if no-one had cleaned them up yet. To top it off, "Common name, species scientific name" is plain misleading - both names are the species name; the usual phrasing does make that clear.

I don't want to drag this to AN/I as disruptive behaviour (although it is getting there) and I don't want to get into one of these easily-provoked spats with cygnis insignis. Instead, I want to suggest that editors currently active in this area arrive at a consensus about preferred usage, and come to some decision about how to handle these instances. Do we treat this particular formatting issue as free-for-all, article creator gets to determine their version (the WP:ENGVAR/WP:REFVAR approach)? Or do we promote a uniform approach that keeps articles in step with the majority? Then there would be something to point to, and we can ask people to stick to it.

(As an addendum, I would be very keen to get Wikipedia:Manual_of_Style/Organisms to functional status, and I'm not sure as to the reasons why it is still lingering in draft mode at this point?) --Elmidae (talk · contribs) 22:08, 4 March 2019 (UTC)

The standard even in scientific writing is common name (species name) if a common name exists. Considering there isn't really any justification for whatever they are trying to do, this posting should at least be notice for them to knock it off. It's just fluff and redundant, especially with the species box spelling out the species name To answer your question, this is really more of a MOS issue than just picking whatever the article creator wants. I'm usually wary of WP:CREEP because enforcing this convention shouldn't need to be spelled out, but if we're getting the oddball editor causing problems with it and not letting up, it's probably time to strengthen the standards to lessen any wikilawyering. Kingofaces43 (talk) 22:34, 4 March 2019 (UTC)

We do have Wikipedia:Manual of Style/Lead section#Organisms which I think agrees with you. (Having just looked at it, I may make a suggestion that where the article is under the scientific name, that we should not use parentheses for the common name because in Google Knowledge Graph and Wikipedia's own Page Previews, the material in parentheses is skipped over, but that's another issue.) Hope that helps. SchreiberBike | ⌨ 00:56, 5 March 2019 (UTC)

I know I'd brought up MoS guidance and Knowledge Graph/Page Previews stripping parentheticals somewhere before. Was that discussion on your talkpage, SchreiberBike? Plantdrew (talk) 02:12, 5 March 2019 (UTC)

We are just talking about it stripping the common name, not the fluff. cygnis insignis 02:24, 5 March 2019 (UTC)

I whole-heartedly support making it the standard usage across the project to list as "Common name (species name)" in the lead. I hope we can put this issue to bed. Enwebb (talk) 02:42, 5 March 2019 (UTC)

~~Easier than creating article content.~~ For what reason? cygnis insignis 06:39, 5 March 2019 (UTC) struck, mistaken idenity, but if anyone can answer why this is better or where "The standard …" that states how scientific writing must use common names is published, I can put this to rest. Things are not commonly known do not have 'common names', and sources make it very clear that are not intended to replace or run parallel or mean anything next to the accepted method of communication about organisms. cygnis insignis 09:49, 5 March 2019 (UTC)

@Plantdrew: I do remember you talking about it before, see the discussion I just opened at Wikipedia talk:Manual of Style/Lead section#Parentheses in the lead affecting Knowledge Graph and Page Previews SchreiberBike | ⌨ 03:18, 5 March 2019 (UTC)

While I think the current approach of using common name (scientific name) is good style and suitable for the lede, the stripping out on preview is an issue. If it is undesirable for the common name to be stripped out on preview, isn't it equally undesirable to strip out the scientific name? Perhaps "The lion, Panthera leo, is ... " would be better than using parentheses although it looks odd to my eye. There is certainly no need to include the term species before the scientific name. Jts1882 | talk 08:17, 5 March 2019 (UTC)

Two points:

Let's be really clear that we are only talking about articles whose title is the English name. Articles should begin with their title wherever possible (MOS:FIRST). Articles at the scientific name (ultimately the great majority of taxon articles) should begin with the scientific name.
The scientific name should not be stripped off on preview, since in many cases it's necessary for precise identification, and even if not, it's a key part of the article. So if material in parentheses is not shown, we should stop using parentheses in the opening sentence. (Perhaps dashes? "The lion – Panthera leo – ..."?)

Peter coxhead (talk) 08:34, 5 March 2019 (UTC)

Previews are indeed stripping out the name when in parentheses; didn't know that. (I'm not sure if that is a great feature of previews - seems excessive.) I would agree that this is a problem regardless of whether the format is "scientific name followed by common", or the other way around. It's definitely desirable to have both names present. Under these circumstances, we maybe do need to adopt another format. If so, I would suggest JTs1882's version above ("The lion, Panthera leo, ..."), as retaining the standard bold for the article title, and using a comma-delineated, non-parenthesized, italic scientific name, as is reasonably common in literature. For the inverse case, "Vitis vinifera, the common grapevine, ..." (as per MOS).

Note that if we care how it looks in the previews, that's a bot task coming up to reformat a large number of articles. Basically all species of "common interest" already have articles and are set up along these lines, after all. --Elmidae (talk · contribs) 14:50, 5 March 2019 (UTC)

I just previewed Chukar partridge. Both common name and scientific name remained intact....Pvmoutside (talk) 15:04, 5 March 2019 (UTC)

I see the preview without the scientific name. Instead of "or simply chukar (Alectoris chukar)" I see "or simply chukar". Are you sure your not mistaking the "also called Chukor" for scientific name? Not sure why that particular common name is italicised. Jts1882 | talk 15:17, 5 March 2019 (UTC)

No scientific name visible in preview. You want to share your secret settings, Pvmoutside? :) --Elmidae (talk · contribs) 16:26, 5 March 2019 (UTC)

Are we talking previews in google or with the article viewer inside Wikipedia (which I have disabled)? Either way, I would favour Elmidae's ("The lion, Panthera leo, ...") over any that inserted the word species before the binomial if we consider this an important enough reason to change (and I'd be inclined to think it is). I'd suggest we want a pretty wide discussion to make sure everyone's on board. Sabine's Sunbird talk 17:04, 5 March 2019 (UTC)

Stripping happens with both Google Knowledge Graph and WP Previews; tested. --Elmidae (talk · contribs) 17:28, 5 March 2019 (UTC)

so when I select edit, scroll to the bottom of the page, and select show preview, this is the line that appears...…"The chukar partridge, or simply chukar (Alectoris chukar), also called Chukor"...….

You are talking about edit previews, which is another thing entirely. This discussion is about the popup previews that appear when you mouse over a wikilink. Jts1882 | talk 18:30, 5 March 2019 (UTC)

I see parentheticals in the pop-up previews when I'm logged in, but not when I'm logged out (I'm pretty sure when I first brought this up, I wasn't seeing them even when logged in). Note that in general, stripping parentheticals is probably a good thing; there are more articles that use parentheticals for IPA pronunciations, or to present the original form of foreign language terms that have been translated/transliterated, then there are articles that use parentheticals for common/scientific names. Plantdrew (talk) 18:21, 5 March 2019 (UTC)

I just logged out and the the parentheses are still stripped. You could have seen a different version or possibly using a different popup. There were two option gadgets for popups before they made the current one default. The other (still as option?) has information more useful to editors. Jts1882 | talk 18:30, 5 March 2019 (UTC)

I agree that adding "species" in front of scientific names is an uncomfortable and unusual construction (is there some field in which it's common? I've never encountered it...). That said, I'm generally happy to give major page contributors relatively broad leeway in stylistic decisions. As for previews omitting text between parentheses, I feel unconvinced that this is something we need to fix. Is it essential the scientific name appear in the preview? Folks seeing the Google (or Wikipedia) preview for "Tiger" can find out the scientific name by clicking on the link and reading the first sentence of the article. How many Googlers are really going to need the scientific name in the preview to figure out what a tiger is? Like Peter coxhead noted above, most species articles already start with the scientific names. For the ones that have common names, I think my first preference is the status quo ("The lion (panthera leo)..."). If folks feel it's critical for previews to include scientific names of species with common names, then I think the suggestion above ("The lion, panthera leo,...") is also perfectly readable. Thanks, and happy editing! Ajpolino (talk) 18:16, 5 March 2019 (UTC)

I'll second Ajpolino's comments. If we decide to go with Sabine's Sunbird's suggestion of a different format, then I agree a wider discussion needs to take place, and if that suggestion is taken up, a bot would need to be created to change all the thousands of pages to make them all consistent....Pvmoutside (talk) 18:27, 5 March 2019 (UTC)

I agree with Ajpolino and oppose the use of the word species in this way, except (minor quibble) in the parenthetical case it should be "The lion (panthera leo)..." not "The lion (panthera leo)...", scientific names should always be italicised. - Nick Thorne ^talk 21:27, 5 March 2019 (UTC)

I strongly support the desire that all editors continue to use the standard format example found in Wikipedia:Manual_of_Style/Organisms#Lead_section (ie. Thomson's gazelle (Eudorcas thomsonii) is the ...) for the article lead. No consensus exists (or has even been proposed) to divert from this status quo. The short exchange at Wikipedia_talk:Manual_of_Style/Lead_section#Systematic_suppression does not provide any justification for changing styles, let alone edit-warring over it, without a new RfC being proposed and accepted. As a secondary issue, the pop-up preview stripping should be addressed separately if there is actually any concern that the scientific binomial needs to be visible in the preview. Loopy30 (talk) 22:05, 6 March 2019 (UTC)

There was a consensus not to do it, linked above. There was an agreement to do it among several bird project members. There is no justification to have that MOS constraint, what other topic has such a dictate? The conversation I opened at the MOS talk page is short for a reason, this is no reason that anyone wants to publicly admit. Where it exists is because of gnomes who mechanically applied this in the belief they were doing something righteous, stopping undue weight being given to 'science'. cygnis insignis 02:32, 7 March 2019 (UTC)

@cygnis insignis, I do not see where there was "a consensus not to do it" ("it" being continuing to follow the standard MOS lead style for organisms). The "links above", including your post to the MOS talk page, do not support the statement that there exists "a consensus not to do it". As far as being a constraint, yes it is a constraint that all articles on the project "should" follow the MOS guidelines wherever possible. Although still languishing as a draft in the MOS pages, what exists there is still the most current guideline that we have on this issue. As this discussion thread started with a request that we all continue to follow this style rather than start to adopt a mix of styles or even worse, needlessly edit war over which is the "correct style" for the lead in organism articles, I strongly support continuing to use the current standard until such time as a new consensus is reached. Loopy30 (talk) 15:29, 11 March 2019 (UTC)

@Loopy30: Nice of you to pile on to the user's big spit, but perhaps things are not as simple as they make out. And are you in favour or opposed to the rewording that avoids it also being suppressed in some outputs? That is crucial to your demand that there be a moratorium on changing (or fixing) the lede. I accept that users think it should not be emphasised and if it is missing it that is not a concern. I also notice that users believe a common name, or names. replaces the systematic name. This is not supported in sources, which state the opposite—they are imperfect to useless to misleading—they do not replace the name used to identify the topic of the page. So why do it, why did several gnomes do this to thousands of polbot articles and point to that as consensus. Nobody can, or is willing to say WHY this is better. and you answer that without recourse to CONSISTENCY or 'this is what readers want or at least I know what is good for them'? How about templates for distribution, lists of taxa, making the author after the name small (is that another rule I'm breaking?) cygnis insignis 18:50, 11 March 2019 (UTC)

Hi cygnis insignis, before I chime in on any discussion to avoid suppression of names on previews, I would like to first do a bit more research and weigh the views of other editors who may have more technical knowledge than myself. That being said, my first impression would be that the stripping doesn't really matter in a preview because the full information is still present on the main article page. All this however, is unrelated (let alone crucial) to my request that we a) recognise that we already have a current consensus for the style of presentation for names in in the lead and b) agree to keep to the current style in the MOS until such time that a consensus is reached to change it. Loopy30 (talk) 20:38, 11 March 2019 (UTC)

Loopy30, show me that consensus, please. cygnis insignis 21:23, 11 March 2019 (UTC)

Loopy, I think the current conversation, and the revelation that the parentheses are stripping binomials from previews, is sufficient justification to re-examine the long-running consensus. It may be that the solution is technical as proposed below by SMcCandlish, or, if this isn't possible, changing our approach to how we introduce species in the lead. Sabine's Sunbird talk 21:50, 8 March 2019 (UTC)

@Sabine's Sunbird, yes by all means we can go ahead and "re-examine the long-running consensus" while still agreeing that we currently have a consensus on the issue. While I recognise that this consensus may end up changing in the near future based on a new discussion of the need to accommodate previews without stripping binomials from the information presented, that discussion to change the consensus should not lead us to abandon the current style format in the meantime or justify edit-warring to prematurely change the lead style. If a such a consensus to change is reached in the future, it will likely then take a bot to re-word the lead sentences of all the many articles affected. Loopy30 (talk) 15:29, 11 March 2019 (UTC)

Consistent presentation is helpful to readers and to editors and potentially to various other things (automated parsing, etc.) If we have proof that parentheticals are being stripped by tools under our control, then we should either change the tools' behavior or use a different markup (like en dashes). Another approach, which I've suggested before for other reasons, is to create a template for this purpose; doing so would permit us to use a CSS class around a parenthetical bi- or trinomial and have tools make an exception for it: The {{TemplateName|foo|or=phoo|sci=Genus species|sci2=G. s. subspecies|aka=barbaz}} in South Africa, is ... might output "The foo or phoo (Genus species, or G. s. subspecies), also known as the barbaz in South Africa, is ...)" on the browser side (see wikisource for example class output). At any rate, we need not concern ourselves with what Google or other tools outside our control are doing because we can't control them and they could change tomorrow. — SMcCandlish ☏ ¢ 😼 12:45, 7 March 2019 (UTC)

Then have the default of the template suppressing that sciencey name with a preference to switch it on? This thing you are not bothered about one way or the other? I am a breath away from asking at FTN for an opinion on this 'undue weight to meaningful classification'. cygnis insignis 13:46, 7 March 2019 (UTC)

I spent way too much time looking for it, but I finally found the discussion that established the current recommendation: Wikipedia_talk:WikiProject_Biology/Archive_4#Consensus_how_scientific_names_are_displayed_in_the_lead_of_species_articles_listed_under_common_names (WikiProject Biology?? I never would've guessed that any important discussion had ever happened there). The recommendation was apparently moved from WP:FAUNA to MOS following Wikipedia_talk:Manual_of_Style/Archive_128#Merge_WP:FAUNA_sections_to_MOS. Plantdrew (talk) 21:24, 11 March 2019 (UTC)

Thank you, most sincerely, I think that fills in most of the blanks for me. Or is going to once I finish reviewing some interactions, so bloody average! cygnis insignis 22:00, 11 March 2019 (UTC)

The previous discussion was not advertised here, and among ToL descendent projects, appears to have been only advertised at some of the vertebrate ones (birds, mammals, AAR, fish and sharks, but not turtles or rodents). WP:MOSLEAD guidance on placing common name in parentheses when the article has a scientific name title appears to have been entirely undiscussed (but WP:MOSORGANISM suggests using commas in that case due to the parenthetical stripping issue). Plantdrew (talk) 23:24, 11 March 2019 (UTC)

Attempted summary

Since the discussion seems to have about run its course, I'll try to sum up my overall impression:

The form used in the vast majority of species articles under a common name: "The lion (Panthera leo) is..." - is the consensus of previous discussions, MOS-compliant, conforms to standard publication usage, and is generally regarded as what should be used. A parenthesis-less form: "The lion, Panthera leo, is..." - while less common, is probably not worth tussling over if it's used, and we may end up having to use it for other reasons anyway (see below). Other constructions, especially those inserting an extra "species" into the expression, should be avoided.
We don't seem to have arrived at any consensus regarding what is probably the more important point: is it desirable to make sure that in such articles, the scientific name in parentheses is not stripped out in Preview? And if so, what should be done about it? - I believe it might not be a bad idea to set up an RfC to determine whether the community at large considers this an issue. Because if people overall don't think so, large-scale (bot) retrofitting, for example, would be a no-go from the start. --Elmidae (talk · contribs) 22:58, 19 March 2019 (UTC)

I think this is an accurate summary. The current consensus is the preferred format but there is the issue with the preview, which I think is important. Ideally the preview would recognise that scientific names are a special case, but this might not be possible. I wonder if a template such as {{organism name|lion|panthera leo}} could change the preview behaviour or be a flag for it to not strip the name, although I generally don't like these types of template as they impair readability. Jts1882 | talk 09:17, 20 March 2019 (UTC)

Wikidata discussion

Latest comment: 5 years ago15 comments6 people in discussion

For anyone interested, there is a discussion on Wikidata regarding taxa and their names, and whether they should be separated into two items. The decision will not affect how we treat taxonomic articles on Wikipedia, but people with nomenclatural and/or data-management expertise may have good insight. --Animalparty! (talk) 00:23, 19 March 2019 (UTC)

The substance is largely a repeat of an earlier discussion. The relevance to us here is:

It confirms, yet again, that what Wikidata currently labels "taxon" is actually a "taxon name", so when entries like Reynoutria japonica (Q18421053) say that something is an instance of Q16521, this doesn't mean "taxon" but "taxon name". There appears to be opposition to changing the label, for reasons that I don't understand. I'm still hopeful!
Wikidata currently doesn't model taxa, and it seems that, although it's regularly suggested that it should, no-one has yet put forward, with a worked example, a way to do it. This matters to us because it's been suggested that we should use Wikidata for taxoboxes rather than taxonomy templates, but this would require them to model taxonomic hierarchies in the way that taxonomy templates do.

Peter coxhead (talk) 17:28, 20 March 2019 (UTC)

These discussions make my head hurt. The Wikidata taxon seems to be both the object, the "group of one or more organism(s), which a taxonomist adjudges to be a unit", and the name for the object, which causes problems when the taxon name changes. It cannot work unless the name is separated from its object or, at least, only does if the name is permanently linked to the object. The argument that a taxon is a concept with four parts, "the name, the author(s), the publication and the date" makes more sense to me. The name is linked to an author and publication that defines a particular hypothesis for what the taxon is and what it applies to. Anyway, this argument will be running for a while yet.

P.S. Can anyone remember the name of the site that graphically displays the network of taxa on Wikidata? Someone suggested it be added to articles a while back along with wikispecies and commons. My memory says it was a single word beginning with S. Jts1882 | talk 17:55, 20 March 2019 (UTC)

Scholia? Here's the Scholia for C. elegans. Plantdrew (talk) 18:52, 20 March 2019 (UTC)

The Scholia for Scilla (Q157238) is interesting: [6]. It shows the problem that arises when Wikidata is forced to allow multiple parent taxa to preserve NPOV, but then lacks any way to choose a self-consistent classification. It's a nice example for those who think taxoboxes can be built from Wikidata. Peter coxhead (talk) 22:42, 20 March 2019 (UTC)

Reasonator is another Wikidata visualization tool; see here for Scilla. Plantdrew (talk) 16:13, 21 March 2019 (UTC)

When more than one parent is given, Scholia merges them in the table and shows all in the graph; Reasonator uses only the second. Peter coxhead (talk) 16:26, 21 March 2019 (UTC)

Scholia is inconsistent in the graphic. With Felidae it only shows Feliformia as parent (the third one, like Reasonator) and ignores the two superclasses. Not sure if this is because there are more than two parents or something to do with two superclasses both giving Feliformia as the parent.

Have you seen this graphic of wikidata rank relationships? Jts1882 | talk 18:00, 21 March 2019 (UTC)

@Jts1882: the name, the author(s), the publication and the date do not specify a taxon in the sense of a particular group of organisms. These four define the taxon name, not its scope (circumscription), other than the need to include the type. The IPNI entry "Asparagaceae Juss. -- Gen. Pl. [Jussieu] 40. 1789 [4 Aug 1789] ; nom. cons." does not distinguish between the APG's Asparagaceae s.l. and the Asparagaceae s.s. of those who do not accept APG's lumping. As per my comments at the Wikidata discussion, Hyacinthaceae is the same taxon as Scilloideae in the two currently used alternative systems (which is why we rightly have a redirect from the former to the latter), although not one of the "four parts" is the same. Peter coxhead (talk) 22:42, 20 March 2019 (UTC)

What I trying to convey was the idea that a taxon is concept, a hypothesis about a group of organisms, which will be a particular point of view in a reference applied to a taxon name (the taxon name sensu that reference). So Hyacinthaceae and Scilloideae would be different taxon concepts applied to the same group of organisms. There seem to be three levels, the actual group of organisms, the name with its formal authority, then a variety of hypotheses about that name. Which is the taxon? When we now apply a name such as Felidae Fischer von Waldheim, 1817 to a group of organism such as the cat family, we are neither using the exact name he used nor describing the same group of organisms. I have no idea how this should be put in a database. There is also the problem of how to reconcile public editing with curation of a strict database structure. It seems impossible. Jts1882 | talk 07:21, 21 March 2019 (UTC)

I'm beginning to agree that it's impossible! A variety of hypotheses about that name isn't right, I think. The hypotheses (i.e. the choices that can be made by taxonomists) are (1) the circumscription of the taxon (2) its position (rank, placement) relative to other taxa. Once (1) and (2) are settled, the name follows algorithmically – (ignoring conservation of names) by finding the oldest name used for a type that falls into the taxon thus established. Applying the slightly different algorithms in the different codes isn't easy, partly because they are complicated, and partly because the research needed to find the oldest (highest priority) name is difficult. However, the variety of hypotheses apply to the taxon, not the name. The circumscription, rank and placement determine the name uniquely. You can't hypothesize what the name for a family in which Felis is placed should be called; it has to be "Felidae", unless the family includes a genus with a higher priority family name formed from it.

At the risk of becoming repetitive (but writing this out helps me at least to be sure I've understood it), perhaps in your terms the three "levels" are:

Taxon = group of organisms, i.e. circumscription
Taxon concept = rank and placement of the taxon
Taxon name – determined by (1) and (2).

However, crucially, it's not reversible: the taxon name does not determine fully the taxon concept or taxon. It's largely this that makes putting it into a database difficult, I think. Names of things other than taxa may need to be disambiguated, but then refer to one of a defined set of entities. Papers by Manning et al. (such as this one) explicitly use "Hyacinthaceae" as a synonym for the APG's "Scilloideae", but there's no guarantee that any future paper even by the same authors would do so, hence the meaning of these two names and their synonymy is not fixed. Peter coxhead (talk) 10:25, 21 March 2019 (UTC)

Isn't that an argument for a structured data approach? Hyacinthaceae sensu [whatever] is something you could map to Manning et al. 2009 on one hand, and the circumscription they're using on the other, isn't it? Synapomorphies, seed size ranges, couldn't all these are things be included into your data model, built into Wikidata? Guettarda (talk) 12:30, 21 March 2019 (UTC)

@Guettarda: it would seem so, and my first assumption was that Wikidata should have a data model that included such information. The key issue seems to be that, yes, you need entities in the model with the same name but the equivalent of "sensu" qualifiers (just as we less formally have Template:Taxonomy/Spermatophyta and Template:Taxonomy/Spermatophytes/Plantae to capture two different approaches to classification) but you also need to be able to choose which paths to follow. In the figure opposite, which I posted to a Wikidata discussion, there are three names (rounded boxes) and three taxa (square boxes). You must either follow the green arrows (sensu Stace, 2010, for example) or the red arrows (sensu APG, for example). Crucially, if you arrive at Taxon 3 (which is Lemnaceae sensu Stace and Lemnoideae sensu APG), by following the green arrow from Lemnaceae, you must not follow the red arrow and say that its parent is the taxon which is Araceae sensu APG. Now this can be handled via special purpose code (as we can write here in Module:Autotaxobox), but no-one seems to be able to say how to handle it in Wikidata, which is a relational database. As the Scholia example I gave above shows, without special handling, the classifications get merged and make no sense. Hopefully, it's just a failure to work out how to do it, rather than a total block. Peter coxhead (talk) 15:41, 21 March 2019 (UTC)

My comments at Wikidata, not sure I would agree its a relational database but that is not the issue, was basically pointing out that in many cases there is not a single goto ref to make a determination on all this. The reality is for many taxa you need to be specialised in these taxa to make such a determination. This causes several problems, but the two prominent ones is do you have the people who can do this, and second NPoV. You would have to make specialised determinations based on your own experience with the taxa. A relational database can do all this, I use them myself for making such databases of turtle names (in SQL), which is my job. Although the structs I use could be applied to any animal group, I do not have the knowledge to apply it more widely. Hence each taxonomic group tends to have some form of database maintained somewhere by specialists in that field. Getting all this under Wikidata I think is a massive undertaking, and in all honesty impossible. Cheers Scott Thomson (Faendalimas) ^talk 20:46, 21 March 2019 (UTC)

@Faendalimas: clearly editors should not be making such determinations; they have to be based on reliable sources, as per WP:OR – although I do understand that to use taxonomic information in sources does require a grasp of the relevant code of nomenclature. To keep to my example, there are multiple reliable sources for the relationships shown in the diagram above. So these at least could be put into Wikidata with no issues around OR or NPOV. The problem seems to be extracting self-consistent alternative classifications, neither collapsing them together as Scholia does, nor arbitrarily, and so inconsistently, selecting one parent taxon as Reasonator does. It's a problem we haven't fully solved here in the automated taxobox system, witness the recent problems with keeping the incompatible 'bird' and 'dinosaur' classifications from interacting. Peter coxhead (talk) 12:47, 22 March 2019 (UTC)

Changes at Monkey

Latest comment: 5 years ago1 comment1 person in discussion

Please see Talk:Monkey/Archive 1#Changed without discussion re this reversion I made. Peter coxhead (talk) 16:30, 27 March 2019 (UTC)

Facto Post – Issue 22 – 28 March 2019

Latest comment: 5 years ago1 comment1 person in discussion

Facto Post – Issue 22 – 28 March 2019

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.

To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Back numbers are here.

When in the cloud, do as the APIs do

Half a century ago, it was the era of the mainframe computer, with its air-conditioned room, twitching tape-drives, and appearance in the title of a spy novel Billion-Dollar Brain then made into a Hollywood film. Now we have the cloud, with server farms and the client–server model as quotidian: this text is being typed on a Chromebook.

File:Cloud-API-Logo.svg

Logo of Cloud API on Google Cloud Platform

The term Applications Programming Interface or API is 50 years old, and refers to a type of software library as well as the interface to its use. While a compiler is what you need to get high-level code executed by a mainframe, an API out in the cloud somewhere offers a chance to perform operations on a remote server. For example, the multifarious bots active on Wikipedia have owners who exploit the MediaWiki API.

APIs (called RESTful) that allow for the GET HTTP request are fundamental for what could colloquially be called "moving data around the Web"; from which Wikidata benefits 24/7. So the fact that the Wikidata SPARQL endpoint at query.wikidata.org has a RESTful API means that, in lay terms, Wikidata content can be GOT from it. The programming involved, besides the SPARQL language, could be in Python, younger by a few months than the Web.

Magic words, such as occur in fantasy stories, are wishful (rather than RESTful) solutions to gaining access. You may need to be a linguist to enter Ali Baba's cave or the western door of Moria (French in the case of "Open Sesame", in fact, and Sindarin being the respective languages). Talking to an API requires a bigger toolkit, which first means you have to recognise the tools in terms of what they can do. On the way to the wikt:impactful or polymathic modern handling of facts, one must perhaps take only tactful notice of tech's endemic problem with documentation, and absorb the insightful point that the code in APIs does articulate the customary procedures now in place on the cloud for getting information. As Owl explained to Winnie-the-Pooh, it tells you The Thing to Do.

Links

Wikidata as a semantic framework for the Gene Wiki initiative, 2016 paper by Andrawaag and others, commenting inter alia on the role of the API on Wikidata
Working With Wikibase From Go, Digital Flapjack blogpost 26 November 2018, Michael Dales, developer for ScienceSource using golang, with a software engineer's view on Wikibase and the MediaWiki API
Dealing with the Rust, Magnus Manske blogpost 12 March 2019, on the Rust language and the MediaWiki API
mw:API:RecentChanges, mediawiki.org page on the API for access to "recent changes" on a wiki
wikitech:Analytics/AQS/Pageviews, wikitech.wikimedia.org for the Pageview API, giving Wikimedia traffic information
xkcd cartoon, API Guide

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:46, 28 March 2019 (UTC)

Request for comment, qBugbot and self-redirects

Latest comment: 5 years ago11 comments4 people in discussion

I would like to use qBugbot to resolve some self-redirects in Animal articles. For clarity, and because it confuses me, when page A links to page B, and page B is a redirect page pointing back to page A, I am saying that page A contains a self-redirect link to page B.

About 1181 animal articles contain self-redirect links pointing to 7622 redirect pages.

Pages with self-redirect links:

640 arthropods
541 other animals
223 extinct

(These numbers could be a little low -- I may not have caught everything.)

Here is what I am considering:

A self redirect link in a monotypic genus to its sole species would be de-linked in the genus page.
Other self-redirects to species would have the species articles created to replace the redirect.
Self-redirect links to genus pages would have an article created to replace the redirect page.
Self-redirect links to higher monotypic taxa would be de-linked.
Self-redirects to tribes, subfamilies, and superfamilies would be de-linked in general, but some of these articles may be created by the bot after manual review.
Self-redirects to families would generally have the article created to replace the redirect, but these pages would be manually reviewed.
Self-redirects to ranks higher than superfamily would be reviewed manually and either ignored, replaced, or de-linked.

Creation Exceptions:

Redirect pages of extinct organisms would not be replaced by articles.
The self-redirect links point to a total of 7622 redirect pages. 2220 of these are not in GBIF with good status, and they would not be replaced by articles.

The only case a self-redirect link would be left intact is when it goes to a subheading in the article. In total, 1181 pages would have the self-redirect links de-linked, and something less than 5400 redirect pages would be replaced with stub articles.

List of Tachinidae genera is a good example of a page with lots of self-redirects. I started fixing these manually a while back, and decided it was taking too long.

For context, qBugbot was used to create about 18,000 arthropod stubs last year, and is currently awaiting authorization to make an editing/update pass over these articles.

I'm looking for comments, criticism, and suggestions.

Thanks! Bob Webster (talk) 00:16, 6 April 2019 (UTC)

I was going to say, I'm almost positive there already is a special bot that's creeping around and fixing such redirects as it finds them - I believe I've seen that in article histories. But then what you are proposing seems a lot more nuanced. Good idea, I think, but: isn't it a little ambitious for the bot to straight up create species or genus pages when there aren't any? I suppose you have that set up for arthropods from the previous runs, but would it yield useful results for other taxa? --Elmidae (talk · contribs) 00:52, 6 April 2019 (UTC)

There might be a bot that unlinks self-redirects, but there certainly isn't one that turns self-redirects into articles. Plantdrew (talk) 01:41, 6 April 2019 (UTC)

I'm very enthusiastic about this proposal. As Qbugbot's ability to create useful articles for non-arthropods hasn't been demonstrated yet, article creation should be restricted to arthropods for now. Non-arthropod articles could be created pending testing.

I'd be happy to forego creating articles on non-arthropods. It would definitely take some work to get categories, common names, synonyms, etc. handled properly. It might be an improvement to unlink the self-redirects without replacing them, but either way is fine with me. Bob Webster (talk) 03:57, 6 April 2019 (UTC)

"Other self-redirects to species would have the species articles created to replace the redirect". Sometimes there a synonyms of a species that are self-redirects. These should be unlinked, not given articles. I assume your mostly thinking of genus articles with lists of species full of self-redirects.

The synonyms and other invalid taxa would be handled properly -- unlinked and left as redirects. That is the majority of those 2000+ I mentioned without good status in GBIF. Most of the articles with self-redirects have lists of genus or species with self-redirects. There are a few with lists of families or other ranks, and a few that have a single or few self-redirects as oversights. Bob Webster (talk) 03:57, 6 April 2019 (UTC)

"Self-redirect links to higher monotypic taxa would be de-linked." There can't be very many of these can there? Could you produce a list of these instance for manual review? My concern is that the redirects may well be missing some categories that they should have, and I'm not very clear on Qbugbot's capabilities for categorization of redirects.

I'll get a list of these.

Some redirects (e.g. Cambarus elkensis) previously existed as articles created by Polbot that had an IUCN status. I haven't seen Qbugbot deal with IUCN status yet. Would that capability be added?

Yes, Qbugbot has that capability now. Here's a test page I generated earlier today, by coincidence, It's a Polbot page that was changed to a redirect. (I added the type locality manually). I've just replaced the Cambarus elkensis redirect with a generated article. I see (like you mentioned above) that I'll need to preserve the most of the categories on the redirect pages. I had not considered that.Bob Webster (talk) 03:57, 6 April 2019 (UTC)

7622 self-redirects is in the range I'd expect. User:Galactikapedia created about 5000 redirects de novo in 2017. Stemonitis turned a lot of arthropod articles into redirects in mid 2010 (including Cambarus elkensis and various Tachinidae genera). These two editors I believe responsible for most of the self-redirects for non-extinct organisms. Category:Animal redirects with possibilities is where some of us have been putting redirects that should have articles. Redirects in this category are less likely to be self-redirects, but I'd suggest you include any arthropods in the category in your next round of article creation. Plantdrew (talk) 01:34, 6 April 2019 (UTC)

That's a good idea. I'll add those to my list for the next round.Bob Webster (talk) 03:57, 6 April 2019 (UTC)

I followed the qbugbot story from when it was just getting approved. I like the idea and I like how it has worked out so well so far. This project seems like a good idea. Can you talk more about why the outcomes differ based on taxonomic rank? Is it a technical issue? Concerns over synonymy and taxonomic shifts? I also would be keen to review the list of taxa to be delinked when that happens. --Nessie (talk) 03:24, 6 April 2019 (UTC)

According to some naming conventions, monotypic taxa are supposed to be redirected toward genus, and left as redirects. The outcomes of the non-major ranks (subfamily, tribe, etc.) are different because it's hard to find a complete list of taxonomic descendants for them in the large databases, so specific sources are often necessary. I'll post a list of the taxa to be delinked. Bob Webster (talk) 04:12, 6 April 2019 (UTC)

It would be good to see the list. This seems a very good idea – most redirects from a species back to its genus are a very bad idea, since it makes it look as though the species article exists. But as with all bots, there can be unforeseen issues when they run. Peter coxhead (talk)

@Peter coxhead, @NessieVL: Here's a list of the articles and their self-redirects. The are two lists in this zip file, one with all the articles, and one in which the redirect page has a higher taxonomic rank than the article linking to the redirect. Bob Webster (talk) 19:05, 8 April 2019 (UTC)

A new newsletter directory is out!

Latest comment: 5 years ago1 comment1 person in discussion

A new Newsletter directory has been created to replace the old, out-of-date one. If your WikiProject and its taskforces have newsletters (even inactive ones), or if you know of a missing newsletter (including from sister projects like WikiSpecies), please include it in the directory! The template can be a bit tricky, so if you need help, just post the newsletter on the template's talk page and someone will add it for you.

– Sent on behalf of Headbomb. 03:11, 11 April 2019 (UTC)

Missing images

Latest comment: 5 years ago5 comments3 people in discussion

Just throwing an idea out, haven't really thought about how to make it work. Many species don't have images. I've previously thought about going through Commons somehow to find instances where Commons has an image, but our articles don't. The downside of Commons is that there is no guarantee that images are correctly identified (images of museum specimens are likely to be correctly identified, but are generally poor representations of how living organisms appear; biological illustrations are likely to be correct, but plates are often cluttered with illustrations of several species). However, it now strikes that iNaturalist would be an excellent resource for images. It's possible to filter iNaturalist images for high confidence identifications and Wikimedia compatible licensing. Plantdrew (talk) 16:11, 19 April 2019 (UTC)

The biodiversity heritage library identifies scientific names in their catalogues, the indexed images in the works are then hosted at flickr which I think means it can be uploaded directly to commons. There are many lithographs and line drawings that are good additions to articles, but obviously a lot that is less useful to sift through. BHL links the names verbatim, so a search on synonyms is also required. (They also have recent articles on organisms too, from publicly available texts) — cygnis insignis 17:12, 19 April 2019 (UTC)

Are there many images in Commons that are incorrectly identified? I’ve been using FIST to add these images where i can. I’ve seen only a small number that were wrong.--Nessie (talk) 21:40, 19 April 2019 (UTC)

What's FIST; could you provide a link? Diagnostic characters may not visible on Commons images, so there is a lot of stuff that may or may not be wrongly identified. Most of my contributions on Commons deal with misidentifications. There's a bunch of stuff where range and photo locale don't match up. My former boss published a book that included two photos from Commons of supposedly South American medicinal plants that were taken in a garden in Japan. I've seen stuff where the label of an adjacent plant in a botanical garden (or seed catalog) is at the root of a wrong ID. Two years ago I questioned whether Commons had ANY correctly identified photos of Eutrochium purpureum. I suppose the quality of Commons IDs varies by groups of organisms. I haven't really paid attention to where photos of fish come from, but fishes are not an easy subject to capture for a casual photographer. Birders are pretty knowledgable as a group. Plantdrew (talk) 22:12, 19 April 2019 (UTC)

The Free Image Search Tool is linked in the {{Image requested}} template and some of the wikipedia requested images categories. I forget how I first found it. It needs an update but it's pretty useful, especially for categories. If you find something incorrect in Wikidata, it has the link for you to remove it. --Nessie (talk) 02:23, 20 April 2019 (UTC)

Facto Post – Issue 23 – 30 April 2019

Latest comment: 5 years ago1 comment1 person in discussion

Facto Post – Issue 23 – 30 April 2019

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.

To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Back numbers are here.

Completely clouded?

Cloud computing logo

Talk of cloud computing draws a veil over hardware, but also, less obviously but more importantly, obscures such intellectual distinction as matters most in its use. Wikidata begins to allow tasks to be undertaken that were out of easy reach. The facility should not be taken as the real point.

Coming in from another angle, the "executive decision" is more glamorous; but the "administrative decision" should be admired for its command of facts. Think of the attitudes ad fontes, so prevalent here on Wikipedia as "can you give me a source for that?", and being prepared to deal with complicated analyses into specified subcases. Impatience expressed as a disdain for such pedantry is quite understandable, but neither dirty data nor false dichotomies are at all good to have around.

Issue 13 and Issue 21, respectively on WP:MEDRS and systematic reviews, talk about biomedical literature and computing tasks that would be of higher quality if they could be made more "administrative". For example, it is desirable that the decisions involved be consistent, explicable, and reproducible by non-experts from specified inputs.

What gets clouded out is not impossibly hard to understand. You do need to put together the insights of functional programming, which is a doctrinaire and purist but clearcut approach, with the practicality of office software. Loopless computation can be conceived of as a seamless forward march of spreadsheet columns, each determined by the content of previous ones. Very well: to do a backward audit, when now we are talking about Wikidata, we rely on integrity of data and its scrupulous sourcing: and clearcut case analyses. The MEDRS example forces attention on purge attempts such as Beall's list.

Links

Wikipedia:Wikipedia_Signpost/2019-03-31/In focus#The_Wikipedia_SourceWatch by Headbomb.
Wikipedia:WikiProject Academic Journals/Journals cited by Wikipedia/Questionable1
d:Wikidata:ScienceSource project/Beall's list: Beall's list, final version, matched into Wikidata.
SPARQL query for Quackwatch: query to find items on Wikidata for articles subject to the Quackwatch blacklist of "Nonrecommended Periodicals", under "Journals (Fundamentally Flawed)".
SPARQL query to find retracted articles on Wikidata.
d:Wikidata:ScienceSource project/NCBI2wikidata dashboard, metadata for biomedical articles being built up, sourced from PubMed and PubMed Central.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:27, 30 April 2019 (UTC)

Facto Post – Issue 24 – 17 May 2019

Latest comment: 4 years ago1 comment1 person in discussion

Facto Post – Issue 24 – 17 May 2019

Text mining display of noun phrases from the US Presidential Election 2012

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.

To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Back numbers are here.

Semantic Web and TDM – a ContentMine view

Two dozen issues, and this may be the last, a valediction at least for a while.

It's time for a two-year summation of ContentMine projects involving TDM (text and data mining).

Wikidata and now Structured Data on Commons represent the overlap of Wikimedia with the Semantic Web. This common ground is helping to convert an engineering concept into a movement. TDM generally has little enough connection with the Semantic Web, being instead in the orbit of machine learning which is no respecter of the semantic. Don't break a taboo by asking bots "and what do you mean by that?"

The ScienceSource project innovates in TDM, by storing its text mining results in a Wikibase site. It strives for compliance of its fact mining, on drug treatments of diseases, with an automated form of the relevant Wikipedia referencing guideline MEDRS. Where WikiFactMine set up an API for reuse of its results, ScienceSource has a SPARQL query service, with look-and-feel exactly that of Wikidata's at query.wikidata.org. It also now has a custom front end, and its content can be federated, in other words used in data mashups: it is one of over 50 sites that can federate with Wikidata.

The human factor comes to bear through the front end, which combines a link to the HTML version of a paper, text mining results organised in drug and disease columns, and a SPARQL display of nearby drug and disease terms. Much software to develop and explain, so little time! Rather than telling the tale, Facto Post brings you ScienceSource links, starting from the how-to video, lower right.

ScienceSourceReview, introductory video: but you need run it from the original upload file on Commons

Links for participation

http://sciencesource-review.wmflabs.org/, review tool link in the left-hand sidebar at http://sciencesource.wmflabs.org/wiki/Main_Page

The review tool requires a log in on sciencesource.wmflabs.org, and an OAuth permission (bottom of a review page) to operate. It can be used in simple and more advanced workflows. Examples of queries for the latter are at d:Wikidata_talk:ScienceSource project/Queries#SS_disease_list and d:Wikidata_talk:ScienceSource_project/Queries#NDF-RT issue.

Please be aware that this is a research project in development, and may have outages for planned maintenance. That will apply for the next few days, at least. The ScienceSource wiki main page carries information on practical matters. Email is not enabled on the wiki: use site mail here to Charles Matthews in case of difficulty, or if you need support. Further explanatory videos will be put into commons:Category:ContentMine videos.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 18:52, 17 May 2019 (UTC)

Single or double quotes?

Latest comment: 4 years ago3 comments3 people in discussion

Is there a difference between "Asthena" subditaria and 'Asthena' subditaria, and is one preferred over the other on Wikipedia? SchreiberBike| ⌨ 00:58, 18 May 2019 (UTC)

I see that the original source used single quotes, and WikiPedia double quotes. I don't think that there's a difference, and also that the format is not defined by ICZN, and these are "scare" quotes. WP:MOS doesn't seem to prescribe a particular form for scare quotes.

I've seen this format used in the reverse situation - giving the corrected name when it is known that a species should be moved to another genus but no-one has formally made the requisite new combination. Perhaps also for situations like "Pan" sapiens when people are suggesting that humans and chimpanzees should be congeneric - but as Homo has priority "Homo" paniscus would be more correct (the combination Homo troglodytes already exists). Lavateraguy (talk) 09:44, 18 May 2019 (UTC)

My impression is that double quotes are more common, but I doubt there is a different meaning. It's often used for extinct stem genera when they are known to be paraphyletic but that the relationships are still uncertain, e.g. "Miacis" spp.. Jts1882 | talk 12:09, 18 May 2019 (UTC)