Telomerase reverse transcriptase

Hey Andrew! I hate to complain, but I really can't understand the PBB_Summary text you put in Telomerase reverse transcriptase. (on a related note, I can't understand why the body of this article is partly inside a template; that's very unusual). The summary seems to consist entirely of information on telomerase in general, but not on TERT specifically. Also, the summary contains no links, which makes it difficult to understand since it contains so many words that I, and other laypeople, don't know. I was wondering if you might be willing to edit it for clarity. Thanks! --Hyperbole (talk) 23:17, 15 January 2008 (UTC)

Hi Hyperbole... The text in the PBB_Summary template is taken from the Entrez Gene summary for that gene (click the link in the reference). The edits I made to this page are part of a larger effort with the ProteinBoxBot, which systematically adds content from public databases to create/amend gene stubs. The advantage of encasing that text in the PBB_Summary template is that PBB can then update that text whenever it changes at its source without disturbing the rest of the page. Having said that, I'm confident any amount of human contributions will be far superior to our bot auto-added content. If you don't think the text is relevant, feel free to delete it. If you change it substantially, feel free to extract it from the PBB_Summary template. In short, feel free to override anything that PBB does (and that equally applies to the other pages with PBB content). Cheers, AndrewGNF (talk) 23:39, 15 January 2008 (UTC)

ProteinBoxBot's uploads

Per the switch to the new pre-processor (see m:Migration_to_the_new_preprocessor#Expected_differences), the trick of passing template parameters via {{!}} no longer works. This was actually a bug in the old preprocessor, which can be verified by the fact that it only worked if the template argument was inside a parserfunction. On {{self}} for example it would not work for the second argument, only 2 and beyond, which were inside #if.

Furthermore, a bot should probably not be using {{self}}. It would probably be best to replace all instances of:

{{self|GFDL-no-disclaimers|cc-by-sa-3.0{{!}}[[Genomics Institute of the Novartis Research Foundation]]}}

with:

{{GFDL}}
{{cc-by-sa-3.0|[[Genomics Institute of the Novartis Research Foundation]]}}

Would you be willing to fix ProteinBoxBot's uploaded images to no longer break with the new preprocessor (as per the above suggestion)? --MZMcBride (talk) 02:31, 25 January 2008 (UTC)

Thanks for the note. I'm going to move this discussion over to User talk:ProteinBoxBot for a slightly broader audience. Cheers, AndrewGNF (talk) 12:36, 25 January 2008 (UTC)

"I know we're all on the same page looking from slightly different angles"

Hello there...

I am Treesoulja and I am a new wikipedian. I was browsing Tim's talk page and I stumbled apon a quote of yours that I like. I often harvest quotes that I like for some reason or another to use as away messages on AIM or on facebook etc. and I really like your "I know we're all on the same page looking from slightly different angles." How would I go about citing you considering that all I have to go with is Andrew GNF and you probably don't want to give me your full name? This seems silly and you may laugh at my naivite but I thought I would ask.

Newly wiki

Treesoulja (talk) 04:45, 3 February 2008 (UTC)

You made me laugh, but not at your naivete. It's just not a common accusation that I've said something noteworthy, much less something worth quoting. (I've heard another quote that I think applies -- even a blind pig finds an acorn once in a while...) Feel free to cite me as "anonymous", since I'm sure many others have said similar things before. Welcome to wikipedia... Cheers, AndrewGNF (talk) 05:25, 3 February 2008 (UTC)

Re: Image help

I'd be more than happy, but I do have a question or two.

  1. Do you mind if I use bullets instead of a <br> tag?
  2. Will all the sources use the same URL (http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=retrieve&db=pubmed&list_uids=15075390&dopt=Abstract)?
  3. The _tn images use redirects that no longer work. Should these redirects be removed?

--MZMcBride (talk) 22:47, 8 February 2008 (UTC)

Thanks much for your help... Switching to bullets sounds like a good idea. Yes, all images will reference that same link. And, sigh, what to do with those _tn redirects? I really liked the old behavior actually. Well, what do you think of this change for the thumbnails? [1] Think people will care that one image page will include another image in its text region? AndrewGNF (talk) 00:05, 9 February 2008 (UTC)

I can help with the large number of edits that need to be done, but I need to make sure they get done right the first time. There are a couple questions I have:

  • Exactly what information should be decoded from the image title? How is it decoded?
  • What should happen to the images that had redirect tags on them?

I am somewhat busy, or I would be able to do this tomorrow. But I can definitely get the code ready in the first half of this week. — Carl (CBM · talk) 06:05, 9 February 2008 (UTC)

Thanks Carl for your offer of help too... I don't know if MZMcBride was particularly excited to help out with this, and if he's started I'd hate for someone else to start making a similar set of changes and have two people stepping on each other's toes. So, at the point that one of you actually starts coding, can you leave a note here so the other knows? Sound reasonable?
Regardless, your questions are good ones, regardless of who does the work...
  • Image titles look like this: PBB_GE_XXXXX_gnf1hYYYYY_at_ZZ.png. the "XXXXX" is a variable length string (nums and chars) that is the gene symbol. "YYYYYY" is var length string that can be ignored here. and "ZZ" is always two chars -- "fs" or "tn". (fs = full size, tn = thumbnail). The gene symbol should go into the description line in the Summary section (e.g., replacing "NIP1A" in "Gene expression pattern of the NIPA1 gene." in Image:PBB_GE_NIPA1_gnf1h07157_at_tn.png)
  • All of the "tn" used to redirect to the "fs" images. So, I propose to replace the redirect by displaying the FS image itself. For example, [2].
Hope that's clear, but let me know if anything isn't. Appreciate both of your offers for help... Cheers, AndrewGNF (talk) 06:44, 9 February 2008 (UTC)
I was the one who asked Carl for help, because the changes that need to be made are more complicated than a simple find / replace. Either Carl or I will take care of this shortly, depending on who has the time / ability. --MZMcBride (talk) 07:01, 9 February 2008 (UTC)
Some of the images have titles with an extra letter t hat you didn't mention, like PBB_GE_ACTB_200801_x_at_fs.png and PBB_GE_ACSL4_202422_s_at_fs.png. Should I ignore that? It looks like I will be able to start this today, although it will take a while to complete. — Carl (CBM · talk) 13:30, 9 February 2008 (UTC)
Also, we should say who holds the copyright on the images. Is that you? — Carl (CBM · talk) 13:48, 9 February 2008 (UTC)
Super, thanks much to both of you. Yes, that extra character should be treated as part of the YYYY here. As for copyright, is it sufficient to use {{GFDL}} and {{cc-by-sa-3.0|[[Genomics Institute of the Novartis Research Foundation]]}} as in Image:PBB GE NIPA1 gnf1h07157 at tn.png? (Actually, all images should be properly tagged with those templates -- MZMcBride helped make those changes previously...) AndrewGNF (talk) 16:30, 9 February 2008 (UTC)
I just wanted to find out who created the diagrams - did you create them? Also, what do you think of the full reference at Image:PBB GE A2BP1 221217 s at fs.png? — Carl (CBM · talk) 20:37, 9 February 2008 (UTC)
As long as we're editing anyway, may be worthwhile to switch Template:GFDL to Template:GFDL. --MZMcBride (talk) 20:41, 9 February 2008 (UTC)
I made one change to the formatting of the reference at Image:PBB GE A2BP1 221217 s at fs.png -- figured we might as well use a standard reference template. And yes, I created them. Agreed on the update to {{GFDL}}. Thanks! AndrewGNF (talk) 00:29, 10 February 2008 (UTC)
Not to throw another wrench in the machinery, but perhaps a template for the source information could be used instead of inserting "Description:..." 20,000 times. A template could simply be {{Protein_cite|A2BP1}} or whatever. It would make any future changes a lot easier (wording, grammar, adding links, etc.). --MZMcBride (talk) 01:02, 10 February 2008 (UTC)
Yes, that would made sense. — Carl (CBM · talk) 01:07, 10 February 2008 (UTC)
Andrew: If you made the images, even if the numerical data come from somewhere else, the copyright is probably yours; I want to make sure everything is perfect before making all the changes. Is this right - you got numerical data from somewhere, and made the images from that? — Carl (CBM · talk) 01:07, 10 February 2008 (UTC)
The copyright for both the images and the data should go to the Genomics Institute of the Novartis Research Foundation. Although I made the images, I did it with resources provided by them and (loosely) under the guise of an academic project funded by them. Hope that makes sense... AndrewGNF (talk) 01:15, 10 February 2008 (UTC)
Oh, also want to point out that the gene symbol was parsed out incorrectly. I made the change on our example Image:PBB GE A2BP1 221217 s at fs.png. AndrewGNF (talk) 01:18, 10 February 2008 (UTC)
Took a crack at the citation template (e.g.,{{PBB Image citation|TEST}}):

{PBB Image citation|TEST}


How does that look? (good idea MZMcBride, hopefully that will make this be the very last mass edit we need to make...) AndrewGNF (talk) 01:23, 10 February 2008 (UTC)

Thanks for pointing out the parsing error and for making the template. I understand the copyright better now. I'm going to set things up and make a few test edits. — Carl (CBM · talk) 01:27, 10 February 2008 (UTC)

I made two test edits, at Image:PBB_GE_A4GALT_219488_at_fs.png and Image:PBB_GE_A2BP1_221217_s_at_tn.png. Let me know if those look right. — Carl (CBM · talk) 01:48, 10 February 2008 (UTC)

Looks great to me! AndrewGNF (talk) 02:54, 10 February 2008 (UTC)
Carl, just wanted to say that the image corrections look great! I'll be in and out for the next couple of days (mostly out), but just wanted to note my appreciation... Many thanks! AndrewGNF (talk) 15:24, 10 February 2008 (UTC)
No problem, you're welcome. I'm also very busy this week, and I forgot to leave a note here a couple days ago when the corrections finished. I think the images are all changed, but I'll double check in the coming week. — Carl (CBM · talk) 12:57, 14 February 2008 (UTC)

Speedy deletion of Template:GNF SymAtlas

A tag has been placed on Template:GNF SymAtlas requesting that it be speedily deleted from Wikipedia. This has been done under section T3 of the criteria for speedy deletion, because it is a deprecated or orphaned template. After seven days, if it is still unused and the speedy deletion tag has not been removed, the template will be deleted.

If the template is intended to be substituted, please feel free to remove the speedy deletion tag and please consider putting a note on the template's page indicating that it is substituted so as to avoid any future mistakes (<noinclude>{{transclusionless}}</noinclude>).

Thanks. --MZMcBride (talk) 21:31, 14 February 2008 (UTC)

Thanks for the note. I think that must have been an obsolete template we played with while we were developing. No objection from me for this deletion... AndrewGNF (talk) 21:49, 14 February 2008 (UTC)

My what a lovely bot!

Hi Andrew, I constantly stumble onto the many pages created by your wonderful ProteinBoxBot, which I think is great for getting the basic gene/protein info to all us sciency folks who need it. However, as I run into these pages, I always find I need to tag 'em with the MCB wikiproject tag, to alert the project a new stub is out there that needs some work. Is there any way you could get the bot (or a "follow up" bot?) to automate this process too, when a new page is created? Otherwise, I don't think we'll be able to keep up ;o)

All the bot would need to do is add:

{{Wikiproject MCB|class=stub}}


to the top of each talk page that exists for a ProteinBoxBot generated page. Is it possible, or too difficult to do - I am not a programming genius, so have no clue what is involved!!! ~ Ciar ~ (Talk to me!) 23:43, 15 February 2008 (UTC)

Thanks for the note Ciar. And yes, I've noticed you're been among the faithful few who have been adding the wikiproject templates, so thanks for that. Unfortunately, right now we're trying hard to wrap up the version 1.0 run of the bot without making too many changes to the code itself. (Incidentally, I think we're pretty close. I count 8427 PBB-enabled gene pages in existence.) Adding the MCB template is definitely on our list of Version 2.0 specs, but no promises on when that may get done. A separate "follow up" bot may also be a possibility... Sorry to be noncommittal, but you're welcome to try to drum up support at the MCB proposals page or the bot requests page. Cheers, AndrewGNF (talk) 01:19, 16 February 2008 (UTC)

Re: have we met?

Hello, Andrew. :) Yes, we briefly before I left for my extended wikibreak about a year ago. We worked on the WP:MCB a little together. – ClockworkSoul 20:09, 25 February 2008 (UTC)

Thanks

Thank you so much for all your help. I was struggling to get the references right! You're a life saver. My genetics professor had us expand a wiki stub for a quiz. Thanks again. ~Daisher


Wanted to let you know I told my professor that you helped edit my references and actually she had a good laugh when I told her the story how I kept editing and you broke in saying, "If you stop editing for a min I will help you". She said the purpose was to find more info on the topic and expand the article, not the actual posting as much. Thanks once again! —Preceding unsigned comment added by Daishermj (talkcontribs) 05:32, 1 March 2008 (UTC)

Gene expression data on Insulin Receptor Protein Box

The gene expression section of the protein box makes no sense (even to an Affymetrix user) as the images are unlabelled when you link through to them. Thus rendering it simply a pretty picture devoid of information. I can update the at_xxxx tag info when I get time. But at present you can't work out why there are two graphs or why the expression data is so different. Just a comment in case this isn't just a problem with this single protein... Ianmc (talk) 21:22, 3 March 2008 (UTC)

Hi Ianmc... I'm not sure what you mean by "unlabelled when you link through to them". Can you clarify? Also, yes, undoubtedly some of the gene expression graphs do not report meaningful data. Thankfully, in the case of INSR, it was easy to see which was the "dead" probe set so I went ahead and deleted it from the page. Cheers, AndrewGNF (talk) 01:45, 4 March 2008 (UTC)
For instance the source article is human and mouse, but the graph is actually human data. Also you might want to point out that the title shown on the graph is a microarray code and can be ignored or what the bars actually represent. I don't mean to teach you to suck eggs but also a general rule in scientific publishing is that a figure (+legend) should be interpretable without reference to the text. I assume you would be able to add a couple of sentences of boilerplate to you bot to cover all of the preceding suggestions. Hopefully this is constructive criticism I think it is a very useful piece of information to include in the protein box (I spend all day making this sort of data :-). Ianmc (talk) 23:12, 4 March 2008 (UTC)

RE: Too biting?

I understand what you mean, but I don't think the warning was too severe. It points out what he did wrong instead of saying "don't do this or else". WEBURIEDOURSECRETSINTHEGARDEN aka john lennon 20:00, 4 March 2008 (UTC)

Ok, will keep that in mind, thanks mate WEBURIEDOURSECRETSINTHEGARDEN aka john lennon 20:12, 4 March 2008 (UTC)

Move, sure

No probs. Tim Vickers (talk) 22:51, 4 March 2008 (UTC)

Protein/gene stubs

You may wish to comment on this proposal to split up the present Category:protein stubs, as it would affect the large number of gene articles your bot's been creating. (Not deleteriously, I trust.) Alai (talk) 04:53, 16 March 2008 (UTC)

Reply

Thank you very much for offer, but I consider my activities in WP mostly as entertainment. You can tell thanks to WP community in general if you wish.Biophys (talk) 21:12, 19 March 2008 (UTC)

I have definitely done that, with particular emphasis on the MCB project. Anyway, thanks again for past and future feedback... Cheers, AndrewGNF (talk) 00:16, 20 March 2008 (UTC)

'Protein Box Bot' and 'GNF Protein box'?

Hi! Great bot! Really staggeringly cool!

I am here because I started to try to unify the various {{PDB}} templates... I would like to turn all calls to all the different templates [3] into simple {{PDB link|1xyz}} calls.

If I do this, can you change your bot to use {{PDB link}} in future? Or do you prefer {{PDB}}?

Anyway... I was wondering also if you could add all pages that use the 'GNF Protein box' into a "GNF Protein" Category (or similar)? I was going to do this myself, but I am not sure if a suitable category already exists.

Finally, what other "protein box"es does Protein Box Bot (or you) maintain? Can you make a category for the boxes that protein box bot is in charge of? Sorry - but I didn't see an easy way to navigate to that data if it exists.

BTW, I didn't read all your work in detail yet, how do you handle user edits of database derived data? Do you have any protocol / plan to sync data back to the source databases?

If you are interested, you may like to look at the site I set up with some friends of mine here PDBWiki - if you have any interesting ideas for bots on that site we would be very interested to collaborate :-D

In the mean time I will read up more about your work!

All the best, --Dan|(talk) 13:20, 3 April 2008 (UTC)

Hi Dan, I replied to your post over at the MCB help page. In short, we use {{PDB2}} because it is the most visually concise method, and that helps keep the overall infobox a manageable size. As far as categorization, all pages maintained by PBB are tagged with Category:Human proteins. The goal was not to create a PBB-specific category, since categories should really reflect the scientific concept and not the technical aspect of how content is created/maintained. In terms of edits to database-derived data, we certainly encourage people to correct missing/incorrect content, and they can turn off automatic-updates using the {{PBB_Controls}} template found at the top of every PBB-maintained page. No mechanism to propagate that data back to the source databases since NCBI hasn't given me the keys to their database yet... ;) Phil Bourne did point out PDBWiki to me recently. Looks cool! I offered to change our PDB links to go there instead, but never heard back from him whether that would be desirable. Hey, maybe an interesting bot task would be to add a link to the PDBWiki from all the PDB images that PBB uploads? That would be a very useful link to have somewhere (although it would be one link deep...) Cheers, AndrewGNF (talk) 16:17, 3 April 2008 (UTC)
Hey Andrew, I am really glad that you like PDBWiki - its an ongoing experiment in user contributed feedback and collaborative annotation... Its going to be fun to see how it turns out. Links to PDBWiki from Wikipedia would be really great, as I think it (partly) solves the debate about which derived database we should link to. PDBWiki links each 'PDB entry' page to a growing (and user contribute-able) list of external databases. In this way people could click through to their favorite derived resource, instead of having one specific external link as it is at the moment. Thanks for the comments on the {{PDB}} - it all makes sense. BTW, do you have examples of user edits to database derived data? Do you check for (and annotate) conflicts as part of the update process? I am still in awe of what you have achieved with PBB. I am really looking forward to watching this all pan out! All the best, --Dan|(talk) 17:09, 5 April 2008 (UTC)
If Dan tells which PDBWiki entries have been annotated and contain information of interest, we could make some links to them. Right now, this can better work the opposite way: Dan should make the links from PDBwiki to Wikipedia. Of course Dan could ask Phil about a possibility of making links from original PDB entries (rather than from PDBWiki entries) to WP. In the long run, all PDB entries should have at least one link to WP articles, but that would be a tall order.Biophys (talk) 17:47, 5 April 2008 (UTC)

Hmmm, what do you guys think about combining this task with the mass upload of PDB images we discussed earlier? (I know Willow is tentatively signed up to take this on, but it looks like she's got her hands full with the action potential FAR. Dan, if you wanted to take this on, I don't think she would object...) Then, on the image pages themselves, we can put links to all the various databases (including PDBwiki) in a "See also" section so that readers can individually pick what pros and cons matter to them. Thoughts? AndrewGNF (talk) 19:55, 5 April 2008 (UTC)

Right. I left my suggestions there. If Dan or someone else can implement this, it would be really great.Biophys (talk) 16:13, 6 April 2008 (UTC)

DCC Page

Hi Andrew, thanks for the positive comments! (Hopefully this is a good place to reply to them.) The wiki page was a project completed for a course on the genetics of cancer. This was the first year the course has been run, so I'm not sure if they'll continue the idea for next year, but it was well received by the students. Approximately 30 students created similar pages on cancer related genes for the course, but I believe that only some of them received approval (and bonus marks) by the course director for wiki posting. Thanks for the edits - we were given instruction on wiki functions, but not much in the way of style or formatting details. I'll forward what you've posted on to the course director and hopefully he can include some of that in the syllabus for the next round. If you'd like to take a look at the course page, here's the link - http://burgundy.cmmt.ubc.ca/medg421/wiki/index.php/Category:Genes. Kkott (talk) 00:33, 23 April 2008 (UTC)

Hi there! I just stumbled on your page by accident and saw this bit regarding the educational assignment, well I just happened to have been involved (as a wikipedian, not student or professor) in the last three months in what was the most ambitious and highest achieving WP educational project to date. In the end, 3 featured articles and 8 Good articles were created and alot of lessons learned. The professor involved user:jbmurray wrote an essay about the experience and is in the process of writing a guide of sorts, which I belive would be very usefull to anybody wanting to use WP as an educational tool. The project's page is WP:MMM. It might be worth taking a look. Cheers! Acer (talk) 00:50, 27 April 2008 (UTC)

BBI Content

Greetings! I added the note that the text was copied with permission of BBI just to make sure someone didn't worry about plagiarism. I'm trying to link to specific references to make sure that I'm not link spamming; unfortunately, since I'm not a protein expert, I'm hoping that people can link to some of the original works as well. A lot of the ubiquination pages are lacking in the references department. Hopefully one of the projects we're working on at BBI will encourage people to add more content to Wikipedia in this area. Rabbitvalley (talk) 14:19, 1 May 2008 (UTC)

Added comment on their userpage. Tim Vickers (talk) 17:03, 21 May 2008 (UTC)

Example of ProteinBoxBot config file

Hello, my name is Jonathan and I am involved in a university research project that has similar goals to your own project. I'd like to try out your bot. Do you have an example of a ProteinBoxBot config file as required by your bot? Masterhomer 15:46, 2 May 2008 (UTC)

Hi Jonathan, thanks for your interest. Sure, I put an example config file here. (Look at the source, not the rendered page.) But you should know that the server that is normally used to download annotation is currently down. We should have it back up and running next week sometime. Care to share any details of your research project? If we're playing in the same space, it would be great to coordinate so we don't duplicate work. FYI, I previously jotted down some notes for PBB here and here. Feel free to steal any ideas from there... Cheers, AndrewGNF (talk) 20:06, 3 May 2008 (UTC)

Re: Bot further reading references

Hi,

ok, I thought they were quite silly I'll just remove them next time. Nice to know you enjoyed Cystatin C, it's mostly medical. I'd appreciate any copy-editing for style/grammar etc. and of the molecular section, of course.

--Steven Fruitsmaak (Reply) 12:21, 20 May 2008 (UTC)

Extra space at top of articles with bot boxes

I noticed that articles with the bot's boxes tend to have a couple of unwanted line returns before the start of the article (see OR10G7). Is there a way to possibly fix that? I tried some simple tricks but it didn't work. Jason Quinn (talk) 00:37, 1 June 2008 (UTC)

Hi Jason... Yup, we (and our testers) didn't notice that before we did a lot of edits. Fixing that issue for future gene pages is on our list before we do a mass run. Whether we do a bot to fix all past pages too is not yet determined. (On the scale of things, it seems like a relatively minor issue.) In any case, if you want to fix specific ones that are bothering you by hand, this is how you do it. Cheers, AndrewGNF (talk) 22:07, 1 June 2008 (UTC)

Subpage

To be honest I'm not sure, the page with template does look a lot clearer and less intimidating, but it isn't at all obvious how you would go about editing the template or finding the subpage. Perhaps some more detailed instructions in the hidden text, with a url they can paste into their browsers? Tim Vickers (talk) 16:20, 27 June 2008 (UTC)

Reading about this, it doesn't seem to work. ITK (gene)/PBB isn't a subpage of ITK (gene), it is classified by the software as a page in its own right. I'll ask about how we could do this at the technical section in the village pump. Tim Vickers (talk) 16:28, 27 June 2008 (UTC)
See this section. Tim Vickers (talk) 16:37, 27 June 2008 (UTC)

The comments added above {{PBB_Controls}} and {{GNF_Protein_box}} on each article end up creating WP:GAPS that get rendered at the top of articles. Would it be possible to adjust these comments (or, at the very least, accommodate others' adjustment of these comments) so no extra blank line gets rendered? (example fix)

The problem exists in >50% of the {{PBB_Controls}} articles. Examples include: p53, Myoglobin, Cytochrome c, Dihydrofolate reductase, Fibronectin, Proopiomelanocortin, Corticotropin-releasing hormone. --Underpants (talk) 16:23, 29 June 2008 (UTC)

Apologies, I completely overlooked this comment from a couple weeks ago. Yes, this is an issue that has been nagging us for quite some time. The good news though is that the latest bot run fixes many of the gaps issues you mentioned above (the exceptions being myoglobin, which I fixed by hand, and cytochrome c, which looks funny when I try to fix by hand). Let us know if you notice any more systematic gap issues and we'll do our best to address them. Cheers, AndrewGNF (talk) 16:49, 9 July 2008 (UTC)

User:ProteinBoxBot

Is there any way to set up the bot to add a template category to the templates it's creating? I'd think that a Proteinbox cat would be sufficient. It could go under this main cat. Otherwise, we're ending up with hundreds of uncategorized templates. --WoohookittyWoohoo! 04:20, 8 July 2008 (UTC)

Yup, sure, once this current run finishes we'll go back and do a second run to add the cat. Thanks for the heads-up... AndrewGNF (talk) 04:49, 8 July 2008 (UTC)
Awesome. :) Thanks. --WoohookittyWoohoo! 04:13, 9 July 2008 (UTC)

Survey request

Hi,
I need your help. I am working on a research project at Boston College, studying creation of medical information on Wikipedia. You are being contacted, because you have been identified as an important contributor to one or more articles.

Would you will be willing to answer a few questions about your experience? We've done considerable background research, but we would also like to gather the insight of the actual editors. Details about the project can be found at the user page of the project leader, geraldckane. Survey questions can be found at geraldckane/medsurvey. Your privacy and confidentiality will be strictly protected!

The questions should only take a few minutes. I hope you will be willing to complete the survey, as we do value your insight. Please do not hesitate to contact me or Professor Kane if you have any questions.

Thank You, BCeagle0312 (talk) 16:33, 13 July 2008 (UTC)

DQA1

After a protracted search I think I have found the person who created this page. DQA1 encodes a subunit, the alpha subunit of DQ. HLA-DQ has a page, the bot apparently failed to find. The image shown for HLA-DQA1 is not that of alpha chain but the entire DQ. I want to replace the image on the HLA-DQ page with that image but I cannot find a description. Can you write a description of the images added for HLA molecules describing which is the alpha or beta chain. There are going to be alot of structures for HLA appearing, I don't think it serves the encyclopedic function of Wikipedia to link out to every structure as some of these are not even published within papers. Is it possible for the protein bot to limit structures to a single instances of HLA molecules, instead of every structure for dozens of bound peptides for each molecule.

IN any case there appears to be a flaw in the bot when it comes to recognizing genes for heteropolymer subunits, as a result it assigned an image to DQA1 but not DQB1. Since there is no isolate structure of either, as the image shows the structure has a peptide within the binding pocket making this the DQ αβ-peptide complex, the best solution is to color the chains a single color and refer to them by color in the description. Alternative a pointer in the image can point to the chain bearing alpha chain.PB666 yap 13:42, 16 July 2008 (UTC)

Hello... Well, I'm afraid this type of detailed gene annotation (especially of a protein family as complex as the HLAs) moves beyond the bot specs as well as my own expertise. In short, I think you should make whatever changes you see fit, and I'll watch and make sure that I make the appropriate changes so that the bot does not undo your edits. Does that sound okay? If you'd like to discuss the specs of the bot in general (for example, whether all PDB structures are linked), feel free to discuss at User talk:ProteinBoxBot. Unfortunately, I can't annotate the structures better -- they come directly from RCSB, so perhaps they have more info on that site. (click the PDB code under the ribbon diagram.) If you'd like to replace that image with one made by hand, I think that would be great. If you'd like help doing that, I'd suggest posting on one of the WP:MCB discussion pages. Hope that helps... Cheers, AndrewGNF (talk) 14:10, 16 July 2008 (UTC)
Based on your suggestions PB666, I have created a new graphic for the HLA-DQA1 article in which the alpha and beta subunits are colored differently. Is this OK? Also I have a question for Andrew: In order to update the structure, I edited Template:PBB/3117. Is this OK? If the Bot updates the template, will the structure be over written? Cheers. Boghog2 (talk) 15:24, 16 July 2008 (UTC)
That looks great... We'll be sure that bot updates don't change the PDB image (i.e., it only will add if one previously didn't exist). Thanks, AndrewGNF (talk) 15:52, 16 July 2008 (UTC)

Entrez

Hi, I was looking around in the web and I found this database called nextbio (www.nextbio.com). And I think it would work great with proteinboxbot. It uses information from both Entrez Pubmed, and alot of other places. I think this would work well but I dont know. Can you look into this? To use the nextbio database maybe? This is just a thought to make the Wikipedia Gene section much bigger or more information in it. Movado73 (talk) 16:40, 23 July 2008 (UTC)

Thanks, I've added the idea to our ideas page. As an aside, there are an overwhelming number of biological resources available, and clearly our goal shouldn't be to include them all on the gene pages. Ultimately it will be up to the user community how the gene pages will look. When we start planning for the version 2 design, we'll post on WP:MCB for input... Cheers, AndrewGNF (talk) 17:44, 23 July 2008 (UTC)

Mentioning your recent PLoS Biology paper

Hello Andrew. Regarding:

Jon W. Huss, III; et al. (July 2008). "A gene wiki for community annotation of gene function". PloS Biology. 6 (7): pp. 1-5. {{cite journal}}: |pages= has extra text (help); Explicit use of et al. in: |author= (help)

Should this be publicized somewhere on the MCB wikiproject? I see the paper has been very discreetly linked at User:ProteinBoxBot#Links, and there is some discussion of the related Talk page. I hope the paper, and the publicity resulting, are being considered for future mention in the Wikipedia:Signpost. EdJohnston (talk) 18:03, 23 July 2008 (UTC)

Hi Ed, sure, I think adding a link to the paper somewhere would be great. Any suggestions on specifically where? To be honest, I'm almost embarrassed by the publicity since the larger MCB community in total (and often individually) has contributed far more to WP than the ProteinBoxBot. I'd hate for that that perspective to be lost... On the other hand, I also understand that for the MCB community to grow, perhaps a bit of evangelism and self-promotion is appropriate. In any case, if anyone can find a good home for it, I think a link to the paper from somewhere on WP:MCB would be great... AndrewGNF (talk) 19:07, 23 July 2008 (UTC)
As for the signpost, someone already tipped them about the paper. See here. Lets hope the editors pick up the story. 189.104.60.1 (talk) 19:11, 23 July 2008 (UTC)

Expression profiles

Hi Andrew. Congratulations with publications an publicity! I just looked at the expression profile in Rhodopsin article. This protein suppose to be expressed only in eye. But profile shows it has been expressed everywhere, and I did not find eye or any eye tissues there. Maybe I do not understand something? Thank you. Biophys (talk) 18:13, 28 July 2008 (UTC)

Gene Wiki - Gen-Wiki - GenWiki

Hi Andrew, since I do not know who is coordinating activities around "Gene Wiki" I try to post it you:

Our German "Society for Computergenealogy e. V." provides since 2004 as one of our projects a Genealogical Wiki named GenWiki:

During the last weeks a lot of sources wrote about your project and named it "Gene Wiki", or in German "Gen-Wiki"/"GenWiki". It would be nice, if You or someone responsible could position information about that on a friendly place.

Thank You for understanding und support.

Uwe (talk) 12:21, 9 August 2008 (UTC)

Thanks for the note. I've created another section for "other biological wikis" at Portal:Gene_Wiki. Cheers, AndrewGNF (talk) 16:46, 12 August 2008 (UTC)

Cdt1

Hi - Cdt1 was discovered in fission yeast by Hofmann and Beach (EMBO J. 1994 Jan 15;13(2):425-34.). They described it as a gene regulated by the Cdc10 transcription factor, thus the acronym Cdc10-dependent transcript 1. At the time it was not known what its function is. The Hugo name is historically incorrect, but maybe they chose it to give a better picture of what the protein does in DNA replication. Or maybe they just didn't bother to look up the literature.Perhaps you should include the original reference in your description of the protein. Regards - Stephen Kearsey —Preceding unsigned comment added by Nornour (talkcontribs) 19:16, 14 September 2008 (UTC)

Hello Stephen, thanks for the clarification. As usual, Boghog beat me to the punch. How do these changes look to you? AndrewGNF (talk) 18:45, 17 September 2008 (UTC)

ProteinBoxBot

I left a note on the bot's talk page. --MZMcBride (talk) 02:31, 15 September 2008 (UTC)

EBI License

Hi! I posted you a message there. --Giac83 (talk) 19:08, 21 September 2008 (UTC)

Hello

Hi, Andy. I notice your bot created this page. I need permission to make an extremely drastic edit, but I'm a bit frightened I mess the bot up. What I'm asking to do is move PRPF3#Further reading section to a subpage because is it larger that the rest of the article put together. Thanks, Stereotyper (talk) 21:42, 27 October 2008 (UTC)

Hi there... Go ahead and make whatever changes you like. It's very unlikely you'll break the bot, and I'll watch and fix anything that might look problematic. Though it's worth noting that I think subpages are frowned up on at WP (at least in the main namespace). If you know something about that gene, then I'd rather you trim out irrelevant references in the further reading section. Oh, and the other way to fix the problem is to add a ton of other content to balance out that section... ;) Cheers, AndrewGNF (talk) 23:26, 27 October 2008 (UTC)

RfD nomination of ITK (gene)/PBB

I have nominated ITK (gene)/PBB (edit | talk | history | protect | delete | links | watch | logs | views) for discussion. Your opinions on the matter are welcome; please participate in the discussion by adding your comments at the discussion page. Thank you. MBisanz talk 00:51, 1 November 2008 (UTC)

Warning Vandals

  Hello. Regarding the recent revert you made to Hadewijch: You may already know about them, but you might find Wikipedia:Template messages/User talk namespace useful. After a revert, these can be placed on the user's talk page to let them know you considered their edit was inappropriate, and also direct new users towards the sandbox. They can also be used to give a stern warning to a vandal when they've been previously warned. Thank you. - Fastily (talk) 05:22, 28 March 2009 (UTC)

Protein Box Bot question

Hi, I understand that you are involved with the PBB project. I came across the Protein Wee1 of which in humans there seem to be two isoforms http://www.expasy.org/cgi-bin/niceprot.pl?P30291 and http://www.expasy.org/cgi-bin/niceprot.pl?P0C1S8. Yet only the first one has its own Wikipedia page created by the bot, is there some reason why the second one was not created? Do you have a policy to only have one article for this kind of cases? Cheers --hroest 14:19, 18 April 2009 (UTC)

Hi Hannes... To make sure pages created by ProteinBoxBot satisfied notability requirements, the bot was approved to create ~10,000 gene pages. Those gene pages were selected based on the number of linked citations to Pubmed. In this case, it looks like Wee1 made the cut but Wee2 did not. Of course, there are certainly notable genes below our arbitrary threshold. Especially if you know something about Wee2, I'd suggest that you go ahead and create Wee2. We're working on a way to generate the standard PBB content template for any user-specified gene (but in the mean time, I notice someone created an earlier version of Wee2 at the bottom of Wee1. Hope that helps... Cheers, AndrewGNF (talk) 01:06, 20 April 2009 (UTC)
Ok I see, I thought there was a page for every gene. Right now I helped a friend to create Wee1, we decided to make the article about the yeast gene but it is still not satisfactory to have many different pages for different organism. Is there a general rule how to handle that problem? The PBB boxes provide a lot of information but they are clumsy in a small article about the yeast gene. It might also make less sense to have an extra article on Wee2 if there is no additional knowledge in addition to the one that could be written in the article Wee1. I hope that you agree with the solution of having the article Wee1 and Wee1-like protein kinase as it is right now. Greetings --hroest 11:11, 23 April 2009 (UTC)

Template talk:Gallery

Didn't catch your message, sorry. Replied to you on Template talk:Gallery. ChyranandChloe (talk) 06:10, 17 June 2009 (UTC)

Contact?

Hi Andrew,

Great work with the PBB!

I sent you an e-mail a few days ago, but I might've mistyped the address. Drop me a line when you get the chance? Thanks! Proteins (talk) 22:50, 25 June 2009 (UTC)

Commons:File:PBB GE MYH6 204737 s at fs.png

This file was moved to Commons from English Wikipedia, but some description information may have got lost in the process.

As you are noted as the original uploader, or in the history for the file, it would be appreciated if you could help in reconstructing this information.

Please also consider checking Commons for other media that you may have uploaded locally, but which was subsequently transferred.

Special:Log for uploads can help in this.

Thanks for you assistance and keep uploading 'free' media :)Sfan00 IMG (talk) 14:23, 21 July 2009 (UTC)

BogBot

Hi Andrew. This is a brief update as to what I have doing lately. Thanks for your pointers on getting a bot up and running. I have made a start as described here. The first task I was planning to use the bot for was to reformat some enzyme pages. I have a prototype that seems to work reasonably well. Of course, I also intend to use the Bot for some Wiki Gene projects. JonSDSUGrad and Plindenbaum have been doing a great job, but there are some additional things I think I could contribute to.

In order to come up to speed, I need to start out with some simple projects. For example modifying the first sentence in the lead of the gene articles. If this works out, I would then move on to more involved tasks. Suggestions are welcome. Cheers. Boghog2 (talk) 22:32, 30 July 2009 (UTC)

Fantastic! It's almost scary for me to think of you being even more productive, but I suppose a bot would be the way... Modifying the first sentence would be very welcome. (It's even come up over here.) And the enzyme box is great as well. Of course, let us know if we can help in any way (debugging, testing, whatever...) Can't wait! Cheers, AndrewGNF (talk) 23:17, 30 July 2009 (UTC)

PBB requests

I submitted one here, but are these still being granted? I hope to apply for WP:DYK and would do a manual box if it couldn't be done within 5 days. --Steven Fruitsmaak (Reply) 21:42, 31 July 2009 (UTC)

Funny you should ask, we're close to having a web tool modeled after the diberri pubmed formatter where people can retrieve the templates themselves. Let me check on status and get back to you... Cheers, AndrewGNF (talk) 22:51, 31 July 2009 (UTC)
Though checking again, it looks like this gene's name and symbol have recently been updated to RNLS/renalase, but our database is still using the less-than-descriptive C10orf59 (http://biogps.gnf.org/genereport/55328). Usually we do quarterly updates. Anyway, you can see all the data we have at the BioGPS link. If you want us to create it and you can modify/update, we'll try to create that page for you ASAP. If too much has changed, it might be easier to manually create it yourself... Cheers, AndrewGNF (talk) 22:56, 31 July 2009 (UTC)
Hi Andrew,
thanks for adding the protein box!
--Steven Fruitsmaak (Reply) 20:24, 5 August 2009 (UTC)

Categories for discussion nomination of Category:User:AndrewGNF/SWL/PPI

 

Category:User:AndrewGNF/SWL/PPI, which you created, has been nominated for deletion, merging, or renaming. If you would like to participate in the discussion, you are invited to add your comments at the category's entry on the Categories for discussion page. Thank you. VegaDark (talk) 01:37, 7 August 2009 (UTC)

IUPHAR external links

Hi Andrew,

After opening my big mouth here, I feel an obligation to follow-up on what I had suggested. Since this proposal involves GNF_Protein_box, I wanted to run this by you first to make sure that I have your support before asking the wider community. I was planning to post the following "request for comment" on WT:PHARM with cross links from WT:MCB and Portal_talk:Gene_Wiki.

First of all, do you think IUPHAR links to protein info boxes is a good idea? Second, if the wider community supports it, would you have any objections or concerns about including an optional IUPHAR link in the GNF_protein_box? Any suggestions you might have are certainly welcome. Cheers. Boghog2 (talk) 18:24, 19 August 2009 (UTC)

  • Request for comment:
dopamine receptor D1
Identifiers
SymbolDRD1
IUPHAR214
NCBI gene1812
HGNC3020
OMIM126449
RefSeqNM_000794
UniProtP21728
Other data
LocusChr. 5 q34-q35
Search for
StructuresSwiss-model
DomainsInterPro

A proposal was made here to add IUPHAR database links to protein info boxes. An example using the {{protein}} template is found to the right. More specifically this proposal is to add a optional link to the {{GNF_Protein_box}} and then include the link on Wikipedia receptor and ion channel pages to the corresponding IUPHAR database entry.

Before implementing these links, I would like to ask the community if there is support for doing so. Any comments, concerns, or suggestions you might have are welcome. Cheers. Boghog2 (talk) 18:24, 19 August 2009 (UTC)

Sure, I think that should be fine. We can add it as an optional variable to the GNF Protein box, and when present, we can add it to the External IDs section. The only work on our part will be to ignore that field on any updates, which shouldn't be too bad. Does that sound good? AndrewGNF (talk) 20:01, 19 August 2009 (UTC)
Great! The optional variable is exactly what I had in mind. Thanks for your support. Boghog2 (talk) 20:19, 19 August 2009 (UTC)
I think I now have a clean solution for adding IUPHAR links to GNF_Protein_box which should not require any changes to the GNF_Protein_box template code. I have created a template called {{IUPHAR2}}. For example the code {{IUPHAR2|DRD1}} (where DRD1 = HUGO gene symbol) returns: IUPHAR: D1. This template can then be appended to the AltSymbols list. For a live example, see Dopamine receptor D1. Alternatively the {{IUPHAR}} template could be added to the external IDs section, but this would require some modification of the GNF_Protein_box template. Since the primary purpose of the IUPHAR database is to establish and document a set of consistent names and symbols for receptors and ion channels (analogous to the HUGO link), there is a certain logic for including the link in the Symbols section. Does this look OK? One of the few reservations I have about this solution is the overhead of using the {{IUPHAR2}} template. It is 44K bytes long and it contains one large lookup table that returns an appropriate IUPHAR link. Your thoughts? Boghog (talk) 06:07, 22 August 2009 (UTC)
Sorry, out of town for a couple days. I see your rationale for including it in the symbols section, but I think it would be a more natural fit in the External Links section. As for {{IUPHAR}} versus {{IUPHAR2}}, I'd recommend we steer away from index lookups. We considered doing the same thing with the Gene Ontology annotations early on, but decided that it wasn't worth the complexity. I think we should add an optional parameter to GNF Protein box, and have it just show up linked through its ID (e.g., IUPHAR: 2552). What do you think? (of course, we'd need an admin to help modify GNF Protein box...) Cheers, AndrewGNF (talk) 21:45, 24 August 2009 (UTC)
Thanks for your response. My first thoughts of how to implement were very similar to yours. However when I started to dig into this deeper, I quickly discovered things were not so simple. As discussed here, the main complication is the format for linking to IUPHAR database. The format of the link is different depending on whether one is linking to a receptor or ion channel. Furthermore the receptor links contain a single accession number whereas two numbers (family and member) are required to link to ion channels. Unfortunately there currently is no unified way to link to all entries in the database. In addition, Chido Mpamhanga from IUPHAR specifically requested here that we display the IUPHAR receptor/ion channel name/symbol. While we are under no obligation to do so, I think it is a reasonable request that we display a the IUPHAR symbol instead of an obscure accession number in the link. Chido provided a csv file which provides a mapping between GNC_gene_name, IUPHAR_name, and IUPHAR_URL for all 1513 entries in the IUPHAR database. From this file, it was very easy to create the {{IUPHAR}} template which given a GNC_gene_name, returns a link to IUPHAR_URL and displays the IUPHAR_name. In order to implement this in the GNF_Protein_box template without using {{IUPHAR}}, we would need four parameters (e.g., IUPHAR_receptor_id, IUPHAR_channel_family_id, IUPHAR_channel_member_id, IUPHAR_symbol) which is very messy. If {{IUPHAR}} were included in the External Links section, we could have a single logical yes/no parameter IUPHAR. The IUPHAR template in turn could get its argument from the Symbol variable in the GNF_Protein_box. Does this sound reasonable? Cheers. Boghog (talk) 05:19, 26 August 2009 (UTC)
Thanks for the summary of all the previous discussion. Yes, makes sense now... On the one hand, I'm not entirely convinced the IUPHAR symbol should be different than the HGNC symbol. On the other hand, I'm not that passionate about it either if everyone else has been sufficiently convinced. As far as the technical solution, I think the one you propose at the end of your last message sounds great. Do you need me to mock this up, or do you want to do it? Either way, we should make a copy in a sandbox first to test... Cheers, AndrewGNF (talk) 21:04, 26 August 2009 (UTC)
Great! I agree that it would be nice if HUGO and IUPHAR could agree on common gene/protein names but I don't think that is going to happen any time soon. Furthermore both sets of symbols are in wide use so I think it is appropriate that the GNF_Protein_box contain both. Since you are much more familiar with the GNF_Protein_box template code than I am, I would appreciate if you would make a test version. In addition, I left a small request here that I don't think is too controversial and should be very easy to implement. Thanks for your help! Cheers. Boghog (talk) 21:31, 26 August 2009 (UTC)
I requested the change to hyperlink "OMIM" here: (Template_talk:OMIM5). I'll work on the other one... Cheers, AndrewGNF (talk) 22:12, 26 August 2009 (UTC)
And how about this? User:AndrewGNF/Template:PBB/1812 If that looks good, then we just need to request this change to {{GNF Protein box}}. Cheers, AndrewGNF (talk) 22:23, 26 August 2009 (UTC)
... actually, we should request this change, which also includes the beginning of the the demise of {{GNF Ortholog box}}. AndrewGNF (talk) 23:46, 26 August 2009 (UTC)
Wow! You have been busy. The modifications look perfect. Thanks for your quick response! Cheers. Boghog (talk) 04:55, 27 August 2009 (UTC)
Requested! Every once in a while, I get motivated to actually do things myself, rather than just talking about doing things... ;) Cheers, AndrewGNF (talk) 15:49, 27 August 2009 (UTC)

Gene Expression Atlas Box

Hi Andrew - responding to your comment on User_talk:Mus4musculus. I wrote up a brief proposal for the Atlas box on Wikipedia_talk:WikiProject_Molecular_and_Cellular_Biology. Looking forward to your reply. -- ostolop 11:04, 10 September 2009 (UTC) —Preceding unsigned comment added by Ostolop (talkcontribs)

==File permission problem with File:PBB Protein VDR image.jpg==
 
File Copyright problem

Thanks for uploading File:PBB Protein VDR image.jpg. I noticed that while you provided a valid copyright licensing tag, there is no proof that the creator of the file agreed to license it under the given license.

If you created this media entirely yourself but have previously published it elsewhere (especially online), please either

  • make a note permitting reuse under the CC-BY-SA or another acceptable free license (see this list) at the site of the original publication; or
  • Send an email from an address associated with the original publication to permissions-en wikimedia.org, stating your ownership of the material and your intention to publish it under a free license. You can find a sample permission letter here.

If you did not create it entirely yourself, please ask the person who created the file to take one of the two steps listed above, or if the owner of the file has already given their permission to you via email, please forward that email to permissions-en wikimedia.org.

If you believe the media meets the criteria at Wikipedia:Non-free content, use a tag such as {{non-free fair use in|article name}} or one of the other tags listed at Wikipedia:Image copyright tags#Fair use, and add a rationale justifying the file's use on the article or articles where it is included. See Wikipedia:Image copyright tags for the full list of copyright tags that you can use.

If you have uploaded other files, consider checking that you have provided evidence that their copyright owners have agreed to license their works under the tags you supplied, too. You can find a list of files you have uploaded by following this link. Files lacking evidence of permission may be deleted one week after they have been tagged, as described on criteria for speedy deletion. If you have any questions please ask them at the Media copyright questions page. Thank you. Trixt (talk) 22:14, 19 September 2009 (UTC)

Thanks a lot

Thanks for getting rid of the ugly vandalism on my userpage (still don't understand why people do such things -_-"); I owe you one for that. =) --Twilight Helryx (talk) 17:25, 25 September 2009 (UTC)

To Commons

Hi AndrewGNF, I'm transfering your images to Commons, see commons:Category:Files by User:AndrewGNF from en.wikipedia. multichill (talk) 14:12, 28 February 2010 (UTC)

Great, thanks! Next time, we'll put it in the right place to start... Cheers, AndrewGNF (talk) 11:42, 2 March 2010 (UTC)
I see some images are still here, but these images don't contain an awful lot of information. Could you provide me with some information so the images can be transfered to Commons? multichill (talk) 19:58, 13 May 2010 (UTC)
I think those images are actually not used anymore. They used to appear on our gene pages, but they were replaced by higher quality images ([4]). Perhaps you can help me confirm that they are not being used in the main namespace, and if so, they should just be deleted? Cheers, AndrewGNF (talk) 20:33, 13 May 2010 (UTC)
Some (136 out of 2739 or 5%) of them are in use, see http://pywiki.pastey.net/136427
If the unused files are not worth keeping we can nominate them for deletion. multichill (talk) 21:26, 13 May 2010 (UTC)
Fantastic, thanks for doing that analysis. Yes, I think it's safe to mark the unused ones for deletion. Thanks... Cheers, AndrewGNF (talk) 22:10, 13 May 2010 (UTC)
And what to do with the used images? These images should contain more information before being moved to Commons or are better images available as replacements so we can orphan these too? multichill (talk) 15:41, 14 May 2010 (UTC)
I think you're right, we can proactively move to orphan these images. Emw was the one who spearheaded the bulk movement. Not sure how these slipped through the cracks, but I'm betting he will be happy to work on the remaining ones. I'll drop a note on his talk page... Cheers, AndrewGNF (talk) 16:17, 14 May 2010 (UTC)
Hi all. Thanks for bringing this problem to my attention. I will begin filling in these cracks early next week, and drop a note once the upgraded images have been uploaded and the old images replaced. Best, Emw (talk) 16:38, 14 May 2010 (UTC)
Great, I'll just wait for you to finish and than we can nominate all the old images in one go. multichill (talk) 16:50, 14 May 2010 (UTC)
Any news? multichill (talk) 19:05, 29 May 2010 (UTC)
Ping! multichill (talk) 11:25, 17 July 2010 (UTC)
Apologies for the delay. I've replaced about 30 of the 136 PBB images still in use. I don't anticipate being able to replace the remaining 100 or so images in the near future. Emw (talk) 18:46, 18 July 2010 (UTC)
I wonder if it is possible for users like me that does now know anything about this area to help replace usage? Is there some systematic approach that could be described? Also if replacement will take "long time" could we nominate all the unused files for deletion so only the images that need replacement is still here? --MGA73 (talk) 22:18, 12 August 2010 (UTC)

HELP PLEASE

hi! i am from hindi wikipedia and i am an admin there. i want to know how to introduce more tools in edit option, can you help me to tell about the file in which these codes are filled. like in mediawiki:edittools but in this only lower side tools are given.i watched you on recent change list and i hope you will surely help us--IMayBeWrong (talk) 19:01, 16 June 2010 (UTC)

You are now a Reviewer

 

Hello. Your account has been granted the "reviewer" userright, allowing you to review other users' edits on certain flagged pages. Pending changes, also known as flagged protection, is currently undergoing a two-month trial scheduled to end 15 August 2010.

Reviewers can review edits made by users who are not autoconfirmed to articles placed under pending changes. Pending changes is applied to only a small number of articles, similarly to how semi-protection is applied but in a more controlled way for the trial. The list of articles with pending changes awaiting review is located at Special:OldReviewedPages.

When reviewing, edits should be accepted if they are not obvious vandalism or BLP violations, and not clearly problematic in light of the reason given for protection (see Wikipedia:Reviewing process). More detailed documentation and guidelines can be found here.

If you do not want this userright, you may ask any administrator to remove it for you at any time. Courcelles (talk) 04:46, 20 June 2010 (UTC)

Bot talk page

I noticed that User talk:ProteinBoxBot is redirected to Portal:Gene Wiki/Discussion. However, it isn't clear whether you were aware that some users have been leaving messages on the redirected talk page. These messages are only visible if you edit the redirect page. If you need to refer to them in future, you might want to copy the text to another page where it will actually be visible. If not, you may wish to delete them. --R'n'B (call me Russ) 13:29, 16 July 2010 (UTC)

RANKL mRNA expression histogram by human tissue type is rare in the levels of the bars

Dear AndrewGNF, The graph at [5] that you uploaded does not agree levels with it source at [6] What's going on here? Please find me at mjenik@dc.uba.ar Michael Jenik Buenos Aires, Argentina —Preceding unsigned comment added by 186.136.135.237 (talk) 23:39, 8 February 2011 (UTC)

File:PBB_GE_PCDHB3_221408_x_at_tn.png and others

Some of your files have been marked as duplicates on commons. They are identical from the content but relate do differnt things so I am not sure, If they should be deleted. Could you have a look? Thanks, Amada44  talk to me 11:33, 29 July 2010 (UTC)

Oops I made a comment in your archive

Hi! I made a comment in your archive User_talk:AndrewGNF/Archive2#To_Commons. If you tell me what should be done perhaps I can help. But I need a "cleanup for dummies" :-) --MGA73 (talk) 20:21, 13 August 2010 (UTC)

Happy Holidays!

File:Nissen-2155 ubt.jpeg Seasons greetings and best wishes for 2011!
Boghog (talk) 16:13, 24 December 2010 (UTC)

Fig4 copyvio

Hi Andrew

I tried moving Fig4_homolog to Fig4 incorrectly, then tried to go back and move it. Could you doublecheck that this was done right? I think its better that the gene is referred to as Fig4

Davebridges (talk) 01:36, 8 March 2011 (UTC)

Hi Dave, You mostly had it right, but I made a few minor changes. Now, FIG4 and FIG4 homolog both point to Fig4. Possibly the thing that you missed is that article titles are case sensitive, so "FIG4" is different than "Fig4". Anyway, I hope it's all to your liking now. Let me know if not... Cheers, AndrewGNF (talk) 02:07, 8 March 2011 (UTC)

How do I add gene expression graphs?

I read about KLF14 in the popular press, and decided do a little on its WP page. It's expressed ubiquitously, strikingly so as seen on its gene report page. One of those pretty gene-expression graphs would look nice in its protein box. How do I add it? --Dan Wylie-Sears 2 (talk) 17:48, 19 May 2011 (UTC)

Hi Dan, sorry for the late reply... Unfortunately, the expression pattern in human for KLF14 is pretty uninteresting. That looks to be a pretty typical example of a "dead" probe set. The mouse version is a bit more interesting, but unfortunately we've standardized on human expression images in the infobox. Sorry, in this case I think KLF14 will have to go without an expression graph... (But nice job with the article, BTW...) Cheers, AndrewGNF (talk) 22:11, 23 May 2011 (UTC)

SPRED1 and SPRED3

Hi,

I have added an article on SPRED1. Individuals who may have signs for but do not have Neurofibromin 1 mutations may have SPRED1 gene mutations, and if not SPRED1 then SPRED2, SPRED3, SPRY1, SPRY2, SPRY3 or SPRY4.

I noticed that you have a gadget called Protein Box Bot. Could you please apply this bot to SPRED1 and SPRED3?

Also do you know of a public gene mutation database similar to WikiGenes or WikiPathways? I am aware of human gene mutation database but that is essentially a commercial project with limited academic and no public access. With respect to NF1, NF2, SPRED1-3 and SPRY1-4 genes, I would like to make publically available on Wikipedia the known mutations to the extent that they are in public domain, cause I don't think there is a WikiMutations yet so might as well use Wikipedia.

Thanks, Erxnmedia (talk) 02:40, 7 June 2011 (UTC)

Hi Erxnmedia, Sorry for the late reply -- not checking into WP as often as I should. Looks like you and Boghog have managed to get SPRED1 and SPRED3 up and running. Looks great. As for a SNP wiki, definitely check out SNPedia. Hope that helps! Cheers, AndrewGNF (talk) 22:58, 14 June 2011 (UTC)

ProteinBoxBot edit summaries

Could you please take a look at Wikipedia:Bot owners' noticeboard#General notice to bot owners about edit summaries and see if the suggestions might apply to your bot? Feel free to add your own suggestions and comments there too. Headbomb {talk / contribs / physics / books} 21:19, 21 August 2011 (UTC)

Thanks for the note! We'll redouble our efforts to put in informative edit summaries... Cheers, AndrewGNF (talk) 17:46, 29 August 2011 (UTC)

Phenotypic annotation of Gene wiki genes

Hi Andrew. At the Sanger Institute, we have a large scale programme to systematically knock out genes (in mouse ES cells, see [7]), produce mutant mice and phenotype them in a standarized format, see [8]. We are at the phase where we are producing a couple of hundred of lines a year and are about to publish the first 300 or so soon. I thought it would be nice to use the functional data from model organisms to enhance the Gene wiki project, so I have started to collate the phenotypic information into a standardized format (The staging area is at User:Rockpocket/MGP) and will release the first few hundred into wikispace concordant with the publication of the paper describing them. You can review one example I have already put into wikispace as a trial, RAD18.

Firstly, I wanted to let you know what we are planning; but secondly, you can see from the staging area that a number of genes we have phenotypic data for does not yet have a Wikipedia article. I can create a stub for each one before I port the phenotypic data across, but I was wondering whether it was possible to have your bot create these first? Also, any thoughts or advice you have on this is most welcome. Rockpocket 09:47, 25 August 2011 (UTC)

Hi Darren, thanks for the note (and sorry for the delay in responding)... Sounds great! Looks like a very interesting initiative, and we'd be happy to help out. If you can give me a list of human gene IDs that you want created (NCBI preferred, Ensembl will work too), we can get those created in short order. Cheers, AndrewGNF (talk) 17:42, 29 August 2011 (UTC)
Thats great, Andrew - thanks! It turns out there is more than I first thought (probably because genes that we knew little about were specifically selected for KO). I'll list about half of them at the moment and provide the other half later:
Rockpocket 09:26, 30 August 2011 (UTC)
Hi all, as you can see from the link color, the bot's finished creating the stubs and templates. There are a few bugs in some of the pages (empty citations, "null" instead of summaries in pages that were missing summary text in NCBI, etc) so some cleanup is needed. Please let me know if there's anything else we should fix. Pleiotrope (talk) 02:42, 31 August 2011 (UTC)
Thats amazing! I'll clean them up as I go through them. There is another set of about the same size, I'll try and get that list to you within a week or so. Thanks again! Rockpocket 09:38, 31 August 2011 (UTC)
Hi again. Below is the remainder of the genes. By the way, I gave a talk today at the Wellcome Trust (who funds my research) about Wikipedia and opportunities for biomedicine, focused heavily on the Gene Wiki project. It went over pretty well, I think. Our long term goal is to get them to engage more with Wikimedia and continue to support these types of projects.
Thanks again for you assistance! Rockpocket 17:39, 8 September 2011 (UTC)
Super, we'll get on it asap. Unless you're in a rush, we're going to finish fixing those bugs from last time before beginning round two. It should just be a couple days or so. Glad to hear your pitch to the Wellcome Trust went well. If this is a big emphasis for your group moving forward, perhaps we should connect on a phone call sometime to compare notes and coordinate. Generally speaking, we've been interested in constructing semantic mashups (e.g., [9]) and in extracting structured annotations from Gene Wiki text (e.g., [10]). Anyway, it would be great to work together directly if there were clear overlaps... Cheers, AndrewGNF (talk) 21:05, 8 September 2011 (UTC)
No rush at all. Our plan is to sync the release of the data onto WP with the publication of the paper describing the mutants (we were a little concerned about WP:V otherwise) - that is still months away. For me, this is strictly an after-hours sideline inspired by the Gene wiki concept. I'm in no way a computational biologist, my lab's primary focus remains on the behavioural analysis of mice. But maybe 6 months to a year after we release this first batch I'll revisit it and see what effect it has in terms of article edits, views, click through to the data, and mouse / ES cell requests. I would certainly like to chat to you about that then, as if there is a measurable impact that I'd like to write it up. Perhaps we could even collaborate on that. Rockpocket 22:06, 8 September 2011 (UTC)
Sounds great... Our infrastructure for looking at various edit/view metrics is pretty detailed and robust now (a recent development as we prepare for our NAR database submission next week). We'd be happy to help you out if you need it... Cheers, AndrewGNF (talk) 19:02, 9 September 2011 (UTC)

ProteinBoxBot edit request

Hi,

I've placed a request on the Gene Wiki portal discussion page for ProteinBoxBot to fill in the name parameter of the infoboxes it updates as part of an enhancement of the {{PBB}} template. Any help you can provide would be much appreciated. Thanks! Chris Cunningham (user:thumperward) - talk 09:27, 15 September 2011 (UTC)

Nomination for deletion of Template:NCBI Taxonomy link

 Template:NCBI Taxonomy link has been nominated for deletion. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. Bulwersator (talk) 10:31, 7 December 2011 (UTC)

Ping

 
Hello, Andrew Su. You have new messages at Portal:Gene_Wiki/Discussion.
Message added 21:59, 28 January 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.
 
Hello, Andrew Su. You have new messages at Template_talk:GNF_Protein_box.
Message added 21:59, 28 January 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.
 
Hello, Andrew Su. You have new messages at Template_talk:GNF_Protein_box.
Message added 21:59, 28 January 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

ProteinBoxBot and Template:PBB/2194

Could you please have a look at {{PBB/2194}}? ProteinBoxBot is making an edit to this template that is causing display problems at Fatty acid synthase. Thank you. 148.177.1.210 (talk) 19:02, 23 March 2012 (UTC)

Aack, good point... We'll have a look at fixing either the bot or the template... Cheers, AndrewGNF (talk) 16:48, 12 April 2012 (UTC)

Bot edit corrupted template

This edit - http://en.wikipedia.org/w/index.php?title=Template%3APBB%2F3717&diff=492719402&oldid=490593689 - corrupted the template, I have reverted it. We had a message at OTRS that Janus kinase 2 wasn't displaying properly  Ronhjones  (Talk) 19:29, 22 May 2012 (UTC)

Thank you for the note, and for reverting that errant edit. Someone reported another example yesterday, and since then we have been hard at work on a fix. Hopefully soon... Cheers, AndrewGNF (talk) 23:23, 22 May 2012 (UTC)

Collaboration with NCBI?

Hi Andrew, I would appreciate your opinion on opportunities to collaborate with NCBI. Thanks, -- Daniel Mietchen - WiR/OS (talk) 02:18, 19 July 2012 (UTC)

ProteinBoxBot

Your bot, ProteinBoxBot, uploaded a bundle of images a few years ago (eg. File:PBB Protein INDO image.jpg) with no description. There looks to be about 50 or so that are populating in Category:Wikipedia files lacking a description. I was wondering if there would be a way for you or your fellow bot owner to get the bot to go and add a description to the files. (I will also be posting on your bot co-owner's talk page) Thanks in advance. -- ТимофейЛееСуда. 21:57, 4 September 2012 (UTC)

Yes, thanks for the note. We'll look into those ASAP. Ideally actually we'll just replace those older images with newer versions that are better looking (and have descriptions). Cheers, AndrewGNF (talk) 17:00, 5 September 2012 (UTC)

Portal:Gene Wiki/PDB listed at Redirects for discussion

 

An editor has asked for a discussion to address the redirect Portal:Gene Wiki/PDB. Since you had some involvement with the Portal:Gene Wiki/PDB redirect, you might want to participate in the redirect discussion (if you have not already done so). Chris Cunningham (user:thumperward) (talk) 12:15, 23 September 2012 (UTC)

Proper Citation Request

I would like to know the methods you used to get the testicular receptor 2 graph because I would like to cite your work correctly.

Can you provide me with a link to your original paper or something along those lines so I can read more?

Thank you,

Jennifer 74.96.74.226 (talk) 17:13, 24 January 2013 (UTC)

Hi Jennifer, those expression graphs came from this paper. Cheers, AndrewGNF (talk) 18:27, 24 January 2013 (UTC)

Portal:Gene Wiki/SCOP listed at Redirects for discussion

 

An editor has asked for a discussion to address the redirect Portal:Gene Wiki/SCOP. Since you had some involvement with the Portal:Gene Wiki/SCOP redirect, you might want to participate in the redirect discussion (if you have not already done so). Chris Cunningham (user:thumperward) (talk) 12:15, 23 September 2012 (UTC)

Hi Andrew,

I've read that already but don't see any reference to TR2 so I wanted to know more about this- did you reanalyze this yourself?

Thanks,

Jen — Preceding unsigned comment added by 74.96.74.226 (talk) 15:38, 1 February 2013 (UTC)



PBB/2232

Based on my comments EC commission changed the EC number from 1.8.1.2 to 1.8.1.6 I noted this on page: http://en.wikipedia.org/wiki/Template_talk:PBB/2232

I keep changing the number to the updated number and you reverse it to the old number. Please check EC 1.8.1.6 to verify that it is the correct number. Redactor271 (talk) 06:23, 25 March 2013 (UTC)

Welcome to the WP:MED/WP:PHARM

I really like these ideas and look forwards to discussing them more. How does your current bot work?Doc James (talk · contribs · email) (if I write on your page reply on mine) 01:22, 14 May 2013 (UTC)

A barnstar for you!

  The Teamwork Barnstar
For reaching out to new partners, charging ahead with new ventures, and taking Wikipedia's role in science and medicine so seriously, I award you this Teamwork Barnstar. Keep being awesome! Ocaasi t | c 01:20, 17 May 2013 (UTC)
Thank you, very kind! Cheers, Andrew Su (talk) 05:08, 17 May 2013 (UTC)

Hey Andrew Su

I'm sending you this because you've made quite a few edits to the template namespace in the past couple of months. If I've got this wrong, or if I haven't but you're not interested in my request, don't worry; this is the only notice I'm sending out on the subject :).

So, as you know (or should know - we sent out a centralnotice and several watchlist notices) we're planning to deploy the VisualEditor on Monday, 1 July, as the default editor. For those of us who prefer markup editing, fear not; we'll still be able to use the markup editor, which isn't going anywhere.

What's important here, though, is that the VisualEditor features an interactive template inspector; you click an icon on a template and it shows you the parameters, the contents of those fields, and human-readable parameter names, along with descriptions of what each parameter does. Personally, I find this pretty awesome, and from Monday it's going to be heavily used, since, as said, the VisualEditor will become the default.

The thing that generates the human-readable names and descriptions is a small JSON data structure, loaded through an extension called TemplateData. I'm reaching out to you in the hopes that you'd be willing and able to put some time into adding TemplateData to high-profile templates. It's pretty easy to understand (heck, if I can write it, anyone can) and you can find a guide here, along with a list of prominent templates, although I suspect we can all hazard a guess as to high-profile templates that would benefit from this. Hopefully you're willing to give it a try; the more TemplateData sections get added, the better the interface can be. If you run into any problems, drop a note on the Feedback page.

Thanks, Okeyes (WMF) (talk) 21:20, 28 June 2013 (UTC)

AANAT (gene)

Hello, Andrew Su, and thank you for your contributions!

An article you worked on AANAT (gene), appears to be directly copied from http://rgd.mcw.edu/rgdweb/report/gene/main.html?id=736736. Please take a minute to make sure that the text is freely licensed and properly attributed as a reference, otherwise the article may be deleted.

It's entirely possible that this bot made a mistake, so please feel free to remove this notice and the tag it placed on AANAT (gene) if necessary. MadmanBot (talk) 23:28, 11 July 2013 (UTC)

Template:PBB

I've asked a question at Template talk:PBB#Just a hardcoding? but not received a response. As one of ProteinBoxBot's operators, I'm wondering if you can shed a light on it for me. VanIsaacWS Vexcontribs 07:58, 30 July 2013 (UTC)

PBB templates

I have been editing the PBB templates in Category:Human protein templates making changes in the Name field:

  1. If the name includes a number kDa without a space, e.g. 30kDa, insert a nonbreaking space, e.g. 30 kDa
  2. If the name includes a genus, e.g. Drosophila, or a genus and species, italicize it, e.g. Drosophila
  3. If the name includes a plus sign that is superscripted in chemical notation, e.g. H+, superscript it, e.g. H+
  4. If the name includes a genus abbreviated to initial, e.g. S. cerevisiae, spell out the genus, e.g. Saccharomyces cerevisiae

These changes are consistent with Wikipedia style.

At the beginning of this project, for the fourth type of edit, I set the edit summary to: "italics; spell out genus name — if you don't like it put a note on my talk page". I used this summary for Template:PBB/10111, Template:PBB/10248, Template:PBB/10412, Template:PBB/10427, Template:PBB/10436, Template:PBB/10483, Template:PBB/10484.

When I got to the first one that included C. elegans, Template:PBB/10497, I used the edit summary "italics; spell out genus name — there are over 170 C. elegans species on C. elegans (disambiguation), not counting synonyms!"

By this time I had put the request to put a note on my talk page in 7 edit summaries, so I thought that was enough. But now I see that your ProteinboxBot has been systematically removing all of the edits I have been making.

I request that you modify ProteinBoxBot to automatically make at least the first 3 of my four changes, if not all four, consistent with WP:MOS. —Anomalocaris (talk) 18:56, 26 September 2013 (UTC)

Sorry for butting in here. Concerning gene names in {{PBB}} templates, these are based on the official Human Genome Organisation names (see HGNC Guidelines and PMID 11944974). The bot is acting to make sure that the gene names in the PBB templates match the current official HUGO gene names. While your changes have merit, they do differ from the approved names. By HUGO convention gene names, in part or in whole, are never italicized (gene symbols are italicized but not names) nor do they contain sub- or superscripts. In addition, "... molecular weights may be specified in kilodaltons using the SI unit: kDa with no space after the molecular weight". Finally the genus is abbreviated to keep the gene names from becoming too long. Boghog (talk) 05:27, 27 September 2013 (UTC)
One additional note. The gene names in PBB articles were systematically created by an approved bot. Hence changing these names is very likely to be controversial. As a general rule, it is wise to first seek consensus before making potentially controversial edits to a large number of articles or templates. Boghog (talk) 05:50, 27 September 2013 (UTC)
Thank you, Boghog, for this information. I didn't purchase "Guidelines for Human Gene Nomenclature" in Genomics, but I read HGNC's "Guidelines for Human Gene Nomenclature" section 3: Gene names, and here, it shows Drosophila italicized in the example "lunatic fringe homolog (Drosophila)" and the example "anillin, actin binding protein (scraps homolog, Drosophila)". But the same guideline in section 5: Homologies with other species, includes the example "BarH-like 1 (Drosophila)". [without italics!]
I began to insert the space before kDA because I found some templates had spaces, e.g. Template:PBB/10621, with name "Polymerase (RNA) III (DNA directed) polypeptide F, 39 kDa". I suspect that ProteinBoxBot removes nonbreaking spaces but not ordinary spaces before kDa. According to NGNC, there is not supposed to be a space between the number and kDa. Why isn't ProteinBoxBot taking out regular spaces between the number and kDa? —Anomalocaris (talk) 08:28, 27 September 2013 (UTC)
Since this discussion began, HGNC's "Guidelines for Human Gene Nomenclature" section 3: Gene names has been updated. The example with a genus or species name is now ASXL1 "additional sex combs like 1 (Drosophila)", without italics. As I said on 27 September, I suggest modifying ProteinBoxBot to take out regular spaces between the number and kDa. —Anomalocaris (talk) 15:47, 3 October 2013 (UTC)
Sorry for joining late. And thanks Boghog for chiming in -- as usual, I have very little to add. I'll just mention that PBB does not do anything to the gene symbols and titles as they come from HGNC (through NCBI) -- neither italicization nor adding/removing whitespaces. While technically it's possible to add logic to do as you suggest, I think we'd want to make sure we had consensus first (please post at WP:MCB). Cheers, Andrew Su (talk) 05:10, 10 October 2013 (UTC)

ProteinBoxBot updates to UniProt links

Hi Andrew. ProteinBoxBot is making a large number updates that seem to be in error. See for example diff. In the meantime, I have requested a temporary emergency shutdown of the bot. Cheers. Boghog (talk) 11:03, 15 November 2013 (UTC)

Thanks for the note, Boghog. We're on it now and will report back here soon... Cheers, Andrew Su (talk) 17:24, 15 November 2013 (UTC)
All should be fixed now thanks to User:X0xMaximus. Thanks for reporting this! Cheers, Andrew Su (talk) 21:21, 16 November 2013 (UTC)

User pages in Category:Human proteins

Hello, Andrew. I was perusing "Category:Human proteins" and saw that two of your user pages are in the category. I do not know how to remove them from the category because it seems like a template is putting them in there. You are probably more capable than I am with templates, so I will leave it to you. You do not need to reply to this message, unless you want to.

Warmest regards, Kjkolb (talk) 01:39, 6 March 2014 (UTC)

Fixed, thank you! Cheers, Andrew Su (talk) 05:39, 6 March 2014 (UTC)

PBB Bot at Czech wikik

Hi,

our community at Czech wikipedia is interested in enriching our articles on proteins with PBB templates (as seen on en.wikipedia) which are being filled by the bot you operate. I would like to know, if you or someone from your co-operators could operate the bot in Czech mutation of wikipedia or if we need to do it ourselves (and if you can provide guidance). Thank you very much! --Hypothalamus (talk) 12:38, 24 March 2014 (UTC)

Hi there. We unfortunately don't have the bandwidth to take on the Czech pages. (Having enough trouble keeping up to date on English WP!) You are welcome to use/adapt the code base here: https://bitbucket.org/sulab/pygenewiki. Cheers, Andrew Su (talk) 17:37, 24 March 2014 (UTC)
We really need to focus on Wikidata. A single lua-infobox and a single bot could keep all languages of Wikipedia up-to-date. I directed the discussion here: d:Wikidata_talk:WikiProject_Molecular_biology#Wikidata_Infobox_on_Czech_Wikipedia. --Tobias1984 (talk) 11:28, 25 March 2014 (UTC)
I agree completely of course. Wikidata to me is the long-term solution. I suggested the pygenewiki code base only if you wanted to do a short-term hack, but in retrospect, any effort you would have put into that would be much better directed to Wikidata... Cheers, Andrew Su (talk) 18:29, 25 March 2014 (UTC)

WikiHack in DC on April 5-6

It's a long shot, but if you were going to be in DC for the weekend after next, you might consider going to this event, Open Government WikiHack. Klortho (talk) 05:01, 26 March 2014 (UTC)

Looks awesome, but yeah, unfortunately not possible for me to attend... Thanks for the heads up! Cheers, Andrew Su (talk) 21:39, 26 March 2014 (UTC)

ProteinBoxBot Errors

As documented here, the ProteinBoxBot appears to have made a large number of erroneous edits from July 2 to 4. Please recheck the bot script before running the bot again. Thanks. Boghog (talk) 08:19, 5 July 2014 (UTC)

Aaack, sincere apologies. The bot was dormant for a few months for an unknown reason. We did a first pass of debugging to work through some changes in the mwclient library, and assumed all would be good for a small run. Obviously we were wrong. We'll clearly apply much more scrutiny and caution as we move forward. Thank you for being our second set of eyes and fixing those errors! Cheers, Andrew Su (talk) 14:06, 5 July 2014 (UTC)
... and doing more spot checking, it's clear that the error rate is unacceptably high. I'm going to just manually revert all edits from PBB made on 3 July 2014. If you have an easy mechanism to programmatically do that, feel free... Cheers, Andrew Su (talk) 14:27, 5 July 2014 (UTC)
No problem. It appears only the edits marked as Minor aesthetic updates have problems. I have been manually reverting these and leaving the other edits which appear to be OK alone. Cheers. Boghog (talk) 14:33, 5 July 2014 (UTC)
But there are hundreds of minor revisions that we made, right? Ugh... Andrew Su (talk) 14:36, 5 July 2014 (UTC)
Plus I've noticed several minor removals of correct content (EC number, chromosome). Certainly not as egregious as the others, but that also makes me not opposed to a wholesale reversion of all edits... Andrew Su (talk) 14:39, 5 July 2014 (UTC)
222 to be exact. And most, but not all of these edits are faulty. So far, I have reverted about 1/3 of them. Boghog (talk) 14:40, 5 July 2014 (UTC)
Okay, all done now... Thanks for the eagle eyes. Spotting that problem after a couple hundred edits is a lot easier to fix than after a couple thousand edits. As always, let me know if you see any other problems! Cheers, Andrew Su (talk) 23:28, 5 July 2014 (UTC)
Great! Thanks for your diligence in fixing this. Cheers. Boghog (talk) 09:19, 6 July 2014 (UTC)

New bot run with same errors

Hi Andrew. Just a friendly alert. An apparently new bot run but with the same types of errors. Cheers. Boghog (talk) 15:02, 9 July 2014 (UTC)

Thanks! Our enthusiastic new programmer gets in earlier than I do (hadn't yet had a chance to debrief on last week's run). A quick email and chat later, we're back on track. Those new edits are now being reverted... Cheers, Andrew Su (talk) 15:31, 9 July 2014 (UTC)

Proposed GNF Protein box name change

Hi Andrew. Just a heads up to the above proposed name change. Cheers. Boghog (talk) 18:29, 21 July 2014 (UTC)

Thank you, much appreciated! Cheers, Andrew Su (talk) 18:36, 21 July 2014 (UTC)

ProteinBoxBot creates blank page?

So, like, 20 minutes ago, ProteinBoxBot has created a blank page named "IFITM5". Is this accidental? TVShowFan122 (talk) 13:32, 5 September 2015 (UTC)

@TVShowFan122: sorry, I don't see evidence of such an edit in the bot's user contribution history? Can you clarify or send a diff? Best, Andrew Su (talk) 20:45, 9 September 2015 (UTC)
It seems the edit has disappeared. The deletion log states the page was deleted 3 hours after I informed you. It's been recreated since then, as a redirect. TVShowFan122 (talk) 13:28, 10 September 2015 (UTC)
@TVShowFan122: Great, thanks for the recap! We obviously don't intend to create blank pages and haven't seen widespread evidence of this, but we'll keep our eye out... Best, Andrew Su (talk)

GeneWikiGenerator

Hi Andrew. The BioGPS GeneWikiGenerator no longer seems to work. For quite awhile, it would create template code and article text, but would not directly create GeneWiki articles. One had to manually copy and paste the code and text into Wikipedia. Now the GeneWikiGenerator will not even create code or text. After selecting a gene, all that is returned is an empty "Gene Wiki Code Creator" window. I would appreciate if you or someone in your group would look into this when you get a chance. Thanks. Boghog (talk) 10:40, 3 August 2015 (UTC)

Thanks Boghog. Yes, we know some things fell into disrepair on our end, my apologies. Since about a month ago we have a new programmer working on getting things back to stable ground, and I hope we have that complete soon. Fixing the GeneWikiGenerator is definitely on the to-do list. More soon hopefully... Best, Andrew Su (talk) 04:11, 4 August 2015 (UTC)
Actually @Boghog:, it was even a more dumb error than that. We had updated the generator code to a new code base that operates directly within the Gene Wiki plugin on BioGPS (instead of a dedicated Code Generator plugin). But I forgot to update the link when using http://biogps.org/GeneWikiGenerator/. Apologies, done now. (For example: http://biogps.org/GeneWikiGenerator/#goto=genereport&id=11275 -- if the page exists then it redirects, and if not then it displays the creation interface.) Anyway, I noticed that the tool does return an error for "Invalid or missing Entrez Identifier" for some valid Entrez Gene IDs -- looking into that one... Best, Andrew Su (talk) 04:36, 4 August 2015 (UTC)
Thanks Andrew. The generator now works for creating new pages. As you say, if the article already exists, but the template is missing, the GeneWikiGenerator layout shows only the Wikipedia article and does not generate a template (see for example ELF1). I would appreciate if you would also fix that when you get a chance. Cheers. Boghog (talk) 07:28, 4 August 2015 (UTC)
Thanks Boghog, yes, we'll check that one out too... Best, Andrew Su (talk) 22:00, 4 August 2015 (UTC)
@Boghog: We just moved things to a more stable server and fixed all the bugs we were aware of. So things should be in good working order at the moment. Also, the "GeneWikiGenerator" links will now always display the interface to create a new page (e.g., [11]). Obviously, let us know if you find any issues big or small! Best, Andrew Su (talk) 22:46, 16 September 2015 (UTC)
Hi Andrew. Thanks for the fixes. Things have improved, but there still are some problems:
  • There is no article text code toggle similar to the template code toggle. The reason this is useful is to add bot generated text to pre-existing articles.
  • The Gene Wiki generator is creating blank pages (see for example diff, diff, diff), at least when the article already exists under a different name.
  • Mouse data is missing (see for example diff).
When you get a chance, I would appreciate if you would take a look at this. Thanks. Boghog (talk) 05:08, 4 October 2015 (UTC)
Thanks Boghog, checking on those issues now... Best, Andrew Su (talk) 15:30, 5 October 2015 (UTC)

ArbCom elections are now open!

Hi,
You appear to be eligible to vote in the current Arbitration Committee election. The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to enact binding solutions for disputes between editors, primarily related to serious behavioural issues that the community has been unable to resolve. This includes the ability to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail. If you wish to participate, you are welcome to review the candidates' statements and submit your choices on the voting page. For the Election committee, MediaWiki message delivery (talk) 16:32, 23 November 2015 (UTC)

Protein stubs

Lots of your articles are in the list of long stubs which I work through when I have nothing more exciting to do. I think they are mostly created by your bot. I've been removing the stub tag because the articles look quite substantial to me, but I don't understand much of them, and I certainly don't feel competent to decide how complete they are in relation to what could be written. Am I doing the right thing? do you regard them as stubs? could the process of deciding whether they are stubs be automated?Rathfelder (talk) 19:25, 16 December 2015 (UTC)

Hi @Rathfelder: Thanks for checking in! I'm not sure I'm the best person to say what is or isn't a stub. Definitely read Wikipedia:Stub if you haven't already, and post on the talk page if you have questions. In general though for gene and protein pages, I would say that you can look at the edit history. If it has only been touched by bots, then probably best to leave the stub tag. For example, KCNA2 should remain a stub IMHO. Best, Andrew Su (talk) 19:38, 16 December 2015 (UTC)
Wikipedia:Stub isn't much help because it talks about comparing the the content of the article with that which could be written - You would expect more in an article about London than about a small village. But I have no idea at all what could be written. Most of these articles incorporate extensive further reading, and I think that should be taken into account. I also look at the history. If an article like KCNA2 has been largely untouched for many years I'm inclined to think that nobody has found much more to say on the subject. Rathfelder (talk) 19:46, 16 December 2015 (UTC)
@Rathfelder: Got it. However, I think your assumption that "if it's untouched then it's probably finished" does not apply for genes and proteins. We're still drawing the molecular biology community here, so there is lots known that isn't reflected in these articles. (Not to mention that more is continually being discovered.) So for these articles, I would err on the side of having a higher bar than your "untouched" rule of thumb. As examples, I think FLT3LG and OS9_(gene) actually are still a stubs, whereas RAD51C and FERMT3 are not (IMO). Also, I would note that sections on "function", "interactions", and "model organisms" are for the most part automatically added, so their presence does NOT support removing the stub tags. Other sections like "clinical significance", "structure", "regulation", "isoforms", "discovery" would be better evidence that an expert has visited and perhaps updated the article to the current state of knowledge... Best, Andrew Su (talk) 19:56, 16 December 2015 (UTC)
thank you very much. I'll be more cautious in removing the tag. Rathfelder (talk) 20:04, 16 December 2015 (UTC)

ArbCom Elections 2016: Voting now open!

Hello, Andrew Su. Voting in the 2016 Arbitration Committee elections is open from Monday, 00:00, 21 November through Sunday, 23:59, 4 December to all unblocked users who have registered an account before Wednesday, 00:00, 28 October 2016 and have made at least 150 mainspace edits before Sunday, 00:00, 1 November 2016.

The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.

If you wish to participate in the 2016 election, please review the candidates' statements and submit your choices on the voting page. MediaWiki message delivery (talk) 22:08, 21 November 2016 (UTC)

Bot being discussed

here Best Doc James (talk · contribs · email) 20:38, 11 December 2016 (UTC)

Presentation at Wikimania, eventually?

Hello. Wikimania is the international Wikipedia / Wikimedia conference. I think that Gene Wiki has never presented at this conference, but that it would be a good venue for claiming community support that your project merits.

When the year is right and other projects align, could someone from Gene Wiki consider proposing a presentation and attending? This is a general interest gathering of all Wikimedia contributors from all language backgrounds, so it is a layman event. Gene Wiki is a major success story in wiki but it is not so well known in the community consciousness. I think that if its story were told in a way that a general audience could understand, then Gene Wiki could become a celebrated success even among layman contributors. Gene Wiki as a project is in a class of its own because of the large amount of content which has been shared and how closely it is tied with sharing the results of monumental research projects in genetics.

Wikimania this year is in Montreal August 11-13, with some preconference events including a medicine meetup on August 10. Next year it will be in the summer also in South Africa.

Could you float the idea in the Gene Wiki group that I think that if you made a proposal, you would be invited to present in the general schedule? As an example, submissions will be taken at wm2017:Submissions from 2 February - 30 March for this year. There is a similar process every year. Thanks for your attention and consideration. If you ever make a proposal then I would rally support. Blue Rasberry (talk) 18:57, 24 January 2017 (UTC)

Hi @Bluerasberry: Hmmm, this is a very tempting idea. Obviously I would love to make some more in-person contacts with the community. The limiting ingredients (as always) are time and money. We usually prioritize the academic biomedical conferences to try to spread the word in those communities. But then again, more outreach within our own WD/WP community would clearly be valuable too... Hmmm, let me give it some thought. Thank you for reaching out! Cheers, Andrew Su (talk) 21:58, 24 January 2017 (UTC)
There was Open biomedical knowledge: Wikipedia, Wikidata, and beyond which your team did in 2015. Scholarships are offered for Wikimania and a representative from your team would be a good candidate, and really in another league as compared to typical applicants, but your own time matters also. Another consideration is the ripeness of things. Your project is ripe, but I am not sure that the wiki community is ready to begin understanding what you do. I care, and I hardly understand. It is nice to have some preparation so that attendees are not completely overwhelmed, and I am not sure it is possible to talk about a project like this and be understood.
For now I will leave the idea out. There are conferences every year - this is not a fleeting opportunity. Other options for outreach could be a video presentation. Webcam works, or I think you are at a university - they must have a team which could help and at least the right viewers would find that. I am not sure what is right, but thanks for coming to WikiProject Medicine to respond. Blue Rasberry (talk) 00:03, 26 January 2017 (UTC)

Proposal of using full size image for RNA expression pattern

Hello Andrew! In infoboxes of gene/protein articles, there are small thumbnail images of RNA expression pattern which you created. Thank you for your great images! Then I proposed switching these small thumbnail to full size image, because Wikipedia/Mediawiki image system was changed. I suppose you already noticed. Since image data are recalled from Wikidata, I post the proposal at Wikidata page (wikidata:Property talk:P692#How about using full size image instead of small thumbnail?). I would like to get your thoughts on that. Thank you. --Was a bee (talk) 06:35, 18 March 2017 (UTC)

Gene Infobox

Your bot is creating stubs that display an error stating that there is a problem with the infobox. I have pulled the error text from the most recent stub that the bot created. Please take a look at what the bot is doing. This happened with other recent stubs. I have marked the most recent stub as unreviewed, so that another reviewer may look at it and also get in touch with you. Robert McClenon (talk) 01:27, 23 March 2017 (UTC)

@Robert McClenon: thanks for the note. We are looking in to this now and will post an update shortly... Best, Andrew Su (talk) 18:03, 23 March 2017 (UTC)

Facto Post – Issue 1 – 14 June 2017

Facto Post – Issue 1 – 14 June 2017
 

Editorial

This newsletter starts with the motto "common endeavour for 21st century content". To unpack that slogan somewhat, we are particularly interested in the new, post-Wikidata collection of techniques that are flourishing under the Wikimedia collaborative umbrella. To linked data, SPARQL queries and WikiCite, add gamified participation, text mining and new holding areas, with bots, tech and humans working harmoniously.

Scientists, librarians and Wikimedians are coming together and providing a more unified view of an emerging area. Further integration of both its community and its technical aspects can be anticipated.

While Wikipedia will remain the discursive heart of Wikimedia, data-rich and semantic content will support it. We'll aim to be both broad and selective in our coverage. This publication Facto Post (the very opposite of retroactive) and call to action are brought to you monthly by ContentMine.

Links
Editor Charles Matthews. Please leave feedback for him.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Opted-out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 09:33, 14 June 2017 (UTC)

Invitation to editathon at ISMB/ECCB 2017

Facto Post – Issue 3 – 11 August 2017

Facto Post – Issue 3 – 11 August 2017
 

Wikimania report

Interviewed by Facto Post at the hackathon, Lydia Pintscher of Wikidata said that the most significant recent development is that Wikidata now accounts for one third of Wikimedia edits. And the essential growth of human editing.

 
Internet-In-A-Box

Impressive development work on Internet-in-a-Box featured in the WikiMedFoundation annual conference on Thursday. Hardware is Raspberry Pi, running Linux and the Kiwix browser. It can operate as a wifi hotspot and support a local intranet in parts of the world lacking phone signal. The medical use case is for those delivering care, who have smartphones but have to function in clinics in just such areas with few reference resources. Wikipedia medical content can be served to their phones, and power supplied by standard lithium battery packages.

Yesterday Katherine Maher unveiled the draft Wikimedia 2030 strategy, featuring a picturesque metaphor, "roads, bridges and villages". Here "bridges" could do with illustration. Perhaps it stands for engineering round or over the obstacles to progress down the obvious highways. Internet-in-a-Box would then do fine as an example.

"Bridging the gap" explains a take on that same metaphor, with its human component. If you are at Wikimania, come talk to WikiFactMine at its stall in the Community Village, just by the 3D-printed display for Bassel Khartabil; come hear T Arrow talk at 3 pm today in Drummond West, Level 3.

Link

  • Plaudit for the Medical Wikipedia app, content that is loaded into Internet-In-A-Box with other material, such as per-country documentation.
Editor Charles Matthews. Please leave feedback for him.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Opted-out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:55, 12 August 2017 (UTC)

Facto Post – Issue 4 – 18 September 2017

Facto Post – Issue 4 – 18 September 2017
 

Editorial: Conservation data

The IUCN Red List update of 14 September led with a threat to North American ash trees. The International Union for Conservation of Nature produces authoritative species listings that are peer-reviewed. Examples used as metonyms for loss of species and biodiversity, and discussion of extinction rates, are the usual topics covered in the media to inform us about this area. But actual data matters.

 
Dorstenia elata, a critically endangered South American herb, contained in Moraceae, the family of figs and mulberries

Clearly, conservation work depends on decisions about what should be done, and where. While animals, particularly mammals, are photogenic, species numbers run into millions. Plant species lie at the base of typical land-based food chains, and vegetation is key to the habitats of most animals.

ContentMine dictionaries, for example as tabulated at d:Wikidata:WikiFactMine/Dictionary list, enable detailed control of queries about endangered species, in their taxonomic context. To target conservation measures properly, species listings running into the thousands are not what is needed: range maps showing current distribution are. Between the will to act, and effective steps taken, the services of data handling are required. There is now no reason at all why Wikidata should not take up the burden.

Links

Editor Charles Matthews. Please leave feedback for him.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Opted-out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 14:46, 18 September 2017 (UTC)

Nomination for deletion of Template:HPO

 Template:HPO has been nominated for deletion. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. Ten Pound Hammer(What did I screw up now?) 18:08, 28 September 2017 (UTC)

ISCB Wikipedia Competition: call for participation

Facto Post – Issue 5 – 17 October 2017

Facto Post – Issue 5 – 17 October 2017
 

Editorial: Annotations

Annotation is nothing new. The glossators of medieval Europe annotated between the lines, or in the margins of legal manuscripts of texts going back to Roman times, and created a new discipline. In the form of web annotation, the idea is back, with texts being marked up inline, or with a stand-off system. Where could it lead?

 
1495 print version of the Digesta of Justinian, with the annotations of the glossator Accursius from the 13th century

ContentMine operates in the field of text and data mining (TDM), where annotation, simply put, can add value to mined text. It now sees annotation as a possible advance in semi-automation, the use of human judgement assisted by bot editing, which now plays a large part in Wikidata tools. While a human judgement call of yes/no, on the addition of a statement to Wikidata, is usually taken as decisive, it need not be. The human assent may be passed into an annotation system, and stored: this idea is standard on Wikisource, for example, where text is considered "validated" only when two different accounts have stated that the proof-reading is correct. A typical application would be to require more than one person to agree that what is said in the reference translates correctly into the formal Wikidata statement. Rejections are also potentially useful to record, for machine learning.

As a contribution to data integrity on Wikidata, annotation has much to offer. Some "hard cases" on importing data are much more difficult than average. There are for example biographical puzzles: whether person A in one context is really identical with person B, of the same name, in another context. In science, clinical medicine require special attention to sourcing (WP:MEDRS), and is challenging in terms of connecting findings with the methodology employed. Currently decisions in areas such as these, on Wikipedia and Wikidata, are often made ad hoc. In particular there may be no audit trail for those who want to check what is decided.

Annotations are subject to a World Wide Web Consortium standard, and behind the terminology constitute a simple JSON data structure. What WikiFactMine proposes to do with them is to implement the MEDRS guideline, as a formal algorithm, on bibliographical and methodological data. The structure will integrate with those inputs the human decisions on the interpretation of scientific papers that underlie claims on Wikidata. What is added to Wikidata will therefore be supported by a transparent and rigorous system that documents decisions.

An example of the possible future scope of annotation, for medical content, is in the first link below. That sort of detailed abstract of a publication can be a target for TDM, adds great value, and could be presented in machine-readable form. You are invited to discuss the detailed proposal on Wikidata, via its talk page.

Links

Editor Charles Matthews. Please leave feedback for him.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Opted-out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 08:45, 17 October 2017 (UTC)

Facto Post – Issue 6 – 15 November 2017

Facto Post – Issue 6 – 15 November 2017
 

WikidataCon Berlin 28–9 October 2017

 
WikidataCon 2017 group photo

Under the heading rerum causas cognescere, the first ever Wikidata conference got under way in the Tagesspiegel building with two keynotes, One was on YAGO, about how a knowledge base conceived ten years ago if you assume automatic compilation from Wikipedia. The other was from manager Lydia Pintscher, on the "state of the data". Interesting rumours flourished: the mix'n'match tool and its 600+ datasets, mostly in digital humanities, to be taken off the hands of its author Magnus Manske by the WMF; a Wikibase incubator site is on its way. Announcements came in talks: structured data on Wikimedia Commons is scheduled to make substantive progress by 2019. The lexeme development on Wikidata is now not expected to make the Wiktionary sites redundant, but may facilitate automated compilation of dictionaries.

 
WD-FIST explained

And so it went, with five strands of talks and workshops, through to 11 pm on Saturday. Wikidata applies to GLAM work via metadata. It may be used in education, raises issues such as author disambiguation, and lends itself to different types of graphical display and reuse. Many millions of SPARQL queries are run on the site every day. Over the summer a large open science bibliography has come into existence there.

Wikidata's fifth birthday party on the Sunday brought matters to a close. See a dozen and more reports by other hands.

Links

Editor Charles Matthews. Please leave feedback for him.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:02, 15 November 2017 (UTC)

ArbCom 2017 election voter message

Hello, Andrew Su. Voting in the 2017 Arbitration Committee elections is now open until 23.59 on Sunday, 10 December. All users who registered an account before Saturday, 28 October 2017, made at least 150 mainspace edits before Wednesday, 1 November 2017 and are not currently blocked are eligible to vote. Users with alternate accounts may only vote once.

The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.

If you wish to participate in the 2017 election, please review the candidates and submit your choices on the voting page. MediaWiki message delivery (talk) 18:42, 3 December 2017 (UTC)

Facto Post – Issue 7 – 15 December 2017

Facto Post – Issue 7 – 15 December 2017
 

A new bibliographical landscape

At the beginning of December, Wikidata items on individual scientific articles passed the 10 million mark. This figure contrasts with the state of play in early summer, when there were around half a million. In the big picture, Wikidata is now documenting the scientific literature at a rate that is about eight times as fast as papers are published. As 2017 ends, progress is quite evident.

Behind this achievement are a technical advance (fatameh), and bots that do the lifting. Much more than dry migration of metadata is potentially involved, however. If paper A cites paper B, both papers having an item, a link can be created on Wikidata, and the information presented to both human readers, and machines. This cross-linking is one of the most significant aspects of the scientific literature, and now a long-sought open version is rapidly being built up.

 

The effort for the lifting of copyright restrictions on citation data of this kind has had real momentum behind it during 2017. WikiCite and the I4OC have been pushing hard, with the result that on CrossRef over 50% of the citation data is open. Now the holdout publishers are being lobbied to release rights on citations.

But all that is just the beginning. Topics of papers are identified, authors disambiguated, with significant progress on the use of the four million ORCID IDs for researchers, and proposals formulated to identify methodology in a machine-readable way. P4510 on Wikidata has been introduced so that methodology can sit comfortably on items about papers.

More is on the way. OABot applies the unpaywall principle to Wikipedia referencing. It has been proposed that Wikidata could assist WorldCat in compiling the global history of book translation. Watch this space.

And make promoting #1lib1ref one of your New Year's resolutions. Happy holidays, all!

 
November 2017 map of geolocated Wikidata items, made by Addshore

Links


To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 14:54, 15 December 2017 (UTC)

Facto Post – Issue 8 – 15 January 2018

Facto Post – Issue 8 – 15 January 2018
 

Metadata on the March

From the days of hard-copy liner notes on music albums, metadata have stood outside a piece or file, while adding to understanding of where it comes from, and some of what needs to be appreciated about its content. In the GLAM sector, the accumulation of accurate metadata for objects is key to the mission of an institution, and its presentation in cataloguing.

Today Wikipedia turns 17, with worlds still to conquer. Zooming out from the individual GLAM object to the ontology in which it is set, one such world becomes apparent: GLAMs use custom ontologies, and those introduce massive incompatibilities. From a recent article by sadads, we quote the observation that "vocabularies needed for many collections, topics and intellectual spaces defy the expectations of the larger professional communities." A job for the encyclopedist, certainly. But the data-minded Wikimedian has the advantages of Wikidata, starting with its multilingual data, and facility with aliases. The controlled vocabulary — sometimes referred to as a "thesaurus" as term of art — simplifies search: if a "spade" must be called that, rather than "shovel", it is easier to find all spade references. That control comes at a cost.

 
SVG pedestrian crosses road
 
Zebra crossing/crosswalk, Singapore

Case studies in that article show what can lie ahead. The schema crosswalk, in jargon, is a potential answer to the GLAM Babel of proliferating and expanding vocabularies. Even if you have no interest in Wikidata as such, simply vocabularies V and W, if both V and W are matched to Wikidata, then a "crosswalk" arises from term v in V to w in W, whenever v and w both match to the same item d in Wikidata.

For metadata mobility, match to Wikidata. It's apparently that simple: infrastructure requirements have turned out, so far, to be challenges that can be met.

Links


To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 12:38, 15 January 2018 (UTC)

ISCB Wikipedia Competition 2018: entries open!

Facto Post – Issue 9 – 5 February 2018

Facto Post – Issue 9 – 5 February 2018
 

m:Grants:Project/ScienceSource is the new ContentMine proposal: please take a look.

Wikidata as Hub

One way of looking at Wikidata relates it to the semantic web concept, around for about as long as Wikipedia, and realised in dozens of distributed Web institutions. It sees Wikidata as supplying central, encyclopedic coverage of linked structured data, and looks ahead to greater support for "federated queries" that draw together information from all parts of the emerging network of websites.

 

Another perspective might be likened to a photographic negative of that one: Wikidata as an already-functioning Web hub. Over half of its properties are identifiers on other websites. These are Wikidata's "external links", to use Wikipedia terminology: one type for the DOI of a publication, another for the VIAF page of an author, with thousands more such. Wikidata links out to sites that are not nominally part of the semantic web, effectively drawing them into a larger system. The crosswalk possibilities of the systematic construction of these links was covered in Issue 8.

Wikipedia:External links speaks of them as kept "minimal, meritable, and directly relevant to the article." Here Wikidata finds more of a function. On viaf.org one can type a VIAF author identifier into the search box, and find the author page. The Wikidata Resolver tool, these days including Open Street Map, Scholia etc., allows this kind of lookup. The hub tool by maxlath takes a major step further, allowing both lookup and crosswalk to be encoded in a single URL.

Links


To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:50, 5 February 2018 (UTC)

Facto Post – Issue 10 – 12 March 2018

Facto Post – Issue 10 – 12 March 2018
 

Milestone for mix'n'match

Around the time in February when Wikidata clicked past item Q50000000, another milestone was reached: the mix'n'match tool uploaded its 1000th dataset. Concisely defined by its author, Magnus Manske, it works "to match entries in external catalogs to Wikidata". The total number of entries is now well into eight figures, and more are constantly being added: a couple of new catalogs each day is normal.

Since the end of 2013, mix'n'match has gradually come to play a significant part in adding statements to Wikidata. Particularly in areas with the flavour of digital humanities, but datasets can of course be about practically anything. There is a catalog on skyscrapers, and two on spiders.

These days mix'n'match can be used in numerous modes, from the relaxed gamified click through a catalog looking for matches, with prompts, to the fantastically useful and often demanding search across all catalogs. I'll type that again: you can search 1000+ datasets from the simple box at the top right. The drop-down menu top left offers "creation candidates", Magnus's personal favourite. m:Mix'n'match/Manual for more.

For the Wikidatan, a key point is that these matches, however carried out, add statements to Wikidata if, and naturally only if, there is a Wikidata property associated with the catalog. For everyone, however, the hands-on experience of deciding of what is a good match is an education, in a scholarly area, biographical catalogs being particularly fraught. Underpinning recent rapid progress is an open infrastructure for scraping and uploading.

Congratulations to Magnus, our data Stakhanovite!

Links

 
3D printing

To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 12:26, 12 March 2018 (UTC)

Facto Post – Issue 11 – 9 April 2018

Facto Post – Issue 11 – 9 April 2018
 

The 100 Skins of the Onion

Open Citations Month, with its eminently guessable hashtag, is upon us. We should be utterly grateful that in the past 12 months, so much data on which papers cite which other papers has been made open, and that Wikidata is playing its part in hosting it as "cites" statements. At the time of writing, there are 15.3M Wikidata items that can do that.

Pulling back to look at open access papers in the large, though, there is is less reason for celebration. Access in theory does not yet equate to practical access. A recent LSE IMPACT blogpost puts that issue down to "heterogeneity". A useful euphemism to save us from thinking that the whole concept doesn't fall into the realm of the oxymoron.

Some home truths: aggregation is not content management, if it falls short on reusability. The PDF file format is wedded to how humans read documents, not how machines ingest them. The salami-slicer is our friend in the current downloading of open access papers, but for a better metaphor, think about skinning an onion, laboriously, 100 times with diminishing returns. There are of the order of 100 major publisher sites hosting open access papers, and the predominant offer there is still a PDF.

 
Red onion cross section

From the discoverability angle, Wikidata's bibliographic resources combined with the SPARQL query are superior in principle, by far, to existing keyword searches run over papers. Open access content should be managed into consistent HTML, something that is currently strenuous. The good news, such as it is, would be that much of it is already in XML. The organisational problem of removing further skins from the onion, with sensible prioritisation, is certainly not insuperable. The CORE group (the bloggers in the LSE posting) has some answers, but actually not all that is needed for the text and data mining purposes they highlight. The long tail, or in other words the onion heart when it has become fiddly beyond patience to skin, does call for a pis aller. But the real knack is to do more between the XML and the heart.

Links


To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 16:25, 9 April 2018 (UTC)

Facto Post – Issue 12 – 28 May 2018

Facto Post – Issue 12 – 28 May 2018
 

ScienceSource funded

The Wikimedia Foundation announced full funding of the ScienceSource grant proposal from ContentMine on May 18. See the ScienceSource Twitter announcement and 60 second video.

A medical canon?

The proposal includes downloading 30,000 open access papers, aiming (roughly speaking) to create a baseline for medical referencing on Wikipedia. It leaves open the question of how these are to be chosen.

The basic criteria of WP:MEDRS include a concentration on secondary literature. Attention has to be given to the long tail of diseases that receive less current research. The MEDRS guideline supposes that edge cases will have to be handled, and the premature exclusion of publications that would be in those marginal positions would reduce the value of the collection. Prophylaxis misses the point that gate-keeping will be done by an algorithm.

Two well-known but rather different areas where such considerations apply are tropical diseases and alternative medicine. There are also a number of potential downloading troubles, and these were mentioned in Issue 11. There is likely to be a gap, even with the guideline, between conditions taken to be necessary but not sufficient, and conditions sufficient but not necessary, for candidate papers to be included. With around 10,000 recognised medical conditions in standard lists, being comprehensive is demanding. With all of these aspects of the task, ScienceSource will seek community help.

Links

 
OpenRefine logo, courtesy of Google

To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM. ScienceSource pages will be announced there, and in this mass message.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:16, 28 May 2018 (UTC)

Facto Post – Issue 13 – 29 May 2018

Facto Post – Issue 13 – 29 May 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Respecting MEDRS

Facto Post enters its second year, with a Cambridge Blue (OK, Aquamarine) background, a new logo, but no Cambridge blues. On-topic for the ScienceSource project is a project page here. It contains some case studies on how the WP:MEDRS guideline, for the referencing of articles at all related to human health, is applied in typical discussions.

Close to home also, a template, called {{medrs}} for short, is used to express dissatisfaction with particular references. Technology can help with patrolling, and this Petscan query finds over 450 articles where there is at least one use of the template. Of course the template is merely suggesting there is a possible issue with the reliability of a reference. Deciding the truth of the allegation is another matter.

This maintenance issue is one example of where ScienceSource aims to help. Where the reference is to a scientific paper, its type of algorithm could give a pass/fail opinion on such references. It could assist patrollers of medical articles, therefore, with the templated references and more generally. There may be more to proper referencing than that, indeed: context, quite what the statement supported by the reference expresses, prominence and weight. For that kind of consideration, case studies can help. But an algorithm might help to clear the backlog.

 
Evidence pyramid leading up to clinical guidelines, from WP:MEDRS
Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 18:19, 29 June 2018 (UTC)

Facto Post – Issue 14 – 21 July 2018

Facto Post – Issue 14 – 21 July 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Plugging the gaps – Wikimania report

Officially it is "bridging the gaps in knowledge", with Wikimania 2018 in Cape Town paying tribute to the southern African concept of ubuntu to implement it. Besides face-to-face interactions, Wikimedians do need their power sources.

 
Hackathon mentoring table wiring

Facto Post interviewed Jdforrester, who has attended every Wikimania, and now works as Senior Product Manager for the Wikimedia Foundation. His take on tackling the gaps in the Wikimedia movement is that "if we were an army, we could march in a column and close up all the gaps". In his view though, that is a faulty metaphor, and it leads to a completely false misunderstanding of the movement, its diversity and different aspirations, and the nature of the work as "fighting" to be done in the open sector. There are many fronts, and as an eventualist he feels the gaps experienced both by editors and by users of Wikimedia content are inevitable. He would like to see a greater emphasis on reuse of content, not simply its volume.

If that may not sound like radicalism, the Decolonizing the Internet conference here organized jointly with Whose Knowledge? can redress the picture. It comes with the claim to be "the first ever conference about centering marginalized knowledge online".

 
Plugbar buildup at the Hackathon
Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 06:10, 21 July 2018 (UTC)

Infobox gene proposal

Hi Andrew. I have made several proposals at Module talk:Infobox_gene (IUPHAR links, EC links, Making Infobox gene more understandable). I would appreciate your feedback. Thanks. Boghog (talk) 08:06, 28 July 2018 (UTC)

8th ISCB Wikipedia Competition: entries open!

Facto Post – Issue 15 – 21 August 2018

Facto Post – Issue 15 – 21 August 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Neglected diseases
 
Anti-parasitic drugs being distributed in Côte d'Ivoire
What's a Neglected Disease?, ScienceSource video

To grasp the nettle, there are rare diseases, there are tropical diseases and then there are "neglected diseases". Evidently a rare enough disease is likely to be neglected, but neglected disease these days means a disease not rare, but tropical, and most often infectious or parasitic. Rare diseases as a group are dominated, in contrast, by genetic diseases.

A major aspect of neglect is found in tracking drug discovery. Orphan drugs are those developed to treat rare diseases (rare enough not to have market-driven research), but there is some overlap in practice with the WHO's neglected diseases, where snakebite, a "neglected public health issue", is on the list.

From an encyclopedic point of view, lack of research also may mean lack of high-quality references: the core medical literature differs from primary research, since it operates by aggregating trials. This bibliographic deficit clearly hinders Wikipedia's mission. The ScienceSource project is currently addressing this issue, on Wikidata. Its Wikidata focus list at WD:SSFL is trying to ensure that neglect does not turn into bias in its selection of science papers.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 13:23, 21 August 2018 (UTC)

Facto Post – Issue 16 – 30 September 2018

Facto Post – Issue 16 – 30 September 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

The science publishing landscape
 

In an ideal world ... no, bear with your editor for just a minute ... there would be a format for scientific publishing online that was as much a standard as SI units are for the content. Likewise cataloguing publications would not be onerous, because part of the process would be to generate uniform metadata. Without claiming it could be the mythical free lunch, it might be reasonably be argued that sandwiches can be packaged much alike and have barcodes, whatever the fillings.

The best on offer, to stretch the metaphor, is the meal kit option, in the form of XML. Where scientific papers are delivered as XML downloads, you get all the ingredients ready to cook. But have to prepare the actual meal of slow food yourself. See Scholarly HTML for a recent pass at heading off XML with HTML, in other words in the native language of the Web.

The argument from real life is a traditional mixture of frictional forces, vested interests, and the classic irony of the principle of unripe time. On the other hand, discoverability actually diminishes with the prolific progress of science publishing. No, it really doesn't scale. Wikimedia as movement can do something in such cases. We know from open access, we grok the Web, we have our own horse in the HTML race, we have Wikidata and WikiJournal, and we have the chops to act.

 
Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 17:57, 30 September 2018 (UTC)

Discussion

See here. My very best wishes (talk) 02:29, 6 October 2018 (UTC)

User:ProteinBoxBot blocked

I have blocked User:ProteinBoxBot and started a discussion at Wikipedia:Bots/Noticeboard#User:ProteinBoxBot. Fram (talk) 08:26, 24 October 2018 (UTC)

Facto Post – Issue 17 – 29 October 2018

Facto Post – Issue 17 – 29 October 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Wikidata imaged

Around 2.7 million Wikidata items have an illustrative image. These files, you might say, are Wikimedia's stock images, and if the number is large, it is still only 5% or so of items that have one. All such images are taken from Wikimedia Commons, which has 50 million media files. One key issue is how to expand the stock.

Indeed, there is a tool. WD-FIST exploits the fact that each Wikipedia is differently illustrated, mostly with images from Commons but also with fair use images. An item that has sitelinks but no illustrative image can be tested to see if the linked wikis have a suitable one. This works well for a volunteer who wants to add images at a reasonable scale, and a small amount of SPARQL knowledge goes a long way in producing checklists.

 
Gran Teatro, Cáceres, Spain, at night

It should be noted, though, that there are currently 53 Wikidata properties that link to Commons, of which P18 for the basic image is just one. WD-FIST prompts the user to add signatures, plaques, pictures of graves and so on. There are a couple of hundred monograms, mostly of historical figures, and this query allows you to view all of them. commons:Category:Monograms and its subcategories provide rich scope for adding more.

And so it is generally. The list of properties linking to Commons does contain a few that concern video and audio files, and rather more for maps. But it contains gems such as P3451 for "nighttime view". Over 1000 of those on Wikidata, but as for so much else, there could be yet more.

Go on. Today is Wikidata's birthday. An illustrative image is always an acceptable gift, so why not add one? You can follow these easy steps: (i) log in at https://tools.wmflabs.org/widar/, (ii) paste the Petscan ID 6263583 into https://tools.wmflabs.org/fist/wdfist/ and click run, and (iii) just add cake.

 
Birthday logo
Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 15:01, 29 October 2018 (UTC)

ArbCom 2018 election voter message

Hello, Andrew Su. Voting in the 2018 Arbitration Committee elections is now open until 23.59 on Sunday, 2 December. All users who registered an account before Sunday, 28 October 2018, made at least 150 mainspace edits before Thursday, 1 November 2018 and are not currently blocked are eligible to vote. Users with alternate accounts may only vote once.

The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.

If you wish to participate in the 2018 election, please review the candidates and submit your choices on the voting page. MediaWiki message delivery (talk) 18:42, 19 November 2018 (UTC)

ArbCom 2018 election voter message

Hello, Andrew Su. Voting in the 2018 Arbitration Committee elections is now open until 23.59 on Sunday, 3 December. All users who registered an account before Sunday, 28 October 2018, made at least 150 mainspace edits before Thursday, 1 November 2018 and are not currently blocked are eligible to vote. Users with alternate accounts may only vote once.

The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.

If you wish to participate in the 2018 election, please review the candidates and submit your choices on the voting page. MediaWiki message delivery (talk) 18:42, 19 November 2018 (UTC)

Facto Post – Issue 18 – 30 November 2018

Facto Post – Issue 18 – 30 November 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

WikiCite issue

GLAM ♥ data — what is a gallery, library, archive or museum without a catalogue? It follows that Wikidata must love librarians. Bibliography supports students and researchers in any topic, but open and machine-readable bibliographic data even more so, outside the silo. Cue the WikiCite initiative, which was meeting in conference this week, in the Bay Area of California.

 
Wikidata training for librarians at WikiCite 2018

In fact there is a broad scope: "Open Knowledge Maps via SPARQL" and the "Sum of All Welsh Literature", identification of research outputs, Library.Link Network and Bibframe 2.0, OSCAR and LUCINDA (who they?), OCLC and Scholia, all these co-exist on the agenda. Certainly more library science is coming Wikidata's way. That poses the question about the other direction: is more Wikimedia technology advancing on libraries? Good point.

Wikimedians generally are not aware of the tech background that can be assumed, unless they are close to current training for librarians. A baseline definition is useful here: "bash, git and OpenRefine". Compare and contrast with pywikibot, GitHub and mix'n'match. Translation: scripting for automation, version control, data set matching and wrangling in the large, are on the agenda also for contemporary library work. Certainly there is some possible common ground here. Time to understand rather more about the motivations that operate in the library sector.

Links

Account creation is now open on the ScienceSource wiki, where you can see SPARQL visualisations of text mining.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:20, 30 November 2018 (UTC)

Facto Post – Issue 19 – 27 December 2018

Facto Post – Issue 19 – 27 December 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Learning from Zotero

Zotero is free software for reference management by the Center for History and New Media: see Wikipedia:Citing sources with Zotero. It is also an active user community, and has broad-based language support.

 
Zotero logo

Besides the handiness of Zotero's warehousing of personal citation collections, the Zotero translator underlies the citoid service, at work behind the VisualEditor. Metadata from Wikidata can be imported into Zotero; and in the other direction the zotkat tool from the University of Mannheim allows Zotero bibliographies to be exported to Wikidata, by item creation. With an extra feature to add statements, that route could lead to much development of the focus list (P5008) tagging on Wikidata, by WikiProjects.

Zotero demo video

There is also a large-scale encyclopedic dimension here. The construction of Zotero translators is one facet of Web scraping that has a strong community and open source basis. In that it resembles the less formal mix'n'match import community, and growing networks around other approaches that can integrate datasets into Wikidata, such as the use of OpenRefine.

Looking ahead, the thirtieth birthday of the World Wide Web falls in 2019, and yet the ambition to make webpages routinely readable by machines can still seem an ever-retreating mirage. Wikidata should not only be helping Wikimedia integrate its projects, an ongoing process represented by Structured Data on Commons and lexemes. It should also be acting as a catalyst to bring scraping in from the cold, with institutional strengths as well as resourceful code.

Links

Diversitech, the latest ContentMine grant application to the Wikimedia Foundation, is in its community review stage until January 2.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 19:08, 27 December 2018 (UTC)

Facto Post – Issue 20 – 31 January 2019

Facto Post – Issue 20 – 31 January 2019
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Everything flows (and certainly data does)

Recently Jimmy Wales has made the point that computer home assistants take much of their data from Wikipedia, one way or another. So as well as getting Spotify to play Frosty the Snowman for you, they may be able to answer the question "is the Pope Catholic?" Possibly by asking for disambiguation (Coptic?).

Amazon Echo device using the Amazon Alexa service in voice search showdown with the Google rival on an Android phone

Headlines about data breaches are now familiar, but the unannounced circulation of information raises other issues. One of those is Gresham's law stated as "bad data drives out good". Wikipedia and now Wikidata have been criticised on related grounds: what if their content, unattributed, is taken to have a higher standing than Wikimedians themselves would grant it? See Wikiquote on a misattribution to Bismarck for the usual quip about "law and sausages", and why one shouldn't watch them in the making.

Wikipedia has now turned 18, so should act like as adult, as well as being treated like one. The Web itself turns 30 some time between March and November this year, per Tim Berners-Lee. If the Knowledge Graph by Google exemplifies Heraclitean Web technology gaining authority, contra GIGO, Wikimedians still have a role in its critique. But not just with the teenage skill of detecting phoniness.

There is more to beating Gresham than exposing the factoid and urban myth, where WP:V does do a great job. Placeholders must be detected, and working with Wikidata is a good way to understand how having one statement as data can blind us to replacing it by a more accurate one. An example that is important to open access is that, firstly, the term itself needs considerable unpacking, because just being able to read material online is a poor relation of "open"; and secondly, trying to get Creative Commons license information into Wikidata shows up issues with classes of license (such as CC-BY) standing for the actual license in major repositories. Detailed investigation shows that "everything flows" exacerbates the issue. But Wikidata can solve it.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:53, 31 January 2019 (UTC)

8th ISCB Wikipedia Competition: a reminder

Nomination for deletion of Template:PBB

 Template:PBB has been nominated for deletion. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. Zackmann (Talk to me/What I been doing) 18:58, 27 February 2019 (UTC)

Facto Post – Issue 21 – 28 February 2019

Facto Post – Issue 21 – 28 February 2019
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

What is a systematic review?

Systematic reviews are basic building blocks of evidence-based medicine, surveys of existing literature devoted typically to a definite question that aim to bring out scientific conclusions. They are principled in a way Wikipedians can appreciate, taking a critical view of their sources.

 
PRISMA flow diagram for a systematic review

Ben Goldacre in 2014 wrote (link below) "[...] : the "information architecture" of evidence based medicine (if you can tolerate such a phrase) is a chaotic, ad hoc, poorly connected ecosystem of legacy projects. In some respects the whole show is still run on paper, like it's the 19th century." Is there a Wikidatan in the house? Wouldn't some machine-readable content that is structured data help?

File:Schittny, Facing East, 2011, Legacy Projects.jpg
2011 photograph by Bernard Schittny of the "Legacy Projects" group

Most likely it would, but the arcana of systematic reviews and how they add value would still need formal handling. The PRISMA standard dates from 2009, with an update started in 2018. The concerns there include the corpus of papers used: how selected and filtered? Now that Wikidata has a 20.9 million item bibliography, one can at least pose questions. Each systematic review is a tagging opportunity for a bibliography. Could that tagging be reproduced by a query, in principle? Can it even be second-guessed by a query (i.e. simulated by a protocol which translates into SPARQL)? Homing in on the arcana, do the inclusion and filtering criteria translate into metadata? At some level they must, but are these metadata explicitly expressed in the articles themselves? The answer to that is surely "no" at this point, but can TDM find them? Again "no", right now. Automatic identification doesn't just happen.

Actually these questions lack originality. It should be noted though that WP:MEDRS, the reliable sources guideline used here for health information, hinges on the assumption that the usefully systematic reviews of biomedical literature can be recognised. Its nutshell summary, normally the part of a guideline with the highest density of common sense, allows literature reviews in general validity, but WP:MEDASSESS qualifies that indication heavily. Process wonkery about systematic reviews definitely has merit.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:01, 28 February 2019 (UTC)

Nomination for deletion of Template:PBB Controls

 Template:PBB Controls has been nominated for deletion. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. {{3x|p}}ery (talk) 19:03, 9 March 2019 (UTC)

Nomination for deletion of Template:PDB Gallery

 Template:PDB Gallery has been nominated for deletion. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. {{3x|p}}ery (talk) 14:54, 10 March 2019 (UTC)

Nomination for merging of Template:GNF Protein box

 Template:GNF Protein box has been nominated for merging with Template:Infobox gene. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. Thank you. Gonnym (talk) 08:43, 20 March 2019 (UTC)

Facto Post – Issue 22 – 28 March 2019

Facto Post – Issue 22 – 28 March 2019
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

When in the cloud, do as the APIs do

Half a century ago, it was the era of the mainframe computer, with its air-conditioned room, twitching tape-drives, and appearance in the title of a spy novel Billion-Dollar Brain then made into a Hollywood film. Now we have the cloud, with server farms and the client–server model as quotidian: this text is being typed on a Chromebook.

File:Cloud-API-Logo.svg
Logo of Cloud API on Google Cloud Platform

The term Applications Programming Interface or API is 50 years old, and refers to a type of software library as well as the interface to its use. While a compiler is what you need to get high-level code executed by a mainframe, an API out in the cloud somewhere offers a chance to perform operations on a remote server. For example, the multifarious bots active on Wikipedia have owners who exploit the MediaWiki API.

APIs (called RESTful) that allow for the GET HTTP request are fundamental for what could colloquially be called "moving data around the Web"; from which Wikidata benefits 24/7. So the fact that the Wikidata SPARQL endpoint at query.wikidata.org has a RESTful API means that, in lay terms, Wikidata content can be GOT from it. The programming involved, besides the SPARQL language, could be in Python, younger by a few months than the Web.

Magic words, such as occur in fantasy stories, are wishful (rather than RESTful) solutions to gaining access. You may need to be a linguist to enter Ali Baba's cave or the western door of Moria (French in the case of "Open Sesame", in fact, and Sindarin being the respective languages). Talking to an API requires a bigger toolkit, which first means you have to recognise the tools in terms of what they can do. On the way to the wikt:impactful or polymathic modern handling of facts, one must perhaps take only tactful notice of tech's endemic problem with documentation, and absorb the insightful point that the code in APIs does articulate the customary procedures now in place on the cloud for getting information. As Owl explained to Winnie-the-Pooh, it tells you The Thing to Do.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:45, 28 March 2019 (UTC)

Nomination for deletion of Template:EntrezGene2

 Template:EntrezGene2 has been nominated for deletion. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. Gonnym (talk) 10:51, 2 April 2019 (UTC)