Wikipedia:WikiProject Tropical cyclones/Data migration
The following is a proposed Wikipedia policy, guideline, or process. The proposal may still be in development, under discussion, or in the process of gathering consensus for adoption. |
This page details planned implementation for a solution to the ever-growing amount of data-related edits on tropical cyclone season pages. These edits count merely as statistics edits, and essentially inflate edit counts of articles without significant impact to the content of the article. This essay is built upon the idea that "data edits go on Wikidata, content edits go on Wikipedia". This hopefully serves as a clear distinction between changes.
Consensus for this implementation plan has not yet been gathered — this will be accomplished after the plan's first draft has been accomplished. Participants from both Wikidata and Wikipedia are encouraged to join the discussion.
Preface
editAs found in this analysis, a significant number of edits on the 2021 Pacific typhoon season article consists of sub-10 byte difference edits (nearly half the article's edit count). Most of these edits consist of numbers changes and other statistical changes. Since Wikipedia is not an indiscriminate collection of information, and because these edits are essentially nullified at the end of a tropical cyclone's lifespan when all current-related data is removed from the article, the recommended course of action is to move statistical edits off of Wikipedia and transfer them to Wikidata.
Goals
editThis plan has the following goals:
- Create a systematic method of updating current storm data on Wikidata and Wikipedia.
- Move all "current storm information" statistics (and possibly information in {{infobox tropical cyclone current}} infoboxes) to Wikidata.
- Ensure compatibility across all storm basins and meteorological agencies when implementing the changes.
- (Optional) Increase WikiProject Tropical cyclones project activity on Wikidata.
Participating
editIf you have any comments or concerns, please leave a message on the talk page.
If you'd like to participate in developing templates, bots, modules, or any other technical details of the plan, please include your name below. A modicum of technical proficiency is required, as the tasks involved in this project are highly dependent on code and other software work.
Phase 1: Infrastructure
editIn order to facilitate this change, the proper infrastructure must be created on both wikis.
Adjust Wikidata properties and identifiers
editWikidata currently has the following properties related to storm information:
- instance of (P31) (storm category)
- coordinate location (P625) (current position)
- lowest atmospheric pressure (P2532)
- maximum sustained winds (P2895)
- speed (P2052)
- statement supported by (P3680) (for assigning agency (e.g. National Hurricane Center, Japan Meteorological Agency, Joint Typhoon Warning Center))
- determination method or standard (P459) (for 10-minute/3-minute/1-minute winds, Wikidata items to be created later)
Wikidata does not have properties related to:
- gusts
- closest reference point (e.g. 200 nautical mile (Q93318) southeast (Q6452640) of Tokyo (Q1490))
- basin (North Atlantic, Northwestern Pacific, Southwestern Indian, etc.)
Wikidata also does not have identifiers for agency tropical cyclone details. These should be supplied in order to automatically generate the "for the latest official information" part of the current storm information.
Create items for all storm classifications
editClassifications will be on a per-agency basis, to be used for instance of (P31). This list outlines classifications with (linked) and without (unlinked) items:
- Joint Typhoon Warning Center (Q1142111)
- subtropical cyclone
- tropical cyclone
- tropical depression (Q96096134)
- tropical storm (Q96096178)
- severe tropical storm
- typhoon (Q140588)
- super typhoon (Q15941028)
- extratropical cyclone
- Japan Meteorological Agency (Q860935)
- National Hurricane Center (Q1329523)/Central Pacific Hurricane Center (Q1053937)
- Météo-France (Q1810406)
- Bureau of Meteorology (Q923429)
- tropical low
- tropical depression
- category 1 tropical cyclone
- category 2 tropical cyclone
- category 3 severe tropical cyclone
- category 4 severe tropical cyclone
- category 5 severe tropical cyclone
- India Meteorological Department (Q923628)
- depression
- deep depression
- cyclonic storm
- severe cyclonic storm
- very severe cyclonic storm
- extremely severe cyclonic storm
- super cyclonic storm
Adjust future Wikidata items
editThis change primarily targets current storm information. Thus, there is no need for retroactive change on existing tropical cyclone items (although it is highly suggested, especially since Wikidata items for storms are highly under-maintained). For future Wikidata items, the following must be observed:
- The storm should be an instance of (P31) whatever storm classification it is under. This should only be the highest actual category achieved.
- For example, a storm designated by the JMA as a severe tropical storm should be an instance of severe tropical storm (Q11069306).
- For example, a storm designated by the JTWC as a typhoon should be an instance of typhoon (Q140588).
- Since instance of (P31) can contain multiple entities, a storm can be classified under both classifications.
- This requires changes to existing Wikidata items. See below section for details.
- Data on a storm's lowest atmospheric pressure (P2532) should be qualified with point in time (P585), determination method or standard (P459), and statement supported by (P3680).
- Data on a storm's maximum sustained winds (P2895) should be qualified with point in time (P585), determination method or standard (P459), and statement supported by (P3680).
Create templates on Wikipedia
editIn order to display the data on Wikipedia, there must be a template that specifically generates the content for a "current storm information" section.
Since transcluded Wikidata statistics are automatically given a pencil icon, it shouldn't be hard for existing editors to make changes to the storm data. This also doesn't make it hard for vandals to edit the data: protection details are outlined in the following sections. This template will display the latest (most recent point in time (P585)) value of the maximum sustained wind speed and lowest atmospheric pressure. Only either 10-minute winds or 5-minute winds will be displayed (along with 1-minute winds, which are always shown if available), depending on the basin. Official sources will be provided through identifiers. Closest reference point will always be shown, along with gusts (if available) and movement (if available).
Phase 2: Optimization
editData importation
editMost of the data can be automatically generated. Best track data from the JTWC and RSMCs are freely available and can be scraped or inclusion on Wikidata. This data can be imported by specifically-designed bots.
Cewbot by Kanashimi is already responsible for uploading images from meteorological agencies to Commons (see BRfA). Because of this, Wikidata items for storms can automatically be updated with the uploaded track maps. This, however, does not include automatic updates involving best track data. For this, another bot should be used instead, or an existing bot can be used as long as it can accurately import data at proper intervals.
Template:Infobox tropical cyclone current (ITCC) suffers from an extremely complicated and messy wikitext-only based infobox. Given that this infobox is used for all basins, standardization should be done in order to ensure that (a) editors will no longer have confusion over the available types and classifications, and (b) Wikipedia can pull data from Wikidata without errors.
The following changes are suggested for standardization:
- Replace switches on
category
,AUScategory
,JMAtype
,JMAcategory
,IMDtype
,IMDcategory
,MFRtype
, andMFRcategory
with one centralized category database (i.e. Module:Storm categories).- (On the module) Prevent specific categories from being used on basins it's not supposed to be in.
- Fix parsing problems with
lat
andlon
. - Stop relying on multiple if statements for gusts position.
NOTE: Much of the issues here have been fixed in an in-development modular infobox, {{Infobox weather event}}.
Timelines
editTimelines can also be automatically generated by grabbing entity data of the current storm season, finding those cyclones that are part of it, and creating the graph automatically using a module. Such a graph no longer needs updating on Wikipedia, and instead delegates updates to Wikidata.
Phase 3: Conclusion
editAll related changes should be made to documentation pages and current WikiProject members should be notified of the proper changes. This does have the side effect of causing multiple changes to norms, but this new system will inevitably lead to a much more standardized and improved system. Otherwise, WikiProject Tropical cyclones will be stuck with old code and inflated season article edit counts — a hurdle in administration and organization.
In order to maintain this system, WikiProject members are highly encouraged to participate in Wikidata. Wikidata items can be watchlisted much like Wikipedia pages, so it wouldn't be difficult to patrol the items of current storms (which rely on Wikidata). This new system also supports adding in the source of a data point (for example, maximum sustained winds (P2895)) in order to deter users from editing the extremely-vague "winds" parameter, leading to edit wars on which agency to prioritize.
Backporting this system to old storms in order to also automatically supply values in {{Infobox tropical cyclone}} is possible, although access to best track data is no longer required as there only needs to be a singular value — the bare minimum that the infobox requires. This, however, is not as important since edits to previous storm infoboxes do not significantly inflate the edit count of an article.
Implementation
editImplementation should begin as soon as consensus has been achieved. For template changes, the WikiProject will be given a 15-day notice with a list of changes and examples for usage. For template creations, their creation should be announced on the WikiProject talk page to increase usage, much like the former.
Expected outcomes
edit- Begin adding in storm data on Wikidata
- Massively reduce non-prose (or data-only) edits to cyclone season articles
- Automatically update current storm information
- Automatically write the "current storm information" section of articles
- Use a bot to automatically add data to Wikidata items
- Automatically generate cyclone season article timelines
- Invite patrollers to Wikidata items of cyclones to prevent disruption (or automatically flag such edits)