Cleaning up textual Copyright/Plagiarism on Wikipedia


What constitutes a "copyright problem" on Wikipedia? edit

Copyright infringement is a matter of law. It occurs when copyrighted content is used without permission in a way that violates the copyright holder's exclusive rights. While some cases are more obvious than others, it is a determination of a court of law. Copyright protection does not govern facts, but rather the creative elements of expression, which can include language, structure and the creative selection of facts. The copyright law that governs the English language Wikipedia is that of the United States.

Wikipedia's policies regarding non-free text content are engineered to help us avoid copyright infringement. In brief, we are forbidden to import creative content unless we can prove that the content is public domain or compatibly licensed (and unless we handle it properly under that license). Otherwise, except for brief excerpts used in accordance with non-free content guidelines, we must write the content we add in our own words. While we may properly paraphrase non-free sources, we cannot follow them so closely in language or structure to create a derivative work. (See also Wikipedia:Close paraphrasing.) Essentially, any content that is added that does not meet the standards of these policies is a "copyright problem". Copyright problems might include (but are not limited to):

  • straightforward copy-pastes of non-free sources,
  • licensing violations that copy compatibly licensed sources without attribution,
  • content that has been directly or closely translated from copyrighted works in other languages,
  • articles that follow so closely on non-free sources to be clear derivative works,
  • articles that include extensive non-free content

What constitutes plagiarism on Wikipedia? edit

Plagiarism is a question of ethics. Standards of plagiarism vary by discipline. The standard of plagiarism adopted into guideline on Wikipedia is set out at Wikipedia:Plagiarism. It requires that creative text or striking information be properly attributed and (a) summarized or (b) denoted as copied, either by utilizing the standard markings for quotations at manual of style or by noting explicit copying in footnote or by attribution template. Content that does not follow the practices recommended in the guideline may be plagiarism.

Is it a copyright problem or plagiarism, and what difference does it make? edit

This is a false dichotomy. Copyright and plagiarism are two entirely different concepts, one legal and one ethical. Something may be both a copyright problem and plagiarism. It may be only a copyright problem, if the content is fully attributed but still violates our copyright policies (as with overly extensive quotations). It may be only plagiarism if the content that isn't attributed is public domain.

If something is both a copyright problem and plagiarism, the content should be handled in accordance with copyright procedures. Procedures for dealing with copyright issues are a matter of policy based on recommendations and requirements set by the Wikimedia Foundation. Plagiarism—both in definition and handling—are matters of community consensus via guideline. Plagiarism can generally be swiftly addressed through proper attribution, but copyright problems may require swift removal or alteration of the content to protect the project and its reusers from legal difficulties and to protect copyright holders from damages related to its misuse.

How do I handle a "copyright problem" on Wikipedia? edit

Text-based copyright problems on Wikipedia are all handled through a two (sometimes three) step process:

  1. Tag, revert or rewrite the content,
  2. Notify the contributor, and
  3. (if necessary) List the article for review.

Whether to tag, revert or rewrite depends on such factors as (a) when the copyvio was introduced, (b) how substantial the copyvio is in comparison to the rest of the article, and (c) whether there's a credible reason to believe we might get permission. If you forget step 3, it's not generally a big deal, as there are bots that will complete that for you when the tag is connected to a specific review board. (The only time the bots might not work is if a contributor reverts your tag before they can; it's a good idea to keep an eye out for that, as copyvio tags are often removed out of process.) Notification is a bit more important, since it's essential we help those who infringe to stop doing so. And, if they will not, the notifications will serve as evidence that we tried.

WP:Cv101 sets out about as brief an overview as possible of the processes, based on how much time you have, as the person who found the problem.

How do I handle plagiarism on Wikipedia? edit

In a word "attribute." In three words, "attribute and educate." See Wikipedia:Plagiarism#How to respond to plagiarism for some specific suggestions. In brief, provide the missing attribution for the content, if you can. And, if he or she is still contributing, politely let the contributor know about the guideline and how to provide proper attribution under it.

How can I write so that I don't create "copyright problems"? edit

Don't copy non-free content, not even temporarily

First, never copy/paste unless the content you are copying and pasting is verifiably public domain or compatibly licensed or unless you are planning to use the entire amount of content you are copying and pasting in a cited quote. Do not temporarily copy over copyrighted content so that you can rewrite it here. That's the fast track to a derivative work. If it's not free and it's too long to use in a cited quote, it should never be placed on Wikipedia at all...not in an article, not in a sandbox, not anywhere. (See Wikipedia:Public domain, Wikipedia:Copy-paste.)

Rewrite non-free content with your own creative words and structure

Make sure that the content you add reflects your own creative writing, not somebody else's. The text you add based on information you find in non-free sources (except for brief quotations) should not retain creative elements of the original. That can include language and structure. The threshold for creativity in US copyright law is intentionally set very low. Content doesn't have to be fiction to be creative; most non-fiction has amply enough creativity to qualify. The United States Supreme Court noted in Feist v. Rural, 499 U.S. 340 (1991), that "originality is not a stringent standard; it does not require that facts be presented in an innovative or surprising way." All that is required is a "spark" of creativity, the court said, "'no matter how crude, humble or obvious it may be." Wikipedia:Close paraphrasing includes some advise for how to properly rewrite content to avoid following too closely on sources. The article Wikipedia:Wikipedia Signpost/2009-04-13/Dispatches, while about plagiarism rather than copyright concerns, also contains some suggestions for reusing material from sources that may be helpful, beginning under "Avoiding plagiarism".

Be sure that the quotes you do use meet guidelines

Wikipedia does allow the use of brief excerpts of copyrighted content, so long as these are clearly marked and implemented for good reasons, such as those set out at the non-free content guideline. Our purpose is to ensure that we remain well within the fair use allowances of the United States. Quotes should be used transformatively; examples in the guideline include illustrating a point, establishing context, or attributing a point of view or idea. The length of quotes should be kept brief relative to the length of the source and in proportion to the size of the article in which they are being used.