Wikipedia talk:Close paraphrasing/Archive 2

Archive 1 Archive 2 Archive 3

First example re-write

Can someone re-write the first example that starts "A statement from the receiver, David Carson..." in a manner that does not violate copyright? --Kenatipo speak! 16:55, 7 February 2012 (UTC)

Reversion of change to milder problems template

Some improvements to the template for milder problems were reverted in [1], commented "rv seems to make casual/specious accusations more plausible". I reverted that in [2] explaining "This is about communication, not accusation. See Wikipedia:Revert_only_when_necessary". This was brutally re-reverted in [3] commenting "no it's not about communication, it's about making things optional that you would have been well advised not to consider optional". This re-reversion, which I'm reverting, confuses the changes with the section. The section is about communication (not accusation). [Some of] the changes I made were about making things optional. If there is any reason why you do not agree with the generalization, please explain. --Chealer (talk) 20:44, 7 March 2012 (UTC)

I disagreed with those changes, although they're very simple. I've only reverted your changes. How can it be a “re-reversion” that “confuses the changes with the section”? You're the one who want's to make changes, not me. Can you give a reason why you want to move the important communication advice down from the top of the section to the bottom? Obviously that seems topsy-turvy. What's your rational that the important advice should go last?
You won't admit you're accusing me of close paraphrasing a direct quotation. You pasted my original quote intact as the sole “example” into your revised form, and you plastered the banner across the top of the Materialism article, pointed out my edit, (including some kind of “Duplication Detector report”). You've reverted that 3 times now], but you are no doubt aware that that a close paraphrase of a direct quote is absurd. Robotically repeating “Unjustified reversion”, so repeatedly justifying your actions with some vague claim about improving communication is not acceptable. Had you taken the original format seriously, it wouldn't have happened. If that's brutal, you'd better sit down: a) you want to make the line that says you have other examples optional, b) that another editor dropped a quotation mark in the two months after I made that edit, shows how easy it would be to get it wrong, in contrast to an article someone wrote being very close paraphrasing/ That's a lot more clear cut than someone contributing to “parts” that aren't necessarily very close paraphrasing c) that you might be wrong, all you can say, is how it seems to you.
I'm not the only editor in connection with naturalism/creationism/intelligent design that you've policed/retaliated against under this relaxed standard.—Machine Elf 1735 00:52, 8 March 2012 (UTC)
There seems to be a content dispute spilling over here, which isn't helpful. If you guys want to discuss whether a specific passage in some article is a close paraphrasing, I'm sure we can do that, but maybe keep it separate from talking about the form of the examples themselves. :)
What we're talking about here are examples. They were written fairly simply so that they could be copied (which is why markup is visible). They certainly aren't meant to be set in stone, but I would myself prefer that we keep them clean, without a lot of visible options which people will be forced to clean up. How about if we simply make a note above the examples that they can be modified according to the needs of the situation? I wrote those examples, and I modify them all the time. :) --Moonriddengirl (talk) 12:27, 8 March 2012 (UTC)
Hi Moonriddengirl, I noticed an instance of close paraphrasing in January and while warning the contributor, I found this page. I made 2 minor modifications to the template elsewhere but served the template to the contributor with 2 other modifications, the first 2. However, when I used the template a second time, I went a little too fast, forgot about the first change (wrote -> contributed to) and only did the second one (plus the third one). (In fact, I forgot about the second one too at first and edited my comment after re-reading myself.) The editor, probably feeling insulted, replied that he was not the only responsible. I then apologized and proceeded to review this, making errors such as the one I did less likely: [4][5]
A note about adapting could help, however I think highlighting the parts that will typically need adaptation is even better. I certainly do not want a lot of options that people have to deal with each time any more than you do. If the 3 variables are too much... well, let's at least remove one. The original "wrote", which I had generalized to "wrote/contributed to" is now just "contributed to". This will certainly sound a little canny if someone wrote the article, but that's not a great problem, nor going to happen often. That leaves the 2 last variables, "very" and the example part. If some still consider that too much, it wouldn't be too hard to remove the "very". It would also be possible to duplicate the example, but that will cause readers to spend time distinguishing the templates and choosing which one is most appropriate. --Chealer (talk) 01:35, 9 March 2012 (UTC)
It's absurd for close paraphrasing to apply to cited direct quotes, so that's not an option, but should the examples be slackened just enough to accommodate harassing editors when there's only one “example”, when multiple editors have contributed, and when it doesn't even seem very much like close paraphrasing?—Machine Elf 1735 02:11, 9 March 2012 (UTC)
Close paraphrasing doesn't apply to cited direct quotes, which are, pretty much by definition, not paraphrased. :) There can be other issues with quotations under non-free policies and guidelines, but close paraphrasing is not one of them. Chealer raises good points about the difference between writing and contributing to; hopefully, anybody using these examples will do enough due diligence to identify the specific contributor involved. But I agree that these examples are probably not appropriate when the close paraphrasing is so borderline that "very" doesn't apply. I really value keeping Wikipedia free of copyright issues, and I understand that concerns about plagiarism are really important to a lot of people, but we don't want to draw a line in the sand so rigid that legitimate use of close paraphrase becomes problematic. --Moonriddengirl (talk) 12:42, 9 March 2012 (UTC)
Thank you, yes if anything it would be “copying”. Perhaps, it's surprisingly difficult to anticipate how much diligence is due? Even though Chealer was keenly aware of the issue, in a passage of less than 250 char, he failed to notice a subsequent editor had dropped a quotation mark. (This guinea pig is partial to “content non-issue” if not “case study”—we learn by example, but point well taken). I'm quite keen on copyright issues as well, moral rights in particular, (living in the US). That's why I prefer to quote if possible; it's higher fidelity. All things being equal, the materials shouldn't imply they're too narrowly applicable. For example, I'd be comfortable with “contributed to” rather than “wrote” if some cautionary advice preceded the instructions: that a diligence effort in multiple contributor scenarios can rapidly overtake that of just fixing it. It does seem to endorse a certain lack of accountability if “very” is marked as optional, better that it's either all the way in or all the way out.—Machine Elf 1735 16:00, 9 March 2012 (UTC)
I partially disagree with this (see my explanation in the Markup changes subsection). --Chealer (talk) 21:00, 9 March 2012 (UTC)
The re-reversion was commented "no it's not about communication, it's about making things optional that you would have been well advised not to consider optional", which I interpret as a reply to my reversion commented "This is about communication, not accusation. See Wikipedia:Revert_only_when_necessary". In my comment, "This" referred to the section. In your reply, if "it" also referred to the section, that would amount to saying "no the section is not about communication, it's about making things optional that you would have been well advised not to consider optional", which I assume is not actually what you meant. You must have used "it" referring to the changes, rather than the section. In other words, I am saying "This section is about communication, not accusation. See Wikipedia:Revert_only_when_necessary" and you are replying with "no the change is not about communication, it's about making things optional that you would have been well advised not to consider optional", which does not actually contradict my comment.
As for the move of the part on communication, as my edit summary explained, the rationale was to make it closer to the communication examples, so that readers would find the advice and the examples together and to facilitate an eventual structuring.
Regarding your insatisfaction about Materialism, please keep the discussion on its talk page or yours, this is the Close paraphrasing essay's talk page. --Chealer (talk) 22:45, 8 March 2012 (UTC)
Hypocritical advice about reverting only when necessary, (which you also tried to redefine to suit your purposes). My insatisfaction? With the changes you're trying to make here, yes… with your abuse? bite me.—Machine Elf 1735 02:11, 9 March 2012 (UTC)
All other things aside, policy does support that reverts should only happen when necessary, but also notes that they are a normal part of establishing consensus. :) There's nothing wrong with boldly implementing change, but when you make a change without establishing consensus for that change first, reversion is part of the process. When disagreeing with a revert, you need to then establish consensus before putting your change into the page. --Moonriddengirl (talk) 12:35, 9 March 2012 (UTC)
This is correct if the reversion in question is a change reversion, not an edit reversion (in general). Generally, an edit cannot be simply reverted because one of the changes it includes is allegedly undesirable. --Chealer (talk) 21:00, 9 March 2012 (UTC)
I'm afraid that I'm not entirely sure what you mean by "change reversion, not an edit reversion (in general)." Policy doesn't include that language. If you mean that it is preferable to fix rather than revert, this is true, but whether or not a change can be fixed is highly subjective. Revert is permissible when an editor believes a change cannot be salvaged. If somebody reverts a change, the thing to do is discuss. See also WP:STATUSQUO. --Moonriddengirl (talk) 14:47, 10 March 2012 (UTC)
I agree, policy does not distinguish between an edit reversion and a change reversion, so what I was trying to say in my last message is that it should be read carefully. I did not mean that it is preferable to fix than to revert. What I meant to point out is that policy shouldn't be understood as supporting reverts of edits (in general), but reverts of changes. Policy does not generally support reverting an edit simply because you consider one of the changes it implements undesirable. --Chealer (talk) 20:45, 10 March 2012 (UTC)

Markup changes

  • I'm separating this out because the conversation above is complicated by the content dispute. While I'm fine with changing "wrote" to "contributed to", I've removed the complicating markup from the examples. As I said, I think this should remain simple. If consensus goes against me, so be it. :) But please don't put this back until there is consensus for it.
    If the "these are just examples" line is not appropriate, I would prefer to remove it altogether rather than to complicate use of this form by requiring that people must edit it either way. Personally, I think it's appropriate because, while a single passage of close paraphrasing can represent a problem worthy of addressing (depending on the size of the passage and proportion to the source and article), close paraphrasing concerns can be overstated. I would not wish to subtly encourage people to issue warnings for problems that are very minor. The use of bolding in the parameters that must be changed (title, for instance, url) isn't bad, but I'm not sure it's necessary given that the section is already using code. I wouldn't mind replacing code with bolding, but I think code is more likely to catch the eye, as it is less common. What do you think? --Moonriddengirl (talk) 12:23, 9 March 2012 (UTC)
Regarding the two content changes, I am not a lawyer, however I think that both alternative versions (without the very and without the more examples part) are useful. I have used the template 3 times in 2012. In these 3 uses, I removed the very part 2 times and the more examples part 2 times. My third use is more complicated, but to discuss concrete cases, we can use the first 2 uses I made to see whether these variations were warranted. The first use is [6], in which I dropped the "very" part. The second case is User_talk:Mthoodhood#Close_paraphrasing, in which I dropped both parts. Let's discuss these 2 issues separately.
I believe the more examples part is useful, because there *are* cases with multiple close paraphrases and 1) we don't want reviewers to have to point out each one 2) nor do we want to give the contributor the impression that the problem is limited to the examples given. So I would rather keep it.
As for the formatting, I agree in principle that less common formatting catches the eye more, however, this difference is diminished by the fact that the examples do not use bold anywhere else, and must be largely compensated by the difference in proeminence between boldening and monospacing. At least on this machine/browser, monospace is quite discrete: [7]. I actually have to look at "source" a little bit to convince myself that is is monospaced. On the other hand, bold is just obvious. If anyone actually thinks monospace is more proeminent than bold, please show a similar comparison. --Chealer (talk) 21:00, 9 March 2012 (UTC)
I can't really tell what the dispute is here (it seems to be spilling over from elsewhere ) and haven't been following closely to be honest, but I do agree with Moonriddengirl regarding the changes to the template. The markup should be kept as simple as possible in my view. Personally I've never used the template; I tend to use a personal message when necessary. But adding complicated markup to a template makes it less user friendly I think. Truthkeeper (talk) 03:04, 10 March 2012 (UTC)
Yes, it's clearly spilling over from elsewhere, per. Keeping the markup simple is, in my opinion, essential. Chealer, your own experiences with using it highlight that. What good is it to have markup that requires people to remember to fix the "[very]"? And if they don't remember, bolding it is only going to make it more confusing for the recipient. Moonriddengirl , — (continues after insertion below.)
I'm not sure how my experiences with using the template highlight the need to keep the markup simple. I'm also not sure I understand your question. If the markup you are referring to is the boldening of "[very]", its purpose is, well, to remember people to modify the "[very]" part, since we want the ultimate message to have either "very" or nothing, but not "[very]". I'm afraid there may be some confusion here (yes, managing a wiki syntax template with wiki syntax is a little complicated :-| ). I boldened parts of how the template displays to the reviewer. I did not modify the template itself (causing the optional parts to ultimately show in bold to the faulty editor). --Chealer (talk) 21:15, 10 March 2012 (UTC)
Chealer, when interrupting somebody else's comments, please follow WP:TPO; you need to note when you're interrupting, and you generally should not do it when responding to short notes such as this one. It's important that talk pages remain readable for those who come later. I've added the {{interrupted}} template for you. --Moonriddengirl (talk) 21:43, 10 March 2012 (UTC)
Sorry. Thanks, I was not aware of that template. --Chealer (talk) 16:49, 11 March 2012 (UTC)
Chealer, this is not a set in stone warning or a template. This is an example. You are welcome to make whatever changes you like to it when placing it on talk page. But I think the example needs to be kept simple and universal. --Moonriddengirl (talk) 14:40, 10 March 2012 (UTC)
"Universality" (or at least some generalization in the way of universality) is precisely what I'm trying to achieve here. I understand this is just a template, but I still think its use should be made as easy as possible. I really don't think that making 2 parts optional prevents the template from being simple. The template's users are editors already aware of how wiki syntax works... they've seen worst. However, you make me realize that now that we only have optional parts (as opposed to choices), the variable parts could be indicated without changing the template's syntax... I'm trying that just now by putting optional parts in italics. Let me know what you think. --Chealer (talk) 21:15, 10 March 2012 (UTC)
I am removing your changes. It's disappointing to me that you continue to complicate these examples in spite of multiple objections to your doing so on this talk page. :/ There is not yet consensus for complicating these examples. --Moonriddengirl (talk) 21:31, 10 March 2012 (UTC)
Well, sorry to disappoint, but the only objections I remember seeing were about the "complicating markup". I don't think anyone suggested that my last change was complicating. Yes, it adds text, but well, simplicity is not a ultimate goal, we want the template to be as optimal as possible. --Chealer (talk) 16:49, 11 March 2012 (UTC)
I mentioned complicated mark-up, but I also agree with Moonriddengirl in regards to the issue of "very". I think something that's simple is being made unnecessarily complicated. Moonriddengirl's recent changes were good, I thought. There isn't a reason not to edit and personalize the templates to fit one's own preferences, but what's posted here should be a generic vanilla version imo. Truthkeeper (talk) 17:16, 11 March 2012 (UTC)
Getting to a generic version is precisely what I'm trying to achieve here. However, the previous version had some interesting optional parts, and I hope that we don't need to stop offering them by generalizing. --Chealer (talk) 21:50, 11 March 2012 (UTC)

Optional part: "very"

The template used to say

I'm afraid the ArticleName article you contributed to has parts which are very closely paraphrased from source.

I made the "very" optional:

I'm afraid the ArticleName article you contributed to has parts which are [very] closely paraphrased from source.

Moonriddengirl thinks "these examples are probably not appropriate when the close paraphrasing is so borderline that "very" doesn't apply.". I am not sure what is meant by "borderline". However, I do not think that how close the paraphrases are has [significant] impact on the close paraphrasing's severity. What really matters, I believe, is the extent of close paraphrasing. Is there any evidence that the "proximity" matters? --Chealer (talk) 21:00, 9 March 2012 (UTC)

The example has never said "which are very closely paraphrased". It says and has always said, "which are very closely paraphrased." The emphasis on very would be extremely inappropriate for an example. The degree of closeness in paraphrase of course has an impact on its severity. Close paraphrase is not an either/or situation; there is a wide variance between a proper paraphrase and direct copying. The closer content moves towards proper paraphrase, the less likely that it will rise to the level of substantial similarity.
In the event where paraphrasing falls closer to proper paraphrase, I do not believe that the example is appropriate because the example is written from the perspective that the content may constitute a copyright issue and suggests that "So that we can be sure it does not constitute a derivative work, this article should be revised to separate it further from its source." If content is not very closely paraphrased, it is less likely to constitute a derivative work.
I do not support any change to this example which is likely to result in its being used against people who are not at substantial risk of closely paraphrasing to the point of copyright issue. --Moonriddengirl (talk) 14:53, 10 March 2012 (UTC)
Right, it says the quoted text, but it does not say that in the format above, in which I boldened the varying parts.
This is not a simple topic. I agree that theoretically, similarity of form between 2 texts carrying the same information is absolutely relative. As Wikipedia:Close_paraphrasing#When_there_are_a_limited_number_of_ways_to_say_the_same_thing suggests, there is not really an infinite number of ways to formulate an idea. For example, 2 people could have lived a similar life, and written 2 identical plays, per the infinite monkey theorem.
It is hard to rate that similarity. Clearly, identical texts are completely similar. However, is there such a thing as 2 formulations of the same idea which have absolutely no similarity? Still, even if evaluating similarity is at least difficult, we agree that there are ways to reformulate an idea which are not similar enough to constitute plagiarism.
But how should we determine whether a formulation is plagiariazed, if similarity can't be counted? In fact, as "When there are a limited number of ways to say the same thing" shows, similarity is not problem - derivation is. I believe copying is what matters: if a formulation was copied from another one, it is plagiarism, no matter how many modifications were done. I understand that in theory, at one point, a copy, if it has enough modifications, could end up as far from the original as a random new formulation. But these won't be called "close paraphrases".
So, I consider that a close paraphrase is a derivative work, and should be avoided. A very close paraphrase is obviously a derivative work, and should obviously be avoided. That being said, if you still think the "very" is essential, I offered 2 concrete cases in which I used the template without the "very". What is your stance on these? Would you have put the "very" in these cases? If not, do you consider the paraphrases were not a plagiarism issue? --Chealer (talk) 20:27, 10 March 2012 (UTC)
There is a difference between plagiarism and copyright concerns, both legally and locally. Both of the examples are about copyright concerns, not about plagiarism, although the plagiarism guideline is mentioned. A derivative work is a very specifically and legally definable thing; to borrow our article's definition, "a derivative work is an expressive creation that includes major, copyright-protected elements of an original, previously created first work (the underlying work)." A paraphrase that is not very close may not be a derivative work, if the copyright-protected elements being used are de minimis or if the use is not close enough to qualify as substantially similar. If the similarity does not rise to the point where copyright is in question, or if the content is public domain, neither example is really appropriate for use without major overhaul. --Moonriddengirl (talk) 22:00, 10 March 2012 (UTC)
Absolutely, de minimis may apply to a close paraphrase which is not very close... just like it may apply to a very close paraphrase. Anyway, there is an interesting point here. Should the examples mention the plagiarism guideline if the examples are only supposed to be used for copyright concerns? --Chealer (talk) 17:26, 11 March 2012 (UTC)

Optional part: more examples

The template used to say

This is an example, there are other passages that similarly follow quite closely.

I made that part optional, however Moonriddengirl "think[s] it's appropriate because, while a single passage of close paraphrasing can represent a problem worthy of addressing (depending on the size of the passage and proportion to the source and article), close paraphrasing concerns can be overstated.". However, I do not think that the number of paraphrases impacts the close paraphrasing's severity. What really matters, I believe, is the total extent of close paraphrasing (the length, if you wish). That's really what Moonriddengirl's parenthesis says - the number of close paraphrases matters if one assumes that each close paraphrase has a constant length, but that's not the case. Basically, a single close paraphrase could be illegal, if long/important enough. I am not saying that close paraphrasing concerns can't be overstated. In fact, I think removing the more examples part avoids overstating concerns. --Chealer (talk) 21:00, 9 March 2012 (UTC)

I don't think the issue is so much specifying what use of the materials is acceptable, but rather merely suggesting options via bolding and/or square brackets. After all, the users can customize it as they see fit. For the user name, article title, etc. bold is nice. In a perfect world, it would be best if a user who's familiar with the tool does point out all the problems they can spot. The “Duplication Detector report” doesn't seem to be geared toward general readers/editors, but rather specialized users, who know what to make of it. Developing good judgement of what's “borderline” would come with practice, but theoretically, you should feel comfortable hazarding a guess as to what borderline close paraphrasing means. It's not a distinctly legal matter, that's WP:COPYVIO.
From your first time in Jan, was there any additional evidence apart from the user's initial upload? If not, the speculations on the Royalties talk page seem harsh. Taking into consideration that the user isn't currently active, is it an appropriate good faith inference?
Parts of this article, including the Software royalties section, are closely paraphrased. This section was added in [2]. I did not investigate all the other edits of the contributor behind that edit, but since he contributed almost half of the article's edits, this problem is probably not isolated to that section. --Chealer (talk) 22:52, 30 January 2012 (UTC)
Also, not that it was unjustified, but was it necessary to lower the article rating simultaneously? A certain detachment would lend itself to an impression of objectivity, helpfulness and authority. March 1, the second user made corrections to Metaphysical naturalism and asked “is that better?” They'd probably like you to review it and remove the banner, if it passes muster.—Machine Elf 1735 02:35, 10 March 2012 (UTC)
The example didn't used to say this. It says it. :) MachineElf is quite right that my main issue is with suggesting options, in bold and in brackets. I do not believe it is beneficial to complicate these examples, and your own experiences, Chealer, in forgetting to update something explain why. People should be able to use them quickly and easily, but they always have the option to remove or alter anything that is inappropriate...and should. That said, I believe that the existing language is better as a base for the reasons I've stated above as well as the fact that in some four years of working copyright problems on Wikipedia, I find the shotgun issue with close paraphrasing is far more prevalent than single passage issues. I have very seldom had to remove the "other passages that similarly follow too closely" language since I created the template for my own use in my user space in 2009 (based on messages I had left many times.) --Moonriddengirl (talk) 15:03, 10 March 2012 (UTC)
Hum, if your main issue is with suggesting options in bold and in brackets, how would you prefer the options be suggested? I don't understand how my experiences explain why it wouldn't be beneficial to "complicate the example". I precisely forgot because the optional parts were not marked as options. I agree that people should be able to use the examples quickly, and for me, that says they should quickly see which parts will need adaptation, and which will likely need adaptation. Which "shotgun issue with close paraphrasing" are you referring to? --Chealer (talk) 20:23, 10 March 2012 (UTC)
I would prefer that options be suggested outside of the examples, in supporting text. The language inside the examples should be formulated to best and common practices. --Moonriddengirl (talk) 21:36, 10 March 2012 (UTC)
Hum. I have very reluctantly implemented that. I am absolutely convinced that it would be better to have the optional parts inside the examples to stay optimal, and only chose this avenue in the interest of consensus. --Chealer (talk) 17:42, 11 March 2012 (UTC)
I do not support your changes at all, as you make the default example too weak, in my opinion. I have repeatedly expressed that I believe that the examples should default to the typical case and the ones in which they are most likely to be needed. Given, for instance, that several have supported the inclusion of the word "very" on this page and you are the only one to support its removal, your reading that this is "more consensual" is puzzling. I have restored the material as it was and am willing to discuss other options, but would hope that you might stop making changes to the example before having some reasonable expectation that others will agree with them. --Moonriddengirl (talk) 19:28, 11 March 2012 (UTC)
You stated yesterday that you "would prefer that options be suggested outside of the examples, in supporting text.", which is what I implemented today, thinking it would be more consensual. If that's not actually what you prefer, I'm back to my question, how would you prefer the options be suggested? Note that I never supported the removal of the "very" part. --Chealer (talk) 21:59, 11 March 2012 (UTC)
Additional evidence for what? I'm not sure what speculations/inference you are referring to. It was not necessary to do both changes simultaneously, of course.--Chealer (talk) 20:23, 10 March 2012 (UTC)

Modification of both examples

These examples were generated to give contributors guidance in talking to other editors about close paraphrasing concerns where problems are sufficient enough to warrant tagging of the article, either by blanking with {{copyvio}} or with {{close paraphrasing}}. The section above contains some of the background of the conversation. The specifics at this point seem to revolve around whether or not the examples should be altered to default to use on a single passage of close paraphrasing (by removing the current text "This is an example, there are other passages that similarly follow quite closely") or altered to embrace paraphrasing that may not be as close (by eliminating the term "very" from "very closely paraphrased"). Additional input in this conversation would be welcome to help establish consensus. (While there isn't exactly a noticeboard for this kind of thing, I have left notices at Wikipedia:Plagiarism, Wikipedia:Copyright problems and Wikipedia:Copyclean inviting others to help establish consensus.) --Moonriddengirl (talk) 19:52, 11 March 2012 (UTC)

  • Comment I think it's important to avoid creating an environment where articles are tagged without sufficient reason and people instructed they must revise their content "[S]o that we can be sure it does not constitute a derivative work" where concerns may be minimal. There are certainly circumstances in which these examples should be modified - when I originally wrote them, I didn't turn them into templates for that reason. We should remain flexible in use particularly on a subject as sensitive and, to me, important as this one. While there does exist the possibility that a significant problem will occur in a single passage of close paraphrasing or that not-so-very-close paraphrasing may be an issue if extensive, I am uncomfortable with these examples being potentially watered down to encourage their use where minimal issues exist. If a single instance of "not very" close paraphrasing occurs, it may warrant discussion with the contributor, but not, in my opinion, the tagging of the article or the rest of the text in the examples. --Moonriddengirl (talk) 20:06, 11 March 2012 (UTC)
    I believe a single close paraphrase may warrant the rest of the text in the second example, even if it has superficial changes. --Chealer (talk) 22:22, 11 March 2012 (UTC)
  • Agreed with Moonriddengirl on all points, per my comments above. BTW, I think you missed a "not" in "problems are sufficient enough" and one in "embrace paraphrasing that may be as close".—Machine Elf 1735 20:31, 11 March 2012 (UTC)
  • Actually, the problems are meant to be sufficient enough to warrant tagging. :) The first ones, by blanking; the second one, by {{close paraphrasing}}. But right you are on the second not! I've taken the liberty of repairing it. Thank you! --Moonriddengirl (talk) 20:44, 11 March 2012 (UTC)
  • Comment - Thanks. Note that I do not think anyone is suggesting to alter the examples "to default to use on a single passage of close paraphrasing (by removing the current text 'This is an example, there are other passages that similarly follow quite closely')". I think the only suggestion is to make that part optional. --Chealer (talk) 22:22, 11 March 2012 (UTC)
  • Then why not say outside of the template something like, "If the concerns relate to a single, extensive closely paraphrased passage, the text that reads 'This is an example; there are other passages that similarly follow quite closely' may be removed"? I wouldn't object to that (although in this case it's important to me to note that the single passage should be extensive to avoid people tagging articles over a single brief passage of close paraphrasing, which is almost certainly inappropriate, and i wouldn't support the inclusion of such text if some language were not introduced to note that the passage should be extensive, if sole). I do object to making the example default to a single instance of close paraphrasing. I do not wish to see these tags subtly encouraging use for minimal concerns. (We do need to fix the punctuation, though; that's a comma splice. I don't have any problems with "This is an example", since that was removed from the earlier verbiage (no issues with that, either), but it should be joined to the following independent clause with a semicolon or a colon. :)) --Moonriddengirl (talk) 21:34, 12 March 2012 (UTC)
    Explaining that copyright concerns need to be extensive might be an improvement, although "extensive" is relative, and there is no actual "minimal extensiveness". In any case, this is specific to copyright concerns. To come back to the general plagiarism concerns, if people want to separe both concerns and the purpose of the current templates is limited to copyright concerns, I guess we can just create a template for plagiarism. Note that we are discussing communication here, not tagging. --Chealer (talk) 17:07, 20 March 2012 (UTC)
  • Perhaps, the good news is… "This appears to be the full extent of the problem, but it seems substantial enough to warrant your attention." ?—Machine Elf 1735 00:05, 13 March 2012 (UTC)
    I do not understand the question. --Chealer (talk) 17:07, 20 March 2012 (UTC)

Research paper on generating non-close paraphrases of text from Wikipedia

Not sure whether it could ever contribute to alleviating the problem, but it seems worthwhile to mention this recent research paper which proposes a method "to collect good-quality paraphrases in a cost-effective manner" which "maximizes the lexical divergence between an original sentence s and its valid paraphrases [i.e. finding the least close paraphrase which still has about the same meaning] by running a sequence of paraphrasing jobs carried out by a crowd of non-expert workers", demonstrating it on sentences taken from Wikipedia. As a sample, two of the "good" examples (p2663):

In the face of demand for higher fuel efficiency and falling sales of minivans, Ford moved to introduce a range of new vehicles, including ”Crossover SUVs” built on unibody car platforms, rather than more body-on-frame chassis.

became

Ford’s introduction of a new range of vehicles (like ”Crossover SUVs”) that were built on unibody car platforms instead of the body-on-frame chassis, was in response to both plunging minivans sales and demands for greater fuel efficiency.

and

The Gates of Alexander was a legendary barrier supposedly built by Alexander the Great in the Caucasus to keep the uncivilized barbarians of the north (typically associated with Gog and Magog) from invading the land to the south.

became

To prevent the uncivilized barbarians from the north (who were typically associated with Gog and Magog) from overrunning the land to the south, Alexander the Great is thought to have built, in the Caucasus, the legendary barrier referred to as the Gates of Alexander.

(Found during the preparation of the Wikimedia Research Newsletter although I don't know whether we will cover the paper there.)

Regards, Tbayer (WMF) (talk) 18:36, 29 May 2012 (UTC)

Macmillan Co. v. King

This century old case is not very relevant. a) It was written long before the modern fair use rules were adopted. b) it deals with a tutor who made a condensed version of a two-volume 1000-page textbook so that students taking the course could learn the material without buying the book. The judge said it "resulted in an appropriation by him of the author's ideas and language more� extensive than the copyright law permits." He found that the notes "constitute 'versions' of substantial portions of the book.' decision. Rjensen (talk) 11:03, 3 June 2012 (UTC)

How to section

Like others I use and refer to this and its relatives ([quotefarm] (essay), [copyvio] (policy), [copyright] (policy), [plagiarism] (guideline), [copypaste] (infopage)). I do have one observation. The RFC below probably isn't the right place for feedback comments, so it gets its own section.

The how to examples, after dispensing advice to just stick to the material facts, concentrate on advising use of multiple sources. That's all well and good where multiple sources exist on what's being written about. Unfortunately many topics or certainly aspects of topics don't have that advantage. The article subject itself nonetheless sufficiently passes notability guidelines.

It's common with niche topics or aspects of them especially, to find one source that covers something. In articles about smaller topics, there may be one source that has a little about one aspect, another source only has a little about a completely different aspect and so forth. As the editor must piece together single sources on disparate aspects throughout the article or section it's inevitable they run into difficulties.

If they fall back to quoting widely, they risk the page becoming a quotefarm, a copyright violation. There may not be another source that addresses that aspect. They come up against obstacles like, and particularly when dealing with small parts of source text, how there's only so many ways to rewrite a small sentence/paragraph or the passage uses technical terms which can make the task harder, etc. The subject used in the how to examples is slave narratives of the Federal Writers' Project, a widely-studied area. It's lovely to have it as an example I think, but hard to apply it to narrower topics. --92.6.202.54 (talk) 22:40, 3 June 2012 (UTC)

Or another way to look at it is that for whatever the subject is, it turns out that only one other person ever anywhere chose to take a decent and scholarly interest, spent years of their life carefully researching a subject, did all the work of making it publishable - and now rather than repay them for the work, we need better reasons to give it away for free? It is of course more complicated than that, but I've seen the situation arise so many many times, when there is just the one good source, what "right" do we have to appropriate the work in order to have a decent Wikipedia article? Franamax (talk) 00:04, 4 June 2012 (UTC)

RfC: Should this be a guideline ?

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
No consensus. Firstly, an RFC is an appropriate venue to gauge community consensus, and the community certainly provided a variety of arguments most of which were strong. The discussion below certainly supports the {{infopage}} labelling, which is described in Wikipedia:Policy as being equivalent to an essay in stringency of development of structure. But the community would only go so far as to support doing so with significant cleaning and harmonisation of this page. Some arguments put were unfortunately specious, such as making a WP:CREEP argument with no content other than the undesirability of creep in general. The quality of such arguments ought to be improved by providing specificity on why CREEP is specifically undesirable in this case. For example, other editors made strong and specific arguments that the current status of this document suits its position and function within a hierarchy of documents discussing appropriate editing. While editors certainly agree that the material contained within this document is worthy, they simultaneously agree that it is unpolished. Some arguments go further and put that the document can never be sufficiently polished to meet standards of more clearly enumerated documents on editing behaviour as the topic covered is fundamentally incapable of being ennumerated. While editors made strong arguments regarding the worth of this document, and its impact and usefulness; these arguments were countered by equally strong arguments basically putting that an essay is a worthy document to convey information of high impact and usefulness. Finally, a number of editors noted that the document resembles the actual living consensus of the encyclopaedic community in this terrain, a point worth considering as editors improve this document and consider whether to proceed towards policy, guideline, infopage or essay status. (Thank you for your wonderful and clearly put arguments) Fifelfoo (talk) 02:17, 12 July 2012 (UTC)

Should Wikipedia:Close paraphrasing be a guideline rather than an essay? Dpmuk (talk) 18:19, 2 June 2012 (UTC)

  • Yes: this is an important and high-impact topic, and should be more widely recognized. Nikkimaria (talk) 18:23, 2 June 2012 (UTC)
  • Yes - it should be seen as an official supplement to our copyvio policy. Thanks Dpmuk for paying enough attention to my suggestion to start this RfC. Dougweller (talk) 18:33, 2 June 2012 (UTC)
  • Yes, I'm sure there are a few others like me who've just assumed that this is a guideline! This definitely needs to be part of our copyvio policy/guideline set. —SpacemanSpiff 19:06, 2 June 2012 (UTC)
  • Yes - per Nikkimaria. Truthkeeper (talk) 20:46, 2 June 2012 (UTC)
  • Yes it's central copyvio issue that desperately needs to be more recognized. The essay has been worked over and refined by several editors and has been stable for quite a while now. Siawase (talk) 21:36, 2 June 2012 (UTC)
  • Yes – definitely. Long overdue and very necessary, especially given the scale of the problem (I clerk at Wikipedia:Copyright problems.) As per Siawase, this "essay" has been carefully written and refined by several editors with great expertise in this area, most notably Moonriddengirl and Dcoetzee, and has been stable for quite some time. Voceditenore (talk) 09:06, 3 June 2012 (UTC)
  • Oppose at this time, and possibly always, given the advisory nature of the material. A guideline needs to set out clear parameters on what is covered and what is not, how situations should be handled, etc. I'm not reading the necessary focus in what is currently here. It's certainly part of my standard trio ("please review our policy on [copyright], our guideline on [plagiarism] and our essay on [close paraphrasing]"), the concept is highly relevant - but this doesn't read to me like a guideline. After some clause-by-clause review maybe, but not now. Franamax (talk) 19:22, 3 June 2012 (UTC)
  • Comment I think we need a guideline. If this document could be improved, I'd be all for improving it. That said, I routinely use it in spite of its essay status. --Moonriddengirl (talk) 19:42, 3 June 2012 (UTC)
  • Comment I don't think anyone disagrees that a close paraphrase can be considered a copyright violation. I think the issue is more about when does content become a close paraphrase. My alternative suggestion to this essay becoming a guideline is that we make it clear at WP:Copyright violations that close paraphrasing can count as a copyright violation, and that this essay is used to help determine when content is or is not a close paraphrase. If my alternative suggestion is not accepted by the community consensus, then I would accept the primary proposal that this essay be turned into a guideline. Singularity42 (talk) 21:38, 3 June 2012 (UTC)
  • No, it's too much in flux. It's a mistake to think that because avoiding close-paraphrasing is important therefore this page ought to be given elevated status. Between this Talk page and WT:Plagiarism there's ongoing confusion about what/when things should be addressed under that page, versus under this. There are sections above about modifying the examples and their presentation. Once a brief definition has been given, examples are probably the most important part of this document, essential to training editors how to self-avoid the problem as well as being as a resource to point others to. A proposed-guideline needs to be stable (settled), clear, concise and effective. The proposed-guideline—not merely the behaviour or acts it prohibits—should be supported by broad consensus agreement. When fundamental unresolved issues on its content exist, it can't be said to be stable or have consensus on its efficacy. --92.6.202.54 (talk) 22:40, 3 June 2012 (UTC)
  • Oppose. the guideline at present does not help an editor tell what is and what is not "too close". There is an ambiguity: "too close" to the original words (bad) and "too close to the original ideas" (good). I have trouble with section 4's statement "in many cases close paraphrasing of a non-free copyrighted source is likely to be an infringement of the copyright of the source." Does that also mean "in many cases close paraphrasing of a non-free copyrighted source is likely to NOT be an infringement of the copyright of the source." There are no citations to actual copyright legal cases even though section 4 says. "Wikipedia's primary concern is with the legal constraints imposed by copyright law". That apparently means fair use law, but the four criteria for legal fair use are not even mentioned. Rjensen (talk) 00:50, 4 June 2012 (UTC)
I recommend no. It starts off "Yes. Among other rights, copyright law grants a copyright owner exclusive control over any unauthorized copying of the copyrighted work." that's false the law does NOT say that. it says "Subject to sections 107 through 122, the owner of copyright under this title has the exclusive rights. section 106 Subject to sections 107 is fair use and that's central. Section 107 says "the fair use of a copyrighted work, including such use by reproduction in copies...for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright." section 107. Rjensen (talk) 10:23, 4 June 2012 (UTC)
That is kind of a wonky sentence. :) "exclusive control over any unauthorized copying" - what does that even mean? Maybe she meant something more like "exclusive right to authorize copying"? (Which still wouldn't mention the exclusions, but at least would make sense.) I have to admit I'd glossed over that one in reading the document. However, we don't have to take all of the language. There could be valuable content to be mined - she does discuss fair use -- and there is case law cited if, unfortunately, not linked. It could be worth looking at those cases to see how directly they apply. --Moonriddengirl (talk) 11:14, 4 June 2012 (UTC)
  • Yes - But it does need a little work to make it conform to the style of other guidelines. It covers behavior that isn't the same as wholesale infringement yet still has legal ramifications as it is a form of copyright infringement. It should be elevated to a guideline to allow a better understanding of the importance of the issue, and facilitate enforcement of copyright problems in general. Dennis Brown - © 13:39, 9 June 2012 (UTC)
  • Yes Per Dennis Brown. Dusty777 23:23, 9 June 2012 (UTC)
  • 'Comment: I got the RfC bot notice. I was not aware of this essay or the content guideline Wikipedia:Plagiarism. I don't have a strong opinion against making it a guideline as an extension of Wikipedia:Plagiarism. I don't know if that will help solve the problem I do run into too often when I check references in some new article and find people just copied text word for word the source. (Usually way after it happened and not worth checking who did it.) Perhaps more publicity about these guidelines/essays is in order. CarolMooreDC 01:36, 13 June 2012 (UTC)
  • Not yet. Nikkimaria asked me to comment. At this stage in the project life cycle it will be extremely difficult to make a major change such as this, and a half-baked attempt will make it that much harder. A few random thoughts:
  • The essay needs to be carefully reviewed and adjusted to ensure that it is consistent with other policies and guidelines. For example, the following statement in the lead is incorrect: "Wikipedia's copyright policy ... forbids Wikipedia contributors from copying information directly from other sources". Inaccuracies and inconsistencies such as this will be pounced upon by those opposed to promotion.
  • A guideline must carefully steer clear of the law-versus-morals debate. For every editor that passionately feels that close paraphrasing is morally wrong, there is another that feels that attempts to impose moral standards beyond those required by the law may lead to a Taliban-like form of self-censorship that could be extremely disruptive to the project. There are hints in the essay that it leans towards the "moral" side of this debate.
  • The section on where close paraphrasing is permitted needs to be expanded. For example, a statement like, "Smith found the waterfalls both stunning and inspiring" may be a sensible alternative to a direct quotation. It may be the only option when the source says, "In a letter to his sister, Smith said he found the waterfalls both stunning and inspiring". Placing the key words in quotes as, "Smith found the waterfalls both 'stunning' and 'inspiring'" would make Fowler turn in his grave.
  • In general, as in the example above, a guideline should make it much easier to distinguish the many cases where close paraphrasing is correct, or at least fully acceptable, from the many other cases where it is not acceptable at all. In its present form the essay is decidedly confused.
  • A guideline should make it clear that the problem arises whenever creative expression is copied, and cannot by fixed by simply shuffling word sequence and substituting synonyms. The following example from Salinger v. Random House. 811 F.2d 90 (2d Cir.1987), found to be an infringement, should be dissected in detail:
  • (Source) He looks to me like a guy who makes his wife keep a scrapbook for him.
  • (Close paraphrase) [Salinger] had fingered [Wilkie] as the sort of fellow who makes his wife keep an album of press clippings.
A guideline would definitely be useful, but this essay is not yet ready for prime time. Aymatth2 (talk) 02:05, 13 June 2012 (UTC)
  • No: We already have a copyright policy. Anything that needs to be said about this topic that needs to be binding should be said there. Also, an RfC is not the proper venue for guideline proposals; use {{Proposal}} and post to WP:VP/PR next time. I also have to agree with Aymatth2 that this essay leans toward moralizing instead of simply sticking to the applicable law (and disagree that a new guideline on this would be helpful, rather than ensuring the topic is covered by existing policy). Basically, something like this cannot be written accurately by someone who is not an attorney specializing in intellectual property. Wikimedia Foundation's own legal counsel keep a close eye on WP:COPYRIGHT, so that would again be the most appropriate place to cover this, to the extent that it needs to be covered. — SMcCandlish   Talk⇒ ɖ∘¿¤þ   Contrib. 17:24, 13 June 2012 (UTC)
    With respect to your statement that "an RfC is not the proper venue for guideline proposals; use {{Proposal}} and post to WP:VP/PR next time" I'd like to point out that Wikipedia:Policies and guidelines#Proposals, which is itself a policy, states "Most commonly, editors use a Request for comments (RfC) to determine consensus for a newly proposed policy or guideline, via the {{rfc|policy}} tag." Dpmuk (talk) 18:44, 13 June 2012 (UTC)
    Not to mix my hats, but "Wikimedia Foundation's own legal counsel keep a close eye on WP:COPYRIGHT"? What do you mean? I work for those attorneys, and I'm pretty sure they don't keep an eye on it at all. :/ Community policies are left to community to develop, although I don't doubt that if the community tried to make sweeping changes to one of our legal policies the Foundation would be asked by somebody to weigh in. (At legalquestions wikimedia.org, maybe. <-shameless plug, new address>) --Moonriddengirl (talk) 13:30, 17 June 2012 (UTC)
  • Guideline maybe, but not policy - And it would need to be made clear that this guideline in no way supercedes copyright policy, and further that this guideline should be edited to address the concerns above so as to keep it in line with the extant policy (not to mention various laws). If this is not - or cannot - be done, then oppose. - jc37 23:55, 16 June 2012 (UTC)
  • No. Sure, close paraphrasing is an issue, but this essay is poorly written and not clearly defined. In my opinion, "close" in this regard cannot be unilaterally defined. What equates to close paraphrasing for one person, editor, and admin is not the same across the board. We need consistency, but it will never happen with this essay. Focus on what works. This doesn't. Best regards, Cindy(talk to me) 01:52, 17 June 2012 (UTC)
  • If "close" in your opinion cannot be unilaterally defined, how would you propose consistently offering guidance on close paraphrase? Since it is a matter of legal compliance, we can't pretend there isn't a necessary standard to which we must adhere. Regardless of what happens to this essay, too closely following your source is already against policy at WP:C. Providing people information on what following too closely means and how to avoid it seems like a good idea, since people have been blocked for this very thing. --Moonriddengirl (talk) 13:30, 17 June 2012 (UTC)
  • "Close" paraphrasing has been defined by various courts. One definition is that it copies the "association, presentation, and combination of the ideas and thought which go to make up the literary composition." That is, much more than just the wording. But also less, because using obvious words to state facts does not count as a copy. I am sure the Wikipedia legal counsel could give us their preferred definition for this essay. I suspect the lawyers would prefer to keep advice and examples down at the essay level, with the policy and guidelines limited to the legal minimum. Aymatth2 (talk) 18:27, 18 June 2012 (UTC)
  • I think that's the difference between an essay and a guideline. An essay can talk in general terms, use examples, etc. ro give advice, whereas a guideline would need to be more prescriptive (or more clearly descriptive I guess). Cindamuse is right in that there will always be cases where reasonable people do differ on whether something is or is not a close paraphrase, so IMO a guideline would need to spell out how a consensus decision can be reached, whereas an essay doesn't need that level of detail. And actually, reading the "Addressing" section just now, if it's not a copyright problem, why would we be tagging it as a close paraphrase anyway (just asking)? Franamax (talk) 16:44, 17 June 2012 (UTC)
  • We already have a guideline, which is Wikipedia:Plagiarism, and a policy, which is Wikipedia:Copyrights. The rules we already have are correctly arranged, with copyright (as a legal issue) set up as policy and plagiarism (as an academic discourtesy) set up as a guideline. This content is very worthwhile, but it's not necessary to have a separate rule about close paraphrasing when we can move the content into our two existing rules instead.

    Basically, I'm saying this content should be a guideline but not a separate guideline.—S Marshall T/C 19:32, 19 June 2012 (UTC)

  • Oppose, as rule creep. What needs to be said to observe copyright as a legal matter (i.e., no excessive quotations or near-quotations) can be said briefly in the copyright policy. Encouraging editors to depart as far as possible from the source's wording can also be counterproductive, because changes in wording can easily distort the source's meaning, which we must avoid as a matter of factual correctness and per WP:NOR. When in doubt, I prefer that we err on the side of too-close paraphrasing (if properly attributed!) rather than on the side of imprecise or awkward wording.  Sandstein  17:19, 20 June 2012 (UTC)
  • Mild oppose. As with others who have opposed here, I certainly agree that it is important to spell out when paraphrasing is or is not acceptable. And I'm in favor of doing more to send that message, loud and clear. My concern, however, is that the essay in its present form has a lot of how-to advice that really isn't appropriate to a guideline, more consistent with an essay. --Tryptofish (talk) 21:11, 20 June 2012 (UTC)
  • Oppose I agree with everything that Sandstein says above. In addition, this will tend to create a whipsaw which would harm the project: if a text is close to sources then it's close paraphrasing; if it departs from the sources too far then it's OR. Warden (talk) 17:25, 21 June 2012 (UTC)
  • Oppose - Nothing like a couple more thousand words of written policy to make WP even more difficult for newcomers. We've survived this long without this bloated essay as an official guideline... Carrite (talk) 00:31, 22 June 2012 (UTC)
  • Oppose Like Carrite above, I think we already have a proliferation of guidelines and policies. We already have WP:COPYVIO and WP:Plagiarism, which address "close paraphrasing". I would be supportive of content changes within those policies in the spirit of what TransporterMan proposes below. Gigs (talk) 16:05, 22 June 2012 (UTC)
  • Oppose This essay is just in too crude a state to become a policy. I agree with the avoidance of plagiarism via close paraphrasing, but the examples are too sparse for me to apply this policy to a given alleged incident and decide if the paraphrasing is too close. Often a sentence in an article contains facts: the rivers in a continent, the new elements a scientist discovered, the components of a dynamo, the schools a person attended, the main exports of a country, the films a producer made, the teams a pro sportsman played for. It would be too easy to jump on someone for "excessively close paraphrasing" if her Wikipedia edit included such facts in a logical or chronological order as stated in the reference. The defense mentioned in the essay that "there are only so many ways to say some things" is unlikely to satisfy those who accuse others of plagiarism. It seems to fall back on the old definition of obscenity: "I know it when I see it," when it is something less egregious than cut and paste with a couple of words changed. Come back when you can provide something like a bright-line test which I can apply to someone's writing. Alternatively, the essay/guideline/policy could provide an extensive set of examples: "Original text, "A" which is just barely too close a paraphrase close, "B" is just original enough to satisfy the policy, when both are variations on the original text. The present two examples are not sufficient.This proposed policy could be used to harass productive editors who create and improve articles, when someone gets into an editing dispute with them, just on the basis of "It's too close because I say it's too close." Edison (talk) 16:48, 22 June 2012 (UTC)
  • Oppose - Per S Marshall. We should retain the 3-tier document hierarchy: top-level policies (WP:Copyright); Detailed guidelines (WP:Plagiarism); and then informal essays (WP:Close paraphrasing). Essays, like this one, are a good home for tons of example and - sometimes - humor. On the other hand, if there are couple of important tidbits in this essay, they should be moved into the WP:Close paraphrasing guideline. --Noleander (talk) 00:26, 26 June 2012 (UTC)

Proposal by TransporterMan

What is lacking is that Wikipedia:Copyrights never says that close paraphrasing — using that term — is a copyright violation unless done with attribution in a fair-use manner. The closest it comes is in the WP:COPYOTHERS section, when it says,

Note that copyright law governs the creative expression of ideas, not the ideas or information themselves. Therefore, it is legal to read an encyclopedia article or other work, reformulate the concepts in your own words, and submit it to Wikipedia, so long as you do not follow the source too closely. (See our Copyright FAQ for more on how much reformulation may be necessary as well as the distinction between summary and abridgment.) However, it would still be unethical (but not illegal) to do so without citing the original as a reference.

The Copyright FAQ then does little more than repeat the same thing. I would propose changing the quoted language to

... so long as you do not follow the source too closely. (Following the source too closely is called [[WP:PARAPHRASE|close paraphrasing]].) ...

With that in place to clearly define policy, I would then recommend upgrading this page to a {{infopage}} rather than a guideline, which would mean that it need not be quite as precise as a policy or guideline.

Finally, just in passing, let me note that the Plagiarism guideline only indirectly has anything to do with copyright issues; plagiarism is largely an ethical issue, copyright is largely a legal issue. The plagiarism article uses the term "close paraphrase" correctly but in an unfortunate manner in that if skimmed or read carelessly it can be read to imply (though it says otherwise, if you read all the way through it) that close paraphrasing is acceptable (which, indeed, it is with proper attribution in fair-use size chunks, but so is direct copying with ditto in ditto).

Best regards, TransporterMan (TALK) 21:19, 19 June 2012 (UTC)

Perhaps the problem is that this essay is too broad in scope and should be split. I see two very different topics:
  • The Rules: What "close paraphrasing" means, and when it is encouraged / allowed / discouraged / forbidden
  • How To: Avoid, detect, deal with inappropriate close paraphrasing
I am not sure "The Rules" should be a separate article. I would not merge them into WP:Plagiarism since the only connection is that paraphrasing is a form of copying, so should be properly attributed. The Rules may be better merged into Wikipedia:Copyrights, or possibly made into a sub-page of some sort within the Wikipedia:Copyrights family. The "How To" article should reference "The Rules" with a very short summary to avoid forking. Then it could be expanded with a lot more examples. An Infopage seems reasonable. A possible sequence:
  1. Ask for consensus on the Wikipedia:Copyrights page to open a new section in the policy statement discussing close paraphrasing, holding "The Rules"
  2. Assuming agreement, implement the new section and let it stabilize
  3. Cut that material out of this essay, replacing with a very short summary and a pointer to the new Wikipedia:Copyrights section
  4. Upgrade this essay to an infopage
Aymatth2 (talk) 15:20, 20 June 2012 (UTC)
  • I support TransporterMan's proposal, for similar reasons to what's in my initial comment in this RFC. Singularity42 (talk) 17:12, 20 June 2012 (UTC)
  • Not a bad idea. Though I think it should be noted somewhere that several editors (as I read on this page) felt that the page and its wording could use some cleanup for clarity and accuracy. - jc37 17:19, 20 June 2012 (UTC)
  • Close paraphrase is already a prohibited form of copyright violation under existing guidelines and practice, as Moodriddengirl etc. can attest. Having an essay which explains close paraphrase to those unclear on the concept which can be pointed to is useful. Carrite (talk) 13:54, 22 June 2012 (UTC)
  • Oppose - The essay is (and, because of its nature, will probably always be) quite ambiguous. There is no definitive way to determine what is and is not close paraphrasing and, while it is good advice for editors to follow, making it a guidelines suggests some level of enforcability. It is a good essay and good advice to follow, but need to be given the authority of being a guideline. ItsZippy (talkcontributions) 17:08, 24 June 2012 (UTC)
  • It would be a mistake for contributors to think that it is not already enforced. People are blocked for persistent close paraphrasing and have been for years. Likewise, people are blocked for violating guidelines and policies related to misuse of non-free media content, even though these are also ambiguous. It would be awesome if copyright law were straightforward, but it's not - and we're in the position of having to enforce it anyway. --Moonriddengirl (talk) 11:19, 25 June 2012 (UTC)

Tolerance

Close paraphasing is, by its nature, always going to be quite debatable and tentative. Today in the news, Jimbo is pushing back against a draconian copyright action and has made public statements such as "The internet as a whole must not tolerate censorship in response to mere allegations of copyright infringement." We ought to follow his lead and take a correspondingly lenient and tolerant position. Warden (talk) 14:43, 25 June 2012 (UTC)

Generalizing the circumstances of that case would probably not be advisable. Besides, there's no need for us to "follow his lead" any more than anyone else's. Nikkimaria (talk) 18:48, 25 June 2012 (UTC)
They aren't "mere allegations" if we create paraphrasing policy that is too lenient and allows editors to insert violations. I think you are reading too much into the situation Colonel Warden. Gigs (talk) 20:20, 25 June 2012 (UTC)
Jimmy also said, "Copyright is an important institution, serving a beneficial moral and economic purpose." This is a question of jurisdiction, akin to National Portrait Gallery and Wikimedia Foundation copyright dispute. It doesn't change the legal reality that we are governed by a court system that has found close paraphrasing infringement. --Moonriddengirl (talk) 12:07, 26 June 2012 (UTC)

There is a risk that this essay could be seen as teaching editors how to conceal copying. Amateurish attempts to avoid close paraphrasing may also violate moral rights, to the extent that they are protected in the United States, where close paraphrasing would not. The sloppy wording in this essay is dangerous. It is much more important to ensure that editors understand and apply the principles of fair use. Aymatth2 (talk) 00:52, 26 June 2012 (UTC)

Which particular sloppy wording do you find dangerous in this respect, as applies to moral rights as they exist in the United States? I'd be interested in hearing more of your thoughts on that. --Moonriddengirl (talk) 12:07, 26 June 2012 (UTC)
For example, the subsection "Brief indirect quotation of non-free text" is the only part of the essay that discusses acceptable close paraphrasing other than copying of free content or obvious expressions of fact. It says indirect quotation is only allowed when it is brief, is not markedly creative and has in-text attribution, but direct quotation is preferred. Here is an example of a problem that could result:
The New York Times reports a speech by Senator X with a careful close paraphrase: "The Senator said that Mexicans could never make good American citizens". The essay seems to imply that the preferred way to copy this is "According to the New York Times, 'the Senator said that Mexicans could never make good American citizens'"[39] That is very clumsy, so the editor attempts a loose paraphrase: "The Senator expressed hostility to Mexicans."[39] Wikipedia has misrepresented what the New York Times said and we have defamed the Senator.
By failing to elaborate the underlying copyright and moral rights principles, and by making an excessively strong case for avoiding close paraphrasing, the essay may create problems that could be avoided with more careful wording and more extensive examples. It is not ready to become a guideline. Aymatth2 (talk) 13:07, 26 June 2012 (UTC)
I don't believe the essay makes an excessively strong case for avoiding close paraphrasing. :) WP:NFCC says, "There is no automatic entitlement to use non-free content in an article or elsewhere on Wikipedia. Articles and other Wikipedia pages may, in accordance with the guideline, use brief verbatim textual excerpts from copyrighted media, properly attributed or cited to its original source or author, and specifically indicated as direct quotations via quotation marks, <blockquote>, or a similar method." This is consistent with our approach to fair use images, which are marked to permit reusers from around the world to assess the usage within their local jurisdiction. There is no other way than by using explicit quotation marks to indicate that text is being claimed under fair use. That said, the example you supply is brief, so it's hard for me to imagine how the issue could apply to lengthy text, since we should not be closely paraphrasing that much from a single source anyway; with that example, "According to the New York times, the Senator told attendees at [event] on [date] that Mexicans could never be good citizens of the United States." Brief extent of limited close paraphrase, contextualized and neither inaccurate nor defamatory. It's even avoiding antagonizing those people who would rightly point out that Mexicans are already citizens of America, along with everyone else who shares the continent. :) (Yes, they exist. I've inadvertently offended them myself.) (Note that in no case would Wikipedia have misrepresented or defamed the Senator; the editor who placed the content would have done those things. You may know this, I know, but it's a distinction I try to point out when it arises because I'm concerned that a lot of editors do not understand that they personally are liable for their actions.) --Moonriddengirl (talk) 13:51, 27 June 2012 (UTC)
  • I have not explained myself well. I should have made it clear that the example was not part of an article on US immigration, quoting the Senator as an authority, but part of an article on the Senator, trying to accurately depict his views without bowdlerization. The copied material in this case may perhaps be changed slightly, as you suggest, but only slightly. Anything more drastic risks further upsetting Senator X, who may regret making that rash remark, or upsetting the New York Times, which thought carefully about how to paraphrase it. Even after your change, the New York Times could still claim that their expression had been infringed and also mutilated. It may therefore be better to change it less, e.g. do not change "America" to "the United States", and trust to fair use. The fair use case is not rock solid. In theory, Wikipedia should find out what the Senator actually said rather than copying the New York Times paraphrase ...
Still, this copying problem can be solved, as they all can. That is not the issue. My hypothetical editor goofed up badly because the essay did not give any guidance on dealing with controversial statements. The editor thought a loose paraphrase must be better than a direct quote. That is the larger problem. In general, the essay is far too much about the mechanics of paraphrasing and does not say nearly enough about the principles, particularly about when copying is appropriate and when it is not. Aymatth2 (talk) 19:51, 27 June 2012 (UTC)
Wiki rules on OR forbid the use of unpublished manuscripts. The copyright law I have seen on "close paraphrasing" all involve the use of close paraphrasing of unpublished material. Since this is already forbidden, it seems to me the "close paraphrasing" rules in US law are irrelevant and should not be guidelines to editors. Rjensen (talk) 23:45, 27 June 2012 (UTC)
My limited understanding is that US copyright law does not make much distinction between literal copying and paraphrasing, and rarely mentions "close paraphrasing". Crudely stated, if an ordinary reasonable person would consider that there has been copying of the protected expression, and if fair use does not apply, there is an infringement. If the copying is concealed so it cannot be detected, there is no problem. There used to be special rules with unpublished material, connected with the right of first publication, but they were loosened up about twenty years ago.
The WP rules are a different thing. Wikipedia is very much a derivative work, so using unpublished material is clearly counter to the "Original Research" policy. The term "close paraphrasing" is used in an internal WP sense unconnected with US law. WP policy is more restrictive than US laws perhaps in part so it is less likely that someone cloning WP content in another country will get into trouble, but maybe more because if we set our editors a tough standard that they don't entirely manage to meet, they are likely to still stay within US laws.
One concern with this essay is that it does not seem to be fully consistent with policies and guidelines. Another is that it combines an attempt at rigorous definitions and rules with a sort of "how to" guide. My main concern is that the lack of explanation of principles and the simplistic rules may encourage editors to do the wrong thing, and thus to compromise the integrity of the project. Aymatth2 (talk) 00:26, 28 June 2012 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Morality or legality?

I've reverted this edit by SV and I would like to open a discussion here, which I briefly touched on in the guideline RFC above. I'm concerned about the reference to plagiarism as distinct from copyright violation. In my time here, I've observed that labeling content as plagiarism and/or editors as plagiarists produces among the worst possible of outcomes. We discussed all this at great length when developing the en:wiki plagiarism guideline and in that context made every effort to push everything possible over to copyvio where the process is much more clinical - simply because it is a much more well-defined process. Our own definition for plagiarism does try to confine itself to copying from "free" sources.

So basically, here is what I think should be discussed: IFF we are talking about a copyrighted source work, then can there ever be a case where we can say that some contribution here based on that source is NOT a copyright violation but is also NOT acceptable for inclusion? This is the lacuna where accusations of plagiarism can arise, and have their maleficent consequences. This is distinct from recognition in our various "audited" (DYK/FA/GA) processes where the recognizers can choose to set whichever rules for acceptance. As a basic tenet of the encyclopedia though, if the contributed material can not be found by reasonable tests to be a copyvio, why should it be excluded? What extra bar does this moral hurdle of "plagiarism" set? Why is that bar justified?

(P.S. If anyone responds with academic definitions of plagiarism, been there :) Franamax (talk) 03:34, 12 July 2012 (UTC)

Hi, the development of Wikipedia's guidelines about plagiarism have been hampered by the attempt to prioritize copyright violation, or to discuss plagiarism in terms of it. And that has caused people to plagiarize without realizing they were doing it. I wasn't even able to get in-text attribution added as a requirement to the verifiability policy, in part because someone who wanted to copy public domain texts turned up to stop it. But the absence of that requirement from V has caused people to plagiarize inadvertently, so it has been very harmful.
The practice of plagiarism is well understood in academia and in publishing in general; it is intellectual theft, and we should stick to the academic definitions and advice, and not try to reinvent (or ignore) the wheel. Whether it rises on any given occasion to a violation of copyright is a separate (complex, legal) issue. SlimVirgin (talk) 03:46, 12 July 2012 (UTC)
But the problem with academic definitions of plagiarism is that they are suited for academie, where the emphasis is heavily on Original Research - which, umm, no more questions for this witness your honour... Franamax (talk) 04:01, 12 July 2012 (UTC)
I don't see what OR has to do with it, Franamax. If I use an idea or words from John Brown, then I write: "Brown (2009) argues that ..." followed by an inline ref. Not to do that will almost always look like plagiarism (unless the words or ideas are very common). A copyright violation is a completely different animal. It is where I make Brown's copyrighted work available to the public without his permission, and crediting him makes no difference to that. How much I would have to make available is a legal question. We should stop confusing this with plagiarism, which is only about intellectual theft.
The point again is that academics know what plagiarism is, and there's no reason not to use their definitions and trust their advice regarding how to avoid it. SlimVirgin (talk) 04:19, 12 July 2012 (UTC)
But close paraphrasing (and accusations of plagiarism) can extend beyond the scope of your chosen vehicle of in-text attribution, good though it is for sentence-by-sentence work. Close paraphrasing can also involve copying with minimal restatement of the very structure of a work. I don't see how your preferred solution addresses such a situation - or are you saying that is beyond the scope of this essay? Also, if you want to create another essay about WP:Intellectual theft, that may be a better place for you to describe such values. I'm not disagreeing with it, just sayin' I'd like to see the whole text first. :) Franamax (talk) 04:47, 12 July 2012 (UTC)
If someone is copying the structure of a work entirely, where the whole article rests on one source and the source has been mined extensively, it could constitute copyright violation, which is beyond the scope of this page. It will always depend on the context (length of article, length of source material, how unique the material is, etc). I don't understand your point about intelletual theft: the word for it in this context is plagiarism, so we're going round in circles. My point is that I wish people would (a) stop conflating plagiarism and copyright violation, and/or (b) stop prioritizing the latter. SlimVirgin (talk) 05:01, 12 July 2012 (UTC)
Plagiarism as I've seen it in consensus practice is reliance upon a source, without appropriate attribution, for quotes, text, paraphrase, novel facts, novel analysis, novel opinion, novel evaluation (weighting), novel organisation or novel structural presentation. It is plagiarism regardless of whether the source legally allows copying or not, or whether moral rights exist at law, because plagiarism is a moral fault against appropriate encyclopaedism. Appropriate attribution may be significantly more strenuous than mere citation, "As Swanson puts it bacon is the perfect food." indicates that while the remainder of the sentence is paraphrase, it relies upon Swanson's work fundamentally for the opinion, rather than the opinion merely being published by Swanson.
The problem for wikipedia is determining novelty and appropriate levels of attribution. In particular, wikipedians often poorly attribute paraphrase, analysis, opinion, evaluation, organisation and structure.
Failure to attribute is poor encyclopaedism, as it fails to document our work process, and fails to acknowledge the work of others we rely upon. It is "immoral" as it is poor encyclopaedic work. It is immoral to the point at which the content should be excluded from the encyclopaedia's current edition.
One of the reasons for the FAC spotchecking upset is that failure to acknowledge the work of others, even when we have the legal right to do so, is immoral encyclopaedism. The depth of upset at believing that we might purvey the unacknowledged work of others as our best work caused that moral crisis. I say that unacknowledged use of others work is so immoral that we should remove the error (by appropriate attribution, editing, or removal of content).
Close paraphrase is a moral issue separate to the potential for close paraphrases to be copyright violation or plagiarism. A close paraphrase of a free work, appropriately attributed, may be immoral to the extent that as encyclopaedists and an encyclopaedia we ought to purvey our own unique expressions, presentations, and verbal and adjectival judgements.
Finally I would note that both plagiarism and close paraphrase are deceptive, deceitful and conceal. As an encyclopaedia we make clear, are honest and reveal. Fifelfoo (talk) 03:58, 12 July 2012 (UTC)
There is some good advice in "What Constitutes Plagiarism?" in the Harvard Guide to Using Sources. SlimVirgin (talk) 04:23, 12 July 2012 (UTC)
(after e/c, haven't reviewed new) But this is not about failure to attribute, in those cases it is a clear-cut copyvio or plagio, which we have procedures to correct. This essay is about where the sources are attributed but insufficiently restructured / reworded, or at least if it were to become a guideline it would fill that niche. So what specific definitions are we using, assuming that attribution exists somewhere?
Much of your response is aspirational, I do agree with it, but can we agree that you are expressing an aspirational view? Wikipedia does not have "moral beacon to the world" among its pillars. Again, no problem with applying further standards to audited stuff, but we should be tempering the message in P/G/E that could be used as a bludgeon by anyone against anyone. Franamax (talk) 04:29, 12 July 2012 (UTC)
There is a definition of plagiarism at Wikipedia:Plagiarism for example. It isn't a complex idea. Where to draw the line can be complicated, but in-text attribution is always a solution. SlimVirgin (talk) 04:36, 12 July 2012 (UTC)
A free encyclopaedia that anyone can edit. We're not the free chapbook that anyone can edit. That's the basis of the morality here—the bad conduct is bad conduct against encyclopaedism, not (for example) personal attacks which is bad conduct because it breaks down the editorial process. If the issue is "what is plagiarism" that is the answer. Close paraphrase is generally plagiarism, but not all plagiarisms are close paraphrase. Close paraphrase as plagiarism is only problematic when it is wide-spread, ie: it was a pattern of encyclopaedic disruption from an editor. Hell, in my article for the MILHIST newspaper, I found close paraphrase by accident in the work of a brilliant editor. One problem is that in-text attribution (of the appropriate kind) doesn't overcome the immorality of close paraphrase. Saying this in an article, "To closely paraphrase the Wikipedia edit screen, "Do submit only cyclopaedic data which may be checked against off-wiki material. Do keep a non-partisan viewpoint. (en.wikipedia (2012) Edit screen)" and yet it is still offensive because I am not presenting original encyclopaedic work (apart from the mangled language). Plagiarism is about non-attribution and deceit. Close paraphrase is about non-attribution, deceit, and failure to present an original expression. Fifelfoo (talk) 04:45, 12 July 2012 (UTC)

I think that the emphasis in the nutshell should be on copyright and not plagiarism. Hence I support Fifelfoo's position. -- PBS (talk) 09:53, 13 July 2012 (UTC)

This has been discussed many times See Wikipedia talk:Plagiarism#Clarification of what "with very few changes" means for a list of some sections on this subject. As discussed in the archives I think it is advantageous to quote and summarise as a style issue rather than using close paraphrasing (how do I know when someone writes Sir Robert Armstrong said he told the truth is a summary of Sir Robert Armstrong said he was "economical with the truth" rather than a unquoted statement of "[I] told the truth"? -- such a nuance is lost in the wash of close paraphrasing). I think that that close paraphrasing is creating so many problems that we should go back to information from copyrighted sources either be summarised or quoted (and discourage close paraphrasing).

I give two examples in the section I mentioned previosly where collateral damage over close paraphrasing has affected this project. So while a talented writer like SV never makes a mistake and violates copyright when close paraphrasing, it is far too easy for many editors to create copyright problems, and so in the interests of the project, I think it better if close paraphrasing is discouraged in favour of summary and quotation. -- PBS (talk) 09:43, 13 July 2012 (UTC)

Request for comment - clearer expression of principles

I am concerned that this essay focuses too much on a mechanistic approach to paraphrasing so as to avoid copyright concerns, and does not say enough about the underlying principles.

A. As we all know, copyright applies to any creative expression. This excludes facts and ideas but may include the choice of words and phrases, unusual sequence or structure, metaphors, plots or scenes in works of fiction, and so on.

B. Salinger v. Random House illustrates the point. A biographer extensively paraphrased letters written by the author J. D. Salinger. The court singled out the following comment about Wendell Willkie:

  • Original: He looks to me like a guy who makes his wife keep a scrapbook for him.
  • Paraphrase: [Salinger] had fingered [Willkie] as the sort of fellow who makes his wife keep an album of his press cuttings.

The concern was that the simile had been copied. Even if the biographer had used quite different words, sequence and structure, they would still have violated Salinger’s copyright by reproducing the simile.

C. The following hypothetical example shows the opposite effect:

  • Original: He spent the next three years immersed in legal tomes at an Ivy League institution in Cambridge, earning the right to append "Bachelor" to his name in 1975.
  • Paraphrase: He studied law at Harvard for three years, gaining a Bachelor’s degree in 1975.

The paraphrase, although closely following the structure and sequence of the original, does not violate copyright since all creative expression has been stripped away to leave only the facts, which are presented in a very conventional form.

D. Substantial similarity of creative expression is both more and less than the choice of words and structure. More, because even with different words and structure, a reasonable person may think an aspect of the creative expression has been copied. Less, because a close paraphrase in plain language of facts given in a short passage is not a copyright violation.

E. Failure to discuss moral rights is a serious omission. Most countries other than the USA fully respect the Berne convention. The author has the right to insist on attribution or anonymity, and the right to demand that their work is not mutilated. Editors must respect moral rights where possible, so Wikipedia content will be reusable in as many jurisdictions as possible. Paraphrasing may violate an author's rights in the integrity of their work. The essay should discuss moral rights problems and ways to avoid them.

F. This is to request comment on the above, and views on how, or whether, this essay should be changed to better explain the principles and to counteract the simplistic view that close similarity in wording is always a problem, while drastically different wording is always acceptable. Please feel free to canvas editors who may not be watching this page. Thanks, Aymatth2 (talk) 19:40, 27 October 2012 (UTC)

Comments

  • I agree the essay should emphasize that it's a judgement call... that there's no mechanical approach to paraphrasing. However, perhaps it's better to refer to the official policy and guidelines, so as to maintain a more consistent and centralized attempt to elucidate the underlying principles? We wouldn't want to imply that a bland, factual (encyclopedic) account is more defensible per se in terms of copyright violation (even if WP's education value would a consideration in terms of fair use). What's the current policy on moral rights? I think attributing a direct quotation might be preferable in that regard, as opposed to attributing a none-too-close (if not far-fetched) paraphrase... a bit confusing in this context.—Machine Elf 1735 03:29, 28 October 2012 (UTC)
I find it hard to get a clear view of policy and guidelines regarding copying and paraphrasing. They seem to say "write in your own words with minimal quotation", and not much else. I don't see anything on moral rights, even though they apply in most of Europe and in many countries with similar laws. I imagine direct quotation would be safest, but we can't directly quote every source, and I have a feeling many countries do not allow even small excerpts. I wonder what de.wiki says? Aymatth2 (talk) 12:21, 28 October 2012 (UTC)
I would imagine that our primary policy on moral rights is WP:NOR: "Best practice is to research the most reliable sources on the topic and summarize what they say in your own words, with each statement in the article attributable to a source that makes that statement explicitly. Source material should be carefully summarized or rephrased without changing its meaning or implication. Take care not to go beyond what is expressed in the sources, or to use them in ways inconsistent with the intention of the source, such as using material out of context. In short, stick to the sources." This prevents "distortion, mutilation or other modification of, or other derogatory action in relation to the said work, which would be prejudicial to the author's honor or reputation." [8] --Moonriddengirl (talk) 12:38, 28 October 2012 (UTC)
That is a good, clear statement - I would not have thought to look there. Has anyone tried to make an index of policy and guidelines that apply to text copyright, quotation and paraphrasing? It would be useful to have that in this essay, near the front. Aymatth2 (talk) 13:09, 28 October 2012 (UTC)
Funny thing, the defense attorney in the Salinger v. Random House case said: "If you take this opinion to an extreme, what it says is that you cannot quote anything that has not been published before...", so it's WP:RS that would have kept our noses clean, to say nothing of WP:BLP, but the law was amended afterward in favor of fair use... Nonetheless, even an elaborate figure of speech is still on the expression side of the idea-expression divide and there's more great examples in the article. Is the idea that Willkie was vain? I'd want a secondary source for that.—Machine Elf 1735 14:30, 28 October 2012 (UTC)
  • Yeah. I went on a roll a few months ago and started Salinger and a string of other copyright case law articles, then summarized in Paraphrasing of copyrighted material, which gives me the dangerous illusion or delusion that I understand something of copyright law. Salinger was special because it dealt with unpublished material, until then protected, and contributed to the decision to allow fair use for unpublished material in the U.S. Moral rights gives the author full control over first publication, which means in those jurisidictions an unpublished work cannot be quoted or even paraphrased. Wikipedia rules seem to be the same: unpublished work cannot be used as a source. I am strongly tempted to start a section in this essay called "Related policy and guidelines" with a list of links to ones I can think of, assuming other editors will add links to the ones I forgot. Aymatth2 (talk) 15:09, 28 October 2012 (UTC)

Nutshell and lead para

  1. Any objection to my edit, to bring forward the salient point of summarizing in your own words?
  2. Is there a guideline addressing and expanding on this notion? This essay seems more of a warning against "what's not good about that para you wrote", rather than a constructive guideline. The title and dominant shortcut WP:PARAPHRASE implies that all paraphrase is bad. If that's really the case, there's a problem with WP's use of the word "paraphrase" itself: this whole time (6 years) I thought our goal was to paraphrase, but not too closely. The very definition of paraphrase includes "restatement of the meaning in other words", so if one is paraphrasing, one is probably doing exactly the right thing.
  3. Perhaps instead of the clumsy "close paraphrasing" which must be repeated ad nauseum, what if we call the problem edit what it is: paraquoting? --Lexein (talk) 09:08, 16 November 2012 (UTC)
  1. Looks good to me.
  2. Do you mean, has somebody written a guideline talking about how to properly paraphrase? If so, not to my knowledge, but there are links at the bottom to University efforts to do that very thing. I don't agree with you that the title and shortcut suggest "all paraphrase is bad" - the modifier "close" seems pretty clear to me.
  3. "Close paraphrasing" is what it's called. See, for instance, [9], [10], [11], [12]. (Semi-randomly chosen) It's the accepted term for what we're talking about. I don't think we need to grab new terms for it, and paraquoting is already taken. :) --Moonriddengirl (talk) 11:12, 16 November 2012 (UTC)
3- Heh - thanks. --Lexein (talk) 12:26, 18 November 2012 (UTC)
  • Most rulings in copyright cases do not use the word "paraphrase" at all. The argument is about whether there has been copying, and if so whether it is fair. "Paraphrase" implies a process of shuffling and substituting words, perhaps with the aim of concealment. The judges are more interested in the outcome. They try look behind the words and phrases to determine whether creative content has been copied. That suggests a rather different approach from the one suggested in this essay. In a nutshell: "extract the facts from the sources, then state those facts in the most straightforward way possible." Presumably creative expression will get dropped in the process. The resulting text may sometimes be similar to the source, but that should not be a problem since only the facts were copied. "Paraphrase" in the sense of "restate what the source said, but in different words" carries the risk of copying creative expression. "Summarize" makes no sense to me, since it implies arbitrarily dropping information that may be relevant. How do you summarize or paraphrase "Smith was born on 13 April 1957 in London, England"? Aymatth2 (talk) 03:09, 18 November 2012 (UTC)
The terms generally used in academia are to summarize, to convey information from a range of pages or a chapter in a shorter, summarized, version, and paraphrase, to reword short passages such as paragraphs in one's own words. I don't think we should drop the term paraphrase - it's quite well known. Truthkeeper (talk) 03:13, 18 November 2012 (UTC)
  • All: fair enough, it was just a late-night thought. --Lexein (talk) 03:27, 18 November 2012 (UTC)
  • But an interesting one. The meaning of the term "paraphrase" is quite well known, but the question is whether we want editors to do it. Summarizing may or may not be appropriate depending on the level of detail in the source. There is obviously a concern about substantiality of copying when everything in a short source is relevant, but that is a different question. "Reword short passages in one's own words" may carry forward opinions, similes, metaphors and so on, violating copyright. Straining to reword may be pointless when there is nothing in the original but dull facts. Aymatth2 (talk) 03:32, 18 November 2012 (UTC)
  • Right. Ok, I'm seeing your point more clearly now. I'd like to see the lead of the nutshell be really punchy and clear, and I see how mine is a problem. Shall we discuss "punching up" the lead sentence in the nutshell then, starting with yours?
Previous: Closely paraphrased material that infringes on the copyright of its source material should be rewritten or deleted to avoid infringement, and to ensure that it complies with Wikipedia policy.
Current: Summarize in your own words, instead of closely paraphrasing. (continues with above text).
Alt 1: Extract the facts from the sources, then state those facts in the most straightforward way possible, rather than closely paraphrasing.
Alt 2: Extract the source facts, then fully rewrite them concisely, rather than closely paraphrasing.
Look, we can just put it all back the way it was, and I can run away. --Lexein (talk) 12:26, 18 November 2012 (UTC)
No, don't run away. I think you're on to something. I hadn't actually read the lead in a while and looking at it, agree that it definitely lacks punchiness. I went back in history and found this older version that seems to be more direct, and I think that, combined with what you've proposed above, is closer to what we want to convey. Truthkeeper (talk) 13:36, 18 November 2012 (UTC)
Keeping in mind I was only referring to the nutshell... --Lexein (talk) 13:45, 18 November 2012 (UTC)
  • (edit conflict) No, this matters, and the essay does need clarification. At present it is too much a recipe for "how to paraphrase so thoroughly as to conceal copying", but it does not do a very good job. Editors may think they have to mangle their sentences to avoid any similarity to the sources. The essay does not deal with the pitfalls of metaphors and opinions. I would like to put across the ideas that:
  1. Material that infringes on copyright must be replaced or removed
  2. A plain, concise presentation of the facts in natural sequence is appropriate for an encyclopedia, and will never violate copyright
  3. Editors should not copy any original words, phrases, sequence, conclusions, metaphors, similes or other creative elements. Stick to the facts!
  4. When presenting someone's opinion, it is generally safer to quote them directly than to attempt a paraphrase.
That is too long for the nutshell, but something along those lines. Not just in the nutshell and lead, but in the body. Aymatth2 (talk) 14:23, 18 November 2012 (UTC)
How is "extract the salient points, and use your own words, style and sentence structure to draft text for an article" a recipe for "how to paraphrase so thoroughly as to conceal copying"? If you use your own words, style and sentence structure to state the salient point, there's no copying at all. :) --Moonriddengirl (talk) 11:22, 19 November 2012 (UTC)
  • There could still be copying. I go back to the Salinger example: "He looks to me like a guy who makes his wife keep a scrapbook for him". Salinger's metaphor is a creative expression protected by copyright - and is the salient point. That point could be reproduced back to front in Hungarian and still violate copyright. I don't think we should be pushing editors too hard to find alternate words and sentence structure, which can give daft results: "The day on which he entered the world was the first in the month of February in the fifth year of the twentieth century." To me the main thing is to stick to the facts, and to avoid carrying forward anything creative. "He was born on 1 February 1905" does not violate copyright, however the source is worded. The "How To" part of this essay describes an approach that will work for many people, but other approaches will work better for some. I would not prescribe it as the only way to get to the result. What seems missing is an explanation of what can and cannot be copied. Aymatth2 (talk) 15:53, 19 November 2012 (UTC)
  • Non-creative language is discussed in Wikipedia:Close_paraphrasing#When there are a limited number of ways to say the same thing. You're quite right that the sentence you cite is not creative, but it's important not to overlook the impact of the aggregate. Just as an image collage of public domain pictures is copyrightable, even uncreative text can accrue copyright by structure. A stack of uncreative sentences can display creativity in their selection and organization. Any advice we give on copyright needs to be carefully given, particularly as their can be real-world consequences for getting it wrong. :/ --Moonriddengirl (talk) 16:36, 19 November 2012 (UTC)
  • I fully agree. Advice has to be very carefully worded. But we do need to give advice. I would like to add two sections to the front of this essay. The first would list all relevant policies and guidelines, with a short one-sentence description of each. The second, perhaps called "Concepts", would give examples of different types of creative expression, and would also explain broader issues such as the substantiality concern you raise, and the risks of paraphrasing people's opinions when reporting those opinions. The first statement should be the familiar "Material that violates any copyrights must be replaced or removed". Then, without limiting this general statement, we can give specific examples: " Metaphors: A metaphor is usually the author's creative expression ... Substantiality: Even if the language used in a source is purely factual, copying more than a small portion ..." I would also like to slightly tone down the intro to the "how to" section, basically saying "the following approach works well for a lot of editors." Having done that, we could go back to the nutshell and lead. Aymatth2 (talk) 17:37, 19 November 2012 (UTC)

Relevant policies and guidelines

I propose to add a section with this title at the front of the article, after the lead, that links to relevant policies and guidelines. I think it should be placed up front, rather than buried at the back, although it could perhaps be made a box floating to the right. Draft follows:

A number of Wikipedia policies and guidelines are relevant to this essay. They include:

  • WP:ORIGINAL. Policy stating that, while articles should be written in your own words, they should substantially retain the meaning of the source material
  • WP:COPYRIGHT. Policy that describes general principles that apply to use of copyrighted work
  • WP:NFCC. Policy that defines limitations on use of non-free content
  • WP:NONFREE. Guideline that expands on WP:NFCC and describes when non-free material may be used under the "fair use" principle
  • WP:PLAGIARISM. Guideline that describes the importance of attributing the sources used, even when they may be out of copyright
  • WP:MOSQUOTE. Guideline that describes how quotations should be faithfully reproduced, clearly identified as quotations

Several Wikipedia articles discuss related topics such as Copyright law of the United States, fair use, plagiarism, moral rights and paraphrasing of copyrighted material. These may be of interest to editors. However, they may have inaccuracies or omissions, and Wikipedia has a broader aim of providing material that may be used anywhere for any purpose, which imposes further restrictions that are defined in our policies and guidelines.

I am sure I have missed some relevant ones - anyone should feel free to add the list. But are there any concerns about this proposal? Aymatth2 (talk) 13:05, 24 November 2012 (UTC)

Good idea - all for it ... missing Wikipedia:Copy-paste.Moxy (talk) 17:44, 24 November 2012 (UTC)
  • I thought about Wikipedia:Copy-paste, but it is an "information page", floating in the limbo between a guideline and an essay. Relevant policies and guidelines should definitely be listed, and in my view essays should not, or at least not in this section. "See also" maybe. I think Wikipedia:Copy-paste explains but does not add to content from the policies and guidelines. Not a strong feeling... Aymatth2 (talk) 18:56, 24 November 2012 (UTC)
  • Since no objections were raised, and the proposal hardly seems controversial, I have gone ahead and added the section. Aymatth2 (talk) 17:33, 6 December 2012 (UTC)

Concepts

 
Copying isn't the only way to violate copyright or plagiarize. Close paraphrasing can be a problem, too.

There are legal, ethical and organizational standard considerations regarding the use of close paraphrasing.

Copyright law

Wikipedia's primary concern is with the legal constraints imposed by copyright law; in many cases close paraphrasing of a non-free copyrighted source is likely to be an infringement of the copyright of the source. Close paraphrasing rises to the level of copyright infringement when taking is substantial. Depending on the context and extent of the paraphrasing, limited close paraphrase may be permitted under the doctrine of fair use; close paraphrase of a single sentence is not as much of a concern as an entire section or article.

Wikipedia's guidelines

But even when content is verifiably public domain or released under a compatible free license (see Wikipedia:Donating copyrighted materials), close paraphrasing may be at odds with Wikipedia's guideline related to plagiarism (see Wikipedia:Plagiarism). While in this context, too, close paraphrasing of a single sentence is not as much of a concern, if a contributor closely paraphrases public domain or freely licensed content, he or she should explicitly acknowledge that content is closely paraphrased. (See above.)

Another potential problem arises when a contributor copies or closely paraphrases a biased source either purposefully or without understanding the bias. This can make the article appear to directly espouse the bias of the source, which violates our neutral point of view policy.

When is close paraphrase permitted?

There are a few specific situations when close paraphrasing is permitted. If information is gathered from the public domain or is free use content, close paraphrase may be acceptable. In some instances it is helpful to capture the words as written, in which case the guidelines for Quotations apply. Lastly, there may be some instances where it's difficult to paraphrase because of the nature of the content, in such cases there are a couple of tips about how to limit the degree of close paraphrasing.

When using a close paraphrase legitimately, citing a source is in most cases required and always highly recommended.

Public domain or free use content

In some limited cases, close paraphrase may be an acceptable way of writing an article. For example, many Wikipedia articles are (or were) based on text from the 1911 Encyclopedia Britannica (see Wikipedia:1911 Encyclopaedia Britannica). If the source is public domain, such as work of the U.S. government, or available under a CC-By-SA-compatible free license, it may be closely paraphrased if it is fully attributed. Acknowledging the source in such instances may include accompaniment by in-text attribution that makes clear whose words or ideas are being used (e.g. "John Smith wrote that ...") or in some cases more general attribution (see Wikipedia:Plagiarism).

Brief indirect quotation of non-free text

If a non-free copyrighted source is being used, it is recommended to use original language and direct quotations, to clearly separate source material from original material. This is in keeping with non-free content policy and guideline. However, brief instances of indirect quotation may be acceptable without quotation marks with in-text attribution. If the text is markedly creative or if content to be duplicated is extensive, direct quotation should be used instead. Extensive instances of indirect quotation are not generally acceptable; even if content is attributed, it can still create copyright problems if the taking is too substantial. To avoid this risk, Wikipedia keeps this—like other non-free content—minimal.

When there are a limited number of ways to say the same thing

Close paraphrasing is also permitted when there are only a limited number of ways to say the same thing. In general, sentences like "Dr. John Smith earned his medical degree at State University" can be rephrased "John Smith earned his M.D. at State University" without copyright problems. Note, however, that closely paraphrasing extensively from a non-free source may be a copyright problem, even if it is difficult to find different means of expression. The more extensively we rely on this exception, the more likely we are to run afoul of compilation protection.[1]

Comments?

Comments? Aymatth2 (talk) 02:00, 7 December 2012 (UTC)

It seems like a good approach substantially to me, but it would be much easier to judge on detail if specific changes were highlighted. :) I think the approach brings good clarity. I reserve the right to quibble over specific wording within it, though I don't see anything that raises red flags for me (that are not already in the text). (I look forward to the archival of this content, as the header levels rather mess up the TOC and editing of this page, though. :P) --Moonriddengirl (talk) 13:30, 7 December 2012 (UTC)
  • At this point I have not made any wording changes so there is nothing to highlight. It is better to move step by step. First get the changes to headings and sequence agreed and implemented, then discuss any additions and changes to the text. This is a "high impact" essay, so no changes should be made without discussion here. This proposal is just to move and re-sequence the text with no wording changes at all. The idea is to make the essay flow more naturally from the general and abstract to the specific and prescriptive, and to provide a natural structure for insertion of new material. New material can be discussed in following steps. Aymatth2 (talk) 02:13, 8 December 2012 (UTC)
  • Well, nobody seems to be objecting, and I like the approach, so I'd say it seems safe to implement if you want to. :) --Moonriddengirl (talk) 11:51, 10 December 2012 (UTC)
  • No rush. This essay has been around for years, so there can't be an urgent need for changes. I think a proposal should sit for 8 days minimum, so people who only check WP once a week get to see it and react. Aymatth2 (talk) 14:57, 10 December 2012 (UTC)
  • I've been lurking here, and I have no objections to the proposed reorganization. I don't have strong feeling either way, but it certainly couldn't hurt. VernoWhitney (talk) 23:11, 10 December 2012 (UTC)
  • I have gone ahead and made that change. Same wording, different sequence. Aymatth2 (talk) 02:52, 16 December 2012 (UTC)

Opinions requested

I'd like to get some extra eyes on some examples of possible close paraphrasing I've posted at the Education Noticeboard. I've been reviewing student articles to see how accurately they used sources, and I decided to review all source usage in a given article, regardless of whether the editor was a student. There are eight examples from three different editors that I'd like to get opinions on. Thanks. Mike Christie (talk - contribs - library) 13:53, 19 December 2012 (UTC)

Resolving the Example

WP:Close paraphrasing#Example doesn't close with a do finisher; IMHO it should. I'll Ima be take a stab or two at that here:

One of many possible solutions
After the decision to lay off 480 of 670 employees, as announced by Deloitte's David Carson, about 100 workers (and local Sinn Féin Councillor Joe Kelly), sat-in in the visitors' gallery at the factory, and insisted on meeting with Carson. Minor scuffles were reported to have damaged the visitors' centre door.[1]

Here, the irreducible facts are included, unadorned. Thoughts? --Lexein (talk) 10:27, 16 December 2012 (UTC)

Seems like a rather long and complicated sentence. Kaldari (talk) 23:02, 18 December 2012 (UTC)
How about the following: "After Deloitte announced the lay off of 480 employees, over 100 workers, including Sinn Féin Councillor Joe Kelly, staged a sit-in at the factory. Fights broke out at one point during the sit-in, damaging the door to the visitor center." Kaldari (talk) 19:13, 21 December 2012 (UTC)
Better. "But", I hear a voice say, "shouldn't Carson be in there, for completeness, for this example?" No, not if we're really summarizing. --Lexein (talk) 19:24, 21 December 2012 (UTC)

Proposal: Creative expression definition

This is to propose adding a section under "Concepts" after "Copyright law" but before "Wikipedia's guidelines" called "Creative expression". The wording below is tentative, because it is a subtle concept but one that has to be expressed clearly and accurately. I may have limited connectivity for the next two weeks, but am throwing it out for comment and suggestions.

Creative expression

Facts and ideas cannot be protected by copyright, but creative expression is protected.

Hilaire Belloc illustrates two forms of creative expression in his 1897 More Beasts: (for Worse Children), where he informs us that "The Llama is a woolly sort of fleecy hairy goat, with an indolent expression and an undulating throat; like an unsuccessful literary man". If this somewhat dubious source was used for the article on Llama and was still protected by copyright, it would be acceptable to say that the Llama is an animal with a shaggy coat, and perhaps that it has a long neck. These are facts. But use of the phrases "indolent expression" and "undulating throat" would violate copyright. The original choice of words is part of Belloc's creative expression. Going further, the simile "like an unsuccessful literary man" is also creative, and is also protected. A clumsy paraphrase like "resembling a failed writer" would violate copyright even though the words are entirely different. More than the facts have been copied.

Although facts are not subject to copyright, a selection or arrangement of facts may be considered creative and therefore protected. For example, an alphabetical list of states in the USA giving their name, size and population cannot be copyrighted. However, a shorter list of states giving the name, size and population as before, but ranked as the "top most livable states" would be subject to copyright. The selection and ranking is creative. To avoid problems, editors should avoid any fanciful wording, imaginative metaphors or similes, or subjective interpretations, and stick to giving the facts in plain words. That is what the reader wants from an encyclopedia.

Thoughts, suggestions? Aymatth2 (talk) 02:52, 16 December 2012 (UTC)

  • Given lack of objections, I have gone ahead and made the changes. I feel slightly uncomfortable, since this is a crucial section that should be carefully debated. Perhaps a new section on this talk page can be started for suggested improvements. Aymatth2 (talk) 00:54, 31 December 2012 (UTC)
  1. ^ In Feist Publications v. Rural Telephone Service, the United States Supreme Court noted that factual compilations of information may be protected with respect to "selection and arrangement, so long as they are made independently by the compiler and entail a minimal degree of creativity," as "[t]he compilation author typically chooses which facts to include, in what order to place them, and how to arrange the collected data so that they may be used effectively by readers"; the Court also indicated that "originality is not a stringent standard; it does not require that facts be presented in an innovative or surprising way" and that "[t]he vast majority of works make the grade quite easily, as they possess some creative spark, 'no matter how crude, humble or obvious' it might be."("Decision". Feist Publications, Inc., v. Rural Telephone Service Co., 499 U.S. 340 (1991). {{cite web}}: Italic or bold markup not allowed in: |work= (help))