Can You Cure the Copy/Paste Disease?
secretGeek .:dot Nuts about dot Net:.
home .: about .: sign up .: sitemap .: secretGeek RSS

Can You Cure the Copy/Paste Disease?

The other day, someone on stack overflow said they wished copy and paste were not permitted within an IDE.

It's a wacky idea - but sometimes I'm inclined to agree.

Maybe it shouldn't be completely illegal - but it could be harder to do (in specific circumstances), or tracked more closely.

At one point (perhaps long ago, perhaps today...) I stumbled onto this chunk of code [at right], and I thought "Hmm, interesting lack of reuse..."

Why use a data structure when you can copy and paste your code?

It's only a little code smell, right?

Not really, because you know that where people have the 'copy/paste' bug... it's only going to get worse.

Let's zoom out and see what we find...

The code sits inside this method... -->

(There's other issues there too, but let's leave them aside today, please.)

And the very next method down is almost identical -->

Then - I look at the next class - it's an almost identical class, and it has two methods of its own with identical names to the ones above and only very slightly different code...

So we're talking about a serious case of "Achieves reuse through the cunning application of copy and paste"

And you know it doesn't stop there.

It just keeps going - thousands of lines of copy/pasted code. File after file. Bugs here and there (notice the 'harmless' bug in the first snippet?) quirks abound.

It's a fractal! The more you zoom out, the more you see the same pattern copied, pasted, modified, and reapplied over and over at greater and greater levels of scale. The author's entire career could be composed of the same programs, copied, pasted, and modified, retrofitted to slightly different problems.

Copy/Pasting is like the weather. Everyone complains about it, but nobody does anything about it.

If copying and pasting were illegal (or more difficult) - would the implementer have thought about achieving reuse through other methods?

Illegal is taking it a bit far. So: how could the copy/paste virus be reduced?

Maybe clippy could pop up and say: "It looks like you're copying and pasting code. Do you know that you can achieve re-use in other ways?" (Maybe we could use Scott Bellware's face on clippy? eh?)

How about this... every time you use the "paste" function inside your IDE, a little counter is incremented. And that counter is displayed on a gigantic, glowing LED panel suspended from the ceiling above your desk.

Or: Every time you use the "paste" function inside your IDE, a loud buzzer sounds for ten minutes, and a flashing red siren pops up over your desk.

Or: Every time you use the "paste" function inside your IDE, your screen is locked out for ten seconds.

I think you'd only need to apply a technique like that for a few minutes a week and you'd soon be permanently cured of copy/pastitis.

My friend OJ suggests...

"an enterprise Copy and Paste server, which acts like a petrol tank. Hence you only get to use your copy and pastes wisely before you run out."

Perhaps a plugin would need to detect when code is copied and pasted from within the ide. It would only punish you when the copied code (and subsequently pasted code) contains some degree of logic.

And before you say it -- I know there's a tool called Simian that you can plug in to your continuous integration server, to detect this sort of nonsense. I haven't tried it, but I'd like to know more. (They're Australian by the way)

Also -- an apology

Let me be crystal clear about one thing... writing this kind of code doesn't make you a bad programmer.

This particular code was written by a very successful programmer, who does an excellent job of delivering value to the business.

It is definitely a better example of programming than a hell of a lot of code I've seen in the business world.

I'm just decrying the fact that sometimes we work harder not smarter, and if we trained ourselves into the right coding habits, we could achieve both: work hard and smart.

Tools for Tackling Copy-Pastitis





'Josh Bush' on Sat, 24 Jan 2009 10:25:38 GMT, sez:

I'm in the unfortunate circumstance where I'm maintaining a fairly large app where the devs all had this copy/paste bug. I love knowing that the bug I'm fixing exists in some multiple of 2 throughout the code. This code came as a result of the previous devs having the "just bang it out" mentality.

At least the IDE has a find in solution feature to supplement the copy and paste feature. I think find was created because of copy/paste.



'Steven Nagy' on Sat, 24 Jan 2009 10:45:01 GMT, sez:

Why not just enhance our code analysis tools? Then turn them on during the build. Therefore, build fail when code looks the same (since if you block copy/paste people will just type out the same code again).

I think refactoring tools will get better at this as well.



'Omer van Kloeten' on Sat, 24 Jan 2009 11:24:11 GMT, sez:

I respectfully disagree. I use copy/paste *all the time* in my IDE and none of my code looks like that. Well, almost none.

Considering copy/paste to be the culprit here is akin to saying typing is to blame for most programming bugs and as such either ban or make it very hard to type. :)



'Mark Dalgarno' on Sat, 24 Jan 2009 11:53:23 GMT, sez:

If you're going to talk about specific tools to address this problem - and it is a problem - then please mention Axivion Bauhaus Suite which allows just the continuous integration check that Steven mentions.

FWIW Researchers have shown that the *average* code base is made up of 20% code clones. The problem is that everyone thinks they are better than this 20% average :-)



'Henrik' on Sat, 24 Jan 2009 15:30:20 GMT, sez:

node.selectNodesSeparatedBy(", ","node1","node2","node3")



'mat roberts' on Sat, 24 Jan 2009 18:07:08 GMT, sez:

The problem is that copy/paste is *quicker now* than factoring your code to remove duplication.

If you want to stop this stuff though, I say you code review every check in.



'(omervk via twitter)' on Sat, 24 Jan 2009 22:40:54 GMT, sez:

isn't it ironic that in a post talking about copy/paste, your copied pieces of code differ? :P

[i pasted in this message from twitter so i could respond here ;-) --lb]



'lb' on Sat, 24 Jan 2009 22:58:29 GMT, sez:

since writing this article i've been watching my own use of copy/paste a lot more closely -- and it's amazing how often that functionality is needed!

99% of copy/paste usage is harmless, all good. Stuff like moving around the order of statements, shuffling things around to clean them up.

So any plugin would need to be very intelligent to avoid getting false positives.



'Jeremy' on Sun, 25 Jan 2009 02:42:05 GMT, sez:

I would disagree with that last statement about the author not being a bad programmer. To write code like that is almost the definition of "bad".

I am generally not very judgmental about other programmers. I don't care if they use the right patterns or whatever as long as their code is clear and readable. But one other thing I do harp on (almost over anything else) is to not repeat yourself in code like that. It is so incredibly easy to extract those duplicate blocks into reusable functions. To not do so is BAD. It just is. It makes a mess of everything, and it makes it so much harder to maintain going forward.

Just step up and say it Leon... you work with a BAD programmer. ;-)



'Joseph Cooney' on Sun, 25 Jan 2009 07:39:52 GMT, sez:

On the tools side Clone Detective (freeware IIRR) is pretty cool. You need to install Java to use it, but it is worth even that high price. Plus I know you're always up for a starwars in joek or two bambrick.



'Don2' on Sun, 25 Jan 2009 09:02:56 GMT, sez:

Here's my take on it.

(I'm surprised noone else has responded to this blatant FizzBuzz style question with a code snippet of their own ;-) )

i'd set up a utility function like this --

public static string ConcatenateNodeValues(XmlNode root, params string[] nodeNames)
{
    var nodeValues = new List<string>();
    foreach (var nodeName in nodeNames)
    {
        nodeValues.Add(root.SelectSingleNode(nodeName).InnerText);
    }
    return string.Join(", ", nodeValues.ToArray());
}

and then each of the others would be a one line function like this:


private string GetPostal(XmlNode root)
{
    return ConcatenateNodeValues(root,
        "PostalAddress",
        "PostalSuburb",
        "PostalState",
        "PostalPostcode");
}



'lb' on Sun, 25 Jan 2009 09:22:18 GMT, sez:

@Don2
Thanks for the snippet. I've done my best to format the code appropriately, by hand.

@omervk
yes indeed -- the copied pieces of code differ. This is part of the issue. A 'harmless bug' was fixed in the copies, but not in the original.



'nellboy' on Sun, 25 Jan 2009 10:55:27 GMT, sez:

Great post!!

I agree 100%... but then, if someone wants to write nasty, tightly coupled code, then it keeps the rest of us in a job, AND makes us look good



'John' on Sun, 25 Jan 2009 14:20:46 GMT, sez:

lb - I think we all need to make a distinction between CUT-paste and COPY-paste. CUTting and pasting is refactoring; COPYing and pasting is probably laziness (unless it's just a single line or long variable name or whatever).



'Stu Smith' on Mon, 26 Jan 2009 06:22:19 GMT, sez:

It might not make you a /bad/ programmer, but I reckon it makes you a /lazy/ programmer.

I wrote a little article about ten kinds of smelly code (one of them being copy-paste):

http://www.hackification.com/2009/01/20/sniff-out-that-smelly-code/

...and was amazed by how controversial it was (I thought they were fairly cut-and-dried topics).



'KristofU' on Mon, 26 Jan 2009 07:00:34 GMT, sez:

An idea could be to get the information density out of code by compressing it.
The compressed ratio could then be compared to that of compressed reference code.
This would then result in some kind of ( automatically calculated ) metric and could serve to keep an eye on redundancy within the codebase.



'Mario Gleichmann' on Mon, 26 Jan 2009 12:05:04 GMT, sez:

Nice entry, enjoyed reading!

I had some similar ruminations some time ago, considering the motivations why developers may want to use copy n' paste.

For further details, you may want to take a look at

http://gleichmann.wordpress.com/2007/11/21/clipboard-considered-harmful-a-funny-look-at-developers-laziness/

Greetings

Mario



'Todd' on Mon, 26 Jan 2009 18:16:01 GMT, sez:

Don't forget the circumstances where developers are given 15 minutes to fix a bug, and it's just sometimes easier to copy and paste, wwap a few items out, and then done. As opposed to refactoring, thinking about what's the common, designing some reusable classes, writing dynamic code, etc...

The real trouble is continued application of this to a codebase.

But realistically, all code rots over time...



'mike' on Mon, 26 Jan 2009 20:15:00 GMT, sez:

I can't imagine why it would be a good idea to cripple the IDE because some people write bad code in it. If copy/paste were disabled, would this make the copy/paste programmer into a better programmer?

People blab in their cell phones while driving, which is a terrible practice. We should have an interlock system that as soon as you turn on the ignition, it disables the cell phone. That's it, no one gets to use a cell phone in a car because some people do it wrong. Etc., etc.

It's not the tools, it's the people using them.



'Thomas G. Mayfield' on Tue, 27 Jan 2009 00:43:01 GMT, sez:

@(Those decrying laziness):

I firmly believe that being lazy is the primary defining factor of a good programmer.

You should be snarking at _short-sighted_, lazy programmers. The rest of us know that it's likely going to be us that have to alter/fix that nasty bit of code later. It's so much lazier to make it easy on our future selves.



'OJ' on Tue, 27 Jan 2009 02:50:12 GMT, sez:

The problem with having tools as part of the CI build is that they're not smart enough to know the difference between the copy and paste example above and VALID uses of copy and paste.

So you see a call to Foo(bar, baz, smoo);, then somewhere else in the code, possibly in the same file, you again see Foo(bar, baz, smoo);.

Is this a copy and paste? Or is this good reuse of code? (poor example, but you get the point).

I agree with Mat Roberts, code reviews are the only way to make this stuff disappear. I don't think you can rely on tools to catch this sort of stuff.

On a slightly different note, if you're in the job and you notice that kind of code lying around, be sure to refactor it if you can. As I said to lb a while back, I liken it to a small pile of rubbish on a building site. If you don't clean it up quickly, then the pile of rubbish will get bigger and bigger as everyone else thinks that it's ok to throw their rubbish there. Before you know it the job of rubbish removal is so big it's ignored and left for the owner to clean up.

If you see copy and paste, remove it before a stack of other programming plebs come in and follow the example. Trust me, it happens.

I've done it myself ;)



'John' on Tue, 27 Jan 2009 11:33:28 GMT, sez:

Thomas - of course you are correct! Stupid lazy (avoiding work now) is bad; smart lazy (avoiding work in the long run) is good.



'lb' on Tue, 27 Jan 2009 17:47:30 GMT, sez:

@(Thomas and John)
"Stupid Lazy versus Smart Lazy"
Well i'm half way there ;-)

@Todd
re: "all code rots over time"

While this is consistent with what I've observed -- i know that it's not actually true!

Code never rots or rusts... any decay that sets in is our own doing. We put it there. We are the termites that destroy it over time -- not some other "out of our control" effect.



'freida' on Wed, 04 Feb 2009 06:14:41 GMT, sez:

Doesn't hurt to try.

Mearbag...is that 'lingo' for "me air bag"...

or something...

Or, maybe me at bag...and I got can't see as good as I used to?

I'd sure like to blow this up, all out of proportion, for some reason...

What a concept...reason 'is'...and 'why?'

So it's really meatbag, lol.






'freida' on Wed, 04 Feb 2009 06:22:57 GMT, sez:

"All code does not rot over time"

It's the same today, as it was yesterday, and will be forever and ever.

The lasting impression of 'i' and 'o'...



'freida' on Wed, 04 Feb 2009 06:33:05 GMT, sez:

Now I'll try entering 'the word' without the http...



'freida' on Wed, 04 Feb 2009 06:34:41 GMT, sez:

That was fast, I'm sure impressed, with some 'thang' either a person or a program, not sure which.



'freida' on Wed, 04 Feb 2009 06:41:26 GMT, sez:

I wonder if I'll remember 'this trick' tomorrow?

One thang's for darn sure, I have a lot of respect for those farsighted, forefathers, that sat and entered code endlessly all day and night, more than likely, probably.

Reminds me of those men that pick up our garbage everyday...

Yet, some develop muscles, and that's a good thing...

Rather it's your body or your brain.

Nice to see a video and relax with some 'moral to a story' or read a good book, if your eyes are still able.



'kristen' on Sat, 07 Feb 2009 22:18:15 GMT, sez:

dont copy this poem



'Seth' on Wed, 22 Apr 2009 19:54:14 GMT, sez:

dont copy this poem



'yoda' on Thu, 23 Apr 2009 18:51:07 GMT, sez:

"It's a fractal! The more you zoom out, the more you see the same pattern copied, pasted, modified, and reapplied over and over at greater and greater levels of scale."

Wow! That's exactly how I recently described some code that I'm currently maintaining to a friend of mine!

It makes me feel better to know that I'm not the only one suffering with this kind of horrible code.

Thanks for a great article!



'Alex' on Mon, 27 Apr 2009 18:02:17 GMT, sez:

Your proposed solutions, while amusing (and most likely effective), are all based on "paste" action, with no regard for whether the clipboard text was previously "copied", or simply "cut". Copy/paste is a bad idea. Cut/Paste makes refactoring plausible.

Unfortunately, there's a hole in my idea too- Even if copy/paste was detected, and cut/paste ignored, there would be nothing to stop the user from cheating the system by doing a simple cut, paste to same area, paste elsewhere (essentially a copy with an extra step). If IDE's could detect that and respond accordingly, though, it'd be gravy.



'Dave Kaye' on Sat, 06 Jun 2009 14:07:38 GMT, sez:

I'm on the fence. While this code could certainly be written better, is it bad code per se? Sometimes you have to bash it out like this to see what you "should" have done, and I rarely have time to do that. Whenever I get into the whole "couldn't I just make a function" mode, I'm up until 4AM doing something that should have taken me 15 minutes.

I tell my students (I taught programming for 8 years) that "Good programmers are truly lazy!"



'Air Max Ultimate' on Mon, 13 Feb 2012 01:03:39 GMT, sez:

http://www.chaussuresairmax-fr.com/ outfit costumes. Tout le monde veut avoir l'air superbe, ainsi que l'amour pour les bagues tendances de style mains traditionnels qui nous rend beaucoup plus merveilleuse.




name


website (optional)


enter the word:
 

comment (HTML not allowed)


All viewpoints welcome. Incivility is not tolerated, such comments are deleted.

 

I'm the co-author of TimeSnapper, a life analysis system that stores and plays-back your computer use. It makes timesheet recording a breeze, helps you recover lost work and shows you how to sharpen your act.

 

NimbleText - FREE text manipulation and data extraction

NimbleText is a Powerful FREE Tool

I wrote this, and use it every day for:

  • extracting data from text
  • manipulating text
  • generating code

It makes you look awesome. You should use NimbleText, you handsome devil!

 

Articles

The Canine Pyramid The Canine Pyramid
Humans: A Tragedy. Humans: A Tragedy.
ACK! ACK!
OfficeQuest... Gamification for the Office Suite OfficeQuest... Gamification for the Office Suite
New product launch: NimbleSET New product launch: NimbleSET
Programming The Robot from Diary of a Wimpy Kid Programming The Robot from Diary of a Wimpy Kid
Happy new year 2014 Happy new year 2014
Downtime as a service Downtime as a service
The Shape of Your Irrationality The Shape of Your Irrationality
This is why I don't go to nice restaurants any more. This is why I don't go to nice restaurants any more.
A flowchart of what programmers do at work all day A flowchart of what programmers do at work all day
The Telepresent Man. The Telepresent Man.
Interview with an Ex-Microsoftie. Interview with an Ex-Microsoftie.
CRUMBS! Commandline navigation tool for Powershell CRUMBS! Commandline navigation tool for Powershell
Little tool for making Amazon affiliate links Little tool for making Amazon affiliate links
Extracting a Trello board as markdown Extracting a Trello board as markdown
hgs: Manage Lots of Mercurial Projects Simultaneously hgs: Manage Lots of Mercurial Projects Simultaneously
You Must Get It! You Must Get It!
AddDays: A Very Simple Date Calculator AddDays: A Very Simple Date Calculator
Google caught in a lie. Google caught in a lie.
NimbleText 2.0: More Than Twice The Price! NimbleText 2.0: More Than Twice The Price!
A Computer Simulation of Creative Work, or 'How To Get Nothing Done' A Computer Simulation of Creative Work, or 'How To Get Nothing Done'
NimbleText 1.9 -- BoomTown! NimbleText 1.9 -- BoomTown!
Line Endings. Line Endings.
**This** is how you pivot **This** is how you pivot
Art of the command-line helper Art of the command-line helper
Go and read a book. Go and read a book.
Slurp up mega-traffic by writing scalable, timeless search-bait Slurp up mega-traffic by writing scalable, timeless search-bait
Do *NOT* try this Hacking Script at home Do *NOT* try this Hacking Script at home
The 'Should I automate it?' Calculator The 'Should I automate it?' Calculator

Archives Complete secretGeek Archives

TimeSnapper -- Automated Screenshot Journal TimeSnapper: automatic screenshot journal

25 steps for building a Micro-ISV 25 steps for building a Micro-ISV
3 minute guides -- babysteps in new technologies: powershell, JSON, watir, F# 3 Minute Guide Series
Universal Troubleshooting checklist Universal Troubleshooting Checklist
Top 10 SecretGeek articles Top 10 SecretGeek articles
ShinyPower (help with Powershell) ShinyPower
Now at CodePlex

Realtime CSS Editor, in a browser RealTime Online CSS Editor
Gradient Maker -- a tool for making background images that blend from one colour to another. Forget photoshop, this is the bomb. Gradient Maker



[powered by Google] 

How to be depressed How to be depressed
You are not inadequate.



Recommended Reading


the little schemer


The Best Software Writing I
The Business Of Software (Eric Sink)

Recommended blogs

Jeff Atwood
Joseph Cooney
Phil Haack
Scott Hanselman
Julia Lerman
Rhys Parry
Joel Pobar
OJ Reeves
Eric Sink

InfoText - amazing search for SharePoint
LogEnvy - event logs made sexy
Computer, Unlocked. A rapid computer customization resource
Aussie Bushwalking
BrisParks :: best parks for kids in brisbane
PhysioTec, Brisbane Specialist Physiotherapy & Pilates
 
home .: about .: sign up .: sitemap .: secretGeek RSS .: © Leon Bambrick 2006 .: privacy

home .: about .: sign up .: sitemap .: RSS .: © Leon Bambrick 2006 .: privacy