The Virtue of Junk Code

Brett Schuchert wrote a fine article yesterday on the subject of refactoring, or cleaning up your code. And though I agree with most of his arguments against junk code (or what he calls "bit rot") I believe it is time for someone to stand up in defense of millions of developers who cannot seem to get around to clean up their act, or their code.

Rejuvenate Your Code
Brett compares software systems with the human body. His argument is that most cells in the human body are replaced periodically, so why don’t we keep rejuvenating our code? Brett argues that humans have to sleep, exercise and eat healthy food to stay in shape. And programmers –who are often not the first to care about the shape of their body– should pay more attention to regular health and fitness processes that would keep their code in good shape.

Now, I’m not going to argue with this using simple arguments like "lack of funding" or "lack of time". These issues can be solved with proper time management and customer expectation management practices. I am more interested in the intrinsic problems of refactoring processes. And here’s what I think:

Brett Schuchert has the analogy all wrong!

Most Software Doesn’t Grow, It Evolves
You cannot compare the life cycle of software systems with the life cycle of a human body, because human bodies don’t evolve. I wouldn’t mind if my body could evolve better eye sight, a faster brain, or any of the enlargements promised to me in the email I receive daily. But alas, such improvements are out of my reach. The human body is static, with only minor parts being renewed. It grows, but it is not improved.

It might be better to compare software systems with the human genome, not the human body. The genome of the human species has been written and updated by nature over a period of 6 million years.

Has the human genome ever been refactored? Hell no!

Junk DNA and Junk Code
Almost 90% of the human genome consists of Junk DNA. It serves no purpose. It contains repetitive genes for redundancy, parasitic and viral genes, genes that seem to have lost their function, and genes that might simply be a basis for further evolution. There is no biological process to remove junk DNA out of the human genome. And why should there be? What harm is there in having a bloated genome? Human beings seem to function fine, thank you very much. Obviously, nature has decided that the cost of removing junk DNA, and the cost of the minor inconveniences of carrying it around, do not weigh up against the benefits of having it readily available.

Likewise, most software systems in this world contain a lot of Junk Code. They contain repetitive code for redundancy, parasitic code (bad and useless code attached to good code and being copied around with it), old code that has lost its function, and unfinished code that might someday be the basis for further evolution. What reason is there to remove all this junk out of your software systems? Your systems are doing fine, are they not? Then what harm is there in having a bloated code base? Has anyone actually proven that the cost of continuously removing junk code from software systems saves us more than it costs us to keep carrying it around?

Nipples, Teeth and Eyes
Male nipples have no use, but nature has decided against refactoring the male nipple out of our DNA. It’s simply not worth the trouble. (And I know some men who are glad it turned out this way.) So why remove code that serves no purpose, when it’s not in the way?

But junk code can be harmful, I hear you say. Well, sure. We don’t need wisdom teeth either, and they can be harmful. Having a smaller, weaker jaw allowed us to grow larger brains, but it has left less room for molars. And the consequences can sometimes be fatal! But apparently, it is even more trouble to cut the wisdom teeth from our DNA. If your code sometimes gives you trouble, are you prepared to go through the pain of refactoring it? Have you estimated if this is really going to cost you less than the trouble of keeping the bad code around for another hundred years?

But some designs are bad, and should be improved before anything else is built on top of it, I hear you say. Well maybe, but maybe not. Our vertebrate eye has a blind spot where the wiring goes through the retina. It is a bad but workable design. And nature has decided to let us keep our inferior eyes, while other animals got much better ones. Do we care? I don’t. I’m more envious of dogs being able to reach lots of places with their own tongue. But the human spine doesn’t allow for that. Scientists agree that the spine is one of the worst designs of the human body. I agree. (And I question the scientists’ hidden motivations.)

Don’t Fix It
"If it ain’t broke, don’t fix it," is what people sometimes used to say, before the refactoring hype took over. There’s a lot of common sense in refactoring. Yes, I sleep regularly, I do work out (occasionally), and I eat healthy food (sometimes). But I believe a lot of people’s refactoring efforts fly in the face of that other well-known agile principle: defer decisions. If you don’t have to add a new feature, then don’t. If you don’t have to fix a bug that nobody notices, then don’t. If you don’t have to remove junk code that doesn’t harm you, then don’t.

All problems will disappear, if you wait long enough.

This has been a favorite phrase of mine for years. I tend to postpone activities when nobody really needs them done. There is a good chance that problems cease to be a problem, if you give it some time. I was happy not to have replaced a bad bicycle saddle with a new one, before my bike was stolen a week later. And I am quite happy not to have spent time on refactoring code in a troublesome system that was taken out of production for unrelated business reasons a short while later.

It appears that I’m not the only one who cannot be bothered to tune my code to near perfection. Just run FxCop on any of the .NET assemblies and you will be presented with many thousands of warnings and even critical errors. I guess the Microsoft developers had other priorities as well. Refactoring their code to conform to Microsoft’s own coding standards was, apparently, the least of their problems.

Refactoring is the practise of improving existing code.

The idea behind refactoring sounds good. On paper. But most software systems evolve. They don’t just grow. And there’s a difference. Evolving systems change their purpose. Growing systems simply renew themselves. Refactoring is a sound idea for growing systems. But for evolving systems you have to take two things into account:

  1. The effort of refactoring code might cost you more in the short term than it will gain you in the long term. Don’t expect improvements to pay back for themselves. Many of them won’t.
  2. Cleaning up code will not only prevent your junk from doing any harm. It will also prevent you from enjoying any (often unexpected) benefits in the near future.

It’s your choice!

  • Book Review: Agile Management for Software Engineering
  • Vrouwen in Techniek
Related Posts
free book
“How to Change the World”
  • wisher

    I think that in a perfect world all of our code should be refactored. In an evolutionary context we can use refactoring in order to move evolution in the direction we desire.
    I agree with you when you say that sometimes refactoring is too time consuming to be worth, but this is because a refactor on a bigger scale is pushing off the feature form our software or even the software from the platform.

  • Jurgen Appelo

    Wisher, thanks for the input. Though I think I don’t agree. Even in a perfect world (setting aside the cost of refactoring) it can still be better not to remove junk code.
    Junk code might enable stability (through redundancy) and new opportunities in innovation and evolution. Just like junk DNA does for the human genome. You lose that when you strip away the junk.
    I believe junk can have many hidden purposes, though I admit that scientists don’t agree on the extent to which junk can be beneficial.

  • Max Pool

    Personally, I feel both you and Brett are correct. I view software like a garden. It needs weeding, trimming, and replanting.
    Software that organically grows without refactoring becomes bloated, and not all software needs to be replaced simply because it is ‘old’.
    Fantastic post Jurgen!

  • wisher

    @ Jurgen: I think that achieving sability trought redundancy isn’t a really good idea.
    When you have to evolve your software must change the source-code for each redundant component you mantain.
    On the other side innovation is too often limited by ‘legacy’ components – that prevent us form designing the best solution for the problem – without having to play with junk code. How can I think that being constrained by junk code could offer new opportunities?

  • Hoba

    Great post and great comments!
    I really don’t get why people always have to draw hard lines. Why no use both solutions?
    If it makes sense to refactor code, do it. Otherwise don’t.
    Personally I would not encouraged anybody to do something like “oh boy, I have no current task or some free time on my hands. What do I do? I know, I just start refactoring some old code for no apparent reason”.
    Also I have seen some refactorings just because someone else liked a different approach for the same solution. So the same code got refactored at least twice and might get refactored again without much visible benefit.

  • Ngu Soon Hui

    I agree. You can defer refactoring as long as you like. But just in case you need to do it later, you should always have a set of unit tests to back you up.

  • Jurgen Appelo

    Hoba, Max, thanks for the compliments! Naturally, the truth is somewhere in the middle. I’ve just seen too many glorified descriptions of refactoring to resist the temptation to counterbalance that somewhat.
    @Wisher: I don’t blame you for feeling that legacy code is constraining you. I often feel the same. You might compare it to junk in the attic or in the basement. You don’t want to see it. But you don’t want to throw it away either.

  • Frank Silbermann

    If evolution were guided by a single intellegent but intellectually limited Designer, then He would have to refactor the human genome or it would become too complex for Him to comprehend. However, millions of random mutations and sexual interchanges, combined with natural selection allows the human genome to evolve despite its incomprehensibility to the human mind. A process based on human evolution is what you’d need if human evolution were the basis of your refactoring strategy.
    Unfortunately, I don’t think your boss would approve a software process based on human evolution. Such a process might require, for each application, a million programmers each forking that application in a different random direction, each selling each version to a single customer, and killing those versions whose customers went out of business.

  • Jurgen Appelo

    @Frank: biological evolution is not the only kind of evolution. Languages evolve, businesses evolve, society evolves, and software processes evolve.
    Sure, we can learn from biology. But what nature does randomly, and which takes millions of years, we can do much faster using conscious planning, foresight and anticipation.

  • Artem Marchenko

    I afraid the DNA analogy is not close enough. If you happen to be a docker and move 5 tons of cargo every day, your body is likely to refactor its muscles and make them adequate for the load.
    I believe good attitude to software refactoring should go along similar line. There has to be a reason for removing the junk. Though, naturally if your system expects long-time evolution you might like to prepare in advance, just the same way as it is a good idea to stop drinking whiskey train the body before becoming a docker.
    Though insisting on permanent ruthless refactoring just believe that there will always be change and it is usually pays off to train the muscles a bit every week.

  • Huperniketes

    I’m surprised that an intelligent and insightful thinker such as you is seeing refactoring through the eyes of the hammer. Perhaps it is in reaction to the hype surrounding its virtues. I certainly wouldn’t blame you in that respect. The refactoring evangelists promote the practice of refactoring with its best use-case, that of improving a code base by making it more focused on its application.
    But the true point of refactoring isn’t the why or when of moving code, merely the how. Just as knowing the ways various chess pieces are moved doesn’t guarantee that each move will be made optimally, refactoring doesn’t pre-determine the quality of the outcome. It is merely an enabler, so that when you see an opening or an opportunity for improving code in whatever quality you deem important you have knowledge of the path(s) to get there. Knowledge is indeed power, and knowledge of the what and how is tactic, knowledge of the why, when and where is strategy.
    As for “junk” DNA, that’s an issue entirely unrelated to whether one should practice refactoring or not. As in the development of the movie Toy Story, some elements proved to be a burden to the flow of the film. But rather than discard them entirely, the story team kept the ideas on the back burner. When it came time to do a sequel those elements were finally used, which ended up making both films much stronger.
    For the audience’s sake, the developers, don’t burden them with unnecessary elements in the code. Leave them behind in the SCM repository, or better yet in some snippets database, where they can be safe and yet out of the way of current work. And should you have need of them for some other project, you can weigh their usability on their own merits.

  • Jurgen Appelo

    @Huperniketes: Thanks for your insightful comment!
    I’m not the only one to see a certain intention behind the practice of refactoring. Just look up its definition on Wikipedia and you’ll see that refactoring has a purpose: *improving* the design of existing code. Martin Fowler even used this as the subtitle of his famous book!
    So yes, there is a “why” behind refactoring, not just a “how”.
    I agree with you that unnecessary code can be kept in the SCM repository. But it takes an effort to remove it. And to add it back later. I’m afraid your analogy with Toy Story falls short, because people will see everything that is not refactored out of a movie. (Like the car that can be seen in Lord of the Rings.) And movies, once they are released, are dead. They are not growing systems. Software is different. I can hide anything inside my code without you ever noticing it as a user. And my software keeps growing after its initial use.
    Thanks for your contribution!

  • Adam Johnson

    Your comments are intriguing but I can’t help thinking that the analogy is somewhat ‘off’. I spend a lot of time refactoring and I do believe it to be useful in general. Speaking as an observer it’s easy to forget that someone has to deal with all this code eventually. Couple this with the fact that some of your staff have drifted out through normal turnover so some of your staff will be a little wet behind the ears. Now factor in that favourite gem – 95% of our codebase does nothing. Those new programmers are going to have a hell of a job trying to figure out what’s really important, and that’s before they even get down and dirty trying to change something. Refactoring is also about keeping the codebase ‘relevant’ and making sure the codebase remains fluid enough to adapt to changing circumstances. Duplicated code is the greatest sin as this leaves you with multiple questionable implementations where you’re never quite sure that every patch made it to every copy. This leads to endless similar bugs which simply won’t go away. The users also start to get wise and may well just smell this duplication if every similar bug isn’t fixed; this erodes customer confidence.
    If we’re to believe the scientific community then the human genome has grown up over millions of years through a massive process of trial and error…liken this to millions of programmers randomly changing one line of code at a time and compiling. Sure, eventually someone will get that new feature right eventually but you might be waiting another million years for it.

  • Neil Kandalgaonkar

    I am working on a project which has accumulated a lot of “junk code” as you might call it. Based on my recent experience I do not find your analogy holds.
    Evolution is an undirected process in which 99.999999999999% of the changes made are harmful. It still succeeds because of the virtue of reproduction; many parallel experiments are running at once.
    Except for genetic algorithms, we cannot hope to repeat this success in software. We want a very low percentage of harmful changes. Furthermore we want to achieve certain functionality in a given time period.
    You are right that “dead” code is not actually detracting from the quality of the product. I don’t think anyone has ever claimed that refactoring improves functionality; the very definition of refactoring implies no change of functionality. We refactor in order to support *future* change better. Partially that is accomplished through decoupling and other organizational tricks. But the most important thing is that we refresh our understanding about what the code is really doing.

  • Jurgen Appelo

    Neil, only the mechanisms of evolution differ, but the goals do not. I believe the goal of adaptation in biology is the same as the goal of adaptation in software development. It is completely besides the point that mother nature achieves this through random mutations and software developers through anticipation.
    My message stays the same: we have to adapt to survive, and there might be lots of value in the errors we keep carrying around.

  • Stefan Billiet

    This post has left me conflicted.
    Deferring decisions is all nice and dandy when everyone pulls their weight. I’ve been working in a new company for about 2 months now, and it took the (overly bureaucratic) management somewhat long to decide what kind of work to give me. So I started to take ownership of a legacy project, which has been left to its own devices for about ten years. It has evolved past more platforms and languages than I would care to point a large pointy stick at; it started in Assembler and wound up in C#.
    It is the prime candidate for the description “Big Ball of Mud” and I honestly believe that a big part of the reason for this stems from all parties involved deferring the decision of just diving in and pulling the bloody thing straight indefinitely.
    Jurgen, you yourself have claimed that a project’s success is simply the postponement of its failure. Pre-emptive refactoring (within reason) is one of the straightforward ways to postpone failure.
    If my predecessors had shared this vision, I could have understood certain pieces of code in 15 minutes, instead of a day :-p That totals to about 80€ in wages alone that that particular deferred decision has cost the company.

  • Sraf

    Removing junk developers is more urgent than removing the junk code they produce. The problem has to be fixed at the source.

How to Change the World - free Workout - free