Two common coding weaknesses

Posted 27 Oct 2000 at 16:42 UTC by apm Share This

When I review code, especially code from less experienced programmers, there are two issues that come up again and again. (Actually, with 25 years of programming, almost everyone is less experienced! Of course, that doesn't mean I'm any good - just that I've been doing it a long time.) These two common coding weaknesses are: naming and duplication.

Since neither of these issues necessarily affect the functionality of the code, cleaning them up falls under the "refactoring" umbrella. Since they are not functional issues, why are they so important? I think they're important because most code gets "read" and "maintained" much more than it gets "written", so anything that improves the readability and maintainability is important. It might seem contradictory, but I also feel you should write code as if it is "temporary" or "transient", since most code that you write will end up getting rewritten, often several times. But this isn't really a contradiction. Extreme Programming tells us "do the simplest thing that could possibly work". In other words, plan on incrementally improving your design, a little bit at a time. So any given line of code will likely be transient, but the system at a whole is likely to last much longer than you might think.

Some of the naming problems I run into:

  • calling the same thing by different names in different parts of the code

  • calling different things by the same name in different parts of the code (An exception to this is the use of generic names like i and n and x. My rule is that this is okay if their use is limited to a small area of code, so it is easy to see what they refer to.)

  • calling something by a misleading name e.g. calling a function Check_Totals when it actually does the totalling and doesn't check anything, or calling something "prompt" when it's actually the field name, not a prompt.

  • calling something by an uninformative name e.g. calling a function "Process" or "Calculate" which doesn't really tell you anything about what it does

  • excessive abbreviation (e.g. rpt instead of report) Even if the abbreviation is understandable, it's often harder to remember which abbreviation you used, than it is to simply write out the whole word. (e.g. was that rpt or rep or rprt ...) (Someone suggested we start a "vowel movement".)

  • unnecessarily long names (e.g. Print_History_Report - Print_History or History_Report are probably sufficient)

  • inconsistent naming i.e. not having a "system" for your names (e.g. cur_amount & amount_ytd should probably be either cur_amount & ytd_amount or amount_cur & amount_ytd) Again, it's not just the matter of readability, but also write-ability i.e. is it easy to remember the names.

Some of the duplication issues I run into:

  • putting the same code into both branches of an if-else, for example:
    	if (condition)
    		something
    		stuff1
    	else
    		something
    		stuff2
    
    is probably better written as:
    	something
    	if (condition)
    		stuff1
    	else
    		stuff2
    

  • putting the same code all the places you call a function, for example:
    	something
    	func()
    	...
    	something
    	func()
    
    is probably better written by putting "something" into func()

  • not putting common methods into a base class (in object-oriented code)

  • repeating a complicated expression instead of assigning it once to a temporary variable
    	print(invoice.customer.address.street)
    	print(invoice.customer.address.city)
    
    is probably better written as:
    	address = invoice.customer.address
    	print(address.street)
    	print(address.city)
    

  • two (or more!) functions that only differ in very small ways, where the difference could easily be made into an argument passed into a single generic version of the function

My advice to programmers - if you find yourself cutting and pasting code (or even worse, typing the same code) this should be a sign that you should look for ways to avoid the duplication.

The thing to remember is that coding is an iterative process. You won't necessarily get all these things right the first time you write the code. In fact, trying to write code "perfectly" the first time is a mistake, because you would be investing too much time in code that is very likely to get thrown out later. Instead, get in the habit of reviewing your own and each others code, and cleaning it up as you work on it.

For a much more thorough and well thought out treatment of these and many other issues, I highly recommend Martin Fowler's "Refactoring" book (Addison-Wesley, 1999).

And since I can't resist the opportunity for shameless self-promotion - check out my open source project Suneido, an integrated object- oriented language and client-server relational database.

Andrew McKinlay
mckinlay@suneido.com


The GNOME Programming Guidelines, posted 27 Oct 2000 at 21:51 UTC by federico » (Master)

The GNOME Programming Guidelines describe the preferred way of writing code for GNOME. This is not just a coding style standards guide; it describes issues pertaining to consistency, maintainability, extensibility, correctness, and security. You may find it interesting.

but don't refactor too much, posted 27 Oct 2000 at 23:58 UTC by sej » (Master)

I'm of the opinion that most any halfway capable programmer devotes a lot of thought to increasing the consistency of their naming strategies and maximizing their use of functions and methods. At least I did when I was halfway capable. And I stumbled on the once-and-only-once ideal a long time ago and saw the power of it. These fundamentals were somewhat overlooked with the initial hype that arose around object-oriented programming, but they were what was important, and finally people are writing books about it.

But I think there are more caveats to the refactoring methodology than doing it too early. You also don't want to do it too much. Sometimes it is easier on the aesthetics (and logistics) to have replicated blocks of code that call a series of methods than to combine them (and all their possible parameter variations) into a single method. At least for C-like code without keyword defaulted arguments. And everyone probably knows an object design they liked less after it went through extensive refactoring to eliminate perceived redundancies.

sadly, many programmers don't think about what they do, posted 28 Oct 2000 at 01:15 UTC by apm » (Journeyer)

I would have to say that by your definition there are an awful lot of programmers that aren't to the "halfway capable" stage. I don't mean this in a critical or cynical way. And certainly there are a lot of good programmers out there. And anyone who's are reading the kind of stuff you're referring to is likely well past halfway capable. What prompted this article was that I just finished reviewing a pile of code from one of my programmers and it seemed like I kept running into the same "mistakes" over and over.

I totally agree that over-generalization can be just as big a problem, and that refactoring too early is also a mistake. But I seem to run into these less often than simple naming and duplication issues. If they haven't got past these simple issues, they likely won't really understand the deeper issues.

overuse of flags, posted 28 Oct 2000 at 03:47 UTC by forrest » (Journeyer)

I'm in a situation where programmers are adding/changing a mature body of code, where many programmers of differing skill levels have come and gone.

When adding to this mess, some of my fellow programmers absolutely love to use boolean flags. It often happens that code in place A determines some flow control that now has to happen in place B. When the code was originally written, there was no conceptual relationship between A and B, but changing requirements have brought this relationship about.

So, add a flag! It's easy, and it works. The problem is, soon you are trying to juggle 4 or 5 of these flags in a module ... and it doesn't help that these flag-loving programmers aren't too good with the consistent-naming concept either.

Sometimes, a flag is actually what you need, but in well over half these cases, A and B can be made adjacent by a very minimal amount of rewriting. The only reason they are apart is because there was no requirement for them to be related before.

Is this a common problem that others have seen, or is my working environment just special?

Coding standards, posted 28 Oct 2000 at 04:09 UTC by Skud » (Master)

An article I wrote about coding standards can be found at perl.com. It may be of interest.

those seductive flags!, posted 28 Oct 2000 at 19:02 UTC by apm » (Journeyer)

I've run into the flag problem as well. It seems so innocuous - until, as you say, you have 4 or 5 of them and it becomes impossible to figure out what's going on. Very occasionally they're the best solution, but in most cases there is a better way. Sometimes you can use derived classes and virtual functions, or the "State" Design Pattern (GOF), or sometimes you need to recognize that you really should have a state machine, or sometimes you can just redesign the code so there aren't so many special cases.

The Flags of Death, posted 29 Oct 2000 at 04:54 UTC by nelsonrn » (Master)

The last time I saw flags overused was at HP. The HP-41 processor only had four levels on its subroutine call stack. Call too deeply and you toast the stack. Very easy to do. So you can't just create a subroutine out of repeated code. Instead, you have to retain state in flags. Made the code extremely hairy and tense.

Yes, you shouldn't have more than one or two flags passed into a subroutine. More than that and your code is structured wrongly. Split that subroutine into two subroutines calling the code that was executed using the flags. The basic problem with flags is the same problem with ifdefs -- it's very hard to follow the sequence of instructions if you have to retain a lot of state.

In fact, I'd go so far as to say that ifdefs should *never* be used. Instead, arrange your makefile so that it includes one file full of conditional code instead of a different file. That way, you know the semantics of the code and you don't have to worry about "If this ifdef is turned on, then we're searching for things that aren't in the set, as opposed to the other sense of the ifdef."
-russ

pointers are evil, security bugs, posted 29 Oct 2000 at 11:59 UTC by ajv » (Master)

The biggest single problem when I review code is that programmers try to simplify at the expense of maintainability.

    for ( p = fgbuf; *(p++) = ++t; i *= 2) ;
    if ( *p )
       p++;

is the sort of thing that drives me nuts. It's too terse. It involves pointers and subtle buffer overruns that only occur sometimes.

I suggest the use of Steve McConnell's Code Complete for a really good coding how-to.

Re: pointers are evil, security bugs, posted 29 Oct 2000 at 15:53 UTC by sneakums » (Journeyer)

Pointers are not evil per se. Random unchecked use of pointer arithmetic is evil.

re: overuse of flags, posted 29 Oct 2000 at 19:11 UTC by sej » (Master)

I've found flags to be necessary in the evolution of non-object-oriented code. The need for them seems to dissipate with object-oriented code, because specialization of generic behavior is what OOP is all about.

Effect you're missing, posted 29 Oct 2000 at 22:50 UTC by Fyndo » (Journeyer)

Something else to keep in mind, is that an outside reviewer will almost always spot problems in the design of code that the orignal author misses. This is just due to familiarity, The 2 different names may have made sense at the time, (possibly were actually 2 different concepts, but soem later revelation came up with a way to unify them), and they know both parts of the code so intimately, that they don't need to thnk what identifiers they used. Admittedly an experienced programmer will catch these things more often, but the origin of stylistic failures isn't always ignoarance of stylistic guidelines, but often knowledge of the code preventing the programmer from seeing that it's unreadable, as she doesn't need to read it, really.

Certainly this happens to me in my writing, I skim right over typos, repetitions of words, etc, because I know what I meant.

My intent isn't to gloss over the importance of good style, just to question whether teaching people what is good style, or teaching people how to recognize bad style that snuck into their code, or teaching people to have someone review their code more often is more important.

re: overuse of flags, posted 30 Oct 2000 at 00:18 UTC by forrest » (Journeyer)

sej:

In the codebase I'm working with, some flags are definitely necessary.

The problem is, it's some programmers' first recourse to add a flag, when it should really be their last. When you know you're probably going to have to be juggling some flags, it makes it more important not to add them gratuitously.

Re: overuse of flags, posted 30 Oct 2000 at 02:57 UTC by stefan » (Master)

sej: you seem to suggest that OOD is a priori superior to other techniques for parametrizing code, or making it polymorphic. Please have a look at the wonderful "Multi-Paradigm DESIGN for C++" by James Coplien. It explains in great detail analysis techniques such as commonality and variability analysis and as well as solution domain analysis. Preprocessor and compiler are both used to provide [compile time, link time, run time] polymorphism. Every such technique has its place, depending on your exact requirements.
Take the berlin project, it is very object oriented. Yet there are a couple of places, where we use compile time switches to map a common API to a variety of underlying layers, such as the console (GGI, SDL, ...) . Doug Schmidt's ACE framework is another wonderful example of a healthy mixture of these techniques.

Biggest Weakness = Doesn't Learn From Past Mistakes, posted 30 Oct 2000 at 18:53 UTC by mrorganic » (Journeyer)

Programmers are the worst professional class I've ever seen for not learning from past mistakes. I've had to give virtual (and occasionally literal!) head-knocks to coders who consistently make the same dumbass mistakes (dangling pointers are the most common). It's like they're "blind" to certain flaws in their code; they continue to use the same approaches to problems even when those approaches have failed in the past. And they tend to brute-force problems rather than sit back and think out a design before coding.

Ultimately, the question of "quality" is not related to the writing of code; it is related to the behavior of the programmer himself. In my view, this is the partially the fault of CS courses in colleges: they train programmers to write code, not to design software. And it shows.

Look at this from birds flight, posted 31 Oct 2000 at 00:47 UTC by Malx » (Journeyer)

I have read all this message and see how it resembles talks of authors, painters - any atrtists

You are talking about how to present computer-control language the way people will understand it at once.

I thinks this is mistake.

You need different languages to present different data!!! In better CS courses you'll lean about 4GL languages for data flow description. Or about Eifell :) for better and cleaner way to design programs.
But still it is mix of data!! Just remember that you still need to write documentation to describe your project. ;-) and just remember - are you happy to write it after program is completed?!

What is needed:
1) Description of goals - this must be metalanguage (English will not help you any more - it must be language with the only interpretation and complex enough to present all goals of humanity :). So it will be easy to convert it to any human language (it is one way automatic conversion so... developing THING in this language would require programmer + writer + artists + linguists skills, but inline help with context-depending lists of meaning(for Natural-lang word) will help).

2) Description of types - here would be son of OOP...:) Have a look at Yahoo or Odigo catalogues. They are step to all-defined data types with all-defined properties names and types and value ranges. So all will be in same (1) metalnaguage(UniLang?). So you will have no need to make notes, docs, lessons etc. to tell what you mean with this names. Also easy extensions and aging must be considered.

3) Descriptions of algorithms - here I would point out that algorithms must be separated from API!!! The algorithms developers are not API developers at all! They are mathematiotions really. And.... you need a way to present properties of algorithm (ranges of values, speed, number of iterations for different types of data). In simple case just table of test - wich algorithm is best for wich OS/data/delays etc... It would be automatically choosen for goal set in (1).

4) Configure - just dependencies to environment. Choosing better definitions, data representations of data types (ex. - encodings for given Natural Language), algorithms (performing needed test on system software is installed to choose one is better suite the goal - speed/quality/stability).

5) User interfaceses - same time it would be program Help , but compiled to English or Russian :). Also you see - there is no difference are they User interfaces with English help or Programm interaction interfaces with sources.

6) compilers - Just choose better implementations of algorithm for given processor/architecture. With all neded tests and list to choose from (still (1) goal will guide you here).

7) Converters - just reorganising data - ways of gating one information from other. Also it is not converting actually... for example you could convert DATA(image) to DATA(height) - and get linear value, represented in any sutable way -> Meters/Pixels -> byte/Integer/float

---------------------------
Ok it is just a knowlege base :)

Until it will be build - there would be discussions of the sort above shows. Just representations misunderstanding...Current programming languages(a) are not suted for human-reading. Current docs/manuals(b) are not suted for automatic processing. And the only compilers of (b)->(a) is human :)

The thins to solve is - to find a way of building this system so it will be usefull from the first week of it's existence. Nevertheless it will be redesigned and rebuid from scratch during next 5-10 years It must save all data.

Sorry for mistakes in English ... I just one of those persuaded to learn English to be able to learn programming. You will never do it in other language... it's pity :-/

re: overuse of flags, posted 31 Oct 2000 at 19:31 UTC by sej » (Master)

stefan said: you seem to suggest that OOD is a priori superior to other techniques for parametrizing code, or making it polymorphic

I was trying to make the point that with mature C code I find myself adding global variables to get custom behavior, where with mature C++ code I find myself adding derived classes to get custom behavior. I do prefer the latter activity. Other than that I'm open to any kind of parameterization that works.

Misnaming of variables, posted 1 Nov 2000 at 19:54 UTC by jkdufair » (Apprentice)

Great topic!

There's a guy about 10 feet away from me that names his production variables things like "bud" and "weiser". Or I see a lot of "temp1", "temp2", "temp3" in production code. Sometimes I feel like I need a dipshit neutralizer.

re: over-use of flags, posted 11 Dec 2000 at 15:49 UTC by jay » (Journeyer)

Another helpful thing when refactoring or throwing functionality away (e.g. just to simplify things, or to reuse the code in another project) is to make the compiler/linker do the work.

In other words, avoid flags which allow the programmer use use inline calculated values like foo(p, bar!=baz) or variables foo(p, bazflag). The point here is that when cutting away what you don't need, it's really useful to rename/remove a function and see what drops away.

One feature I used to like about the Whitesmith linker it that it would list "deadwood" for you.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page