My fascinating question for today:
What should malloc(0) return?
Composing a critique of somebody else's crappy half-baked stdlib
implementation today caused me to take another look at
my
own crappy half-baked stdlib implementation. In part, my critique puts
the memory allocation functions malloc(), free(),
calloc(), and realloc() under the microscope.
One of the four is easy:
calloc() is really just malloc() in disguise --
the only wrinkle is to remember to call memset() afterwards,
which the somebody else in question forgot :-( --, but the other three
engage in an interesting dance.
The C99 standard has the following to say about malloc(0) and
friends, and the text in C90 is similar:
If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned, or the behavior
is as if the size were some nonzero value, except that the returned pointer
shall not be used to access an object.
I've always preferred the latter behaviour for malloc():
it avoids overloading the meaning of a NULL return, so, especially if your
size was a variable rather than a constant, you can say
"if malloc returned NULL, then abort due to out-of-memory" without having
to fudge about and check that it wasn't really a
returned-NULL-due-to-zero-size situation. (Of course, you could
say that people who actually call malloc() with a size of zero
deserve what they get!)
That (the latter) is also the behaviour required of a C++ allocation function.
So if you're writing operator new() in terms of
malloc(), you have to know that your malloc() has
the same preference as I do, or take extra care instead of just
calling malloc().
Now let's look at some of the other requirements that C90 and C99 place
on these functions:
free (NULL) = ({})
realloc (NULL, n) = malloc (n)
realloc (p, 0) = ({ free (p); return NULL; }) when p != NULL
The first is just saying that free(NULL) does nothing, which is,
of course, very convenient -- you get to avoid some tedious checking.
The other two are parts of the definition of realloc(). They are
spelt out in the C90 text; the C99 text as pertains to the last one is quite
different, but it can still be derived from the text.
Now, the "when p != NULL" condition on the last rule is not very
pleasant: it makes for an ugly algebraic rule. Futhermore, I would contend
that programmers who actually write code that's literally
"realloc (p, 0)", with p variable but a constant 0,
expect realloc() to Do The Right Thing even when p is NULL,
just as free() does.
Thus I would contend that programmers actually expect that they are using
a stronger set of rules, in which the third rule holds regardless of the
value of p. (In particular, this is what the Linux man page says
realloc() does; yes, the man page is way stronger here than the
C Standard!)
In that case, we can calculate the value of malloc(0):
malloc (0) = realloc (NULL, 0)
= ({ free (NULL); return NULL; })
= NULL
So that means that the choice is really between:
- either a realloc() that Does The Right Thing for size=0
- or a malloc() that does what I've always preferred for size=0
You can't have both! So I've changed my mind: being able to write
realloc(p, 0) and have it Just Work, like
free(p) does,
is more important to me than having
malloc(0), which I never actually
use, be easily distinguishable from memory exhaustion.
In retrospect, this was all obvious. The specification of
implementation-defined behaviour that I quoted at the beginning applies to
both malloc() and realloc(). Wanting
realloc(p, 0) to be useful in cleanup functions (by just freeing
through p, and not also allocating some dummy "zero-sized" object
that the caller will ignore and will become garbage) means that you've already
made the choice:
realloc() given a size of zero will return a null pointer.
Either that means you've also already made the choice for malloc() too; or you've decided to make the choice differently for the two functions, which
would be really ugly and I'm not even sure is allowed by the Standard.
But playing with the algebra was fun while it lasted!
(Looks like I'm going to have to put that "malloc (size? size : 1)"
stuff back in my library's default C++ allocation functions.)