3 Oct 2010 fxn   » (Master)

Ruby, C, and Java are pass-by-value, Perl is pass-by-reference

Call semantincs in languages that manage references often confuse people. It is a recurring thread in Java and Ruby. The reason is simple: "pass-by-reference" has the word "reference" in it and thus people assume it has something to do with the language's "references". Not really.

In Ruby and Java "reference" is a term that is close to the concept of "pointer" in C. You have a handler that somehow points to something, rather than being that very something. The language may hide the indirection for you. Usage is straight in Ruby or Java. Not so in C or Perl, where to go from a pointer/reference to their objects you need an arrow.

When we talk about "pass-by-value", though, the fact that the language has references is irrelevant. We are really talking about where's the value associated with the parameters' name. Particularly we mean that the value is stored in an area unrelated to the storage in the callee. Let's see this.

Say we have an assignment


    a = 1

This assignment means that somewhere there's an association between the name "a" and the value "1", which is itself stored somewhere:


    +-----+       +-----+
    |  a  | ----> |  1  |
    +-----+       +-----+

If you assign


    b = 1

conceptually we get two associations to two different value storages:


    +-----+       +-----+
    |  a  | ----> |  1  |
    +-----+       +-----+
 
    +-----+       +-----+
    |  b  | ----> |  1  |
    +-----+       +-----+

In particular, if in the next line we change b:


    b = 2

we all know the situation becomes


    +-----+       +-----+
    |  a  | ----> |  1  |
    +-----+       +-----+
 
    +-----+       +-----+
    |  b  | ----> |  2  |
    +-----+       +-----+

In particular a still holds 1.

In languages like Perl you can have this other diagram:


    +-----+
    |  a  | --+
    +-----+   |    +-----+
              +--> |  1  |
    +-----+   |    +-----+
    |  b  | --+
    +-----+

In Perl jargon you say that a and b are aliases. In that situation, any assignment to a is reflected in b, and any assignment to b is reflected in a. Those names are associated with the same storage area.

The terms "pass-by-value" and "pass-by-reference" are about names linked to storage. And with those pictures you can understand what they mean. I am gonna obviate scope to simplify this and use different variable names on purpose, so this is not exact, but the essence is there.

Say you have


  def foo(b)
    ...
  end
  
  a = 1
  foo(a)

In a pass-by-value language the situation is:


    +-----+       +-----+
    |  a  | ----> |  1  |
    +-----+       +-----+
 
    +-----+       +-----+
    |  b  | ----> |  1  |
    +-----+       +-----+

The interpreter or whoever runs your language performs a copy behind the scenes of the storage area associated with "a", and associates the new one with "b". That's why if you reassing to b inside foo a is unaffected.

On the other hand, in a pass-by-reference language the situation is:


    +-----+
    |  a  | --+
    +-----+   |    +-----+
              +--> |  1  |
    +-----+   |    +-----+
    |  b  | --+
    +-----+

That's why you can implement swap in such languages.

But I can change the state of a mutable object in Ruby/Java because I pass a reference!

That is true, and it has no bearing with this. Since Ruby is pass-by-value, you can be certain that when the method returns your variable will refer to the same object. object_id is guaranteed to be the same after a method invocation (modulo black magic). Same for Java.

But I can change the integer a variable holds by passing a pointer in C!

That is true, but you are not passing the integer, you are passing a pointer to the integer. Since C is pass-by-value, if you had a variable holding the pointer before the call, you can be totally certain the variable will hold the same exact pointer after the call.

Summary

The terms pass-by-value and pass-by-reference are about links from names to storage areas, they have nothing to do with the references or pointers of your language.

That's a bit simplified, in Perl for example the aliases happen within @_, but that's the key idea.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!