Object embedding

Kragen Javier Sitaker, 2014-02-24 (13 minutes)

In Java (as in Smalltalk, Lisp, Python, ML, and many other languages) when an object A “contains” an object B, the in-RAM representation of object A contains actually just a pointer to B, i.e. the address at which B’s in-RAM representation can be found. This is in some ways convenient, but it has some drawbacks — not only for efficiency, but even for program correctness.

In C, instead, “A containing B” means that the in-RAM representation of B is actually contained inside the in-RAM representation of A. I don’t know of a standard term for this, so I’m calling this “object embedding.” (Object embedding is one aspect of one possible implementation of what is sometimes called “compound value types”, for example in C# and in Elements of Programming.) C, C++, Pascal, and Golang do the same thing; they all inherit it from COBOL’s “group items”, from which Algol got “record types”, which C renamed “struct”.

The benefits of by-reference object fields are significant:

However, compared to Java’s approach, it occurs to me that the embedding approach has some real benefits. More than once, I’ve written code like this:

public class LogFileParser {
    private Map<String, String> messageIds;
    public void sawMessageId(String messageId, String queueId) {
        messageIds.put(queueId, messageId);
    }
}

This has the tendency to throw a NullPointerException, because of course declaring that your LogFileParser objects could have a Map from Strings to Strings doesn’t give them one. You need to say:

    private Map<String, String> messageIds =
        new HashMap<String, String>();

final on instance variables is not nearly as useful as you might think when you’re obligated to use mutable objects for a lot of those final instance variables. It does, however, avoid the above problem.

By contrast, embedded objects aren’t nullable. If you say:

struct foo {
    struct foo *next;
    struct bar my_bar;
    int weight;
};

then you can never have a struct foo whose my_bar is NULL, because it’s not a pointer. So the above NullPointerException cannot arise, not even as a compile error. In C++ you can even have a default constructor for bar that ensures that its memory doesn’t contain random garbage; in Golang it’s initially zero, and you get the same guarantee in C if you use calloc or initialize a local struct foo f = {0};.

The consequence is that in C++, if you say

std::map<string, string> message_ids;

— either as a local variable or an instance variable — you have a full-fledged empty map sitting there, without any need to repeat again the fact that you wanted one. This kind of thing, in combination with the STL container library, allows you to write entire useful C++ programs that don’t mention heap memory allocation at all, because it’s all hidden away inside the STL.

(ML and its descendants take another approach to this problem, one which inspired the final mechanism in Java: although relationships by objects are by means of pointers, there is no null value, and so you can’t create an object without supplying values for all of its fields.)

Aside from this implicit nullability imposed on every object field, there are a few other problems with by-reference objects.

(Having more pointers also makes the garbage collector’s job a lot harder, since it has to traverse the pointers in your live objects.)

Quantitatively, it seems that half to two-thirds of a typical Java program’s memory is taken up by pointers, because moving to 64-bit increases the needed heap size by 50%.

This sounds like a bigger problem than it is in practice, because most software can be called upon to handle arbitrarily-sized problems, which means that you can always find some way to make it run out of memory and fail. Dynamic memory allocation is built into the problem, not just your solution to it. So increasing the number of places you can get an out-of-memory error isn’t a big deal.

But there is some software that needs to work all the time instead of sometimes failing unpredictably, and that software should be written without dynamic allocation, which means it should not be written in a language where object fields are by reference. It’s not acceptable if your antilock braking system, your jet engine controller, or your air-traffic control system gets an out-of-memory exception and reboots while you’re using it.

On the other hand, if it’s okay for your software to fail every once in a while, and it’s just a matter of reducing the frequency to once a day, once a year, or once a millennium, dynamic allocation is not a big deal. In fact, dynamic allocation can help a lot with that. One of the major reasons the Fuzz paper found that the GNU tools were so much more reliable than the Unix tools they replaced is that they use dynamic allocation all over the place, so they have few arbitrary limits. Every dynamic allocation is a potential failure in your program; but every arbitrary limit is a potential bug.

However, these disadvantages are irrelevant if you want to practice OOP, because if you don’t have ubiquitous LSP polymorphism, you aren’t practicing OOP. So, in a way, these can be seen as five disadvantages of OOP.

(In theory I don’t see why you couldn’t have a functional programming language where object embedding was the norm, but I haven’t seen one. Perhaps this is because functional programming languages have mostly adopted other approaches to solving the problems here that are not questions of mere efficiency: for nullability, sum types like ML’s Option or Haskell’s Maybe and constructor expressions; for aliasing, a default of immutability; and, to some extent, for garbage collection, sophisticated generational garbage collectors.)

Parametric polymorphism

One big disadvantage of object embedding is that it requires you to implement parametric polymorphism (aka generics, generic types, or parametrized types) by means of generative programming, which can create a combinatorial explosion in code size. ML-family languages get parametric polymorphism basically for free, because every value is the size of a machine word (ironic that the 1980s explosion of functional programming traces back to Backus’s misplaced attack on the “conceptual von Neumann bottleneck”!) so you only need one copy of the machine code for List.map, not one copy for each size of list item.

Zero instructions!

My assertion above that accessing embedded objects is not only cheap but actually free may seem dubious. What I mean is that, often, what you’re doing with an embedded object is accessing fields within it, and if those fields themselves are embedded objects, you can easily end up with an entire chain of three or four or more field accesses compiling down to a single indexed fetch instruction, rather than a series of three or four of them.

As an example, in My Very First Raytracer, I have a function which begins:

static bool
find_intersections(ray rr, sphere ss, sc *intersections)
{
  vec center_rel = sub(rr.start, ss.cp);
…
}

which invokes these functions:

static vec
add(vec aa, vec bb) { vec rv = { aa.x+bb.x, aa.y+bb.y, aa.z+bb.z }; return rv; }
static vec
sub(vec aa, vec bb) { return add(aa, scale(bb, -1)); }
static vec
scale(vec vv, sc c) { vec rv = { vv.x*c, vv.y*c, vv.z*c }; return rv; }

So, given all this, the center_rel code works out to

vec center_rel = { rr.start.x + ss.cp.x * -1,
                   rr.start.y + ss.cp.y * -1,
                   rr.start.z + ss.cp.z * -1 };

Even mild inlining optimization can turn that into six indexed memory fetches and three subtractions, followed by three stores, but as far as I can tell, GCC actually somehow manages to allocate an SSE register to each of the six values, and so those 12 field accesses are incorporated into three instructions. (I can’t follow the generated assembly at all, even with the aid of -g -Wa,-adhlns=raytracer.lst. GCC has inlined basically the entire raytracer program into a single giant recursive function called trace of some 500-plus instructions.)

Again, that’s 12 field accesses, plus some math, in 3 instructions, thanks to object embedding.

Further alternatives

What was the name of that guy who advocated using parallel arrays in C, Fortran-style? He was also pretty unhappy about the decline in hi-fi audio systems since the 1970s.

Topics