Sunday, November 30, 2008

Where's the Metaphor?

Remember that television commercial for a fast food restaurant (I think it was Wendy's) that was boasting about the size of their burgers? They would say "Where's the beef?" when they looked at a competitors product. When I look at the supposedly objected oriented design to which some people code, I am often wondering "where's the metaphor?".

Object Oriented Design is Metaphorical Design
OK, the example might not be the best one I could have used, but the point is that many people claim to be Object Oriented Developers just because they use a language like Java. The fact is that what makes a design truly object oriented is that it has a metaphor that illustrates the expected behaviour. Sure, it doesn't have to have a metaphor to be object oriented. If you have a good imagination, you can dream up your own objects that don't have to have any relationship to the real world at all... but you'll be the only one who understands them. Some might say that if a design uses encapsulation, and inheritance, and perhaps even recursion, it is object oriented. I say that those are only some of the hallmarks of object oriented design. I feel quite strongly, that a good object oriented design requires a metaphor. And in bigger projects, the metaphor has many layers that need to be spelled out.

Large projects almost always involve more than one designer, and the designers need to be able to explain and discuss the design with each other and the implementers. A solid, well thought out, metaphor gives your design clarity. For example, if you are designing a data transfer mechanism, referring to it as a truck automatically conjures an image in the mind of whoever is involved in the design discussion. The characteristics of your design become obvious. It has a paylod, or cargo, it has a route, it has an identification, it has a driver. What if your data transfer mechanism doesn't need a driver? Well, you could just refer to it as a "driver-less truck", or perhaps you pick a different metaphor like "rail car". The point is that when you use a metaphor to discuss your objects, problems with your design jump out at you as absurdities in the metaphor. If you try to use your truck as a storage mechanism, it would become quite obvious that though you could park a truck and leave the cargo in it, that's not what it is designed to do.

Just because you can does not mean you should.
The lack of understanding of even the most basic of metaphors leads to sloppy code that is difficult to maintain. Take the parent child metaphor as an example. A parent has one or more children and each child has a parent. Yes, one could argue that a parent could have lost his child, so the parent could have zero children... Again, we could argue about orphans too. In fact, it is just that sort of an argument that illustrates the value of a metaphor. We could say that a Person has no children, and when they have children they become a parent.

In discussing this model, we often refer to the "has a" relationship. What I am proposing is that we use the "has knowledge of a" relationship instead. The "has a" denotes containment, whereas the "has knowledge of a" relationship denotes a reference. It really boils down to this: "Can one object exist without the other?" If not, then containment may be correct. However, many people mistakenly use containment when reference is correct. The use of metaphors exposes this error. Does a son really contain his mother? No. The son has knowledge of his mother, but he doesn't actually contain her. It is almost conceivable to say that a mother contains a son, but even then it is only during the first nine months that she does. If the mother contains her children, and the children contain the mother, you have a situation where the son contains not only his mother, but he also contains his brothers, sisters, and even himself (since they are contained within the mother). (I really wish people would have used "contains a" instead of "has a" from the beginning... it's much more clear.)

So here come the critics. "But in Java all relationships are By Reference, so it's just a pointer anyways!" You would be correct to say so, and in a stand alone project, I might not be concerned. No real harm in circular references. The mechanics of it will be sorted out for you. But even there, I would say use it with caution. Where it is much more problematic, and the problems harder to spot is when you are using an Object Relational Mapping tool like hibernate. And there, especially if you have a complex data structure. In such situations, one MUST think about whether containment is the correct approach or whether merely a reference to the id would be sufficient.

I was brought in to work on a web project that used hibernate to, among other things, retrieve a collection of 15 objects. No problem. However, each of those objects contained multiple collections of other objects which in turn contained references (sometimes several layers deep) to the first object. To get the 15 objects that were sought, our resultset had to pull down 1.5 million records! This situation did not materialize until some realistic data was used. The problem was not the ORM tool (as some people claimed), it was that the objects all contained each other, and hibernate had no choice but to do a bunch of joins (12 in this case). Once the object model was cleaned up we got the resultset down to something like fifty records. What used to cause the web container to timeout now happens in a flash.

Yes, you are allowed to use bi-directional references. But just because you can, doesn't mean you should. In a web application I think I would take it a step further and say that you may do it, only if you have to do it. (And if you are using something like hibernate, you had better read up on it so that you are sure you are using it correctly! More on that some other time.)

Keep It Stupidly Simple. The KISS rule does not give license to laziness! In fact, it will often require extra effort to keep it simple. In my real world example, it turned out that many of the contained objects were only needed from within the data access layer anyways, so they were easily retrieved on an as needed basis. I suspect that they were only ever contained so that they could be retrieved using dot notation.

It may mean that you have to write a few extra lines of code to get the desired object when you need it, but it will likely mean that the rest of your application runs smoother, and be easier to debug.

No comments:

Post a Comment