the meaning of a class

stuff I've learned

jan, 1998

The purpose of a class is literally classification, categorization. It provides meaning and semantics through its interface: an integer by itself is meaningless. However, a year is not.

The meaning of a class is found in what it represents and how it is intended to be used. Containers are a good example of this, and where I think this point is most often misunderstood. Imagine that we want a collection of transactions, that is, a bunch of transaction objects. It is very tempting to do something like this (in C++, using a typical Rogue Wave template collection class) :

	RWCPtrSlist<transaction> theTransactions;

and the use the RWCPtrSlist<T> interface to add and remove transactions. So what's wrong with this? Isn't this adhering to appropriate OO principles like encapsulation, abstraction etc? After all, the RWCPtrSlist<T> is a really cool template class and doesn't have to know anything about transaction objects, etc. Well, sort of. But the way this list is used is not present in this collection at all. What if there are specific rules about how collections of transaction objects are to be used? What if they have to be in specific sorted order, say, by date? There would be an "insert transaction" function or method somewhere that operates on this list. The two are separate and separately changeable... unless.. we actually use OO design principles such as encapsulation and encode the meaning of a collection of transactions into its own class:

	class transactionCollection {
		RWCPtrSlist<transaction > theTransactions;
		addTransaction(transaction *& t);
		// etc.
This is good because:

Now, hardly anyone ever does this. Why not? My guess is that people treat classes as big, huge things that take forever to design and code, and subsequently think that there should only be a few, or only the big ideas should be modeled as classes. My perspective is that you can't do object oriented stuff without classes, so the more code you write with fewer classes, the less you're doing OO -- which means the less you benefit from OO. Object-based programming is good, but object-oriented programming is light years beyond it.

I think that every time a thing is used in ways that are not clearly stated by that thing's interface, it should be encapsulated and the meaning and ways that thing will be used should be encoded in and obvious from its abstracted interface. Such as the transaction collection example.

Perhaps you don't believe me when I say that hardly anyone ever does this.

Consider the following simple example that almost everyone has seen. I bet it is even in your code, in more than one place:

	int status;    // -1: false, 0: true, 1: error

I call this an untyped variable, for while it does have a type from the compiler's perspective, it doesn't help. The fact that we're working within a strongly typed system has no benefit here whatsoever when it comes to usage of this variable around the concept of status; assignment to or from an age, a height, or a cubic volume -- obviously algorithmic or design flaws -- are just fine if they're all ints.

While supposedly quick and easily implemented, from a long term perspective, this is crazy. Don't even try to tell me that your #define statements will help. What if, suddenly, you get a status value of 7 returned from some function? Or someone changes those #defines? or #define's conflict -- not particularly unlikely in a large system whose interdependencies are supposed to be minimal for stability and reuse purposes.

Forget it.

Hoping programmers will not only always adhere to this but that their code will be obvious to a reader two years from now is, frankly, not something one should rely on.

Instead, consider creating a class for this purpose. Encode the way it will be used into its interface. Try a simple technique such as this:

	class status {
		enum value_t { False, True, Error };
		value_t theValue;
		status(value_t v) : theValue(v) { }
		value_t value() { return theValue; }
Then you could have functions return status values such as
	status someFunction() { return status(status::False); }
Or if you are high strung about efficiency, you could declare several status objects and return them or references to them instead of constructing new ones every time:
	status statusFalse(status::False), statusTrue(status::true),
	status& someFunction() { /* .. */ return statusFalse; }

This is better than using an enum type (like status::value_t) by itself since the status class can be designed to control access to the value itself, as all classes can. Centralized, encapsulated control versus distributed, not-always-allowed const enums.

Additionally, other operations can be attached and the class can be expanded or contracted as necessary. For example, if it was desired that these status values also be expressed in English (char *, that is) as well. Or German. Or Unicode. Or, as another example, if one of the values were to be made obsolete, the class could be extended to report any situations where this obsolete value were to be set: this functionality would be totally encapsulated within the status class and no other code would have to change. The simple act of encapsulating the data value and giving it meaning through an interface allows all these things to be added at a later date without changing the code that uses it!

Imagine that!