Do I need to explain why global data is bad?
Suppose you write a function to calculate square roots. You call it "root" and it takes a floating point number as a parameter and returns the square root of that number. You put it in a "MathUtil" class. So now you can write code like the following Java code:
float f=9.0f; float rf=MathUtil.root(f); System.out.println(rf); // prints "3.0"
Now suppose you discover that you also need to calculate cube roots. So you decide to create a global variable in the MathUtil class to hold the power of the root you need. Then you write code like this:
float f=9.0f; MathUtil.power=2; float rf=MathUtil.root(f); System.out.println(rf); // prints "3.0" MathUtil.power=3; f=64.0f; rf=MathUtil.root(f); System.out.println(rf); // prints "4.0" f=125.0f; rf=MathUtil.root(f); System.out.println(rf); // prints "5.0"
Would creating this "power" field be a good design decision?
Absolutely not. It creates mysterious connections between the caller and the called. Someone modifying this program later may not realize that the behavior of the root function is controlled by this "hidden" parameter.
Suppose module A uses root to calculate square roots. So at the top of the module it sets MathUtil.power=2 and never touches it again. Then every call to root within the module will give a square root, right? We test the program and indeed this works.
Suppose module A calls module B somewhere along the line. And months or years after we write the first draft of these programs, someone comes along and modifies module B to calculate a cube root. Something like this:
void moduleA() { MathUtil.power=2; float sqrt3=MathUtil.root(3.0f); ... hundreds of lines of other code ... moduleB(); ... more other code ... float sqrt64=MathUtil.root(64.0f); } void moduleB() { ... whatever ... MathUtil.power=3; float curt27=MathUtil.root(27.0f); ... whatever ... }
What is the value of sqrt64? Someone just reading moduleA might think it would be 8, but they would be wrong. moduleB sets power to 3, and this value will still be set when we get back to moduleA, so sqrt64 will actually contain the cube root of 64, or 4.
We could, of course, set the value of power before every call to root. But if we're going to do that, why not just make it a parameter of root? And I can guarantee that even if you document that that's what people should do, someone will surely say, "Why bother? I know it's already 2. It's more efficient to just leave it alone."
So global data is bad.
But I was rather surprised to come across two separate sources in the last couple of weeks that said that beans are a solution to this problem. (I won't name them -- that would be rude.)
If you're not familiar with beans, the idea is that a bean is a class whose data can only be accessed through getter and setter functions. So in this example, instead of MathUtil having a field named power that can be directly access by moduleA and moduleB, this field would be private. Now MathUtil would look something like this:
public class MathUtil { private int power; public void setPower(int power) { this.power=power; } public int getPower() { return power; } public float root(float f) { ... log to find "power"th root of f } }
So now "power" is private, it's no longer a global field. But, umm, exactly what did this "solution" accomplish?
The problem I described above -- moduleB changing the value of power while moduleA is not looking -- is not solved or lessened in any way. Requiring the caller to use getters and setters instead of changing the value directly did not make this any less "global data" in any relevant way. Sure, now it's declared "private". But that's a semantic trick. It's private but we provide public functions to update it. It's still public in practice. It's like those car dealers who say that a car is "pre-owned" rather than calling it "used". It may sound nicer, but the car is still a rusty clunker.
Okay, the bean approach does offer some advantages.
Beans do give us some flexibility if in the future we decide we need to change the data type. Note that I declared power as an int above. Maybe later we discover that we also need to find fractional roots, so we want to change power to a float. In a non-bean class, every place that accesses power would have to be changed to set it to a float instead of an int. In a been class, we could keep the old int getter and setter and make them do type conversions, while adding new a new getter and setter for the more flexible float. This would let old programs continue to work unchanged while giving the extra capabilities for new programs.
public class MathUtil { private float power; public void setPower(int power) { this.power=(float)power; } public int getPower() { return (int)power; } public void setFloatPower(float power) { this.power=power; } public float getFloatPower() { return power; } public float root(float f) { ... log to find "power"th root of f } }
Beans also allow getter and setter to have side effects. For example, in many applications we want to keep a log of every time a variable is changed. With a setter, we can have the function write to the log as well as setting the value. If the only way to update the value is through the setter, we can then be confident that every change of the value has been logged. Without the setter, we would have to rely on every caller being sure to update the log. There's no way we would guarantee that.
Getters and setters can do more complex manipulations. We might decide that it is more convenient to store the power as an exponent rather than a root, that is, for a square root, instead of storing 2 maybe we want to store 1/2, instead of 3 we store 1/3, etc. Getters and setters could easily do these conversions.
So I'm not saying not to use beans. I am saying that you should not delude yourself into thinking that you have eliminated global data because you used beans.
Global data is evil. Putting a pretty wig on it does not make it good.
© 2008 by Jay Johansen
No comments yet.