Very briefly, it makes program state unpredictable.
To elaborate, imagine you have a couple of objects that both use the same global variable. Assuming you’re not using a source of randomness anywhere within either module, then the output of a particular method can be predicted (and therefore tested) if the state of the system is known before you execute the method.
However, if a method in one of the objects triggers a side effect which changes the value of the shared global state, then you no longer know what the starting state is when you execute a method in the other object. You can now no longer predict what output you’ll get when you execute the method, and therefore you can’t test it.
On an academic level this might not sound all that serious, but being able to unit test code is a major step in the process of proving its correctness (or at least fitness for purpose).
In the real world, this can have some very serious consequences. Suppose you have one class that populates a global data structure, and a different class that consumes the data in that data structure, changing its state or destroying it in the process. If the processor class executes a method before the populator class is done, the result is that the processor class will probably have incomplete data to process, and the data structure the populator class was working on could be corrupted or destroyed. Program behaviour in these circumstances becomes completely unpredictable, and will probably lead to epic lossage.
Further, global state hurts the readability of your code. If your code has an external dependency that isn’t explicitly introduced into the code then whoever gets the job of maintaining your code will have to go looking for it to figure out where it came from.
As for what alternatives exist, well it’s impossible to have no global state at all, but in practice it is usually possible to restrict global state to a single object that wraps all the others, and which must never be referenced by relying on the scoping rules of the language you’re using. If a particular object needs a particular state, then it should explicitly ask for it by having it passed as an argument to its constructor or by a setter method. This is known as Dependency Injection.
It may seem silly to pass in a piece of state that you can already access due to the scoping rules of whatever language you’re using, but the advantages are enormous. Now if someone looks at the code in isolation, it’s clear what state it needs and where it’s coming from. It also has huge benefits regarding the flexibility of your code module and therefore the opportunities for reusing it in different contexts. If the state is passed in and changes to the state are local to the code block, then you can pass in any state you like (if it’s the correct data type) and have your code process it. Code written in this style tends to have the appearance of a collection of loosely associated components that can easily be interchanged. The code of a module shouldn’t care where state comes from, just how to process it. If you pass state into a code block then that code block can exist in isolation, that isn’t the case if you rely on global state.
There are plenty of other reasons why passing state around is vastly superior to relying on global state. This answer is by no means comprehensive. You could probably write an entire book on why global state is bad.
Mutable global state is evil for many reasons:
- Bugs from mutable global state – a lot of tricky bugs are caused by mutability. Bugs that can be caused by mutation from anywhere in the program are even tricker, as it’s often hard to track down the exact cause
- Poor testability – if you have mutable global state, you will need to configure it for any tests that you write. This makes testing harder (and people being people are therefore less likely to do it!). e.g. in the case of application-wide database credentials, what if one test needs to access a specific test database different from everything else?
- Inflexibility – what if one part of the code requires one value in the global state, but another part requires another value (e.g. a temporary value during a transaction)? You suddenly have a nasty bit of refactoring on your hands
- Function impurity – “pure” functions (i.e. ones where the result depends only on the input parameters and have no side effects) are much easier to reason about and compose to build larger programs. Functions that read or manipulate mutable global state are inherently impure.
- Code comprehension – code behaviour that depends on a lot of mutable global variables is much trickier to understand – you need to understand the range of possible interactions with the global variable before you can reason about the behaviour of the code. In some situations, this problem can become intractable.
- Concurrency issues – mutable global state typically requires some form of locking when used in a concurrent situation. This is very hard to get right (is a cause of bugs) and adds considerably more complexity to your code (hard/expensive to maintain).
- Performance – multiple threads continually bashing on the same global state causes cache contention and will slow down your system overall.
Alternatives to mutable global state:
- Function parameters – often overlooked, but parameterising your functions better is often the best way to avoid global state. It forces you to solve the important conceptual question: what information does this function require to do its job? Sometimes it makes sense to have a data structure called “Context” that can be passed down a chain of functions that wraps up all relevant information.
- Dependency injection – same as for function parameters, just done a bit earlier (at object construction rather than function invocation). Be careful if your dependencies are mutable objects though, this can quickly cause the same problems as mutable global state…..
- Immutable global state is mostly harmless – it is effectively a constant. But make sure that it really is a constant, and that you aren’t going to be tempted to turn it into mutable global state at a later point!
- Immutable singletons – pretty much the same as immutable global state, except that you can defer instantiation until they are needed. Useful for e.g. large fixed data structures that need expensive one-off pre-calculation. Mutable singletons are of course equivalent to mutable global state and are therefore evil 🙂
- Dynamic binding – only available in some langauges like Common Lisp/Clojure, but this effectively lets you bind a value within a controlled scope (typically on a thread-local basis) which does not affect other threads. To some extent this is a “safe” way of getting the same effect as a global variable, since you know that only the current thread of execution will be affected. This is particularly useful in the case where you have multiple threads each handling independent transactions, for example.
- Since your whole damn app can be using it, it’s always incredibly hard to factor them
back out again. If you ever change anything to do with your global, all your code needs changing. This is a maintenance headache- far more than simply being able to
grepfor the type name to find out which functions use it.
- They’re bad because they introduce hidden dependencies, which break multithreading, which is increasingly vital to increasingly many applications.
- The state of the global variable is always completely unreliable, because all of your code could be doing anything to it.
- They’re really hard to test.
- They make calling the API hard. “You must remember to call SET_MAGIC_VARIABLE() before calling API” is just begging for someone forget to call it. It makes using the API error-prone, causing difficult-to-find bugs. By using it as a regular parameter, you force the caller to properly provide a value.
Just pass a reference into functions which need it. It’s not that hard.