Home ยป Why is Global State so Evil?

Why is Global State so Evil?

Solutons:


Very briefly, it makes program state unpredictable.

To elaborate, imagine you have a couple of objects that both use the same global variable. Assuming you’re not using a source of randomness anywhere within either module, then the output of a particular method can be predicted (and therefore tested) if the state of the system is known before you execute the method.

However, if a method in one of the objects triggers a side effect which changes the value of the shared global state, then you no longer know what the starting state is when you execute a method in the other object. You can now no longer predict what output you’ll get when you execute the method, and therefore you can’t test it.

On an academic level this might not sound all that serious, but being able to unit test code is a major step in the process of proving its correctness (or at least fitness for purpose).

In the real world, this can have some very serious consequences. Suppose you have one class that populates a global data structure, and a different class that consumes the data in that data structure, changing its state or destroying it in the process. If the processor class executes a method before the populator class is done, the result is that the processor class will probably have incomplete data to process, and the data structure the populator class was working on could be corrupted or destroyed. Program behaviour in these circumstances becomes completely unpredictable, and will probably lead to epic lossage.

Further, global state hurts the readability of your code. If your code has an external dependency that isn’t explicitly introduced into the code then whoever gets the job of maintaining your code will have to go looking for it to figure out where it came from.

As for what alternatives exist, well it’s impossible to have no global state at all, but in practice it is usually possible to restrict global state to a single object that wraps all the others, and which must never be referenced by relying on the scoping rules of the language you’re using. If a particular object needs a particular state, then it should explicitly ask for it by having it passed as an argument to its constructor or by a setter method. This is known as Dependency Injection.

It may seem silly to pass in a piece of state that you can already access due to the scoping rules of whatever language you’re using, but the advantages are enormous. Now if someone looks at the code in isolation, it’s clear what state it needs and where it’s coming from. It also has huge benefits regarding the flexibility of your code module and therefore the opportunities for reusing it in different contexts. If the state is passed in and changes to the state are local to the code block, then you can pass in any state you like (if it’s the correct data type) and have your code process it. Code written in this style tends to have the appearance of a collection of loosely associated components that can easily be interchanged. The code of a module shouldn’t care where state comes from, just how to process it. If you pass state into a code block then that code block can exist in isolation, that isn’t the case if you rely on global state.

There are plenty of other reasons why passing state around is vastly superior to relying on global state. This answer is by no means comprehensive. You could probably write an entire book on why global state is bad.

Mutable global state is evil for many reasons:

  • Bugs from mutable global state – a lot of tricky bugs are caused by mutability. Bugs that can be caused by mutation from anywhere in the program are even tricker, as it’s often hard to track down the exact cause
  • Poor testability – if you have mutable global state, you will need to configure it for any tests that you write. This makes testing harder (and people being people are therefore less likely to do it!). e.g. in the case of application-wide database credentials, what if one test needs to access a specific test database different from everything else?
  • Inflexibility – what if one part of the code requires one value in the global state, but another part requires another value (e.g. a temporary value during a transaction)? You suddenly have a nasty bit of refactoring on your hands
  • Function impurity – “pure” functions (i.e. ones where the result depends only on the input parameters and have no side effects) are much easier to reason about and compose to build larger programs. Functions that read or manipulate mutable global state are inherently impure.
  • Code comprehension – code behaviour that depends on a lot of mutable global variables is much trickier to understand – you need to understand the range of possible interactions with the global variable before you can reason about the behaviour of the code. In some situations, this problem can become intractable.
  • Concurrency issues – mutable global state typically requires some form of locking when used in a concurrent situation. This is very hard to get right (is a cause of bugs) and adds considerably more complexity to your code (hard/expensive to maintain).
  • Performance – multiple threads continually bashing on the same global state causes cache contention and will slow down your system overall.

Alternatives to mutable global state:

  • Function parameters – often overlooked, but parameterising your functions better is often the best way to avoid global state. It forces you to solve the important conceptual question: what information does this function require to do its job? Sometimes it makes sense to have a data structure called “Context” that can be passed down a chain of functions that wraps up all relevant information.
  • Dependency injection – same as for function parameters, just done a bit earlier (at object construction rather than function invocation). Be careful if your dependencies are mutable objects though, this can quickly cause the same problems as mutable global state…..
  • Immutable global state is mostly harmless – it is effectively a constant. But make sure that it really is a constant, and that you aren’t going to be tempted to turn it into mutable global state at a later point!
  • Immutable singletons – pretty much the same as immutable global state, except that you can defer instantiation until they are needed. Useful for e.g. large fixed data structures that need expensive one-off pre-calculation. Mutable singletons are of course equivalent to mutable global state and are therefore evil ๐Ÿ™‚
  • Dynamic binding – only available in some langauges like Common Lisp/Clojure, but this effectively lets you bind a value within a controlled scope (typically on a thread-local basis) which does not affect other threads. To some extent this is a “safe” way of getting the same effect as a global variable, since you know that only the current thread of execution will be affected. This is particularly useful in the case where you have multiple threads each handling independent transactions, for example.

  1. Since your whole damn app can be using it, it’s always incredibly hard to factor them
    back out again. If you ever change anything to do with your global, all your code needs changing. This is a maintenance headache- far more than simply being able to grep for the type name to find out which functions use it.
  2. They’re bad because they introduce hidden dependencies, which break multithreading, which is increasingly vital to increasingly many applications.
  3. The state of the global variable is always completely unreliable, because all of your code could be doing anything to it.
  4. They’re really hard to test.
  5. They make calling the API hard. “You must remember to call SET_MAGIC_VARIABLE() before calling API” is just begging for someone forget to call it. It makes using the API error-prone, causing difficult-to-find bugs. By using it as a regular parameter, you force the caller to properly provide a value.

Just pass a reference into functions which need it. It’s not that hard.

Related Solutions

comparing five integers with if , else if statement

try this : int main () { int n1, n2, n3, n4, n5, biggest,smallest; cout << "Enter the five numbers: "; cin >> n1 >> n2 >> n3 >> n4 >> n5 ; smallest=biggest=n1; if(n2>biggest){ biggest=n2; } if(n2<smallest){ smallest=n2;...

How to play YouTube audio in background/minimised?

Here's a solution using entirely free and open source software. The basic idea is that although YouTube can't play clips in the background, VLC for Android can play clips in the background, so all we need to do is pipe the clip to VLC where we can listen to it...

Why not use “which”? What to use then?

Here is all you never thought you would ever not want to know about it: Summary To get the pathname of an executable in a Bourne-like shell script (there are a few caveats; see below): ls=$(command -v ls) To find out if a given command exists: if command -v...

Split string into Array of Arrays [closed]

If I got correct what you want to receive as a result, then this code would make what you want: extension Array { func chunked(into size: Int) -> [[Element]] { return stride(from: 0, to: self.count, by: size).map { Array(self[$0 ..< Swift.min($0 + size,...

Retrieving n rows per group

Let's start with the basic scenario. If I want to get some number of rows out of a table, I have two main options: ranking functions; or TOP. First, let's consider the whole set from Production.TransactionHistory for a particular ProductID: SELECT...

Don’t understand how my mum’s Gmail account was hacked

IMPORTANT: this is based on data I got from your link, but the server might implement some protection. For example, once it has sent its "silver bullet" against a victim, it might answer with a faked "silver bullet" to the same request, so that anyone...

What is /storage/emulated/0/?

/storage/emulated/0/Download is the actual path to the files. /sdcard/Download is a symlink to the actual path of /storage/emulated/0/Download However, the actual files are located in the filesystem in /data/media, which is then mounted to /storage/emulated/0...

How can I pass a command line argument into a shell script?

The shell command and any arguments to that command appear as numbered shell variables: $0 has the string value of the command itself, something like script, ./script, /home/user/bin/script or whatever. Any arguments appear as "$1", "$2", "$3" and so on. The...

What is pointer to string in C?

argv is an array of pointers pointing to zero terminated c-strings. I painted the following pretty picture to help you visualize something about the pointers. And here is a code example that shows you how an operating system would pass arguments to your...

How do mobile carriers know video resolution over HTTPS connections?

This is an active area of research. I happen to have done some work in this area, so I'll share what I can about the basic idea (this work was with industry partners and I can't share the secret details ๐Ÿ™‚ ). The tl;dr is that it's often possible to identify an...

How do I change the name of my Android device?

To change the hostname (device name) you have to use the terminal (as root): For Eclair (2.1): echo MYNAME > /proc/sys/kernel/hostname For Froyo (2.2): (works also on most 2.3) setprop net.hostname MYNAME Then restart your wi-fi. To see the change, type...

How does reverse SSH tunneling work?

I love explaining this kind of thing through visualization. ๐Ÿ™‚ Think of your SSH connections as tubes. Big tubes. Normally, you'll reach through these tubes to run a shell on a remote computer. The shell runs in a virtual terminal (tty). But you know this part...

Difference between database vs user vs schema

In Oracle, users and schemas are essentially the same thing. You can consider that a user is the account you use to connect to a database, and a schema is the set of objects (tables, views, etc.) that belong to that account. See this post on Stack Overflow:...

What’s the output of this code written in java?

//if you're using Eclipse, press ctrl-shift-f to "beautify" your code and make it easier to read int arr[] = new int[3]; //create a new array containing 3 elements for (int i = 0; i < 3; i++) { arr[i] = i;//assign each successive value of i to an entry in...

How safe are password managers like LastPass?

We should distinguish between offline password managers (like Password Safe) and online password managers (like LastPass). Offline password managers carry relatively little risk. It is true that the saved passwords are a single point of failure. But then, your...