Hoooo, doctoral comp flashback. Okay, here goes.

We start with the idea of a decision problem, a problem for which an algorithm can always answer “yes” or “no.” We also need the idea of two models of computer (Turing machine, really): deterministic and non-deterministic. A deterministic computer is the regular computer we always thinking of; a non-deterministic computer is one that is just like we’re used to except that is has unlimited parallelism, so that any time you come to a branch, you spawn a new “process” and examine both sides. Like Yogi Berra said, when you come to a fork in the road, you should take it.

A decision problem is in P if there is a known polynomial-time algorithm to get that answer. A decision problem is in NP if there is a known polynomial-time algorithm for a non-deterministic machine to get the answer.

Problems known to be in P are trivially in NP — the nondeterministic machine just never troubles itself to fork another process, and acts just like a deterministic one. There are problems that are known to be neither in P nor NP; a simple example is to enumerate all the bit vectors of length n. No matter what, that takes 2^{n} steps.

(Strictly, a decision problem is in NP if a nodeterministic machine can arrive at an answer in poly-time, and a deterministic machine can *verify* that the solution is correct in poly time.)

But there are some problems which are known to be in NP for which no poly-time deterministic algorithm is known; in other words, we know they’re in NP, but don’t know if they’re in P. The traditional example is the decision-problem version of the Traveling Salesman Problem (decision-TSP): given the cities and distances, is there a route that covers all the cities, returning to the starting point, in less than *x* distance? It’s easy in a nondeterministic machine, because every time the nondeterministic traveling salesman comes to a fork in the road, he takes it: his clones head on to the next city they haven’t visited, and at the end they compare notes and see if any of the clones took less than *x* distance.

(Then, the exponentially many clones get to fight it out for which ones must be killed.)

It’s not known whether decision-TSP is in P: there’s no known poly-time solution, but there’s no proof such a solution doesn’t exist.

Now, one more concept: given decision problems P and Q, if an algorithm can transform a solution for P into a solution for Q in polynomial time, it’s said that Q is *poly-time reducible* (or just reducible) to P.

A problem is NP-complete if you can prove that (1) it’s in NP, and (2) show that it’s poly-time reducible to a problem already known to be NP-complete. (The hard part of that was provie the *first* example of an NP-complete problem: that was done by Steve Cook in Cook’s Theorem.)

So really, what it says is that if anyone ever finds a poly-time solution to one NP-complete problem, they’ve automatically got one for *all* the NP-complete problems; that will also mean that P=NP.

A problem is NP-hard if and only if it’s “at least as” hard as an NP-complete problem. The more conventional Traveling Salesman Problem of finding the shortest route is NP-hard, not strictly NP-complete.

Michael Sipser’s Introduction to the Theory of Computation is a great book, and is very readable. Another **great** resource is Scott Aaronson’s Great Ideas in Theoretical Computer Science course.

The formalism that is used is to look at decision problems (problems with a Yes/No answer, e.g. “does this graph have a Hamiltonian cycle”) as “languages” — sets of strings — inputs for which the answer is Yes. There is a formal notion of what a “computer” is (Turing machine), and a problem is in P if there is a polynomial time algorithm for deciding that problem (given an input string, say Yes or No) on a Turing machine.

A problem is in NP if it is *checkable* in polynomial time, i.e. if, for inputs where the answer is Yes, there is a (polynomial-size) certificate given which you can check that the answer is Yes in polynomial time. [E.g. given a Hamiltonian cycle as certificate, you can obviously check that it is one.]

It doesn’t say anything about how to *find* that certificate. Obviously, you can try “all possible certificates” but that can take exponential time; it is not clear whether you will always *have* to take more than polynomial time to decide Yes or No; this is the P vs NP question.

A problem is NP-hard if being able to solve that problem means being able to solve all problems in NP.

Also see this question:

What is an NP-complete in computer science?

But really, all these are probably only vague to you; it is worth taking the time to read e.g. Sipser’s book. It is a beautiful theory.

This is a comment on Charlie’s post.

A problem is NP-complete if you can prove that (1) it’s in NP, and

(2) show that it’s poly-time reducible to a problem already known to

be NP-complete.

There is a subtle error with the second condition. Actually, what you need to prove is that a known NP-complete problem (say *Y*) is polynomial-time reducible to this problem (let’s call it problem *X*).

The reasoning behind this manner of proof is that if you could reduce an *NP*-Complete problem to this problem and somehow manage to solve this problem in poly-time, then you’ve also succeeded in finding a poly-time solution to the *NP*-complete problem, which would be a remarkable (if not impossible) thing, since then you’ll have succeeded to resolve the long-standing *P = NP* problem.

Another way to look at this proof is consider it as using the the contra-positive proof technique, which essentially states that if *Y –> X*, then *~X –> ~Y*. In other words, not being able to solve *Y* in polynomial time isn’t possible means not being to solve X in poly-time either. On the other hand, if you could solve X in poly-time, then you could solve Y in poly-time as well. Further, you could solve all problems that reduce to Y in poly-time as well by transitivity.

I hope my explanation above is clear enough. A good source is Chapter 8 of *Algorithm Design* by Kleinberg and Tardos or Chapter 34 of Cormen et al.