A socket is a pseudo-file that represents a network connection. Once a socket has been created (identifying the other host and port), writes to that socket are turned into network packets that get sent out, and data received from the network can be read from the socket.
Sockets are similar to pipes. Both look like files to the programs using them. Both facilitate interprocess communication. Pipes communicate with a local program; sockets communicate with a remote program. Sockets also offer, as you mention, bidirectional communication (much like a pair of properly connected pipes could).
Finally, it is common for programs on a single machine to communicate using standard network protocols, such as TCP; it would be wasteful to go all the way to the network hardware (if any!), compute checksums, etc., just to end up back at the same host. Unix domain sockets handle this case. They connect processes on the same host rather than remote processes, bypassing the network.
As tripleee mentioned, in the course of the history of BSD, pipes were introduced earlier than sockets, and were reimplemented using sockets once those existed. The same reference, The Design and Implementation of the FreeBSD Operating System, mentions that pipes were then reverted to a non-socket implementation for performance reasons.
A socket is just a logical endpoint for communication. They exist on the transport layer. You can send and receive things on a socket, you can bind and listen to a socket. A socket is specific to a protocol, machine, and port, and is addressed as such in the header of a packet.
Beej’s guides to Network Programming and Inter-Process Communication both have good information on how to use sockets, and even answer this exact question.
Now, what is it?
A socket, or “socket” can be several things:
First of all, it is a thought model and an application programming interface (API). That means you have a set of rules you need to follow and a set of functions that you can use to write programs that do something, according to a precisely specified contract. In this particular case, something means exchange data with another program.
The sockets API widely abstracts the details of “communication” in general. It encapsulates who you talk with and how, all through one (almost) consistent and identical cookie-cutter form.
You can create sockets in different “domains” (such as e.g. a “unix socket” or a “internet socket”) and of different types of communication (e.g. a “datagram” socket or a “stream” socket) and talk to different recipients, and everything works exactly the same (well, 99%, there are obviously minute differences that you’ll have to account for).
You do not need to know (and you do not even want to know!) whether you talk to another program on the same computer or on a different computer, or whether there is IPv4 or IPv6 network in between those computers, or maybe some other protocol that you have never heard of.
socket is also the name of the library function (or syscall) which creates “the socket“, which is a special kind of file (everything in Unix is a file).
How does it compare to…
sockets fall into the same category as pipes and name pipes
A pipe is a means of one way communication between a reader and a writer (both being programs) on the same computer. It simulates a stream of data (just like e.g. TCP).
That is, no individual “messages” or “blocks of data” exist from the pipe’s point of view. You can copy any amount of data into “one end”, and someone else can read any amount of data (not necessarily the same, and not necessarily in one go) at the “other end” in the same byte order as you’ve pushed it in.
A named pipe is, well, simply a pipe which owns a name in the filesystem. That is, it’s something that looks and behaves just like a file, it appears in the directory listing and you can open it, write to it, etc etc. Note that you can also create socket special files (that would be a named socket).
A socket, on the other hand, is a means of two way (“duplex”) communication, that means you can write to and read from the same socket, and you do not need two separate sockets for a two-way communication.
Also, a socket can act as a stream (identical to a pipe), or it can send discrete, unreliable messages, or it can send discrete, ordered messages (the first two work on any domain, the last only on “unix domain”). It can send messages (or simulate a stream) to someone on an entirely different computer. A socket can even do a form of one-to-many communication (multicast) under some conditions.
With that in mind, it is clear that sockets do something much more complicated and generally have more overhead than pipes (which are basically no more than a simple
memcpy to and from a buffer!), but if you create local sockets (i.e. on the same computer), the operating system usually applies a heavily optimized fast path, so there is really not much of a difference.
sometimes mentioned with regard to networks
Yes, sockets are one possible way of inter-process communication (shared memory and pipes being examples of alternatives). All at the same time, they are being used for “networking”, as explained above.