Home » What is the difference between “cat file | ./binary” and “./binary < file"?

What is the difference between “cat file | ./binary” and “./binary < file"?

Solutons:


In

./binary < file

binary‘s stdin is the file open in read-only mode. Note that bash doesn’t read the file at all, it just opens it for reading on the file descriptor 0 (stdin) of the process it executes binary in.

In:

./binary << EOF
test
EOF

Depending on the shell, binary‘s stdin will be either a deleted temporary file (AT&T ksh, zsh, bash…) that contains testn as put there by the shell or the reading end of a pipe (dash, yash; and the shell writes testn in parallel at the other end of the pipe). In your case, if you’re using bash, it would be a temp file.

In:

cat file | ./binary

Depending on the shell, binary‘s stdin will be either the reading end of a pipe, or one end of a socket pair where the writing direction has been shut down (ksh93) and cat is writing the content of file at the other end.

When stdin is a regular file (temporary or not), it is seekable. binary may go to the beginning or end, rewind, etc. It can also mmap it, do some ioctl()s like FIEMAP/FIBMAP (if using <> instead of <, it could truncate/punch holes in it, etc).

pipes and socket pairs on the other hand are an inter-process communication means, there’s not much binary can do beside reading the data (though there are also some operations like some pipe-specific ioctl()s that it could do on them and not on regular files).

Most of the times, it’s the missing ability to seek that causes applications to fail/complain when working with pipes, but it could be any of the other system calls that are valid on regular files but not on different types of files (like mmap(), ftruncate(), fallocate()). On Linux, there’s also a big difference in behaviour when you open /dev/stdin while the fd 0 is on a pipe or on a regular file.

There are many commands out there that can only deal with seekable files, but when that’s the case, that’s generally not for the files open on their stdin.

$ unzip -l file.zip
Archive:  file.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
       11  2016-12-21 14:43   file
---------                     -------
       11                     1 file
$ unzip -l <(cat file.zip)
     # more or less the same as cat file.zip | unzip -l /dev/stdin
Archive:  /proc/self/fd/11
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of /proc/self/fd/11 or
        /proc/self/fd/11.zip, and cannot find /proc/self/fd/11.ZIP, period.

unzip needs to read the index stored at the end of the file, and then seek within the file to read the archive members. But here, the file (regular in the first case, pipe in the second) is given as a path argument to unzip, and unzip opens it itself (typically on fd other than 0) instead of inheriting a fd already opened by the caller. It doesn’t read zip files from its stdin. stdin is mostly used for user interaction.

If you run that binary of yours without redirection at the prompt of an interactive shell running in a terminal emulator, then binary‘s stdin will be inherited from its caller the shell, which itself will have inherited it from its caller the terminal emulator and will be a pty device open in read+write mode (something like /dev/pts/n).

Those devices are not seekable either. So, if binary works OK when taking input from the terminal, possibly the issue is not about seeking.

If that 14 is meant to be an errno (an error code set by failing system calls), then on most systems, that would be EFAULT (Bad address). The read() system call would fail with that error if asked to read into a memory address that is not writable. That would be independent of whether the fd to read the data from points to a pipe or regular file and would generally indicate a bug1.

binary possibly determines the type of file open on its stdin (with fstat()) and runs into a bug when it’s neither a regular file nor a tty device.

Hard to tell without knowing more about the application. Running it under strace (or truss/tusc equivalent on your system) could help us see what is the system call if any that is failing here.


1 The scenario envisaged by Matthew Ife in a comment to your question sounds a lot plausible here. Quoting him:

I suspect it is seeking to the end of file to get a buffer size for reading the data, badly handling the fact that seek doesn’t work and attempting to allocate a negative size (not handling a bad malloc). Passing the buffer to read which faults given the buffer is not valid.

Here’s a simple example program that illustrates Stéphane Chazelas’ answer using lseek(2) on its input:

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main(void)
{
    int c;
    off_t off;
    off = lseek(0, 10, SEEK_SET);
    if (off == -1)
    {
        perror("Error");
        return -1;
    }
    c = getchar();
    printf("%cn", c);
}

Testing:

$ make seek
cc     seek.c   -o seek
$ cat foo
abcdefghijklmnopqrstuwxyz
$ ./seek < foo
k
$ ./seek <<EOF
> abcdefghijklmnopqrstuvwxyz
> EOF
k
$ cat foo | ./seek
Error: Illegal seek

Pipes are not seekable, and that’s one place where a program might complain about pipes.

The pipe and redirection are different animals, so to speak. When you use here-doc redirection ( << ) or redirecting stdin < the text doesn’t come in out of thin air – it actually goes into a file descriptor ( or temporary file, if you will ), and that is where the binary’s stdin will be pointing.

Specifically, here’s an excerpt from bash's source code, redir.c file (version 4.3):

/* Create a temporary file holding the text of the here document pointed to
   by REDIRECTEE, and return a file descriptor open for reading to the temp
   file.  Return -1 on any error, and make sure errno is set appropriately. */
static int
here_document_to_fd (redirectee, ri)

So since redirection can basically be treated as files, the binaries can navigate them , or seek() through the file easily, jumping to any byte of the file.

Pipes , since they are buffers of 64 KiB (at least on Linux) with writes of 4096 bytes or less guaranteed to be atomic, aren’t seekable, i.e. you cannot freely navigate them – only read sequentially. I once implemented tail command in python. 29 million lines of text can be seeked in microseconds if redirected, but if cat‘ed via pipe , well, there’s nothing that can be done – so it all has to be read sequentially.

Another possibility is that the binary might want to open a file specifically, and doesn’t want to receive input from a pipe. It’s usually done via fstat() system call, and checking if the input comes from a S_ISFIFO type of file (which signifies a pipe/named pipe).

Your specific binary, since we don’t know what it is, probably attempts seeking, but cannot seek pipes. It is recommended you consult its documentation to find out what exactly error code 14 means.

NOTE: Some shells, such as dash ( Debian Almquist Shell, default /bin/sh on Ubuntu ) implement here-doc redirection with pipes internally, thus may not be seekable. The point remains the same – pipes are sequential and cannot be navigated easily, and attempts to do so will result into errors.

Related Solutions

Joining bash arguments into single string with spaces

[*] I believe that this does what you want. It will put all the arguments in one string, separated by spaces, with single quotes around all: str="'$*'" $* produces all the scripts arguments separated by the first character of $IFS which, by default, is a space....

AddTransient, AddScoped and AddSingleton Services Differences

TL;DR Transient objects are always different; a new instance is provided to every controller and every service. Scoped objects are the same within a request, but different across different requests. Singleton objects are the same for every object and every...

How to download package not install it with apt-get command?

Use --download-only: sudo apt-get install --download-only pppoe This will download pppoe and any dependencies you need, and place them in /var/cache/apt/archives. That way a subsequent apt-get install pppoe will be able to complete without any extra downloads....

What defines the maximum size for a command single argument?

Answers Definitely not a bug. The parameter which defines the maximum size for one argument is MAX_ARG_STRLEN. There is no documentation for this parameter other than the comments in binfmts.h: /* * These are the maximum length and maximum number of strings...

Bulk rename, change prefix

I'd say the simplest it to just use the rename command which is common on many Linux distributions. There are two common versions of this command so check its man page to find which one you have: ## rename from Perl (common in Debian systems -- Ubuntu, Mint,...

Output from ls has newlines but displays on a single line. Why?

When you pipe the output, ls acts differently. This fact is hidden away in the info documentation: If standard output is a terminal, the output is in columns (sorted vertically) and control characters are output as question marks; otherwise, the output is...

mv: Move file only if destination does not exist

mv -vn file1 file2. This command will do what you want. You can skip -v if you want. -v makes it verbose - mv will tell you that it moved file if it moves it(useful, since there is possibility that file will not be moved) -n moves only if file2 does not exist....

Is it possible to store and query JSON in SQLite?

SQLite 3.9 introduced a new extension (JSON1) that allows you to easily work with JSON data . Also, it introduced support for indexes on expressions, which (in my understanding) should allow you to define indexes on your JSON data as well. PostgreSQL has some...

Combining tail && journalctl

You could use: journalctl -u service-name -f -f, --follow Show only the most recent journal entries, and continuously print new entries as they are appended to the journal. Here I've added "service-name" to distinguish this answer from others; you substitute...

how can shellshock be exploited over SSH?

One example where this can be exploited is on servers with an authorized_keys forced command. When adding an entry to ~/.ssh/authorized_keys, you can prefix the line with command="foo" to force foo to be run any time that ssh public key is used. With this...

Why doesn’t the tilde (~) expand inside double quotes?

The reason, because inside double quotes, tilde ~ has no special meaning, it's treated as literal. POSIX defines Double-Quotes as: Enclosing characters in double-quotes ( "" ) shall preserve the literal value of all characters within the double-quotes, with the...

What is GNU Info for?

GNU Info was designed to offer documentation that was comprehensive, hyperlinked, and possible to output to multiple formats. Man pages were available, and they were great at providing printed output. However, they were designed such that each man page had a...

Set systemd service to execute after fstab mount

a CIFS network location is mounted via /etc/fstab to /mnt/ on boot-up. No, it is not. Get this right, and the rest falls into place naturally. The mount is handled by a (generated) systemd mount unit that will be named something like mnt-wibble.mount. You can...

Merge two video clips into one, placing them next to each other

To be honest, using the accepted answer resulted in a lot of dropped frames for me. However, using the hstack filter_complex produced perfectly fluid output: ffmpeg -i left.mp4 -i right.mp4 -filter_complex hstack output.mp4 ffmpeg -i input1.mp4 -i input2.mp4...

How portable are /dev/stdin, /dev/stdout and /dev/stderr?

It's been available on Linux back into its prehistory. It is not POSIX, although many actual shells (including AT&T ksh and bash) will simulate it if it's not present in the OS; note that this simulation only works at the shell level (i.e. redirection or...

How can I increase the number of inodes in an ext4 filesystem?

It seems that you have a lot more files than normal expectation. I don't know whether there is a solution to change the inode table size dynamically. I'm afraid that you need to back-up your data, and create new filesystem, and restore your data. To create new...

Why doesn’t cp have a progress bar like wget?

The tradition in unix tools is to display messages only if something goes wrong. I think this is both for design and practical reasons. The design is intended to make it obvious when something goes wrong: you get an error message, and it's not drowned in...