Home » What’s the difference between a CTE and a Temp Table?

What’s the difference between a CTE and a Temp Table?

Solutons:


This is pretty broad, but I’ll give you as general an answer as I can.

CTEs…

  • Are unindexable (but can use existing indexes on referenced objects)
  • Cannot have constraints
  • Are essentially disposable VIEWs
  • Persist only until the next query is run
  • Can be recursive
  • Do not have dedicated stats (rely on stats on the underlying objects)

#Temp Tables…

  • Are real materialized tables that exist in tempdb
  • Can be indexed
  • Can have constraints
  • Persist for the life of the current CONNECTION
  • Can be referenced by other queries or subprocedures
  • Have dedicated stats generated by the engine

As far as when to use each, they have very different use cases. If you will have a very large result set, or need to refer to it more than once, put it in a #temp table. If it needs to be recursive, is disposable, or is just to simplify something logically, a CTE is preferred.

Also, a CTE should never be used for performance. You will almost never speed things up by using a CTE, because, again, it’s just a disposable view. You can do some neat things with them but speeding up a query isn’t really one of them.

EDIT:

Please see Martin’s comments below:

The CTE is not materialised as a table in memory. It is just a way of encapsulating a query definition. In the case of the OP it will be inlined and the same as just doing SELECT Column1, Column2, Column3 FROM SomeTable. Most of the time they do not get materialised up front, which is why this returns no rows WITH T(X) AS (SELECT NEWID())SELECT * FROM T T1 JOIN T T2 ON T1.X=T2.X, also check the execution plans. Though sometimes it is possible to hack the plan to get a spool. There is a connect item requesting a hint for this. – Martin Smith Feb 15 ’12 at 17:08


Original answer

CTE

Read more on MSDN

A CTE creates the table being used in memory, but is only valid for the specific query following it. When using recursion, this can be an effective structure.

You might also want to consider using a table variable. This is used as a temp table is used and can be used multiple times without needing to be re-materialized for each join. Also, if you need to persist a few records now, add a few more records after the next select, add a few more records after another op, then return just those handful of records, then this can be a handy structure, as it doesn’t need to be dropped after execution. Mostly just syntactic sugar. However, if you keep the row-count low, it never materializes to disk. See What’s the difference between a temp table and table variable in SQL Server? for more details.

Temp Table

Read more on MSDN – Scroll down about 40% of the way

A temp table is literally a table created on disk, just in a specific database that everyone knows can be deleted. It is the responsibility of a good dev to destroy those tables when they are no longer needed, but a DBA can also wipe them.

Temporary tables come in two variety: Local and global. In terms of MS Sql Server you use a #tableName designation for local, and ##tableName designation for global (note the use of a single or double # as the identifying characteristic).

Notice that with temp tables, as opposed to table variables or CTE, you can apply indexes and the like, as these are legitimately tables in the normal sense of the word.


Generally I would use temp tables for longer or larger queries, and CTEs or table variables if I had a small dataset already and wanted to just quickly script up a bit of code for something small. Experience and the advice of others indicates that you should use CTEs where you have a small number of rows being returned from it. If you have a large number, you would probably benefit from the ability to index on the temp table.

The accepted answer here says “a CTE should never be used for performance” – but that could mislead. In the context of CTEs versus temp tables, I’ve just finished removing a swathe of junk from a suite of stored procs because some doofus must’ve thought there was little or no overhead to using temp tables. I shoved the lot into CTEs, except those which were legitimately going to be re-used throughout the process. I gained about 20% performance by all metrics. I then set about removing all the cursors which were trying to implement recursive processing. This was where I saw the greatest gain. I ended up slashing response times by a factor of ten.

CTEs and temp tables do have very different use cases. I just want to emphasise that, while not a panacea, the comprehension and correct use of CTEs can lead to some truly stellar improvements in both code quality/maintainability and speed. Since I got a handle on them, I see temp tables and cursors as the great evils of SQL processing. I can get by just fine with table variables and CTEs for almost everything now. My code is cleaner and faster.

Related Solutions

How can I stop applications and services from running?

First Things First You may have some misconceptions about how Android works and what's really happening when a service is running or an app is in the background. See also: Do I really need to install a task manager? Most apps (e.g., ones you launch manually)...

How do I reset a lost administrative password?

By default the first user's account is an administrative account, so if the UI is prompting you for a password it's probably that person's user password. If the user doesn't remember their password you need to reset it. To do this you need to boot into recovery...

How can I use environment variables in Nginx.conf

From the official Nginx docker file: Using environment variables in nginx configuration: Out-of-the-box, Nginx doesn't support using environment variables inside most configuration blocks. But envsubst may be used as a workaround if you need to generate your...

Difference between .bashrc and .bash_profile

Traditionally, when you log into a Unix system, the system would start one program for you. That program is a shell, i.e., a program designed to start other programs. It's a command line shell: you start another program by typing its name. The default shell, a...

Custom query with Castle ActiveRecord

In this case what you want is HqlBasedQuery. Your query will be a projection, so what you'll get back will be an ArrayList of tuples containing the results (the content of each element of the ArrayList will depend on the query, but for more than one value will...

What is the “You have new mail” message in Linux/UNIX?

Where is this mail? It's likely to be in the spool file: /var/mail/$USER or /var/spool/mail/$USER are the most common locations on Linux and BSD. (Other locations are possible – check if $MAIL is set – but by default, the system only informs you about...

How can I find the implementations of Linux kernel system calls?

System calls aren't handled like regular function calls. It takes special code to make the transition from user space to kernel space, basically a bit of inline assembly code injected into your program at the call site. The kernel side code that "catches" the...

Is a composite index also good for queries on the first field?

It certainly is. We discussed that in great detail under this related question: Working of indexes in PostgreSQL Space is allocated in multiples of MAXALIGN, which is typically 8 bytes on a 64-bit OS or (much less common) 4 bytes on a 32-bit OS. If you are not...

Explaining computational complexity theory

Hoooo, doctoral comp flashback. Okay, here goes. We start with the idea of a decision problem, a problem for which an algorithm can always answer "yes" or "no." We also need the idea of two models of computer (Turing machine, really): deterministic and...

Building a multi-level menu for umbraco

First off, no need pass the a parent parameter around. The context will transport this information. Here is the XSL stylesheet that should solve your problem: <!-- update this variable on how deep your menu should be --> <xsl:variable...

How to generate a random string?

My favorite way to do it is by using /dev/urandom together with tr to delete unwanted characters. For instance, to get only digits and letters: tr -dc A-Za-z0-9 </dev/urandom | head -c 13 ; echo '' Alternatively, to include more characters from the OWASP...

How to copy a file from a remote server to a local machine?

The syntax for scp is: If you are on the computer from which you want to send file to a remote computer: scp /file/to/send username@remote:/where/to/put Here the remote can be a FQDN or an IP address. On the other hand if you are on the computer wanting to...

What is the difference between curl and wget?

The main differences are: wget's major strong side compared to curl is its ability to download recursively. wget is command line only. There's no lib or anything, but curl's features are powered by libcurl. curl supports FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TFTP,...

Using ‘sed’ to find and replace [duplicate]

sed is the stream editor, in that you can use | (pipe) to send standard streams (STDIN and STDOUT specifically) through sed and alter them programmatically on the fly, making it a handy tool in the Unix philosophy tradition; but can edit files directly, too,...

How do I loop through only directories in bash?

You can specify a slash at the end to match only directories: for d in */ ; do echo "$d" done If you want to exclude symlinks, use a test to continue the loop if the current entry is a link. You need to remove the trailing slash from the name in order for -L to...

How to clear journalctl

The self maintenance method is to vacuum the logs by size or time. Retain only the past two days: journalctl --vacuum-time=2d Retain only the past 500 MB: journalctl --vacuum-size=500M man journalctl for more information. You don't typically clear the journal...