Home ยป How do mobile carriers know video resolution over HTTPS connections?

How do mobile carriers know video resolution over HTTPS connections?


This is an active area of research. I happen to have done some work in this area, so I’ll share what I can about the basic idea (this work was with industry partners and I can’t share the secret details ๐Ÿ™‚ ).

The tl;dr is that it’s often possible to identify an encrypted traffic stream as carrying video, and it’s often possible to estimate its resolution – but it’s complicated, and not always accurate. There are a lot of people working on ways to do this more consistently and more accurately.

Video traffic has some specific characteristics that can distinguish it from other kinds of traffic. Here I refer specifically to video on demand – not live streaming video. Video on demand doesn’t often have those priority tags mentioned in this answer. Also I refer specifically to adaptive video, meaning that the video is divided into segments (each about 2-10 seconds long), and each segment of video is encoded at multiple quality levels (quality level meaning: long-term video bitrate, codec, and resolution). As you play the video, the quality level at which the next segment is downloaded depends on what data rate the application thinks your network can support. (That’s the DASH protocol referred to in this answer.)

If your phone is playing a video, and you look at the (weighted moving average of) data rate of the traffic going to your phone over time, it might look something like this:

data rate over time

(this is captured from a YouTube session over Verizon. There’s the moving average over 15 seconds and also short-term average.)

There are a few different parts to this session:

First, the video application (YouTube player) tries to fill the buffer up to the buffer capacity. During this time, it is pulling data at whatever rate the network can support. At this stage, it’s basically indistinguishable from a large file download, unless you can infer that it’s video traffic from the remote address (as mentioned in this answer).

Once the buffer is full, then you get “bursts” at sort-of-regular intervals. Suppose your buffer can hold 200 seconds of video. When the buffer has 200 seconds of video in it, the application stops downloading. Then after a segment of video has played back (say 5 seconds), there is room in the buffer again, so it’ll download the next segment, then stop again. That’s what causes this bursty pattern.

This pattern is very characteristic of video – traffic from other applications doesn’t have this pattern – so a network service provider can pretty easily pick out flows that carry video traffic. In some cases, you might not ever observe this pattern – for example, if the video is so short that the entire thing is loaded into the buffer at once and then the client stops downloading. Under those circumstances, it’s very difficult to distinguish video traffic from a file download (unless you can figure it out by remote address).

Anyway, once you have identified the flow as carrying video traffic – either by the remote address (not always possible, since major video providers use content distribution networks that are not exclusive to video) or by its traffic pattern (possible if the video session is long, much more difficult if it is so short that the whole video is loaded into the buffer all at once)…

Now, as Hector said, you can try to guess the resolution from the bitrate by looking at the size (in bytes) of each “burst” of data:

From the size per duration you could make a reasonable estimate of the resolution – especially if you keep a rolling average.

But, this can be difficult. Take the YouTube session in my example:

  • Not all segments are the same duration – the duration of video requested at a time depends on several factors (the quality level, network status, what kind of device you are playing the video on, and others). So you can’t necessarily look at a “burst” and say, “OK, this was X bytes representing 5 seconds of video, so I know the video data rate”. Sometimes you can figure out the likely segment duration but other times it is tricky.
  • For a given video quality level and segment duration, different segments will have different sizes (depending on things like how much motion takes place in that part of the video).
  • Even for the same video resolution, the long-term data rate can vary – a 1080p video encoded with VP9 won’t have the same long-term data rate as one encoded with H.264.
  • The video quality level changes according to perceived network quality (which is visible to the network service provider) and buffer status (which is not). So you can look at long-term data rates over 30 seconds, but it’s possible that the actual video quality level changed several times over that 30 seconds.
  • During periods when the buffer is draining or filling as fast as possible (when you don’t have those “bursts”), it’s much harder to estimate what’s going on in the video.
  • To complicate things even further: sometimes a video flow will be “striped” across multiple lower-layer flows. Sometimes part of the video will be retrieved from one address, and then it will switch to retrieving the video from a different address.

That graph of data rate I showed you just above? Here’s what the video resolution was over that time interval:

video resolution

Here, the color indicates the video resolution. So… you can sort of estimate what’s going on just from the traffic patterns. But it’s a difficult problem! There are other markers in the traffic that you can look at. I can’t say definitively how any one service provider is doing it. But at least as far as the academic state-of-the-art goes, there isn’t any way to do this with perfect accuracy, all of the time (unless you have the cooperation of the video providers…)

If you’re interested in learning more about the techniques used for this kind of problem, there’s a lot of academic literature out there – see for example BUFFEST: Predicting Buffer Conditions and Real-time Requirements of HTTP(S) Adaptive Streaming Clients as a starting point. (Not my paper – just one I happen to have read recently.)

Nothing maxes out bandwidth at a consistent rate other than streaming video.

Also, in order to make sure that stream is handled with priority (and not like a big file download, for instance) streaming sources tag the packets in a way that tells the carriers that it is streaming video. The rest of the packet is encrypted, but the metadata that tells the ISP how to route it gets to see this part. If they did not do this, there would be a high chance that the stream would get interrupted or degraded as the ISP tried to balance all the needs of the network traffic at that time.

And here is how Verizon said they will do it:

Verizon apparently won’t be converting videos to lower resolutions
itself. Instead, it will set a bandwidth limit that video applications
will have to adjust to. “We manage HD video throughput by setting
speeds at no more than 10Mbps
, which provides HD video at up to 1080p
video,” Verizon told Ars. The Mbps will presumably be lower than that
in cases where Verizon limits video to 480p or 720p.

That means that both the subscriber and the fact that the traffic is shaped a certain way because it is a certain type of video means it is tagged.

How? Verizon has a video optimization system that has been shown to limit Netflix and YouTube to 10 Mbps even before the Aug 2017 announcement of the new caps.

Verizon acknowledged using a new video optimization system but said it
is part of a temporary test and that it did not affect the actual
quality of video. The video optimization appears to apply both to
unlimited and limited mobile plans.

But some YouTube users are reporting degraded video, saying that using
a VPN service can bypass the Verizon throttling.

This points to the ability for Verizon to identify video streams and limit the bandwidth accordingly, even if the content is delivered over HTTPS (but not VPNs).

Schroeder is almost certainly right in that its just a marketing way of saying they restrict bandwidth to certain sites IP addresses or look for priority markers on the packets.

It is worth noting however that theoretically there are ways they could make this work better if the sole aim was to force users to a certain resolution while video streaming and nothing else.

Much internet streaming these days uses a process called DASH (Dynamic Adaptive Streaming over HTTP). The way that this works is to request a small chunk of video, measure the bandwidth while this is downloaded and select the next chunk of video at a resolution / compression scheme that would allow it to be received in time for when the first chunk has finished playing.

This means there are hints in the requests as to what the user is doing. If your device sends a request to a website every 3 seconds requesting a file that takes just under 3 seconds to download then there is a very high chance that site is streaming video. From the size per duration you could make a reasonable estimate of the resolution – especially if you keep a rolling average. You can then just restrict bandwidth to that ip address.

By using known IP addresses for major video providers (googlevideo (youtube), Netflix etc) in decision weighting you could make the algorithm more aggressive without too many false positives.

Related Solutions

AddTransient, AddScoped and AddSingleton Services Differences

TL;DR Transient objects are always different; a new instance is provided to every controller and every service. Scoped objects are the same within a request, but different across different requests. Singleton objects are the same for every object and every...

How to download package not install it with apt-get command?

Use --download-only: sudo apt-get install --download-only pppoe This will download pppoe and any dependencies you need, and place them in /var/cache/apt/archives. That way a subsequent apt-get install pppoe will be able to complete without any extra downloads....

What defines the maximum size for a command single argument?

Answers Definitely not a bug. The parameter which defines the maximum size for one argument is MAX_ARG_STRLEN. There is no documentation for this parameter other than the comments in binfmts.h: /* * These are the maximum length and maximum number of strings...

Bulk rename, change prefix

I'd say the simplest it to just use the rename command which is common on many Linux distributions. There are two common versions of this command so check its man page to find which one you have: ## rename from Perl (common in Debian systems -- Ubuntu, Mint,...

Output from ls has newlines but displays on a single line. Why?

When you pipe the output, ls acts differently. This fact is hidden away in the info documentation: If standard output is a terminal, the output is in columns (sorted vertically) and control characters are output as question marks; otherwise, the output is...

mv: Move file only if destination does not exist

mv -vn file1 file2. This command will do what you want. You can skip -v if you want. -v makes it verbose - mv will tell you that it moved file if it moves it(useful, since there is possibility that file will not be moved) -n moves only if file2 does not exist....

Is it possible to store and query JSON in SQLite?

SQLite 3.9 introduced a new extension (JSON1) that allows you to easily work with JSON data . Also, it introduced support for indexes on expressions, which (in my understanding) should allow you to define indexes on your JSON data as well. PostgreSQL has some...

Combining tail && journalctl

You could use: journalctl -u service-name -f -f, --follow Show only the most recent journal entries, and continuously print new entries as they are appended to the journal. Here I've added "service-name" to distinguish this answer from others; you substitute...

how can shellshock be exploited over SSH?

One example where this can be exploited is on servers with an authorized_keys forced command. When adding an entry to ~/.ssh/authorized_keys, you can prefix the line with command="foo" to force foo to be run any time that ssh public key is used. With this...

Why doesn’t the tilde (~) expand inside double quotes?

The reason, because inside double quotes, tilde ~ has no special meaning, it's treated as literal. POSIX defines Double-Quotes as: Enclosing characters in double-quotes ( "" ) shall preserve the literal value of all characters within the double-quotes, with the...

What is GNU Info for?

GNU Info was designed to offer documentation that was comprehensive, hyperlinked, and possible to output to multiple formats. Man pages were available, and they were great at providing printed output. However, they were designed such that each man page had a...

Set systemd service to execute after fstab mount

a CIFS network location is mounted via /etc/fstab to /mnt/ on boot-up. No, it is not. Get this right, and the rest falls into place naturally. The mount is handled by a (generated) systemd mount unit that will be named something like mnt-wibble.mount. You can...

Merge two video clips into one, placing them next to each other

To be honest, using the accepted answer resulted in a lot of dropped frames for me. However, using the hstack filter_complex produced perfectly fluid output: ffmpeg -i left.mp4 -i right.mp4 -filter_complex hstack output.mp4 ffmpeg -i input1.mp4 -i input2.mp4...

How portable are /dev/stdin, /dev/stdout and /dev/stderr?

It's been available on Linux back into its prehistory. It is not POSIX, although many actual shells (including AT&T ksh and bash) will simulate it if it's not present in the OS; note that this simulation only works at the shell level (i.e. redirection or...

How can I increase the number of inodes in an ext4 filesystem?

It seems that you have a lot more files than normal expectation. I don't know whether there is a solution to change the inode table size dynamically. I'm afraid that you need to back-up your data, and create new filesystem, and restore your data. To create new...

Why doesn’t cp have a progress bar like wget?

The tradition in unix tools is to display messages only if something goes wrong. I think this is both for design and practical reasons. The design is intended to make it obvious when something goes wrong: you get an error message, and it's not drowned in...

OpenSSH: How to end a match block

To end up a match block with openssh 6.5p1 or above, use the line: Match all Here is a piece of code, taken from my /etc/ssh/sshd_config file: # Change to no to disable tunnelled clear text passwords PasswordAuthentication no Match host