Arthur C. Clarke’s Three Laws of Prediction have the dubious distinction of being the most well-known set after Newton’s Three Laws of Motion and Kepler’s Three Laws of Planetary Motion. The third of Clarke's laws is oft-quoted and applies to a variety of different industries, including online video: “Any sufficiently advanced technology is indistinguishable from magic.” When you think about what goes into the making of a 30-second promo or even a 6-second Vine, and how such marketing videos are received by viewers (8-second grace period, then drop-off), it’s almost sort of creepy how reliable Clarke’s law has become in the 21st century. Over half of all mobile traffic is video…and we take it completely for granted.
But when you’re in the business of online video, knowing back-end operations pays dividends. And if you’ve worked before with ad execs or videographers, you know how hard it can be to communicate your needs (especially when you don’t know the difference between a .wmv and a Flash video). That’s why we’ve decided to put together a quick primer on the “stagehands” of online video – the bits of data running around behind the curtain as images and audio stream at you, 29.97 frames per second.
Frames Per Second (frame/s)
Also called frame rate, FPS is, at its essence, the reason video even works at all – because video is really just a bunch of images coming at you really fast, faster than Derrick Rose on a fast break, and faster than the Road Runner can outrun Wile E. Coyote. How fast is that? Only 16 frame/s, actually. Fun fact: the highest quality cameras can record at over 120 frame/s.
Back in the day, when analogue televisions roamed the world, knowing the difference between PAL (Europe, Asia, Australia; 25 frame/s), NTSC (USA, Canada, Japan; 29.97 frame/s), and SECAM frame/s (France, Russia, Africa; 25 frame/s) specifications actually mattered, but since things became digital video FPS is now just determined by distribution method and desired quality.
If, for example, you wanted to watch a bullet impact, you would shoot a video at 1 million FPS……
Interlaced Vs Progressive
While FPS determines how “real” a video appears to the human eye, it doesn’t really take into account how much data video actually takes up. When you need to stream a 5-minute product demo, for example, a high FPS will only be a hindrance. Compression (we’ll get to that later) helps make high-quality video streamable, but there are also ways of displaying each frame of a video optimally in order to reduce data requirements. “Interlacing” refers to how some video types are “painted” on a screen one half at a time.
Two “fields” are used to create each frame – one field contains all odd lines in an image while the other field contains all the even lines. Normally, a video would be displayed one static image at a time in succession, and quality would be determined by resolution and framerate. Interlacing essentially ups the perceived framerate by creating and displaying half-frames instead of full-frames.
Interlacing was created to deal with the flicker issues in old CRT displays, and today it’s mostly a holdover from CRT technology. Progressive scans, on the other hand, draw all lines of each frame in sequence, and are universal in computing and online video because there is less data loss.
This “Indian Head” graphic shows the difference between interlace and progressive scans. The two frames on the right are progressive, and look solid. The two on the left are interlaced, and suffer from “interline twitter” (no relation to the company). Top images are “as-is”, while bottom images have anti-aliasing to reduce twitter (but are lower quality as a result).
Aspect Ratio And Quality
All this talk about lines per image brings us to aspect ratios. While all of us know a little about aspect ratios just because we’ve all played around with wallpapers in Windows and changed the resolution of a video game to get gorier blood splatters, there’s no reason anyone who doesn’t work directly with a camera should know about all the different types of aspect ratios. Which is why there’s this handy cheat sheet:
Before your brain erupts – let me try to explain. The diagonal dotted lines running from upper left to lower right signify the most common aspect ratios, and each color corresponds to a specific aspect ratio. Each colored box correlates a common type of video standard to its corresponding aspect ratio.
In 2013, most of us will be familiar with the aspect ratios in dark green. HD720 and HD1080 are the two most common types of HD quality video, and the quickest way to understand the relationship between aspect ratio and video quality is to look at the numbers. A 1080p video is higher quality than a 720p video because 1920 x 1080 means 1920 lines by 1080 lines in a single frame, while 1280 x 720 means 1280 lines by 720 lines. More lines means more pixels, and more pixels means higher color and luminal fidelity to the real thing.
Lossy Vs Lossless Compression
As mentioned earlier, high FPS video takes up a lot of memory space. Typical Blu-ray movies actually take up 30 – 50 gigabytes of data, more than 10x the usual 2-3GB a 2-hour movie will occupy on your hard drive. The difference in data size is due to compression, which is the equivalent of taking large, chunky kidney beans and squeezing them into small tin cans.
Well, that’s more like “lossy” compression, which this doggy demonstrates very well.
On the far left you have man’s best friend in glorious, high-quality realism. Enter lossy compression. The middle frame shows the dog at medium compression (92% less information than uncompressed), and high compression is displayed on the right (98% less information than uncompressed). Take note that the middle image is already anywhere from 20-200x smaller in size than the original, uncompressed image.
As you can see, high compression is undesirable, but low-to-medium compression is still pretty good-looking. This is because the human eye is actually very hue-insensitive. In other words, we can detect broad changes in color, like the difference between orange and blue, but tend not to notice small changes in hue or shading. Lossy compression algorithms take advantage of this by chunking together pixels of similar color and displaying them all as one large pixel, or “macroblock”, in order to save space. These macroblocks are especially obvious in the leftmost image.
This is all an example of intraframe (i.e. within a frame) compression. There is also the more powerful interframe (i.e. across frames) compression. Interframe compression algorithms take into account the frames preceding and following any given frame and chunk based on similarities between frames. This allows for even greater compression. Unfortunately, both types of lossy compression result in irreversible data loss.
Lossless compression is much more desirable than lossy compression because no data is lost in the compression process. Eventually, the compressed data can be unpacked into its original form. You’re probably already familiar with lossless compression via WinZip and 7-Zip, but it’s also used in online video when, say, a ton of high-quality footage and B-roll needs to be put on a USB drive and transferred to another computer. Lossless compression algorithms create an organized, spatial “map” of data bits and then compress each bit of data to its corresponding map point in order to save space. When the video needs to be unpacked into its original size and quality, however, the data map tells a codec what to do.
Video Codecs Vs Video Containers
Okay, now that we’ve gone over frames per second, interlaced vs. progressive, the relationship between aspect ratio and quality, and compression, we can get into the differences between codecs and containers, or the actual “format” of online video.
A lot of articles have already covered the different types of codecs and containers, so I won’t get into that here (and there really is no reason to, since most online video formats change from year to year based on trends and it’s just a big popularity contest). But you can still take a gander at them here. Honestly, it seems a lot more helpful to understand the difference between codecs and containers.
The word “codec” is short for “coder-decoder” or “compressor-decompressor”, and a codec is a computer program that can both encode and decode data. A video codec can both “write” a video into a smaller, compressed format, and also “read” a lossless compression file and unpack it back into a high-quality video. There are actually thousands of codecs available for audio and video, and many of the codecs used for video are actually multimedia codecs that take into account both audio and video elements. Examples of codecs include VidX and MPEG-4.
Most people mistake video file endings like .wmv and .vlc for codecs when they’re actually containers. A video “container” is just a wrapper that holds all the data in a multimedia video together in one package for a codec to work with. Some proprietary containers also happen to contain their own codecs, but they are not the same thing. Here’s a comparison list of video containers, so you can get a better sense of their limits and capabilities. Some examples of containers include .wmv, .flv, .mov/.qt, and .ts.
Next week, we’ll go into the differences between 2-D and 3-D video. Stay tuned!