Understanding Codecs
& Container Files
Codec and container are two words that are often used interchangeably but mean two entirely different things. Understanding the distinction — and knowing which formats appear at which stage of a production — is essential for anyone navigating a modern post pipeline.
01 — The container and the codec
When you look at a file on a drive — say, Interview_Cam-A_001.mov — what you are seeing is a container file. The .mov extension tells you the format of the wrapper. It tells you nothing about how the video and audio inside it are compressed.
The container is the file format itself: a structured wrapper that holds one or more streams of data. A video container typically holds a video stream, one or more audio streams, and often additional data such as timecode, subtitle tracks, and embedded metadata. Think of it as a box.
The codec — short for compressor/decompressor — is the algorithm that encodes and decodes the video data inside that box. It determines how images are compressed to reduce file size, and how they are reconstructed during playback. The same container can hold video compressed with entirely different codecs: a .mov file might contain ProRes 422, H.264, or DNxHD depending on what was put into it.
A useful analogy: the container is an envelope. The codec determines what language the letter inside is written in. You can receive envelopes of the same size and shape carrying letters in completely different languages — and you need the right decoder to read each one.
02 — How codecs compress video
All digital video compression involves discarding or approximating some information to reduce file size. The codec determines how much is discarded, what kind of information is prioritised, and how the original signal can be reconstructed. There are several axes on which codecs differ.
Intra-frame vs inter-frame compression
This is the most fundamental division between codec types and the one that most directly affects editing.
Intra-frame codecs — sometimes called I-frame-only codecs — compress each frame of video completely independently. Every frame contains all the information needed to reconstruct it without reference to any other frame. ProRes, DNxHR, and JPEG 2000 are all intra-frame. They produce larger files than inter-frame alternatives, but any frame can be read and decoded independently. This makes editing fast and accurate: the NLE can seek to any point in the timeline instantly.
Inter-frame codecs use temporal compression — they look across multiple frames and only encode the differences between them. H.264 and H.265 use three frame types: I-frames (complete reference frames), P-frames (which contain only the difference from the previous frame), and B-frames (which reference both the previous and next frames). This produces dramatically smaller files, but to decode any given frame the player may need to assemble data from several surrounding frames. Editing highly compressed inter-frame footage is computationally demanding and is why proxies are common in H.264/H.265 workflows.
The traditional rule of thumb: intra-frame codecs are for production and post, inter-frame codecs are for delivery. This remains broadly true — but inter-frame codecs have also found a valuable second role as proxy formats, particularly in RAW workflows.
H.264 and H.265 as proxy formats in RAW pipelines
In our day-to-day work at Xenon Post, H.264 and H.265 appear not as acquisition formats but as the output of a proxy generation step. The camera originals are RAW — ARRIRAW, R3D, BRAW — and it is from those files that lightweight proxy copies are generated for editing.
DaVinci Resolve’s built-in proxy generator, and Blackmagic’s standalone Proxy Generator application, can process RAW camera originals and output H.264 or H.265 proxy files at a fraction of the original file size. These proxies are timecode-matched and file-name-matched to their originals. An editor can cut on them on almost any machine — a laptop on location, a remote editor’s workstation — without needing access to the full RAW files at all. When the edit is complete, Resolve relinks the timeline to the RAW originals for grading and output, and the proxies are discarded.
H.264 and H.265 are well suited to this role precisely because of their efficiency: a proxy of a large RAW file can be small enough to share over the internet or store on a laptop drive, while still being visually accurate enough to cut on confidently. The inter-frame compression that makes them unsuitable as mastering formats is exactly what makes them practical as proxies.
Bit depth
Bit depth determines how many distinct tonal values each colour channel can represent. An 8-bit codec encodes 256 steps per channel — 256 levels of red, 256 of green, 256 of blue. This is adequate for web delivery and viewing, but provides limited headroom for grading: when you push shadows or highlights, you can reveal the steps between tonal values as visible banding.
10-bit encoding raises this to 1,024 steps per channel, which is the professional production standard and essential for any HDR workflow. 12-bit (4,096 steps) is used by higher-end cameras and VFX pipelines. OpenEXR images used in visual effects compositing typically operate at 16-bit or 32-bit float, preserving enormous tonal range for complex compositing operations.
Chroma subsampling
The human visual system is more sensitive to brightness (luminance) than to colour (chrominance). Chroma subsampling exploits this by recording colour at lower resolution than brightness information. The notation uses three numbers: the first represents the sampling block width (always 4), the second represents the number of colour samples in the first row, and the third in the second row.
For green-screen work, keying, and any effects work requiring precise colour edges, 4:2:2 is the minimum practical requirement. The reduced chroma resolution of 4:2:0 makes clean key extraction significantly harder. For archive and highest-quality masters, 4:4:4 retains full colour at every pixel.
03 — Common container formats
Container formats are largely ecosystem decisions — the post house, the NLE, and the delivery platform each have preferences. The most important characteristic of a container is not quality (the codec determines that) but rather compatibility: what software can read and write it, and what delivery destinations accept it.
04 — Common codecs by use case
Codecs divide naturally into three groups that reflect where in the pipeline they are used: acquisition codecs that cameras record to, editorial or mezzanine codecs used through post production, and delivery codecs for the final output. Using the wrong codec at the wrong stage — editing in a delivery codec, or delivering in an acquisition codec — is a very common and avoidable problem.
Acquisition & camera-native codecs
Editorial & mezzanine codecs
Mezzanine codecs are high-quality, intra-frame codecs designed to be decoded quickly during editing, survive multiple transcode generations without significant quality loss, and serve as the master format through colour and finishing. They are not efficient enough for delivery, but they are the right tool for everything between acquisition and output.
ProRes and DNxHR are functionally equivalent in quality — they represent Apple’s and Avid’s respective approaches to the same problem. The right choice depends on your NLE and facility ecosystem, not on any inherent quality difference.
Delivery codecs
05 — RAW acquisition
RAW is frequently described as a codec, but it is more accurate to call it a data format. When a camera records RAW, it is capturing the unprocessed electrical signal from the sensor — the raw photon count at each photosite — before any in-camera processing has been applied. There is no white balance baked in, no sharpening, no noise reduction, and no colour science decisions locked to the image. All of that happens in post.
How camera sensors capture colour
Most digital cinema cameras use a single sensor with a colour filter array — typically a Bayer pattern — placed in front of the photosites. Each individual photosite can only record one colour (red, green, or blue) because of the filter above it. The camera records the intensity at each photosite, and a process called debayering (or demosaicing) reconstructs the full-colour image by interpolating the two missing colour values at each pixel from its neighbours. This is the step that converts RAW data into a viewable image, and it happens in post — not in the camera.
The consequence is that RAW images contain more information than a processed video codec, but they look flat and unprocessed until colour science is applied. The image that comes off a camera in RAW is not what the finished film looks like — it is the raw material from which the finished image is built.
Why shoot RAW
The primary advantage is exposure latitude. Because no in-camera processing decisions have been baked in, the full dynamic range captured by the sensor is preserved. Highlights and shadows that would be clipped in a processed codec remain recoverable in a RAW file, up to the physical limits of the sensor. A well-exposed RAW file from a modern cinema camera can yield 14 or more stops of usable dynamic range.
The secondary advantage is colour science flexibility. The choice of colour space, white balance interpretation, and tone mapping are all made in post — and can be changed non-destructively at any stage. Shooting RAW means a colourist is working with the maximum possible information, not a processed approximation of it.
The cost of RAW
RAW files are large. An ARRIRAW recording in open gate at 4K generates enormous data volumes — typical data rates range from 700 MB/s to over 2 GB/s for uncompressed or losslessly compressed formats. This has downstream effects: media costs are higher, data wrangling takes longer, and real-time playback requires either dedicated hardware or a transcode step. Most RAW workflows include a transcode to ProRes or DNxHR before editing, with the RAW originals retained as camera masters.
BRAW (Blackmagic RAW) occupies an interesting middle ground: it is a compressed RAW format that bakes in the camera’s colour metadata while still deferring the actual image development to post. It delivers much of the latitude and flexibility of traditional RAW at significantly lower data rates, and it plays back natively in DaVinci Resolve without a transcode step.
Planning a shoot and working out how much storage you need? We built a Data Rate Calculator that covers 45 cameras and every major acquisition codec — useful for estimating media costs and drives before you step on set.
06 — The post pipeline: from acquisition to delivery
Understanding individual codecs and containers is most useful in context — knowing how they connect through the stages of a production. A typical narrative or commercial pipeline moves through several distinct codec stages, each chosen for the demands of that stage rather than any single format serving all purposes.
The mezzanine principle
A key concept the pipeline diagram illustrates is the mezzanine file — a high-quality intermediate that exists purely to serve the post process. Camera originals are archived and never touched again. An editorial transcode (ProRes 422 or DNxHR) is made from them, and it is this file that the editor works with day to day. From the finished grade, delivery files are transcoded downward for each platform. The mezzanine sits in the middle of this chain, absorbing the repeated read/write and decode operations that would degrade a more compressed format.
The proxy workflow
On projects where the full-resolution editorial codec is still too demanding for a given edit system — common with RAW 4K or 6K material, or on lighter laptops — a proxy is created alongside the full-resolution file. ProRes 422 Proxy or DNxHR LB are the standard choices: small enough to edit smoothly on almost any machine, but timecode-matched and file-name-matched to the high-resolution originals. When the edit is complete, the NLE reconforms the timeline to the full-resolution files for finishing. The proxy is discarded.
A common and costly mistake: delivering from the proxy rather than the original, or losing track of which files are proxies and which are camera masters. Rigorous file naming and folder structure at the data wrangling stage prevents this — and is one of the less glamorous but most valuable parts of what a good post facility manages.
07 — Quick reference
The right format at every stage
Codec and container decisions made early in a production have consequences that run all the way through to delivery. At Xenon Post, we build pipeline specifications before a project starts — so there are no surprises when it comes time to finish.
Talk to the team