Ivan Methuselah's Digi-Box Ration Book

R.B.4
1 A / V W O M A N &
A V I E W

2 0 0 7 / 8

SERIAL NO.

IM 448199

IVAN METHUSELAH'S
DIGI-BOX SIGNAL BOOK
3: GROUPS OF PICTURES

Also of relevance to picture quality is the maximum "group of pictures" (GOP) employed by each channel. Were the signal simply made up of a lot of pictures, it would be too big to fit through the bandwidth allocated. To make best use of the room, each frame is compared to the next and a mathematical route from one to the other is established. The first frame is broadcast and then the second frame is created from it using manipulations of the original data (all a bit Adam's rib). If the maximum group of pictures is 12, then the MPEG encoding can make up to 11 iterations on an original image (this original image being known as the "i-frame"); one in 12 frames contains real data with the further 11 frames being built up from extrapolations of that data. 12 frames is about half a second, so in the case of 12 GOP, an actual image is only transmitted twice a second with the rest of the information being maths. It's all kind of like JPEG compression but in three dimensions (the usual two spacial dimensions plus a temporal dimension). Here's a simplified depiction:

These three chess-boards represent three frames of footage we wish to broadcast. Rather than transmit photos of all three, we save room by just transmitting the first frame. Then in place of the second frame we send a telegram to the effect of: "RED SQUARE TO QUEENS BISHOP SIX STOP", and in place of the third frame: "RED SQUARE TO KING FIVE STOP". Of course, MPEG-2 doesn't actually use bastardised chess notation, but it does love blocks of 8 pixels and multiples thereof, so it's not an absolutely absurd example.

Multiplexes 1 and 2 use a maximum GOP of 12, Multiplexes B and C use a max of 18, and Multiplexes A and D use a variable GOP rate that can produce GOPs in excess of 40 frames (almost 2 seconds) long. The longer the group of pictures, the greater the saving on bandwidth, but also the greater possibility of errors. If an error occurs at some point it will not be corrected until the start of the next group of pictures, which in the case of Mux A or D could be 2 seconds away. This makes their channels all the more flaky (particularly the QAM-64 Mux A). A long GOP can be disastrous if it breaks down, due to the gaps between i-frames, but it does have the advantage of permitting higher quality i-frames than a short GOP were we to assume a fixed bandwidth.

A full explanation as to the mathematics involved in groups of pictures would probably be a bit superfluous and potentially needlessly technical for this discussion (cosines are involved). But some reference to it (albeit simplified) might be useful to explain some of the errors that result from MPEG-2 encoding and the GOP phenomenon. In the example, right (ignoring the red disc for the moment), we see a very simple i-frame which we can easily cut up into 64 referential chunks. Occasionally chunks such as these (but smaller with regards to the image as a whole... 16x16 pixel "macroblocks" are typically used in the GOP transformations of digital television) might fall down the back of the telly and have to be replaced with educated guesses: a form of "blocking" often seen as a result of a weak signal. In the case here though, something else is happening: The intended image for broadcast has a red disc moving across a chess-board, but because the maths relates the movement of the dissected macroblocks (in this example helpfully aligned to the squares), some of the background gets shunted along with the red disc. Now in reality the maths is a bit cleverer than that, but it is only so clever, and if the movement in the image (and hence the maths involved) gets too complicated, mistakes will start happening, especially as the sums get further and further from the i-frame they're manipulating. It's a lot like Chinese Whispers. A fairly stationary or uncomplicated image such as a testcard or a newsreader will look fine over a long GOP, just as a simple word will pass well between whisperers (indeed, given the same bandwidth, such a transmission will look better on a longer GOP than on a shorter one because, as a result of the savings in space provided by the maths, the i-frames can be transmitted at a higher quality). However, a fussy image such as birds in flight will be far too complicated for a long GOP to cope with, and televisual slurry will result. This is down to a couple of things: on the one hand the referencing from one frame to another (as exemplified by the chess notation telegrams above) becomes too elaborate for the system to manage and it starts to guess things wrong, and on the other hand the birds are smaller than the macroblocks, and so we see a subtler version of the problem demonstrated in the animation we just looked at: normally the system would be smart enough to adjust the backgrounds but as its frames of reference disappear it is forced to surrender. And the longer the GOP, the greater the distortion that sentence undergoes as it passes from whisperer to whisperer. An intelligent combination of variable GOP and variable bitrate can efficiently produce decent results, as Film4 goes a moderate way to testifying. But it all too easily can fall prey to gremlins.

While we're on the subject, GOPs are also why it takes so long to change channels on digital TV: the box has to wait for a new i-frame to arrive before it can display anything.

< PREVIOUS PAGE

CONTENTS

NEXT PAGE >