FILM'S BASIC VISUAL UNITS: THE FRAME[1]
Since film is primarily a visual medium, the grammar and syntax governing the stream of cinema's visual imagery are of first importance. The visual grammar and syntax of film concern the ways a filmmaker arranges shots into scenes and scenes into sequences, just as the grammar and syntax of spoken and written languages deal with the way words are arranged into sentences and sentences into paragraphs.
The smallest discernible unit in film is the frame. A frame is-a single photographic image printed on a length of film.[2] A viewer can see a single frame only under certain artificial conditions: when a projector is stopped at "still" position; when a frame is excerpted and projected as a slide or printed on photographic paper; or when a freeze-frame appears on the screen.[3] Like a single letter in a word, a frame is not a part of a viewer's perceptions until it is isolated. Even then, it seldom has meaning.
Although a single photographic frame cannot be discerned during actual viewing, it contributes to a larger unit and is understood in terms of that unit. During normal projection, twenty-four frames per second (approximately a foot and a half of 35-mm film) pass through the projector's gate. Each image flashes on the screen, then the screen turns black and is followed by another frame. However, the human eye misses the period of blackout since the eye retains an image one-tenth of a second longer than the image exists. It is this physiological phenomenon that allows motion pictures to be seen in continuous movement with no apparent jumps or single frames visible. (Take two frames out of a shot, however, and the eye can often detect a jump.) The average feature contains close to 130,000 separate frames.
The word “frame” also has another meaning in the filmmaker's jargon. The frame is the outer boundary of a projected image -- the lines, on the rectangle on the screen where an image ends and blackness begins. Because the frame serves as the boundary of an image, it is the starting point in the filmmaker's composition. The camera itself sees indiscriminately. The filmmaker must make a variety of choices to be sure that he will put boundaries around a segment of experience that, when projected, will have meaning for the viewer.
THE SHOT (See also The Photographic Characteristics of a Shot and Shots)
At a normal projection speed of twenty-four frames per second, it is quickly evident that a large number of frames make up the basic perceivable unit of the film, the shot. A shot is a single uninterrupted action of a camera.[4] Like the verbal word, the cinematic shot is the smallest functional unit of filmmaking. Some shots last only one or two frames, although such short shots appear rarely in commercial films. But anyone who has seen experimental films (such as Charles Braverman's An American Time Capsule or The World of '68) knows how rapidly shots can operate and how many shots the eye will accept in a small amount of time. Although longer shots are "standard," few last over thirty seconds. The exceptions, of course, run for as long as a filmmaker chooses to keep film running through his camera. The average shot runs from about two to thirty seconds.
Because it is the smallest functional unit of film and combines to form a larger statement, the shot syntactically parallels the word of spoken and written communication. The frame, on the other hand, resembles the single phoneme or letter of a word. Shots make up the vocabulary that film's visual grammar and syntax connect into statements with meaning. The vocabulary of film is primarily the vocabulary of a series of photographic images.
It is illuminating to consider the notion "shot" in relation to the notion "word" in order to grasp the syntactical workings of the basic unit of cinematic composition. The shots of a film draw meaning from their context much as words derive significance almost exclusively from their linguistic context. When isolated, the meaning of either a word or a shot is imprecise at best. Consider the word "stand." Is it a verb (such as a command to assume a certain physical position, or a description of what someone is doing or did do) or is it a noun (such as an ideological one takes, a structure to sit on, a courtroom place of witness, or of trees)? Without a context, one cannot ascertain meaning or function. Similarly, a single shot has meaning, but without a context, a particular meaning is difficult to identify. Consider, for example, a frame showing a saloon with men drinking at tables while a man stands just outside the swinging doors. Is the situation comical or threatening? Or are we seeing a typical Tuesday afternoon at Hank's saloon?
While analogies can be drawn between shot and word, the shot also resembles the written paragraph. A paragraph normally articulates an idea, then offers supportive evidence or arguments. Similarly, a shot in context assumes a general idea or mood and also offers many equivalents of simple declarative and descriptive sentences, providing a viewer with supportive information. Imagine the elements of a hypothetical shot put into statement form: The woman sits in the kitchen. The baby is in the highchair. The baby is crying. The woman is holding baby food. The wall is yellow. On the wall stands a picture of a horse. There is a table in the foreground. The table is round. The table is dark. There are four chairs around the table. All this, and far more, a viewer perceives as he watches a shot. A shot, like a paragraph, offers both detailed information and an idea or mood.
Any direct analogy between the shot and the paragraph, however, will quickly break down. The elements of a paragraph are met with one at a time. They are linear. The content of a shot is, for all practical purposes, available all at once.[5] Ideas and details are not easily separated. Abstract ideas are seldom stated as such in film-and then usually in documentaries. Film argues almost entirely by evidence, inexorably forcing a viewer to supply appropriate abstract ideas. We are not told, for instance, that Mr. Jones loves his wife. We see him love her. Film is a visual medium, and it must make its statements visually.
Shots are categorized according to the apparent closeness of the camera to the person or object photographed. With the early single focal length lenses, distance literally became the factor determining the "length" of a shot. With the present variety of lenses, only the illusion of distance counts. If an object or person seems very far away, the result is normally called an extreme long-shot (ELS), also called an establishing shot because it places objects in context and prepares a viewer for a closer look later. If a person or object appears extremely close, the shot is called an extreme close-up (ECU). In between lie the long-shot (LS), medium long-shot (MLS), medium shot (MS), medium close-up (MCU), and close-up (CU).
The distinctions among shots by distance are relative, and no precise lines or measurements separate the various shots. Usually, the human figure provides the chief standard for measurement. In an extreme long- shot a person might be visible, but the setting clearly dominates. The person fills a good part of the vertical line of the frame in a long- although the setting also receives strong emphasis. A medium reveals about three-fourths of the subject, while a medium shot a mid-shot) would show the subject only from the waist up, viewer's attention more on the subject than on the setting but a clear relationship between the two. A medium close-up a person from the shoulders up, and a close-up shows only the head. An extreme close-up reveals only a small part of the face, such as or an eye. (Click here to see sample shots.)
What is a close-up in one situation, however, can be a mid- or long- another. The length of a shot depends on the subject of a film of an individual scene). The longer a shot is, the more it shows of subject; the shorter a shot is, the more it emphasizes detail that is the subject. Hence, the length of a shot is relative and depends the filmmaker has chosen for a subject. If, for example, a film about cities of the world, a shot of a red double-decker bus in London be a close-up. The same shot would be a long-shot in a film about buses, while a shot of an instrument panel would be a close-up.
THE SCENE
A filmmaker puts shots together to make up a scene. A scene is a series of shots that the viewer perceives as taken at the same location during a rather brief period of time. The classic western gunfight furnishes a good example of a scene. Some action prompts two men to face each other, one draws, the other follows suit, and one is killed or wounded. The gunfight might be preceded by a number of shots in the same location, and the scene typically ends quickly after the fight, with action resuming at another time or location.
Film scenes vary greatly in length. Sometimes a scene will be only a single shot long; in other cases a whole movie will have but one scene. Usually, however, scenes last for several minutes. Cinematic scenes tend to be much more crisp than scenes from plays (which offer the nearest literary analogue to cinematic scenes). The average film contains far more scenes than the average play. Since a filmmaker can cut instantly to another scene, he need not worry (as must a dramatist) about moving his cast on and off the stage and about scenery changes. A cinematic scene need last no longer than it takes the filmmaker to convey a single point.
THE SEQUENCE
The largest unit of film's visual grammar is the sequence. The nearest literary analogues to the sequence are the chapters of novels or the acts of plays. A number of scenes make up a sequence, which is the largest working unit of a film. A sequence is usually composed of a series of scenes that are related in location, time, generating action, point of view, or cast. Atone time sequences were clearly set off by strong punctuation marks such as the fade-out and fade-in. These strong forms of punctuation alerted a viewer that a major segment of a film had ended and that a new one was about to begin. Contemporary filmmakers, however, have abandoned such obvious punctuation marks, relying instead on jump cuts[6] or other forms of transition.
A sequence provides an enlarged context to which individual shots and scenes contribute. Individual shots must always be read in terms of surrounding shots, and scenes must be considered in relation to other scenes. Our mental expectations when viewing film lead us to read all actions in terms of what precedes them. In the case of a gunfight, a filmmaker would normally include a number of significant scenes before the gunfight to enable us to read the actions properly. By the time of the gunfight we are set to respond in a certain way to its occurrence and its outcome. A sequence, then, provides a self-contained unit that can undergo evaluation and criticism.
Frame, shot, scene and sequence are the most basic terms in the lexicon of film. They suggest the fundamental structures with which film operates, and they indicate the complex “cut-and-paste” nature of the medium. The rhetoric of any film grows from the way a filmmaker manipulates these basic structural units.
[1] From The Rhetoric of Film, John Harrington (University of Massachusetts), pp. 8-20.
[2] In normal usage a frame differs from a still. A still is photograph taken with a still (versus motion) camera and printed on a photographic paper. Most pictures displayed outside theaters or appearing in newspaper ads or magazine articles are taken with still cameras on a movie’s set and are made to be photographs standing alone even though an almost identical frame might appear in a film. Note the difference in person’s response to a still before or after viewing a film. Afterwards, he has a context, and the still reminds him of a piece of action -- it is part of a continuous stream of images. Before a person sees a film, he views the same photograph as an independent composition, considered in terms of itself. The stream of images of a film conditions a viewer’s later responses.
[3] A freeze frame is produced mechanically in a laboratory by printing the same frame over and over until the image on the screen resembles a projected slide. Actually, the viewer does not see a single frame during a freeze-frame; rather he sees a repetition of the same picture, although his eye cannot detect any difference between a projected frame and a freeze-frame.
[4] A take is also a single uninterrupted action of a camera, but a take is the unedited footages and is seen from the point of view of the filmmaker rather than of the viewer. A take will frequently be shortened at both ends, and perhaps another shot or two will be cut into the middle creating three, four, five or more shots out of a single take. For instance, during an interview, two cameras might be trained on the two persons talking. Later, an editor will cut and splice to alternate between the two speakers, creating many separate shots from only two takes.
[5] Some shots do, of course, reveal certain components of content linearly; for example, a moving camera presents different pieces of information in a defined order.
[6] A jump cut is an instantaneous shift from one action to another, at first seemingly unrelated, action. A cut from Paris in 1813 to New York in 1972, with no apparent transition, would normally constitute a jump cut. Such cuts rely upon the audience to fill in missing information and allow a filmmaker to move action rapidly forward through ellipses. Once a viewer consciously or unconsciously fills in missing information, the paradoxical unrelatedness of the two separate actions is, of course, resolved.