Descriptive captions

Captions aren’t only about the speech. Descriptive captions describe anything audible which isn’t spoken. This could be the tone of voice, a significant, non-verbal sound made by a speaker, a substantial interruption which blocks out speech, media being played within a recording – the list goes on.

Whatever the case, descriptive captions are all formatted the same way – (ALL CAPITALS), in round brackets (parentheses).
Here are some commonly used descriptive captions:

(LAUGHS), (LAUGHTER) Use if there’s a significant reaction by either the speaker or the audience, so the viewer gets the joke.
(OVERLAPPING SPEECH), (CROSSTALK) Sometimes speakers will talk over or interrupt each other. Caption everything you can make out,
but if the speech isn’t clear, use this label.
(PHONE RINGS), (DOOR SLAMS) Common interruptions which might occur in talks or lectures. Only use if referred to by the speaker or
if they block out audio.
(WHISPERING), (SARCASTICALLY) Used to indicate the tone of voice. Use only if necessary for understanding.
Eg. “(WHISPERING) But don’t tell anyone.” “(SARCASTICALLY) I’m so excited for class today.”
(VIDEO PLAYING), (VIDEO STOPPED) Sometimes speakers will play media within a lecture or presentation. These labels don’t replace captions – if there is audible speech, sound effects or music, this must be captioned as usual.
(WHISTLES), (SNAPS FINGERS), (SIGHS HEAVILY) Common sounds that speakers might make during talks. Only use if they add meaning to speech.
Eg. “And it was gone, just like… (SNAPS FINGERS).” “(SIGHS HEAVILY) I’ve had a tough day.”
Use when there is any significant break in audio, not just natural pauses during speech. Make these captions five seconds long, and begin captioning with a speaker label when captionable content resumes.

More questions? See our Recorded Captioning Style Guide.

Updated on September 6, 2018

Was this article helpful?

