When They Hear Us: Creating the Period-Specific Soundtrack for a Tragedy of Justice


Chris Carpenter, left, Susan Dudek, Joe DeAngelis, John Benson and Bruce Tanis. Photo by Wm. Stetz

by Mel Lambert • portraits by Wm. Stetz

The Netflix four-part miniseries When They See Us, which dropped May 31, was originally called Central Park 5. That working title identifies the overarching plot trajectory: The notorious case of five teenagers of color who were wrongly convicted of raping a white female jogger in New York, and which resulted in the use of the term “wilding” to describe out-of-control gang activity. The revised title better reflects the fate of the young people unjustly ensnared in the criminal justice system by embracing their humanity and not their politicized role.

John Benson.

The sound editorial and dubbing crew for the show were based at Technicolor Sound Services on the Paramount Studios lot and consisted of supervising sound editor John Benson, sound designer/sound effects editor Bruce Tanis, CAS, and ADR/dialogue supervisor Susan Dudeck, who worked closely with dialogue/music re-recording mixer Joe DeAngelis, CAS, and sound effects re-recording mixer Chris Carpenter. The mixers used the facility’s Stage 9 to develop a Dolby Native Atmos 7.4.1-channel soundtrack, as well as a wide, 5.1-channel mix for the majority viewing audience.

Also on the post sound team were Foley artists Dawn Lunsford and Alicia Stevenson and Foley mixer David Jobe. DeAngelis, Carpenter and Benson recently completed work on The Umbrella Academy series for Netflix, which debuted in February.

“Our director Ava DuVernay — herself a person of color — wanted a specific, urban feel for the soundtrack,” Benson says. “She asked us to create the sound of New York in the dialogue, Foley and effects tracks — to really put the audience into those closed environments and follow the action as the film moves from the boys’ homes in Harlem, to Central Park, to the police station, to the courtroom, to juvenile detention, then to the Rikers Island and Attica prisons. ‘Realism’ was our watchword when making soundtrack decisions.”

Contrasting Period Sounds

The team’s primary focus was to establish and contrast the 1989 sound of Harlem projects with uptown Manhattan, and the more affluent neighborhood of rape victim Patricia Meili. “We wanted to contrast the neighborhoods and cultures within New York City, and the authority of the police stations and courtrooms,” Benson explains. “The audience needs to understand how it felt to come from these poor backgrounds, and the cultural differences between that and the courtroom environments.”

When They See Us. Netflix

Benson asked Tanis to oversee sound design to enhance the fear and anxiety that the accused teens were experiencing, and to integrate that with the music score to develop a feeling of oppression and isolation, “specifically during the prison mess hall scenes and flashbacks.” Adds Tanis, “Obviously, the basic intention was to cut full backgrounds and hard effects, but also that spotted sounds would form a large element of the edit, especially street-based call-outs and people talking and shouting back and forth to each other between tall buildings.”

Tanis discusses the director’s intention that Harlem should sound vibrant and alive, with crowds of people, loud traffic, sirens, buses and jackhammers, noting, “This would be contrasted with Central Park, which was ‘big city’ around the edges but would become much quieter and pastoral as we moved deeper inside to the crime scene. This serene woodland sense evolved through further discussions with John [Benson]. It became a sort of leafy, wind transition swell, as a memory of the last place the boys were free — each time we see one of them being released from prison, in Part 3, and a few other scenes.”

Benson, along with Dudeck and Tanis, spotted each part with DuVernay and each part’s specific picture editor. (See related story on the show’s picture editorial crew on Page 38.) “We discussed all specifics of the scenes, problems with dialogue, the ADR needed, design for particular moments and the general feel of the scenes,” relates Benson. “In other words, identifying the dramatic beats so that our design can support and enhance them. After the spotting session, we would discuss the director’s notes and expand on them, figuring a plan for that part.”

Susan Dudek.

As the action moves for the first time into a squad room full of police officers, there are lots of phones, typewriters, talk and activity, according to the supervising sound editor. “We go to a reception area full of worried parents trying to get information about their children who were picked up in the park the night before,” he elaborates. “There is a holding room where the boys have been all night, and the interrogation rooms away from the busier station where the boys are questioned. All these specific rooms within the police station need to support their part in the story of the creation of the park attack narrative.

“The quiet of the interrogation room, the buzz of the fluorescent lights, squeaks of the chair…,” Benson continues, describing the sonics of the scenes. “Similarly, the courtroom is often quiet so that the movement of lawyers making their points is featured. This is in contrast to the busy police station areas and, in the case of the court scenes, the crowds and media outside the courtroom.” ADR/dialogue supervisor Dudeck recorded chants for the protesters and call-outs for the law-and-order crowd to create a cultural collision that the boys walk through to get to the courtroom, according to Benson, which is quickly contrasted with the authority and the order of the court itself.

Bruce Tanis.

Dudeck is candid about her editorial options, specifically for dealing with the “very dirty” dialogue tracks. “Because Ava shot many scenes using multiple cameras, the boom channel was very often unusable — and unfortunately, the lavalier tracks, which were all we had to depend on, were often full of clothing noise and other contamination,” she concedes. “Normally, due to those conditions we would have done a good deal more ADR, but Ava was protective of her performances. So we had to use a lot of iZotope RX solutions, going a bit deeper than we normally would, or that we would have liked to do. But there was no other way to get audible dialogue.”

Due to that dialogue issue, Dudeck and her crew hit the ground running. “We had to jump right in,” she recalls. “Of course, I flagged certain scenes, having done my own very quick RX sessions on each piece of dialogue as I cued ADR — to make sure it was at all salvageable. There was a lot of group ADR, more than a day’s worth on two parts. That was mostly because of the variety of different ages, genders and ethnic groups required for each part. Also, a few parts had several news reporters, both on-camera and wild, that needed to be recorded; those took quite a bit of work to cast and shoot and, in a few cases, reshoot.”

Reinforcing Environments Through Sound

Tanis maintains that his sound design was focused on character motivation: “A lot was based on coming up with ‘fear’ and ‘dread’ tones, plus accents for when the boys are being interrogated and, later on, sentenced in court. It’s a pretty dark sequence in the first part because these children are being held and questioned by police, and it’s very intimidating.” Benson asked if he could come up with some sounds that would give a sense that the boys were struggling in an alien and very unfriendly environment. “This developed to include scene transitions from incarceration to freedom, and also emotional transitions for some of the other characters,” the sound designer explains. “Since a lot of the story takes place within some context of the criminal justice system, I used cell doors, sirens, dark metal drones and things like that to create sounds that helped fulfill John’s request.”

When They See Us. Netflix

One of Tanis’ biggest sound effects challenges came from the director’s need to “hear a lot of the nearby but off-stage world from wherever we are,” he explains. “If we are in an office, there are people down the hall doing things; if we’re out on the street in Harlem, there are all kinds of layers and distances to make it vibrant and alive. The prison scenes are full of all kinds of voices. In effect, by having these full layers of detail, we cut the show a couple of times over.”

For the interrogation scenes during Part 1, and again during the solitary confinement in Part 4, Tanis developed some metal- and glass-based tones and swells that play very quietly in the mix, but suggest a sense of something uncomfortable. “There’s just an edge to them that makes these scenes a bit darker and lonely,” he says. “At one point, while still in solitary, the air conditioning goes out and the cell becomes a sweat box; the heat is intense and there’s just no air coming in. I made some tones created from insect buzzes, and incredibly slowed-down metal impacts, to play up the heat aspect of the scene. Whenever the character touches something in his cell, which is basically all metal, there’s a soft, low sizzle effect to play up the sense that everything he touches is burning. When the AC finally comes back on, there’s a stream babble added to enhance the feeling that not just cool air is coming out, but that wet, lush, life-giving air is flowing again.”

In terms of plug-ins, Tanis elected to use his regulars. “Mostly they were used to create solitary confinement tones and ‘heat’ tones,” he explains. “Some of the work was on creating the ‘fear’ tones for specific parts of scenes. Some of the gate buzzers, L train sounds and other things were sort of ‘outdoor-ized’ using Audio Ease Altiverb presets. A few things used multiple passes of several plug-ins to really push them into making a dark, foreboding presence.”

According to the sound designer, it was the time crunch to complete everything by deadline that proved the most challenging part of his job. “Both Ava and John had specific things they wished to bring out in the edit, in addition to all the regular things like wallas, cell doors, 1989-90 period things, etc.,” Tanis reveals, and thanked his sound effects editors Mark Larry, Elliott Koretz, MPSE, Matt Wilson and Suat Ayas, MPSE.

Dialogue Guides the Mixing

Given that dialogue intelligibility was of paramount importance, “It was vital for the audience that the dialogue was clear and present,” re-recording mixer DeAngelis says. “After all, it’s the vehicle by which the story is told. Principally, there are three ways in which we can protect the dialogue: volume, clarity and physical location. The production mixer was able to get us a fair bit of volume in some really tough situations, and that was quite valuable. I attempted to match angles from take to take with volume and EQ, as well as some subtle reverb. Once I got it in the ballpark, I’d use iZotope to clean out as much noise as I could without causing track degradation. It’s really easy to overuse iZotope and I’m always double-checking the work to make sure I’m not digging in too hard.

Chris Carpenter, left, and Joe DeAngelis.

“Lastly, but importantly, we are fortunate to be able to mix When They See Us in Atmos, which lets us pull the music off the screen a bit and helps with dialogue clarity, because the center channel isn’t too overloaded with dialogue, music and sound effects,” DeAngelis continues. “Pads and strings really benefit from being placed in the space, but sources like drums and bass don’t come off the screen as elegantly. Additionally, since source music and futz dialogue often want to be placed specifically with reference to some radio or physical speaker seen on screen, those are obvious Atmos objects. It’s easy to overuse overheads that will make your mix sound mono, but I’ve found they really support the tone and texture of certain reverbs.”

The Dolby RMU handled folding down of the Native Atmos mix to 5.1-channel and stereo mixes, which preserves a lot of acoustic separation, according to DeAngelis. “We were always checking the stereo mix in a separate nearfield room to make sure the fold-down was translating; we’ve been able to pick up a number of tricks over our past Atmos shows. I always think that what I might do in Atmos surround will benefit the intention of the story, and not just what I see on screen.”

In terms of track layouts, the mixer recalls that they were pretty consistent. “I carried 12 dialogue tracks, eight futzed dialogue tracks [four of which were objects], 16 ADR tracks, 16 group tracks, four foreign dialogue tracks and eight foreign group tracks,” he says. “On the music side, I had 24 music 5.1 stem tracks, 12 stereo music Atmos objects and four music stereo song tracks. The foreign and music object tracks expand or contract to fit the needs of a specific part.”

The two re-recording mixers report that they have always worked a little differently from most dub stages. “We mix completely in-the-box on an Avid S6 console, so we can separate at will and simultaneously work on different sections, which dramatically reduces downtime,” DeAngelis explains. “Depending on the track and availability, I’ll either work on a pair of custom in-ear monitors that JH Audio built for me, or in a pre-dub room behind our stage. In this way, both Chris [Carpenter] and I get a lot more time on the speakers than otherwise. Once I’ve been through the dialogue, ADR and group, and Chris has been through the effects, backgrounds and Foley, we will both begin a pass together where I lay in music and we work the tracks against each other.”

In most cases, the mixers took their direction from the temp track. “It’s usually quite instructional as to what the picture editor and producer/director want,” DeAngelis continues. “Chris and I have a great rapport built over 20 years; we rarely have to negotiate anything since the track comes together and the story really dictates what is sonically most important. That doesn’t mean there aren’t times where we have a lot of sound to try and cram into a scene, but it always works out.”

Despite the arsenal of tools at his disposal, Carpenter feels, “The mix is still dependent on the old standbys: level, panning, EQ, compression and reverb; those five tools are all you need for a great track. Obviously, the sound effects track must not dominate dialogue unless there is a very specific reason for it to do so. The sound effects team did a fantastic job of preparing effects and backgrounds; they kept a lot of material isolated in various pre-dubs so that I had a quick way of managing major aspects of the mix, as well as being able to get very granular when needed.”

Separate Pre-Dubs

Carpenter and DeAngelis will pre-dub separately until “both of us feel that our elements play in a way that we won’t be slowed down by mixing together,” the former explains. “Then we join up together and work through the show, Joe integrating music while I polish Foley and keep the sound effects from negatively impacting dialogue. Sonic impact is more effective if it occurs after a lull in the track. And surrounds pop if they have been ignored for a moment; ditto the overhead speakers in an Atmos environment. Leaning on the quiet moments allows the track to have greater dynamics and impact without dominating the dialogue or becoming monotonous.”

When They See Us. Netflix

Given that each part of When They See Us is cut and mixed like a feature film, the shortest being 60-plus minutes and the longest 82 minutes, “There is a lot of detail that needs to play in order to establish emotion and sense of place,” DeAngelis explains. “Additionally, there is a lot of wonderful score [from composer Kris Bowers] and dense, thoughtful dialogue. Needless to say, it’s taken a serious amount of good work. Typically, our mixes averaged 10 days per part. I’ve been using UAD plug-ins and the firm’s DSP Accelerators for the past eight years; they really give the warmth we were used to with the older analog gear, with the huge benefit of having it all automated and in-the-box.”

The team used Nugen ISL2 as a mastering/true-peak limiter. “Netflix’s current specification of -27 dB LUFS, measured with a 1770-1 algorithm and Dialogue Intelligence on, allows for both creative expression and compliance,” stresses DeAngelis. “It’s a great spec to work in. We use the Nugen ISL2 to maintain true peak compliance but don’t automate any other part of our mixing process.”

Benson recalls that DuVernay’s notes from playback were specific: “We would typically have two playbacks with her to refine the mix. Often, it was about finding the right balance between music and sound effects to feature the element that best addressed the emotion or feel of the scene. Sometimes, ideas we had for background activity in the end distracted from the scene’s main story. An example was a sequence in juvenile detention in which Raymond Santana, Jr. [Freddy Miyares] was calling his father. As their conversation takes place, there is background action where one boy is controlling access to the phones. We used group, Foley and effects to help tell this story.”

But those sounds became distracting from the foreground conversation between Raymond and his father, the supervising sound editor acknowledges. “In the end,” says Benson, “we realized that this story was told mostly visually.”

About Mel Lambert 73 Articles
Mel Lambert is intimately involved with production industries on both sides of the Atlantic. He is a 30-year member of the UK’s National Union of Journalists. He can be reached at mel.lambert@content-creators.com.