The AES31 Standard

The exchange of audio between workstations is a critical issue for audio professionals in the film, television, broadcast, and music industries. It's this problem that the Audio Engineering society set out to address with a standard for digital audio interchange, designated AES31.

“The primary target of AES31 is the interchange of audio editing projects between DAWs,” says Brooks Harris, president of Brooks Harris Film & Tape in New York City. Harris is also vice chairman of AES SC06-01, the working group in charge of AES31, and is the primary author of AES31-3, the Audio Decision List portion of the standard.

“The goal,” he continues, “is to allow a disk to be moved to another audio editing system with a minimum of fuss and to reproduce the project as accurately as possible.”

The Need for Speed

According to Joe Bull, managing director of SADiE in Cambridge, England, the need for simple, reliable interchange of assets and edit decision lists arises both between and within post facilities.

“Most facilities acquire their audio from a variety of sources, some out-of-house,” Bull says. “Unless the operator knows exactly which product and version of software was used on the source material, it can waste valuable time — at the facility's expense — to fix audio that doesn't transfer correctly.”

Even when the particulars of a file's history are known, interchange hurdles can be an obstacle to efficient workflow.

“Different manufacturers' machines are more or less suited to each task throughout the postproduction process,” Bull says, “and facilities need the flexibility to use the best machine for each task.”

The biggest bottleneck, says Mike Parker, managing director of Digital Audio Research in Surrey, England, is found at the intersection of the picture and sound departments.

“The main interchange application in sound-for-picture,” he says, “is transferring rough-cut audio from video systems — where location sound is digitized when the picture is captured for editing — into digital audio editing systems, where effects and music are added.” Parker adds that the ability to port original location sound directly into audio editing machines is also crucial, as it avoids the need for slow real-time transfers.

Manufacturers haven't completely ignored the interchange difficulties faced by their customers, but prior efforts to address the issue have mostly been driven by the needs of individual vendors.

“All the manufacturers have proprietary formats,” Harris says, “but these are only rarely used for interchange amongst systems. And other manufacturers are reluctant to invest significant development in proprietary solutions that they cannot control.” The real solution, Harris believes, is a public-domain standard that can be implemented at a reasonable cost.

Pieces of the Puzzle

The AES31 standard is broken down into four parts that address different aspects of the interchange puzzle. While interrelated, each of these four parts is at a different stage in the adoption process.

Part 1 is concerned with the issue of media readability. The goal is that any workstation will be able to read the file system used on a storage medium — removable hard-drive, for instance — containing files from a different workstation. As described on the AES Web site, technical requirements include support for large storage devices, large audio files, and arbitrary file names of reasonable length. In addition, the file system should be capable of being implemented on existing platforms and computer operating systems, as well as within various embedded systems.

To meet these requirements, the working group has recommended adoption of the FAT32 file system. Bull says three overriding attributes make FAT32 suitable.

“It's very simple to implement — typically a few man-weeks of development on any platform,” he notes. “It overcomes the 2GB limit of other older filing systems. And it can be easily read by both the Mac and PC platforms, which can't be said of any other disk format.”

Parker adds that FAT32 is simple and robust, efficient for multi-stream applications, and has some level of redundancy, which means that some errors can be fixed. The choice of FAT32 is expected to be finalized at the 2001 European AES convention in Amsterdam.

Part 2 of the standard, which deals with the file format for audio data, has already been ratified. AES31-2 calls for each track of audio to be interchanged as a mono Broadcast Wave file. The Broadcast Wave format is based on the common WAVE file, and was adopted as a standard by the European Broadcast Union (EBU) back in 1996. Because the AES and EBU are both internationally recognized standards bodies, each can quote existing standards from the other.

Files in Broadcast Wave format are a “restricted subset” of all possible WAVE files. The basic audio format in Broadcast Wave is 16-bit linear PCM sampled at 48kHz, but additional sample rates and bit depths may also be used, and MPEG-encoded audio is supported. Broadcast Wave files also contain an extra “chunk” with information about the audio content, such as the title, origination, date, time, etc. The “.wav” file extension is used, and WAVE-capable applications that do not recognize the Broadcast Wave chunk can still open and play a file's audio.

“The WAVE specification provided a foundation with widespread availability, and relatively simple and well-documented implementation,” Harris says. “The ‘Broadcast Wave’ chunk was added to meet professional requirements, especially a high-resolution time stamp [timecode].”

The Audio Decision List

While the media and file readability provided for in parts 1 and 2 of the standard offer the necessary preconditions for interchange, they don't go far enough to allow the work-product of one editing system to be meaningful to another. It's the question of how much further to go that has proven most challenging in terms of reaching a true, industry-wide consensus.

To make the issues involved more manageable, the standard tackles the problem into two stages. The first stage, already ratified and in effect, is AES31-3, Audio Decision List. This standardized “ADL” format, Harris says, is a “Simple Project Interchange,” format, intended solely to interchange an audio project's current state. It includes a provision for a single track of video.

The first section of an ADL is header information about the project, system, and timeline. Next is the source index (references to the source files and tapes), followed by the event list itself. One of the defining characteristics of the ADL is that it is ASCII-based, making it human-readable.

“Human readability has shown its value over the years in EDLs,” Harris says. “If software or hardware fails somehow, you can still just look at the list.”

To preserve synch between the multiple mono Broadcast Wave files that make up a multi-track project, ADL files are synchronized along timeline tracks using time stamps in the audio file. The timecode type is Timecode Character Format (TCF), which supports sample-accuracy, multiple-picture frame rates, and other information critical to ensuring proper time alignment.

Describing the committee's other priorities for Part 3, Bull says that the inclusions of certain fields in the ADL were “obvious and essential.” Among these, he lists clip name, source-material file, source time, destination time, track number, in time, and out time.

Bull says that “non-essential fields” were left out of the ADL “in the interests of simplicity.” For example, AES31-3 does not address “edit heritage,” meaning the complete history of all editing sessions contributing to the current version.

Another valuable capability that is missing is the interchange of equalization, dynamics, and real-time automation parameters. But Bull, using as an example the fact that the same EQ curve can sound quite different on different machines, suggests that such information is “at best likely to be interpreted as an approximation of the sending user's wishes.” To have included it in Part 3, he argues, “would merely complicate the Simple Interchange Format with no discernible benefit.”

The committee is addressing the need for more sophisticated interchange in Part 4, Object-Oriented Project Interchange. Still in its early stages, Part 4 is not progressing quickly.

“Complete design of an object-based project interchange format is difficult,” Harris says. “This design work is probably beyond the means of AES on a volunteer basis.”

Now, or Never?

Many in the industry advocate swift implementation of AES31, without waiting for Part 4 to be defined.

“When the disk, files, and ADL are all readable by the receiving machine, real interchange will work for the majority of post facilities,” Bull says. Moving ahead now as an industry, he adds, means “giving our customers the benefit of a real working interchange today, saving them time and therefore money.”

In February, SADiE announced the immediate availability of AES31 in its workstation products, celebrating the fact that it was the first company to offer the format. But the company shouldn't have long to wait before others join in.

“To my knowledge at this moment,” Harris says, “DAR and Euphonix also have implementations, while WaveFrame, Merging, Fairlight, and others have announced their intentions.” Bull adds Genex, Akai, DSP Media, AMS Neve, and SEK'D to the list of those who have committed their support.

While this list suggests impressive industry cooperation, there are nonetheless a couple of obvious gaps.

“Unfortunately, we do not see the video-editing manufacturers implementing this standard, which will limit its use to after the picture department has handed-off,” says David Van Hoy, chairman of WaveFrame in Emeryville, California.

The most crucial of these video vendors, of course, is Avid, which also owns Digidesign. The company has long had its own ideas about interchange, based on its proprietary Open Media Framework.

“OMF does everything AES31 does and more,” says Scott Dailey, VP of product management and business development at Digidesign in Palo Alto, California. “It's a mature, second-generation technology. It's been adopted by essentially every important company in postproduction audio, including Akai, AMS Neve, DAR, DSP, EDLMax, Fairlight, SADiE, SSL, and TASCAM/TimeLine, as well as most — if not all — of the important film and video post companies. Thousands of working professionals rely on it every day to get their work done.”

Given OMF's widespread adoption and its “now rock-solid reliability,” Dailey says that Digidesign doesn't understand the point of the AES31 ADL. Adopting AES31, he adds, would take resources away from other products and features that Digidesign's customers are asking the company to implement.

To AES31's proponents, however, OMF's complexity and proprietary ownership make it a problematic alternative.

“OMF is documented and freely licensed,” Harris says, “but it is difficult to implement and is not a due-process standard.”

Bull, meanwhile, rejects the notion that any existing format is similar to the new standard.

“AES31 specifically excluded the complex video information required by OMF,” he says, “to make the standard cost-effective for the audio industry to implement.”

Bull adds that without the backing of an internationally recognized standards body, agreements to support proprietary solutions are likely to last only as long as software versions stay static on the products involved.

“This has always been a problem with OMF,” he says. “It's been through several iterations, making it extremely difficult for audio manufacturers and customers alike to keep abreast of the latest developments.”

Despite this apparent impasse, there's a good chance that AES31-4 will eventually bring all the manufacturers together on common ground. That's because a leading contender for object-oriented interchange is Advanced Authoring Format, which Dailey describes as the “successor” to OMF. AAF is promoted by a consortium led by Avid and Microsoft.

“AAF is complex,” Harris says, “but it has many technical features to recommend it. The consortium has announced its intention to move AAF through the SMPTE standardization process, and this bodes well for its future. It has a reasonable chance of being accepted, at least by those willing to accept the terms of the consortium. It may be the best candidate for AES31-4, though time will tell.”

In the meantime, Harris says that the internationally backed AES31 promises to become a truly open standard for project interchange. For now, however, it remains to be seen whether the new format turns out to be truly universal or merely additional.

Keyword URL
Unique Identifier
Start Timecode Length
Usage Code

(F) “URL:file:3/26/01/localhost/MyDisk/AUD001.WAV”
03:04:17.03/1601 00:00:05.01/1601
“NAME: Shot w/sync”

The layout of data fields for a file in the Source Index section of an ADL. The lower half of the graphic shows an example of the data as it would appear in an actual ADL. The upper half of the graphic (bold lettering), identifies the type of data that goes into each field.

Keyword Tape Name Tape Timecode File Timecode Available Channels
(T) “TAPE001” 03:04:17.04 03:04:17.03/1601 V1-4 1

The layout of data fields for a tape in the Source Index section of an ADL. The lower half of the graphic shows an example of the data as it would appear in an actual ADL. The upper half of the graphic (bold lettering), identifies the type of data that goes into each field.

(Index) 0001
(F) “URL:file:3/26/01/localhost/NO NAME/DARDemo2/2CFZ8U41.WAV”
_|0000 _
N /* Usage:Normal Codec:WAVE Filesystem:FAT32 */
(Index) 0002
(F) “URL:file:3/26/01/localhost/NO NAME/DARDemo2/2CFZ80RI.WAV”
_|0000 _
N /* Usage:Normal Codec:WAVE Filesystem:FAT32 */
(Index) 0003
(F) “URL:file:3/26/01/localhost/NO NAME/DARDemo2/5MVIIQC0.WAV”
_|0000 _
N /* Usage:Normal Codec:WAVE Filesystem:FAT32 */

A sample Source Index section of an ADL, showing three source files.

(Entry) 0001 (Cut) I 0001 1 1|1218|0598|1582 _
(Outfade)|0000 LIN _ _ _ _ _ _
(Rem) SOURCE “”
(Entry) 0002 (Cut) I 0002 1 2|1218|0598|1582 _
(Outfade)|0000 LIN _ _ _ _ _ _
(Rem) SOURCE “”
(Entry) 0003 (Cut) I 0001 1 3|0438|0826|0931 _
(Infade)|0000 LIN _ _ _ _ _ _
(Rem) SOURCE “”

A sample Event List section of an ADL, showing three events.

Keyword Event # Event Type Source Type Source Index Source Channel Dest. Channel
Source IN Dest. IN Dest OUT
(Entry) 0001 (Cut) I 0001 1~9999 1~9999
03:10:04:11/1218 01:00:00.00/0000 _

The layout of data fields for an event in the Event List section of an ADL. The lower half of the graphic shows an example of the data as it would appear in an actual ADL. The upper half of the graphic (bold lettering), identifies the type of data that goes into each field.