Provide audio descriptions for video or animated content - with Synchronised Multimedia Integration Language (SMIL)
Why this is important
Providing audio descriptions enables media that contains visual or aural information important to understanding the media's content, to be accessible to anyone who is blind or visually impaired and unable to see the video's content. SMIL is an open standard created by the World Wide Web Consortium (W3C) - and therefore in theory media player or format-independent - for creating accessible media through combining caption and audio description files with media files and providing information on their synchronisation and display.
General Principles
SMIL is based on Extensible Markup Language (XML). Version 1.0 has been around since 1998; the specification for version 2.0 was published at the beginning of 2005. SMIL is a text file, so can be created using a basic text editor. In practice, though, authoring can more effectively and quickly be done using an application such as MAGPie.
Real Player and QuickTime both support SMIL, but Windows Media Player uses a separate technology (Synchronised Accessible Multimedia Interchange, SAMI - confusingly also provided with the file extension .smi). Microsoft has also developed HTML+TIME as a technology for synchronising embedded media within a web page. Version 2.0 of HTML+TIME is a subset of SMIL 2.0, and is currently only supported by Internet Explorer (from version 5.5).
Since all three main media players can open files with the .smi extension, to avoid the wrong media player opening the SMIL file, it is necessary to provide the reference to the SMIL file in a separate player-specific file (for example RAM for Real Player or .mov for QuickTime).
NB: We have provided general advice on audio description in a separate How To: Provide audio descriptions for video or animated content - general advice.
Before you continue
The advice on this page helps you avoid introducing a specific accessibility barrier, but it's not a magic formula. To avoid attempting to follow a technical solution that is not appropriate to the resource and its intended purpose, you need to know the context in which the multimedia resource is being used:
- The purpose or aim of the multimedia resource in question, and whether it is being used to supplement another resource in the learning environment, or whether its use is required by students.
- The target audience, their knowledge and expectations, and the type of browsing and assistive technology that they may be using.
- Whether the information and experiences provided by the multimedia technology are already available in an equivalent, alternative form.
For more background on this approach, see our Guide to the use of multimedia in accessible e-learning.
Technique Details
SMIL is a powerful technology, and to discuss in depth its capabilities would be beyond the scope of this resource. From an accessibility perspective, the process of adding audio descriptions to a video file using SMIL involves:
- Creating audio description files, normally very short files of spoken content, each created using a microphone and sound recording software. Audio description files can either be one file containing all the descriptions which is then synchronised with the video using timed cue points, or individual audio files for each description clip. While the latter obviously requires multiple files to be created and managed, it is easier to code and is recommended.
- Creating a SMIL file which references the media file and audio description file(s) (and caption file assuming one is provided). It can also define properties of the area of the screen to be used to show the media, the area of the screen to be used to show captions, and provide metadata about the clip. All assets must reside in the same folder - the digitised video, caption and audio description file(s), and SMIL file.
- Making the SMIL file available on a web page, using HTML. This is normally more complex than might be expected, given the lack of native browser support for SMIL and the unknown factor of which media player on a user's computer will open the SMIL file. The way the SMIL file is referenced in a web page thus depends on the media player the resultant captioned video will open in, but generally requires creation of a media-player specific meta-file, which in turn references the SMIL file.
It can take time to hand-code a SMIL file first time round, and the process will require several stages of reviewing timing and content, and making adjustments as required. Subsequent files should, however, be quicker to prepare by virtue of being able to use the original as a template. The National Centre for Accessible Media also offers some SMIL templates for download from its web site.
SMIL can also be used to create content based on a user's settings, for example alternative-language formats for captions and audio descriptions, based on the user's system language; or alternative bandwidth versions of the same media file. SMIL is by its definition an extensible language, so player- or format-specific extensions can be written to provide additional functionality. For example, a number of QuickTime extensions exist for SMIL.
Example 1 shows a portion of the SMIL code used to associate audio description files with the QuickTime video shown in Clip 1.
<smil qt:time-slider="true" qt:autoplay="true">
<head>
<meta content="" name="title"/>
<meta content="" name="author"/>
<meta content="" name="copyright"/>
<layout>
<root-layout id="frame" background-color="#336699" width="320" height="240"/>
<region id="main" top="0" left="0" width="320" height="240"/>
</layout>
</head>
<body>
<par>
<video src="/media/audio_described1.mov" region="main" id="video"/>
<audio src="/media/audio_described1-11.mp3" begin="id(video)(0:03)"/>
<audio src="/media/audio_described1-12.mp3" begin="id(video)(0:10)"/>
<audio src="/media/audio_described1-13.mp3" begin="id(video)(0:17)"/>
<audio src="/media/audio_described1-14.mp3" begin="id(video)(0:28)"/>
<audio src="/media/audio_described1-15.mp3" begin="id(video)(0:32)"/>
...
</par>
</body>
</smil>
Clip 1: Audio Described QuickTime Video
Length : 3 Min 38 Sec

Watch the Interview
Clip 2: A further Example of an Audio Described QuickTime Video
Length : 4 Min 8 Sec

Watch the Interview
Note that SMIL code does not extend to providing a way to adjust the audio levels, so the audio description clips must fit in the gaps in the main soundtrack, and be of high enough level to be heard relative to the main soundtrack. The best way to achieve the desired relative levels is to prepare the video soundtrack at a lower level when recording, for example -6dB relative to the Audio Description track (which will be prepared at full level). Actual levels will, of course, need to be determined by testing for each job.
SMIL 2.0 allows a way for authors to pause the main video while an audio description clip is playing - in order to supply extended audio descriptions for content that requires a longer description than the main soundtrack allows. NCAM also offers a way in which extended descriptions can be provided using SMIL 1.0.
Testing
Like many W3C open technologies, support for SMIL is improving but still inconsistent from player to player. For example, to view SMIL in QuickTime, version 4.1 or more recent is required. It's important that you carefully read documentation relating to how well SMIL is supported for your media player of choice, and how it should be used.
The process of creating an audio described video using SMIL is by its nature lengthy and iterative, requiring constant reviewing of presentation and timing. Once available on the web, playing the resource in as many different platforms as possible is recommended. At the same time, it is well worth seeking feedback from blind and visually impaired people, although, of course it is assumed that they are unlikely to be familiar with the content of the video.
Related Sites
- Accessibility Features of SMIL (W3C)
- A W3C article on how the SMIL specification supports accessible multimedia.
- Extended Audio Descriptions in SMIL 1.0 (NCAM)
- A tutorial from the National Centre for Accessible Media on creating and providing extended audio descriptions using SMIL 1.0.
- Introduction to HTML+TIME (Microsoft)
- An article outlining Microsoft's HTML+TIME, a subset of SMIL 2.0.
- SMIL 1.0 Specification (W3C)
- SMIL 2.0 Specification (W3C)
- SMIL Reference (RealNetworks)
- A detailed reference on SMIL from RealNetworks, aimed at developers writing SMIL for Real Player.
- SMIL Resources (NCAM)
- The National Centre for Accessible Media's list of resources relating to SMIL - tutorials, examples of SMIL in action, and links to external resources.
- SMIL templates (NCAM)
- The National Centre for Accessible Media provide some SMIL templates for download, to help authors get started in creating SMIL files.
- SMIL Tutorial (Helio.org)
- An introduction to SMIL and its capabilities.
- SMIL Tutorial (WikiBooks)
- A general introduction to SMIL - what it can do, how it's created and accessed.
- Synchronised Multimedia Home Page (W3C)
- The World Wide Web Consortium's resource on all things relating to synchronised multimedia, including links to SMIL tutorials, references, specifications, authoring tools, and players.
Related Resources
How To
- Provide audio descriptions for video or animated content - in MAGpie
- Provide audio descriptions for video or animated content - general advice
- Provide text equivalents for audio - with Synchronised Multimedia Integration Language (SMIL)
- Provide text equivalents for audio - general advice on transcripts
- Use media to enhance text - using video