Provide text equivalents for audio - with Synchronised Multimedia Integration Language (SMIL)

Why this is important

Providing captions enables media that contains spoken or other audio information (on-or off-screen sound effects, or background music) important to understanding the media's content to be accessible to anyone who has difficulty hearing, or is unable to hear, the media soundtrack. SMIL is an open standard created by the World Wide Web Consortium (W3C) - and therefore in theory media player or format-independent - for creating accessible media through combining caption and audio description files with media files and providing information on their synchronisation and display.

General Principles

SMIL is based on Extensible Markup Language (XML). Version 1.0 has been around since 1998; the specification for version 2.0 was published at the beginning of 2005. A SMIL file is a simple text file, so like HTML, it can be created using a basic text editor. In practice, though, they are created more effectively and quickly using authoring software like MAGpie.

Real Player and QuickTime both support SMIL, but Windows Media Player uses a separate technology (Synchronised Accessible Multimedia Interchange, SAMI - confusingly also provided with the file extension .smi or .sami). Microsoft have also developed HTML+TIME as a technology for synchronising embedded media within a web page. Version 2.0 of HTML+TIME is a subset of SMIL 2.0, and is currently only supported by Internet Explorer (from version 5.5).

Since all three main media players can open files with the .smi extension, to avoid the wrong media player opening the SMIL file, it is necessary to provide the reference to the SMIL file in a separate player-specific file (for example .ram for Real Player, and .mov or .sml for QuickTime).

Before you continue

The advice on this page helps you avoid introducing a specific accessibility barrier, but it's not a magic formula. To avoid attempting to follow a technical solution that is not appropriate to the resource and its intended purpose, you need to know the context in which the multimedia resource is being used:

  1. The purpose or aim of the multimedia resource in question, and whether it is being used to supplement another resource in the learning environment, or whether its use is required by students.
  2. The target audience, their knowledge and expectations, and the type of browsing and assistive technology that they may be using.
  3. Whether the information and experiences provided by the multimedia technology are already available in an equivalent, alternative form.

For more background on this approach, see our Guide to the use of multimedia in accessible e-learning.

Technique Details

SMIL is a powerful technology, and to discuss in depth its capabilities would be beyond the scope of this resource. From an accessibility perspective, the process of adding captions to a video file using SMIL involves:

  1. Creating a caption file containing individual captions, either from a pre-existing transcript or by playing the video and transcribing the spoken content plus any important non-spoken sound, and associating a timestamp with each caption.
  2. Creating a SMIL file which references the media file and caption file (and audio description files if provided). It can also define properties of the area of the screen to be used to show the media, the area of the screen to be used to show captions, and provide metadata about the clip. All assets must reside in the same folder - the digitised video, caption and audio description file(s), and SMIL file.
  3. Making the SMIL file available on a web page, using HTML. This is normally more complex than might be expected, given the lack of native browser support for SMIL and the unknown factor of which media player on a user's computer will open the SMIL file. The way the SMIL file is referenced in a web page thus depends on the media player the resultant captioned video will open in, but generally requires creation of a media-player specific meta-file, which in turn references the SMIL file.

Example 1 shows a portion of the SMIL file used to combine a QTText file with a QuickTime video clip.

<?xml version="1.0" encoding="UTF-8"?>
<smil xmlns:qt="http://www.apple.com/quicktime/resources/smilextensions" xmlns="http://www.w3.org/TR/REC-smil" qt:time-slider="true">
<head>
    <meta content="" name="title"/>
    <meta content="" name="author"/>
    <meta content="" name="copyright"/>
    <layout>
        <root-layout width="330" height="335" background-color="black"/>
        <region top="5" width="320" height="240" left="5" background-color="black" id="videoregion"/>
        <region top="245" width="320" height="80" left="5" background-color="black" id="textregion"/>
    </layout>
</head>
<body>
    <par dur="0:01:01.00">
        <video dur="0:01:01.00" region="videoregion" src="/media/awvidq1a_220k.mov"/>
        <textstream dur="0:01:01.00" region="textregion" src="/media/awvidq1a_qt.txt"/>
    </par>
</body>
</smil>

Example 1: Sample of SMIL file.

SMIL can also be used to create content based on a user's settings, for example alternative-language formats for captions and audio descriptions, based on the user's system language; or alternative bandwidth versions of the same media file. SMIL is by its definition an extensible language, so player- or format-specific extensions can be written to provide additional functionality. For example, a number of QuickTime extensions exist for SMIL.

NB The W3C has also initiated a Timed Text activity, which appears an acknowledgement of current limitations in the ability of SMIL to deliver captions in an open, player-independent form. See Related Sites on this page for more information.

Testing

Like many W3C open technologies, support for SMIL is improving but still inconsistent from player to player. For example, to view SMIL in QuickTime, version 4.1 or more recent is required. It's important that you carefully read documentation relating to how well SMIL is supported for your media player of choice, and how it should be used.

The process of creating a captioned video using SMIL is by its nature lengthy and iterative, requiring constant reviewing of presentation and timing. Once available on the web, playing the resource in as many different platforms as possible is recommended. At the same time, it is well worth seeking feedback from people who are deaf or hard-of-hearing, although, of course it is assumed that they are unlikely to be familiar with the content of the video.


Related Sites

Accessibility Features of SMIL (W3C)
W3C article on how the SMIL specification supports accessible multimedia.
Introduction to HTML+TIME (Microsoft)
An article outlining Microsoft's HTML+TIME, a subset of SMIL 2.0.
SMIL 1.0 Specification (W3C)
SMIL 2.0 Specification (W3C)
SMIL Reference (RealNetworks)
A detailed reference on SMIL from RealNetworks, aimed at developers writing SMIL for Real Player.
SMIL Resources (NCAM)
The National Centre for Accessible Media's list of resources relating to SMIL - tutorials, examples of SMIL in action, and links to external resources.
SMIL templates (NCAM)
The National Centre for Accessible Media provide some SMIL templates for download, to help authors get started in creating SMIL files.
SMIL Tutorial (Helio.org)
An introduction to SMIL and its capabilities.
SMIL Tutorial (WikiBooks)
A general introduction to SMIL - what it can do, how it's created and accessed.
Synchronised Multimedia Home Page (W3C)
The World Wide Web Consortium's resource on all things relating to synchronised multimedia, including links to SMIL tutorials, references, specifications, authoring tools, and players.
Timed Text (W3C)
The W3C has established its Timed Text activity acknowledging certain current limitations of SMIL. According to the W3C, "The Timed-Text specification should (cover) all necessary aspects of timed text on the Web".