dotjay (home) | dotjay’s lab / notes / css / aural-speech

Aural CSS: Support for CSS 2 Aural Style Sheets / CSS 3 Speech Module

Last updated: February 3, 2008

Questions about support for aural CSS (Cascading Style Sheets) have been popping up in various corners of the Web lately, so I thought I would compile what I know as a supplementary page to my Screen Readers and Abbreviations tests.

If you find this information to be incomplete or inaccurate, please let me know so that I can update this page.

Note: Some new sources of information will be added to this page pending review. In the meantime, you may like to follow the links I have included at the bottom of the page and read for yourself.

Introduction

CSS includes 'aural' (or 'speech') properties that allow Web designers and developers control over the way in which HTML (and XML) is synthesised as speech by CSS-aware software. However, these properties enjoy very limited support in current Web browsers, screen readers and in other assistive technology software where the properties may be of benefit.

Unfortunately, such limited support makes it next to useless as an implementation and without an improved level of support from software vendors, Web designers and developers are unlikely to use it as a tool. We have a paradox, however, as vendors are unlikely to prioritise support for something that is not used and has no benefit to them. Instead, current screen readers (such as JAWS) and speaking browsers (such as Home Page Reader) analyse words to determine how they should be pronounced using their own non-CSS-based algorithms.

However, even if support was better than it is, using them is another matter. The average Web designer or developer would still need the skill to write an appropriate and considerate aural style sheet, selecting voices, and perhaps positioning them spacially. If you think about how many designers actually use print style sheets, how many might actually implement aural style sheets?

Aural CSS first appeared in the CSS 2 Specification, the current official W3C Recommendation for CSS. The CSS 2.1 Specification – currently a "last call" Working Draft that will become the next official W3C Recommendation – extends the specification to include a new property, but deprecates the 'aural' media type and reserves the favoured 'speech' media type. The CSS 3 Speech module reworks and replaces the 'aural' properties as specified for CSS 2, 19 Aural style sheets / CSS 2.1, Appendix A. Aural style sheets. To quote some relevent sections of the CSS specifications:

UAs are not required to implement the properties of this chapter in order to conform to CSS 2.1.”

CSS 2.1, Appendix A. Aural style sheets

And:

“We expect that in a future level of CSS there will be new properties and values defined for speech output. Therefore CSS 2.1 reserves the 'speech' media type (see chapter 7, "Media types"), but does not yet define which properties do or do not apply to it.

“The properties in this appendix apply to a media type 'aural', that was introduced in CSS 2. The type 'aural' is now deprecated.”

CSS 2.1, Appendix A. Aural style sheets, A.1 The media types 'aural' and 'speech'

If you want to know more, the “Aural stylesheets” section of Joe Clark's book, Building Accessible Websites, is a very informative.

Summary of Known Support

The CSS 3 Speech module is currently supported in:

Note about Fire Vox and Firefox: Firefox does not parse aural/speech CSS properties, so Fire Vox support is achieved by parsing the CSS directly.

CSS 2 Aural Style Sheets are currently supported in:

Note about Safari with VoiceOver: It has been suggested that using Safari with VoiceOver offers support for aural CSS. I have not tested this, but it seems that it is just rumour.

Note about iCab: It has also been implied that iCab should support CSS 2 Aural style sheets as it claims full CSS 2.1 support. I currently have no information to confirm support.

Note about Window-Eyes: GW Micro are quoted as having said in December 2003 that they have no plans to support aural style sheets in Window-Eyes (see addendum to Shortened forms on the Web).

References:

Details of Aural CSS Properties

The following table shows which properties are available in the different CSS specifications.

Table of Aural CSS Properties
CSS property CSS 2 CSS 2.1 CSS 3
azimuth y y n
cue y y y
cue-after y y y
cue-before y y y
elevation y y n
mark n n y
mark-after n n y
mark-before n n y
pause y y y
pause-after y y y
pause-before y y y
phonemes n n y
pitch y y n
pitch-range y y n
play-during y y n
rest n n y
rest-after n n y
rest-before n n y
richness y y n
speak y y y
speak-header n y n
speak-numeral y y n
speak-punctuation y y n
speech-rate y y n
stress y y n
voice-balance n n y
voice-duration n n y
voice-family y y y
voice-pitch n n y
voice-pitch-range n n y
voice-rate n n y
voice-stress n n y
voice-volume n n y
volume y y n

The following table shows the current support for aural/speech CSS properties.

Key:

n
not supported
y
is supported
/
partial support
?
not tested / level of support unknown (but unlikely if CSS 2)
Table of Known Support for Aural CSS Properties
CSS property Opera 9 Fire Vox Emacspeak
azimuth n ? ?
cue y ? ?
cue-after y ? ?
cue-before y ? ?
elevation n ? ?
mark ? ? ?
mark-after ? ? ?
mark-before ? ? ?
pause y ? ?
pause-after y ? ?
pause-before y ? ?
phonemes y ? ?
pitch n ? ?
pitch-range n ? ?
play-during n ? ?
rest ? ? ?
rest-after ? ? ?
rest-before ? ? ?
richness n ? ?
speak y ? ?
speak-header n ? ?
speak-numeral n ? ?
speak-punctuation n ? ?
speech-rate n ? ?
stress n ? ?
voice-balance y ? ?
voice-duration y ? ?
voice-family y ? ?
voice-pitch y / ?
voice-pitch-range y ? ?
voice-rate y / ?
voice-stress y ? ?
voice-volume y / ?
volume n ? ?

CSS 2 (W3C Recommendation)

http://www.w3.org/TR/CSS2/aural.html

19 properties:

  1. azimuth
  2. cue
  3. cue-after
  4. cue-before
  5. elevation
  6. pause
  7. pause-after
  8. pause-before
  9. pitch
  10. pitch-range
  11. play-during
  12. richness
  13. speak
  14. speak-numeral
  15. speak-punctuation
  16. speech-rate
  17. stress
  18. voice-family
  19. volume

Note: The speak-date and speak-time properties were referenced in a W3C note in 1997, but never made it into a specification.

CSS 2.1 ("last call" Working Draft, 06 November 2006)

http://www.w3.org/TR/CSS21/aural.html

The new property speak-header is introduced and 'aural' media type is deprecated in favour of 'speech' media type.

20 properties:

  1. azimuth
  2. cue
  3. cue-after
  4. cue-before
  5. elevation
  6. pause
  7. pause-after
  8. pause-before
  9. pitch
  10. pitch-range
  11. play-during
  12. richness
  13. speak
  14. speak-header
  15. speak-numeral
  16. speak-punctuation
  17. speech-rate
  18. stress
  19. voice-family
  20. volume

CSS 3 (Working Draft, 16 December 2004)

http://www.w3.org/TR/css3-speech/#property-index

22 properties:

  1. cue
  2. cue-after
  3. cue-before
  4. mark
  5. mark-after
  6. mark-before
  7. pause
  8. pause-after
  9. pause-before
  10. phonemes
  11. rest
  12. rest-after
  13. rest-before
  14. speak
  15. voice-balance
  16. voice-duration
  17. voice-family
  18. voice-pitch
  19. voice-pitch-range
  20. voice-rate
  21. voice-stress
  22. voice-volume

New Sources Pending Addition

There are a few pages of information I've found that I still need to read through and/or digest, but you can take a look yourself in the meantime: