Accessible tables in DocBook
Requested explanation as to the DocBook Publisher's RFE regarding the inclusion of the HTML table model in the Publisher's schema.
One of the issues in making tabular data accessible to blind people using a screen reader lies in providing the appropriate markup that will allow a correct correlation of table data with its headings.
Where sighted people can intuitively assign a header placed atop a table column or leftmost in a row (or any combination thereof), these cues through visual placement are not available to blind users.
- First, I'll give a quick overview of the different mechanisms available in HTML for achieving the correlation of data with their headers,
- then I'll look at the abilities of the CALS table model by means of an example,
- closing with a comparative summary of the two table models.
Fully accessible tables in HTML output
HTML provides three mechanisms for attributing the respective headers to tabular data that provide screen readers with the necessary cues to replace visual placement with explicit verbalizations thereof.
They enable the screen reader to repeat the associated header(s) as it reads through the data, thus expressing the semantic relationships between the information provided (as opposed to reading out the table's contents linearly, which invariably results in confusing gibberish):
-
use of
th
andtd
:
On its simplest level, HTML provides these two elements to distiguish between headers and data. The Techniques section of the WCAG 2.0 actually recommends the bare use of these for simple tables, since some screen readers still seem to have issues in matching the correct headers with the table data viascope
(I'm not sure how up-to-date this is however). -
the
scope
attribute:
is added to table headers (i.e.,th
, which can be column or row headers alike) but is only viable for simple tables (with simple header:data relationships).A simple example to demonstrate its use:
<table> <caption>Shelly's Daughters</caption> <tr> <th scope="col">Name</th> <th scope="col">Age</th> <th scope="col">Birthday</th> </tr> <tr> <th scope="row">Jackie</th> <td>5</td> <td>April 5</td> </tr> <tr> <th scope="row">Beth</th> <td>8</td> <td>January 14</td> </tr> </table>
which translates to:
Shelly's Daughters Name Age Birthday Jackie 5 April 5 Beth 8 January 14 id
andheaders
attributes:
This duo is useful for complex tables in which a complex set of headers is to be attributed to the individual table data, and is recommended for complex tables in the WCAG 2.0 Techniques. In this case, theid
is added to the header cells, while each data cell references the respectively relevant headerid
's via theheaders
attribute.Here the example from above, with an added differentiation for more complexity:
<table> <caption>Shelly's Daughters</caption> <tr> <td> </td> <th id="name">Name</th> <th id="age">Age</th> <th id="birthday">Birthday</th> </tr> <tr> <th rowspan="2" id="birth">daughters by birth</th> <th id="jackie" headers="birth name">Jackie</th> <td headers="birth jackie age">5</td> <td headers="birth jackie birthday">April 5</td> </tr> <tr> <th id="beth" headers="birth name">Beth</th> <td headers="birth beth age">8</td> <td headers="birth beth birthday">January 14</td> </tr> <tr> <th id="step">daughters by marriage</th> <th id="jenny" headers="step name">Jenny</th> <td headers ="step jenny age">12</td> <td headers="step jenny birthday">Feb 12</td> </tr> </table>
which translates to:
Shelly's Daughters Name Age Birthday daughters by birth Jackie 5 April 5 Beth 8 January 14 daughters by marriage Jenny 12 February 12
None of these mechanisms is a one-size-fits all solution, their implementation must be decided upon from case to case, and their usefulness may change over time depending on screen reader developments.
The capabilities of the CALS table model
With regards to DocBook tables, I found that the CALS table model does not provide sufficient "hooks" which then could be used via XSL transformation to output the case-by-case best possible HTML to facilitate the understanding of table data via screen readers.
Here the example table from above in CALS markup (with possibly relevant code in bold):
<table frame='all' rowheader='firstcol'> <title>'Shelly's Daughters' in CALS Markup</title> <tgroup cols='4'> <colspec colname='provenience'/> <colspec colname='Name'/> <colspec colname='Age'/> <colspec colname='Birthday'/> <thead> <row> <entry> </entry> <entry>Name</entry> <entry>Age</entry> <entry>Birthday</entry> </row> </thead> <tbody> <row> <entry morerows="1" xml:id="Natural">Daughters by birth</entry> <entry xml:id="Jackie">Jackie</entry> <entry>5</entry> <entry>April 5</entry> </row> <row> <entry xml:id="Beth">Beth</entry> <entry>8</entry> <entry>January 14</entry> </row> <row> <entry xml:id="Step">Daughters by marriage</entry> <entry xml:id="Jenny">Jenny</entry> <entry>12</entry> <entry>Feb 12</entry> </row> </tbody> </tgroup> </table>
CALS does allow for the implicit establishment of simple semantic relationships via thead
(column headers) and the rowheader
attribute (here the shortcomings begin: only for the first column). For very simple tables, this would allow XSL transformation of the thead
entry
's and those of the first column into the th
's needed to express their "header" nature in HTML.
However, it stops there. As soon as relationships become more complex, either by two-level asymmetric header relationships as in the example above, or because of spanned rows or columns, CALS does not offer the means to express these. Regarding the example:
- there is no implicit way of identifying the second column's contents as row (sub-)headers (along the lines of
rowheader
or the possibility of including more than one row of column headers within thethead
); - header
entry
's can be unambiguously labelled viaxml:id
's, or perhaps even (partially) viacolname
(whereby a uniform method would surely be preferable?), - but there is no way in CALS to associate them to one another (e.g. that "Jackie" and "Beth" are both "Daughters by birth": a screen reader typically would have problems especially with the second if there is no explicit pointer in place) - except perhaps by constructing a hack around
morerows
...? - And there is no way to reference the headers from the data
entry
's (e.g. that "Daughter by birth", "named" "Beth", is "Age" "8"). This approach of explicit attribution becomes necessary as soon as relationships are non-linear to keep the meaning of complex data clear. Sighted readers intuit these by means of visual cues, but screen readers only see code, which can quickly become ambiguous with increasing complexity.
In going through the many options the CALS model offers for table markup, it becomes clear that these (especially also those available to the entry
element) predominantly target visual representation: size, spans, positioning, etc. CALS does not provide for the expression of semantic relationships between tabular data.
Providing table accessibility via DocBook
In a nutshell: CALS shows limitations regarding the semantic attribution of headers to complex data.
Requirement | CALS | HTML |
---|---|---|
Column headers | marked via thead/row/entry , if scope is needed, XSL customization necessary? |
native th and scope support |
Row headers | marked via attribute rowheader (only first column), if scope is needed, XSL customization necessary? |
|
multiple header attributions | perhaps by customization???
|
native id and headers support |
So, although CALS does include the possibility of indicating which table cells are headers, this only covers the simplest of cases. In all other cases (such as the 'Overview' table above with its row-spanned cell), CALS does not of itself provide the necessary structures.
In the end, one is faced with the alternative of expanding or at least hacking the CALS model - or simply turning to the HTML model, which already comes with all the required features and possibilities of choice built-in and ready-for-use.
An easy decision, I would think :)
Thus my request to add the HTML table model to the DocBook Publishers schema, considering that the Publisher's realm does cover types of publications that will contain complex tables too.
Nathalie Sequeira
2012-04-18, added CALS example 2012-06-19
Implementation Update
2015
In the course of dealing with this RFE, the DocBook Publishers Committee decided in favour of extending the DocBook CALS table model to accomodate accessibility needs. (for details see the 2013 update to the RFE).
The changes are currently being implemented in DocBook 5.1 and being integrated into the CALS table model itself, thus making the newly added mechanisms for CALS table accessibility available to a wider user base.
Many thanks to Scott Hudson who championed the changes in the DocBook and Oasis CALS committees!