Accessible tables in DocBook

Requested explanation as to the DocBook Publisher's RFE regarding the inclusion of the HTML table model in the Publisher's schema.

One of the issues in making tabular data accessible to blind people using a screen reader lies in providing the appropriate markup that will allow a correct correlation of table data with its headings.
Where sighted people can intuitively assign a header placed atop a table column or leftmost in a row (or any combination thereof), these cues through visual placement are not available to blind users.

Fully accessible tables in HTML output

HTML provides three mechanisms for attributing the respective headers to tabular data that provide screen readers with the necessary cues to replace visual placement with explicit verbalizations thereof.
They enable the screen reader to repeat the associated header(s) as it reads through the data, thus expressing the semantic relationships between the information provided (as opposed to reading out the table's contents linearly, which invariably results in confusing gibberish):

None of these mechanisms is a one-size-fits all solution, their implementation must be decided upon from case to case, and their usefulness may change over time depending on screen reader developments.

The capabilities of the CALS table model

With regards to DocBook tables, I found that the CALS table model does not provide sufficient "hooks" which then could be used via XSL transformation to output the case-by-case best possible HTML to facilitate the understanding of table data via screen readers.

Here the example table from above in CALS markup (with possibly relevant code in bold):

<table frame='all' rowheader='firstcol'>
    <title>'Shelly's Daughters' in CALS Markup</title>
    <tgroup cols='4'>
        <colspec colname='provenience'/>
        <colspec colname='Name'/>
        <colspec colname='Age'/>
        <colspec colname='Birthday'/>
        <thead>
            <row>
                <entry> </entry>
                <entry>Name</entry>
                <entry>Age</entry>
                <entry>Birthday</entry>
            </row>
        </thead>
        <tbody>
            <row>
                <entry morerows="1" xml:id="Natural">Daughters by birth</entry>
                <entry xml:id="Jackie">Jackie</entry>
                <entry>5</entry>
                <entry>April 5</entry>
            </row>
            <row>
                <entry xml:id="Beth">Beth</entry>
                <entry>8</entry>
                <entry>January 14</entry>
            </row>
            <row>
                <entry xml:id="Step">Daughters by marriage</entry>
                <entry xml:id="Jenny">Jenny</entry>
                <entry>12</entry>
                <entry>Feb 12</entry>
            </row>
        </tbody>
    </tgroup>
</table>

CALS does allow for the implicit establishment of simple semantic relationships via thead (column headers) and the rowheader attribute (here the shortcomings begin: only for the first column). For very simple tables, this would allow XSL transformation of the thead entry's and those of the first column into the th's needed to express their "header" nature in HTML.

However, it stops there. As soon as relationships become more complex, either by two-level asymmetric header relationships as in the example above, or because of spanned rows or columns, CALS does not offer the means to express these. Regarding the example:

In going through the many options the CALS model offers for table markup, it becomes clear that these (especially also those available to the entry element) predominantly target visual representation: size, spans, positioning, etc. CALS does not provide for the expression of semantic relationships between tabular data.

Providing table accessibility via DocBook

In a nutshell: CALS shows limitations regarding the semantic attribution of headers to complex data.

Overview of CALS capabilities with regard to HTML requirements for table accessibility
Requirement CALS HTML
Column headers marked via thead/row/entry, if scope is needed, XSL customization necessary? native th and scope support
Row headers marked via attribute rowheader (only first column), if scope is needed, XSL customization necessary?
multiple header attributions perhaps by customization???
  • possibly: mark up entrys containing header information with xml:id
  • but: there is no entry attribute that can be used to reference applicable headers?
native id and headers support

 

So, although CALS does include the possibility of indicating which table cells are headers, this only covers the simplest of cases. In all other cases (such as the 'Overview' table above with its row-spanned cell), CALS does not of itself provide the necessary structures.

In the end, one is faced with the alternative of expanding or at least hacking the CALS model - or simply turning to the HTML model, which already comes with all the required features and possibilities of choice built-in and ready-for-use.
An easy decision, I would think :)

Thus my request to add the HTML table model to the DocBook Publishers schema, considering that the Publisher's realm does cover types of publications that will contain complex tables too.

Nathalie Sequeira
2012-04-18, added CALS example 2012-06-19