Against semantic markup

@siracusa tweeted this a little while ago

Hypercritical #33 correction: <em> and <strong> … not <b> and <i>. Apologies to @gruber and semantic markup sticklers.

(Hypercritical is a podcast he does with Dan Benjamin at 5×5.com; go listen, but there’s nothing in the podcast relevant to what follows here.)

The idea here is that the <b> & <i> tags (bold & italic) are typographical, or display, instructions, and as such should be left up to the page designer. We should supply semantic markup instead to give the designer enough information about what we want displayed that the italic or bold typeface can be chosen as appropriate. For our purposes, those tags are <em> and <strong>, short for “stress emphasis” and “strong importance”. <strong> can be nested to indicate stronger and stronger importance.

This kind of semantic markup is fine in its place, but HTML isn’t the place to enforce it. A sufficient reason is that HTML doesn’t have a rich enough set of tags to do the work. The APA Style Manual lists seven reasons to use italics:

  • Titles of books, periodicals, and microfilm publications
  • Genera, species and varieties
  • Introduction of a new, technical, or key term
  • Emphasis
  • A letter, word, or phrase referred to as such
  • Letters use as statistical symbols or algebraic variables
  • Anchors of a scale

Sure, “emphasis” is on the list…along with six others that HTML has no tag for. And that’s not an exhaustive list.

One of the WordPress themes I use oddly inverts the representation of em/strong from i/b to b/i. It must have seemed like a good idea to someone at some time, but the only way I could use it on my site was to “fix” the CSS, which fortunately I was in a position to do. The thing is, there’s nothing technically wrong with doing that: “emphasis” is nowhere defined as “italics”.

So (except for cases where you’ve already taken care of things via CSS and classes), if you want italics, go ahead and use <i>. Ditto <b> for bold. And don’t apologize for it.

And now for a slight digression. HTML5 adds a bunch of new “semantic tags”, like <header> and <section>. Notice that “semantics” ends up referring to at least two rather distinct categories. The new HTML5 tags describe document structure, a kind of containerization where the container names aren’t all “div”. But the kind of semantic reference we’re talking about in the above list-of-reasons-to-italicize have nothing to do with document structure; they have to do with the connection between the pieces of the document and the great outside world: movie names, species, name-vs-use.

I mention this as an introduction to an oldish essay by John Allsopp, Semantics in HTML5. It’s the kind of thing that’s just as well to keep in the back of your mind when you start creating The Semantic Web.

Oh, the title. I’m not against semantic markup. Really. Just against using em/strong as fancified ways of saying italic/bold and then calling it “semantic markup”.

Leave a Reply

Your email address will not be published. Required fields are marked *