David Baron's Weblog

Semantic markup, browsers, and identity in the DOM

Friday, 2020-02-21, 12:32 -0800

HTML was initially designed as a semantic markup language, with elements having semantics (meaning) describing general roles within a document. These semantic elements have been added to over time. Markup as it is used on the web is often criticized for not following the semantics, but rather being a soup of divs and spans, the most generic sorts of elements. The Web has also evolved over the last 25 years from a web of documents to a web where many of the most visited pages are really applications rather than documents. The HTML markup used on the Web is a representation of a tree structure, and the user interface of these web applications is often based on dynamic changes made through the DOM, which is what we call both the live representation of that tree structure and the API through which that representation is accessed.

Browsers exist as tools for users to browse the Web; they strike a balance between showing the content as its author intended versus adapting that content to the device it is being displayed on and the preferences or needs of the user.

Given the unreliable use of semantics on the Web, most of the ways browsers adapt content to the user rarely depend deeply on semantics, although some of them (such as reader mode) do have significant dependencies. However, browser adaptations of content or interventions that browsers make on behalf of the user very frequently depend on the persistent object identity in the DOM. That is, nodes in the DOM tree (such as sections of the page, or paragraphs) have an identity over the lifetime of the page, and many things that browsers do depend on that identity being consistent over time. For example, exposing the page to a screen reader, scroll anchoring, and I think some aspects of ad blocking all depend on the idea that there are elements in the web page that the browser understands the identity of over time.

This might seem like it's not a very interesting observation. However, I believe it's important in the context of frameworks, like React, that use a programming model (which many developers find easier) where the developer writes code to map application state to user interface rather than having to worry about constantly altering the DOM to match the current state. These frameworks have an expensive step where they have to map the generated virtual DOM into a minimal set of changes to the real DOM. It is well known that it's important for performance for this set of changes to be minimal, since making fewer changes to the DOM results in the browser doing less work to render the updated page. However, this process is also important for the site to be a true part of the Web, since this rectification is important for being something that the browser can properly adapt to the device and to the user's needs.