Skip to main content

Module xml

Module xml 

Source
Expand description

HTML and XML parsers that produce an XmlTree.

Two parsers are provided, each suited to a different use-case:

  • XmlParser — a hand-rolled recursive-descent parser. Node offsets are exact byte positions of each token in the source string. Use this wherever reading positions need to be persisted to disk (EPUB spine chapters, standalone HTML files).

  • parse_html5 — a thin wrapper around html5ever. Handles entities, void elements, and the full HTML5 error-recovery algorithm. Node offsets are synthetic (a monotonically increasing counter, not source positions). Use this for ephemeral rendering where offset precision is not required (e.g. the dictionary view).

Structs§

Html5Sink 🔒
[TreeSink] implementation that bridges html5ever’s push-based API into an XmlTree.
XmlParser
Hand-rolled recursive-descent parser for XML and basic HTML documents.

Traits§

XmlExt
Extension trait that adds XML whitespace detection to char.

Functions§

parse_html5
Parses input as HTML using the html5ever spec-compliant parser and returns the resulting XmlTree.