Skip to main content

Html5Sink

Struct Html5Sink 

Source
struct Html5Sink {
    tree: RefCell<XmlTree>,
    qual_names: RefCell<FxHashMap<NodeId, QualName>>,
    template_contents: RefCell<FxHashMap<NodeId, NodeId>>,
    offset_counter: RefCell<usize>,
}
Expand description

[TreeSink] implementation that bridges html5ever’s push-based API into an XmlTree.

Node offsets are assigned from a monotonically increasing counter rather than from source byte positions, because html5ever’s TreeSink callbacks do not receive source positions. The counter advances by 1 per element and by text.len() per text chunk, preserving the non-overlap invariant needed by the layout engine’s page-finding binary search.

Fields§

§tree: RefCell<XmlTree>

The tree being built. RefCell is required because multiple TreeSink methods need mutable access and Rust’s borrow checker cannot see that html5ever calls them non-concurrently.

§qual_names: RefCell<FxHashMap<NodeId, QualName>>

Maps each element NodeId to its fully-qualified name so that elem_name can return a borrowed reference as required by the trait.

§template_contents: RefCell<FxHashMap<NodeId, NodeId>>

Maps <template> element NodeIds to their associated content root NodeId, as required by the HTML5 template element spec.

§offset_counter: RefCell<usize>

Synthetic position counter. Incremented for every node created so that all offsets are unique and ordered by document position.

Implementations§

Source§

impl Html5Sink

Source

fn new() -> Self

Creates a new sink with an empty tree and a zeroed offset counter.

Source

fn next_offset(&self) -> usize

Returns the current value of the offset counter without advancing it.

Source

fn advance_offset(&self, by: usize)

Advances the offset counter by by, clamped to a minimum of 1 to guarantee that every node receives a strictly larger offset than the previous one even for zero-length text runs.

Source

fn is_whitespace_only(text: &str) -> bool

Returns true when text contains only ASCII whitespace characters.

Source

fn attr_name(attr: &Attribute) -> String

Converts an html5ever [Attribute] name to its string representation, prefixing with the namespace if one is present (e.g. xml:lang).

Source

fn build_attributes(attrs: Vec<Attribute>) -> Attributes

Converts a Vec<Attribute> from html5ever into the Attributes map used by the DOM.

Trait Implementations§

Source§

impl TreeSink for Html5Sink

Source§

fn parse_error(&self, _msg: Cow<'static, str>)

Silently ignores all parse errors. The dictionary content from reader-dict is often malformed HTML, and we rely on html5ever’s error-recovery rather than failing on bad input.

Source§

fn create_element( &self, name: QualName, attrs: Vec<Attribute>, flags: ElementFlags, ) -> Self::Handle

Creates a new element node, assigns it the next synthetic offset, and registers its qualified name for later elem_name lookups.

For <template> elements an additional content-root node is created and stored in template_contents, as required by the spec.

Source§

fn create_comment(&self, _text: Tendril<UTF8>) -> Self::Handle

Maps an HTML comment to an empty whitespace node so it occupies a slot in the offset space without contributing visible content.

Source§

fn create_pi( &self, _target: Tendril<UTF8>, _data: Tendril<UTF8>, ) -> Self::Handle

Maps a processing instruction to an empty whitespace node so it occupies a slot in the offset space without contributing visible content.

Source§

fn append(&self, parent: &Self::Handle, child: NodeOrText<Self::Handle>)

Appends a child node or text run to parent.

Text runs are coalesced into the preceding sibling text node when one exists, to match the behaviour of the hand-rolled parser and avoid producing redundant nodes for adjacent text chunks.

Source§

fn append_based_on_parent_node( &self, element: &Self::Handle, prev_element: &Self::Handle, child: NodeOrText<Self::Handle>, )

Delegates to Self::append using element as the target parent.

Called by html5ever during foster-parenting and similar error-recovery situations where the intended parent is determined by the element rather than its previous sibling.

Source§

fn append_before_sibling( &self, sibling: &Self::Handle, new_node: NodeOrText<Self::Handle>, )

Inserts a node or text run immediately before sibling.

Source§

fn append_doctype_to_document( &self, _name: Tendril<UTF8>, _public_id: Tendril<UTF8>, _system_id: Tendril<UTF8>, )

DOCTYPE declarations are not represented in the tree.

Source§

fn set_quirks_mode(&self, _mode: QuirksMode)

Quirks mode is accepted but has no effect on the tree representation.

Source§

type Handle = NodeId

Handle is a reference to a DOM node. The tree builder requires that a Handle implements Clone to get another reference to the same node.
Source§

type Output = XmlTree

The overall result of parsing. Read more
Source§

type ElemName<'a> = Ref<'a, QualName>

Source§

fn finish(self) -> Self::Output

Consume this sink and return the overall result of parsing. Read more
Source§

fn get_document(&self) -> Self::Handle

Get a handle to the Document node.
Source§

fn elem_name<'a>(&'a self, target: &'a Self::Handle) -> Self::ElemName<'a>

What is the name of this element? Read more
Source§

fn get_template_contents(&self, target: &Self::Handle) -> Self::Handle

Get a handle to a template’s template contents. The tree builder promises this will never be called with something else than a template element.
Source§

fn same_node(&self, x: &Self::Handle, y: &Self::Handle) -> bool

Do two handles refer to the same node?
Source§

fn add_attrs_if_missing(&self, target: &Self::Handle, attrs: Vec<Attribute>)

Add each attribute to the given element, if no attribute with that name already exists. The tree builder promises this will never be called with something else than an element.
Source§

fn remove_from_parent(&self, target: &Self::Handle)

Detach the given node from its parent.
Source§

fn reparent_children(&self, node: &Self::Handle, new_parent: &Self::Handle)

Remove all the children from node and append them to new_parent.
§

fn mark_script_already_started(&self, _node: &Self::Handle)

Mark a HTML <script> as “already started”.
§

fn pop(&self, _node: &Self::Handle)

Indicate that a node was popped off the stack of open elements.
§

fn associate_with_form( &self, _target: &Self::Handle, _form: &Self::Handle, _nodes: (&Self::Handle, Option<&Self::Handle>), )

Associate the given form-associatable element with the form element
§

fn is_mathml_annotation_xml_integration_point( &self, _handle: &Self::Handle, ) -> bool

Returns true if the adjusted current node is an HTML integration point and the token is a start tag.
§

fn set_current_line(&self, _line_number: u64)

Called whenever the line number changes.
§

fn allow_declarative_shadow_roots( &self, _intended_parent: &Self::Handle, ) -> bool

§

fn attach_declarative_shadow( &self, _location: &Self::Handle, _template: &Self::Handle, _attrs: &[Attribute], ) -> bool

Attempt to attach a declarative shadow root at the given location. Read more
§

fn maybe_clone_an_option_into_selectedcontent(&self, option: &Self::Handle)

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
§

impl<T> Downcast for T
where T: Any,

§

fn into_any(self: Box<T>) -> Box<dyn Any>

Converts Box<dyn Trait> (where Trait: Downcast) to Box<dyn Any>, which can then be downcast into Box<dyn ConcreteType> where ConcreteType implements Trait.
§

fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>

Converts Rc<Trait> (where Trait: Downcast) to Rc<Any>, which can then be further downcast into Rc<ConcreteType> where ConcreteType implements Trait.
§

fn as_any(&self) -> &(dyn Any + 'static)

Converts &Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot generate &Any’s vtable from &Trait’s.
§

fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)

Converts &mut Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot generate &mut Any’s vtable from &mut Trait’s.
§

impl<T> DowncastSend for T
where T: Any + Send,

§

fn into_any_send(self: Box<T>) -> Box<dyn Any + Send>

Converts Box<Trait> (where Trait: DowncastSend) to Box<dyn Any + Send>, which can then be downcast into Box<ConcreteType> where ConcreteType implements Trait.
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> Pointable for T

§

const ALIGN: usize

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more