9 Sep 2014 mbrubeck   » (Journeyer)

Let's build a browser engine! Part 5: Boxes

This is the latest in a series of articles about writing a simple HTML rendering engine:

This article will begin the layout module, which takes the style tree and translates it into a bunch of rectangles in a two-dimensional space. This is a big module, so I’m going to split it into several articles. Also, some of the code I share in this article may need to change as I write the code for the later parts.

The layout module’s input is the style tree from Part 4, and its output is yet another tree, the layout tree. This takes us one step further in our mini rendering pipeline:

I’ll start by talking about the basic HTML/CSS layout model. If you’ve ever learned to develop web pages you might be familiar with this already—but it may look a bit different from the implementer’s point of view.

The Box Model

Layout is all about boxes. A box is a rectangular section of a web page. It has a width, a height, and a position on the page. This rectangle is called the content area because it’s where the box’s content is drawn. The content may be text, image, video, or other boxes.

A box may also have padding, borders, and margins surrounding its content area. The CSS spec has a diagram showing how all these layers fit together.

Robinson stores a box’s content area and surrounding areas in the following structure. [Rust note: f32 is a 32-bit floating point type.]

    // CSS box model. All sizes are in px.
struct Dimensions {
    // Top left corner of the content area, relative to the document origin:
    x: f32,
    y: f32,

    // Content area size:
    width: f32,
    height: f32,

    // Surrounding edges:
    padding: EdgeSizes,
    border: EdgeSizes,
    margin: EdgeSizes,
}

struct EdgeSizes {
    left: f32,
    right: f32,
    top: f32,
    bottom: f32,
}
  

Block and Inline Layout

Note: This section contains diagrams that won't make sense if you are reading them without the associated visual styles. If you are reading this in a feed reader, try opening the original page in a regular browser tab. I also included text descriptions for those of you using screen readers or other assistive technologies.

The CSS display property determines which type of box an element generates. CSS defines several box types, each with its own layout rules. I’m only going to talk about two of them: block and inline.

I’ll use this bit of pseudo-HTML to illustrate the difference:

    <container>
  <a></a>
  <b></b>
  <c></c>
  <d></d>
</container>
  

Block boxes are placed vertically within their container, from top to bottom.

    a, b, c, d { display: block; }
  

Description: The diagram below shows four rectangles in a vertical stack.

a
b
c
d

Inline boxes are placed horizontally within their container, from left to right. If they reach the right edge of the container, they will wrap around and continue on a new line below.

    a, b, c, d { display: inline; }
  

Description: The diagram below shows boxes `a`, `b`, and `c` in a horizontal line from left to right, and box `d` in the next line.

a
b
c
d

Each box must contain only block children, or only inline children. When an DOM element contains a mix of block and inline children, the layout engine inserts anonymous boxes to separate the two types. (These boxes are “anonymous” because they aren’t associated with nodes in the DOM tree.)

In this example, the inline boxes b and c are surrounded by an anonymous block box, shown in pink:

    a    { display: block; }
b, c { display: inline; }
d    { display: block; }
  

Description: The diagram below shows three boxes in a vertical stack. The first is labeled `a`; the second contains two boxes in a horizonal row labeled `b` and `c`; the third box in the stack is labeled `d`.

a
b
c
d

Note that content grows vertically by default. That is, adding children to a container generally makes it taller, not wider. Another way to say this is that, by default, the width of a block or line depends on its container’s width, while the height of a container depends on its children’s heights.

This gets more complicated if you override the default values for properties like width and height, and way more complicated if you want to support features like vertical writing.

The Layout Tree

The layout tree is a collection of boxes. A box has dimensions, and it may contain child boxes.

    struct LayoutBox<'a> {
    dimensions: Dimensions,
    box_type: BoxType<'a>,
    children: Vec<LayoutBox<'a>>,
}
  

A box can be a block node, an inline node, or an anonymous block box. (This will need to change when I implement text layout, because line wrapping can cause a single inline node to split into multiple boxes. But it will do for now.)

    enum BoxType<'a> {
    BlockNode(&'a StyledNode<'a>),
    InlineNode(&'a StyledNode<'a>),
    AnonymousBlock,
}
  

To build the layout tree, we need to look at the display property for each DOM node. I added some code to the style module to get the display value for a node. If there’s no specified value it returns the initial value, 'inline'.

    enum Display {
    Inline,
    Block,
    DisplayNone,
}

impl StyledNode {
    /// Return the specified value of a property if it exists, otherwise `None`.
    fn value(&self, name: &str) -> Option<Value> {
        self.specified_values.find_equiv(&name).map(|v| v.clone())
    }

    /// The value of the `display` property (defaults to inline).
    fn display(&self) -> Display {
        match self.value("display") {
            Some(Keyword(s)) => match s.as_slice() {
                "block" => Block,
                "none" => DisplayNone,
                _ => Inline
            },
            _ => Inline
        }
    }
}
  

Now we can walk through the style tree, build a LayoutBox for each node, and then insert boxes for the node’s children. If a node’s display property is set to 'none' then it is not included in the layout tree.

    /// Build the tree of LayoutBoxes, but don't perform any layout calculations yet.
fn build_layout_tree<'a>(style_node: &'a StyledNode<'a>) -> LayoutBox<'a> {
    // Create the root box.
    let mut root = LayoutBox::new(match style_node.display() {
        Block => BlockNode(style_node),
        Inline => InlineNode(style_node),
        DisplayNone => fail!("Root node has display: none.")
    });

    // Create the descendant boxes.
    for child in style_node.children.iter() {
        match child.display() {
            Block => root.children.push(build_layout_tree(child)),
            Inline => root.get_inline_container().children.push(build_layout_tree(child)),
            DisplayNone => {} // Skip nodes with `display: none;`
        }
    }
    return root;
}

impl LayoutBox {
    /// Constructor function
    fn new(box_type: BoxType) -> LayoutBox {
        LayoutBox {
            box_type: box_type,
            dimensions: Default::default(), // initially set all fields to 0.0
            children: Vec::new(),
        }
    }
}
  

If a block node contains an inline child, create an anonymous block box to contain it. If there are several inline children in a row, put them all in the same anonymous container.

    impl LayoutBox {
    /// Where a new inline child should go.
    fn get_inline_container(&mut self) -> &mut LayoutBox {
        match self.box_type {
            InlineNode(_) | AnonymousBlock => self,
            BlockNode(_) => {
                // If we've just generated an anonymous block box, keep using it.
                // Otherwise, create a new one.
                match self.children.last() {
                    Some(&LayoutBox { box_type: AnonymousBlock,..}) => {}
                    _ => self.children.push(LayoutBox::new(AnonymousBlock))
                }
                self.children.mut_last().unwrap()
            }
        }
    }
}
  

This is intentionally simplified in a number of ways from the standard CSS box generation algorithm. For example, it doesn’t handle the case where an inline box contains a block-level child. Also, it generates an unnecessary anonymous box if a block-level node has only inline children.

To Be Continued…

Whew, that took longer than I expected. I think I’ll stop here for now, but don’t worry: Part 6 is coming soon, and will cover block-level layout.

Once block layout is finished, we could jump ahead to the next stage of the pipeline: painting! I think I might do that, because then we can finally see the rendering engine’s output as pretty pictures instead of just numbers.

However, the pictures will just be a bunch of colored rectangles unless we finish the layout module by implementing inline layout and text layout. I’m

Syndicated 2014-09-08 23:16:00 from Matt Brubeck

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!