Spike
Spike's Blog

Spike's Blog

Build a simple template engine in <100 lines of Rust code

Use Regex and a little bit of Rustafarian magic to create your own toy HTML templating engine using Jinja-like syntax

Spike's photo
Spike
·Jun 17, 2022·

11 min read

Featured on Hashnode
Build a simple template engine in <100 lines of Rust code

Table of contents

If you've ever built a full-stack web app, you've probably run into a templating engine like the one provided by Django or Flask. These neat utility packages parse your HTML files and "fill-in-the-blanks" (so to speak) with dynamic content.

A common example is sending the titles of posts from your back-end to be listed in a <ul> on your static HTML5 blog page, each as their own <li>.

This tutorial is aimed at beginner Rustaceans but you should be familiar with programming and working on the terminal.

Why Rust?

Rust is the modern, memory-safe programming language (with a fantastic web ecosystem!) that I'll be using in this tutorial. Packages in Rust are called "crates" and are typically published to crates.io.

If you'd like to learn more about the most loved language on StackOverflow, I can't recommend the Rust Book and Rust by Example enough!

In line with the notoriously-idiomatic Rust community, the file structure for this example crate will be:

template-engine/
├─ dist/
│  ├─ index.html
├─ src/
│  ├─ lib.rs
├─ Cargo.toml

🦀 The primary code file for Rust libraries is src/lib.rs; for binaries, src/main.rs

src/lib.rs will house the crate's code, while the dependencies and other metadata will be kept neatly tucked away in Cargo.toml.

Qu'est-ce que Jinja ?

Jinja is a web template engine for Python that uses a simple set of rules for its syntax:

Delimiter:It is used for:
{{..}}
Example: {{ customer }}.
Print the value of the variable between the braces.
{%..%}
Example: {% set x = 2 %}.
Statements that do not have an output.
{#..#}
Example: {# Blog: #}.
Comments to explain the code.

This will provide us with a simple, standardized set of methods to implement in our toy template engine.

The Plan

The only thing programmers hate more than working with legacy software is having to do a boring, repetitive process. To avoid having to rewrite your code as much—while you come up with slightly better names for your variables—it's useful to brainstorm as much as possible before you begin scripting.

Features

By the end of this tutorial, you should have a basic, safe (no Rust code is ever evaluated), and minimal template engine. It will support the following features:

  1. Printing variables:
    <body>Hello, my name is {{ name }}!</body>
    
  2. If statements (only supports boolean expressions currently):
    {% if allowed %}
     <p>Welcome to the internet!</p>
    {% else %}
     <p>No trespassing.</p>
    {% endif %}
    
  3. Repeat statements:
    {% repeat 3 times %}
    <li>{{ date }}</li>
    {% endrepeat %}
    
  4. Comments:
    {# List of blog posts #}
    

Process

The render() function will be defined as:

fn render(mut template: String, mut data: HashMap<&str, Data>) -> String {

and will take a template as an input such as:

<body>Hello {{ name }}!</body>

along with a HashMap (key-value list) of data:

HashMap::from([
    ("name", Data::Text("world".to_string())),
])

and output an HTML5-compatible String with the correctly replaced data:

<body>Hello world!</body>

Let's begin!

Setup

Use Rust's package manager, cargo, to initialize the crate's (library) directory structure:

cargo init template-engine --lib

This will create a new folder, template-engine/, with the file structure listed above (minus templates/ which we'll add later).

Start by filling up Cargo.toml with the metadata of the crate and its dependencies:

[package]
name = "template-engine"
version = "0.1.0"
edition = "2021"

[dependencies]
regex = "1"

We'll be using good ol' RegEx to parse the delimiters.

Imports

Rendering will require both importing the regex crate as well as loading the standard library's HashMap implementation. Additionally, the fmt module of the standard library should be loaded to support converting Data elements to strings.

use regex::{Regex, Captures};
use std::{fmt, collections::HashMap};

Data

You may have noticed in the Process section that the 'value' type of the HashMap<&str, Data> is a custom type called Data. These wildcard types can be defined in Rust with the enum keyword and allow the creation of a type that may be one of a few different variants.

The Data type is necessary for the template engine to support more than just string values and to properly convert back and forth between integers/booleans (to be processed on the Rust side) and strings (to be rendered in the HTML).

enum Data {
    Number(i32),
    Boolean(bool),
    Text(String)
}

Rust is strictly-typed and therefore, conversions must be defined for your custom type so that the template engine can render the inputted Data types as strings.

This can be done with the following code which unwraps the inner type (either i32, bool, or String) of each Data type:

impl fmt::Display for Data {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            Self::Text(x) => write!(f, "{}", x),
            Self::Number(x) => write!(f, "{}", x),
            Self::Boolean(x) => write!(f, "{}", x)
        }
    }
}

Rendering

Print delimiters

The core feature of a template engine is printing the value of the inputted data based on what key is typed between {{ and }} in the HTML file.

To match the instances where text is surrounded by double braces, we'll be using a Regular Expression — the industry standardized search pattern language.

💡 I recommend MDN's cheat sheet and RegExr for testing your regular expressions

For the braces to be properly read by Rust's regex crate, you need to escape the characters by placing backslashes before them.

let print_regex = Regex::new(r"\{\{(.*?)\}\}").unwrap();
RegEx Breakdown
  • {{ — Match opening double braces
  • (.*?) — Match any text lazily
  • {{ — Match closing double braces

Next, iterate through each instance and replace it with the value that corresponds to the entered key (e.g. "world" corresponds to "hello" in the above example).

template = print_regex.replace_all(&template, |caps: &Captures| {
    ...
    // Find the corresponding key in the Data and return the value
    data[key].to_string()
}).to_string();

What should replace the ... to extract the key?

To extract the key, you must traverse through the capture groups and identify only the key.

The regex crate for Rust always inserts the entire match as the first element of captures. Thus, the indices and elements of caps are as follows:

0. "{{ hello }}"
1. " hello "

To ensure the Data HashMap can locate the same key as specified in the template, use Rust's .trim() method to remove the whitespace at the start and end of capture group 1.

// Extract the text in between the {{ and the }} in the template
let key = caps.get(1).unwrap().as_str().trim();

Repeat statements

Once again, through lots of StackOverflow reading, a new regular expression is born—this time, to match the repeat statements as defined above:

let repeat_regex = Regex::new(r"\{% repeat (\d*?) times %\}((.|\n)*?)\{% endrepeat %\}").unwrap();
RegEx Breakdown
  • {% repeat — Match first part of opening tag
  • (\d*?) — Match digits(s) lazily as capture group
  • times %} — Match end part of opening tag
  • ((.|\n)*?) — Match multi-line string of text lazily
  • {% endrepeat %} — Match closing tag

This beaut will capture both the number of times to repeat the code and the code block itself (capture group indexes #3 and #6 respectively).

From there, you can apply the same syntax as before but this time utilizing Rust's .repeat() method to repeat the code block a certain number of times:

template = repeat_regex.replace_all(&template, |caps: &Captures| {
    // Extract the number of times to repeat the code
    let times = caps.get(1).unwrap().as_str().trim();

    // Parse the code block to be repeated
    let code = caps.get(2).unwrap().as_str().trim();

    // Repeat the code `times` number of times
    code.repeat(times.parse::<usize>().unwrap())
}).to_string();

Note how after each statement, the template variable is reassigned to the rendered version.

If/else statements

This regular expression will match {% if KEY %}...{% else %}...{% endif %}:

let if_else_regex = Regex::new(r"\{% if (.*?) %\}((.|\n)*?)(\{% else %\}((.|\n)*?)\{% endif %\}|\{% endif %\})").unwrap();
RegEx Breakdown
  • {% if — Match first part of opening tag
  • (.*?) — Match any text lazily
  • %} — Match end part of opening tag
  • ((.|\n)*?) — Match multi-line string of text lazily
  • {% else %} — Match else tag
  • ((.|\n)*?) — Match multi-line string of text lazily
  • {% endif %} — Match closing tag

From there, simply use the regex crate's .replace_all() method to swap the capture with either the parsed if code block or the parsed else code block, depending on the value of the boolean that matches key.

template = if_else_regex.replace_all(&template, |caps: &Captures| {
    // Extract the name of the bool being tested
    let key = caps.get(1).unwrap().as_str().trim();
    // Parse the 'if' and (optional) 'else' code blocks
    let if_code = caps.get(2).unwrap().as_str().trim();
    let else_code = caps.get(5).map_or("", |m| m.as_str()).trim();
    // Find the corresponding key in the Data and return the value
    if let Data::Boolean(exp) = data[key] {
        if exp { if_code.to_string() }
        else { else_code.to_string() }
    } else {
        "ERROR PARSING KEY".to_string()
    }
}).to_string();

.unwrap() is one of the many methods for error handling in Rust but it will throw an error if the unwrapped Option is None (Rust equivalent to null).

Using .map_or() when handling the 5th capture group (line recreated below) allows us to store an empty string as the else_code if there is no else code block in the template.

let else_code = caps.get(5).map_or("", |m| m.as_str()).trim();

Comments

Lastly, Jinja-style comments should be converted to HTML comments so that they can still be read in the rendered code. Comments are useful to disable parts of the template for debugging or provide documentation to your template code.

Here, we don't need to match the key or code block surrounded by {# and #} so a simple string .replace() method will suffice:

template = template.replace("{#", "<!--").replace("#}", "-->");

Put it all together!

Your finished render function should look something like this:

fn render(mut template: String, mut data: HashMap<&str, Data>) -> String {
    // Render variable printing
    let print_regex = Regex::new(r"\{\{(.*?)\}\}").unwrap();
    template = print_regex.replace_all(&template, |caps: &Captures| {
        // Extract the text in between the {{ and the }} in the template
        let key = caps.get(1).unwrap().as_str().trim();
        // Find the corresponding key in the Data and return the value
        data[key].to_string()
    }).to_string();

    // Render repeat statements
    let repeat_regex = Regex::new(r"\{% repeat (\d*?) times %\}((.|\n)*?)\{% endrepeat %\}").unwrap();
    template = repeat_regex.replace_all(&template, |caps: &Captures| {
        // Extract the number of times to repeat the code
        let times = caps.get(1).unwrap().as_str().trim();
        // Parse the code block to be repeated
        let code = caps.get(2).unwrap().as_str().trim();
        // Repeat the code `times` number of times
        code.repeat(times.parse::<usize>().unwrap())
    }).to_string();

    // Render for statements
    let if_else_regex = Regex::new(r"\{% if (.*?) %\}((.|\n)*?)(\{% else %\}((.|\n)*?)\{% endif %\}|\{% endif %\})").unwrap();
    template = if_else_regex.replace_all(&template, |caps: &Captures| {
        // Extract the name of the bool being tested
        let key = caps.get(1).unwrap().as_str().trim();
        // Parse the 'if' and (optional) 'else' code blocks
        let if_code = caps.get(2).unwrap().as_str().trim();
        let else_code = caps.get(5).map_or("", |m| m.as_str()).trim();
        // Find the corresponding key in the Data and return the value
        if let Data::Boolean(exp) = data[key] {
            if exp { if_code.to_string() }
            else { else_code.to_string() }
        } else {
            "ERROR PARSING KEY".to_string()
        }
    }).to_string();

    // Process comments
    template = template.replace("{#", "<!--").replace("#}", "-->");

    // Return output
    template
}

Testing it out

Writing a template

Inside dist/index.html, we'll design a simple template to test documentation comments, code-disabling comments, variable printing, if/else statements, and repeat statements.

<body>
    {# This is a comment! #}
    <h1>{{ hello }}</h1>

    {% if allowed %}
        {% repeat 3 times %}
            <p>Welcome to the {{ hello }}!</p>
        {% endrepeat %}
    {% else %}
        <p>No trespassing.</p>
    {% endif %}

    {#
    <p>Hidden from the {{ hello }}...</p>
    #}
</body>

Create the test

Unlike Rust binaries, libraries in Rust have no primary function to run which is why executing cargo run in the root directory won't work.

Instead, you can play around with your library code using Rust tests.

These tests are run independently from your library code meaning you'll need to import the elements of your crate to be used in your test:

#[cfg(test)]
mod tests {
    use crate::{render, Data};
    use std::collections::HashMap;

    #[test]
    fn basic_template() {
        ...
    }
}

In our basic_template() test, we'll use the fs module of the Rust standard library to read the contents of our example template and save it as a string:

let input = std::fs::read_to_string("dist/index.html").expect("Something went wrong reading the file");

Next, we'll construct a HashMap of data to pass to the render() function:

let data = HashMap::from([
    ("hello", Data::Text("internet".to_string())),
    ("allowed", Data::Boolean(true))
]);

Finally, we'll call our render() function and print the output:

println!("{}", render(input, data));

Run the test

To run a Rust test without hiding printed text, use:

cargo test -- --nocapture

Which should successfully run and print to the terminal:

<body>
    <!-- This is a comment! -->
    <h1>internet</h1>

    <p>Welcome to the internet!</p><p>Welcome to the internet!</p><p>Welcome to the internet!</p>

    <!--
    <p>Hidden from the internet...</p>
    -->
</body>

Congrats on creating your own template engine! 🥳

What's next?

In comparison to most template engines, the toy we built today is pretty bare-bones. Currently, no variables can be set in the template code, nesting only partially(?) works, and few data types are supported.

If you're willing to take on the strict typing of Rust, you could expand this project by:

  • Supporting more operators in the If/Else Statement expression syntax (<, >, ==, !=, etc.)
  • Allowing the editing of variables in the templates and rendering the template delimiters as ordered commands rather than simultaneous replacements
  • Implementing more Jinja features such as macros, importing of other templates, correct escaping, etc. (see Bloomreach's documentation on Jinja syntax)

The code for this project can be found here:


Please leave a like if you enjoyed this post! Let me know in the comments if you have any suggestions or clarifications for making this article accurate, easy to understand, and up-to-date.

 
Share this