Prose linting with Vale

⚠️

Due to the recent documentation site migration, our documentation doesn't currently use Vale. You can still view our style guide on GitHub.

Writing is a time-consuming process. Depending on the writer and the review process, it can take a while to get content ready for publication. Whether you’re a solo writer or a big team, prose linters can help ensure a consistent tone and style.

Similar to code linters, prose linters automatically check your text for errors. Unlike grammar checkers, which highlight violations of grammatical rules, prose linters focus on how you can make your text better by addressing common usage problems like extra spaces, repeated words, excessive use of jargon, sexist language, and incorrect capitalization.

Prose linters can also provide a framework for creating and enforcing an editorial style guide. This helps with the review process as you can now focus on reviewing the content itself instead of pointing out typos and preferred usage patterns. This is particularly important when working in open-source projects like Meilisearch, where you have many contributors unfamiliar with your style guide.

What is Vale?

Vale is an open-source, highly customizable, syntax-aware prose linter. It supports documents written in many different formats such as Markdown, HTML, reStructuredText, AsciiDoc, DITA, and XML.

Vale isn’t your only option when it comes to prose linting. There are many other open-source tools available, including proselint, textlint, and alex.

At Meilisearch, we decided to go with Vale because it’s fast, easy to set up, flexible, and comes with existing rules to help you get started.

Where do I start?

Though it may seem daunting, setting up Vale can be fairly straightforward if you start small and keep things simple. In this post, we’ll go over how to use Vale in a project much like Meilisearch’s documentation.

Step 1: Style guide

The first step is to create a style guide. A style guide ensures a consistent tone and style regardless of how big your team gets. It establishes standard practices when people may have different opinions, such as whether or not to use the oxford comma.

If you don’t have an in-house style guide, you can check out Google’s or Microsoft’s to help you get started. Over time you will memorize most of the rules, but it’s possible to overlook mistakes and sometimes forget the rules altogether. We’re only human.

And that is where Vale comes in. It allows you to “codify” a style guide and checks your text against all the rules in that style guide. If it detects any issues, it displays suggestions, warnings, or errors on the console.

Step 2: Install Vale

I’m using macOS and ran brew install vale in my console to install Vale.

If you’re using a different operating system, check out Vale’s documentation for installation instructions.

Verify the installation by typing vale -v in your console. If this command returns Vale’s version number, the installation was successful.

Finally, create the following files and folders:

├── .vale.ini
│   ├──  styles
│   │	    ├── Style guide
│   │       └── Vocab

Step 3: Configuring Vale - vale.ini

Create a vale.ini file at the root of your repository. This is the Vale configuration file where you define what you want Vale to do and what files to lint. Let’s start with a basic setup—you can always add to it later based on your project’s needs.

StylesPath = .vale/styles
MinAlertLevel = suggestion

Vocab = word_list

[*.md]
BasedOnStyles = Meilisearch

StylesPath is where Vale looks for your style guide (more on that in the next step). The path can be relative or absolute to the location of the vale.ini file.
MinAlertLevel specifies the minimum alert level that Vale will report. By default, it’s set to warning. The other options are error and suggestion.
An error indicates you did something wrong, like using extra spaces or making a typo. A warning isn’t as severe as an error, but indicates something you should avoid, like making sure your sentences don’t become too long. A suggestion is a recommendation to do something that is usually—but not always—a good idea, like breaking your sentence into two instead of using a semicolon.
If a rule is set to suggestion, you will see suggestions, warnings, and errors. If it is set to warning, Vale will only show errors and warnings, no suggestions.
Vocab: This is where you create a directory containing the accept.txt and reject.txt files. Both files accept words, phrases, and regular expressions. If your text contains words that do not exist in the dictionary (e.g. “Meilisearch”), you can add them to accept.txt and Vale won’t angrily scream at you for making a “typo”. Vale will flag all occurrences listed in reject.txt as errors. This can be useful when you want writers to avoid a specific word—if you are writing about search engines and databases, for example, using “indexation” can be confusing.
[*.md] tells Vale to only lint markdown files. If you want to lint plain text files, use [*.txt].
BasedOnStyles specifies the style guide Vale should use for linting.

You can specify other settings, including what tokens and HTML tags to ignore and what Vale should consider an individual word. You can read more about the vale. ini file in Vale’s documentation.

Step 4: Rules and the styles folder

As I mentioned earlier, you need a style guide to use Vale. You then convert this style guide into something Vale can understand: rules.

Rules use different extension points to perform specific tasks. For example, the existence extension point looks for the existence of a particular token, repetition looks for repeated tokens, spelling implements spell checking, and so on. In Vale, each rule is a YAML file. The styles folder contains the individual rules that make up the style guide.

If you don’t want to create your own style guide or need a starting point to build from, Vale comes with ready-to-use style guides that you can apply to your docs and start linting. Here are some style guides that helped us get started:

You can find more on Vale’s GitHub repository.

Let’s start with a rule for sentence length. Your style would say something like “Ensure sentences don’t exceed 40 words”. This is what the rule looks like as a YAML file:

# Warning: Meilisearch.SentenceLength

# Counts words in a sentence and alerts if a sentence exceeds 40 words.

extends: occurrence
message: 'Shorter sentences improve readability (max 40 words).'
scope: sentence
link: https://docs.gitlab.com/ee/development/documentation/styleguide/index.html#language
level: warning
max: 40
token: \b(\w+)\b

This rule counts the words in a sentence and throws a warning if it exceeds 40 words. The scope is set to sentence: this ensures that Vale does not apply this rule to other parts of the text, like headings or tables.

The level is set to warning. Written text is complicated, and Vale will find false positives. There are no sure-fire ways of deciding when a rule should be a suggestion, warning, or an error. You will need to make decisions and learn as you go.

I suggest reviewing your rules over time to update and, in some cases, delete outdated rules. Initially, the Meilisearch docs didn’t have a rule on sentence length. When we added it, the maximum length of a sentence was 45. Now it’s 40, and we plan on bringing it down to 35.

You can also enable or disable specific rules within a style guide by adding them to vale.ini:

Meilisearch.Headings = NO
Meilisearch.Spelling = NO
Meilisearch.Semicolons = NO

The above lines disable the Heading, Spelling, and Semicolons rules from the Meilisearch style guide.

Step 5: Run Vale

Now, when you use the following command on your console to lint your whole project:

vale .

Vale will check all your files against the rules stored in BasedOnStyles. If Vale detects any issues, it will display suggestions, warnings, and errors on the console.

You can also lint individual files using:

vale {file_path}

Step 6: Automate Vale checks

All the checks we’ve discussed so far are for your local files. Once you’re confident the rules work as intended, you can automate these checks using the Vale GitHub action! At the Meilisearch documentation repository, we configured it to run for every pull request. As mentioned before, Vale may find false positives. Since you don’t want a PR blocked because Vale isn’t working as it should, I recommend starting with a few rules, and slowly tweaking them to avoid failing checks.

Conclusion

That is all, folks! I hope I was able to help you get started with Vale (and a style guide). This was a quick overview to introduce you to Vale’s features. Tweaking it to your needs takes time and many, many iterations.

Once configured, Vale can automate parts of the review process and allow you to focus on the parts of a text that computers are not very good at. At least not yet.

Oh, and if you’re curious, check out our style guide on GitHub to see how we use Vale!

Prose linting with Vale

Maryam Sulemani

Maryam Sulemani

What is Vale?

Where do I start?

Step 1: Style guide

Step 2: Install Vale

Step 3: Configuring Vale - vale.ini

Step 4: Rules and the styles folder

Step 5: Run Vale

Step 6: Automate Vale checks

Conclusion

How Meilisearch updates a database with millions of vector embeddings in under a minute

Meilisearch expands search power with Arroy's Filtered Disk ANN

Multithreading and Memory-Mapping: Refining ANN performance with Arroy

Minoru Osuka: POV of a main language contributor

Balancing business, technology, and user experience