How this blog detects broken links - update

Posted by db on Mon 01 February 2021

This article is a follow-up to how this blog detects broken links, describing updated ways to use the markdown-link-check npm package.

Batch Operation

The previous article used a for loop and an shortcut or-operator (||) with an exit command.

for file in  $(find ./content -name \*.md); do markdown-link-check --verbose "$file" || exit 1; done;

That exit command turned out to exit the shell in some cases and this was not desirable on some machines. Instead, we switched to xargs and enforce single parameters with --max-lines argument:

find content -name \*.md -print0 | xargs --null --max-lines=1 markdown-link-check --config .markdown-link-check.json --verbose

Using xargs introduces some new changes. At first xargs passed all arguments to markdown-link-check, but the additional flag executes it once per file. Second, xargs will return back exit code 123 if any of invocations returns back a non-zero value, which is fine in the ci pipeline. And lastly, xargs processes all arguments, regardless if one invocation fails. This is desirable in our case because we need to check all links and not quit at the first failure.

Ignore Rules

We have another update regarding Pelican's shorthand for local links: {filename} and {static}. These are not valid links but we create a configuration file, .markdown-link-check.json, and use the ignorePatterns option to skip those special cases. Before we could use the raw { and } characters. Now, it looks like we have to use the html escape codes %7B and %7D:

{
    "ignorePatterns": [
        {
            "pattern": "^({|%7B)filename(}|%7D)"
        },
        ....
    ]
}

For backwards compatibility, both characters are in a regex or (|) group.

tags: markdown, travis, bash