April 8, 2026

Learn Regex the Easy Way, Part 2: Anchors and Boundaries

Learn Regex the Easy Way, Part 2: Anchors and Boundaries #

Quick Recap #

In Part 1, we talked about what regular expressions are, why they look intimidating (they really do look like a cat sat on your keyboard), and why verbose mode with comments is the way to learn them. We also set up regex101.com in PCRE2 mode. Now let’s write some actual regex.

Anchors Match Positions, Not Characters #

Here’s the first counterintuitive thing about regex: not everything matches a character. Some symbols match a position in the text. These are called anchors.

Think of it this way. When you type the letter a in a regex, the engine finds the letter “a” in your text and consumes it. The match moves forward one character. Anchors don’t do that. They check “am I at a certain position?” and if yes, the match continues, but the engine doesn’t move forward. Nothing is consumed. No character is matched. Just a position.

The Line Anchors: `^` and `$` #

These two are the anchors you’ll use constantly:

^ matches the position at the beginning of a line
$ matches the position at the end of a line

When you use ^ and $ together, you’re saying “match the entire line, start to finish. Nothing extra before, nothing extra after.”

Here’s a verbose regex that matches the exact words “cat”, “dog”, or “bird” as complete lines:

^                # Start of line
(cat|dog|bird)   # Match one of these words
$                # End of line

The compact version: ^(cat|dog|bird)$

MATCH THESE

cat

dog

bird

DO NOT MATCH THESE

catfish

hotdog

birds

the cat sat

Without the anchors, the regex cat would match the “cat” inside “catfish” and “the cat sat.” The anchors prevent that by requiring the match to span the entire line.

Word Boundaries: `\b` #

Sometimes you don’t want to match an entire line. You just want to match a whole word. That’s what \b does. It matches the position between a word character and a non-word character. Think of it like the edge of a word.

A “word character” is any letter, digit, or underscore (the same characters matched by [[:word:]] or \w). Everything else is a “non-word character.” The boundary \b sits right at that edge.

\b     # Word boundary (edge of word)
cat    # The literal text "cat"
\b     # Word boundary (other edge)

Compact: \bcat\b

MATCH THESE

the cat sat on the mat (matches “cat”)

I saw a cat today (matches “cat”)

cat (matches “cat”)

DO NOT MATCH THESE

concatenate (no match)

category (no match)

catfish (no match)

scat (no match)

There’s also \B (capital B), which matches a position that is NOT a word boundary. You’ll rarely use it, but it exists.

Practical Example: Matching `.txt` Filenames #

Let’s match filenames that end with .txt but not .txt.bak:

^                  # Start of line
[[:alnum:]_.-]+    # One or more valid filename characters
\.txt              # Literal ".txt" (dot is escaped)
$                  # End of line (nothing after .txt)

Compact: ^[[:alnum:]_.-]+\.txt$

MATCH THESE

report.txt

my-file.txt

data_2025.txt

DO NOT MATCH THESE

report.txt.bak

image.png

readme.md

The $ anchor at the end is doing the heavy lifting here. It ensures nothing comes after .txt, which is what excludes .txt.bak files. Open regex101.com, set it to PCRE2, and try this yourself.

What to Practice #

Write a regex that matches lines containing only a single digit (0 through 9). Use anchors.
Write a regex that finds the word “error” as a whole word (not inside “errors” or “terrorism”). Use word boundaries.
Write a regex that matches filenames ending in .log but not .log.gz.
Write a regex that matches lines that are completely empty (hint: ^$).

Definitions #

Anchor — A regex element that matches a position in the text, not a character. Anchors don’t consume any characters; they just check “am I at this position?”
Line Anchor (^, $) — The caret ^ matches the start of a line; the dollar sign $ matches the end of a line. Together, they require the pattern to span the entire line.
Metacharacter — A character that has special meaning in regex instead of representing itself literally. Examples include ., *, ^, $, and \.
Position (in regex context) — A point between characters (or before the first/after the last character) in the text. Anchors and boundaries match positions, not characters.
Word Boundary (\b) — Matches the position between a word character (letter, digit, or underscore) and a non-word character. Think of it as the edge of a word.

Series Navigation #

Part 1: Make Regular Expressions the Easy Way
Part 2: Anchors and Boundaries (this post)

Kudos

Learn Regex the Easy Way, Part 2: Anchors and Boundaries

Learn Regex the Easy Way, Part 2: Anchors and Boundaries #

Quick Recap #

Anchors Match Positions, Not Characters #

The Line Anchors: `^` and `$` #

Word Boundaries: `\b` #

Practical Example: Matching `.txt` Filenames #

What to Practice #

Definitions #

Series Navigation #

Now read this

The Real Reason Google Killed Reader

Learn Regex the Easy Way, Part 2: Anchors and Boundaries

Learn Regex the Easy Way, Part 2: Anchors and Boundaries #

Quick Recap #

Anchors Match Positions, Not Characters #

The Line Anchors: ^ and $ #

Word Boundaries: \b #

Practical Example: Matching .txt Filenames #

What to Practice #

Definitions #

Series Navigation #

Now read this

The Real Reason Google Killed Reader

The Line Anchors: `^` and `$` #

Word Boundaries: `\b` #

Practical Example: Matching `.txt` Filenames #