Learn Regex the Easy Way, Part 6: Alternation and Grouping

Quick Recap #

In Part 5, we covered metacharacters, escaping with backslash, and the dot wildcard. We learned the 14 special characters and how to match them literally. Now let’s combine patterns with “or” logic and control what our quantifiers apply to.

The Pipe Means OR #

The pipe character | means “or” in regex. cat|dog matches either “cat” or “dog.”

Grouping with Parentheses #

Here’s where it gets critical. Without parentheses, the pipe applies to everything on each side:

cat|dog food    # Matches "cat" OR "dog food"
                # NOT "cat food" or "dog food"

With parentheses, you limit the scope:

(cat|dog)       # Either "cat" or "dog"
[[:blank:]]     # A space
food            # Literal "food"

Compact: (cat|dog) food

MATCH THESE

  • cat food
  • dog food

DO NOT MATCH THESE

  • bird food
  • cat
  • dog
  • food

Multiple Alternatives #

You can have as many alternatives as you need: (red|green|blue|yellow).

Grouping for Quantifiers #

Parentheses also let you apply quantifiers to multi-character sequences:

(ha)+           # One or more "ha"
                # Matches: ha, haha, hahaha

Non-Capturing Groups #

Regular parentheses create a “capturing group” (we’ll cover capturing in Part 7). If you just want grouping without capturing, use (?:...). The ?: says “group this but don’t save it.”

(?:cat|dog)     # Group without capturing
[[:blank:]]     # Space
food            # Literal "food"

Compact: (?:cat|dog) food

This behaves identically for matching. The difference only matters when you need capture groups (Part 7).

Practical Example: Image File Extensions #

^               # Start of line
[[:alnum:]_.-]+ # Filename characters
\.              # Literal dot
(?:jpg|jpeg|png|gif|webp)  # Image extensions
$               # End of line

Compact: ^[[:alnum:]_.-]+\.(?:jpg|jpeg|png|gif|webp)$

MATCH THESE

  • photo.jpg
  • logo.png
  • banner.webp

DO NOT MATCH THESE

  • document.pdf
  • photo.jpg.bak
  • script.js

What to Practice #

  1. Write a regex that matches “http” or “https” (hint: https? is one approach, http|https is another).
  2. Write a regex matching US states: “Texas”, “California”, or “New York” as whole words.
  3. Write a regex for “ha”, “haha”, “hahaha” using (ha)+ with anchors.
  4. What is the difference between [cat] and (cat)? Write a sentence explaining it.

Definitions #


Series Navigation #

 
0
Kudos
 
0
Kudos

Now read this

Growl in Retirement

Growl is being retired after surviving for 17 years. With the announcement of Apple’s new hardware platform, a general shift of developers to Apple’s notification system, and a lack of obvious ways to improve Growl beyond what it is and... Continue →