Regular Expressions¶
Ripgrep uses Rust's regex-automata library (a DFA-based engine) by default, which provides excellent performance for most patterns while supporting a rich set of regex features.
Basic Metacharacters¶
# . matches any character (except newline)
rg "error." # Matches "error1", "errors", "error!"
# * means zero or more of the preceding element
rg "lo*p" # Matches "lp", "loop", "looop"
# + means one or more of the preceding element
rg "lo+p" # Matches "loop", "looop" but not "lp"
# ? means zero or one of the preceding element
rg "colou?r" # Matches "color" or "colour"
# ^ matches start of line
rg "^TODO" # Matches lines starting with "TODO"
# $ matches end of line
rg "error$" # Matches lines ending with "error"
Character Classes¶
# [abc] matches any single character a, b, or c
rg "[aeiou]" # Matches any vowel
# [a-z] matches any character in the range
rg "[0-9]+" # Matches one or more digits
rg "[a-zA-Z]+" # Matches alphabetic words
# [^abc] matches any character except a, b, or c
rg "[^0-9]" # Matches any non-digit character
Character Class Negation
The ^ character only means negation when it's the first character inside square brackets. [^0-9] means "not a digit", but [0-9^] means "a digit or a caret character".
Prefer Predefined Classes
Use predefined classes like \d instead of [0-9] and \w instead of [a-zA-Z0-9_] for better readability.
Note on Unicode support: In the default regex engine, \d is equivalent to [0-9] (ASCII digits only). For matching Unicode digits across all scripts, use \p{N} instead. PCRE2 may have different Unicode handling.
Predefined Character Classes¶
# \d matches any digit [0-9]
rg "\d+" # Matches numbers like "42", "123"
# \w matches word characters [a-zA-Z0-9_]
rg "\w+" # Matches words
# \s matches whitespace characters (space, tab, newline)
rg "\s+error" # Matches "error" with leading whitespace
# \D matches non-digits
# \W matches non-word characters
# \S matches non-whitespace
Unicode Character Classes¶
Ripgrep supports Unicode character classes using the \p{} syntax:
# \p{L} matches any Unicode letter
rg "\p{L}+" # Matches words in any language
# \p{N} matches any Unicode number
rg "\p{N}+" # Matches numeric characters
# \p{P} matches any Unicode punctuation
rg "\p{P}" # Matches punctuation marks
# \p{Ll} matches lowercase letters
rg "\p{Ll}+" # Matches lowercase words
# \p{Lu} matches uppercase letters
rg "\p{Lu}+" # Matches uppercase words
# \P{} negates the class (matches anything except)
rg "\P{L}+" # Matches non-letter characters
Common Unicode categories:
| Pattern | Description | Example |
|---|---|---|
\p{L} |
Any letter | Matches "café", "日本", "hello" |
\p{N} |
Any number | Matches "42", "৪২" (Bengali) |
\p{P} |
Any punctuation | Matches ".", "!", "?" |
\p{S} |
Any symbol | Matches "$", "€", "©" |
\p{Ll} |
Lowercase letter | Matches "abc", "ñ" |
\p{Lu} |
Uppercase letter | Matches "ABC", "Ñ" |
Unicode scripts allow matching specific writing systems:
# \p{Greek} matches Greek script characters
rg "\p{Greek}+" # Matches "Ελληνικά", "Ω"
# \p{Han} matches Chinese/Japanese/Korean Han characters
rg "\p{Han}+" # Matches "日本語", "中文"
# \p{Arabic} matches Arabic script
rg "\p{Arabic}+" # Matches "العربية"
# \p{Cyrillic} matches Cyrillic script
rg "\p{Cyrillic}+" # Matches "Русский"
Quantifiers¶
# {n} matches exactly n times
rg "[0-9]{3}" # Matches exactly 3 digits like "123"
# {n,} matches n or more times
rg "[a-z]{5,}" # Matches words with 5 or more letters
# {n,m} matches between n and m times
rg "[0-9]{2,4}" # Matches 2-4 digits like "42", "123", "1234"
Greedy vs Non-Greedy Quantifiers
By default, quantifiers (*, +, ?, {n,m}) are greedy - they match as much text as possible. Add ? after the quantifier to make it non-greedy (match as little as possible):
.*matches as much as possible (greedy).*?matches as little as possible (non-greedy)
Example: In "<div>text</div>", the pattern <.*> matches the entire string, but <.*?> matches just <div>.
Groups and Alternation¶
# () creates a capturing group
rg "(error|warning): (.+)" # Captures error/warning and message
# Named capture groups with (?P<name>...)
rg "(?P<level>error|warning): (?P<msg>.+)" # Named captures for clarity
# | means "or"
rg "error|warning" # Matches either "error" or "warning"
# Non-capturing groups with (?:)
rg "(?:http|https)://\S+" # Matches URLs
Named capture groups are especially useful with replacement operations (see the Replacements chapter) where you can refer to captures by name instead of number.
Common Pattern Examples¶
Real-World Patterns
These patterns are useful for searching codebases and logs:
# Email addresses
rg "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" # (1)!
# IP addresses (simplified)
rg "\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b" # (2)!
# Hexadecimal colors
rg "#[0-9a-fA-F]{6}" # (3)!
# Function calls (basic)
rg "\w+\([^)]*\)" # (4)!
# URLs
rg "https?://[^\s]+" # (5)!
\bensures word boundaries,+matches one or more allowed email characters- Matches four groups of 1-3 digits separated by dots (doesn't validate valid IP ranges)
#followed by exactly 6 hexadecimal digits for colors like#FF5733\w+matches function name,[^)]*matches any characters except closing parens?makes the 's' in https optional,[^\s]+matches any non-whitespace
Regex Engine Selection¶
Ripgrep supports multiple regex engines that you can select using the --engine flag:
# Use default engine (Rust regex-automata, DFA-based)
rg pattern # Implicit default
rg --engine default pattern
# Use PCRE2 engine (same as -P flag)
rg --engine pcre2 pattern
rg -P pattern # Shorthand
# Auto-select engine based on pattern
rg --engine auto pattern
rg --auto-hybrid-regex pattern # DEPRECATED: Use --engine auto instead
When to use different engines:
- Default (regex-automata): Fast, efficient for most patterns. Use for general searches.
- PCRE2: Supports advanced features not in default engine. Use when you need backreferences, lookahead/lookbehind, or other Perl-compatible features.
- Auto: Lets ripgrep choose the best engine for your pattern.
flowchart TD
Start[Need Regex Pattern] --> Check{"Pattern
Requirements?"}
Check -->|Basic matching, Character classes, Quantifiers| Default["Use Default Engine
Fast & Efficient"]
Check -->|Backreferences, Lookahead/behind, Advanced features| PCRE["Use PCRE2 Engine
-P flag"]
Check -->|Not sure| Auto["Use Auto Mode
--engine auto"]
Default --> Search[Execute Search]
PCRE --> Search
Auto --> Detect{"Auto-detect
Pattern Type"}
Detect -->|Simple| UseDefault[Select Default]
Detect -->|Complex| UsePCRE[Select PCRE2]
UseDefault --> Search
UsePCRE --> Search
style Default fill:#e8f5e9
style PCRE fill:#fff3e0
style Auto fill:#e1f5ff
style Search fill:#f3e5f5
Figure: Decision flow for selecting the appropriate regex engine based on pattern requirements.
Auto Engine Selection
Use --engine auto when you're not sure which engine to use. Ripgrep will automatically select the best engine based on your pattern's complexity.
Deprecated Flag
The --auto-hybrid-regex flag is deprecated. Use --engine auto instead for future compatibility.
PCRE2 Engine¶
Use the -P or --pcre2 flag to enable the Perl-compatible regex engine, which supports advanced features not available in the default engine.
Features available only with PCRE2:
# Positive lookahead (?=...)
rg -P "error(?=:)" # Matches "error" only if followed by ":"
# Negative lookahead (?!...)
rg -P "test(?!ing)" # Matches "test" but not "testing"
# Positive lookbehind (?<=...)
rg -P "(?<=@)\w+" # Matches username after "@" in email
# Negative lookbehind (?<!...)
rg -P "(?<!un)happy" # Matches "happy" but not "unhappy"
Performance Tradeoffs
- PCRE2 is more powerful but typically slower than the default engine
- Use PCRE2 only when you need its specific features
- The default engine is optimized for speed and handles most use cases
Default Engine Limitations¶
The default Rust regex engine provides excellent performance but does not support some advanced features:
Not supported in default engine:
- Backreferences (\1, \2, etc.)
- Lookahead and lookbehind assertions
- Possessive quantifiers (*+, ++, etc.)
- Atomic groups ((?>...))
If you need these features, use the PCRE2 engine with -P:
# This pattern requires PCRE2 for backreferences
rg -P "(\w+)\s+\1"
# This pattern requires PCRE2 for lookahead
rg -P "error(?=:)"
Troubleshooting Regex Patterns
If your regex pattern isn't working as expected and uses backreferences or lookahead/lookbehind, try adding the -P flag to enable PCRE2.
Multiline Patterns
For multiline pattern matching, ripgrep provides the -U flag. You can also use --multiline-dotall to make . match newlines in multiline mode. See the Advanced Patterns chapter for details.