The core building blocks of regex
A regex pattern is made up of literals and metacharacters. Literals match themselves — the pattern cat matches the text 'cat' exactly. Metacharacters have special meaning and let you build flexible patterns.
The dot (.) matches any single character except a newline. The asterisk (*) means 'zero or more of the previous item'. The plus (+) means 'one or more'. The question mark (?) means 'zero or one' — making the previous item optional. These quantifiers are the first thing to learn.
Character classes let you match a set of characters. [abc] matches any one of a, b, or c. [a-z] matches any lowercase letter. [0-9] matches any digit. Negated classes like [^abc] match anything except those characters.
- . — any single character
- * — zero or more of the previous item
- + — one or more of the previous item
- ? — zero or one (makes previous item optional)
- [abc] — character class (matches a, b, or c)
- ^ — start of string (or negation inside [])
- $ — end of string
- \d — any digit (shorthand for [0-9])
- \w — any word character (letters, digits, underscore)
- \s — any whitespace character
Anchors, groups, and alternation
Anchors don't match characters — they match positions. ^ matches the start of a string and $ matches the end. So ^hello matches 'hello world' but not 'say hello'. Using both — ^hello$ — only matches the string 'hello' with nothing else.
Parentheses create groups: (abc)+ matches 'abc', 'abcabc', 'abcabcabc', and so on. Groups also capture the matched text, which you can reference in replacements or extract in code.
The pipe | means 'or'. cat|dog matches 'cat' or 'dog'. Combined with groups: (cat|dog)s matches 'cats' or 'dogs'.
Practical regex patterns you'll actually use
Regex is most useful for validation and text extraction. Here are patterns that come up constantly in real projects.
Email validation is a classic use case. A simple pattern like ^[\w.-]+@[\w-]+\.[a-z]{2,}$ catches most valid emails without being overly strict. Perfect email validation via regex is famously complex — for most purposes a reasonable approximation is fine.
For extracting data from text — log files, HTML, structured strings — regex combined with capture groups lets you pull out specific fields. A pattern like (\d{4})-(\d{2})-(\d{2}) extracts year, month, and day from a date string like '2026-01-10'.
- Email: ^[\w.-]+@[\w-]+\.[a-z]{2,}$
- URL: https?:\/\/[\w-]+(\.[\w-]+)+(\/[\S]*)?
- US phone: ^\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})$
- IPv4: ^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
- Date YYYY-MM-DD: ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
How to test regex online
Writing regex without a live tester is painful. An online regex tester shows you which parts of your test string match as you type, highlights capture groups, and tells you immediately when a pattern change breaks something.
The Irreva Regex Tester supports JavaScript regex syntax, lets you set flags (global, case-insensitive, multiline), and shows all matches highlighted in your input text. You can also see capture groups extracted separately.
When writing a new pattern, start with a handful of test strings that should match and a few that shouldn't. Build the pattern piece by piece and verify each component before combining them.
