Pattern Syntax
PHP regex patterns are wrapped in delimiters (commonly /, #, or ~) followed by optional modifiers:
/pattern/modifiers
| Token | Meaning |
|---|---|
. | Any single character (except newline) |
\d \D | Digit / non-digit |
\w \W | Word char [A-Za-z0-9_] / non-word |
\s \S | Whitespace / non-whitespace |
^ $ | Start / end of string (or line with m) |
* + ? | 0+, 1+, 0 or 1 |
{n} {n,m} | Exactly n / between n and m |
[...] | Character class |
(...) | Capture group |
(?:...) | Non-capturing group |
| | Alternation (OR) |
\b | Word boundary |
preg_match - Test & Capture
<?php
$text = "The order #12345 was placed";
// Just test
if (preg_match("/order #\d+/", $text)) {
echo "Matches!";
}
// Capture groups
if (preg_match("/order #(\d+)/", $text, $m)) {
echo $m[0]; // "order #12345" (whole match)
echo $m[1]; // "12345" (first group)
}
// Multiple groups
$date = "2024-11-15";
if (preg_match("/(\d{4})-(\d{2})-(\d{2})/", $date, $m)) {
[$full, $year, $month, $day] = $m;
}
preg_match_all - All Hits
<?php
$html = '<a href="/a">A</a> <a href="/b">B</a>';
preg_match_all("/href=\"([^\"]+)\"/", $html, $m);
print_r($m[1]); // ["/a", "/b"]
// PREG_SET_ORDER groups by match instead of by capture group
preg_match_all("/(\w+):(\d+)/", "a:1 b:2 c:3", $m, PREG_SET_ORDER);
// $m = [["a:1","a","1"], ["b:2","b","2"], ["c:3","c","3"]]
preg_replace - Substitute
<?php
// Replace with $1 backreferences
echo preg_replace("/(\w+)@(\w+)/", "$1 at $2", "user@host");
// "user at host"
// Multiple patterns at once
$out = preg_replace(
["/foo/", "/bar/"],
["FOO", "BAR"],
"foo and bar"
); // "FOO and BAR"
// Callback - transform each match
$out = preg_replace_callback("/\b(\w+)\b/", function ($m) {
return ucfirst($m[1]);
}, "hello world");
// "Hello World"
preg_split - Split
<?php
// Split on any whitespace
$words = preg_split("/\s+/", " hello world foo");
// ["", "hello", "world", "foo"]
// Skip empty pieces
$words = preg_split("/\s+/", " hello world ", -1, PREG_SPLIT_NO_EMPTY);
// ["hello", "world"]
// Split on multiple delimiters
$parts = preg_split("/[,;\s]+/", "a, b; c\td");
// ["a", "b", "c", "d"]
Named Capture Groups
<?php
$pattern = "/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/";
preg_match($pattern, "2024-11-15", $m);
echo $m["year"]; // 2024
echo $m["month"]; // 11
echo $m["day"]; // 15
// Use in replacement
echo preg_replace($pattern, "$3/$2/$1", "2024-11-15"); // 15/11/2024
Pattern Modifiers
| Mod | Effect |
|---|---|
i | Case-insensitive |
m | ^ and $ match line breaks (multiline) |
s | . matches newlines too |
u | UTF-8 mode (always use for text) |
x | Extended - ignore whitespace, allow # comments |
U | Ungreedy by default |
Common Patterns
<?php
// Email (prefer filter_var for production)
"/^[\w.+-]+@[\w-]+(\.[\w-]+)+$/"
// Strong password - 8+ chars, mixed case, digit, symbol
"/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[^\w]).{8,}$/"
// URL
"#^https?://[\w.-]+(:\d+)?(/[^\s]*)?$#i"
// Phone (US: +1-555-123-4567 or variants)
"/^\+?1?[-.\s]?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$/"
// IPv4
"/^(\d{1,3}\.){3}\d{1,3}$/"
// Hex color
"/^#([\da-f]{3}|[\da-f]{6})$/i"
// Slug (URL-safe)
"/^[a-z0-9]+(-[a-z0-9]+)*$/"
// HTML tag (use a real HTML parser instead!)
"/<([a-z]+)([^>]*)>(.*?)<\/\1>/is"
HTML is not a regular language. Use DOMDocument, SimpleXML, or a library like symfony/dom-crawler. Regex on HTML works for trivial cases and breaks the moment markup gets nested or quoted weirdly.