Linux Regular Expressions (Regex) Cheat Sheet
This article is a translation of the following my article:
Original: Linux 正規表現(Regex)チートシート
* Translated automatically by Google.
* Please note that some links or referenced content in this article may be in Japanese.
* Comments in the code are basically in Japanese.
by bokumin
Linux Regular Expression(Regex) Cheat Sheet
I made a quick cheat sheet for Linux regular expressions. I would appreciate it if you could use it as a reference.
Special characters (meta characters)
| Symbol | Explanation | Usage example (matching characters) |
|---|---|---|
. | Match any single character | a.c → abc, adc, a1c, etc. |
^ | Matches the beginning of the line | ^abc → Lines starting with “abc” |
$ | Matches the end of the line | abc$ → Lines ending with “abc” |
* | 0 or more repetitions of the previous character | ab*c → ac, abc, abbc, abbc… |
\ | Escape next character | \. → Dot character itself |
* +, ?, |, (), {} can be used in Extended Regular Expressions (ERE) or PCRE. Basic regular expressions (BRE) require escapes like \+, \?, \|, \( \), \{ \}.
Character classes and ranges
| Symbol | Explanation | Usage example (matching characters) |
|---|---|---|
[abc] | Match any one character in parentheses | [abc] → a, b, or c |
[^abc] | Match any single character except within parentheses | [^abc] → a, b, c Any character other than |
[a-z] | Range specification (lowercase letters from a to z) | [a-z] → 1 lowercase alphabetic character |
[A-Z] | Range specification (uppercase letters from A to Z) | [A-Z] → 1 uppercase alphabetic character |
[0-9] | Range specification (numbers from 0 to 9) | [0-9] → 1 arbitrary numeric character |
POSIX character class
When used, enclose it in double parentheses like [[:alnum:]].
| Symbol | Explanation | Example of use |
|---|---|---|
[:alnum:] | Alphanumeric characters | [[:alnum:]] → a-z, A-Z, 0-9 |
[:alpha:] | Alphabet | [[:alpha:]] → a-z, A-Z |
[:blank:] | Blanks (spaces and tabs) | [[:blank:]] → Space or tab |
[:digit:] | Number | [[:digit:]] → 0-9 |
[:lower:] | Lowercase | [[:lower:]] → a-z |
[:upper:] | Uppercase | [[:upper:]] → A-Z |
[:space:] | Space character | [[:space:]] → Spaces, tabs, line breaks, etc. |
[:punct:] | Punctuation marks | [[:punct:]] → Punctuation mark |
[:xdigit:] | Hex digit | [[:xdigit:]] → 0-9, a-f, A-F |
Repeat specification
Basic formal performance (BRE: grep, sed)
| Symbol | Description | Example of use |
|---|---|---|
\{n\} | Repeat previous character exactly n times | a\{3\} → aaa |
\{n,\} | Repeat the previous character n or more times | a\{2,\} → aa, aaa, aaaa… |
\{n,m\} | Repeat the previous character between n and m times. | a\{2,4\} → aa, aaa, aaaa |
Formal expression (ERE: grep -E, egrep)
| Symbol | Explanation | Usage example |
|---|---|---|
+ | One or more repetitions of the previous character | ab+c → abc, abbc, abbc… |
? | 0 or 1 occurrence of previous character | ab?c → ac, abc |
{n} | Repeat previous character exactly n times | a{3} → aaa |
{n,} | Repeat the previous character n or more times | a{2,} → aa, aaa, aaaa… |
{n,m} | Repeat the previous character between n and m times. | a{2,4} → aa, aaa, aaaa |
* GNU grep allows you to use \+ as a basic regular expression, but this is a GNU extension.
Selection and Alternatives
Basic Regular Representation (BRE)
| Symbol | Explanation | Usage example |
|---|---|---|
| | Selection (OR operator) | cat|dog → “cat” or “dog” |
\( \) | Grouping | \(cat|dog\)s → “cats” or “dogs” |
Regular Expression (ERE)
| Notation | Explanation | Usage example |
|---|---|---|
| ` | ` | Selection (OR operator) |
( ) | グループ化 | `(cat |
Escape sequences and special expressions
BRE/ERE共通
| Symbol | Explanation | Example of use |
|---|---|---|
\n | Line feed character | line\n → Line break after “line” |
\t | Tab character | column\t → Tab after “column” |
PCRE専用(grep -P)
| Symbol | Explanation | Example of use |
|---|---|---|
\d | One number [0-9] Equivalent to | \d{3} → 3-digit number |
\D | Equivalent to non-numeric [^0-9] | \D+ → One or more non-numeric characters |
\w | Equivalent to [a-zA-Z0-9_] | \w+ → One or more word constituent characters |
\W | Non-word constituent characters | \W+ → One or more non-word characters |
\s | Space character | \s+ → One or more blank characters |
\S | Non-blank characters | \S+ → One or more non-blank characters |
\b | Word boundary | \bcat\b → Only the word “cat” |
\B | Non-word boundary | \Bcat\B → “cat” in the middle of the word |
* \d, \w, \s, etc. are functions of Perl compatible regular expressions (PCRE) and can only be used with grep -P. It cannot be used with normal grep/sed.
Character code specification
| Symbol | Explanation | Usage example |
|---|---|---|
\ooo | Character of octal code (3 digits) | \101 → A |
\xhh | Hexadecimal code (2 digits) character | \x41 → A |
Back reference
Basic Regular Representation (BRE)
sed 's/\(abc\)\1/XYZ/g' # abcabc → XYZ
Regular Expression (ERE)
sed -E 's/(abc)\1/XYZ/g' # abcabc → XYZ
| Symbol | Explanation | Example of use |
|---|---|---|
\1, \2… | See previous parenthesized pattern | \(abc\)\1 → abcabc(BRE) |
(abc)\1 → abcabc(ERE) |
*Named capture group (?<name>) can only be used with PCRE.
Read ahead/read behind (for PCRE only: grep -P)
| Symbol | Explanation | Example of use |
|---|---|---|
(?=pattern) | Positive lookahead | foo(?=bar) → “foo” in “foobar” |
(?!pattern) | Negative lookahead | foo(?!bar) → “foo” without trailing “bar” |
(?<=pattern) | Positive lookbehind | (?<=foo)bar → “bar” in “foobar” |
(?<!pattern) | Negative lookbehind | (?<!foo)bar → “bar” without “foo” in front |
Other special syntax (PCRE)
| Symbol | Explanation | Example of use |
|---|---|---|
(?:pattern) | Non-capturing group | (?:abc)+ → One or more repetitions of “abc” |
(?i) | Case insensitive | (?i)abc → ABC, abc, Abc, etc. |
sed replacement command flags
The following flags can be used with the sed replacement command s/検索/置換/フラグ.
| Flag | Description | Example |
|---|---|---|
g | Replace all matches in each line | sed 's/old/new/g' |
p | Output replaced line | sed -n 's/old/new/p' |
i or I | Ignore case | sed 's/old/new/gi' |
数字 | Replace only nth match | sed 's/old/new/2' (Second only) |
w ファイル | Write replaced line to file | sed 's/old/new/w out.txt' |
It is possible to combine them in the form likesed 's/old/new/gp'.
sed command options
| Options | Description | Example of use | |
|---|---|---|---|
| ___MASK_CODE_3 14___ | Script specification (multiple specifications are possible) | sed -e 's/a/b/' -e 's/c/d/' | |
-i | Edit the file directly (GNU or -r | Use extended regular expressions | sed -E 's/(abc)+/XYZ/g' |
Differences in regular expressions for different Linux tools
| Tool | Regular expression type | Options | Notes |
|---|---|---|---|
grep | Basic Regular Expressions (BRE) | Default | +, ?, {}, ` |
grep -E / egrep | Extended Regular Expressions (ERE) | -E | +, ?, {}, ` |
grep -P | Perl Compatible Regular Expressions (PCRE) | -P | \d, \w, \s, Read ahead, look behind, etc. can be used (experimental function) ` |
sed -E | Extended Regular Expressions (ERE) | -E or -r | +, ?, {}, ` |
awk | Extended Regular Expression (ERE) | Default | AWK-specific syntax and limitations |
perl | Perl Compatible Regular Expressions (PCRE) | Default | The most powerful and versatile |
注意点
1. Difference between basic regular expressions (BRE) and extended regular expressions (ERE)
In basic regular expressions, metacharacters such as +, ?, |, (), {} need to be escaped with a backslash when using them, but in extended regular expressions they can be used directly.
# BRE (基本正規表現)
grep 'a\{5\}' file.txt # aaaaa にマッチ
grep 'aa\+' file.txt # aa, aaa, aaaa... にマッチ(GNU拡張)
# ERE (拡張正規表現)
grep -E 'a{5}' file.txt # aaaaa にマッチ
grep -E 'aa+' file.txt # aa, aaa, aaaa... にマッチ
2. Utilization of PCRE (Perl Interchange Regular Expression)
In GNU grep, PCRE can be used with the -P option, and shorthand character classes such as \d, \w, \s and lookahead/lookbehind functions are available. However, grep -P is an experimental feature and may not be supported in all environments.
# PCRE使用例
grep -P '\d{3}-\d{4}' file.txt # 123-4567 のような電話番号
grep -P '\w+@\w+\.\w+' file.txt # メールアドレス
3. sed basics and extended regular expressions
Sed allows extended regular expressions to be enabled using the -E option (in some environments -r).
# BRE
sed 's/\(abc\)\+/XYZ/g' file.txt
# ERE
sed -E 's/(abc)+/XYZ/g' file.txt
Practical example
Search with grep
# 基本的な検索
grep 'error' log.txt
# 拡張正規表現で複数パターン
grep -E 'error|warning|fatal' log.txt
# 行番号付きで表示
grep -n 'error' log.txt
# 大文字小文字を区別しない
grep -i 'error' log.txt
# PCRE で数字のみの行
grep -P '^\d+$' file.txt
Replacement with sed
# 基本的な置換
sed 's/old/new/' file.txt
# すべて置換
sed 's/old/new/g' file.txt
# ファイルを直接編集
sed -i 's/old/new/g' file.txt
# 拡張正規表現を使用
sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3\/\2\/\1/g' file.txt
まとめ
Please note that Basic Regular Expressions (BRE) and Extended Regular Expressions (ERE) escape special characters differently, and that Unicode-related features are not supported by all tools, and are primarily available only in tools that use Perl Compatible Regular Expressions (PCRE).
Furthermore, supported functions may vary depending on the Linux distribution and tool version used, so it is ultimately important to use regular expressions that suit your environment.
I tend to forget things that I don’t use often, and it often takes time to find them, so I keep them as notes.