Linux Regular Expressions (Regex) Cheat Sheet – Blog

This article is a translation of the following my article:

* Translated automatically by Google.
* Please note that some links or referenced content in this article may be in Japanese.
* Comments in the code are basically in Japanese.

by bokumin

Linux Regular Expression(Regex) Cheat Sheet

I made a quick cheat sheet for Linux regular expressions. I would appreciate it if you could use it as a reference.

Special characters (meta characters)

Symbol	Explanation	Usage example (matching characters)
`.`	Match any single character	`a.c` → abc, adc, a1c, etc.
`^`	Matches the beginning of the line	`^abc` → Lines starting with “abc”
`$`	Matches the end of the line	`abc$` → Lines ending with “abc”
`*`	0 or more repetitions of the previous character	`ab*c` → ac, abc, abbc, abbc…
`\`	Escape next character	`\.` → Dot character itself

* +, ?, |, (), {} can be used in Extended Regular Expressions (ERE) or PCRE. Basic regular expressions (BRE) require escapes like \+, \?, \|, $ $, \{ \}.

Character classes and ranges

Symbol	Explanation	Usage example (matching characters)
`[abc]`	Match any one character in parentheses	`[abc]` → a, b, or c
`[^abc]`	Match any single character except within parentheses	`[^abc]` → a, b, c Any character other than
`[a-z]`	Range specification (lowercase letters from a to z)	`[a-z]` → 1 lowercase alphabetic character
`[A-Z]`	Range specification (uppercase letters from A to Z)	`[A-Z]` → 1 uppercase alphabetic character
`[0-9]`	Range specification (numbers from 0 to 9)	`[0-9]` → 1 arbitrary numeric character

POSIX character class

When used, enclose it in double parentheses like [[:alnum:]].

Symbol	Explanation	Example of use
`[:alnum:]`	Alphanumeric characters	`[[:alnum:]]` → a-z, A-Z, 0-9
`[:alpha:]`	Alphabet	`[[:alpha:]]` → a-z, A-Z
`[:blank:]`	Blanks (spaces and tabs)	`[[:blank:]]` → Space or tab
`[:digit:]`	Number	`[[:digit:]]` → 0-9
`[:lower:]`	Lowercase	`[[:lower:]]` → a-z
`[:upper:]`	Uppercase	`[[:upper:]]` → A-Z
`[:space:]`	Space character	`[[:space:]]` → Spaces, tabs, line breaks, etc.
`[:punct:]`	Punctuation marks	`[[:punct:]]` → Punctuation mark
`[:xdigit:]`	Hex digit	`[[:xdigit:]]` → 0-9, a-f, A-F

Repeat specification

Basic formal performance (BRE: grep, sed)

Symbol	Description	Example of use
`\{n\}`	Repeat previous character exactly n times	`a\{3\}` → aaa
`\{n,\}`	Repeat the previous character n or more times	`a\{2,\}` → aa, aaa, aaaa…
`\{n,m\}`	Repeat the previous character between n and m times.	`a\{2,4\}` → aa, aaa, aaaa

Formal expression (ERE: grep -E, egrep)

Symbol	Explanation	Usage example
`+`	One or more repetitions of the previous character	`ab+c` → abc, abbc, abbc…
`?`	0 or 1 occurrence of previous character	`ab?c` → ac, abc
`{n}`	Repeat previous character exactly n times	`a{3}` → aaa
`{n,}`	Repeat the previous character n or more times	`a{2,}` → aa, aaa, aaaa…
`{n,m}`	Repeat the previous character between n and m times.	`a{2,4}` → aa, aaa, aaaa

* GNU grep allows you to use \+ as a basic regular expression, but this is a GNU extension.

Selection and Alternatives

Basic Regular Representation (BRE)

Symbol	Explanation	Usage example
`\|`	Selection (OR operator)	`cat\|dog` → “cat” or “dog”
`$` `$`	Grouping	`$cat\|dog$s` → “cats” or “dogs”

Regular Expression (ERE)

Notation	Explanation	Usage example
`	`	Selection (OR operator)
`(` `)`	グループ化	`(cat

Escape sequences and special expressions

BRE/ERE共通

Symbol	Explanation	Example of use
`\n`	Line feed character	`line\n` → Line break after “line”
`\t`	Tab character	`column\t` → Tab after “column”

PCRE専用(grep -P)

Symbol	Explanation	Example of use
`\d`	One number `[0-9]` Equivalent to	`\d{3}` → 3-digit number
`\D`	Equivalent to non-numeric `[^0-9]`	`\D+` → One or more non-numeric characters
`\w`	Equivalent to `[a-zA-Z0-9_]`	`\w+` → One or more word constituent characters
`\W`	Non-word constituent characters	`\W+` → One or more non-word characters
`\s`	Space character	`\s+` → One or more blank characters
`\S`	Non-blank characters	`\S+` → One or more non-blank characters
`\b`	Word boundary	`\bcat\b` → Only the word “cat”
`\B`	Non-word boundary	`\Bcat\B` → “cat” in the middle of the word

* \d, \w, \s, etc. are functions of Perl compatible regular expressions (PCRE) and can only be used with grep -P. It cannot be used with normal grep/sed.

Character code specification

Symbol	Explanation	Usage example
`\ooo`	Character of octal code (3 digits)	`\101` → A
`\xhh`	Hexadecimal code (2 digits) character	`\x41` → A

Back reference

Basic Regular Representation (BRE)

sed 's/\(abc\)\1/XYZ/g'  # abcabc → XYZ

sed 's/\(abc\)\1/XYZ/g'  # abcabc → XYZ

Regular Expression (ERE)

sed -E 's/(abc)\1/XYZ/g'  # abcabc → XYZ

sed -E 's/(abc)\1/XYZ/g'  # abcabc → XYZ

Symbol	Explanation	Example of use
`\1`, `\2`…	See previous parenthesized pattern	`$abc$\1` → abcabc(BRE)
		`(abc)\1` → abcabc(ERE)

*Named capture group (?<name>) can only be used with PCRE.

Read ahead/read behind (for PCRE only: grep -P)

Symbol	Explanation	Example of use
`(?=pattern)`	Positive lookahead	`foo(?=bar)` → “foo” in “foobar”
`(?!pattern)`	Negative lookahead	`foo(?!bar)` → “foo” without trailing “bar”
`(?<=pattern)`	Positive lookbehind	`(?<=foo)bar` → “bar” in “foobar”
`(?<!pattern)`	Negative lookbehind	`(?<!foo)bar` → “bar” without “foo” in front

Other special syntax (PCRE)

Symbol	Explanation	Example of use
`(?:pattern)`	Non-capturing group	`(?:abc)+` → One or more repetitions of “abc”
`(?i)`	Case insensitive	`(?i)abc` → ABC, abc, Abc, etc.

sed replacement command flags

The following flags can be used with the sed replacement command s/検索/置換/フラグ.

Flag	Description	Example
`g`	Replace all matches in each line	`sed 's/old/new/g'`
`p`	Output replaced line	`sed -n 's/old/new/p'`
`i` or `I`	Ignore case	`sed 's/old/new/gi'`
`数字`	Replace only nth match	`sed 's/old/new/2'` (Second only)
`w ファイル`	Write replaced line to file	`sed 's/old/new/w out.txt'`

It is possible to combine them in the form likesed 's/old/new/gp'.

sed command options

Options	Description	Example of use
___MASK_CODE_3 14___	Script specification (multiple specifications are possible)	`sed -e 's/a/b/' -e 's/c/d/'`
`-i`	Edit the file directly (GNU or `-r`	Use extended regular expressions	`sed -E 's/(abc)+/XYZ/g'`

Differences in regular expressions for different Linux tools

Tool	Regular expression type	Options	Notes
`grep`	Basic Regular Expressions (BRE)	Default	`+`, `?`, `{}`, `
`grep -E` / `egrep`	Extended Regular Expressions (ERE)	`-E`	`+`, `?`, `{}`, `
`grep -P`	Perl Compatible Regular Expressions (PCRE)	`-P`	`\d`, `\w`, `\s`, Read ahead, look behind, etc. can be used (experimental function) `
`sed -E`	Extended Regular Expressions (ERE)	`-E` or `-r`	`+`, `?`, `{}`, `
`awk`	Extended Regular Expression (ERE)	Default	AWK-specific syntax and limitations
`perl`	Perl Compatible Regular Expressions (PCRE)	Default	The most powerful and versatile

注意点

1. Difference between basic regular expressions (BRE) and extended regular expressions (ERE)

In basic regular expressions, metacharacters such as +, ?, |, (), {} need to be escaped with a backslash when using them, but in extended regular expressions they can be used directly.

# BRE (基本正規表現)
grep 'a\{5\}' file.txt          # aaaaa にマッチ
grep 'aa\+' file.txt            # aa, aaa, aaaa... にマッチ(GNU拡張)

# ERE (拡張正規表現)
grep -E 'a{5}' file.txt         # aaaaa にマッチ
grep -E 'aa+' file.txt          # aa, aaa, aaaa... にマッチ

# BRE (基本正規表現)
grep 'a\{5\}' file.txt          # aaaaa にマッチ
grep 'aa\+' file.txt            # aa, aaa, aaaa... にマッチ(GNU拡張)

# ERE (拡張正規表現)
grep -E 'a{5}' file.txt         # aaaaa にマッチ
grep -E 'aa+' file.txt          # aa, aaa, aaaa... にマッチ

2. Utilization of PCRE (Perl Interchange Regular Expression)

In GNU grep, PCRE can be used with the -P option, and shorthand character classes such as \d, \w, \s and lookahead/lookbehind functions are available. However, grep -P is an experimental feature and may not be supported in all environments.

# PCRE使用例
grep -P '\d{3}-\d{4}' file.txt  # 123-4567 のような電話番号
grep -P '\w+@\w+\.\w+' file.txt # メールアドレス

# PCRE使用例
grep -P '\d{3}-\d{4}' file.txt  # 123-4567 のような電話番号
grep -P '\w+@\w+\.\w+' file.txt # メールアドレス

3. sed basics and extended regular expressions

Sed allows extended regular expressions to be enabled using the -E option (in some environments -r).

# BRE
sed 's/\(abc\)\+/XYZ/g' file.txt

# ERE  
sed -E 's/(abc)+/XYZ/g' file.txt

# BRE
sed 's/\(abc\)\+/XYZ/g' file.txt

# ERE  
sed -E 's/(abc)+/XYZ/g' file.txt

Practical example

Search with grep

# 基本的な検索
grep 'error' log.txt

# 拡張正規表現で複数パターン
grep -E 'error|warning|fatal' log.txt

# 行番号付きで表示
grep -n 'error' log.txt

# 大文字小文字を区別しない
grep -i 'error' log.txt

# PCRE で数字のみの行
grep -P '^\d+$' file.txt

# 基本的な検索
grep 'error' log.txt

# 拡張正規表現で複数パターン
grep -E 'error|warning|fatal' log.txt

# 行番号付きで表示
grep -n 'error' log.txt

# 大文字小文字を区別しない
grep -i 'error' log.txt

# PCRE で数字のみの行
grep -P '^\d+$' file.txt

Replacement with sed

# 基本的な置換
sed 's/old/new/' file.txt

# すべて置換
sed 's/old/new/g' file.txt

# ファイルを直接編集
sed -i 's/old/new/g' file.txt

# 拡張正規表現を使用
sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3\/\2\/\1/g' file.txt

# 基本的な置換
sed 's/old/new/' file.txt

# すべて置換
sed 's/old/new/g' file.txt

# ファイルを直接編集
sed -i 's/old/new/g' file.txt

# 拡張正規表現を使用
sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3\/\2\/\1/g' file.txt

まとめ

Please note that Basic Regular Expressions (BRE) and Extended Regular Expressions (ERE) escape special characters differently, and that Unicode-related features are not supported by all tools, and are primarily available only in tools that use Perl Compatible Regular Expressions (PCRE).
Furthermore, supported functions may vary depending on the Linux distribution and tool version used, so it is ultimately important to use regular expressions that suit your environment.

I tend to forget things that I don’t use often, and it often takes time to find them, so I keep them as notes.

Symbol	Explanation	Usage example
`\|`	Selection (OR operator)	`cat\|dog` → “cat” or “dog”
`\(` `\)`	Grouping	`\(cat\|dog\)s` → “cats” or “dogs”

Symbol	Explanation	Example of use
`\1`, `\2`…	See previous parenthesized pattern	`\(abc\)\1` → abcabc(BRE)
		`(abc)\1` → abcabc(ERE)