bokumin.org

Github

Linux Regular Expressions (Regex) Cheat Sheet

This article is a translation of the following my article:

 

 

* Translated automatically by Google.
* Please note that some links or referenced content in this article may be in Japanese.
* Comments in the code are basically in Japanese.

 

by bokumin

 

Linux Regular Expression(Regex) Cheat Sheet

 

I made a quick cheat sheet for Linux regular expressions. I would appreciate it if you could use it as a reference.

 

Special characters (meta characters)

 

SymbolExplanationUsage example (matching characters)
.Match any single charactera.c → abc, adc, a1c, etc.
^Matches the beginning of the line^abc → Lines starting with “abc”
$Matches the end of the lineabc$ → Lines ending with “abc”
*0 or more repetitions of the previous characterab*c → ac, abc, abbc, abbc…
\Escape next character\. → Dot character itself

 

* +, ?, |, (), {} can be used in Extended Regular Expressions (ERE) or PCRE. Basic regular expressions (BRE) require escapes like \+, \?, \|, \( \), \{ \}.

 


 

Character classes and ranges

 

SymbolExplanationUsage example (matching characters)
[abc]Match any one character in parentheses[abc] → a, b, or c
[^abc]Match any single character except within parentheses[^abc] → a, b, c Any character other than
[a-z]Range specification (lowercase letters from a to z)[a-z] → 1 lowercase alphabetic character
[A-Z]Range specification (uppercase letters from A to Z)[A-Z] → 1 uppercase alphabetic character
[0-9]Range specification (numbers from 0 to 9)[0-9] → 1 arbitrary numeric character

 


 

POSIX character class

 

When used, enclose it in double parentheses like [[:alnum:]].

 

SymbolExplanationExample of use
[:alnum:]Alphanumeric characters[[:alnum:]] → a-z, A-Z, 0-9
[:alpha:]Alphabet[[:alpha:]] → a-z, A-Z
[:blank:]Blanks (spaces and tabs)[[:blank:]] → Space or tab
[:digit:]Number[[:digit:]] → 0-9
[:lower:]Lowercase[[:lower:]] → a-z
[:upper:]Uppercase[[:upper:]] → A-Z
[:space:]Space character[[:space:]] → Spaces, tabs, line breaks, etc.
[:punct:]Punctuation marks[[:punct:]] → Punctuation mark
[:xdigit:]Hex digit[[:xdigit:]] → 0-9, a-f, A-F

 


 

Repeat specification

 

Basic formal performance (BRE: grep, sed)

 

SymbolDescriptionExample of use
\{n\}Repeat previous character exactly n timesa\{3\} → aaa
\{n,\}Repeat the previous character n or more timesa\{2,\} → aa, aaa, aaaa…
\{n,m\}Repeat the previous character between n and m times.a\{2,4\} → aa, aaa, aaaa

 

Formal expression (ERE: grep -E, egrep)

 

SymbolExplanationUsage example
+One or more repetitions of the previous characterab+c → abc, abbc, abbc…
?0 or 1 occurrence of previous characterab?c → ac, abc
{n}Repeat previous character exactly n timesa{3} → aaa
{n,}Repeat the previous character n or more timesa{2,} → aa, aaa, aaaa…
{n,m}Repeat the previous character between n and m times.a{2,4} → aa, aaa, aaaa

 

* GNU grep allows you to use \+ as a basic regular expression, but this is a GNU extension.

 


 

Selection and Alternatives

 

Basic Regular Representation (BRE)

 

SymbolExplanationUsage example
|Selection (OR operator)cat|dog → “cat” or “dog”
\( \)Grouping\(cat|dog\)s → “cats” or “dogs”

 

Regular Expression (ERE)

 

NotationExplanationUsage example
``Selection (OR operator)
( )グループ化`(cat

 


 

Escape sequences and special expressions

 

BRE/ERE共通

 

SymbolExplanationExample of use
\nLine feed characterline\n → Line break after “line”
\tTab charactercolumn\t → Tab after “column”

 

PCRE専用(grep -P)

 

SymbolExplanationExample of use
\dOne number [0-9] Equivalent to\d{3} → 3-digit number
\DEquivalent to non-numeric [^0-9]\D+ → One or more non-numeric characters
\wEquivalent to [a-zA-Z0-9_]\w+ → One or more word constituent characters
\WNon-word constituent characters\W+ → One or more non-word characters
\sSpace character\s+ → One or more blank characters
\SNon-blank characters\S+ → One or more non-blank characters
\bWord boundary\bcat\b → Only the word “cat”
\BNon-word boundary\Bcat\B → “cat” in the middle of the word

 

* \d, \w, \s, etc. are functions of Perl compatible regular expressions (PCRE) and can only be used with grep -P. It cannot be used with normal grep/sed.

 


 

Character code specification

 

SymbolExplanationUsage example
\oooCharacter of octal code (3 digits)\101 → A
\xhhHexadecimal code (2 digits) character\x41 → A

 


 

Back reference

 

Basic Regular Representation (BRE)

 

sed 's/\(abc\)\1/XYZ/g'  # abcabc → XYZ

 

Regular Expression (ERE)

 

sed -E 's/(abc)\1/XYZ/g'  # abcabc → XYZ

 

SymbolExplanationExample of use
\1, \2See previous parenthesized pattern\(abc\)\1 → abcabc(BRE)
(abc)\1 → abcabc(ERE)

 

*Named capture group (?<name>) can only be used with PCRE.

 


 

Read ahead/read behind (for PCRE only: grep -P)

 

SymbolExplanationExample of use
(?=pattern)Positive lookaheadfoo(?=bar) → “foo” in “foobar”
(?!pattern)Negative lookaheadfoo(?!bar) → “foo” without trailing “bar”
(?<=pattern)Positive lookbehind(?<=foo)bar → “bar” in “foobar”
(?<!pattern)Negative lookbehind(?<!foo)bar → “bar” without “foo” in front

 


 

Other special syntax (PCRE)

 

SymbolExplanationExample of use
(?:pattern)Non-capturing group(?:abc)+ → One or more repetitions of “abc”
(?i)Case insensitive(?i)abc → ABC, abc, Abc, etc.

 


 

sed replacement command flags

 

The following flags can be used with the sed replacement command s/検索/置換/フラグ.

 

FlagDescriptionExample
gReplace all matches in each linesed 's/old/new/g'
pOutput replaced linesed -n 's/old/new/p'
i or IIgnore casesed 's/old/new/gi'
数字Replace only nth matchsed 's/old/new/2' (Second only)
w ファイルWrite replaced line to filesed 's/old/new/w out.txt'

 

It is possible to combine them in the form likesed 's/old/new/gp'.

 


 

sed command options

 

OptionsDescriptionExample of use
___MASK_CODE_3 14___Script specification (multiple specifications are possible)sed -e 's/a/b/' -e 's/c/d/'
-iEdit the file directly (GNU or -rUse extended regular expressionssed -E 's/(abc)+/XYZ/g'

 


 

Differences in regular expressions for different Linux tools

 

ToolRegular expression typeOptionsNotes
grepBasic Regular Expressions (BRE)Default+, ?, {}, `
grep -E / egrepExtended Regular Expressions (ERE)-E+, ?, {}, `
grep -PPerl Compatible Regular Expressions (PCRE)-P\d, \w, \s, Read ahead, look behind, etc. can be used (experimental function) `
sed -EExtended Regular Expressions (ERE)-E or -r+, ?, {}, `
awkExtended Regular Expression (ERE)DefaultAWK-specific syntax and limitations
perlPerl Compatible Regular Expressions (PCRE)DefaultThe most powerful and versatile

 


 

注意点

 

1. Difference between basic regular expressions (BRE) and extended regular expressions (ERE)

 

In basic regular expressions, metacharacters such as +, ?, |, (), {} need to be escaped with a backslash when using them, but in extended regular expressions they can be used directly.

 

# BRE (基本正規表現)
grep 'a\{5\}' file.txt          # aaaaa にマッチ
grep 'aa\+' file.txt            # aa, aaa, aaaa... にマッチ(GNU拡張)

# ERE (拡張正規表現)
grep -E 'a{5}' file.txt         # aaaaa にマッチ
grep -E 'aa+' file.txt          # aa, aaa, aaaa... にマッチ

 

2. Utilization of PCRE (Perl Interchange Regular Expression)

 

In GNU grep, PCRE can be used with the -P option, and shorthand character classes such as \d, \w, \s and lookahead/lookbehind functions are available. However, grep -P is an experimental feature and may not be supported in all environments.

 

# PCRE使用例
grep -P '\d{3}-\d{4}' file.txt  # 123-4567 のような電話番号
grep -P '\w+@\w+\.\w+' file.txt # メールアドレス

 

3. sed basics and extended regular expressions

 

Sed allows extended regular expressions to be enabled using the -E option (in some environments -r).

 

# BRE
sed 's/\(abc\)\+/XYZ/g' file.txt

# ERE  
sed -E 's/(abc)+/XYZ/g' file.txt

 


 

Practical example

 

Search with grep

 

# 基本的な検索
grep 'error' log.txt

# 拡張正規表現で複数パターン
grep -E 'error|warning|fatal' log.txt

# 行番号付きで表示
grep -n 'error' log.txt

# 大文字小文字を区別しない
grep -i 'error' log.txt

# PCRE で数字のみの行
grep -P '^\d+$' file.txt

 

Replacement with sed

 

# 基本的な置換
sed 's/old/new/' file.txt

# すべて置換
sed 's/old/new/g' file.txt

# ファイルを直接編集
sed -i 's/old/new/g' file.txt

# 拡張正規表現を使用
sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3\/\2\/\1/g' file.txt

 


 

まとめ

 

Please note that Basic Regular Expressions (BRE) and Extended Regular Expressions (ERE) escape special characters differently, and that Unicode-related features are not supported by all tools, and are primarily available only in tools that use Perl Compatible Regular Expressions (PCRE).
Furthermore, supported functions may vary depending on the Linux distribution and tool version used, so it is ultimately important to use regular expressions that suit your environment.

 

I tend to forget things that I don’t use often, and it often takes time to find them, so I keep them as notes.