ABAP Regular Expressions: Pattern Search and Replace

Usage Options

FIND … REGEX – Search pattern in string
REPLACE … REGEX – Replace pattern
matches() – Check if string matches pattern
cl_abap_regex / cl_abap_matcher – Object-oriented API

Regex Syntax Overview

Character	Meaning	Example
`.`	Any character	`a.c` -> abc, aXc
`*`	0 or more	`ab*c` -> ac, abc, abbc
`+`	1 or more	`ab+c` -> abc, abbc
`?`	0 or 1	`ab?c` -> ac, abc
`^`	Start	`^Hello`
`$`	End	`World$`
`[abc]`	Character class	`[aeiou]` -> vowels
`[^abc]`	Negated class	`[^0-9]` -> non-digits
`[a-z]`	Range	`[A-Za-z]` -> letters
`\d`	Digit [0-9]	`\d{4}` -> 4 digits
`\w`	Word character [a-zA-Z0-9_]	`\w+`
`\s`	Whitespace	`\s+` -> spaces
`\b`	Word boundary	`\bword\b`
`{n}`	Exactly n times	`a{3}` -> aaa
`{n,m}`	n to m times	`a{2,4}` -> aa, aaa, aaaa
`(...)`	Group	`(ab)+` -> ab, abab
`\|`	Or	`cat\|dog`

Examples

1. FIND with REGEX

DATA: lv_text TYPE string VALUE 'Order 12345 from 2024-11-15'.

" Find number
FIND REGEX '\d+' IN lv_text MATCH OFFSET DATA(lv_offset)
                            MATCH LENGTH DATA(lv_length).

IF sy-subrc = 0.
  DATA(lv_number) = substring( val = lv_text off = lv_offset len = lv_length ).
  WRITE: / 'Found:', lv_number.  " 12345
ENDIF.

2. Find All Matches (FIND ALL OCCURRENCES)

DATA: lv_text TYPE string VALUE 'Tel: 030-12345, Fax: 040-67890, Mobile: 0170-9876543'.

" Find all phone numbers
FIND ALL OCCURRENCES OF REGEX '\d{3,4}-\d+'
  IN lv_text
  RESULTS DATA(lt_results).

LOOP AT lt_results INTO DATA(ls_result).
  DATA(lv_phone) = substring( val = lv_text
                              off = ls_result-offset
                              len = ls_result-length ).
  WRITE: / 'Phone:', lv_phone.
ENDLOOP.

" Output:
" Phone: 030-12345
" Phone: 040-67890
" Phone: 0170-9876543

3. Extract Groups (Submatches)

DATA: lv_date TYPE string VALUE 'Date: 2024-11-15'.

" Extract date with groups
FIND REGEX '(\d{4})-(\d{2})-(\d{2})'
  IN lv_date
  SUBMATCHES DATA(lv_year) DATA(lv_month) DATA(lv_day).

IF sy-subrc = 0.
  WRITE: / 'Year:', lv_year.    " 2024
  WRITE: / 'Month:', lv_month.  " 11
  WRITE: / 'Day:', lv_day.      " 15
ENDIF.

4. REPLACE with REGEX

DATA: lv_text TYPE string VALUE 'Price: 123.45 EUR, Discount: 10.00 EUR'.

" Replace numbers with XXX
REPLACE ALL OCCURRENCES OF REGEX '\d+\.?\d*'
  IN lv_text WITH 'XXX'.

WRITE: / lv_text.  " Price: XXX EUR, Discount: XXX EUR

5. Backreferences

DATA: lv_text TYPE string VALUE 'The the cat sits on on the roof.'.

" Remove duplicate words (backreference \1)
REPLACE ALL OCCURRENCES OF REGEX '\b(\w+)\s+\1\b'
  IN lv_text WITH '$1'.

WRITE: / lv_text.  " The cat sits on the roof.

6. matches() – Check If Pattern Matches

DATA: lv_email TYPE string VALUE '[email protected]'.

" Simple email validation
IF matches( val = lv_email regex = '^\w+@\w+\.\w+$' ).
  WRITE: / 'Valid email'.
ELSE.
  WRITE: / 'Invalid email'.
ENDIF.

" Multiple checks
DATA: lv_phone TYPE string VALUE '+49-170-1234567'.

DATA(lv_valid_phone) = xsdbool(
  matches( val = lv_phone regex = '^\+?\d{2,3}-\d{2,4}-\d{4,}$' )
).

7. contains() with Regex

DATA: lv_text TYPE string VALUE 'Order No. 12345 has been shipped'.

" Contains number?
IF contains( val = lv_text regex = '\d+' ).
  WRITE: / 'Text contains numbers'.
ENDIF.

" Starts with pattern
IF contains( val = lv_text regex = '^Order' ).
  WRITE: / 'This is an order'.
ENDIF.

8. count() with Regex

DATA: lv_text TYPE string VALUE 'a1b2c3d4e5'.

" Count of digits
DATA(lv_digit_count) = count( val = lv_text regex = '\d' ).
WRITE: / 'Digit count:', lv_digit_count.  " 5

" Count of words
DATA: lv_sentence TYPE string VALUE 'This is an example sentence with words'.
DATA(lv_word_count) = count( val = lv_sentence regex = '\b\w+\b' ).
WRITE: / 'Word count:', lv_word_count.  " 7

9. cl_abap_regex – Object-oriented

DATA: lv_text TYPE string VALUE 'Name: Max Mustermann, Age: 30'.

" Create regex object
DATA(lo_regex) = cl_abap_regex=>create_pcre( pattern = '(\w+):\s*(\S+)' ).

" Create matcher
DATA(lo_matcher) = lo_regex->create_matcher( text = lv_text ).

" Iterate through all matches
WHILE lo_matcher->find_next( ).
  DATA(lv_full)  = lo_matcher->get_match( ).
  DATA(lv_key)   = lo_matcher->get_submatch( 1 ).
  DATA(lv_value) = lo_matcher->get_submatch( 2 ).

  WRITE: / 'Match:', lv_full.
  WRITE: / '  Key:', lv_key, 'Value:', lv_value.
ENDWHILE.

" Output:
" Match: Name: Max
"   Key: Name Value: Max
" Match: Age: 30
"   Key: Age Value: 30

10. Practical: Extract Emails

DATA: lv_text TYPE string VALUE
  'Contact: [email protected], [email protected] or [email protected]'.

DATA: lt_emails TYPE string_table.

" Find all emails
FIND ALL OCCURRENCES OF REGEX '[\w.+-]+@[\w.-]+\.\w{2,}'
  IN lv_text
  RESULTS DATA(lt_results).

LOOP AT lt_results INTO DATA(ls_result).
  APPEND substring( val = lv_text
                    off = ls_result-offset
                    len = ls_result-length ) TO lt_emails.
ENDLOOP.

LOOP AT lt_emails INTO DATA(lv_email).
  WRITE: / lv_email.
ENDLOOP.

" Output:
" [email protected]
" [email protected]
" [email protected]

11. Practical: Clean Data

" Normalize phone number
DATA: lv_phone TYPE string VALUE '+49 (0) 170 / 123 45 67'.

" Remove all non-digits except +
REPLACE ALL OCCURRENCES OF REGEX '[^\d+]' IN lv_phone WITH ''.

WRITE: / lv_phone.  " +491701234567

" Reduce multiple spaces
DATA: lv_text TYPE string VALUE 'Too   many    spaces   here'.

REPLACE ALL OCCURRENCES OF REGEX '\s{2,}' IN lv_text WITH ' '.

WRITE: / lv_text.  " Too many spaces here

12. Practical: Validations

" Validate IBAN (simplified)
DATA: lv_iban TYPE string VALUE 'DE89370400440532013000'.

IF matches( val = lv_iban regex = '^[A-Z]{2}\d{2}[A-Z0-9]{4}\d{7}([A-Z0-9]?){0,16}$' ).
  WRITE: / 'Valid IBAN format'.
ENDIF.

" Validate ZIP code (Germany)
DATA: lv_plz TYPE string VALUE '12345'.

IF matches( val = lv_plz regex = '^\d{5}$' ).
  WRITE: / 'Valid German ZIP code'.
ENDIF.

" Validate date (YYYY-MM-DD)
DATA: lv_date TYPE string VALUE '2024-11-15'.

IF matches( val = lv_date regex = '^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$' ).
  WRITE: / 'Valid date format'.
ENDIF.

13. Practical: Parsing

" Parse log line
DATA: lv_log TYPE string VALUE '2024-11-15 10:30:45 [ERROR] Database connection failed'.

FIND REGEX '^(\d{4}-\d{2}-\d{2})\s+(\d{2}:\d{2}:\d{2})\s+\[(\w+)\]\s+(.+)$'
  IN lv_log
  SUBMATCHES DATA(lv_date) DATA(lv_time) DATA(lv_level) DATA(lv_message).

IF sy-subrc = 0.
  WRITE: / 'Date:', lv_date.
  WRITE: / 'Time:', lv_time.
  WRITE: / 'Level:', lv_level.
  WRITE: / 'Message:', lv_message.
ENDIF.

14. Practical: CSV Parsing

DATA: lv_csv TYPE string VALUE 'Max;Mustermann;30;Berlin'.
DATA: lt_fields TYPE string_table.

" Split by semicolon (alternative to SPLIT)
FIND ALL OCCURRENCES OF REGEX '[^;]+' IN lv_csv RESULTS DATA(lt_matches).

LOOP AT lt_matches INTO DATA(ls_match).
  APPEND substring( val = lv_csv off = ls_match-offset len = ls_match-length )
    TO lt_fields.
ENDLOOP.

" Or simpler with SPLIT:
SPLIT lv_csv AT ';' INTO TABLE lt_fields.

15. Case-Insensitive Search

DATA: lv_text TYPE string VALUE 'ABAP is great, abap is cool'.

" Case-insensitive with (?i)
FIND ALL OCCURRENCES OF REGEX '(?i)abap'
  IN lv_text
  MATCH COUNT DATA(lv_count).

WRITE: / 'Found:', lv_count, 'times'.  " 2

16. Escape Function

DATA: lv_search TYPE string VALUE 'a.b*c?'.

" Escape special characters for literal search
DATA(lv_escaped) = escape( val = lv_search format = cl_abap_format=>e_regex ).

WRITE: / 'Escaped:', lv_escaped.  " a\.b\*c\?

" Now lv_escaped can be used in REGEX
DATA: lv_text TYPE string VALUE 'Test a.b*c? end'.

FIND REGEX lv_escaped IN lv_text.
IF sy-subrc = 0.
  WRITE: / 'Found!'.
ENDIF.

Common Regex Patterns

Purpose	Pattern
Number	`\d+` or `[0-9]+`
Decimal number	`\d+\.?\d*`
Word	`\w+` or `[A-Za-z]+`
Email (simple)	`[\w.+-]+@[\w.-]+\.\w{2,}`
URL	`https?://[\w./%-]+`
Date (YYYY-MM-DD)	`\d{4}-\d{2}-\d{2}`
ZIP code (DE)	`\d{5}`
Phone (DE)	`(\+49\|0)\d{2,4}[-/]?\d+`
IP address	`\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}`
Remove whitespace	`\s+` -> “
HTML tags	`<[^>]+>`

FIND/REPLACE Options

FIND REGEX pattern IN text
  [ IGNORING CASE ]           " Ignore case
  [ MATCH OFFSET off ]        " Start position of match
  [ MATCH LENGTH len ]        " Length of match
  [ MATCH COUNT cnt ]         " Number of matches
  [ SUBMATCHES s1 s2 ... ]    " Group contents
  [ RESULTS result_tab ].     " All matches as table

Important Notes / Best Practice

PCRE syntax (Perl-compatible) with cl_abap_regex=>create_pcre().
Standard ABAP Regex is POSIX-compatible.
Use escape() to escape special characters.
Submatches with (...) for group extraction.
(?i) at the beginning for case-insensitive search.
Performance: Compiled regex (cl_abap_regex) for repeated use.
matches() checks if entire string matches the pattern.
contains( ... regex = ...) checks if pattern is contained.
\d, \w, \s are shorthand for character classes.
Test regex with online tools (regex101.com) before implementing.