PHP Conference Japan 2024

preg_replace

(PHP 4, PHP 5, PHP 7, PHP 8)

preg_replacePerform a regular expression search and replace

Description

preg_replace(
    string|array $pattern,
    string|array $replacement,
    string|array $subject,
    int $limit = -1,
    int &$count = null
): string|array|null

Searches subject for matches to pattern and replaces them with replacement.

To match an exact string, rather than a pattern, consider using str_replace() or str_ireplace() instead of this function.

Parameters

pattern

The pattern to search for. It can be either a string or an array with strings.

Several PCRE modifiers are also available.

replacement

The string or an array with strings to replace. If this parameter is a string and the pattern parameter is an array, all patterns will be replaced by that string. If both pattern and replacement parameters are arrays, each pattern will be replaced by the replacement counterpart. If there are fewer elements in the replacement array than in the pattern array, any extra patterns will be replaced by an empty string.

replacement may contain references of the form \n or $n, with the latter form being the preferred one. Every such reference will be replaced by the text captured by the n'th parenthesized pattern. n can be from 0 to 99, and \0 or $0 refers to the text matched by the whole pattern. Opening parentheses are counted from left to right (starting from 1) to obtain the number of the capturing subpattern. Note that backslashes in string literals may require to be escaped.

When working with a replacement pattern where a backreference is immediately followed by another number (i.e.: placing a literal number immediately after a matched pattern), you cannot use the familiar \1 notation for your backreference. \11, for example, would confuse preg_replace() since it does not know whether you want the \1 backreference followed by a literal 1, or the \11 backreference followed by nothing. In this case the solution is to use ${1}1. This creates an isolated $1 backreference, leaving the 1 as a literal.

When using the deprecated e modifier, this function escapes some characters (namely ', ", \ and NULL) in the strings that replace the backreferences. This is done to ensure that no syntax errors arise from backreference usage with either single or double quotes (e.g. 'strlen(\'$1\')+strlen("$2")'). Make sure you are aware of PHP's string syntax to know exactly how the interpreted string will look.

subject

The string or an array with strings to search and replace.

If subject is an array, then the search and replace is performed on every entry of subject, and the return value is an array as well.

If the subject array is associative, keys will be preserved in the returned value.

limit

The maximum possible replacements for each pattern in each subject string. Defaults to -1 (no limit).

count

If specified, this variable will be filled with the number of replacements done.

Return Values

preg_replace() returns an array if the subject parameter is an array, or a string otherwise.

If matches are found, the new subject will be returned, otherwise subject will be returned unchanged or null if an error occurred.

Errors/Exceptions

Using the "\e" modifier is an error; an E_WARNING is emitted in this case.

If the regex pattern passed does not compile to a valid regex, an E_WARNING is emitted.

Examples

Example #1 Using backreferences followed by numeric literals

<?php
$string
= 'April 15, 2003';
$pattern = '/(\w+) (\d+), (\d+)/i';
$replacement = '${1}1,$3';
echo
preg_replace($pattern, $replacement, $string);
?>

The above example will output:

April1,2003

Example #2 Using indexed arrays with preg_replace()

<?php
$string
= 'The quick brown fox jumps over the lazy dog.';
$patterns = array();
$patterns[0] = '/quick/';
$patterns[1] = '/brown/';
$patterns[2] = '/fox/';
$replacements = array();
$replacements[2] = 'bear';
$replacements[1] = 'black';
$replacements[0] = 'slow';
echo
preg_replace($patterns, $replacements, $string);
?>

The above example will output:

The bear black slow jumps over the lazy dog.

By ksorting patterns and replacements, we should get what we wanted.

<?php
ksort
($patterns);
ksort($replacements);
echo
preg_replace($patterns, $replacements, $string);
?>

The above example will output:

The slow black bear jumps over the lazy dog.

Example #3 Replacing several values

<?php
$patterns
= array ('/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/',
'/^\s*{(\w+)}\s*=/');
$replace = array ('\3/\4/\1\2', '$\1 =');
echo
preg_replace($patterns, $replace, '{startDate} = 1999-5-27');
?>

The above example will output:

$startDate = 5/27/1999

Example #4 Strip whitespace

This example strips excess whitespace from a string.

<?php
$str
= 'foo o';
$str = preg_replace('/\s\s+/', ' ', $str);
// This will be 'foo o' now
echo $str;
?>

Example #5 Using the count parameter

<?php
$count
= 0;

echo
preg_replace(array('/\d/', '/\s/'), '*', 'xp 4 to', -1 , $count);
echo
$count; //3
?>

The above example will output:

xp***to
3

Notes

Note:

When using arrays with pattern and replacement, the keys are processed in the order they appear in the array. This is not necessarily the same as the numerical index order. If you use indexes to identify which pattern should be replaced by which replacement, you should perform a ksort() on each array prior to calling preg_replace().

Note:

When both pattern and replacement are arrays, matching rules will operate sequentially. That is, the second pattern/replacement pair will operate on the string that results from the first pattern/replacement pair, not the original string. If you want to simulate replacements operating in parallel, such as swapping two values, replace one pattern by an intermediary placeholder, then in a later pair replace that intermediary placeholder with the desired replacement.

<?php
$p
= array('/a/', '/b/', '/c/');
$r = array('b', 'c', 'd');
print_r(preg_replace($p, $r, 'a'));
// prints d
?>

See Also

add a note

User Contributed Notes 10 notes

up
781
arkani at iol dot pt
15 years ago
Because i search a lot 4 this:

The following should be escaped if you are trying to match that character

\ ^ . $ | ( ) [ ]
* + ? { } ,

Special Character Definitions
\ Quote the next metacharacter
^ Match the beginning of the line
. Match any character (except newline)
$ Match the end of the line (or before newline at the end)
| Alternation
() Grouping
[] Character class
* Match 0 or more times
+ Match 1 or more times
? Match 1 or 0 times
{n} Match exactly n times
{n,} Match at least n times
{n,m} Match at least n but not more than m times
More Special Character Stuff
\t tab (HT, TAB)
\n newline (LF, NL)
\r return (CR)
\f form feed (FF)
\a alarm (bell) (BEL)
\e escape (think troff) (ESC)
\033 octal char (think of a PDP-11)
\x1B hex char
\c[ control char
\l lowercase next char (think vi)
\u uppercase next char (think vi)
\L lowercase till \E (think vi)
\U uppercase till \E (think vi)
\E end case modification (think vi)
\Q quote (disable) pattern metacharacters till \E
Even More Special Characters
\w Match a "word" character (alphanumeric plus "_")
\W Match a non-word character
\s Match a whitespace character
\S Match a non-whitespace character
\d Match a digit character
\D Match a non-digit character
\b Match a word boundary
\B Match a non-(word boundary)
\A Match only at beginning of string
\Z Match only at end of string, or before newline at the end
\z Match only at end of string
\G Match only where previous m//g left off (works only with /g)
up
3
Anonymous
8 months ago
You can only use numeric backreferences in the replacement string, but not named ones:
<?php
echo preg_replace('#(\d+)#', '\1 $1 ${1}', '123');
// 123 123 123
echo preg_replace('#(?<digits>\d+)#', '\digits $digits ${digits}', '123');
// \digits $digits ${digits}
?>

To use named backreferences, you have to use preg_replace_callback:
<?php
echo preg_replace_callback('#(?<digits>\d+)#', function( $m ){
return
"$m[1] $m[digits] {$m['digits']}";
},
'123');
// 123 123 123

echo preg_replace_callback('#(?<digits>\d+)#', fn($m) => "$m[1] $m[digits] {$m['digits']}", '123');
// 123 123 123
?>

See https://bugs.php.net/bug.php?id=81469
up
4
nik at rolls dot cc
11 years ago
To split Pascal/CamelCase into Title Case (for example, converting descriptive class names for use in human-readable frontends), you can use the below function:

<?php
function expandCamelCase($source) {
return
preg_replace('/(?<!^)([A-Z][a-z]|(?<=[a-z])[^a-z]|(?<=[A-Z])[0-9_])/', ' $1', $source);
}
?>

Before:
ExpandCamelCaseAPIDescriptorPHP5_3_4Version3_21Beta
After:
Expand Camel Case API Descriptor PHP 5_3_4 Version 3_21 Beta
up
2
ismith at nojunk dot motorola dot com
17 years ago
Be aware that when using the "/u" modifier, if your input text contains any bad UTF-8 code sequences, then preg_replace will return an empty string, regardless of whether there were any matches.

This is due to the PCRE library returning an error code if the string contains bad UTF-8.
up
2
sternkinder at gmail dot com
17 years ago
From what I can see, the problem is, that if you go straight and substitute all 'A's wit 'T's you can't tell for sure which 'T's to substitute with 'A's afterwards. This can be for instance solved by simply replacing all 'A's by another character (for instance '_' or whatever you like), then replacing all 'T's by 'A's, and then replacing all '_'s (or whatever character you chose) by 'A's:

<?php
$dna
= "AGTCTGCCCTAG";
echo
str_replace(array("A","G","C","T","_","-"), array("_","-","G","A","T","C"), $dna); //output will be TCAGACGGGATC
?>

Although I don't know how transliteration in perl works (though I remember that is kind of similar to the UNIX command "tr") I would suggest following function for "switching" single chars:

<?php
function switch_chars($subject,$switch_table,$unused_char="_") {
foreach (
$switch_table as $_1 => $_2 ) {
$subject = str_replace($_1,$unused_char,$subject);
$subject = str_replace($_2,$_1,$subject);
$subject = str_replace($unused_char,$_2,$subject);
}
return
$subject;
}

echo
switch_chars("AGTCTGCCCTAG", array("A"=>"T","G"=>"C")); //output will be TCAGACGGGATC
?>
up
2
php-comments-REMOVE dot ME at dotancohen dot com
16 years ago
Below is a function for converting Hebrew final characters to their
normal equivelants should they appear in the middle of a word.
The /b argument does not treat Hebrew letters as part of a word,
so I had to work around that limitation.

<?php

$text
="עברית מבולגנת";

function
hebrewNotWordEndSwitch ($from, $to, $text) {
$text=
preg_replace('/'.$from.'([א-ת])/u','$2'.$to.'$1',$text);
return
$text;
}

do {
$text_before=$text;
$text=hebrewNotWordEndSwitch("ך","כ",$text);
$text=hebrewNotWordEndSwitch("ם","מ",$text);
$text=hebrewNotWordEndSwitch("ן","נ",$text);
$text=hebrewNotWordEndSwitch("ף","פ",$text);
$text=hebrewNotWordEndSwitch("ץ","צ",$text);
} while (
$text_before!=$text );

print
$text; // עברית מסודרת!

?>

The do-while is necessary for multiple instances of letters, such
as "אנני" which would start off as "אןןי". Note that there's still the
problem of acronyms with gershiim but that's not a difficult one
to solve. The code is in use at http://gibberish.co.il which you can
use to translate wrongly-encoded Hebrew, transliterize, and some
other Hebrew-related functions.

To ensure that there will be no regular characters at the end of a
word, just convert all regular characters to their final forms, then
run this function. Enjoy!
up
-1
me at perochak dot com
13 years ago
If you would like to remove a tag along with the text inside it then use the following code.

<?php
preg_replace
('/(<tag>.+?)+(<\/tag>)/i', '', $string);
?>

example
<?php $string='<span class="normalprice">55 PKR</span>'; ?>

<?php
$string
= preg_replace('/(<span class="normalprice">.+?)+(<\/span>)/i', '', $string);
?>

This will results a null or empty string.

<?php
$string
='My String <span class="normalprice">55 PKR</span>';

$string = preg_replace('/(<span class="normalprice">.+?)+(<\/span>)/i', '', $string);
?>

This will results a " My String"
up
-2
bublifuk at mailinator dot com
6 years ago
A delimiter can be any ASCII non-alphanumeric, non-backslash, non-whitespace character: !"#$%&'*+,./:;=?@^_`|~- and ({[<>]})
up
-2
razvan_bc at yahoo dot com
2 years ago
How to replace all comments inside code without remove crln = \r\n or cr \r each line?

<?php
$txt_target
=<<<t1
this;// dsdsds
nope

/*
ok
*/
is;huge
/*text bla*/
/*bla*/

t1;

/*
=======================================================================
expected result:
=======================================================================
this;
nope

is;huge
=======================================================================
visualizing in a hex viewer .. to_check_with_a_hex_viewer.txt ...
t h i s ; LF TAB n o p e CR LF CR LF i s ; h u g e CR LF
74 68 69 73 3b 0a 09 6e 6f 70 65 0d 0a 0d 0a 69 73 3b 68 75 67 65 0d 0a
I used F3 (viewer + options 3: hex) in mythical TOTAL COMMANDER!
=======================================================================
*/

echo '<hr><pre>';
echo
$txt_target;
echo
'</pre>';

// a single line '//' comments
$txt_target = preg_replace('![ \t]*//.*[ \t]*!', '', $txt_target);

// /* comment */
$txt_target = preg_replace('/\/\*([^\/]*)\*\/(\s+)/smi', '', $txt_target);
echo
'<hr><pre>';
echo
$txt_target;
echo
'</pre><hr>';

file_put_contents('to_check_with_a_hex_viewer.txt',$txt_target);

?>
up
-4
mail at johanvandemerwe dot nl
5 years ago
Sample for replacing bracketed short-codes

The used short-codes are purely used for educational purposes for they could be shorter as in 'italic' to 'i' or 'bold' to 'b'.

Sample text
----
This sample shows how to have [italic]italic[/italic], [bold]bold[/bold] and [underline]underlined[/underline] and [strikethrough]striked[/striketrhough] text.

with this function:

<?php
function textDecoration($html)
{
$patterns = [
'/\[(italic)\].*?\[\/\1\] ?/',
'/\[(bold)\].*?\[\/\1\] ?/',
'/\[(underline)\].*?\[\/\1\] ?/'
];

$replacements = [
'<i>$1</i>',
'<strong>$1</strong>',
'<u>$1</u>'
];

return
preg_replace($patterns, $replacements, $html);
}

$html = textDecoration($html);

echo
$html; // or return
?>

results in:
----
This sample shows how to have <i>italic</i>, <b>bold</b> and <u>underlined</u> and [strikethrough]striked[/striketrhough] text.

Notice!
There is no [strikethrough]striked[/striketrhough] fallback in the patterns and replacements array
To Top