为了避免双重编码,除了 double_encode=false 之外,还必须指定 ENT_HTML5。
原因是,与文档相反,double_encode=false 不会无条件地全局阻止所有现有实体的双重编码。至关重要的是,它只会在对所选文档类型显式有效的那些字符实体中跳过双重编码!
由于 ENT_HTML5 引用了最广泛的字符实体列表,因此它是对现有字符实体最宽松的设置。
<?php
declare(strict_types=1);
$text = 'ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)';
$result3 = htmlspecialchars( $text, ENT_NOQUOTES | ENT_SUBSTITUTE, 'UTF-8', false );
$result4 = htmlspecialchars( $text, ENT_NOQUOTES | ENT_XML1 | ENT_SUBSTITUTE, 'UTF-8', false );
$result5 = htmlspecialchars( $text, ENT_NOQUOTES | ENT_XHTML | ENT_SUBSTITUTE, 'UTF-8', false );
$result6 = htmlspecialchars( $text, ENT_NOQUOTES | ENT_HTML5 | ENT_SUBSTITUTE, 'UTF-8', false );
echo "<br />\r\nHTML 4.01:<br />\r\n", $result3,
"<br />\r\nXML 1:<br />\r\n", $result4,
"<br />\r\nXHTML:<br />\r\n", $result5,
"<br />\r\nHTML 5:<br />\r\n", $result6, "<br />\r\n";
?>
将产生
HTML 4.01(不会识别单引号,但会识别欧元)
ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)
XML 1(会识别单引号,但不会识别欧元)
ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)
XHTML(识别单引号和欧元)
ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)
HTML 5(识别所有有效的字符实体)
ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)