必须除了 double_encode=false 之外还指定 ENT_HTML5 以避免双重编码。
原因是,与文档相反,double_encode=false 不会无条件且全局地防止所有现有实体的双重编码。至关重要的是,它只会跳过对那些对所选文档类型显式有效的字符实体的双重编码!
由于 ENT_HTML5 参考了最广泛的字符实体列表,因此它是唯一一个对现有字符实体最宽松的设置。
<?php
declare(strict_types=1);
$text = 'ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)';
$result3 = htmlspecialchars( $text, ENT_NOQUOTES | ENT_SUBSTITUTE, 'UTF-8', false );
$result4 = htmlspecialchars( $text, ENT_NOQUOTES | ENT_XML1 | ENT_SUBSTITUTE, 'UTF-8', false );
$result5 = htmlspecialchars( $text, ENT_NOQUOTES | ENT_XHTML | ENT_SUBSTITUTE, 'UTF-8', false );
$result6 = htmlspecialchars( $text, ENT_NOQUOTES | ENT_HTML5 | ENT_SUBSTITUTE, 'UTF-8', false );
echo "<br />\r\nHTML 4.01:<br />\r\n", $result3,
"<br />\r\nXML 1:<br />\r\n", $result4,
"<br />\r\nXHTML:<br />\r\n", $result5,
"<br />\r\nHTML 5:<br />\r\n", $result6, "<br />\r\n";
?>
将生成
HTML 4.01(不会识别单引号,但识别欧元符号)
ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)
XML 1(会识别单引号,但不会识别欧元符号)
ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)
XHTML(识别单引号和欧元符号)
ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)
HTML 5(识别“所有”有效的字符实体)
ampersand(&), double quote("), single quote('), less than(<), greater than(>), numeric entities(&"'<>), HTML 5 entities(+,!$(ņ€)