mb_decode_numericentity

(PHP 4 >= 4.0.6, PHP 5, PHP 7, PHP 8)

mb_decode_numericentity — 将 HTML 数字字符串引用解码为字符

描述

mb_decode_numericentity(string $string, array $map, ?string $encoding = null): string

将指定块中string string 的数字字符串引用转换为字符。

参数

string: 要解码的string。
map: map 是一个 array，指定要转换的代码区域。
encoding: encoding 参数是字符编码。如果省略或为 null，则使用内部字符编码值。
is_hex: 此参数未使用。

返回值

转换后的string。

错误/异常

如果 map 不是 int 列表，则抛出 ValueError 异常。

变更日志

版本	描述
8.4.0	如果 `map` 不是 int 列表，mb_decode_numericentity() 现在会抛出 ValueError 异常。
8.0.0	`encoding` 现在可以为空。

示例

示例 #1 map 示例

<?php
$convmap = array (
 int start_code1, int end_code1, int offset1, int mask1,
 int start_code2, int end_code2, int offset2, int mask2,
 ........
 int start_codeN, int end_codeN, int offsetN, int maskN );
// 指定 start_codeN 和 end_codeN 的 Unicode 值
// 向值添加 offsetN 并与 maskN 进行按位“与”运算，
// 然后将值转换为数字字符串引用。
?>

示例 #2 map 示例转义 JavaScript 字符串

<?php
function escape_javascript_string($str) {
 $map = [
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,0,0, // 49
 0,0,0,0,0,0,0,0,1,1,
 1,1,1,1,1,0,0,0,0,0,
 0,0,0,0,0,0,0,0,0,0,
 0,0,0,0,0,0,0,0,0,0,
 0,1,1,1,1,1,1,0,0,0, // 99
 0,0,0,0,0,0,0,0,0,0,
 0,0,0,0,0,0,0,0,0,0,
 0,0,0,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1, // 149
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1, // 199
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1,
 1,1,1,1,1,1,1,1,1,1, // 249
 1,1,1,1,1,1,1, // 255
 ];
 // Char encoding is UTF-8
 $mblen = mb_strlen($str, 'UTF-8');
 $utf32 = bin2hex(mb_convert_encoding($str, 'UTF-32', 'UTF-8'));
 for ($i=0, $encoded=''; $i < $mblen; $i++) {
 $u = substr($utf32, $i*8, 8);
 $v = base_convert($u, 16, 10);
 if ($v < 256 && $map[$v]) {
 $encoded .= '\\x'.substr($u, 6,2);
 } else if ($v == 2028) {
 $encoded .= '\\u2028';
 } else if ($v == 2029) {
 $encoded .= '\\u2029';
 } else {
 $encoded .= mb_convert_encoding(hex2bin($u), 'UTF-8', 'UTF-32');
 }
 }
 return $encoded;
}
 
// Test data
$convmap = [ 0x0, 0xffff, 0, 0xffff ];
$msg = '';
for ($i=0; $i < 1000; $i++) {
 // chr() cannot generate correct UTF-8 data larger value than 128, use mb_decode_numericentity().
 $msg .= mb_decode_numericentity('&#'.$i.';', $convmap, 'UTF-8');
}
 
// var_dump($msg);
var_dump(escape_javascript_string($msg));

参见

mb_encode_numericentity() - 将字符编码为 HTML 数字字符串引用

发现问题？

了解如何改进此页面 • 提交拉取请求 • 报告错误

＋添加备注

用户贡献笔记 4 条笔记

上移

下移

abderrahmanekaddour dot aissat at gmail dot com ¶

2 年前

<?php

// 以下文档取决于对 php mbr 代码源的理解
// 首先为了优化 php 的工作
// 字符串必须包含“&” 否则 php 不会尝试解码。
// 对于 map：int start_codeN，int end_codeN，int offsetN，int maskN
// 实体必须在 [start_codeN，end_codeN] 范围内，如果实体大于或小于
// mb_decode_numericentity 将忽略解码过程并按原样返回 $string。
// 在 php 的后期版本中，$map：“必须具有 4 个元素的倍数”

$map = [ 0x0, 0xFFFF, 0, 0];
echo mb_decode_numericentity('&#109;', $map ); // 结果 "m"
// 如果 offsetN = 1 结果 "l"；您增加的十进制数越多，它使用的 OR 运算符就越多。
$map_2 = [ 0x0, 0xFFFF, 60, 0];
echo mb_decode_numericentity('&#109;', $map_2 ); // 解码 (&#49;) 结果： "1"

// 检查结果的实体引用： https://cs.stanford.edu/people/miles/iso8859.html#ISO

?>

上移

下移

donovan at conduit it ¶

18 年前

请注意，目前 mb_decode_numericentity() 似乎仅适用于十进制实体，而不适用于十六进制实体。这一事实为我的调试节省了大量时间。

对于需要转换十六进制实体的用户，请首先尝试使用 preg_replace() 和 hexdec() 函数的组合将它们全部转换为十进制实体。

上移

下移

dev at glossword info ¶

21 年前

仅供日常使用的两个很棒的函数

/* 将任何 HTML 实体转换为字符 */
function my_numeric2character($t)
{
$convmap = array(0x0, 0x2FFFF, 0, 0xFFFF);
return mb_decode_numericentity($t, $convmap, 'UTF-8');
}
/* 将任何字符转换为 HTML 实体 */
function my_character2numeric($t)
{
$convmap = array(0x0, 0x2FFFF, 0, 0xFFFF);
return mb_encode_numericentity($t, $convmap, 'UTF-8');
}
print my_numeric2character('&#8217; &#7936; &#226;');
print my_character2numeric(' ? ? ');

上移

下移

-2

fernandosilveira at yahoo dot com dot br ¶

4 年前

小心！
除了将数字实体翻译成指定目标编码的字符外，此函数还会将输入字符串中的每个字符编码到指定的 target encodin，即使字符位于转换映射定义的范围之外。