php – 正确解码传入电子邮件主题的方法(utf 8)

前端之家收集整理的这篇文章主要介绍了php – 正确解码传入电子邮件主题的方法(utf 8)前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我正在尝试将传入的邮件传递给 PHP脚本,以便将它们存储在数据库和其他内容中.我正在使用 MIME E-mail message parser (registration required)班,虽然我认为这不重要.

我的电子邮件主题有问题.当标题是英文时,它工作正常,但如果主题使用非拉丁字符我得到类似的东西

=?UTF-8?B?2KLYstmF2KfbjNi0?=

对于像这样的标题
یکدوسه

我像这样解码主题

$subject  = str_replace('=?UTF-8?B?','',$subject);
  $subject  = str_replace('?=',$subject);      
  $subject = base64_decode($subject);

它适用于10-15个字符的短主题,但标题较长,我最终获得原始标题的一半,类似于 .

如果标题更长,比如30个字符,我什么也得不到.我这样做了吗?

尽管这已经差不多一年了 – 我发现了这个并且面临着类似的问题.

我不确定为什么你会得到奇怪的字符,但也许你试图在你的字符集不受支持的地方显示它们.

这里是我编写的一些代码,它应该处理除charset转换之外的所有内容,这是一个很大的问题,许多库处理得更好. (例如PHPMB library)

class mail {
    /**
      * If you change one of these,please check the other for fixes as well
     *
     * @const Pattern to match RFC 2047 charset encodings in mail headers
     */
    const rfc2047header = '/=\?([^ ?]+)\?([BQbq])\?([^ ?]+)\?=/';

    const rfc2047header_spaces = '/(=\?[^ ?]+\?[BQbq]\?[^ ?]+\?=)\s+(=\?[^ ?]+\?[BQbq]\?[^ ?]+\?=)/';

    /**
     * http://www.rfc-archive.org/getrfc.PHP?rfc=2047
     *
     * =?<charset>?<encoding>?<data>?=
     *
     * @param string $header
     */
    public static function is_encoded_header($header) {
        // e.g. =?utf-8?q?Re=3a=20Support=3a=204D09EE9A=20=2d=20Re=3a=20Support=3a=204D078032=20=2d=20wordpress=20Plugin?=
        // e.g. =?utf-8?q?wordpress=20Plugin?=
        return preg_match(self::rfc2047header,$header) !== 0;
    }

    public static function header_charsets($header) {
        $matches = null;
        if (!preg_match_all(self::rfc2047header,$header,$matches,PREG_PATTERN_ORDER)) {
            return array();
        }
        return array_map('strtoupper',$matches[1]);
    }

    public static function decode_header($header) {
        $matches = null;

        /* Repair instances where two encodings are together and separated by a space (strip the spaces) */
        $header = preg_replace(self::rfc2047header_spaces,"$1$2",$header);

        /* Now see if any encodings exist and match them */
        if (!preg_match_all(self::rfc2047header,PREG_SET_ORDER)) {
            return $header;
        }
        foreach ($matches as $header_match) {
            list($match,$charset,$encoding,$data) = $header_match;
            $encoding = strtoupper($encoding);
            switch ($encoding) {
                case 'B':
                    $data = base64_decode($data);
                    break;
                case 'Q':
                    $data = quoted_printable_decode(str_replace("_"," ",$data));
                    break;
                default:
                    throw new Exception("preg_match_all is busted: didn't find B or Q in encoding $header");
            }
            // This part needs to handle every charset
            switch (strtoupper($charset)) {
                case "UTF-8":
                    break;
                default:
                    /* Here's where you should handle other character sets! */
                    throw new Exception("Unknown charset in header - time to write some code.");
            }
            $header = str_replace($match,$data,$header);
        }
        return $header;
    }
}

当运行脚本并使用UTF-8在浏览器中显示时,结果是:

آزمایش

你会像这样运行它:

$decoded = mail::decode_header("=?UTF-8?B?2KLYstmF2KfbjNi0?=");

猜你在找的PHP相关文章