Kevin's Java Life

喝一杯咖啡,生活变得从容和清新
随笔 - 3, 文章 - 12, 评论 - 1, 引用 - 0
数据加载中……

pageEncoding,ContentType以及其他

前不久使用Sitemesh遇到乱码问题,偶然了解到pageEncoding这一属性,并用google搜到一些资料,在此记录之。

SCWCD Exam Study KitcontentType 的敘述
The contentType attribute specifies the MIME type and character encoding of the
output. The default value of the MIME type is text/html; the default value of the
character encoding is ISO-8859-1. The MIME type and character encoding are
separated by a semicolon, as shown here:
<%@ page contentType="text/html;charset=ISO-8859-1" %><%@ page contentType="text/html;charset=ISO-8859-1" %>

This is equivalent to writing the following line in a servlet:
response.setContentType("text/html;charset=ISO-8859-1");

这样是否说明,如果使用SetCharacterEncodingFilter来过滤所有的request的话,就不需要在每个JSP页面加上<% @page>指令了?<%@ page contentType>指令了?

pageEncoding
The pageEncoding attribute specifies the character encoding of the JSP page. The
default value is ISO-8859-1. The following line illustrates the syntax:

<%@ page pageEncoding="ISO-8859-1" %>


JSP 2.0 Spec 的JSP.4.1pageEncoding的叙述

Describes the character encoding for the JSP page.

For JSP pages in standard syntax, the page character encoding is determined
from the following sources:

• A JSP configuration element page-encoding value whose URL pattern matches
the page.

• The pageEncoding attribute of the page directive of the page. It is a translation-
time error to name different encodings in the pageEncoding attribute of
the page directive of a JSP page and in a JSP configuration element whose
URL pattern matches the page.

• The charset value of the contentType attribute of the page directive. This is
used to determine the page character encoding if neither a JSP configuration
element page-encoding nor the pageEncoding attribute are provided.

• If none of the above is provided, ISO-8859-1 is used as the default character
encoding.

the character encoding for the JSP page是什么意思呢?

这样说就清楚了:pageEncoding是当jsp转译成_jsp.java时使用的encoding。

要了解JSP的乱码问题,最重要的是了解jsp的编译输出流程。
1. 从JSP“翻译”成*_jsp.java,此时JSPC根据pageEncoding来读取JSP(注意是读取),然后把它翻译成统一的utf-8 JAVA源码(.java).。如果pageEncoding设定错了,此时出来的中文已经是乱码了。

2. 从Java源码编译成Java ByteCode,此时JavaC将utf-8编码的Java源码编译成同样utf-8的二进制码(.class).

3. Tomcat或者其他应用服务器载入并执行Java ByteCode,并使用contentType设定的的字符集来输出结果(html页面)。

了解了以上流程,应该对JSP页面的乱码问题有本质的理解了。

以上文字参考:
1. JavaWorld@TW:page指令:contentType VS. pageEncoding
2. Matrix:jsp,db,apache中文乱码的解决方案

posted on 2005-10-19 17:41 Kevin 阅读(1022) 评论(0)  编辑  收藏 所属分类: 编程技巧


只有注册用户登录后才能发表评论。


网站导航: