在Java开发中,URL跳转经常遇到中文乱码问题。实际上,如果细心的话,我们会发现在访问网页时经常会在URL中看到一些16进制格式的字符串,如:http://xxx.com/s?w=%e7%bc

这其实就是用到Java.net包下的URLEncoder和URLDecoder这两个类来对URL参数实现转码和解码。

1、URLDecoder(解码)

源码上对此解释是:

Utility class for HTML form decoding. This class contains static methods for decoding a String from the <CODE>application/x-www-form-urlencoded</CODE>MIME format.

即这是一个HTML格式的解码工具类。该类包含了对一个字符串解码的静态方法!

从源码可看出给出了两种解码方法:

(1)默认格式解码

 public static String decode(String s) {String str = null;try {str = decode(s, dfltEncName);} catch (UnsupportedEncodingException e) {// The system should always have the platform default}return str;}

(2)指定格式解码

 public static String decode(String s, String enc)throws UnsupportedEncodingException{boolean needToChange = false;int numChars = s.length();StringBuffer sb = new StringBuffer(numChars > 500 ? numChars / 2 : numChars);int i = 0;if (enc.length() == 0) {throw new UnsupportedEncodingException ("URLDecoder: empty string enc parameter");}char c;byte[] bytes = null;while (i < numChars) {c = s.charAt(i);switch (c) {case '+':sb.append(' ');i++;needToChange = true;break;case '%':/** Starting with this instance of %, process all* consecutive substrings of the form %xy. Each* substring %xy will yield a byte. Convert all* consecutive  bytes obtained this way to whatever* character(s) they represent in the provided* encoding.*/try {// (numChars-i)/3 is an upper bound for the number// of remaining bytesif (bytes == null)bytes = new byte[(numChars-i)/3];int pos = 0;while ( ((i+2) < numChars) &&(c=='%')) {int v = Integer.parseInt(s.substring(i+1,i+3),16);if (v < 0)throw new IllegalArgumentException("URLDecoder: Illegal hex characters in escape (%) pattern - negative value");bytes[pos++] = (byte) v;i+= 3;if (i < numChars)c = s.charAt(i);}// A trailing, incomplete byte encoding such as// "%x" will cause an exception to be thrownif ((i < numChars) && (c=='%'))throw new IllegalArgumentException("URLDecoder: Incomplete trailing escape (%) pattern");sb.append(new String(bytes, 0, pos, enc));} catch (NumberFormatException e) {throw new IllegalArgumentException("URLDecoder: Illegal hex characters in escape (%) pattern - "+ e.getMessage());}needToChange = true;break;default:sb.append(c);i++;break;}}return (needToChange? sb.toString() : s);}

附:解码规则

 <ul>* <li>The alphanumeric characters "{@code a}" through*     "{@code z}", "{@code A}" through*     "{@code Z}" and "{@code 0}"*     through "{@code 9}" remain the same.* <li>The special characters "{@code .}",*     "{@code -}", "{@code *}", and*     "{@code _}" remain the same.* <li>The space character "   " is*     converted into a plus sign "{@code +}".* <li>All other characters are unsafe and are first converted into*     one or more bytes using some encoding scheme. Then each byte is*     represented by the 3-character string*     "<i>{@code %xy}</i>", where <i>xy</i> is the*     two-digit hexadecimal representation of the byte.*     The recommended encoding scheme to use is UTF-8. However,*     for compatibility reasons, if an encoding is not specified,*     then the default encoding of the platform is used.* </ul>

翻译过来就是:

字母数字字符 "a" 到 "z"、"A" 到 "Z" 和 "0" 到 "9" 保持不变。 
特殊字符 "."、"-"、"*" 和 "_" 保持不变。 
加号 "+" 转换为空格字符 " "。 
将把 "%xy" 格式序列视为一个字节,其中 xy 为 8 位的两位十六进制表示形式。然后,所有连续包含一个或多个这些字节序列的子字符串,将被其编码可生成这些连续字节的字符所代替。可以指定对这些字符进行解码的编码机制,或者如果未指定的话,则使用平台的默认编码机制。

示例如下:

  public static void main(String[] args) throws Exception {String encodedString = “%e7%bc%96%e7%a0%81%e6%a0%bc%e5%bc%8f”;URLDecoder.decode(encodedString, "UTF-8");}

2、URLEncoder(转码)

源码上对此解释是:

Utility class for HTML form encoding. This class contains static methods for converting a String to the <CODE>application/x-www-form-urlencoded</CODE> MIME  format.

即这是一个HTML格式的转码工具类。该类包含了对一个字符串转码的静态方法!

从源码可看出给出了两种转码方法:

(1)默认格式转码

 public static String encode(String s) {String str = null;try {str = encode(s, dfltEncName);} catch (UnsupportedEncodingException e) {// The system should always have the platform default}return str;}

(2)指定 格式转码

  public static String encode(String s, String enc)throws UnsupportedEncodingException {boolean needToChange = false;StringBuffer out = new StringBuffer(s.length());Charset charset;CharArrayWriter charArrayWriter = new CharArrayWriter();if (enc == null)throw new NullPointerException("charsetName");try {charset = Charset.forName(enc);} catch (IllegalCharsetNameException e) {throw new UnsupportedEncodingException(enc);} catch (UnsupportedCharsetException e) {throw new UnsupportedEncodingException(enc);}for (int i = 0; i < s.length();) {int c = (int) s.charAt(i);//System.out.println("Examining character: " + c);if (dontNeedEncoding.get(c)) {if (c == ' ') {c = '+';needToChange = true;}//System.out.println("Storing: " + c);out.append((char)c);i++;} else {// convert to external encoding before hex conversiondo {charArrayWriter.write(c);/** If this character represents the start of a Unicode* surrogate pair, then pass in two characters. It's not* clear what should be done if a bytes reserved in the* surrogate pairs range occurs outside of a legal* surrogate pair. For now, just treat it as if it were* any other character.*/if (c >= 0xD800 && c <= 0xDBFF) {/*System.out.println(Integer.toHexString(c)+ " is high surrogate");*/if ( (i+1) < s.length()) {int d = (int) s.charAt(i+1);/*System.out.println("\tExamining "+ Integer.toHexString(d));*/if (d >= 0xDC00 && d <= 0xDFFF) {/*System.out.println("\t"+ Integer.toHexString(d)+ " is low surrogate");*/charArrayWriter.write(d);i++;}}}i++;} while (i < s.length() && !dontNeedEncoding.get((c = (int) s.charAt(i))));charArrayWriter.flush();String str = new String(charArrayWriter.toCharArray());byte[] ba = str.getBytes(charset);for (int j = 0; j < ba.length; j++) {out.append('%');char ch = Character.forDigit((ba[j] >> 4) & 0xF, 16);// converting to use uppercase letter as part of// the hex value if ch is a letter.if (Character.isLetter(ch)) {ch -= caseDiff;}out.append(ch);ch = Character.forDigit(ba[j] & 0xF, 16);if (Character.isLetter(ch)) {ch -= caseDiff;}out.append(ch);}charArrayWriter.reset();needToChange = true;}}return (needToChange? out.toString() : s);}

附:转码规则

<ul>* <li>The alphanumeric characters "{@code a}" through*     "{@code z}", "{@code A}" through*     "{@code Z}" and "{@code 0}"*     through "{@code 9}" remain the same.* <li>The special characters "{@code .}",*     "{@code -}", "{@code *}", and*     "{@code _}" remain the same.* <li>The space character "   " is*     converted into a plus sign "{@code +}".* <li>All other characters are unsafe and are first converted into*     one or more bytes using some encoding scheme. Then each byte is*     represented by the 3-character string*     "<i>{@code %xy}</i>", where <i>xy</i> is the*     two-digit hexadecimal representation of the byte.*     The recommended encoding scheme to use is UTF-8. However,*     for compatibility reasons, if an encoding is not specified,*     then the default encoding of the platform is used.* </ul>

翻译过来就是:

字母数字字符 "a" 到 "z"、"A" 到 "Z" 和 "0" 到 "9" 保持不变。 
特殊字符 "."、"-"、"*" 和 "_" 保持不变。 
空格字符 " " 转换为一个加号 "+"。 
所有其他字符都是不安全的,因此首先使用一些编码机制将它们转换为一个或多个字节。然后每个字节用一个包含 3 个字符的字符串 "%xy" 表示,其中 xy 为该字节的两位十六进制表示形式。推荐的编码机制是 UTF-8。但是,出于兼容性考虑,如果未指定一种编码,则使用相应平台的默认编码。

示例如下:

  public static void main(String[] args) throws Exception {String encodedString = “编码格式”;URLEncoder.encode(encodedString, "UTF-8");}

URLEncoder和URLDecoder实现转码和解码相关推荐

  1. Java:URLEncoder、URLDecoder、Base64编码与解码

    1. URL 主要用来http get请求url不能传输中文参数问题.http请求是不接受中文参数的 1.1 URLEncoder编码 使用指定的编码机制将字符串转换为 application/x-w ...

  2. URLEncoder 、URLDecoder 对中文转码解码使用

    URLEncoder .URLDecoder 转码解码使用 传递参数,转码传递 String encodeStr = null; try {     encodeStr = URLEncoder.en ...

  3. 使用URLEncoder、URLDecoder进行URL参数的转码与解码

    url后参数的转码与解码 import java.net.URLDecoder; import java.net.URLEncoder; String strTest = "?=abc?中% ...

  4. java 中文解码_java使用URLDecoder和URLEncoder对中文字符进行编码和解码

    摘要: URLDecoder 和 URLEncoder 用于完成普通字符串 和 application/x-www-form-urlencoded MIME 字符串之间的相互转换.在本文中,我们以使用 ...

  5. URLEncoder 、URLDecoder 对 URL 编解码,HttpURLConnection 文件下载

    目录 URLEncoder  编码 URLDecoder 解码 URL 空格问题 与 HttpURLConnection 文件下载 URLEncoder  编码 1.public class URLE ...

  6. URLEncoder和URLDecoder中特殊字符的处理方案 URL传值问题

    在Java中,我们会经常对一些中文字符进行URL编码,这样的就可以在数据传递中解决中文乱码的现象. 但是在对于一些特殊字符的URLEncoder编码后    在通过URLDecoder解码处理会出现丢 ...

  7. URLEncoder和URLDecoder(乱码处理)

    前言 在进行向服务器传递表单数据的实验的时候,发现得到的英文字符正常而中文字符都是乱码.在百思不得其解的时候,学习了一下URLEncoder和URLDecoder,以及顺藤摸瓜找到了产生乱码的原因和解 ...

  8. java qlv转mp4 代码_Java实用工具类五:URL转码、解码类

    package com.cn.hnust.util; import java.io.UnsupportedEncodingException; import java.util.HashMap; im ...

  9. Java实用工具类五:URL转码、解码类

    此文仅对自己工作中用到的类进行总结,方便以后的使用. package com.cn.hnust.util;import java.io.UnsupportedEncodingException; im ...

最新文章

  1. vscode中使用js的console配置 - mac
  2. python 元组 字典 列表 序列化与反序列化
  3. mvc和php的关系,php - 什么是HMVC模式?
  4. 使用SQL Server 2005 Report Builder
  5. word无法打开请去应用商店_word文档打不开的4种解决方法
  6. Spark Streaming 实战案例(五) Spark Streaming与Kafka
  7. linux平台之如何查看svn账号
  8. 操作系统上机题目(多进程2)
  9. 前端学习(1271):async/await处理多个异步请求
  10. Ordering Tasks UVA - 10305(拓扑排序)
  11. 实验2 格式化输入输出和分支语句
  12. 向数据源DataTable 中添加新的一列,并向其赋值
  13. 一:MySQL数据库的性能的影响分析及其优化
  14. 经济应用文写作【10】
  15. yoga710怎么进入bios_联想笔记本怎么进入BIOS联想手提电脑进BIOS方法汇总
  16. 平均误差、相对误差百分数、均方根误差
  17. 【软件定义汽车】AUTOSAR架构介绍
  18. 平均数、中位数、众数,在分析中如何使用?
  19. 【采用】大数据风控---风险量化和风险定价
  20. 年薪201万的华为“天才少年”曾是三本复读生,逆袭就是抓住每一次提升自己的机会 | AI大赛报名开启

热门文章

  1. 组合学笔记(一)偏序集概念与应用
  2. 轻量级web开发框架:Flask 基础教程
  3. git github gitlib gitlab分别是什么,有什么区别?
  4. pdf文件怎么修改颜色
  5. 【大咖专访】热爱美术的优质青年-李佳男
  6. break的作用与用法
  7. APISpace 庆元旦
  8. matlab ext2int,向大神们帮助,索引超出矩阵维度和出错 ext2int (line 165)
  9. win10msmpeng占内存_微软win10吃内存,CPU占用高,没有优化好?做好这3点系统快如飞...
  10. 基于模型的聚类和R语言中的高斯混合模型