文章目录
- [1. 问题描述](#1. 问题描述)
- [2. 原因](#2. 原因)
-
- [2.1 编码](#2.1 编码)
- [2.2 解码](#2.2 解码)
- [3. 解决方法](#3. 解决方法)
1. 问题描述
之前写了一个接口,用 Apifox 请求,参数传入一个 +86 的电话,结果到服务器 + 就变成空格了。
Java 接收请求的接口:
2. 原因
2.1 编码
进行 URL 请求的时候我们需要用 URLEncoder 对参数进行编码,下面是编码的规则。
java
/**
* Utility class for HTML form encoding. This class contains static methods
* for converting a String to the <CODE>application/x-www-form-urlencoded</CODE> MIME
* format. For more information about HTML form encoding, consult the HTML
* <A HREF="http://www.w3.org/TR/html4/">specification</A>.
*
* <p>
* When encoding a String, the following rules apply:
*
* <ul>
* <li>The alphanumeric characters "{@code a}" through
* "{@code z}", "{@code A}" through
* "{@code Z}" and "{@code 0}"
* through "{@code 9}" remain the same.
* <li>The special characters "{@code .}",
* "{@code -}", "{@code *}", and
* "{@code _}" remain the same.
* <li>The space character " " is
* converted into a plus sign "{@code +}".
* <li>All other characters are unsafe and are first converted into
* one or more bytes using some encoding scheme. Then each byte is
* represented by the 3-character string
* "<i>{@code %xy}</i>", where <i>xy</i> is the
* two-digit hexadecimal representation of the byte.
* The recommended encoding scheme to use is UTF-8. However,
* for compatibility reasons, if an encoding is not specified,
* then the default encoding of the platform is used.
* </ul>
*
* <p>
* For example using UTF-8 as the encoding scheme the string "The
* string ü@foo-bar" would get converted to
* "The+string+%C3%BC%40foo-bar" because in UTF-8 the character
* ü is encoded as two bytes C3 (hex) and BC (hex), and the
* character @ is encoded as one byte 40 (hex).
*
* @author Herb Jellinek
* @since JDK1.0
*/
public class URLEncoder {
...
}
直接上 GPT,解释如下:
-
保留字符:
- 字母(a-z, A-Z)和数字(0-9)保持不变。
- 特殊字符 .(点)、-(减号)、*(星号)和 _(下划线)保持不变。
-
空格:
- 空格字符( )被转换为加号(+)。
-
其他字符:
- 所有其他字符被认为是"不安全的",需要先使用某种编码方案(如UTF-8)转换为一个或多个字节,然后每个字节表示为%xy,其中xy是该字节的十六进制表示。
举个例子:假设使用UTF-8编码,字符串 "The string ü@foo-bar" 将被转换为 "The+string+%C3%BC%40foo-bar",因为:
字符 ü 在UTF-8中编码为两个字节 C3 和 BC。
字符 @ 编码为一个字节 40。
2.2 解码
编码之后就能发送请求到服务器了,而我们直接在 Postman 上面请求的 URL 如下:
可以理解成编码之后的 URL,所以接收请求的时候同样会进行 URL 解码。那么 URL 是如何解码的呢?我们可以同样到 URLDecoder 里面去找答案:
java
/**
* Utility class for HTML form decoding. This class contains static methods
* for decoding a String from the <CODE>application/x-www-form-urlencoded</CODE>
* MIME format.
* <p>
* The conversion process is the reverse of that used by the URLEncoder class. It is assumed
* that all characters in the encoded string are one of the following:
* "{@code a}" through "{@code z}",
* "{@code A}" through "{@code Z}",
* "{@code 0}" through "{@code 9}", and
* "{@code -}", "{@code _}",
* "{@code .}", and "{@code *}". The
* character "{@code %}" is allowed but is interpreted
* as the start of a special escaped sequence.
* <p>
* The following rules are applied in the conversion:
*
* <ul>
* <li>The alphanumeric characters "{@code a}" through
* "{@code z}", "{@code A}" through
* "{@code Z}" and "{@code 0}"
* through "{@code 9}" remain the same.
* <li>The special characters "{@code .}",
* "{@code -}", "{@code *}", and
* "{@code _}" remain the same.
* <li>The plus sign "{@code +}" is converted into a
* space character " " .
* <li>A sequence of the form "<i>{@code %xy}</i>" will be
* treated as representing a byte where <i>xy</i> is the two-digit
* hexadecimal representation of the 8 bits. Then, all substrings
* that contain one or more of these byte sequences consecutively
* will be replaced by the character(s) whose encoding would result
* in those consecutive bytes.
* The encoding scheme used to decode these characters may be specified,
* or if unspecified, the default encoding of the platform will be used.
* </ul>
* <p>
* There are two possible ways in which this decoder could deal with
* illegal strings. It could either leave illegal characters alone or
* it could throw an {@link java.lang.IllegalArgumentException}.
* Which approach the decoder takes is left to the
* implementation.
*
* @author Mark Chamness
* @author Michael McCloskey
* @since 1.2
*/
public class URLDecoder {
...
}
还是一样,直接用 GPT 解释:
- 字母和数字: 字母a到z、A到Z和数字0到9保持不变。
- 特殊字符: 点号.、连字符-、星号*和下划线_保持不变。
- 加号: 加号+被转换为空格字符。
- 百分号编码: 形式为"%xy"的序列被视为表示一个字节,其中xy是该字节的两位十六进制表示。连续的这些字节序列将被替换为那些字节所表示的字符。字符的编码方案可以指定,如果没有指定,则使用平台的默认编码。
比如请求参数:http://localhost:8080/demo/getName?name=.-*_aA0+%2B
服务端这边的接收:.-*_aA0 +
,可以看到 + 号解码成空格,同时 %2B 解码成 + 号了。因为 + 的 Ascii 十六进制就是 2B。
3. 解决方法
既然 + 号是被解码成空格了,那我们可以不把 + 号放在 URL 中,可以放在 Body 中,也就是使用 Post 请求,把参数放到请求体中传入就不会解码了。
java
@RequestMapping("/demo")
@RestController
public class DemoController {
@GetMapping("getName")
public void reqDemo(@RequestBody DataDemo dataDemo){
System.out.println(dataDemo.getName());
}
}
@Getter
@Setter
public class DataDemo {
private String name;
}
输出结果如下:
此外,还有一种方法,就是上面说的:传入 %2B
就行了。
如有错误,欢迎指出!!!