java解压ZIP 解决中文乱码 (GBK和UTF-8)


工具使用 : zip4j

GitHub : zip4j

版本 : 2.2.8

Maven :







1. Windows下使用WinRAR、好压、快压、百度压缩等工具压缩的文件


2. 使用Linux、MacOS等系统压缩的zip文件




ZipFile zip = new ZipFile(dest);






通过阅读ZIP的协议文档,我们可以发现,Info-ZIP Unicode Path Extra Field (0x7075)





String extractAll(MultipartFile file) throws Exception {


if(!new File(path).mkdirs()) {

return "上传文件失败,无法创建临时文件夹";


File dest = new File(path + "/"+file.getOriginalFilename());


/* 解压 */

try {

ZipFile zip = new ZipFile(dest);


System.out.println("begin unpack zip file....");


String extractedFile = getFileNameFromExtraData(v);

try {

zip.extractFile(v, path,extractedFile);

} catch (ZipException e) {

System.out.println("解压失败 :"+extractedFile);




System.out.println("解压成功 :"+extractedFile);


System.out.println("unpack zip file success");

} catch (ZipException e) {

if(!new File(path).mkdirs())

return "解压失败";


return "success";


public static String getFileNameFromExtraData(FileHeader fileHeader) {


for (ExtraDataRecord extraDataRecord : fileHeader.getExtraDataRecords()) {

long identifier = extraDataRecord.getHeader();

if (identifier == 0x7075) {

byte[] bytes = extraDataRecord.getData();

ByteBuffer buffer = ByteBuffer.wrap(bytes);

byte version = buffer.get();

assert (version == 1);

int crc32 = buffer.getInt();

System.out.println("使用:fileHeader.getExtraDataRecords() ");

return new String(bytes, 5, buffer.remaining(), StandardCharsets.UTF_8);





return fileHeader.getFileName();



Third party mappings commonly used are:

0x07c8 Macintosh

0x2605 ZipIt Macintosh

0x2705 ZipIt Macintosh 1.3.5+

0x2805 ZipIt Macintosh 1.3.5+

0x334d Info-ZIP Macintosh

0x4341 Acorn/SparkFS

0x4453 Windows NT security descriptor (binary ACL)

0x4704 VM/CMS

0x470f MVS

0x4b46 FWKCS MD5 (see below)

0x4c41 OS/2 access control list (text ACL)

0x4d49 Info-ZIP OpenVMS

0x4f4c Xceed original location extra field

0x5356 AOS/VS (ACL)

0x5455 extended timestamp

0x554e Xceed unicode extra field

0x5855 Info-ZIP UNIX (original, also OS/2, NT, etc)

0x6375 Info-ZIP Unicode Comment Extra Field

0x6542 BeOS/BeBox

0x7075 Info-ZIP Unicode Path Extra Field

0x756e ASi UNIX

0x7855 Info-ZIP UNIX (new)

0xa220 Microsoft Open Packaging Growth Hint

0xfd4a SMS/QDOS

-Info-ZIP Unicode Path Extra Field (0x7075):

Stores the UTF-8 version of the file name field as stored in the

local header and central directory header. (Last Revision 20070912)

Value Size Description

----- ---- -----------

(UPath) 0x7075 Short tag for this extra block type ("up")

TSize Short total data size for this block

Version 1 byte version of this extra field, currently 1

NameCRC32 4 bytes File Name Field CRC32 Checksum

UnicodeName Variable UTF-8 version of the entry File Name

Currently Version is set to the number 1. If there is a need

to change this field, the version will be incremented. Changes

may not be backward compatible so this extra field should not be

used if the version is not recognized.

The NameCRC32 is the standard zip CRC32 checksum of the File Name

field in the header. This is used to verify that the header

File Name field has not changed since the Unicode Path extra field

was created. This can happen if a utility renames the File Name but

does not update the UTF-8 path extra field. If the CRC check fails,

this UTF-8 Path Extra Field should be ignored and the File Name field

in the header should be used instead.

The UnicodeName is the UTF-8 version of the contents of the File Name

field in the header. As UnicodeName is defined to be UTF-8, no UTF-8

byte order mark (BOM) is used. The length of this field is determined

by subtracting the size of the previous fields from TSize. If both

the File Name and Comment fields are UTF-8, the new General Purpose

Bit Flag, bit 11 (Language encoding flag (EFS)), can be used to

indicate that both the header File Name and Comment fields are UTF-8

and, in this case, the Unicode Path and Unicode Comment extra fields

are not needed and should not be created. Note that, for backward

compatibility, bit 11 should only be used if the native character set

of the paths and comments being zipped up are already in UTF-8. It is

expected that the same file name storage method, either general

purpose bit 11 or extra fields, be used in both the Local and Central

Directory Header for a file.


unzip not correct with cjk filename. #45

Garbled chinese character #73

