Apache POI XXE漏洞分析与防护指南
0x01 概述
Apache POI是Java应用中常见的组件,主要用于处理Word文档和Excel文件的导入导出功能。由于这些文档本质上是类似压缩包的结构,在处理过程中容易引发XML外部实体(XXE)注入漏洞。本文将详细分析Apache POI组件中的两个典型XXE漏洞(CVE-2014-3529和CVE-2019-12415),包括漏洞原理、复现方法、分析过程和修复方案。
0x02 CVE-2014-3529漏洞分析
漏洞影响范围
Apache POI 3.10.1之前版本存在XXE漏洞
漏洞场景搭建
测试代码:
import org.apache.poi.EncryptedDocumentException;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import java.io.FileInputStream;
import java.io.IOException;
public class CVE20143529 {
public static void main(String[] args) throws IOException, EncryptedDocumentException, InvalidFormatException {
Workbook wb1 = WorkbookFactory.create(new FileInputStream("test.xlsx"));
Sheet sheet = wb1.getSheetAt(0);
System.out.println(sheet.getLastRowNum());
}
}
pom.xml依赖配置:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.apache.poi</groupId>
<artifactId>xxe</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>3.10-FINAL</version>
</dependency>
</dependencies>
</project>
漏洞复现方法
- 修改Excel文件中的以下任意位置添加XXE Payload:
[Content_Types].xml/xl/workbook.xml/xl/worksheets/shee1.xml
推荐在[Content_Types].xml文件中添加Payload
漏洞分析过程
- 调用栈起点:
WorkbookFactory.create() - 进入
OPCPackage.open()方法:public static OPCPackage open(InputStream in) throws InvalidFormatException, IOException { OPCPackage pack = new ZipPackage(in, PackageAccess.READ_WRITE); if (pack.partList == null) { pack.getParts(); } return pack; } - 创建
ZipPackage解析输入流:ZipPackage(InputStream in, PackageAccess access) throws IOException { super(access); this.zipArchive = new ZipInputStreamZipEntrySource(new ZipInputStream(in)); } - 调用
getParts()方法:public ArrayList<PackagePart> getParts() throws InvalidFormatException { this.throwExceptionIfWriteOnly(); if (this.partList == null) { boolean hasCorePropertiesPart = false; boolean needCorePropertiesPart = true; PackagePart[] parts = this.getPartsImpl(); - 关键点:
getPartsImpl()方法处理[Content_Types].xml文件 - 通过
ZipContentTypeManager调用父类ContentTypeManager处理XML:public ZipContentTypeManager(InputStream in, OPCPackage pkg) throws InvalidFormatException { super(in, pkg); } - 最终在
ContentTypeManager.parseContentTypesFile()方法中触发XXE漏洞:private void parseContentTypesFile(InputStream in) throws InvalidFormatException { try { Document xmlContentTypetDoc = SAXHelper.readSAXDocument(in);
完整调用栈:
parseContentTypesFile:377, ContentTypeManager (org.apache.poi.openxml4j.opc.internal)
<init>:105, ContentTypeManager (org.apache.poi.openxml4j.opc.internal)
<init>:56, ZipContentTypeManager (org.apache.poi.openxml4j.opc.internal)
getPartsImpl:188, ZipPackage (org.apache.poi.openxml4j.opc)
getParts:665, OPCPackage (org.apache.poi.openxml4j.opc)
open:274, OPCPackage (org.apache.poi.openxml4j.opc)
create:79, WorkbookFactory (org.apache.poi.ss.usermodel)
main:12, CVE20143529
漏洞修复方案
修复方式是将xmlReader.read(in)替换为SAXHelper.readSAXDocument(in):
private void parseContentTypesFile(InputStream in) throws InvalidFormatException {
try {
Document xmlContentTypetDoc = SAXHelper.readSAXDocument(in);
在org.apache.poi.util.SAXHelper中增加了XXE限制措施。
0x03 CVE-2019-12415漏洞分析
漏洞影响范围
Apache POI 4.1.0及之前版本,在使用XSSFExportToXml工具转换用户提供的Microsoft Excel文档时存在XXE漏洞
漏洞场景搭建
测试代码:
import org.apache.poi.EncryptedDocumentException;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.xssf.extractor.XSSFExportToXml;
import org.apache.poi.xssf.usermodel.XSSFMap;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.xml.sax.SAXException;
import javax.xml.transform.TransformerException;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
public class PoiXxe {
public static void main(String[] args) throws IOException, EncryptedDocumentException, InvalidFormatException, TransformerException, SAXException {
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(new File("/Users/l1nk3r/Desktop/CustomXMLMappings.xlsx")));
for (XSSFMap map : wb.getCustomXMLMappings()) {
XSSFExportToXml exporter = new XSSFExportToXml(map);
exporter.exportToXML(System.out, true); // 第二个参数必须为true
}
}
}
pom.xml依赖配置:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.apache.poi</groupId>
<artifactId>xxe</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>4.1.0</version>
</dependency>
</dependencies>
</project>
漏洞复现方法
- 修改Excel文件中的
CustomXMLMappings/xl/xmlMaps.xml文件 - 添加以下XXE Payload:
<xsd:redefine schemaLocation="http://127.0.0.1:8080/"></xsd:redefine>
漏洞分析过程
- 关键点1:
XSDHandler#constructTrees方法提取POC中的外带地址 - 关键点2:
XSDHandler#resolveSchema将外带地址交给getSchemaDocument处理 - 最终在
XMLEntityManager#setupCurrentEntity发起HTTP请求
调试技巧:在分析XXE漏洞时,可以在JDK自身的XMLEntityManager#setupCurrentEntity中HTTP请求处下断点,利用OOB方式利用,可以快速找到触发过程的调用栈。
完整调用栈:
setupCurrentEntity:619, XMLEntityManager (com.sun.org.apache.xerces.internal.impl)
determineDocVersion:189, XMLVersionDetector (com.sun.org.apache.xerces.internal.impl)
parse:582, SchemaParsingConfig (com.sun.org.apache.xerces.internal.impl.xs.opti)
parse:685, SchemaParsingConfig (com.sun.org.apache.xerces.internal.impl.xs.opti)
parse:530, SchemaDOMParser (com.sun.org.apache.xerces.internal.impl.xs.opti)
getSchemaDocument:2175, XSDHandler (com.sun.org.apache.xerces.internal.impl.xs.traversers)
resolveSchema:2096, XSDHandler (com.sun.org.apache.xerces.internal.impl.xs.traversers)
constructTrees:1100, XSDHandler (com.sun.org.apache.xerces.internal.impl.xs.traversers)
parseSchema:620, XSDHandler (com.sun.org.apache.xerces.internal.impl.xs.traversers)
loadSchema:617, XMLSchemaLoader (com.sun.org.apache.xerces.internal.impl.xs)
loadGrammar:575, XMLSchemaLoader (com.sun.org.apache.xerces.internal.impl.xs)
loadGrammar:541, XMLSchemaLoader (com.sun.org.apache.xerces.internal.impl.xs)
newSchema:255, XMLSchemaFactory (com.sun.org.apache.xerces.internal.jaxp.validation)
newSchema:638, SchemaFactory (javax.xml.validation)
isValid:249, XSSFExportToXml (org.apache.poi.xssf.extractor)
exportToXML:211, XSSFExportToXml (org.apache.poi.xssf.extractor)
exportToXML:105, XSSFExportToXml (org.apache.poi.xssf.extractor)
main:20, PoiXxe
漏洞修复方案
修复方式增加了安全处理特性:
trySetFeature(factory, "http://javax.xml.XMLConstants/feature/secure-processing", true);
关键变化在SecuritySupport#checkAccess方法:
- 未修复版本:
allowedProtocols为all,acessAny为all,checkAccess返回null - 已修复版本:
allowedProtocols为空,acessAny为all,checkAccess返回http
在XSDHandler#getSchemaDocument中,由于不允许http方式外带数据,会抛出错误。
0x04 防护建议
-
升级Apache POI版本:
- 对于CVE-2014-3529,升级到3.10.1或更高版本
- 对于CVE-2019-12415,升级到4.1.1或更高版本
-
安全编码实践:
// 使用安全的XML解析配置 DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); factory.setFeature("http://xml.org/sax/features/external-general-entities", false); factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false); -
输入验证:
- 对用户上传的Excel/Word文件进行严格验证
- 限制文件大小和类型
-
运行时保护:
System.setProperty("javax.xml.accessExternalSchema", "file, jar:file"); System.setProperty("javax.xml.accessExternalDTD", "file, jar:file"); -
网络防护:
- 限制应用服务器出站连接
- 使用网络防火墙规则阻止非必要的出站请求
0x05 总结
Apache POI组件在处理Office文档时,由于XML解析配置不当,可能导致XXE漏洞。通过分析这两个典型漏洞,我们可以了解到:
- XXE漏洞常出现在XML解析环节,特别是处理压缩包内XML文件时
- 修复方案通常涉及启用安全处理特性或限制外部实体解析
- 调试XXE漏洞时,可以从JDK底层的HTTP请求处入手逆向分析
- 及时升级组件版本是最有效的防护措施
在实际开发中,应避免使用存在已知漏洞的旧版本组件,并遵循安全编码规范处理用户上传文件。