FOFA爬虫与源码泄露PoC联动利用教学文档

一、工具概述

本文档介绍如何结合FOFA爬虫与源码泄露PoC工具进行自动化安全检测，实现从目标发现到漏洞验证的完整流程。

二、FOFA爬虫实现

1. 准备工作

需要有效的FOFA账号Cookie(_fofapro_ars_session)
准备代理IP池(可选，用于避免被封禁)

2. 核心代码解析

import requests
from lxml import etree
import random
import time
import urllib
import base64

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36",
}

# 代理IP列表(需自行维护更新)
proxylist = [
    {'HTTP': '112.84.54.35:9999'},
    {'HTTP': '175.44.109.144:9999'},
    {'HTTP': '125.108.119.23:9000'}
]

def loadpage(url, begin, end):
    for page in range(begin, end+1):
        print("正在爬取第"+str(page)+"页:")
        fullurl = url+"&page="+str(page)
        response = requests.get(fullurl, headers=headers, proxies=proxy).text
        html = etree.HTML(response)
        req = html.xpath('//div[@class="fl box-sizing"]/div[@class="re-domain"]/a[@target="_blank"]/@href')
        result = '\n'.join(req)
        with open(r'url.txt',"a+") as f:
            f.write(result+"\n")
        print("第"+str(page)+"页已完成爬取\n")

if __name__ == '__main__':
    q = input('请输入关键字，如"app="xxx" && country="CN":等等')
    begin = int(input("请输入开始页数 最小为1:"))
    end = int(input("请输入结束页数 最大为5:"))
    cookie = input("请输入你的Cookie:")
    cookies = '_fofapro_ars_session='+cookie+';result_per_page=20'
    headers['cookie'] = cookies
    url = "https://fofa.so/result?"
    key = urllib.parse.urlencode({"q":q})
    key2 = base64.b64encode(q.encode('utf-8')).decode("utf-8")
    url = url+key+"&qbase64="+key2
    loadpage(url, begin, end)
    time.sleep(5)

3. 使用步骤

登录FOFA并获取Cookie
准备搜索语法(如app="xxx" && country="CN")
运行脚本输入参数
结果保存在url.txt中

三、源码泄露PoC实现

1. 准备工作

准备web.txt文件，包含常见源码泄露路径

示例路径(部分):

/.git/HEAD
/.svn/entries
/CVS/Entries
/WEB-INF/web.xml

2. 核心代码解析

import requests
import time

with open("url.txt", 'r') as temp:
    for url in temp.readlines():
        url = url.strip('\n')
        with open("web.txt", 'r') as web:
            webs = web.readlines()
            for web in webs:
                web = web.strip()
                u = url + web
                r = requests.get(u)
                print("url为:" + u + " 状态为:%d" % r.status_code)
                time.sleep(2)  # 请求间隔防止被封

w = open('write.txt', 'w+')
for web in webs:
    web = web.strip()
    u = url + web
    r = requests.get(u)
    w.write("url为:" + u + " 状态为:%d" % r.status_code + '\n')

3. 使用步骤

确保url.txt和web.txt已准备就绪
运行PoC脚本
结果保存在write.txt中
分析状态码(200通常表示存在泄露)

四、单目标检测版本

import requests
import time

url='' # 目标URL

with open("web.txt", 'r') as web:
    webs = web.readlines()

for web in webs:
    web = web.strip()
    u = url + web
    r = requests.get(u)
    print("url为:" + u + " 状态为:%d"%r.status_code)
    time.sleep(2)

w = open('write easy.txt', 'w+')
for web in webs:
    web = web.strip()
    u = url + web
    r = requests.get(u)
    w.write("url为:" + u + " 状态为:%d"%r.status_code + '\n')

五、注意事项

合法性：仅在授权范围内使用这些工具
请求频率：适当设置time.sleep()避免对目标造成影响
代理使用：建议使用代理池避免IP被封
结果验证：所有自动检测结果需要人工验证
FOFA限制：免费账号只能爬取前5页结果

六、优化建议

增加异常处理机制
实现多线程/异步请求提高效率
添加更全面的源码泄露路径
实现自动识别.git等目录的功能
增加结果自动分类功能

七、常见源码泄露路径参考

完整路径列表建议参考：

Git泄露：/.git/config, /.git/HEAD
SVN泄露：/.svn/entries, /.svn/wc.db
DS_Store：/.DS_Store
配置文件：/WEB-INF/web.xml, /config.xml
备份文件：/index.php.bak, /.bak