Researchers from the University of Illinois Urbana-Champaign have introduced a framework called Hierarchical Planning and Task-Specific Agents (HPTSA) in their paper “Teams of LLM Agents can Exploit Zero-Day Vulnerabilities.” This innovative approach leverages a planning agent and specialized subagents, utilizing GPT-4, LangChain, and LangGraph, to autonomously hack websites using zero-day vulnerabilities. Impressively, HPTSA achieved a 53% success rate on a benchmark of 15 real-world vulnerabilities, outperforming traditional single AI agents by up to 4.5 times.
The approach reduces the cost of exploiting vulnerabilities by 75% compared to human pentesters. The implications are botlicious: attackers can exploit vulnerabilities without building or buying exploit code, while defenders gain “auto-hack-me-brah” bots for pen-testing, especially for recently disclosed vulnerabilities not yet integrated into existing tools. This would further reduce the timeline from vulnerability disclosure to widespread exploit. Vulnerability management teams will be under more strain to smash the upgrade button faster.
Although the vulnerabilities tested were from less well-known products and open-source projects, LLM bot farms potential is compelling. With further refinement, the approach could significantly enhance both the offensive and defensive landscape.
Ref:
https://arxiv.org/pdf/2406.01637