Introduction

This page gives a quick introduction to SWIDE. Details can be found in our CCS'24 paper.

Abstract

Web attacks, a primary vector for system breaches, pose a significant challenge within the cybersecurity landscape. The growing intensity of web attack attempts has led to ``alert fatigue'' where enterprises are inundated by excessive alerts. Although extensive research is being conducted on automated methods for detecting web attacks, it remains an open problem to identify whether the attacks are successful. Towards this end, we present SWIDE (Successful Web Injection Detection Engine), an engine to pinpoint successful web injection attacks (e.g., PHP command injection, SQL injection). This enables enterprises to focus exclusively on those crucial threats. Our methodology builds on two insights: Firstly, while attackers tend to apply payload obfuscation techniques to evade detection, all successful web injection attacks must comply with the programming language syntax to be executable; Secondly, these attacks inevitably produce observable effects, such as returning execution result or creating backdoors for future access by the attacker. Consequently, we leverage advanced syntactic and semantic analysis to 1) detect malicious syntax features in obfuscated payloads and 2) perform semantic analysis of the payload to recover the intention of the attack. With a two-stage design, namely, attack identification and confirmation mechanisms, SWIDE can accurately identify successful attacks, even amidst intricate obfuscations. Unlike proof-of-concept studies, SWIDE has been deployed and validated in real-world environments through collaborations with a cybersecurity firm. Serving 5,045 enterprise users, our system identifies that roughly 15\% of enterprises have suffered from successful attacks on a weekly basis - an alarmingly high rate. Moreover, we perform a detailed analysis of six months’ data and discover 60 zero-day vulnerabilities exploited in the wild, including 12 high-risk ones acknowledged by relevant authorities. These findings underscore the practical effectiveness of SWIDE.

A Motivating Example

The PHP code injection vulnerability below stems from the unchecked integration of user input into dynamic code. Decoding the multi-layer obfuscation, the example attack eventually executes the payload printf("hacked").

Motivating Example

If the system is vulnerable and the injected command gets executed successfully, we expect to observe the keyword hacked in the corresponding response. Ideally, if a detection system sitting on-path can predict consequence of the attack payload and match evidence in the response, it can accurately report if the attack is launched successfully. However, this is challenging due to:

The complexity and dynamic nature inherent in web programming languages give large flexibility to obfuscation and leave attack payload detection difficult.
It is necessary to simulate the payload execution to predict the consequence (to print a string hacked in this example), while ahieving high throughput.

SWIDE: Multi-stage Semantic-aware Detection System

SWIDE Architecture

SWIDE solves above challages and ahieve effective detection of successful attacks with:

Multi-stage detection architecture: fast on-path suspicious package screening + accurate cloud analysis.
Payload identification via syntax analysis: by applying novel parial code parsing algorithm, SWIDE can identify payload accurately.
Exploint intention recovery and consequence mathcing by lightweight simulation of payload execution.