Most modern malware download attacks occur via the browser, typically due to social engineering and drive- by downloads. In this paper, we study the “origin” of malware download attacks experienced by real network users, with the objective of improving malware down- load defenses. Specifically, we study the web paths fol- lowed by users who eventually fall victim to different types of malware downloads. To this end, we propose a novel incident investigation system, named WebWitness. Our system targets two main goals: 1) automatically trace back and label the sequence of events (e.g., visited web pages) preceding malware downloads, to highlight how users reach attack pages on the web; and 2) leverage these automatically labeled in-the-wild malware down- load paths to better understand current attack trends, and to develop more effective defenses.
We deployed WebWitness on a large academic net- work for a period of ten months, where we collected and categorized thousands of live malicious download paths. An analysis of this labeled data allowed us to design a new defense against drive-by downloads that rely on in- jecting malicious content into (hacked) legitimate web pages. For example, we show that by leveraging the inci- dent investigation information output by WebWitness we can decrease the infection rate for this type of drive-by downloads by almost six times, on average, compared to existing URL blacklisting approaches.