As the web grows in size, web automation techniques help both legitimate operators, as well as malicious actors. While legitimate operators use automated tools to index Internet content and provide additional services, malicious actors leverage automation tools to scan vulnerable web applications and launch scam campaigns. This thesis proposal explores and measures automated web activities, focusing on activities that are controlled by malicious actors.
First, we perform a large-scale study of web bot behavior through automatically deployed honeysites. This work demonstrates that automated web activities can be studied and characterized using honeypot technology and the subsequent identification of behavioral patterns through the captured requests. Among others, we discovered that more than 50\% of bots are conducting malicious web activities such as brute-forcing, fingerprinting web applications, and exploiting web vulnerabilities.
Second, we analyze the workings of web vulnerability scanners and to what extent unwanted vulnerability scanning can be detected and blocked. To this end, we implement a testbed for identifying automated web vulnerability scanners and design a machine-learning-based detection system to protect against malicious scanning activities. Through this study, we discover tool-specific and type-specific behaviors for scanners that are absent from regular users and can be captured by supervised-learning classification algorithms.
Finally, we conduct a large-scale measurement of cryptocurrency scam websites. We develop a tool that can identify these scams in the wild, using Certificate Transparency as its source of suspicious URLs. Through a six-month study, we identified thousands of giveaway scam domains and used that dataset to perform the first quantitative analysis of stolen funds using the public blockchains of the abused cryptocurrencies. We find that attackers are stealing the equivalent of tens of millions of dollars with clear signs of underlying automation in terms of setting up new scam pages and pivoting across cryptocurrencies.