PhD Thesis Defense, Phong Hoang, 'Tackling Online Surveillance and Censorship with Empirical Network Measurement'

Wednesday, December 1, 2021 - 2:00pm to 3:00pm
Contact for Zoom info.
Event Description: 


With the Internet having become an indispensable means of communication in modern society, censorship and surveillance in cyberspace are getting more prevalent. Malicious actors around the world, ranging from nation states to private organizations, are increasingly making use of technologies to not only control the free flow of information, but also eavesdrop on Internet users' online activities.

In this dissertation, we tackle the twin problems of Internet censorship and online surveillance. We show that empirical network measurement, when conducted at scale and in a longitudinal manner, is an important and effective approach to gain insights about (1) censors' blocking behaviors and (2) key characteristics of anti-censorship and privacy-enhancing technologies. These insights can then be used to not only aid in the development of effective censorship circumvention tools, but also help related stakeholders making informed decisions to maximize the privacy benefits of privacy-enhancing technologies.

We first focus on measuring Internet censorship in which we conduct an empirical study of the I2P anonymity network, shedding light on important properties of this network including population, churn rate, router type, and the geographic distribution of I2P peers. Using the collected network data, we examine the blocking resistance of I2P against a censor that wants to prevent access to I2P using address-based blocking techniques. Despite the decentralized characteristics of I2P, we discover that a censor can block more than 95% of peer IP addresses known by a stable I2P client by operating only 10 routers in the network. We then develop an opportunistic censorship measurement infrastructure built on top of a network of distributed VPN servers run by volunteers. Using this measurement platform, we have exposed numerous censorship regimes (e.g., China, Iran, Oman, Qatar, and Kuwait) where access to I2P is blocked by various techniques, including domain-based blocking, network packet injection, and block pages.

Of the discovered censors, China is the most notorious, having developed an advanced network filtering system, commonly known as the Great Firewall (GFW). Continuing the same line of work on measuring Internet censorship, we build GFWatch, a large-scale, longitudinal measurement platform capable of testing hundreds of millions of domains daily, enabling continuous monitoring of the GFW’s DNS filtering behavior. During a nine-month period in 2020, GFWatch tested an average of 411M domains per day and detected a total of 311K domains censored by GFW’s DNS filter. Using data from GFWatch, we studied the impact of GFW blocking on the global DNS ecosystem. We found 77K censored domains with DNS resource records polluted in popular public DNS resolvers, such as Google and Cloudflare. Based on the insights gained from running GFWatch, we propose strategies to detect poisoned responses that can (1) sanitize poisoned DNS records from the cache of public DNS resolvers, and (2) assist in the development of circumvention tools to bypass the GFW’s DNS censorship.

We then focus on measuring and improving the privacy benefits provided by new domain name encryption technologies. While Internet surfing nowadays, users leak information about the domains they visit via DNS queries and via the Server Name Indication (SNI) extension of TLS. Recent domain name encryption proposals to ameliorate this issue include DNS over HTTPS/TLS (DoH/DoT) and Encrypted Client Hello (ECH). Although the security benefit of these technologies is clear, their positive impact on user privacy is weakened by—the still exposed—IP address information.

By introducing an IP-based website fingerprinting technique that allows a network-level observer to identify at scale the website a user visits, we could successfully identify 84% of more than 200K websites studied, when observing solely destination IP addresses. The accuracy rate increases to 92% for popular websites, and 95% for popular and sensitive websites. We also evaluated the robustness of the generated fingerprints over time, and demonstrate that they are still effective at successfully identifying about 70% of the tested websites after two months. We conclude by discussing strategies for website owners and hosting providers towards hindering IP-based website fingerprinting and maximizing the privacy benefits offered by DoT/DoH and ECH.

Finally, given that the visibility into plaintext domain information is lost due to the introduction of domain name encryption protocols, it is important to investigate whether and how network traffic of these protocols is interfered with by different Internet filtering systems. We thus conduct a measurement study to verify the accessibility of DoT/DoH and ESNI and investigate whether these protocols are tampered with by any network providers (e.g., for censorship). We find evidence of blocking efforts against domain name encryption technologies in several countries, including China, Russia, and Saudi Arabia. On the bright side, we however discover that domain name encryption can help with unblocking more than 55% and 95% of censored domains in China and other countries where DNS-based filtering is heavily employed. Our findings will hopefully provide informative and useful insights for related stakeholders (e.g., Internet companies and standardization bodies) to make informed decisions while rolling out new domain name encryption technologies to end users in different regions of the world.

Computed Event Type: 
Event Title: 
PhD Thesis Defense, Phong Hoang, 'Tackling Online Surveillance and Censorship with Empirical Network Measurement'