Authors:
(1) Diwen Xue, University of Michigan;
(2) Reethika Ramesh, University of Michigan;
(3) Arham Jain, University of Michigan;
(4) Arham Jain, Merit Network, Inc.;
(5) J. Alex Halderman, University of Michigan;
(6) Jedidiah R. Crandall, Arizona State University/Breakpointing Bad;
(7) Roya Ensaf, University of Michigan.
3 Challenges in Real-world VPN Detection
4 Adversary Model and Deployment
5 Ethics, Privacy, and Responsible Disclosure
6 Identifying Fingerprintable Features and 6.1 Opcode-based Fingerprinting
6.3 Active Server Fingerprinting
6.4 Constructing Filters and Probers
7 Fine-tuning for Deployment and 7.1 ACK Fingerprint Thresholds
7.2 Choice of Observation Window N
7.4 Server Churn for Asynchronous Probing
7.5 Probe UDP and Obfuscated OpenVPN Servers
9 Evaluation & Findings and 9.1 Results for control VPN flows
12 Acknowledgement and References
We set out to explore if an ISP or censor can fingerprint OpenVPN connections at scale, without significant collateral damage. Adopting the viewpoint of an adversarial ISP, we deploy our framework inside Merit, as shown in Figure 2. Our evaluation is two-fold: we generate control vanilla and obfuscated flows with commercial VPN providers and attempt to identify them as a network intermediary; we also process other traffic passing through our Monitoring Station in order to estimate the false positive rate of our framework.
We set up our framework on a 16-core server (Monitoring Station) inside Merit with two mirroring interfaces that have an aggregated 20 Gbps bandwidth. Due to the large traffic volume, we optimize our deployment with PF_RING [38] in order to improve the packet processing speed. We employ PF_- RING in zero-copy mode and spread the traffic load across a Zeek cluster of 15 workers. Nonetheless, due to limited CPU resources, we only sample 12.5% of all TCP and UDP flows arriving at the network interfaces in order to minimize the effect of packet loss. The sampling is based on IP pairs so that all bi-directional traffic of a flow will be selected/dropped together. With these settings, we are able to operate with an end-to-end packet loss rate under 3%. Even though we process only a fraction of all traffic, our Filter still handles over 15 Terabytes of traffic from over 2 billion flows on an average day on a single server. In addition, processing all traffic without sampling is feasible through parallelism or using faster CPUs.
Next, we set up Probers on two dedicated measurement machines, each provisioned with 10 IPv4 and 1 IPv6 addresses. By the end of each day during the evaluation, the Probers fetch filtering logs from the Monitoring Station. For each target, we run a Masscan [25] to the /29 subnet the IP belongs to over all TCP ports (1-65535). We follow up each discovered open port by running our probing scheme, and endpoints confirmed through probing are recorded for manual analysis.
To select VPN services for evaluation, we first generate a list of “top” VPN services ranked by popularity. We combine 80 providers, most of which are paid premium VPN services, from top VPN recommendation sites based on previous work [42], listed in Appendix Table 4. Next, we visit the websites of these VPN providers searching for “Obfuscation”, “Stealth”, or “Camouflage Mode” etc., and include providers that offer at least one obfuscated VPN configuration. In total, we find 24 providers offering obfuscated services. We test all obfuscation configurations if more than one is offered as well as vanilla OpenVPN for each provider. If TCP and UDP modes are both available, we test them separately. In total, we have 81 configurations, 41 of which are obfuscated ones.
We configure the Client Station inside Merit to act as a VPN client. Both upstream and downstream traffic of the Client Station go through the router that mirrors traffic to the Monitoring Station. In addition, we exclude this server from our random sampling so that all traffic to/from this
server will be analyzed. On the client, we run an automated script to generate control traffic for our evaluation. For each iteration, we start the VPN client application and connect to the “default / recommended” server using Pywinauto [41]. After a random wait of 20 to 180 seconds, we confirm that the VPN tunnel is active and generate random browsing traffic with Selenium [45] by sending requests to a random website from the Alexa top 500. Finally, we disconnect from the VPN server and wait for 180 seconds before proceeding to the next iteration. For each VPN configuration, we repeat the process 50 times and collect packet captures for reference.
This paper is available on arxiv under CC BY 4.0 DEED license.