Authors:
(1) Diwen Xue, University of Michigan;
(2) Reethika Ramesh, University of Michigan;
(3) Arham Jain, University of Michigan;
(4) Arham Jain, Merit Network, Inc.;
(5) J. Alex Halderman, University of Michigan;
(6) Jedidiah R. Crandall, Arizona State University/Breakpointing Bad;
(7) Roya Ensaf, University of Michigan.
3 Challenges in Real-world VPN Detection
4 Adversary Model and Deployment
5 Ethics, Privacy, and Responsible Disclosure
6 Identifying Fingerprintable Features and 6.1 Opcode-based Fingerprinting
6.3 Active Server Fingerprinting
6.4 Constructing Filters and Probers
7 Fine-tuning for Deployment and 7.1 ACK Fingerprint Thresholds
7.2 Choice of Observation Window N
7.4 Server Churn for Asynchronous Probing
7.5 Probe UDP and Obfuscated OpenVPN Servers
9 Evaluation & Findings and 9.1 Results for control VPN flows
12 Acknowledgement and References
Our Filter performs both opcode and ACK-based fingerprinting, flagging a flow if at least one fingerprint matches. This is because the opcode and ACK fingerprints are designed to be complementary: both are effective against vanilla OpenVPN and they each target a specific subset of obfuscations. The former works against XOR-based obfuscations that work like Vigenère ciphers, i.e. they always encrypt the same plaintext opcodes at the same position to the same ciphertext bytes. The latter targets tunneling-based obfuscation that lacks random padding and preserves the 1:1 correspondence between the original and obfuscated packet streams. Combining the two features maximizes our fingerprinting coverage, as we discovered that even within the same provider, obfuscating strategies can vary a lot (§ 9). Table 5 in Appendix shows the effectiveness of each feature against each commercial VPN service we tested. Following Filter’s result, the Prober performs the active probing scheme to further lower potential false positives.
We implement the Filter in Zeek [75], an open-source network monitoring tool. We note that the evaluation processes for opcode and ACK-based fingerprinting are quite simple: both only require several dozen integer comparisons (limited by the observation window) while maintaining a small number of per-flow states. We implement the Prober in Nim [31]. We believe that both components can be easily deployed by any ISP or censor.
This paper is available on arxiv under CC BY 4.0 DEED license.