Technical overview of Awala VPN
This is a high-level, technical overview of the protocol behind Awala VPN. It’s aimed at security auditors, human rights organisations, and the Relaycorp team. We assume the reader has a good command of cryptography and networking (e.g. TLS, Deep Packet Inspection).
This is not a protocol specification. We intend to formalise the protocol after launch.
Introduction
The Awala VPN protocol is specifically designed to circumvent the world’s most sophisticated censorship system, the Great Firewall of China (GFW), and gradually target the equivalent in other countries. Circumventing censorship safely is the only goal of this project. We are not trying to bypass geo-restrictions from streaming services or match the degree of anonymity of Tor.
The protocol circumvents censorship by making the traffic look like innocuous web browsing. The idea of using websites as tunnels for proxy or VPN traffic has been around for a few years, and has been studied academically under the name HTTPT. Tor’s WebTunnel bridge and many other projects use WebSockets, but the emerging WebTransport API and MASQUE are also promising candidates.
What makes Awala VPN innovative is its approach to censorship resistance:
- Tunnel sharding to prevent adversaries from enumerating all tunnels, even if they obtain legitimate subscriptions.
- Third-party tunnels running inside real websites with organic traffic.
- Mainstream web browser impersonation by using the same TLS implementation as Google Chrome (BoringSSL).
In countries with sophisticated censorship systems like China, the service will be premium-only to prevent tunnel enumeration. In countries with less sophisticated censorship, we may offer a subsidised free tier with dedicated shards (subject to external funding). To maximise impact, we plan to offer licensed deployments to vetted commercial VPN providers, and enable self-hosting for non-commercial use.
This work is based on a proof of concept of the protocol, which was successfully tested in China.
Limitations of existing solutions
The GFW already detects some HTTP-based circumvention services by exploiting certain vulnerabilities in their obfuscation strategies. Other vulnerabilities, whilst not yet exploited, could conceivably be used in the future. Such vulnerabilities fall into at least one of these areas:
- Active probing: The websites used as tunnels (AKA bridges) don’t usually look like the kind of websites that would drive the traffic observed (in terms of volume and patterns). In fact, they generally look like trivial static websites with little or no search engine traffic, and their domain names are often newly registered.
- Enumeration: It’s relatively easy to enumerate all the tunnels, especially if the service is free or freemium (like Tor). In the case of premium VPNs, they have to maintain every tunnel, so they have an incentive to keep the number of tunnels down to minimise overhead and maximise profit, but this comes at the cost of an adversary needing just one or a few premium accounts to discover the entire pool.
- Fingerprinting: Even though the client is indeed connecting to a website over TLS, it generally exhibits a different behaviour than a browser. For example, during the TLS handshake.
As of December 2024, we do not have reason to believe that the GFW’s active probing mechanism considers the architecture of the website (e.g. whether it’s static) or the estimated volume of traffic. However, these capabilities are technically feasible today. To future-proof against such advancements, we intend to pay third-party website operators to host tunnels.
Architecture
Awala VPN adopts a multi-hop architecture with exactly three middleboxes between the VPN client and the target server:
- The tunnel, which is the TLS website to which the client connects directly. This is comparable to an entry node in Tor.
- The gateway shield, or simply shield, which is responsible for protecting and load-balancing the gateways, as well as performing operations that need not be done by the gateway (e.g. DNS resolution).
- The gateway, which is the actual VPN server that relays traffic to the target server. It’s comparable to an exit node in Tor.
The client uses end-to-end (E2E) encrypted WebSocket messages to communicate with the gateway shield, preventing tunnel operators from viewing or tampering with the traffic (see client-shield protocol). Additionally, the IP packets are E2E encrypted between the client and the gateway to minimise the exposure of such data, since the shield doesn’t need it (following the principle of least privilege; see client-gateway protocol). Consequently, IP packets are E2E encrypted twice in an onion routing fashion: initially between the client and the gateway, and the resulting ciphertext is then encrypted between the client and the shield.
To minimise latency, the servers in the middlebox trio must be geographically close; in the same country (for smaller countries like Singapore), neighbouring countries (like Japan and South Korea), or within the same region of large countries (like the US West Coast). For legal and security reasons, no middleboxes will be located in any jurisdiction likely to be hostile towards Awala (see eligible jurisdictions).
The overall network is controlled by the orchestrator.
Client
The client is the Awala app that the user installs on their device. The app’s role as a private gateway in the Awala network is orthogonal to the VPN protocol, and therefore outside the scope of this document.
To mitigate TLS client fingerprinting, the client will use BoringSSL, via a library like rquest, to mimic Google Chrome during the TLS handshake. Google Chrome is not only the most popular browser in the world, but also the most popular browser in China, accounting for roughly half of the market share in 2023.
Tunnel
A tunnel is a website that acts as a mere reverse proxy for the shield at a random path.
For example,
the website https://example.com
could host a tunnel under https://example.com/<random-path>
with the following configuration,
assuming the website is running Nginx:
location /<random-path> {
proxy_pass https://singapore.vpn-shields.awala.app;
# Enable WebSockets
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
access_log off;
}
This is all that’s needed to configure a tunnel.
Relaycorp will operate all or most tunnels at launch, gradually shifting traffic to partner-operated tunnels that meet strict requirements to protect user privacy. In the medium term, we expect our first-party tunnels to be systematically blocked, forcing all traffic through third-party tunnels and requiring us to cap subscription data usage for sustainable operation.
Note that this server can’t be TLS fingerprinted as a tunnel because it runs third-party software.
Shield
The gateway shield, or simply shield, is a regionally-distributed server that routes VPN traffic to gateways, protects gateways from protocol-layer DDoS attacks using a reverse proxy like Caddy, protects gateways from application-layer DDoS attacks with Despacito, enforces usage quotas, and resolves DNS records.
The shield is implemented as a standalone server, instead of its functionality being part of the gateway, to ensure clients maintain consistent Internet-facing IP addresses by routing their traffic to their previously assigned gateway, even when connecting through different tunnels.
Each tunnel is assigned to its nearest shield.
For example,
if the only shields in Asia-Pacific are in Singapore,
tunnels in Seoul and Tokyo would connect to https://singapore.vpn-shields.awala.app
.
To mitigate DDoS attacks, shields will have catchment areas, which are regions that are geographically close to the shield and amongst the eligible jurisdictions. For example, shields in Singapore may allow connections from Australia, Japan, and South Korea, but not from China (ineligible jurisdiction) or the Americas (outside catchment area).
As the middleware between the client and the gateway, the shield has two interfaces:
- The server in the client-shield connection via the tunnel. The server will only allow incoming connections from IP addresses in supported jurisdictions, in order to mitigate DDoS attacks and prevent tunnels from being hosted in unsupported jurisdictions.
- The WebSockets client in the shield-gateway connection, which will be used to relay E2E encrypted IP packets between the gateway and the client. Consequently, the shield is responsible for allocating a gateway to the client.
To learn more about this server and inspect its source code, visit the GitHub project AwalaApp/vpn-shield.
Gateway
A gateway is effectively a VPN server, as it’s responsible for relaying the packets between the VPN client and the target server. This is the component that handles Network Address Translation (NAT).
The WebSockets server in the gateway, which is used to communicate with the shield, isn’t accessible from the public Internet. Instead, only shields with whitelisted IP addresses could connect to this WebSockets server, limiting DDoS attacks to volumetric attacks only.
Shields and gateways are co-located in regional clusters (e.g., Singapore) to minimise latency.
To learn more about this server and inspect its source code, visit the GitHub project AwalaApp/vpn-gateway.
Orchestrator
The orchestrator is a centralised app that directs all the clients and middleboxes, without playing an active role in the transmission of packets, so it can’t be deemed a middlebox.
Its primary responsibilities are to shard the network, and manage the pool of tunnels.
The orchestrator won’t be accessible from the public Internet. Instead, it will only be accessible from the subscription management system (to issue registration codes and allocate shards), and the shield (to issue Despacito client certificates).
Connection protocols
Client-shield protocol
The client and the shield will communicate over an E2E encrypted channel using Diffie-Hellman (DH) with X25519 keys. The client should trigger the rotation of both key pairs every 60 minutes or when the client starts. The shield should refuse requests with key pairs older than 60 minutes.
To prevent malicious tunnels from replaying requests or responses, each HTTP request should include a monotonically increasing sequence number, which the shield will verify before processing the request and include in the subsequent response. The client should verify the sequence number in the response matches the expected sequence number. This requirement also applies to the HTTP requests to bootstrap WebSockets connections. The sequence number is tied to the client’s key pair, and will be reset to zero upon key rotation.
To mitigate traffic analysis, HTTP requests and responses will be padded to a random size.
Every request from the client is to be signed with the client’s Despacito certificate, except for the first request in the client registration.
Web browsing simulation
Before opening a WebSockets connection to relay IP packets, the client will start the connection via the tunnel by making an initial request to simulate the retrieval of a web page by mimicking the behaviour of Google Chrome for obfuscation purposes. Once a response is received, the client will emulate the retrieval of assets like images and JS files, and open a WebSockets connection to relay IP packets.
The pattern exhibited by the connection should be stable by tunnel, but different across tunnels. See the web browsing simulation operations on the shield app to learn more.
WebSockets connection
After simulating the retrieval of a webpage and its assets,
and rotating the key pair if necessary,
the client will open a WebSockets connection to relay IP packets.
This will require opening a new TLS connection an ALPN of h1
(HTTP/1) if the previous connection used a different HTTP version,
since WebSockets only works in HTTP/1.
Both the client and the shield will maintain their encryption keys for the duration of the connection, even if a key rotation takes place before the connection is closed.
To prevent malicious tunnels from replaying WebSockets messages (within the same connection or across different connections), the plaintext of each encrypted message must include the HTTP request sequence number and a monotonically increasing sequence number. The client and the shield will maintain separate monotonically increasing sequences for the messages they send. Upon successful decryption, the recipient must verify both values, rejecting any message with a different HTTP request sequence number, or a monotonically increasing sequence number less than or equal to the last processed message.
The WebSocket connection will also carry control messages, besides the E2E encrypted IP packets. For example, the shield may periodically notify the client about how much data its subscription has left.
To mitigate traffic analysis on the WebSockets connection, the client and the shield will obfuscate it by:
- Exchanging bidirectional noise frames of random sizes and at random intervals throughout the lifespan of the connection. The patterns should be consistent for any given tunnel — that is, censors won’t observe widely different patterns across connections to the same website.
- Padding each packet to a random size, and/or batching packets.
- Having the shield delay the first message from the gateway, as this handshake in the client-gateway protocol could be used to fingerprint the connection. During this delay, the client and the gateway may send noise frames.
Each connection has a maximum time-to-live (TTL) of 60 seconds to mitigate the effects of TCP meltdown, with the actual TTL varying by tunnel to maintain consistent patterns across clients. To avoid disruption, the client may maintain multiple concurrent connections with the same tunnel or different tunnels.
Key rotation
The key rotation process involves the client initiating a request encrypted with the shield’s current DH public key. This request contains the client’s newly generated DH public key in its payload. Upon receiving this request, the shield generates and returns its own new DH public key, completing the bilateral key rotation. Each shield’s key pair is strictly bound to a single client.
To tolerate delays in the communication, both will keep their old keys to decrypt incoming messages for up to 60 seconds.
DNS resolution
The client will make an HTTP request to the shield for each DNS record that needs to be resolved.
Shield-gateway protocol
Whenever a client starts a packet relay connection, the shield will start the shield-gateway protocol by selecting the gateway that will take over the packet relay. This is done by reusing an existing session between a client and a gateway, or allocating a new gateway to the client.
Once the gateway is selected, the shield will relay E2E encrypted IP packets bidirectionally between the client and the gateway, until either the client or the gateway closes the connection.
Client-gateway protocol
Once the shield establishes a connection between the client and the gateway,
the gateway will start the client-gateway protocol by sending the client its IPv4 and IPv6 addresses
(e.g. 10.0.102.0/30
, fd00:1234::2:0/127
).
The client will be ready to send E2E encrypted IP packets once it receives its IP addresses.
The session between a client and its gateway does not end when the connection via the shield is closed. The session persists until it’s not being used for at least 5 minutes. This enables the client to resume the session from a different tunnel at any time.
Since this is the actual VPN protocol, it’s important to consider why we didn’t just obfuscate an existing VPN protocol.
Subscription model
Each client must have an active subscription, and multiple clients can share the same subscription.
Subscriptions are designed for privacy: they’re ephemeral (non-renewable) and can be purchased anonymously, depending on the payment method. This protects users against data breaches and targeted law enforcement requests.
Each subscription is tied to a specific country (e.g. China) for performance reasons, so it can be assigned to a shard with nearby tunnels.
We do not offer a free tier, as it could be abused by an adversary for enumeration purposes. We may, however, give away subscriptions at our discretion.
Sharding
To mitigate enumeration attacks on the tunnel pool, each subscription and each tunnel is assigned to a shard, and each shard has multiple tunnels and multiple subscriptions. Any client can use any tunnel in its subscription’s shard, but it won’t learn about the existence of tunnels in other shards. Consequently, an adversary that purchases a legitimate subscription will only be able to enumerate the tunnels in its shard, and therefore won’t be able to correlate the traffic of different users.
When a tunnel is blocked by the GFW or equivalent, the entire shard will be deemed compromised in that country, and two smaller shards will be created to replace it. Each new shard will inherit half of the subscriptions of the original shard, and it will have a new pool of tunnels. If a second-generation shard is also compromised, no more shards will be created, and users will be required to purchase a new subscription to continue using the service.
User onboarding
App distribution
Unfortunately, we expect most users to have to side-load the app, as app stores tend to block VPN apps at the request of the authorities. Consequently, we’ll distribute the app in two ways:
- By linking to the latest GitHub release from the download page, which will require users to use another VPN (or an existing Awala VPN subscription) to download the app when GitHub is blocked.
- Through single-use download URLs served by our tunnels, in case GitHub rate limits our downloads. Each URL will expire after 24 hours or after the first successful download.
Subscription purchase
To purchase a subscription, users must access the subscription management system through the Awala website or app. If the purchaser is in a censored region, they must use an existing VPN to access the system, or make the purchase from outside the censored region. Following the purchase, the orchestrator would issue a single-use client registration code, and a signature of said code, so that the client can register itself.
If the purchase is made within the Awala app, using a VPN or outside a censored region, the registration will be handled automatically.
If the purchase is made from the website,
the orchestrator will produce the following ClientRegistration
ASN.1 structure,
and encapsulate it in a VeraId SignatureBundle
signed by [email protected]
:
ClientRegistration ::= SEQUENCE {
registrationCode [0] OCTET STRING,
shieldPublicKey [1] ShieldPublicKey,
tunnelURL [2] VisibleString
}
ShieldPublicKey ::= SEQUENCE {
key [0] SubjectPublicKeyInfo, -- From the X.509 specification
identifier [1] OCTET STRING
}
Where,
registrationCode
is the single-use registration code.shieldPublicKey.key
is an ephemeral DH public key of the shield.shieldPublicKey.identifier
is the unique identifier of the shield’s public key.tunnelURL
is the URL of one of the tunnels in the allocated shard.
The resulting VeraId SignatureBundle
will span around 4 KiB,
due to the overhead of the MemberIdBundle
(for [email protected]
),
which is embedded in the SignatureBundle
.
To facilitate the distribution of this information to the client,
the person who made the purchase will receive this information as a deep link in the form https://awala.app/vc#<version><registrationParameters>
,
where <version>
is currently the constant 1
,
and <registrationParameters>
is the Base64-encoded DER representation of the VeraId SignatureBundle
.
The deep link should open the Awala app on the receiving device if installed.
Note that the parameters are encoded in the URL fragment to avoid sending them to the server,
if the URL is opened on a web browser.
In cases where the size of the URL is problematic,
or the sharing of any URL is undesirable,
the user will get the option to download the parameters as an Awala VPN configuration file with the extension .avpn
.
This file is analogous to an OpenVPN® .ovpn
configuration file.
Note that we use VeraId to establish secure initial communication between the client and shield without relying on public keys hard-coded in the client. This enables the client to encrypt sensitive information with the correct shield public key from the very first interaction, whilst maintaining key rotation flexibility.
Client registration
The client will be able to register itself with the shield, and claim its Despacito certificate, using the following protocol spanning two HTTP requests (to mitigate replay attacks from malicious tunnels):
-
Client: Send
RegistrationRequest
message:- Generate a nonce.
- Generate a DH key pair.
- Create a message containing the nonce, the registration code, the client’s DH public key, and the identifier of the client’s DH public key (provided by the orchestrator).
- Encrypt the message with the shield’s DH public key (provided by the orchestrator).
- Send the ciphertext to the shield, along with the identifier of the shield’s public key.
-
Shield: Process
RegistrationRequest
and sendRegistrationConfirmationRequest
:- Look up the DH key pair by the provided identifier, or abort if not found.
- Decrypt the ciphertext with the specified DH private key, or abort if this fails.
- Verify the registration code is valid and unused, or abort if invalid/used. (Do not yet expire the registration code.)
- Generate a nonce.
- Store the nonce, and the respective registration code, in the database with a 5-minute expiry.
- Create a message containing the client’s nonce and the shield’s nonce.
- Encrypt the message with the client’s DH public key.
- Send the encrypted message to the client.
-
Client: Process
RegistrationConfirmationRequest
and sendRegistrationConfirmation
:- Decrypt the ciphertext with the client’s DH private key, or abort if this fails.
- Verify the client’s nonce sent by the shield, and abort if it doesn’t match.
- Generate a Despacito key pair.
- Create a message containing the shield’s nonce, the registration code, and the Despacito public key.
- Sign the message with the Despacito private key.
- Encrypt the signed message with the shield’s DH public key.
- Send the encrypted message to the shield.
-
Shield: Process
RegistrationConfirmation
and sendRegistrationComplete
:- Decrypt the ciphertext with the shield’s DH private key, or abort if this fails.
- Verify the digital signature of the decrypted message, or abort if invalid.
- Verify the shield’s nonce, and abort if it doesn’t match.
- Delete the shield’s nonce from the database.
- Mark the registration code as used.
- Issue Despacito certificate using the client’s provided public key.
- Create a message containing the Despacito certificate.
- Encrypt the message with the client’s DH public key.
- Send the encrypted message to the client.
-
Client: Process
RegistrationComplete
:- Decrypt the ciphertext with the client’s DH private key, or abort if this fails.
- Verify the Despacito certificate contains the public key originally sent, or abort if they don’t match.
- Store the Despacito certificate.
Note that if the client fails to receive the
RegistrationComplete
message, the process will have to be started from scratch with a new registration code.
To register an additional client,
an authenticated client must request a VeraId-signed ClientRegistration
message from the orchestrator.
This request may be refused if the subscription has already registered the maximum number of clients.
VeraId usage
VeraId operations must be done in the context of the service 1.3.6.1.4.1.58708.3.0
,
defined as the awala_vpn_service
Object Identifier (OID) below:
awala_vpn_service OBJECT IDENTIFIER ::= { iso(1) identified-organization(3)
dod(6) internet(1) private(4) enterprise(1)
relaycorp(58708) awala_vpn(3) veraid_service(0) }
Additionally,
the maximum TTL for a VeraId SignatureBundle
is 24 hours.
Threat model
(This document covers the threat model of Awala VPN, omitting any considerations for the physical and cyber security of Relaycorp as a company.)
We’re going after nation-state actors, and they will come for us. To stave off attacks as a tiny team, we must offload as much as possible to well-resourced providers, minimise what we know about our stakeholders (e.g. users, partners), and minimise the level of trust required from us and our partners.
Adversaries
For the purposes of this threat model, we are splitting our nation-state adversaries into national firewall operators (e.g. the Cyberspace Administration of China), and state-backed Advanced Persistent Threats (APTs; e.g. Flax Typhoon).
Adversary | Goals | Capabilities | Resources |
---|---|---|---|
National firewall operator | Block service | Traffic analysis, active probing, connection blocking | Historical traffic data, government agencies |
State-backed APT | Disrupt service, damage reputation | Infrastructure infiltration, social engineering, operating tunnels, running modified VPN clients with valid subscriptions | Booter services, significant funding, government agencies |
Financially-motivated cybercriminal | Ransom, sell tunnelling metadata | Infrastructure infiltration, social engineering, operating tunnels, running modified VPN clients with valid subscriptions | Booter services |
Government agencies in operating jurisdictions | Surveillance | Legal authority to request data, advanced surveillance infrastructure | Legal frameworks, intelligence agencies |
Attack vectors
Long-running connection disruption
All VPNs use long-running connections that can be blocked or terminated. In our case, censors may interfere with long-running connections to websites.
- Impact: Service disruption for users when connections are terminated.
- Attempt likelihood: High.
- Attack method:
- Termination of connections to websites after a given timeout.
- Blocking TLS connections using an ALPN of
h1
, if the server is known to support newer HTTP versions, as that would indicate a WebSockets connection.
- Mitigations:
- Encourage users to use Awala network apps like Letro so that we can use regular HTTP requests with short-lived connections, leveraging the delay-tolerant nature of the technology.
- Residual risks:
- Migration to delay-tolerant networks is a major barrier.
Tunnel identification via traffic analysis
An adversary may identify tunnels by analysing the mismatch between a website’s apparent purpose/content and its traffic patterns.
- Impact: Identification and blocking of tunnels, potential investigation of tunnel operators.
- Attempt likelihood: High, given the sophistication of traffic analysis and active probing by national firewalls.
- Attack method: Detection of traffic patterns inconsistent with the website’s apparent purpose.
- Mitigations:
- Prefer tunnels that could plausibly explain their traffic patterns.
- Rate-limit traffic per tunnel to maintain plausible traffic patterns.
- Distribute traffic across more tunnels when volume increases.
- Residual risks:
- Tunnels may still exhibit suspicious traffic patterns.
DDoS attacks against shield
- Impact: Regional service degradation or unavailability.
- Attempt likelihood: High, given the shield’s critical role and visibility.
- Attack method: Application-level attacks.
- Mitigations:
- Hosting providers that offer volumetric attack protection, such as BGP-based solutions.
- Mainstream reverse proxies (e.g. Caddy) for protocol-level DDoS protection.
- Despacito integration for application-level DDoS protection.
- Rate limiting at load balancer level.
- Residual risks:
- The service may still be degraded with a large enough attack volume.
- Increased infrastructure costs during attacks.
Tampered client distribution
An adversary may distribute a tampered version of the client, that looks like the legitimate client, but it may actually be malicious.
- Impact: User privacy compromise, service abuse, reputational damage.
- Attempt likelihood: High, considering the need to side-load the client in many cases.
- Attack method:
- Distribution of modified APKs through unofficial channels.
- Social engineering to convince users to install compromised versions.
- Mitigations:
- Clear documentation of official distribution channels.
- In-app warnings when sideloading is detected.
- Residual risks:
- Users may still install compromised versions of the client.
DDoS attacks against orchestrator
- Impact: Users may be prevented from signing up for the service.
- Attempt likelihood: High.
- Attack method: Volumetric, protocol- and application-layer attacks against orchestrator infrastructure.
- Mitigations:
- Cloudflare reverse proxy for volumetric and protocol-level DDoS protection.
- Despacito integration with rate limiting per IP address for application-level DDoS protection.
- Anomaly detection.
- Residual risks:
- Increased costs for the app running the orchestrator.
- The service may still be degraded with a large enough attack volume.
Tunnel enumeration via data breach
An adversary may try to enumerate all tunnels via a data breach of the shield or the orchestrator.
- Impact: Enumeration of all tunnels.
- Attempt likelihood: High, given the high-value target.
- Attack method: Infrastructure breach, insider threats, social engineering.
- Mitigations:
- Minimise the number of places where the tunnel addresses are stored.
- Encrypt the tunnel addresses at rest.
- Share the tunnel addresses on a strict need-to-know basis.
- Avoid logging the tunnel addresses.
- Periodic security audits.
- Anomaly detection.
- Residual risks:
- A successful breach could still expose tunnel information.
VPN client detection via eavesdropping app
An adversary may distribute an eavesdropping app that can detect the presence of a VPN client. For example, China’s National Anti-Fraud Center app.
- Impact: Individual user identification and potential legal consequences.
- Attempt likelihood: High where such apps are widely used.
- Attack method: System-level monitoring of installed apps or network activity.
- Mitigations:
- Prompt user to uninstall the VPN client or the eavesdropping app, when such an app is detected.
- Documentation warning users about this risk.
- Residual risks:
- Users may be required to keep eavesdropping apps installed.
- New eavesdropping apps may emerge that are harder to detect.
Shard-level traffic correlation via tunnel analysis
Shards might be identified if their websites represent a large fraction of the traffic from a few client IP addresses.
- Impact: Identification and blocking of tunnels, compromising entire shards.
- Attempt likelihood: Medium.
- Attack method: Identification of websites when they represent a large fraction of traffic from few IP addresses.
- Mitigations:
- Split tunnelling to skip VPN for certain apps and websites.
- Allocating sufficient tunnels to each shard.
- Residual risks:
- Tunnels might still be identified if they represent a significant portion of traffic from a small number of clients.
DDoS attacks against gateways
- Impact: Service degradation or unavailability for clients connected to affected gateway, or other gateways in the same data centre.
- Attempt likelihood: Medium.
- Attack method: Volumetric attacks against gateway infrastructure.
- Mitigations:
- Whitelisting IP addresses of shields in the same region to avoid protocol- and application-layer attacks.
- Hosting providers that offer volumetric attack protection.
- Residual risks:
- The service may still be degraded with a large enough attack volume.
- Hosting provider may prevent us from using their services going forward.
Tunnel operator compromise
- Impact:
- Client IP address exposure.
- Sybil attacks to throttle or disable the service.
- Attempt likelihood: Medium, considering our vetting process.
- Attack method:
- Operator coercion.
- Infrastructure breach.
- Adversary signing up as an operator.
- Mitigations:
- Strict operator vetting to ensure they fully operate within eligible jurisdictions.
- Remind operators about the no-logging policy.
- Legal action against operators that don’t comply.
- Residual risks:
- Operators could still log IP addresses.
- Legal compulsion might force logging.
Tunnel enumeration via compromised clients
An adversary may modify the open source client and pay for a legitimate subscription to discover the tunnels in its shard.
- Impact: Discovery of tunnels in a shard, leading to targeted blocking.
- Attempt likelihood: Low, given the cost of the service.
- Attack method: Modified client that logs tunnel URLs.
- Mitigations:
- No free tier.
- Each shard should contain up to 5% of the active subscriptions.
- Each tunnel should belong to zero or one shard.
- Residual risks:
- Adversaries with sufficient financial resources could still enumerate tunnels by subscribing multiple times.
DDoS attacks against tunnels
The sensor may launch DDoS attacks against the tunnels as a punitive measure, because they could simply block the website used as tunnel.
- Impact: Third parties may be reluctant to host tunnels.
- Attempt likelihood: Low.
- Attack method: Volumetric, protocol- or application-layer attacks against tunnel websites.
- Mitigations:
- Promote the DDoS Report amongst tunnel operators.
- Residual risks:
- Prospective tunnel operators may still not be able or willing to adopt recommendations from the DDoS report.
Targeted surveillance requests
- Impact: Link specific users to their traffic.
- Attempt likelihood: Low, given our criteria for eligible jurisdictions and the target audience.
- Attack method: Warrants targeting specific users/traffic, gag orders.
- Mitigations:
- Ephemeral subscriptions with minimal personally-identifiable information.
- Anonymous payment options.
- No logging of personally-identifiable information on the shield and gateways.
- Tunnels operated by independent third parties with a no-logging policy.
- E2E encryption between the client and the gateway, and between the client and the shield.
- Publicly-available source code for the client and all the server-side components.
- Periodic independent audits of all components.
- Residual risks:
- Payment information could be traced.
- Traffic patterns might identify users.
- Real-time monitoring could be court-ordered.
Mass surveillance requests
- Impact: Link a large segment of users to their traffic.
- Attempt likelihood: Low, given our criteria for eligible jurisdictions and the target audience.
- Attack method: Legal orders requiring logging/monitoring of all traffic, gag orders.
- Mitigations:
- Ephemeral subscriptions with minimal personally-identifiable information.
- Anonymous payment options.
- No logging of personally-identifiable information on the shield and gateways.
- Tunnels operated by independent third parties with a no-logging policy.
- E2E encryption between the client and the gateway, and between the client and the shield.
- Publicly-available source code for the client and all the server-side components.
- Periodic independent audits of all components.
- Residual risks:
- Hosting providers might facilitate mass surveillance requests.
User identification via data breach
An adversary may try to identify users via a data breach of the subscription database or our payment provider.
- Impact: Compromise of user privacy, potential legal consequences, reputational damage.
- Attempt likelihood: Low, given our minimal data collection.
- Attack method:
- Payment tracking.
- Social engineering.
- Mitigations:
- Anonymous payment options.
- No logging or storage of personally-identifiable information on our systems (only on the payment provider’s systems, for fiat transactions).
- Residual risks:
- Payment methods may still be traceable.
Performance considerations
Using WebSockets, a TCP-based protocol, to tunnel network traffic introduces performance issues:
- TCP meltdown when tunnelling TCP connections. This is the main reason why we limit the TTL of client-shield connections.
- Head-of-line blocking when tunnelling UDP packets, as well as unnecessary overhead from adding reliability to UDP traffic.
Migrating to MASQUE would solve these issues.
Ethical considerations
Collateral damage
The holy grail of Internet censorship circumvention is to make it prohibitively expensive for the censor to block the circumvention service. In practice, the best we can do is to make it very expensive, which effectively means daring the censor to block the circumvention service at the expense of some collateral damage.
If we succeed in making Awala VPN connections truly indistinguishable from regular Web browsing, we may expect the following collateral damage:
- Long-running connections to websites could be blocked or terminated, regardless of how likely the website is to be a tunnel.
- Censors could use TLS client fingerprinting to block Google Chrome, forcing people to use sanctioned browsers.
- Google may make it technically or legally difficult for us to continue to use BoringSSL to impersonate Chrome.
- Censors could migrate their firewalls from a denylist approach, where all websites are accessible by default, to an allowlist approach, where only sanctioned websites are accessible.
- Censors could sever ties with the global Internet.
Affordability
Whilst charging for the VPN conflicts with Awala’s mission of universal uncensored communication, it’s necessary to prevent enumeration attacks on tunnels. A free tier in countries with sophisticated firewalls would require novel abuse prevention methods beyond payments or phone verification. Until then, we can only work to minimise costs through scale and subsidies.
Having said that, we will ensure that even non-paying users benefit from our work by:
- Licensing the client under an open source licence.
- Licensing the shield, gateway, and orchestrator under a fair source licence that allows for non-commercial use and eventually converts to open source. We may also license these server-side components to commercial VPN providers, who may offer the service more affordably thanks to better economies of scale.
- Packaging the shield and gateway as a single app, to make it easier for individuals to deploy for non-commercial purposes. This app could even run on the same server as the tunnel.
- We’ll open source two key components of our obfuscation protocol,
so that other circumvention technologies can benefit from our work:
- The library that makes the client behave like Google Chrome, which wraps BoringSSL and implements the obfuscation protocol in the client-shield connection.
- The tooling we use to generate and maintain our own tunnels.
Environmental impact
All proxy-based circumvention services have an environmental impact, as they require additional server-side resources, client-side processing, and bandwidth for the proxying functionality alone. In our case, we also need additional computing resources to mitigate traffic analysis in the client-shield connection (e.g. noise frames, padding).
Unfortunately, our ability to offset the environmental impact of the service is limited:
-
We will certainly optimise our algorithms to avoid wasting CPU cycles and bandwidth, but these efforts are unlikely to yield any meaningful impact.
-
We have no control over the efficiency of the devices used by users and tunnels, or the kind of energy used to power them.
-
None of our server-side infrastructure providers are listed on the Green Web Directory, although some do have some sustainability commitments:
- The orchestrator will run on Google Cloud Platform, which invests heavily in carbon-free energy and efficiency improvements.
- The gateways and shields will sadly be hosted on virtual and bare metal servers from various providers with unclear sustainability commitments, like DigitalOcean.
Amazon Web Services is listed on the Green Web Directory and could serve most of our needs, but it would have made the service unviable due to extortionate bandwidth costs.
We are considering some form of carbon offsetting that wouldn’t be mere greenwashing. Suggestions are most welcome!
Alternatives considered
Peer-to-peer tunnelling
We decided against peer-to-peer (P2P) tunnelling, like Snowflake, for the following reasons:
- P2P applications, such as those using WebRTC, are nowhere as widespread as web browsing, so the collateral damage of blocking them is minimal, and therefore more tolerable to the censors.
- Without a vetting process, like we do for the tunnels, a sufficiently-resourced adversary could launch successful sybil attacks.
Obfuscating existing VPN protocols
In principle, we’re only interested in the tunnelling aspect. The underlying VPN protocol, whether it’s OpenVPN® or WireGuard®, should be irrelevant. In practice, however, the current implementations of OpenVPN® and WireGuard® would prove exceptionally problematic in production.
Neither OpenVPN® nor WireGuard® servers natively support client-side interfaces based on WebSockets, so we would’ve to implement and/or integrate another middleware (e.g. wstunnel) to bridge the two. This would’ve added significant complexity and costs in production, and reduced performance.
If we were to do any kind of advanced integration with the VPN protocol, we would’ve only considered WireGuard®, given its simplicity, but much to our regret, it wasn’t a viable option. We would’ve faced the same challenges that led Cloudflare to create their own implementation from scratch. Unfortunately, Cloudflare have abandoned their implementation, and whilst a fork has emerged recently, it’s too soon to tell if it will be reliable enough for production.
Future improvements
HTTP/3 support
We should switch to MASQUE once mainstream web servers add HTTP/3 support to their reverse proxies.
Post-quantum cryptography
We should migrate to post-quantum cryptography algorithms (e.g. PQXDH).
Cross-site request mimicking
For example, mimic the retrieval of analytics script (e.g. Google Analytics) used by the main website.
MASQUE migration
MASQUE’s powerful proxying capabilities could solve the performance issues with TCP. However, this requires reverse proxies to support HTTP/3 on both their client-facing and upstream interfaces. We’d also need to ensure our HTTP/3 traffic wouldn’t stand out from regular web traffic.
Alternative tunnelling methods
Future versions of the protocol may support additional tunnelling methods. By baking tunnelling into the protocol, the gateway can automatically support any type of tunnelling method that the client uses, whilst preserving the E2E encryption between the client and the gateway. Alternative methods could include:
- Video calls, preferably using an E2E encrypted platform like Zoom. The client would use a virtual webcam (see pyvirtualcam) to send packets, and a video decoder for the incoming packets. The tunnel would decode the video stream from the client and send the packets to the gateway, and encode the packets from the gateway and send them to the client.
- Email (SMTP and IMAP). Once configured with the credentials, the client and the tunnel would exchange packets via email with no user intervention. Packets would be batched for efficiency.
Naturally, the methods above would be used sparingly to avoid detection and because of their extremely low throughput. It should also be considered whether the method would breach the terms of the underlying service.
Questions or feedback
We welcome any questions or feedback about this technical overview. Please join our community!
Changelog
We only list significant changes here.
- 25th November 2024: Initial version published, and feedback requested.
- 3rd December 2024:
- Removed the shield’s dependency on Cloudflare since VPN traffic violates their terms of service, even if clients don’t connect to Cloudflare servers.
- Acknowledged performance issues.