Technical overview of Awala VPN

This is a high-level, technical overview of the protocol behind Awala VPN. It’s aimed at security auditors, human rights organisations, and the Relaycorp team. We assume the reader has a good command of cryptography and networking (e.g. TLS, Deep Packet Inspection).

This is not a protocol specification. We intend to formalise the protocol after launch.

Introduction

The Awala VPN protocol is specifically designed to circumvent the world’s most sophisticated censorship system, the Great Firewall of China (GFW), and gradually target the equivalent in other countries. Circumventing censorship safely is the only goal of this project; we are not trying to bypass geo-restrictions from streaming services or match the degree of anonymity of Tor.

The protocol circumvents censorship by making the traffic look like innocuous web browsing. The idea of using websites as fronts for proxy or VPN traffic has been around for a few years, and has been studied academically under the name HTTPT. Tor’s WebTunnel bridge and many other projects use WebSockets, but the emerging WebTransport API and MASQUE are also promising candidates.

We started this work because existing circumvention services use obfuscation strategies that are vulnerable to detection by the GFW in at least one of these areas:

The protocol aims to mitigate the obfuscation-related weaknesses above, whilst being easy to use, and ensuring a degree of privacy and security adequate for at-risk users (e.g. journalists, activists).

This work is based on an early proof of concept of the protocol, which was successfully tested in China.

Architecture

Awala VPN architecture

Awala VPN adopts a multi-hop architecture with exactly three middleboxes between the VPN client and the target server:

The client uses end-to-end (E2E) encrypted WebSocket messages to communicate with the gateway shield, preventing tunnel operators from viewing or tampering with the traffic (see client-shield protocol). Additionally, the IP packets are E2E encrypted between the client and the gateway to minimise the exposure of such data, since the shield doesn’t need it (following the principle of least privilege; see client-gateway protocol). Consequently, IP packets are E2E encrypted twice in an onion routing fashion: initially between the client and the gateway, and the resulting ciphertext is then encrypted between the client and the shield.

To minimise latency, the servers in the middlebox trio must be geographically close; in the same country (for smaller countries like Singapore), neighbouring countries (like Japan and South Korea), or within the same region of large countries (like the US West Coast). For legal and security reasons, no middleboxes will be located in any jurisdiction likely to be hostile towards Awala (see eligible jurisdictions).

The overall network is controlled by the orchestrator.

Client

The client is the Awala app that the user installs on their device. The app’s role as a private gateway in the Awala network is orthogonal to the VPN protocol, and therefore outside the scope of this document.

To mitigate TLS client fingerprinting, the client will use BoringSSL, via a library like rquest, to mimic Google Chrome during the TLS handshake. Google Chrome is not only the most popular browser in the world, but also the most popular browser in China, accounting for roughly half of the market share in 2023.

Tunnel

A tunnel is a website that acts as a mere reverse proxy for the shield at a random path. For example, the website https://example.com could host a tunnel under https://example.com/<random-path> with the following configuration, assuming the website is running Nginx:

location /<random-path> {
    proxy_pass https://vpn.awala.app/tunnel;

    # Enable WebSockets
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";

    access_log off;
}

This is all that’s needed to configure a tunnel.

Most tunnels will eventually be run by Relaycorp partners, who must meet strict requirements to protect the privacy of Awala VPN users (e.g. no logging of client IP addresses). Relaycorp will run tunnels to bootstrap the pool of tunnels at launch, and to mitigate the risk of a sybil attack until the pool is sufficiently large.

Note that this server can’t be TLS fingerprinted as a tunnel because it runs third-party software.

Shield

The gateway shield, or simply shield, is a globally-distributed server powered by Cloudflare Workers to minimise latency with the tunnels. The shield is responsible for routing VPN traffic to a gateway, protecting the gateway from DDoS attacks, enforcing usage quotas, and resolving DNS records, amongst other things.

The shield is implemented as a standalone server, instead of being part of the gateway, so that it can run as a Despacito proxy on the edge, and thus mitigate DDoS attacks and enforce usage quotas more effectively. A VPN server like the Awala VPN gateway uses long-running processes, stable IP addresses and elevated networking privileges, which aren’t available in a serverless environment like Cloudflare Workers, so the gateway has to be deployed to a more traditional server.

The shield provides several services to the client: allocation of gateway, DNS resolution, and brokering the connection with the gateway. As the middleware between the client and the gateway, the shield has two interfaces:

To learn more about this server and inspect its source code, visit the GitHub project AwalaApp/vpn-shield.

Gateway

A gateway is effectively a VPN server, as it’s responsible for relaying the packets between the VPN client and the target server. This is the component that handles Network Address Translation (NAT).

The WebSockets server in the gateway, which is used to communicate with the shield, isn’t accessible from the public Internet. Instead, the connection between the gateway and the shield will be done via a Cloudflare Tunnel to avoid unauthorised access, and drastically reduce the DDoS attack surface to volumetric attacks only.

To learn more about this server and inspect its source code, visit the GitHub project AwalaApp/vpn-gateway.

Orchestrator

The orchestrator is a centralised app that directs all the clients and middleboxes, without playing an active role in the transmission of packets, so it can’t be deemed a middlebox.

Its primary responsibilities are to shard the network, and manage the pool of tunnels.

The orchestrator won’t be accessible from the public Internet. Instead, it will only be accessible from the subscription management system (to issue registration codes and allocate shards), and the shield (to issue Despacito client certificates).

Connection protocols

Client-shield protocol

The client and the shield will communicate over an E2E encrypted channel using Diffie-Hellman (DH) with X25519 keys. The client should trigger the rotation of both key pairs every 60 minutes or when the client starts. The shield should refuse requests with key pairs older than 60 minutes.

To prevent malicious tunnels from replaying requests or responses, each HTTP request should include a monotonically increasing sequence number, which the shield will verify before processing the request and include in the subsequent response. The client should verify the sequence number in the response matches the expected sequence number. This requirement also applies to the HTTP requests to bootstrap WebSockets connections. The sequence number is tied to the client’s key pair, and will be reset to zero upon key rotation.

To mitigate traffic analysis, HTTP requests and responses will be padded to a random size.

Every request from the client is to be signed with the client’s Despacito certificate, except for the first request in the client registration.

Web browsing simulation

Before opening a WebSockets connection to relay IP packets, the client will start the connection via the tunnel by making an initial request to simulate the retrieval of a web page by mimicking the behaviour of Google Chrome for obfuscation purposes. Once a response is received, the client will emulate the retrieval of assets like images and JS files, and open a WebSockets connection to relay IP packets.

The pattern exhibited by the connection should be stable by tunnel, but different across tunnels. See the web browsing simulation operations on the shield app to learn more.

WebSockets connection

After simulating the retrieval of a webpage and its assets, and rotating the key pair if necessary, the client will open a WebSockets connection to relay IP packets. This will require opening a new TLS connection an ALPN of h1 (HTTP/1) if the previous connection used a different HTTP version, since WebSockets only works in HTTP/1.

Both the client and the shield will maintain their encryption keys for the duration of the connection, even if a key rotation takes place before the connection is closed.

To prevent malicious tunnels from replaying WebSockets messages (within the same connection or across different connections), the plaintext of each encrypted message must include the HTTP request sequence number and a monotonically increasing sequence number. The client and the shield will maintain separate monotonically increasing sequences for the messages they send. Upon successful decryption, the recipient must verify both values, rejecting any message with a different HTTP request sequence number, or a monotonically increasing sequence number less than or equal to the last processed message.

The WebSocket connection will also carry control messages, besides the E2E encrypted IP packets. For example, the shield may periodically notify the client about how much data its subscription has left.

To mitigate traffic analysis on the WebSockets connection, the client and the shield will obfuscate it by:

Key rotation

The key rotation process involves the client initiating a request encrypted with the shield’s current DH public key. This request contains the client’s newly generated DH public key in its payload. Upon receiving this request, the shield generates and returns its own new DH public key, completing the bilateral key rotation. Each shield’s key pair is strictly bound to a single client.

To tolerate delays in the communication, both will keep their old keys to decrypt incoming messages for up to 60 seconds.

DNS resolution

The client will make an HTTP request to the shield for each DNS record that needs to be resolved.

Shield-gateway protocol

Whenever a client starts a packet relay connection, the shield will start the shield-gateway protocol by selecting the gateway that will take over the packet relay. This is done by reusing an existing session between a client and a gateway, or allocating a new gateway to the client.

Once the gateway is selected, the shield will relay E2E encrypted IP packets bidirectionally between the client and the gateway, until either the client or the gateway closes the connection.

Client-gateway protocol

Once the shield establishes a connection between the client and the gateway, the gateway will start the client-gateway protocol by sending the client its IPv4 and IPv6 addresses (e.g. 10.0.102.0/30, fd00:1234::2:0/127). The client will be ready to send E2E encrypted IP packets once it receives its IP addresses.

The session between a client and its gateway does not end when the connection via the shield is closed. The session persists until it’s not being used for at least 5 minutes. This enables the client to resume the session from a different tunnel at any time.

Since this is the actual VPN protocol, it’s important to consider why we didn’t just obfuscate an existing VPN protocol.

Subscription model

Each client must have an active subscription, and multiple clients can share the same subscription.

Subscriptions are designed for privacy: they’re ephemeral (non-renewable) and can be purchased anonymously, depending on the payment method. This protects users against data breaches and targeted law enforcement requests.

Each subscription is tied to a specific country (e.g. China) for performance reasons, so it can be assigned to a shard with nearby tunnels.

We do not offer a free tier, as it could be abused by an adversary for enumeration purposes. We may, however, give away subscriptions at our discretion.

Sharding

To mitigate enumeration attacks on the tunnel pool, each subscription and each tunnel is assigned to a shard, and each shard has multiple tunnels and multiple subscriptions. Any client can use any tunnel in its subscription’s shard, but it won’t learn about the existence of tunnels in other shards. Consequently, an adversary that purchases a legitimate subscription will only be able to enumerate the tunnels in its shard, and therefore won’t be able to correlate the traffic of different users.

When a tunnel is blocked by the GFW or equivalent, the entire shard will be deemed compromised in that country, and two smaller shards will be created to replace it. Each new shard will inherit half of the subscriptions of the original shard, and it will have a new pool of tunnels. If a second-generation shard is also compromised, no more shards will be created, and users will be required to purchase a new subscription to continue using the service.

User onboarding

App distribution

Unfortunately, we expect most users to have to side-load the app, as app stores tend to block VPN apps at the request of the authorities. Consequently, we’ll distribute the app in two ways:

  1. By linking to the latest GitHub release from the download page, which will require users to use another VPN (or an existing Awala VPN subscription) to download the app when GitHub is blocked.
  2. Through single-use download URLs served by our tunnels, in case GitHub rate limits our downloads. Each URL will expire after 24 hours or after the first successful download.

Subscription purchase

To purchase a subscription, users must access the subscription management system through the Awala website or app. If the purchaser is in a censored region, they must use an existing VPN to access the system, or make the purchase from outside the censored region. Following the purchase, the orchestrator would issue a single-use client registration code, and a signature of said code, so that the client can register itself.

If the purchase is made within the Awala app, using a VPN or outside a censored region, the registration will be handled automatically.

If the purchase is made from the website, the orchestrator will produce the following ClientRegistration ASN.1 structure, and encapsulate it in a VeraId SignatureBundle signed by [email protected]:

ClientRegistration ::= SEQUENCE {
    registrationCode [0] OCTET STRING,
    shieldPublicKey  [1] ShieldPublicKey,
    tunnelURL        [2] VisibleString
}

ShieldPublicKey ::= SEQUENCE {
    key        [0] SubjectPublicKeyInfo, -- From the X.509 specification
    identifier [1] OCTET STRING
}

Where,

The resulting VeraId SignatureBundle will span around 4 KiB, due to the overhead of the MemberIdBundle (for [email protected]), which is embedded in the SignatureBundle.

To facilitate the distribution of this information to the client, the person who made the purchase will receive this information as a deep link in the form https://awala.app/vc#<version><registrationParameters>, where <version> is currently the constant 1, and <registrationParameters> is the Base64-encoded DER representation of the VeraId SignatureBundle. The deep link should open the Awala app on the receiving device if installed. Note that the parameters are encoded in the URL fragment to avoid sending them to the server, if the URL is opened on a web browser.

In cases where the size of the URL is problematic, or the sharing of any URL is undesirable, the user will get the option to download the parameters as an Awala VPN configuration file with the extension .avpn. This file is analogous to an OpenVPN® .ovpn configuration file.

Note that we use VeraId to establish secure initial communication between the client and shield without relying on public keys hard-coded in the client. This enables the client to encrypt sensitive information with the correct shield public key from the very first interaction, whilst maintaining key rotation flexibility.

Client registration

The client will be able to register itself with the shield, and claim its Despacito certificate, using the following protocol spanning two HTTP requests (to mitigate replay attacks from malicious tunnels):

  1. Client: Send RegistrationRequest message:

    1. Generate a nonce.
    2. Generate a DH key pair.
    3. Create a message containing the nonce, the registration code, the client’s DH public key, and the identifier of the client’s DH public key (provided by the orchestrator).
    4. Encrypt the message with the shield’s DH public key (provided by the orchestrator).
    5. Send the ciphertext to the shield, along with the identifier of the shield’s public key.
  2. Shield: Process RegistrationRequest and send RegistrationConfirmationRequest:

    1. Look up the DH key pair by the provided identifier, or abort if not found.
    2. Decrypt the ciphertext with the specified DH private key, or abort if this fails.
    3. Verify the registration code is valid and unused, or abort if invalid/used. (Do not yet expire the registration code.)
    4. Generate a nonce.
    5. Store the nonce, and the respective registration code, in the database with a 5-minute expiry.
    6. Create a message containing the client’s nonce and the shield’s nonce.
    7. Encrypt the message with the client’s DH public key.
    8. Send the encrypted message to the client.
  3. Client: Process RegistrationConfirmationRequest and send RegistrationConfirmation:

    1. Decrypt the ciphertext with the client’s DH private key, or abort if this fails.
    2. Verify the client’s nonce sent by the shield, and abort if it doesn’t match.
    3. Generate a Despacito key pair.
    4. Create a message containing the shield’s nonce, the registration code, and the Despacito public key.
    5. Sign the message with the Despacito private key.
    6. Encrypt the signed message with the shield’s DH public key.
    7. Send the encrypted message to the shield.
  4. Shield: Process RegistrationConfirmation and send RegistrationComplete:

    1. Decrypt the ciphertext with the shield’s DH private key, or abort if this fails.
    2. Verify the digital signature of the decrypted message, or abort if invalid.
    3. Verify the shield’s nonce, and abort if it doesn’t match.
    4. Delete the shield’s nonce from the database.
    5. Mark the registration code as used.
    6. Issue Despacito certificate using the client’s provided public key.
    7. Create a message containing the Despacito certificate.
    8. Encrypt the message with the client’s DH public key.
    9. Send the encrypted message to the client.
  5. Client: Process RegistrationComplete:

    1. Decrypt the ciphertext with the client’s DH private key, or abort if this fails.
    2. Verify the Despacito certificate contains the public key originally sent, or abort if they don’t match.
    3. Store the Despacito certificate.

    Note that if the client fails to receive the RegistrationComplete message, the process will have to be started from scratch with a new registration code.

To register an additional client, an authenticated client must request a VeraId-signed ClientRegistration message from the orchestrator. This request may be refused if the subscription has already registered the maximum number of clients.

VeraId usage

VeraId operations must be done in the context of the service 1.3.6.1.4.1.58708.3.0, defined as the awala_vpn_service Object Identifier (OID) below:

awala_vpn_service OBJECT IDENTIFIER ::= { iso(1) identified-organization(3)
    dod(6) internet(1) private(4) enterprise(1)
    relaycorp(58708) awala_vpn(3) veraid_service(0) }

Additionally, the maximum time-to-live for a VeraId SignatureBundle is 24 hours.

Threat model

(This document covers the threat model of Awala VPN, omitting any considerations for the physical and cyber security of Relaycorp as a company.)

We’re going after nation-state actors, and they will come for us. To stave off attacks as a tiny team, we must offload as much as possible to well-resourced providers, minimise what we know about our stakeholders (e.g. users, partners), and minimise the level of trust required from us and our partners.

Adversaries

For the purposes of this threat model, we are splitting our nation-state adversaries into national firewall operators (e.g. the Cyberspace Administration of China), and state-backed Advanced Persistent Threats (APTs; e.g. Flax Typhoon).

AdversaryGoalsCapabilitiesResources
National firewall operatorBlock serviceTraffic analysis, active probing, connection blockingHistorical traffic data, government agencies
State-backed APTDisrupt service, damage reputationInfrastructure infiltration, social engineering, operating tunnels, running modified VPN clients with valid subscriptionsBooter services, significant funding, government agencies
Financially-motivated cybercriminalRansom, sell tunnelling metadataInfrastructure infiltration, social engineering, operating tunnels, running modified VPN clients with valid subscriptionsBooter services
Government agencies in operating jurisdictionsSurveillanceLegal authority to request data, advanced surveillance infrastructureLegal frameworks, intelligence agencies

Attack vectors

Long-running connection disruption

All VPNs use long-running connections that can be blocked or terminated. In our case, censors may interfere with long-running connections to websites.

Tunnel identification via traffic analysis

An adversary may identify tunnels by analysing the mismatch between a website’s apparent purpose/content and its traffic patterns.

DDoS attacks against shield

Tampered client distribution

An adversary may distribute a tampered version of the client, that looks like the legitimate client, but it may actually be malicious.

DDoS attacks against orchestrator

Tunnel enumeration via data breach

An adversary may try to enumerate all tunnels via a data breach of the shield or the orchestrator.

VPN client detection via eavesdropping app

An adversary may distribute an eavesdropping app that can detect the presence of a VPN client. For example, China’s National Anti-Fraud Center app.

Shard-level traffic correlation via tunnel analysis

Shards might be identified if their websites represent a large fraction of the traffic from a few client IP addresses.

DDoS attacks against gateways

Tunnel operator compromise

Tunnel enumeration via compromised clients

An adversary may modify the open source client and pay for a legitimate subscription to discover the tunnels in its shard.

DDoS attacks against tunnels

The sensor may launch DDoS attacks against the tunnels as a punitive measure, because they could simply block the website used as tunnel.

Targeted surveillance requests

Mass surveillance requests

User identification via data breach

An adversary may try to identify users via a data breach of the subscription database or our payment provider.

Ethical considerations

Collateral damage

The holy grail of Internet censorship circumvention is to make it prohibitively expensive for the censor to block the circumvention service. In practice, the best we can do is to make it very expensive, which effectively means daring the censor to block the circumvention service at the expense of some collateral damage.

If we succeed in making Awala VPN connections truly indistinguishable from regular Web browsing, we may expect the following collateral damage:

Affordability

Offering this VPN as a premium service is at odds with Awala’s goal of “providing all human beings with uncensored and timely communication”, no matter how much we charge. Unfortunately, charging for the service is an essential barrier to mitigate enumeration attacks on the tunnel pool; we may be able to reduce the cost of the service over time thanks to better economies of scale and/or subsidies, but we may never be able to offer a free tier.

Having said that, we will ensure that even non-paying users benefit from our work by:

Environmental impact

All proxy-based circumvention services have an environmental impact, as they require additional server-side resources, client-side processing, and bandwidth for the proxying functionality alone. In our case, we also need additional computing resources to mitigate traffic analysis in the client-shield connection (e.g. noise frames, padding).

Unfortunately, our ability to offset the environmental impact of the service is limited:

We are considering some form of carbon offsetting that wouldn’t be mere greenwashing. Suggestions are most welcome!

Alternatives considered

Peer-to-peer tunnelling

We decided against peer-to-peer (P2P) tunnelling, like Snowflake, for the following reasons:

Obfuscating existing VPN protocols

In principle, we’re only interested in the tunnelling aspect. The underlying VPN protocol, whether it’s OpenVPN® or WireGuard®, should be irrelevant. In practice, however, the current implementations of OpenVPN® and WireGuard® would prove exceptionally problematic in production.

Neither OpenVPN® nor WireGuard® servers natively support client-side interfaces based on WebSockets, so we would’ve to implement and/or integrate another middleware (e.g. wstunnel) to bridge the two. This would’ve added significant complexity and costs in production, and reduced performance.

If we were to do any kind of advanced integration with the VPN protocol, we would’ve only considered WireGuard®, given its simplicity, but much to our regret, it wasn’t a viable option. We would’ve faced the same challenges that led Cloudflare to create their own implementation from scratch. Unfortunately, Cloudflare have abandoned their implementation, and whilst a fork has emerged recently, it’s too soon to tell if it will be reliable enough for production.

Future improvements

HTTP/3 support

We should switch to MASQUE once mainstream web servers add HTTP/3 support to their reverse proxies.

Post-quantum cryptography

We should migrate to post-quantum cryptography algorithms (e.g. PQXDH).

Cross-site request mimicking

For example, mimic the retrieval of analytics script (e.g. Google Analytics) used by the main website.

Alternative tunnelling methods

Future versions of the protocol may support additional tunnelling methods. By baking tunnelling into the protocol, the gateway can automatically support any type of tunnelling method that the client uses, whilst preserving the E2E encryption between the client and the gateway. Alternative methods could include:

Naturally, the methods above would be used sparingly to avoid detection and because of their extremely low throughput. It should also be considered whether the method would breach the terms of the underlying service.

Questions or feedback

We welcome any questions or feedback about this technical overview. Please join our community!