paint-brush
A Peek Into BlueSky's AT Protocol Helped Me Understand Why It Needs to Existby@thebojda
1,183 reads
1,183 reads

A Peek Into BlueSky's AT Protocol Helped Me Understand Why It Needs to Exist

by Laszlo FazekasJanuary 11th, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Discover BlueSky, the federated alternative to Twitter, powered by the innovative AT Protocol. Learn about its decentralized structure, repository system, DID integration, and unique features that prioritize user freedom and authenticity. Explore how BlueSky could shape the future of social media.
featured image - A Peek Into BlueSky's AT Protocol Helped Me Understand Why It Needs to Exist
Laszlo Fazekas HackerNoon profile picture


In 2022, Elon Musk acquired Twitter, which has since been rebranded as X. This part of the story is widely known. However, fewer people know that a project to decentralize the platform was taking shape within Twitter. This project, named BlueSky, was launched in 2019. By 2021, it was spun off into the Bluesky Social Benefit Corporation, allowing it to continue independently from Twitter. In some ways, BlueSky can be considered just as much a successor to Twitter as X.


X is a heavily for-profit service, offering paid validation and premium features. In contrast, BlueSky is a completely open system built on open protocols similar to the Fediverse, consisting of a multitude of independent nodes.


Since the BlueSky team was not satisfied with the ActivityPub standard, they developed their protocol, called the AT Protocol.


The core element of the protocol is the repository, which is similar to a database. This is where posts, likes, and all other data are stored. Every user (or any entity) has a repository. The repository contains collections (similar to database tables), and the collections contain records in a key-value format. The repository stores data like IPFS. Each record has a CID, which is a content-based hash. If a user modifies anything in the database, a commit hash is generated from the data (similar to a Git commit). Even a single-bit change in the database results in a new commit hash. The owner of the repository digitally signs this commit hash after every change, thereby authenticating the database.


The advantage of this solution is that the entire repository or its parts can be freely transferred between systems while maintaining the ability for any system to easily verify the authenticity of the data.


Users can host their repository on a Personal Data Server (PDS) of their choice. In this sense, the PDS functions like a database server. Through the PDS, users can modify their repository and make it accessible to others. Beyond this, the PDS provides a wide range of additional services. It allows access to other users' data, retrieves feeds, and more. Essentially, the PDS serves as a fully functional social media node, enabling users to connect to the network.


Since the PDS is merely the system that runs the repository, users have the freedom to move their repository between different PDSs or even operate their own PDS. This flexibility is what provides the system with its freedom.


The content of the repositories stored on PDSs is monitored by feed generators, which create feeds based on specific criteria. On other social media platforms (e.g., Facebook), there is usually only one feed, but in the case of BlueSky, users are free to choose which feed they want to see posts from. If a user feels that a feed generator is not showing posts relevant to their interests, is censoring posts they want to see, or is attempting to manipulate them, they can simply switch to a different feed provider.


This loose network of systems and the ease of transferring repositories provide complete freedom without sacrificing efficiency. For example, the client doesn't need to gather posts from multiple sources, as this is handled by the feed generator and the PDS.


Now that we understand the system's structure and main components, let's dive a bit deeper and see how the protocol works.


Every user (and other entities) is assigned a unique decentralized identifier (DID). This DID is associated with the repository and the key pair used by the user to sign repository commits. Since DIDs are difficult to remember, users are identified by domain names, which the system translates into DIDs.


For example, my username is thebojda.bsky.social. The DID can be linked to this in two ways: either by specifying it in the domain's TXT record with the appropriate key or simply through the well-known URI. The DID can be accessed at this URL:


https://thebojda.bsky.social/.well-known/atproto-did


For example, my DID is: did:plc:4x7rynvskplz54p5pofj3jxa


The AT Protocol supports both plc and web types of DIDs. A web DID is a simple URL, while the plc DID is a custom standard of the AT Protocol, generated from the public key and some additional data (those interested in a deeper dive can read more about it here).


Each DID is associated with a DID document, which contains the public key linked to the DID and the URL of the PDS where the repository is hosted.


A DID can be resolved like this:


https://plc.directory/did:plc:4x7rynvskplz54p5pofj3jxa


The DID document looks like this:


{
    "@context": [
        "https://www.w3.org/ns/did/v1",
        "https://w3id.org/security/multikey/v1",
        "https://w3id.org/security/suites/secp256k1-2019/v1"
    ],
    "id": "did:plc:4x7rynvskplz54p5pofj3jxa",
    "alsoKnownAs": [
        "at://thebojda.bsky.social"
    ],
    "verificationMethod": [
        {
            "id": "did:plc:4x7rynvskplz54p5pofj3jxa#atproto",
            "type": "Multikey",
            "controller": "did:plc:4x7rynvskplz54p5pofj3jxa",
            "publicKeyMultibase": "zQ3shaNKzE66K1Kr3dmbnDwXWHh6v4nUcBmpEaK7bVktKTwfh"
        }
    ],
    "service": [
        {
            "id": "#atproto_pds",
            "type": "AtprotoPersonalDataServer",
            "serviceEndpoint": "https://fibercap.us-west.host.bsky.network"
        }
    ]
}


In this document, the PDS URL must be updated if someone moves to a new PDS. For example, my current PDS is accessible at https://fibercap.us-west.host.bsky.network.


Communication with the PDS is done through XRPC, a simple HTTP/JSON-based protocol. Each call has a reverse DNS-style name. For example, if I want to fetch my entire repository, I can do it like this:


https://fibercap.us-west.host.bsky.network/xrpc/com.atproto.sync.getRepo?did=did:plc:4x7rynvskplz54p5pofj3jxa


The com.atproto.sync.getRepo method is used to query the repository, and it has a did parameter.


The BlueSky team developed a JSON-based descriptive language called Lexicon to define the API and data structures. It is similar to JSON Schema and can be used to generate type-safe interfaces, for example, for TypeScript, which simplifies the implementation of the protocol.


The following call can be used to fetch my last 10 posts:


https://public.api.bsky.app/xrpc/app.bsky.feed.getAuthorFeed?actor=thebojda.bsky.social&limit=10


The result looks like this:


{
    "feed": [
        {
            "post": {
                "uri": "at://did:plc:4x7rynvskplz54p5pofj3jxa/app.bsky.feed.post/3le3esbhaek2l",
                "cid": "bafyreihagjnwrkakkajkaighz6kyqww3wznvbdcqpnl4wbfnpwrr2xwmmi",
                "author": {
                    "did": "did:plc:4x7rynvskplz54p5pofj3jxa",
                    "handle": "thebojda.bsky.social",
                    "displayName": "Laszlo Fazekas",
                    "avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:4x7rynvskplz54p5pofj3jxa/bafkreibxd4mx7rehgkc77diautavpcota6jotzkaagp4zv2t6w3a52n7sq@jpeg",
                    "labels": [],
                    "createdAt": "2024-11-24T01:36:39.146Z"
                },
                "record": {
                    "$type": "app.bsky.feed.post",
                    "createdAt": "2024-12-24T21:20:56.891Z",
                    "embed": {
                        "$type": "app.bsky.embed.external",
                        "external": {
                            "description": "MyETHMeta is a decentralized metadata service for Ethereum accounts. It is something like Gravatar. There are no backend servers, you fully own your data.",
                            "thumb": {
                                "$type": "blob",
                                "ref": {
                                    "$link": "bafkreifoezhkdtesuhkaoqt3yml44geozu6ckg2cfffg5t5omob2bdjmo4"
                                },
                                "mimeType": "image/jpeg",
                                "size": 669185
                            },
                            "title": "MyETHMeta v2 – Some Improvements on the Gravatar for Your Ethereum Account | HackerNoon",
                            "uri": "https://hackernoon.com/myethmeta-v2-some-improvements-on-the-gravatar-for-your-ethereum-account"
                        }
                    },
                    "facets": [
                        {
                            "features": [
                                {
                                    "$type": "app.bsky.richtext.facet#link",
                                    "uri": "https://hackernoon.com/myethmeta-v2-some-improvements-on-the-gravatar-for-your-ethereum-account"
                                }
                            ],
                            "index": {
                                "byteEnd": 270,
                                "byteStart": 240
                            }
                        }
                    ],
                    "langs": [
                        "en"
                    ],
                    "reply": {
                        "parent": {
                            "cid": "bafyreieipk3kgwonq3h62wyadauplzgcpdgcg6pxp2776oeldcstgybwha",
                            "uri": "at://did:plc:4x7rynvskplz54p5pofj3jxa/app.bsky.feed.post/3le3es7kzmc2l"
                        },
                        "root": {
                            "cid": "bafyreieipk3kgwonq3h62wyadauplzgcpdgcg6pxp2776oeldcstgybwha",
                            "uri": "at://did:plc:4x7rynvskplz54p5pofj3jxa/app.bsky.feed.post/3le3es7kzmc2l"
                        }
                    },
                    "text": "MyETHMeta is a decentralized metadata service for Ethereum accounts. It is something like Gravatar, but here the metadata and your profile picture is assigned to your Ethereum address. There are no backend servers, you fully own your data. hackernoon.com/myethmeta-v2..."
                },
                "embed": {
                    "$type": "app.bsky.embed.external#view",
                    "external": {
                        "uri": "https://hackernoon.com/myethmeta-v2-some-improvements-on-the-gravatar-for-your-ethereum-account",
                        "title": "MyETHMeta v2 – Some Improvements on the Gravatar for Your Ethereum Account | HackerNoon",
                        "description": "MyETHMeta is a decentralized metadata service for Ethereum accounts. It is something like Gravatar. There are no backend servers, you fully own your data.",
                        "thumb": "https://cdn.bsky.app/img/feed_thumbnail/plain/did:plc:4x7rynvskplz54p5pofj3jxa/bafkreifoezhkdtesuhkaoqt3yml44geozu6ckg2cfffg5t5omob2bdjmo4@jpeg"
                    }
                },
                "replyCount": 0,
                "repostCount": 0,
                "likeCount": 0,
                "quoteCount": 0,
                "indexedAt": "2024-12-24T21:21:02.466Z",
                "labels": []
            },
            "reply": {}
        },
        {},
        {},
        {},
        {},
        {},
        {},
        {},
        {}
    ]
}


Each element contains a post and its associated replies. As mentioned earlier, every post is a record in the repository and has its own unique identifier. Each post can be assigned a unique URI, which consists of the DID, the collection name, and the post ID. In the example above, the URI looks like this: at://did:plc:4x7rynvskplz54p5pofj3jxa/app.bsky.feed.post/3le3esbhaek2l


This means the post is located in the repository associated with my DID, within the app.bsky.feed.post collection and its ID is 3le3esbhaek2l.


In addition to the URI, the post's CID is also included, which is a unique hash generated from the content. These elements together form the unique commit hash of the repository.


Another noteworthy aspect is the thumb section, which refers to an image associated with the post. This is a blob-type object that does not belong to any collection. The system stores large files (such as images, videos, etc.) as blobs, which can be referenced in individual records (e.g., posts) using their hash (CID).


For those interested in a deeper look at the structure of the repository and the records, the go-repo-export tool can be quite useful. With this small program, you can download the entire user repository and extract the collections and their records into a directory in JSON format. This allows you to see exactly how BlueSky stores the data.


Another good source of information is Chrome DevTools. On the https://bsky.app website, you can clearly see the API calls and how the client side communicates with the PDS.


And, of course, there's the official documentation. The AT Protocol and BlueSky have excellent documentation, and on GitHub, you can find examples as well as the source code for both the client and server sides.


When I first read about the AT Protocol, my initial thought was why we need yet another federated protocol alongside ActivityPub. However, the AT Protocol indeed has certain features that fully justify its existence. The protocol is, of course, not perfect and will likely undergo significant development. For instance, if we want to build a truly censorship-resistant network, it cannot rely on a centralized domain name system. A blockchain-based naming system, which is genuinely censorship-resistant, would be valuable (I might even write a proposal on this).


BlueSky currently has an estimated 20–30 million users, which pales in comparison to the user base of X (Twitter), but it's still significant. As for what the future holds, no one can say for sure. The federated network is a major advantage, and there are intense debates about whether it’s too much power for one entity to own a global communication platform like Twitter or Facebook. I wouldn’t completely rule out the possibility that BlueSky could one day surpass its competitors. With good feed generators and effective community building, this is achievable. In any case, it’s worth paying attention to this project and understanding how it works.