M Baas

I am an E&E engineering PhD student at Stellenbosch University. I post about deep learning, electronics, and other things I find interesting.

14 February 2022

Introduction to Bitcoin address formats

by Matthew Baas

An abridged taxonomy of the major bitcoin address formats and versions.

TL;DR: bitcoin (BTC) has been through several versions, and will doubtless go through more in the future. Different major bitcoin versions have different address formats. This post aims to document the common BTC address types encountered in common use as of Feb 2022, and is targeted at those who have a minimal understanding of bitcoin.

I will be assuming you have a basic idea of what bitcoin is, and just about nothing else :). For those already well experienced in BTC and its technical components, this might be of less use to you.

Update 2022-03-20: changed name of wrapped segwit to script hash addresses (BIP-13). This is done to better indicate that this address format can encode the hash of any arbitrary script, not just wrapping segregated witness.

1. Bitcoin overview

Before the taxonomy, a brief reminder of the key parts of bitcoin are appropriate.

The Bitcoin (BTC or just btc) blockchain is fundamentally a record of transactions between btc addresses. By looking at all transactions involving a given BTC address, we can determine the balance of that address. This means that an amount of bitcoin is owned by a BTC address on the BTC blockchain.

However, a single ‘wallet’ in common software wallets these days can correspond to many different addresses. The total value of that wallet is the sum of balances of all the addresses contained in that wallet (just like how in real life a single wallet can have multiple cards in it).

Changes to bitcoin: BIPs

The original version of the bitcoin protocol and software released over a decade ago has undergone significant upgrades and bug fixes. Upgrades, or improvements to bitcoin are formally proposed as Bitcoin Improvement Proposals (BIPs). They have a formal format, a formal lifecycle, and are typically reviewed many times by multiple people before miners consider adopting the BIP.

For example, BIP-32 is an upgrade for introducing a feature called Hierarchical Deterministic Wallets. Some BIPs are minor bugfixes and usability improvements, while others are fairly large updates which introduce swathes of new functionality. Each BIP is only ‘active’ (i.e. in effect on the main BTC blockchain) if the majority of miners agree to run software that implements that BIP. As of Jan 2022 there are 43 BIPs which are in effect – the ‘final’ state of an accepted BIP.

Unique to BTC: all BIPs should be backwards compatible. That is to say, the very first bitcoins should still be spendable using the original methods, and the main functionality that worked in previous versions must still work in the latest version.

Other cryptocurrencies often do not have this feature, where failing to update wallet software to the latest version of the cryptocurrency’s protocol will render your funds unusable. Such a design guideline has its benefits and drawbacks: you can always be confident that you can spend your funds, even if it is your own hand-written wallet software from a decade ago. However, this comes at the cost of major version bloat, where all future BIPs and wallet software code must have special hooks and workarounds to make sure both the latest and all previous versions work as intended.

The nature of BTC improvements

The major BIPs have changed the BTC blockchain so that the blockchain still contains a list of transactions, however nowadays the information that can be included in a transaction has expanded substantially.

Transactions can have many receiving and sending addresses within them along with other metadata. They can also require approval from owners of multiple BTC addresses, and various other functionality – e.g. only being spendable after a certain amount of time.

However, since all BIPs are backwards compatible, new addresses created using software implementing newer BIPs have different forms to let BTC nodes and miners know that the address and its format corresponds to the newer BTC protocol version. This is necessary so that old addresses are not handled as if they support newer features – part of ensuring backwards compatibility.

2. BTC address taxonomy

Note: the example addresses used below are just examples I grabbed of the internet, not mine and I don’t know where they come from. DO NOT SEND ANY FUNDS TO THESE EXAMPLE ADDRESSES.

Here is a list of the types of addresses you will commonly see while using bitcoin:

Address version Example Description Payment type
Legacy 15e15hWo6CShMgbAfo8c2Ykj4C6BLq6Not Oldest bitcoin version. Always start with a 1. P2PKH
Script hash addresses (BIP-13) 35PBEaofpUeH8VnnNSorM1QZsadrZoQp4N 2nd major address version. Always start with a 3. P2SH
Native Segwit bc1q42lja79elem0anu8q8s3h2n687re9jax556pcc 3rd major address version. Always start with bc1q. Current standard. P2WPKH
Lightning Network lnbc2500u1pvjluezsp5zyg3zyg3zyg3zyg3zyg3zyg3zyg3zyg3zyg3zyg3zyg3zyg3zygspp5qqqsyqcyq5rqwzqfqqqsyqcyq5rqwzqfqqqsyqcyq5rqwzqfqypqdq5xysxxatsyp3k7enxv4jsxqzpu9qrsgquk0rl77nj30yxdy8j9vdx85fkpmdla2087ne0xh8nhedh8w27kyke0lp53ut353s06fv3qfegext0eh0ymjpf39tuven09sam30g4vgpfna3rh BTC’s 2nd layer off-chain payment protocol. Always start with lnbc. LN
Taproot (segwit v1) bc1pmzfrwwndsqmk5yh69yjr5lfgfg4ev8c0tsc06e 4th major address version. Always start with bc1p. Upcoming standard. P2TR

I will now give a brief overview of the different types of wallets associated with each major address version, except for lightning network, since that is not on the main btc blockchain and I don’t know enough about it. Also, what follows is my current best understanding of each address technology, and may not be fully correct from here onwards. For the best information on them, consult the source BIPs on the bitcoin github, and the bitcoin node software.

2.1 Legacy

The legacy address is made from a pair of (private key, public key), and the address is simply a hash of the public key using the private key with some cryptography. The result of this hash is something like 15e15hWo6CShMgbAfo8c2Ykj4C6BLq6Not

This is why legacy payments are also referred to as Pay-to-Public-Key-Hash (P2PKH), as you are literally paying to a hash of the public key of the target wallet.

You can spend from the address so long as you can prove (using cryptography) that you have the private key corresponding to the address (hashed public key).

2.2 Script hash addresses (aka wrapped segwit)

Script hash addresses (defined in BIP-13), sometimes known as wrapped segwit addresses, are made, very roughly speaking, from a tuple (private key, public key, script). The address is the hash of a script that involves certain spending conditions.

Such spending conditions can be simple: e.g. showing the private key associated with public key allows you to spend this bitcoin’

Or they can be complex: e.g. showing the private key associated with this public key allows you to spend this bitcoin after 27 days if you also reveal a predetermined secret number.

The script of these conditions is then hashed using the private key to obtain the address. e.g. 35PBEaofpUeH8VnnNSorM1QZsadrZoQp4N . And this is why script hash addresses (aka wrapped segwit) is known as Pay-to-Script-Hash (P2SH). To spend from an address you must have the private key, script, and satisfy the requirements of the script.

2.3 Native Segwit

Wallets in this version are defined, again very roughly speaking, by a pair (seed phrase, pass phrase, tree structure, script). To get an address, we essentially compute a hash based on the seed phrase, pass phrase and a particular path within the tree structure, providing us with a hashed public key to send BTC to. When transactions are broadcast to the blockchain, a hash of the script is included in a separate part of the transaction called the ‘witness’. Spending from any segwit or newer addresses requires satisfying the script requirements specified by the witness.

Hence, we call it Pay-to-Witness-Public-Key-Hash (P2WPKH) because the address is a hash of the public key and witness pair. If a script is used (e.g. for multi-sig wallets) then it is also known as Pay-to-Witness-Script-Hash (P2WSH). The seed and pass phrase in the tuple above can also instead be specified by an extended public and extended private key, and internally the seed and pass phrase are used to generate the extended public and private keys in wallet software.

Example:

Then the derived address will be some series of cryptographic functions that takes these items as input, yielding an address like bc1q42lja79elem0anu8q8s3h2n687re9jax556pcc

2.4 Taproot

With taproot, a now released but not yet widely used version of the BTC protocol, addresses can be formulated in significantly more ways. Concretely, like native segwit, a wallet can consists of a seed phrase and a pass phrase. These are used to generate an extended public and private key, which are used to derive the addresses at arbitrary paths in a hierarchically deterministic wallet.

However, now with taproot, there is one more thing that can be added to generate an address – taptweaks. A taptweak – fundamentally a natural number – is added at an intermediary step to the native segwit tuple to yield a new public key and thus address. Arbitrary bitcoin scripts can then be encoded into a taptweak and thus into an address. This, combined with the script and metadata added to the ‘witness’ part of the address in any transaction, provides the necessary functionality for various new taproot features.

These taptweaks have some special mathematical properties that allow for various interesting functionality, such as having a binary tree of different scripts committed to the same address, allowing one to spend from that address if they can satisfy a script at some path in the tree. Taproot also introduced musig, which allows for multi-sig wallets to be constructed with what is essentially a taptweak, thereby making multi-sig wallets indistinguishable from regular wallets on the blockchain.

Example

Let us look at a single example of a transaction I found from browsing blockchain.com. Below is an example of an actual address clearly used by some whale or exchange:

blockchain address example

We can see that the address – from its format – is a native segwit (segwit v0) address using a non-trivial script in the witness (P2WSH). Recalling that BTC is fundamentally owned by an address and not a wallet, we can also observe the whale nature of the address: tallying all its transactions yields the final balance belonging to this address at over 2930 BTC!

Next lets look at a transaction it is in: blockchain address example

In this example transaction, the segwit address above is sending funds to six output BTC addresses from various versions. I have highlighted the version of each address with colors as used in the table earlier for clarity. Such a transaction highlights how backwards compatible and interoperable BTC is – a single transaction can involve inputs and outputs from differing versions of the BTC protocol, all without problem.

Summary

I hope you found this post valuable, and as always, if you spot things I am mistaken on, please get in contact with me via the About page. I will continue to update the list above if/when new major BTC versions and address formats are released.

And while there are many other BTC-related address formats (e.g. addresses for bitcoin cash, BSV…), here I restricted focus to only address types you will see in common use on main bitcoin blockchain. I have also avoided going into detail on how the addresses are constructed and the different sub-parts of addresses – if you are interested I recommend digging further, as it is a rather abyssal rabbit hole.

Thanks for reading!

tags: bitcoin - cryptocurrency - summary