Due to its pseudo-anonymous nature and decentralized infrastructure, bitcoin has been exploited in darknet marketplaces which facilitate the trading of a myriad of illegal products and services, including illicit drugs, stolen personal data, weapons, hacking tools, and more. The history of bitcoin transactions is recorded on a public ledger, known as the blockchain. However, the real world identity of a bitcoin address remains obfuscated, yet addresses can occasionally be grouped together and linked to their owner(s) via means of behavior patterns and publically available online information obtained from off-chains sources.
Blockchain based analysis of common behavioral patterns, including one-time modification heuristics and common spending, is used for bitcoin clustering in the form of unique votes for the association of bitcoin addresses, whereas off-chain data is usually used to confirm obtained results. A recently published paper presents an analysis of bitcoin transactions and addresses associated with darknet marketplaces and proposes a novel approach that can be used to identify bitcoin addresses used by vendors and buyers on cryptomarkets. The approach is based on address clustering which is the process of predicting ownership of multiple addresses by a single user. Moreover, the accuracy of the results was further improved via a voting based approach that utilizes multiple labels of addresses owned by the same user. Let’s take a look at the method presented via this paper and its effectiveness in identifying bitcoin transactions and addresses associated with illegal activities taking place on darknet marketplaces.
Approach used to deanonymize darknet users via their bitcoin transactions:
To illustrate the approach used, let’s consider Adam, a privacy savvy Tor user, in the following hypothetical scenario:
1 – Adam has a social network account (e.g. Twitter) and purchases bitcoin using a traditional browser that is also used to access his Twitter account. The purchased bitcoin is saved in bitcoin address A.
2 – Adam uses the Tor browser to access a darknet marketplace and makes a payment to a vendor’s address P to purchase one of the products listed there.
Even though the first step involves non-anonymous browsing, Adam expects that his step 2 actions will be anonymous given that he is using Tor and bitcoin. The A->P bitcoin transaction leaks some data that can be used by an adversary, e.g. Judy, to link Adam to a darknet marketplace as follows:
1- Judy regularly crawls public.com to check publicly stored user profiles.
2- Judy regularly crawls Tor hidden services and stores accessible pages.
3- Judy regularly parses crawled data, looking for bitcoin addresses.
4- Judy regularly parses the blockchain looking for transactions between users and addresses associated with darknet marketplaces or other onion hidden services.
5- Judy finds Adam’s bitcoin address A and manages to link it to his Twitter account via data from public.com.
6- Judy finds the bitcoin address P via data from private.onion.
7- Judy finds the bitcoin transaction A -> P and thus identifies that Adam spent bitcoin on a darknet marketplace.
Method of data collection:
The following outlines how the online identities of Tor darknet marketplaces, other onion hidden services, and public bitcoin addresses were collected.
Tor hidden services and marketplaces (voting based approach):
Tor hidden services cannot be indexed by traditional search engines, yet they can be obtained via indexing services, e.g. Ahmia, which is a Clearnet website. Other darknet search engines can be used but have to be accessed via the Tor browser. These search engines can crawl the hidden services’ landing pages or the websites of a large number of onion domains. Typically, some Tor hidden services publicly post their bitcoin addresses on their landing pages in order to receive payments. These addresses can be harvested via simply downloading these pages and searching for bitcoin addresses via means of regular expressions, as a bitcoin address is composed of a base-58 encoded identifier of 26–35 hexadecimal characters.
Identity flexibility is one of the main factors that brings darknet vendors and buyers online, which makes it feasible to trace them via a voting based approach. First of all, most darknet users use fake IDs online. These fake IDs can reveal their online activities in one way or another. For example, on Silk Road, vendor Donagal used the aliases “XX” and “Xanax King”, while the bitcoin seller Faiella used the alias “BTCKing”. Furthermore, more than 80% of darknet users use more than two aliases. For instance, the owner of Silk Road, Ross Ulbricht, used the aliases “Dread Pirate Roberts”, “Silk Road”, and “DPR”. It is possible to deanonymize a cryptomarket user via finding links between their multiple aliases. In fact, one of the main mistakes that led to the arrest of Ulbricht, the mastermind behind Silk Road, is that he posted data on his LinkedIn profile about his multimillion dollar darknet marketplace.
To identify long term bitcoin addresses used by Tor hidden services, authors of the paper analyzed datasets from mid 2015. Analysis showed that very few Tor hidden services publicly posted their bitcoin addresses on their website pages. As such, the study focused on the time period when posting long term bitcoin addresses was relatively a common practice (2010-2016). Automated and manual searching yielded 105 bitcoin addresses that could be linked to Tor hidden services. As such, any bitcoin address sending funds to these 105 addresses means that their owner did some form of activity on the Tor network.
On the other hand, bitcoin users occasionally post their addresses on social networks for various reasons, including receiving tips, offering various services, or expressing pride for being part of the community. Online exposure of bitcoin addresses can put their owners at the risk of transaction tracing and address clustering. Even more, some users reveal private personal information along with their bitcoin addresses which renders it even easier to deanonymize them.
Collection of Twitter data:
Bitcoin addresses as well as the associated online identities of their owners can be harvested via means of crawling and parsing of their profiles or via the social network’s native API. In this study, the addresses and online user identities were collected via Twitter.
The Twitter Decahose stream data was used which was previously harvested during the period between December 11th, 2013, and December 30th, 2014. Decahose yields a 10% real-time random sampling of all tweets posted publically via a streaming connection. This dataset was chosen because the goal was to find bitcoin addresses associated with Twitter users, which could be linked to Tor hidden services’ bitcoin addresses. It is worth mentioning that before 2016, it was a relatively common practice for users and Tor hidden services to publicly post their long term bitcoin addresses. Collectively, data collection yielded 10TB of JSON files that included 5 billion tweets. In addition to its textual content, every tweet includes its author’s public profile information, which often included the user’s bitcoin address. To obtain tweets that include bitcoin addresses the whole dataset was scanned and the tweets which include relevant data were kept, resulting in 509,173 tweets. Thereafter, another pass was run on the matched tweets to associate them with unique bitcoin addresses. From 509,173 matched tweets, 4,183 unique bitcoin addresses and online identities were identified, where an identity’s address appeared in 165 different tweets.
Implications of the results of the study:
This study proves that using bitcoin for transactions on darknet marketplaces, or other Tor hidden services, can leak data that can be used to deanonymize users. This represents a considerable threat to darknet users, since they actively rely on Tor to promote their anonymity when transacting on darknet marketplaces. Deanonymization is possible due to insufficient retroactive security of bitcoin’s pseudonymity infrastructure. Particularly, via analyzing historical blockchain transactions, an adversary can deanonymize users who used to post their bitcoin addresses online on social networks and associate them with darknet marketplace users and Tor hidden services who posted their addresses on the pages of onion hidden services.
The experiments conducted via this study could successfully associate many Twitter users to various Tor hidden services including darknet marketplaces. Via utilization of information obtained from their social network public profiles, it was proven that the anonymity of darknet users could be compromised. The results of this study have one major implication: bitcoin addresses must be changed frequently as they are always the weak link along the process of transactions, which can be exploited to deanonymize users and link them to illegal activities taking place on the Tor network.