Why criminals can't hide behind Bitcoin

The blockchain doesn't store IP addresses. In order to obtain the IP address of someone sending or receiving bitcoins, you would have either observe the activity of the network very carefully, or track them down by some other means.
Table of contents

Subscribe to RSS
It’s never too late to get started
Why criminals can't hide behind Bitcoin | Science | AAAS

Tracking Bitcoin Transactions (Forensics) - Programmer explains

Some of the Bitcoin clients relay the message to it in the first time segment. This constitutes a subset of the Bitcoin clients to which the monitoring client is connected to at the time of the transaction. Only active Bitcoin clients are connected to the network, but not all of the clients are working at the examined moment. The darker the subset is, the higher the probability of a client in that subset is the originator of the transaction.

Before the transaction, no information is known, thus the best estimate we can make is that each Bitcoin client has equal probability of being the originator of the transaction, resulting in a uniform probability distribution among the active clients left side of Fig 4. After the transaction, each Bitcoin client in the first time segment can be either the real originator of the transaction or a client relaying it via several hops. Furthermore, the real originator can also be among the rest of the network, not connected to our monitoring client.

On the other hand, based on the previous arguments, we presume that clients not relaying the transaction in the first time segment are certainly not the originators of the transaction.

us news on bitcoin!
IEEE Xplore Full-Text PDF:.
looking for bitcoin miners.
Can you trace a Bitcoin address? | Luno.
A Bayesian approach to identify Bitcoin users.

Thus, the probability of the first time segment clients increases while the connected clients not belonging to the first time segment will have zero probability right side of Fig 4. Still nothing is known about the clients not connected to the monitoring client, therefore their probabilities will not change. Also, clients belonging to the same subsets can not be distinguished.

Subscribe to RSS

Let us calculate the probabilities of being the originator for clients in each set. The Roman font type notations of Fig 4 are used for the sets. C denotes that the monitoring client is connected to the originator of the transaction, O denotes that the originator relays the message in the first time segment to the monitoring client and F means that a randomly chosen client from the first time segment is actually the originator of the transaction.

Using these notations, we have that. If the monitoring client is connected to the originator, it is going to inform the monitoring client in the first time segment. At this time all of the first time segment clients have the same probability of being the originator. Let us apply the law of total probability for P F. The above formula gives the probability assigned to the first time segment clients.

The connected clients not belonging to the first time segment have zero probability. So far we only considered one monitoring client. If there are more monitoring clients the above mentioned sets are defined separately for each of them, and then the union of the corresponding sets is determined, i. Using this method, monitoring Bitcoin clients do not need to be synchronized in time. If time synchronization among monitoring clients was achieved, we could further limit the F set of first time segment clients to those that broadcast the transaction in t 1 time after any of our monitoring clients first received the transaction.

In our experiments, achieving reliable time synchronization was not possible, so the union of sets was used as described.

We note that the set of active clients at a given time A is not straightforward to estimate even with a large number of monitoring clients. To do that, we would need to perform an active network discovery over the peer-to-peer network of Bitcoin clients. Instead of implementing this functionality ourselves, we relied on the Bitnodes.

The actual set is not required for the calculations, only the size of the set at the time of the transactions is considered. The next task is to group the Bitcoin addresses according to the users they are owned by. After this, every transaction can be assigned to the users by looking at the source Bitcoin addresses of the transaction.

To group addresses, we exploit that Bitcoin addresses appearing on the input side of the same transaction typically belong to the same user. This assumption is employed widely in the literature as well [ 5 , 6 , 9 , 21 ]. This can be used for grouping individual Bitcoin addresses. The process is demonstrated in Fig 5. The left side of the figure shows the transactions and the input Bitcoin addresses where the Bitcoins are sent from. These Bitcoin addresses belong to the same user. When a Bitcoin address appears in different transactions marked red and bold , all Bitcoin addresses can be merged and assigned to the same user.

Although Bitcoin users are encouraged to generate new Bitcoin addresses after every transaction they make, so that the above grouping is less efficient [ 22 ], most of the users do not follow this guideline [ 23 , 24 ]. Grouping of Bitcoin addresses: The left side shows three transactions and the input Bitcoin addresses of these transactions, while the right side indicates how these Bitcoin addresses are grouped. The transactions belong to the user that owns its input Bitcoin address es. From the message propagation it can be determined how likely the clients are the originators of the transactions.

So far we considered the transactions independently from each other. According to our assumptions, the transactions belonging to a single user were created by a few originator clients. This means that these transactions provide probabilities for the same set of originator clients.

Understanding Bitcoin traceability!
Protect your privacy.
how much is a share of bitcoin stock.
cara mendapatkan koin btc.
que es direccion de bitcoin!

The originator clients can be identified more efficiently by combining the probabilities belonging to these transactions, thus obtaining a more decisive result. This can be calculated by the naive Bayes classifier method [ 25 ]. Table 1 shows the transactions denoted by tx created by a single user. The transactions assign probabilities to the clients IP addresses , which indicate the likelihood that the client is the originator of the transaction. P IP i tx j denotes the probability that IP i address created the tx k transaction.

If the ratio of the connected clients is small, the individual probabilities in the table are also low. The probabilities of an IP address related to the different transactions can be combined by the naive Bayes classification, resulting a row of combined probabilities. This shows how likely the IP addresses belong to the examined user. For each transaction, there can be at most one IP address in the originator class. On the other hand, as a user can use multiple IP addresses to create Bitcoin transactions, after combining multiple transactions, more than one IP address can be in the originator class in the final result.

It is assumed that the Bitcoin users can be identified by a limited number of IP addresses they use when connected to the Bitcoin network. If this does not hold, i. We note, that the invalidity of this assumption for some users does not result in false IP address—user pairings: only those users will be identified whom the assumption holds for.

Furthermore, previous work showed that the usage of the TOR network can be prevented by an active malicious attacker by connecting to the TOR network as well and sending malformed Bitcoin messages via the TOR exit nodes [ 8 , 26 ]. This kind of attack would result in users being unable to connect to the Bitcoin network via TOR. In the current work however, we limit our analysis to regular users, i.

By the application of the naive Bayes classifier see Appendix 7 for the detailed derivation , the combined probability of an IP address IP i belonging to the C o originator class is given by. The A number of active clients varies through the transactions as they occur in different times. We note that the naive Bayes classification can only be applied if the transactions provide conditionally independent probabilities.

Otherwise the dependencies between the transactions should be determined [ 28 ]. During the data collection campaign, we used our modified Bitcoin clients to connect to the network and monitor information about transactions relayed by connected clients. As the program code is open-source, it was straightforward to implement a monitoring client. These messages contain the bit hash code of the transactions which are relayed.

Using this hash code, the Bitcoin addresses, the amount of Bitcoin sent and other information of interest can then be looked up in the blockchain. In order to monitor as large part of the Bitcoin network as possible, the modified Bitcoin clients were installed simultaneously to computers located at different parts of the world, and all of these were recording the observed traffic during the campaign. Bitcoin clients behind firewalls usually do not allow incoming connections, i.

By using a large number of monitoring clients, it is more likely that Bitcoin clients behind firewalls initiate connections to some of our monitoring clients when they enter the network. We installed the monitoring clients on computers that are part of PlanetLab, a system maintained for network communication research. During this period million records were obtained, in which transactions and IP-addresses were identified. The collected data was imported into an SQL database server. To calculate the probabilities described above, the total number of active clients need to be determined.

From the Bitnodes.

It’s never too late to get started

All data used in the analysis is made publicly available by the Bitcoin users as it is required by the Bitcoin protocol. Collecting data on the level of network traffic possibly allows linking Bitcoin addresses to the IP addresses of Bitcoin users. No other personally identifiable information beside IP address was collected about users, and no attempt was made to link IP addresses to actual people beside establishing coarse-grained geographic location. In the shared data, IP addresses were replaced with random identifiers to prevent connecting the transactions with individuals based on other IP address related information.

When calculating the combined probability of each IP address belonging to the specific user, the question arises when should a pairing be accepted? As more than one IP address can be used by each user and one IP address can be used by several users, no restriction is made of this kind. A pairing is accepted, if its probability is higher than 0. This means that the IP address of interest has at least 0.

Why criminals can't hide behind Bitcoin | Science | AAAS

Fig 6 shows the distribution of the probabilities of the accepted pairings. It can be seen, that the vast majority of the probabilities are above 0. Two peaks can be observed on the figure, one with a maximum at 0. The first peak is due to usual clients that initiate a relatively small number of transactions. We speculate that the other peak consists of servers offering wallet services, i.