This paper presents preliminary research into the BTC Lightning Network. By modelling transaction reuse and mapping the reuse of aligned addresses as Lightning channels are opened and closed, it was possible to correlate amounts across separate nodes. It is demonstrated that many users of the Lightning Network failed to take adequate steps to protect the privacy of their information, despite the system being developed to increase anonymity.
Table of Contents
The BTC Lightning Network is promoted as a scaling solution for the BTC Core network. This offshoot of the original Bitcoin protocol is premised on utilising an external system outside the blockchain to create anonymity (Tikhomirov et al., 2020). Yet, many individuals using the BTC Core network aggregate payments into single addresses, rather than following what was recommended in the Bitcoin white paper: “a new key pair should be used for each transaction to keep them from being linked to a common owner” (Wright, 2008, p. 6).
The Lightning network has internal channels that are more difficult to model, as they do not record transactions as in the Bitcoin network. Consequently, the network modelling must be conducted at the opening and closing of Lightning channels, through the statistical correlation of values between known or matched addresses, and through a more complex analysis of channels that have opened or closed. This exercise is based on capturing data on the Lightning network between 2020 and 2022. Such information is publicly available, and the analysis has been provided to analyse the behaviour of individuals using common payment addresses.
Multiple scholars have looked at the reuse of addresses in blockchain-based systems to analyse the relationship of transactions between various parties for different reasons (Jourdan et al., 2019). For example, some researchers have extended the analysis of address transfers into creating entity recognition algorithms that model the characteristics of transactions across the blockchain (Wang, 2023). Other researchers have extended the common reuse of addresses into the development of machine learning algorithms that promise to capture information concerning transactions.
In each instance, the ability to model transaction paths is premised on misusing the transaction system and erroneously treating Bitcoin addresses as accounts (Wu et al., 2021). This research is premised on the same flaw implemented using the BTC Lightning network. We can presume individuals will fail to implement privacy controls where addresses are not reused. In that case, this issue may demonstrate an ability to correlate transactions between the opening and closing of payment channels associated with the Lightning network.
The Lightning network has been introduced as a “layer 2” system to increase the anonymity of conducting transactions using the BTC Core network. For instance, Sguanci et al. (2021, p. 16) contend that “privacy guaranteed by Lightning Network is quite robust.” This introduction of anonymity changes the nature of Bitcoin, moving the desired goal from privacy towards anonymity. This distinction provides avenues for money laundering and the abuse of the system, which is far more limited when all transactions are publicly recorded on the blockchain. Consequently, this research is designed to analyse whether correlations can be found between addresses and the opening and closing of channels over time.
Pearson’s Product-Moment Correlation test attempts to draw a line of best fit through the data available between two variables. This statistic measures the degree of correlation between two variables, and is utilised where results plotted on a scatter graph indicate a linear relationship may exist. The Pearson correlation value measures the strength and direction of a linear relationship between the variables. The statistic, the correlation coefficient, may range from a minimum and a negative correlation of -1 to a maximum value of +1, where the result is a perfect positive correlation. The statistic will return 0 where no correlation can be seen to exist (Liu, 2019).
Puth et al. (2014) note that the two main assumptions associated with this statistic are that individuals in the sample are statistically independent and that the population forms a bivariate normal distribution. The researchers contend that there is a need to ‘robustify’ the Pearson Product Moment Correlation Coefficient and that it can break down when there are outliers or substantial influential observations (Zinzendoff Okwonu et al., 2020). Consequently, care must be used in selecting this statistic without testing for assumptions or outliers.
Is there a statistically significant relationship between address matching in channel size and economically significant amounts being transferred (greater than 0.1 BTC)?
Null Hypothesis (H0): there is no statistically significant relationship between the number of BTC transferred and the matching of known addresses.
Alternative Hypothesis (H1): A statistically significant relationship can be found to exist between the number of BTC transferred and the matching of known addresses.
This research is focused on a preliminary study of the transactions in the Lightning network. The analysis is conducted over only a single month, and has been limited to transactions opened and closed within the same time frame. Therefore, the correlation of values over longer periods may vary or be more significant. The research presented in this paper is a preliminary investigation into the actions of individuals using the BTC Lightning network, and demonstrates a lack of understanding associated with the concept of not reusing addresses. Additionally, the cost of opening a Lightning channel may alter consumer behaviour, limiting those on the network to utilise less favourable forms of transaction exchange.
When the cost of opening a Lightning channel is taken to include the BTC fees (Shang et al., 2023), it may alter consumer behaviour. The correlation noted in this paper may be causally related to reactions associated with the fee payment and economic cost. Yet, such activity does not explain the continued reuse of addresses even when large addresses are utilised. Consequently, further study should be conducted to analyse the changes in user behaviour, based upon a variable input fee and with the comparison to low fee-paying transactions including micropayment transactions under a US cent.
The information associated with the Bitcoin blockchain is publicly available. This information has been collected using methodologies developed and implemented by the author, which are associated with measuring network interconnectivity and capturing transactions (Javarone & Wright, 2018a, 2018b). In addition, information collection concerning the BTC Core network is extended using network modelling techniques derived through similar techniques. The collection of data from the Lightning network is described by Zabka et al. (2021). In addition, Herrera-Joancomartí et al. (2019) document the method used in determining changes to Lightning Network channel balances and capturing when they are updated.
The data set incorporates the capture of information related to channel updates and matches it with information across the BTC public blockchain. From such information, a smaller data set has been extracted. When Lightning channels are noted to open or close, and the amounts on those systems match payment addresses from the same entity utilising the techniques referred to earlier in this paper for deanonymising transactions, the value is recorded as a 1 where the input and output address belong to the same entity. Equally, where the addresses cannot be actively matched to the same entity, the value is recorded as a 0.
Next, the input and output values for opening and closing channels were recorded. Where the total transaction was over 0.1 BTC for either the opening or the closing of transactions to start or close a channel on the Lightning network, a 1 was recorded in the database. Where the value was under 0.1 BTC, a 0 was recorded. The information was limited to February 2023, and where none of the entity information was captured, the information was excluded. A total of 363 transactions with reused and referenced BTC addresses were extracted for the analysis.
|Do In and Out channels match||.44||.496||363|
|Is the Input over 0.1 BTC||.45||.498||363|
|Is the Output over 0.1 BTC||.42||.494||363|
The descriptive statistics of the data set are included and summarised in table 1. Where payment channels were opened and closed the same month, the information was recorded where addresses matched the conditions noted above. For example, suppose an address was funded to open a channel on 1 February, 2023, and closed to the same address on 28 February, 2023. In that case, this information will be captured in the data set isolated for use in this paper.
From the information where BTC addresses have previously been reused on the blockchain, the mean number of channels opened and closed within the same month is 44%. Of these, 45% of input values were funded using over 0.1 BTC, and 42% of output values were sent to addresses using more than 0.1 BTC. The recorded average monthly price this month was approximately US$20,000 – $21,000 (Bitcoin GBP (BTC-GBP) Price History & Historical Data – Yahoo Finance, n.d.). Hence, the measured values represent output above $2000 in nearly half of the transactions.
A Pearson Product-Moment correlation test (one-tailed), a = .05, was conducted to assess whether a statistically significant relationship exists between BTC transaction address for use and the transacting of values over 0.1 BTC. The direction of this relationship was correlated with address reuse in BTC, and a one-tailed test was warranted. In addition, the assumptions of normality, linearity, and homoscedasticity were evaluated, with no significant violations noted.
The results were significant when measured against transactions that open Lightning channels (p = .020), but were just outside of the level of significance (p = .071) for the data associated with closing Lightning channels (table 2). In addition, a strong positive correlation exists between the variables of input and output channels matching and the input to a Lightning channel being over 0.1 BTC, r = .107, n = 363, p = .02. This indicates that the relationship is not strong—with only a limited Pearson correlation—but provides evidence that even in the one-month period, some individuals reuse addresses when sending large values.
The correlation statistic is used in measuring the relationship between two variables. Regression provides a method to determine how one variable affects another. The primary differences are that correlation focuses on examining the relationship between the variables and whether the variables move together. In this statistic, the X and Y axes may be interchanged, and the data returned by the statistic is represented as a single point (Zou et al., 2003).
Alternatively, regression is focused on measuring the effect of one variable on another, and is designed to determine cause-and-effect relationships. The X and Y variables may not be interchanged, and the data is shown as a linear representation. The use of regression analysis focuses on determining whether a dependent variable will change and may be predicted based on a known value of an independent variable and then measuring the mathematical relationship between them. A larger-scale analysis could be conducted to understand other consumer behaviour and the use of privacy or anonymity techniques.
Additionally, a prediction model can be constructed (table 3), demonstrating that where individuals open channels with over 0.1 BTC, the same individuals will close the channel with similarly large amounts. While this model is significant (p=.050, F=3.026, df = 2,360), the predictive power (R2 = 0.17) is low.
Additionally, a simple ANOVA test presented in table 4 shows that there is significant evidence (p=.050, F=3.026, df=2,362) to assert that where individuals have reused addresses in constructing channels, there is a behavioural likelihood that they will be sending and receiving amounts over 0.1 BTC. This type of use enables the analysis of user behavior.
The evidence demonstrated in this research aligns with individuals who use the BTC Lightning network while failing to understand the privacy principles of Bitcoin, or, at least, not caring about it. Likewise, the evidence demonstrates that people who are using the BTC Lightning network to send and receive significant economic amounts of money are not taking action to protect the privacy of the transactions and allow for the public analysis of such transactions.
Tikhomirov et al. (2020) note that the purpose of the BTC Lightning network is to increase anonymity and privacy. Poon and Dryja (2015) contended that the Lightning network would provide the ability to send and receive micropayments. Yet, the evidence provided here shows that a significant number of individuals uses the Lightning network to transact in large-value payments (> USD 2000), while failing to adequately anonymise transactions, allowing even small-scale research projects to detect examples of address reuse and the opening and closing of channels to the same addresses.
Extending this project, it would be feasible to collect address-reuse statistics and integrate them with data from the public blockchain that incorporates the modelling of “dust transactions” (Loporchio et al., 2023) in an aggregated transaction analysis that is designed to consolidate and track exchanges made by the same entity (Li et al., 2020). The existing project did not aggregate a large number of addresses using techniques such as dust tracking or over an extended period of time. By extending such research to incorporate the analysis of addresses over a longer time frame and a wider range of addressing information, it is hypothesised that it could lead to a system that would enable the tracking of larger transactions as they go in and out of the Lightning network.
Bitcoin GBP (BTC-GBP) price history & historical data – Yahoo Finance. (n.d.). Retrieved 10 March 2023, from https://uk.finance.yahoo.com/quote/BTC-GBP/history/
Herrera-Joancomartí, J., Navarro-Arribas, G., Ranchal-Pedrosa, A., Pérez-Solà, C., & Garcia-Alfaro, J. (2019). On the Difficulty of Hiding the Balance of Lightning Network Channels. Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security, 602–612. https://doi.org/10.1145/3321705.3329812
Javarone, M. A., & Wright, C. S. (2018a). From Bitcoin to Bitcoin Cash: A network analysis. Proceedings of the 1st Workshop on Cryptocurrencies and blockchains for Distributed Systems, 77–81. https://doi.org/10.1145/3211933.3211947
Javarone, M. A., & Wright, C. S. (2018b). Modeling a Double-Spending Detection System for the Bitcoin Network. ArXiv:1809.07678 [Physics]. http://arxiv.org/abs/1809.07678
Jourdan, M., Blandin, S., Wynter, L., & Deshpande, P. (2019). A probabilistic model of the Bitcoin blockchain. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 0–0. https://doi.org/10.1109/CVPRW.2019.00337
Li, Z., Li, J., Zheng, Y., & Dong, B. (2020). Biteye: A System for Tracking Bitcoin Transactions. 2020 Information Communication Technologies Conference (ICTC), 318–322. https://doi.org/10.1109/ICTC49638.2020.9123286
Liu, X. S. (2019). A probabilistic explanation of Pearson’s correlation. Teaching Statistics, 41(3), 115–117. https://doi.org/10.1111/test.12204
Loporchio, M., Bernasconi, A., Di Francesco Maesa, D., & Ricci, L. (2023). An Analysis of Bitcoin Dust Through Authenticated Queries. In H. Cherifi, R. N. Mantegna, L. M. Rocha, C. Cherifi, & S. Micciche (Eds.), Complex Networks and Their Applications XI (pp. 495–508). Springer International Publishing. https://doi.org/10.1007/978-3-031-21131-7_39
Poon, J., & Dryja, T. (2015). Lightning network. https://www.BitcoinLightning.com/wp-content/uploads/2018/03/Lightning-network-paper.pdf
Puth, M.-T., Neuhäuser, M., & Ruxton, G. D. (2014). Effective use of Pearson’s product–moment correlation coefficient. Animal Behaviour, 93, 183–189. https://doi.org/10.1016/j.anbehav.2014.05.003
Sguanci, C., Spatafora, R., & Vergani, A. M. (2021). Layer 2 blockchain Scaling: A Survey. ArXiv:2107.10881 [Cs]. http://arxiv.org/abs/2107.10881
Shang, G., Ilk, N., & Fan, S. (2023). Need for speed , but how much does it cost? Unpacking the fee‐speed relationship in Bitcoin transactions. Journal of Operations Management, 69(1), 102–126. https://doi.org/10.1002/joom.1202
Tikhomirov, S., Moreno-Sanchez, P., & Maffei, M. (2020). A Quantitative Analysis of Security, Anonymity and Scalability for the Lightning Network. 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 387–396. https://doi.org/10.1109/EuroSPW51379.2020.00059
Wang, Y. (2023). Entity recognition algorithm and transaction characteristics analysis of Bitcoin blockchain. In X. Ye & X. Ben (Eds.), Third International Symposium on Computer Engineering and Intelligent Communications (ISCEIC 2022) (p. 115). SPIE. https://doi.org/10.1117/12.2661154
Wright, C. S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3440802
Wu, J., Liu, J., Zhao, Y., & Zheng, Z. (2021). Analysis of cryptocurrency transactions from a network perspective: An overview. Journal of Network and Computer Applications, 190, 103139. https://doi.org/10.1016/j.jnca.2021.103139
Zabka, P., Förster, K.-T., Schmid, S., & Decker, C. (2021). Node Classification and Geographical Analysis of the Lightning Cryptocurrency Network. Proceedings of the 22nd International Conference on Distributed Computing and Networking, 126–135. https://doi.org/10.1145/3427796.3427837
Zinzendoff Okwonu, F., Laro Asaju, B., & Irimisose Arunaye, F. (2020). Breakdown Analysis of Pearson Correlation Coefficient and Robust Correlation Methods. IOP Conference Series: Materials Science and Engineering, 917(1), 012065. https://doi.org/10.1088/1757-899X/917/1/012065
Zou, K. H., Tuncali, K., & Silverman, S. G. (2003). Correlation and Simple Linear Regression. Radiology, 227(3), 617–628. https://doi.org/10.1148/radiol.2273011499
The following information was captured as output using SPSS 27.