Dr Craig Wright
DBA, Walden University
S1 – Operational Definitions
When studying scalability in a blockchain, it is essential to establish clear operational definitions to ensure consistent and precise measurement of relevant factors. Yet, Walch (2017) contends that the challenges caused by the fluid and contested language surrounding blockchain technology may lead to problems. More specifically, it is asserted that the terminology used in the blockchain ecosystem is often imprecise, overlapping, and inconsistent. In addition, different terms are used interchangeably, adding to the confusion.
This study will argue that this language barrier makes it difficult for regulators to accurately understand and assess the technology, potentially leading to flawed decisions and inconsistent regulation across jurisdictions. Moreover, developers and other people within the blockchain industry constantly engage in activities that overstate benefits while understating the risk. As Walch (2020) highlights in a later paper, the unclear vocabulary around blockchain technology can make it easier for proponents of the technology to exaggerate its capabilities and benefits while downplaying potential risks and downsides. This situation is compounded by the interdisciplinary nature of blockchain technology, which may make regulators hesitant to challenge industry claims because of their lack of expertise.
Misleading terms, like “full node”, could contribute to misunderstandings and misconceptions about the functioning and capabilities of nodes within a blockchain network. As such, it will be essential to define these terms and definitions within the paper. In understanding these terms, it is thus necessary to present some operational definitions to consider:
- Transaction Throughput: This refers to the number of transactions the blockchain network processes within a given time frame. It is essential to define the specific unit of time (e.g., transactions per second, transactions per minute) to measure the scalability of the network accurately.
- Confirmation Time: It represents the time a transaction takes to be confirmed and added to the blockchain. This definition should include whether it refers to the time taken for a transaction to be included in a block or the time for a certain number of blocks to be added on top of the block containing the transaction.
- Block Size: It defines the maximum allowable size of a block in the blockchain. This can be measured in terms of bytes or other relevant units. The block size plays a crucial role in determining the scalability of the network since it affects the number of transactions that can be included in each block.
- Network Latency: This refers to the time delay experienced in propagating information across the blockchain network. Network latency can impact the overall performance and scalability of the network; thus, it should be defined and measured consistently.
- Node Count: It represents the total number of active nodes participating in the blockchain network. The number of nodes can significantly affect the network’s scalability, and defining the exact criteria for determining active nodes is essential.
- Consensus Mechanism: It refers to the specific algorithm or protocol used by the blockchain network to achieve consensus among nodes. The consensus mechanism can impact scalability, and its operational definition should include details about the specific algorithm used and any associated parameters.
- Computational Power: It defines the processing capabilities of individual nodes in the blockchain network. Computational power can influence the speed at which transactions are validated and added to the blockchain. Therefore, the operational definition should include the specific metric used to measure computational power, such as the hash rate or processing speed.
- Scalability Metric: This encompasses the specific metric or criteria used to evaluate the scalability of the blockchain network. It could be transaction throughput, confirmation time, or any other measurable factor determining the network’s ability to handle increased transaction volume.
In computer science, a node is a fundamental concept in various data structures and network systems (Trifa & Khemakhem, 2014). The specific definition of a node can vary depending on the context, but generally, a node refers to an individual element or object within a larger structure or network. Significant overlaps exist between the definition of a term such as a node as it is used in an extended parlance and a particular field such as blockchain. Here are a few standard definitions of nodes in different computer science domains:
- Data Structures: In data structures like linked lists, trees, or graphs, a node represents an individual element or unit of data within the structure. Each node typically contains a value or data payload and one or more references or pointers to other nodes in the structure. Nodes are interconnected to form the underlying structure, enabling efficient data storage and manipulation.
- Networks: In networking, a node refers to any device or entity that can send, receive, or forward data over a network. This can include computers, servers, routers, switches, or any other network-enabled device. Each node in a network has a unique address or identifier and plays a role in the transmission and routing of data packets within the network.
- Graph Theory: In graph theory, a node (also called a vertex) represents a discrete object or entity within a graph. A graph consists of a set of nodes and edges that connect pairs of nodes. Nodes can represent various entities, such as individuals, cities, or web pages, while edges denote relationships or connections between the nodes.
- Distributed Systems: In distributed systems, a node refers to a computing device or server that participates in a distributed network or system. Each node typically has its processing capabilities, storage, and communication capabilities. Nodes collaborate and communicate with each other to perform tasks, share data, and provide services in a decentralized manner.
It’s important to note that the exact definition and characteristics of a node can vary depending on the specific application or system being discussed. Nevertheless, the concept of a node serves as a foundational building block in computer science, enabling data representation, organization, and manipulation and facilitating communication and coordination within networks and distributed systems.
Section 5 of the Bitcoin whitepaper titled “Network” provides insights into the operational definitions of nodes in the Bitcoin network. Here are the critical descriptions to consider when studying nodes in a blockchain network, particularly referencing the concepts described in the Bitcoin whitepaper (Wright, 2008):
- Archive Nodes: Archive nodes are computers or devices that maintain a complete copy of the entire blockchain. These nodes do not validate and verify transactions and blocks. While these have been falsely referred to as a “Full node”, the only activity these engage in is storing and propagating a limited subset of the transaction history. In the Bitcoin network, archive nodes are promoted as maintaining the integrity of the blockchain and participating in the consensus mechanism. However, the only nodes that validate and verify transactions are those defined within section 5 of the White Paper, also called mining nodes.
- Mining Nodes: Mining nodes are the only system that could be correctly called a full node as these engage in the mining process, where they compete to solve computationally-intensive puzzles to add new blocks to the blockchain. Mining nodes validate transactions and create new blocks containing validated transactions. They contribute computational power to the network and are responsible for securing and extending the blockchain.
- Lightweight (SPV) Nodes: Simplified Payment Verification (SPV) nodes, also known as lightweight nodes, do not store the entire blockchain but rely on full nodes for transaction verification. These nodes maintain a limited set of data, typically storing only the block headers, and use Merkle proofs to verify the inclusion of transactions within specific blocks. SPV nodes provide a lighter-weight option for users who don’t require entire transaction history.
- Network Connectivity: This operational definition refers to the ability of a node to connect and communicate with other nodes in the network. Nodes must establish and maintain network connections to exchange information, propagate transactions and blocks, and participate in the consensus process. Network connectivity can be measured by the number of links a node has or the quality of its connections.
- Consensus Participation: This definition encompasses the active involvement of nodes in the consensus mechanism of the blockchain network. In the Bitcoin network, nodes participate in the consensus process by following the proof-of-work algorithm, contributing computational power to mine new blocks, and validating transactions. The level of participation can be assessed based on the computational resources dedicated to mining or the frequency of validation and propagation of transactions.
- Node Diversity: It refers to the variety of node types and their distribution within the network. This operational definition considers the presence of full nodes, mining nodes, SPV nodes, and other specialized nodes. Node diversity can influence the decentralization and resilience of the network, as different types of nodes contribute unique functionalities and help maintain a distributed ecosystem.
By considering these operational definitions of nodes, researchers can accurately describe and analyze the characteristics, roles, and interactions of nodes within a blockchain network, particularly concerning the concepts outlined in the Bitcoin whitepaper. In addition, these definitions help understand the node architecture, network dynamics, and overall functioning of the blockchain system.
Baran (1964) discusses the concept of distributed communications networks. In this work, the author lays the foundation for the idea of decentralized networks by proposing a distributed network architecture that can withstand disruptions and failures. Baran presents the concept of a network consisting of nodes connected in a mesh-like structure. This distributed or decentralized network architecture aims to provide robust and resilient communication by allowing messages to be routed through multiple paths rather than relying on a central authority or a single point of failure.
As a way of defining decentralization, the concept first presented by Baran (1964) establishes the principles of a decentralized network by advocating for redundancy, fault tolerance, and the absence of a central control node. This work has significantly influenced the development of decentralized systems and forms the basis for further research and advancements in the field. However, with the widespread alternative uses of the term “decentralization” (Walch, 2017) and the resulting different interpretations, which then depend upon the context and specific applications within computer science, it becomes necessary to precisely define this term in analyzing blockchain technology.
Therefore, while Baran’s (1964) paper is foundational in the field of distributed networks, a comprehensive definition of decentralization requires examining a broader range of literature and research when this is being applied to Bitcoin. By establishing clear operational explanations for these factors, researchers can ensure consistency and comparability in their study of scalability in a blockchain network. In addition, these definitions will help in designing experiments, collecting data, and analyzing results accurately.
S1 – Assumptions, Limitations, and Delimitations
In this section, we discuss the assumptions and limitations associated with the large-scale doctoral project aimed at measuring the centrality, interconnection, connectivity, and resilience of the Bitcoin network. By acknowledging these factors, we ensure transparency and provide a comprehensive understanding of the scope and potential impact of the research findings.
- Stability of the Bitcoin Protocol:
We assume that the underlying Bitcoin protocol and network architecture remain relatively stable during the research period. However, any significant changes or updates to the protocol may influence the network’s structure and metrics, potentially impacting the validity of the findings.
- Availability of Data:
It is assumed that sufficient data and information about the Bitcoin network are available for analysis. The project relies on accessible data sources that provide relevant network data, node information, and connectivity details. However, the availability and quality of such data may vary, potentially impacting the accuracy and reliability of the research.
- Accurate Representation of Network Topology:
We assume that the chosen methods and tools for measuring the network’s centrality, interconnection, connectivity, and resilience can accurately represent its topology. The analysis takes that the collected data effectively captures the network’s structure and connections.
- Validity of Metrics and Methodologies:
The project assumes that the selected metrics and methodologies for measuring centrality, interconnection, connectivity, and resilience are appropriate and valid for evaluating the Bitcoin network. Furthermore, the metrics chosen should align with established theoretical frameworks and demonstrate relevance to the research objectives.
- Data Availability and Completeness:
One limitation is the potential limitation of data availability. Comprehensive and real-time data on the Bitcoin network might not be easily accessible. Researchers may have to rely on publicly available data sources, which may not capture the entire network or provide up-to-date information. This limitation could affect the comprehensiveness and accuracy of the analysis.
- Data Accuracy and Sampling Bias:
The accuracy and completeness of the obtained data from various sources may vary. Inaccurate or incomplete data could introduce bias and affect the reliability of the research findings. Additionally, the selection of nodes for analysis may introduce sampling bias, potentially limiting the generalizability of the results to the entire Bitcoin network.
- Network Visibility:
Not all network nodes may be visible or known to the researchers. For example, some nodes may choose to operate privately or remain hidden, impacting the accuracy of measurements and analysis. In addition, the lack of complete visibility could limit the researcher’s ability to capture the entire network’s characteristics.
- Network Dynamics:
The Bitcoin network is dynamic, with nodes joining or leaving the network, and network connections changing over time. The research captures a specific snapshot of the network, and the findings may not fully represent the network’s behavior over an extended period. Long-term network dynamics may require further investigation for a comprehensive understanding.
- External Factors:
The research may not consider or account for external factors influencing the network’s centrality, interconnection, connectivity, and resilience. For example, regulatory changes, technological advancements, or network attacks might impact the network’s behavior and metrics. These external influences are beyond the scope of the current research.
- Funding Constraints:
The availability of funding resources may impact the scope and scale of the research. Conversely, limitations in funding could potentially restrict the depth and breadth of the data analysis, which may influence the extent of the conclusions drawn from the research findings.
- Focus on Bitcoin Network:
The research focuses on the Bitcoin network and its centrality, interconnection, connectivity, and resilience. Other blockchain networks or cryptocurrencies are beyond the scope of this study. Therefore, the findings may not directly apply to other networks or ecosystems.
- Time Period:
The study is limited to a specific time period, and the analysis captures the state of the Bitcoin network within that timeframe. Therefore, network dynamics, metrics, and characteristics may evolve over time, and the research findings may not reflect future or historical network behavior.
- Network Layer:
The research primarily focuses on analyzing the Bitcoin network at the protocol layer. While the network’s application layer and associated services and applications may impact the network’s behavior, they are not explicitly examined in this study.
- Methodological Approach:
The research adopts specific methodologies and analytical techniques to measure the centrality, interconnection, connectivity, and resilience of the Bitcoin network. Alternative approaches or methods may yield different results, but they are not explored within the scope of this study.
- External Factors:
The research delimits examining external factors influencing the Bitcoin network’s characteristics. Economic conditions, legal and regulatory changes, or social attitudes toward cryptocurrencies are not directly addressed. These factors could potentially impact the network’s behavior and metrics but are beyond the scope of this study.
While the research aims to provide insights into the Bitcoin network’s characteristics, the findings may not be universally applicable to all nodes or participants within the network. In addition, variations in node configurations, geographic distribution, and operational strategies may impact the generalizability of the research findings to the entire network.
- Limited Scope of Resilience:
The investigation of network resilience is limited to specific metrics and indicators related to the network’s ability to withstand disruptions or attacks. As a result, the research does not comprehensively assess all potential threats or vulnerabilities the Bitcoin network might face.
The delimitations outlined above clarify the specific boundaries and scope of the doctoral research project. Furthermore, recognizing these delimitations allows for a more focused investigation and interpretation of the findings within the defined parameters. In a research scenario where the researcher also happens to be the creator of the original Bitcoin system, it is essential to acknowledge the potential for bias due to the researcher’s personal views and involvement in the system’s development.
The researcher’s intimate knowledge and perspective as the creator may influence the interpretations and conclusions regarding the Bitcoin network’s centrality, interconnection, and resilience. Addressing this bias openly and transparently is crucial to ensure the research maintains objectivity and rigor. By disclosing the role and potential biases, the researcher allows readers and reviewers to critically evaluate the research findings within the context of their creator’s perspective. This transparency enables a more nuanced understanding of the research and encourages independent verification and validation of the results by other researchers in the field.
By acknowledging the assumptions and limitations of the doctoral project, we ensure transparency and promote a comprehensive understanding of the research’s scope and potential impact. In addition, these considerations provide a foundation for interpreting and contextualizing the findings and guiding future investigations in the field.
S1 – Transition Statement
This study has been developed to critically examine the Bitcoin network’s centrality, the interconnection between network nodes, connectivity, and resilience using quantitative and verifiable data that can be independently peer-reviewed and validated, in line with the principles of the scientific method. It is essential to acknowledge that the Bitcoin network being a public network, may introduce biases in defining specific outcomes, such as privacy, anonymity, and the contrasting goals of traceability and untraceability within the cryptocurrency landscape. These definitions are often subject to philosophical discussions and varying perspectives.
Additionally, this study recognizes the need to address scalability challenges in the context of Bitcoin as a monetary payment system. As the network grows and adoption increases, it becomes crucial to assess the network’s ability to handle larger transaction volumes while maintaining its core principles of decentralization, security, and efficiency. By analyzing quantitative data and utilizing established scientific methodologies, this research aims to contribute to understanding scaling issues within the Bitcoin network and their implications for its long-term viability as a reliable payment system.
S2 – Population and Sampling
When analyzing the scaling and node distribution of a blockchain-based application, the population involved refers to the entire network of nodes participating in the blockchain network. In a blockchain, nodes are individual computers or devices that maintain a copy of the distributed ledger and participate in the consensus mechanism to validate and verify transactions.
The population in this context includes all the nodes within the blockchain network, regardless of their geographic location, size, or computational power. Each node contributes to the overall security and decentralization of the network by maintaining a copy of the blockchain and participating in the validation process. Sampling, on the other hand, involves selecting a subset of nodes from the population for analysis. Sampling aims to gain insights into the characteristics, performance, or behavior of the overall network by studying a representative subset (Campbell et al., 2020).
When analyzing scaling in a blockchain-based application, sampling can be helpful in studying the performance of the network under different transaction loads. By selecting a subset of nodes and observing their behavior during periods of high transaction volume, researchers or developers can infer the scalability of the entire network. This approach allows for more efficient analysis as it can be computationally expensive to analyze the whole population of nodes.
Similarly, when examining node distribution, sampling can help understand the geographic distribution, computational capabilities, or connectivity patterns of the nodes in the network. Researchers can extrapolate information about the broader population by selecting a sample of nodes and analyzing their attributes. It’s important to note that the sampling methodology should be carefully designed to ensure the sample is representative and avoids biases. Factors such as node type (e.g., “full nodes”, mining nodes), geographic location, network connectivity, and computational power should be considered when selecting the sample.
In summary, the population involved in sampling a blockchain-based application when analyzing scaling and node distribution refers to the entire network of nodes participating in the blockchain network. Sampling allows for more efficient analysis by selecting a subset of nodes to gain insights into the characteristics, performance, and behavior of the overall network.
Baran, P. (1964). On Distributed Communications Networks. IEEE Transactions on Communications, 12(1), 1–9. https://doi.org/10.1109/TCOM.1964.1088883
Campbell, S., Greenwood, M., Prior, S., Shearer, T., Walkem, K., Young, S., Bywaters, D., & Walker, K. (2020). Purposive sampling: Complex or simple? Research case examples. Journal of Research in Nursing, 25(8), 652–661. https://doi.org/10.1177/1744987120927206
Trifa, Z., & Khemakhem, M. (2014). Sybil Nodes as a Mitigation Strategy Against Sybil Attack. Procedia Computer Science, 32, 1135–1140. https://doi.org/10.1016/j.procs.2014.05.544
Walch, A. (2017). blockchain’s Treacherous Vocabulary: One More Challenge for Regulators. 9.
Walch, A. (2020). Deconstructing ‘Decentralization’: Exploring the Core Claim of Crypto Systems. In Papers.ssrn.com. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3326244
Wright, C. S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3440802