This paper was written, due to the massive popularity of blockchain, and all that comes with the hype of a new technology. Many people seem to believe it is a fantastic finding, “A breakthrough of modern technology”, “The future of Technology”, “Everything should be a blockchain!”. Although, there are many issues that have come with blockchains that aren’t being considered. The easiest, clearest reason for the success of blockchain is the incentivization. In the cryptocurrency space, more versions of money allow for different types of transactions, such as payment per second transactions (assuming the chain the currency is on can handle it). In reality: there may be some amazing uses for blockchain that are not done yet, but to ensure people don’t burn their budgets on a dream that is not actually ideal as a blockchain, this paper was created.
In this paper, there is an outline of the following:
- How blockchains grow
- How blockchains propagate
- Legal considerations of data
- How one chain may split into two
- Why blockchains are trusted
- More effective solutions to some problems
- Security considerations
- Life expectancy of blockchains
Note: If a subject is already known, or isn’t of current interest, you may be able to skip to a different section, but many of these subjects are interconnected.
Hashes of Blocks
Here is an image file’s sha256sum (algorithm) hash:
(Anyone can calculate the hash value, hashing isn’t hard, and good security professionals compare them every time they obtain a file from the internet) So, if you find the image file “linuxmint-18.3-kde-64bit.iso”, obtain a Sha256 hash of it (get a program to calculate it for you), and the value you get is the same, then everyone can be certain that (given today’s hardware capabilities) it is VERY likely to be the EXACT SAME file that was used for the above hash.
That integrity is the basis of the “block” in blockchain (nonces are talked about in section “Process: Growth”).
If the value “ABCD” is hashed, a value is output. If the value is made lowercase, “abcd”, then the hash value becomes something ENTIRELY different. The same goes for any data in the blocks; if anything in the block of information changes, the hash ENTIRELY changes.
So, what is the “Chain”?
Blocks are chained together by taking the hash of the previous block and putting it into the data of the current block, before hashing. This way the previously calculated block is verifiably the same. If data or the hash of the previous block is changed, then the current block changes too. One way of saying this is “breaking the chain”. Want to see how the chain is connecting blocks? Use this MIT Blockchain Tool.
The blockchain’s immutability (non-changeability, unalterability, insert favorite word for can’t be changed) has lead name “ledger”; they are similar to the accounting records in that they are never supposed to changed. Not only are ledgers supposed to be permanent, but they hold PUBLIC information. ANYONE can read what data is on the blockchain at ANY time.
One problem for blockchains is the Right to be Forgotten and the GDPR. The EU requires personal information to be removable, as people have “The Right to be Forgotten” and GDPR does not allow personal data to leave the EU without consent of the individual. Why does this matter? Most of the time the objective is for blockchain to be a globally used system. Even when not intended, it often becomes one. Developers would thus need to be careful and ensure no data that is put on the chain can be considered a breach of privacy, unless that is what was desired, as that is the point of the blockchain (to store private information). Once put on the chain, the private information can NEVER be removed or else you break the chain… rendering the use of a blockchain and accuracy of data moot.
Workaround: The data on a blockchain says WHERE the data is, not WHAT the data is…. If you use the work around, why did you bother using a blockchain? The blockchain is used to store where data is. This is adds a hoop to a basic database server, rendering it less efficient than a simple database (How much less in section, “Process: Growth”). It removes the purpose of using a blockchain, entirely, and even makes it more processing work for a computer making the delivery of data even slower than a simple database!
Let’s assume the blockchain provided the users an option to transact some amount of data, or even a file. Bitcoin, for example, allows people from around the world to put 1 Mb of information into their transactions, without restriction. People can put pretty much anything in it, as it is without restriction. Links, portions of photos, other hashes, whatever as long as it fits into 1 Mb. What do criminals want to put in as their portion of data? People have posted links to child pornography sites (Article).
Once law enforcement finds it, the sites can be taken down, and thus the link is unusable to find content. This does not prevent capability and reoccurrence, though. Some issues criminals have created are irreversible, such as images of child pornography (Article).
Again, links can have the data at the directed site removed, but if it isn’t directing to the content, then they will be permanent, and distributed to everyone who uses the blockchain. For anyone thinking of making a blockchain, this is a possible legal issue.
Decentralization of the System
The blockchain has a ledger that is not owned, maintained, nor held by a single person; rather, it is owned/maintained/held by MANY people. Servers are run on computers to take, look at, and add/verify data on the blockchain. For many, the procedure is “Go to this website if you want to see the data.” If you wanted to do more than simply the website, you could store some of the data yourself. Possibly helping spread the blockchain faster and further, if you decide to set up a computer to do so. This type of computer is called a “node”.
The nodes are the owners and maintainers of a blockchain’s ledger. Nodes are where information is obtained from and stored. The beauty of blockchain being that everyone is able to store and spread the blockchain’s information, without need for a central trusted entity. This capability is called decentralization.
Short version: Nodes store blockchains data. Nodes can be run by anyone. Ownership available to anyone is called decentralization.
This is why many blockchains are generally open-source: “If I can’t tell what I am running, if I can’t tell it isn’t malware, why should I run this blockchain, when the standard is that I can see every bit of code to know what I am processing!?” This aspect has it’s own implications, which are generally positive to the public and some of the time positive to businesses. If you would like to know what those are, look for a debate between open source and closed source, like this one.
(PLEASE look at multiple, most are biased to their side).
A blockchain system is developed, but now it needs computers/processing power to calculate the hashes of the blockchain.
This is because the calculations are non-reversible and so complex that only a computer could do the hashing in a small enough time. If you want to see how to calculate a hash by hand, it is unrealistic to think a person could calculate enough sha256sums for a blockchain; thus, computers do it. Who’s computers, though? The “Miner’s” computer. They take the most recently transacted data and the previous block’s hash to create a hash for the next block. For any given blockchain, calculating the hash can be made more or less difficult (Described in “Stunting Growth”).
Problem: The more data there is, the more data that has to be hashed. The more data that needs hashing, the longer the calculations will take, and the more processing power is needed to do it. Limited answer: Decrease difficulty of hashing the blocks. Answer: Limit the amount of transactions that can be done per block.
Problem: What if someone finds the hash for the next block, but as it is getting distributed to the other nodes/users, someone else finds the same data and starts spreading in an area that hasn’t been told it was found, yet? Who claims ownership of the block? and if there is a reward system, who claims the reward? This separation is called a “fork”. Currently used answers:
- Whichever fork everyone agrees to use
- Whichever fork is longest
- Whichever fork reaches a certain length beyond the split off first.
Problem with the answers: Each fork will have it’s own transaction data, based on which fork people gamble will be the successful one. Where does the information in the dropped forks go? Answers: It either disappears, as if it didn’t happen… or A system is somehow applied that the I don’t understand (thus would be unqualified to explain), which would transition the fork’s data into the new chain. If this happens, then somehow some miners might have spent money that they didn’t know they wouldn’t have…. So now what?
Let’s take this problem into a new idea I once heard of from a presenter at a blockchain conference: A blockchain for setting up and recording times of flight departures and arrivals. Single point of trust? No, two flight towers can be separate entities, and maybe entirely separate airports, eventually. Transactions: Time stamping when flights have arrived or departed. Miners: Air port companies, maybe even fliers. In theory, maybe this would work under perpetual, ideal conditions.
The problems for this system are as follows:
Question 1: What happens when connections between the towers is dropped?
They both keep having more flight arrivals, delays, late departures, so more transactions and more blocks being built and added. Now, these two towers have entirely different content on their chains. The above problem means that one tower at the same airport has accurate data, and the other must be agreed is false. The discrepancy is that both may have accurate data, but the fact that they have different data means that one must be complete thrown out. Airports can’t afford this waste of resources, the missing data wouldn’t have been stored anywhere, and they are at a loss, or the accurate data is know, added to the other chain, and the heaving processing for a blockchain would need to repeat.
Question 2: What happens if flight information changes?
Flights can be delayed several times to eventually be canceled, even, but the blockchain wouldn’t care. You can’t change it. “-Okay, okay, let’s use the blockchain that stores where the data is stored…!” The problems for that make blockchains pointless. (Discussed in section: So, What is the Chain?)
Question 3: Why not just use a different system?
CRDT (Conflict-Free Replicating Data Type) is a system that can resolve this issue more efficiently without this problem, assuming data is properly setup. If we change this to patching software or delivery of information, Darcs is an older algorithm for applying patches out of order.
In short: What happens to the miners when a chain forks? They either keep their earned money, because of the winning chain, the above happens because of a losing chain, OR the chain permanently forks and we have two versions of the same blockchain.
Security and This Permanent Data
Etherium (cryptocurrency) had data that was “public” in an object oriented programming sense. This means that anyone could affect it whenever they wanted to… This was the “who has how much money” part of the code. So to prevent thievery from anyone Etherium forked to a more secure version. If you wanted the more secure version of a blockchain, you would have to start/join a new blockchain, meaning lose all the transactions and rewards that previously existed. Needless to say, not everyone moved to the new chain. It couldn’t be transferred unless everyone agreed to stop doing transactions long enough for it all to be calculated on the new chain. Although… this would take about as long as the lifespan of the chain. This won’t happen if the chain lived for months; People need to use it for it to be an adopted blockchain. This is a larger scale issue to the forking problem.
Assume there are many transactions in general; for example, many degrees/certifications being given to people. This would mean that Degrees may poof out of existence all the sudden, if the granting entity just chose the wrong fork. Resolution: It can be reassigned to a new chain, “Just add it to the other!”, one might say. This means that the users of the chain want as big of blocks as possible, as many transactions as possible.
Remember remember the earlier problem about miners wanting to process as little data as possible, so they could make as many blocks as possible? Paradox: Miners want as few transactions as possible. Users want as many transactions as possible. The creator will want more blocks from miners, making the chain grow, making the data permanent. If there are no users on the chain; thus, no data to hash; then, there is no work for the miners; and the blockchain isn’t being used anyways.
One blockchain did this differently. EOS.io instead of having the competitive system for miners, they work together to obtain the next block. One application that used this system was a social media platform. Thus the incentive to mine is to allow more posts on the chain. The fact that these systems were working together to find hashes makes it able to scale to greater transactions speeds (Currently around 50,000 transactions per second).
Takeaway: What incentive is there for miners to mine (hash) data? How many transactions do you have time to do? Comparison of Bitcoin, Etherium, PayPal, and Visa Transaction rates Would it be better to make a system/application that all colleges make their own version of that is a public database of degrees or certifications? The writers and owners of the databases would be colleges, or better yet, all the students for a personal database? A blockchain may not be ideal, but it can still be decentralized. If not currency, how can you make people work together if it needs to scale more?
For those of you who would like to know the more complex details, here is how random numbers matter and how difficulty for mining is determined. If not, you can skip this section. Use this to visualize and do it, if desired. MIT block tool
Blocks are made harder to mine by requiring some value (usually 0) at the start of a hash to be considered a valid block. To produce this specific hash value, a hash needs a “nonce”, or random data used to change the hash value. The nonce is a randomly generated value; where it comes from can be up to the user. It also shouldn’t matter too much, as long as these values don’t repeat. People are also unlikely to share this information. Miners are working against each other for the reward of mining blocks after all. People use nonces, rather than going from the ground up, because randomly choosing numbers seems to result in an accurate hash faster than calculating the value all the way up. If someone is starting from one and climbing from there, people would likely be repeating the actions of another person, meaning being behind that person in chances for success. One idea to keep in mind is that modern blockchains increase difficulty of mining as time progresses. This results in even slower growth as time goes on, and can compensate for the advancement of technology.
The Trusted, Trust-less System
Blockchains are full of permanent data that people trust is accurate. Blockchains prevent anyone from changing data into false information. Blockchains are only worth doing, if one has not single point of trust. If only one person can make transactions/write on the chain, who can prove that person won’t lie in the transaction? This person is in full control of the data, even though everyone has it, one person has all the power. Thus in a blockchain where only one person can be trusted to post accurate data, to make accurate software/hardware, or to even hashing accurate information, it can’t be trusted by anyone but the person in control. If there is one point of trust, the it may as well have been created in a server or database that is simply owned by the trusted entity. The system would be faster, the person would have necessary control of the data, and data one can still prevent the removal or changing of data. The same argument goes toward businesses.
So to ensure the use-case is viable, to prove that it has a purpose being a blockchain rather than a decentralized system, verify that no where along the path of creation is a “single point of trust”. Blockchains have no data or permission that only one entity is allowed to use.
That is one point that people may call “the beauty of blockchain”. I agree that “Blockchain is an adversarial system. It is a system that people who would take every chance to ruin and demoralize each other will come to trust and agree on this system. People from Israel, Iran, America, and North Korea will come to agreement every second about data on a blockchain. They might fight every day, but in the blockchain they agree on and trust in the truth of the data every second of its existence.”
That is what makes blockchain beautiful, but it also makes it very difficult (if not impossible) to justify using over other systems (if not impossible to use, due to a lack of compatible needs).
Looking at the information that would get stored in a blockchain is also an interesting situation. People who desire privacy may get it. There are possible methods to obtain complete privacy, but these are not usually implemented in currently used cryptocurrencies. Using the data on the blockchain and other outside information about your account are usually enough to trace the currency back to where it was from. Law enforcement need this to ensure any funny business is legal. People who value privacy hate it because it removes anonymity. Looking at the previous example, a system for degrees are intended to have public presence and public data. Not everyone will want it all to be public data look back at the GDPR regulations and keep in mind that some people don’t put their degrees on resumes due to over-qualification but really lack needed knowledge. Some people may not want their place of education posted, as it is a place to gather more information on people. If it is high school degrees, too, then users need to remember not to have that as their security questions. More examples would be found with time.
In short, people may not want their data on the chain, but when you put it on, now what? Redo the chain and spend a lot of time and money on processing? This issue was not made to get easier over time, either. Looking the white papers from Satoshi Nakamoto, who is sometimes looked at as the creator of blockchain, hardware would get more efficient, but the chain would also increment its difficulty to accommodate for that issue. (Also mentioned in section “Stunting Growth”) Maybe it can be changed to be more efficient, but cryptography gets stronger, thus harder to encrypt/decrypt over time. Current difficulty for Bitcoin Speed of processing a blockchain will not likely change, and if it does then it is likely be vulnerable to collision attacks (talked about in final section, “Life Expectancy of the Chain”).
Life Expectancy of the Chain
Blockchains have an inevitable and expected end of life. These systems work because hashes are still difficult enough to forge. These hashes work well, but they have a limit in size. Once the algorithm becomes obsolete, they gain a vulnerability due to general hardware quality being good enough to easily forge.
When calculating hashes, one CAN obtain the same value more than once. If someone finds a second value with the same hash as is on the chain then the data in the chain can be changed to the data within that hash. These vulnerabilities are called collisions, as two calculations have collided into the same hash value. The purpose of the ledger’s immutability becomes void if this is found. Once again, someone can change data, and who can trust the system’s data accuracy now? The likeliness of this happening is entirely dependent on the complexity and amount of bits used in the hashing algorithm. If it is simple and uses a smaller sized hash (fewer bytes), calculations will result in the same hash (thus collide) more often. Thus the largest hash possible is desired to ensure the longest life of the chain. This increases hash calculation time for miners. This means fewer transactions can be done per block mined. And now we have returned to the paradox of transaction sizes.
This all isn’t to say that blockchains have no uses. Timestamp events are one implementation that may even be ideal if implemented with a blockchain. The problem is that timestamp may not be interesting to enough people to make successful, and even if it is successfully made, it will still fail if it isn’t set up properly.
Many people and businesses use blockchains, today. Hopefully, a full scope of what is necessary or better systems will be realized as a result writing this. Today, blockchain is a buzz-word, but hopefully people will see that most of the time, there are ALMOST always better solutions to their problems than blockchains.
More information that may have not been mentioned: