《技术福-Qtum量子链开发者 带你入门区块链》



  • 0_1493950322708_upload-149b7999-8f15-4327-bcbd-83d5e61621d7
    这篇文章将介绍区块链工作原理及其如何以安全的方式进行价值传输,并对比特币、Qtum量子链等使用未花费交易模型UTXO的数字货币的底层机制进行详细介绍。
    另外一种用于区块链资金管理的账户模型(Account Model)本文将不作赘述。
    首先,以防读者对区块链、比特币毫无了解,这里先对比特币中常见术语做一个解释。
    *单向哈希函数(或“哈希函数”):利用密码学算法将任意长度的数据输入转换成固定长度的“哈希值”或“摘要值”。哈希运单是单向性的,仅通过输出的哈希值/摘要无法反向推算出原始输入数据。比特币使用的是最常见哈希算法SHA256,除此以外还有SHA3、RIPEMD160、scrypt等。
    *公钥密码学:是一种加密机制,通过该机制可以将“私钥”转换为“公钥”,并用于证明私钥的所有权而无需透露私钥内容本身。此外,还可以使用公钥对数据进行加密,只有对应私钥持有人才能进行数据解密。在比特币中,公私钥机制用来进行交易签名。签名数据和公钥可以用来证明交易所有人持有相对应的私钥。
    *默克尔哈希树:是一种树状的数据结构,利用单向哈希函数将若干数据块进行哈希加密,使在不更改默克尔根哈希值的情况下无法对树上输入信息进行修改。
    *UTXO:未花费交易输出,即某个交易的未花费输出值。
    *区块:区块链上可供验证和且无法伪造的最小单位。它包含众多数据用于证明交易及其共识机制。
    接下来,我来介绍一下交易是如何完成的。比特币的交易过程与银行支票取现过程相似。当用户想要花费某个交易的“输出”,必须全部花费。就像是当你到银行进行支票取现,必须全部取现。当然,比特币中没有类似于现金和银行账户的概念。因此如果要发送资金,首先需要将“支票取现”,然后“输出”至想要发送的地址,并返回可供下次发送资金的余额。
    在比特币中,花费1个或多个UTXO时,接收地址增加对应数量的UTXO。此过程中,花费的交易称为“vins”,新生成的UTXO称为“vouts”。一旦UTXO被某个交易使用后便被销毁,其交易历史信息可以在区块链上查询。
    这里还有另外一个问题,银行支票上都有名字证明资金接收方。而在比特币UTXO中使用了公钥密码学。从技术的角度来看,当UTXO中的脚本返回值为“真”时,该UTXO才被判定为是“可被花费的”。让我们拿一个最简单的脚本举例:[pubKey] OP_CHECKSIG
    这是“支付到比特币地址的标准交易”[1],也是第一个标准比特币交易类型。其中,[pubKey]是指公钥数据,与其相对应的私钥由其持有人所有。公开公钥对安全性并没有什么影响。
    比特币使用的编程语言是基于堆栈的。因此,想象你有一堆纸质文档,将公钥信息写在纸上,并放入文件夹的顶端。该脚本的另一部分是OP_CHECKSIG。该操作会从文件夹里从上之下取出两张纸,一张纸写着公钥信息,第二张纸写着OP_CHECKSIG对应的加密签名。
    这个过程听起来是不是够令人困惑了,坚持一下。通过OP_CHECKSIG脚本从堆栈中获取两个值,但该脚本似乎只有一个输出值,即‘pubKey’。‘vin’值的重要性也正体现于此。‘vout’脚本对应的是支票上的收款人信息,而‘vin’值对应支票背面的签名,证明该交易以及收款人信息的真实性。在比特币中,该脚本不会执行直至该交易被花费。交易被花费时,首先执行‘vin’脚本,然后将输出结果数据从vin堆栈转移至vout堆栈。因此在实际执行过程中,这个脚本实际上是:
    [signature from vin] [pubKey] OP_CHECKSIG
    Vout脚本可以作为一个挑战,而vin可以作为挑战相对应的应答。有了vin提供的签名并试图花费这些资金,则可以执行这个脚本了。若签名与公钥均有效,则OP_CHECKSIG返回“真”值,相对应的UTXO也可被成功花费。
    因此在任意交易中,每个vin都对应上一个交易的UTXO,并提供导致UTXO脚本返回真值所需答案。无效或相似签名会导致脚本返回“非真”,对应交易则被视为无效。正如之前提到的,仅花费部分UTXO是无法实现的,必须全部花掉或全部保留。举例来说,若一个UTXO中包含有10个代币,其中7个发送给Bob,那么需要创建一个交易,将10个代币所在的UTXO全部花费,并生成两个输出:一个输出给Bob(利用他的公钥),自己保留另一个输出(保证你可以提供花费这个输出vout所对应的“答案”),这一输出称为“更改地址”。一般来说,每个钱包都有多达上百个公钥和私钥,钱包的“总余额”为钱包中可用于花费的所有输出的总和。
    之前我们提到,仅利用公钥执行交易之所以过时,其主要原因之一是公钥过大,公钥地址举例如下:
    04e70a02f5af48a1989bf630d92523c9d14c45c75f7d1b998e962bff6ff9995fc5bdb44f1793b37495d80324acba7c8f537caaf8432b8d47987313060cc82d8a93
    而标准的比特币地址类似于17L2iLaPvbCMaEieYZwYb18T4H2cq1vz2g。有两个原因使标准比特币地址显著缩短:一是“基于58编码”;二是其并不包含所有公钥数据,而只是其哈希值。该地址与“P2PK”交易地址相关联,其脚本类似于:
    OP_DUP OP_HASH160 [pubKeyHash] OP_EQUALVERIFY OP_CHECKSIG
    这里我不会对脚本执行进行完整的解释(相关解释可以参考比特币Wiki [2])。这类型交易广泛使用的原因我概括为以下两点:

    1. 公钥的哈希值长度为160位,远远小于公钥本身的长度,方便地址交换但手写地址仍不现实;
    2. 以防比特币使用的公钥加密出现漏洞,仅使用公钥地址对应的哈希值相对来说更为安全(尽管每一笔交易需要使用不同的公钥/地址)
      接下来介绍Base58地址。区块链上储存的公钥哈希值(pubkeyhash)类似于“456a16722e70016e0d3dcde67840d6df0b59e8ec”。在网络中以这种数字形式进行地址交换,一个小小的拼写错误就可能会导致资金的流失。此外,仅凭这种地址记录方式无法区分pubkeyhash交易和全新pay-to-scripthash交易。
    • 版本号
    • 数据(可为公钥哈希值、脚本哈希或其他类型数据)
    • 校验值
      其中,版本号用于识别数据:version1为pubkeyhash;version3为scripthash。校验值用于解决拼写错误。输入任意长度数据并输出固定长度的摘要值用于数据校验。数据校验值用于检查数据的完整性,类似于“非安全”版的加密哈希运算,并且有更多严格的应用。
      结合上述数据,并进行Base58编码。通俗来讲,Base58利用58种字符将数据翻译为文本信息,与用0和1表示的二进制编码类似。比如英文“testing”在Base58编码中对应字符为“5QqG6hNRBU”。
      这样,我们利用交易和脚本进行代币交换,并理解了资金如何进行储存。但仍然面临一个问题,如何确定你接收到的交易输出没有被进行二次花费?这就回归到区块链本身的特性,从底层代码和运行机制上避免了双花的情况发生。
      比特币区块的区块头信息包含以下数据:
    • 版本号
    • 上一区块头信息哈希
    • 区块内所有交易的默克尔根哈希值
    • 创建时间
    • 难度值
    • 随机数
      完整的交易信息(甚至包括见证人信息)都会记录在区块中。
      正是由于每个区块包含了上一个区块的参考值信息,因此无法对历史区块信息进行恶意修改。修改历史区块信息会导致区块哈希值,并导致“区块哈希链”的断裂。
      比特币使用了工作量证明共识机制(PoW)。具体的工作原理会在另一篇文章中详细解释。简单来说,PoW需要系统中的矿工通过一定量的计算工作解决一个复杂的数学难题。第一个解决难题的矿工将会得到相应的奖励,而新产生的区块也将添加至区块链上。至于计算工作的难易程度则由区块中定义的“难度值”参数决定。
      全网的难度值并不是一成不变的,而是根据整个区块链生命周期期间的算力总和进行调整。在比特币中,“区块目标时间”设定为10分钟,系统利用共识机制对难度值进行自动调整,并达到预先设定的目标时间。比如说,如果系统每5分钟产生新的区块,小于预设的目标时间,则系统会自动增大难度值,则矿工需要投入更多的算力解决数学难题,反之亦然。在实际操作中,比特币难度值每两周调整一次(即每1008个区块)。
      需要说明的是,难度值一般以数字表示,如14,484.16。当然,这种表示方式是方便人们阅读。区块链系统使用256字节的数字表示难度值。如果256字节的区块哈希值小于难度值,则该区块被认为是有效区块,并添加至链上。
      在PoW中只有区块头信息用于共识机制的维护,而默克尔根哈希值用于验证区块交易,同时确保交易可达性。
      当一个区块被创建后,相对应的交易则被视为是“永恒的”。除了偶尔发生的“孤儿块”情况,将交易对应的区块进行完整替换是发起UTXO双花攻击的唯一方法。 但由于区块上包含了历史区块的参考值信息,大大降低了这种攻击成功的可能性,并且成功攻击所消耗的算力是呈指数增长的。因此在很多比特币相关服务中都会等待3到6个不等的“确认”,从而保证交易的有效性。
      还剩最后一个问题:这些代币是如何产生的呢?答案是“挖矿”。在挖矿过程中,矿工在交易中添加“coinbase”交易。 该类交易没有任何输入,并输出一定数量的代币(目前为12.5个比特币)。系统中的代币都由coinbase交易产生,若无代币,则没有交易的生成,整个系统也就没有存在意义。
      好了,现在你应该对区块链是如何安全地进行价值传输,通过增加难度值或更多“确认”防止双花攻击,并且对比特币、Qtum量子链和其他基于UTXO模型的数字货币在协议层的工作原理等有了一定的了解。
      更多详情:
      1: https://en.bitcoin.it/wiki/Script#Obsolete_pay-to-pubkey_transaction
      2:https://en.bitcoin.it/wiki/Script#Standard_Transaction_to_Bitcoin_address_.28pay-to-pubkey-hash.29
      以下为英文部分
      Today I'd like to introduce the basics of how a blockchain works, and how it keeps track of assets in a secure manner. I will be covering the UTXO model, as it is used by Bitcoin and Qtum. There is another way of managing funds on the blockchain called the account model, but it will not be covered here.
      First I'd like to give some definitions in case you do not know anything about Bitcoin.
      _GoBack">
    • One-way hash (or just "hash") - A cryptographic algorithm which converts an arbitrary amount of data into a fixed-length "digest". The algorithm does this in a way that given just the digest it is impossible to determine what the input data was, and furthermore it is impossible to predict what the digest is from the given input data. The most common example is SHA256 which is used extensively in Bitcoin, but there are many others including SHA3, RIPEMD160, scrypt, and many others.
    • Public key cryptography - A cryptographic mechanism by which a "private" secret key can be converted into a "public" key and used to prove ownership of the private key without giving away the secret. Additionally it is possible to encrypt data using the public key so that only the person holding the private key can decrypt it. In Bitcoin this is commonly used to sign transactions. It is possible to prove that the creator of the transaction owns the secret private key by using only the signature data and the public key.
    • Merkle root - A tree data structure that uses one-way hashing to hold multiple pieces of data making it so that any data in the input of the tree cannot be modified without changing the final value of the merkle root hash.
      e transaction owns the secret private key by using only the signature data and the public key.
    • UTXO - Unspent Transaction Output, an unspent vout from a transaction
      erif";mso-fareast-font-family:"Times New Roman"; mso-bidi-font-family:"Times New Roman"'>
      e transaction owns the secret private key by using only the signature data and the public key.
    • Block - The smallest verifiable and unforgeable unit on the blockchain. It contains various data to prove it's consensus as well as transactions
      s New Roman"'>
      e transaction owns the secret private key by using only the signature data and the public key.
      So, let's talk about how transactions work in this. Transactions in Bitcoin resemble a cashier's check in some ways. When you want to spend an "output" of a transaction you must spend the entire thing. It's similar to how you can't walk into the bank and say "I want to cash half of this check". However, in this model there is no equivalent of cash or bank accounts. So in order to send money anywhere you must "cash" a check written out to you, and "output" from that cashing process a check to your intended destination, and another check back to yourself.
      This "cashing process" is actually a transaction in Bitcoin. In a transaction you spend 1 or more "checks" (actually known as UTXOs) and create 1 or more UTXOs to new destinations from those spent funds. The UTXOs you spend in a transaction are called "vins", and the new UTXOs you create are called "vouts". Once a UTXO is spent by a transaction it can be considered gone and destroyed. You can see it's history in the blockchain, but there is nothing that can done with it."Times New Roman"'>
      So, one problem in our system so far is that checks are normally written out to names, such as "Jordan Earls". Anyone of course can say they are any name on the internet. This is where we introduce public key cryptography and programming into UTXOs. In Bitcoin, UTXOs contain a script, or a computer program, which are only spendable if you can make that script end by saying "True". Let's look at the most simple script possible that does something useful:
      [pubKey] OP_CHECKSIG
      This is confusing now though. OP_CHECKSIG takes 2 values from the stack (also known as arguments), but our script appears to only have 1 value, pubKey. This is where the vin portion becomes important. You can imagine the vout script as the "pay to" field on a check, and the vin script as the place you sign on the back, proving that you are indeed the intended party from the "pay to" field. In Bitcoin, a script is not executed until it is spent. And when it is spent, it first executes the vin script, and then places the resulting data from the vin stack onto the vout stack. So in actual execution, the script might look rather like:
      [signature from vin] [pubKey] OP_CHECKSIG
      One could consider the vout script as a challenge, and the vin as the answer to give the vout to satisfy it. Anyway, now that we have a vin providing the signature and attempting to spend these funds, we can actually execute the script. If the signature and public key is valid, then OP_CHECKSIG will push "true" on the stack, resulting in the UTXO being successfully spent.
      So in a transaction, each vin specifies a previous UTXO, and provides an answer that causes the UTXO's script to return "true". If an invalid signature or similar is used, then the scripts will return "false" and the transaction will not be valid. It is not possible to partially spend a UTXO. It must be completely spent or left untouched. This means that if you have a UTXO worth 10 tokens, and you want to send 7 tokens to Bob, then you must make a transaction that spends this 10 token UTXO, and creates 2 outputs. One output to Bob (using his public key), and one output back to yourself (ensuring that you can provide an "answer" to the vout to spend it successfully). This second output back to yourself is called a "change address". Each wallet typically holds dozens or even hundreds of public and private keys. The wallet's "total balance" is the total value of all outputs that can be spent by the wallet.
      However, you might note that we said early that the pubkey transaction is obsolete. There are many reasons for this, but a big one is that public keys are fairly large. Take this example:
      04e70a02f5af48a1989bf630d92523c9d14c45c75f7d1b998e962bff6ff9995fc5bdb44f1793b37495d80324acba7c8f537caaf8432b8d47987313060cc82d8a93
      Meanwhile, a standard Standard Bitcoin addresses look like 17L2iLaPvbCMaEieYZwYb18T4H2cq1vz2g. It has a significantly shorter length for 2 reasons. First, it is "base-58 encoded", and second it does not contain all the data for a public key, but rather just a hash of the public key. This address is associated with "pay-to-pubkeyhash" transactions. Their script looks like this:
      OP_DUP OP_HASH160 [pubKeyHash] OP_EQUALVERIFY OP_CHECKSIG
      And the way to spend it involves two pieces of data rather than one
      [signature] [pubKey]
      The full public key must be included when spending these outputs because with just the hash of the public key, it is impossible to reverse a hash and get the full public key data. Furthermore, in order to check the signature in a public key cryptography system it requires the full public key, it can not work from just a hash of it.
      I won't cover the complete execution of this script (the Bitcoin Wiki [2] does though), but the basic gist of why this transaction has become the most popular transaction is:
    1. The hash of a public key is only 160 bits, whereas a public key is much larger. This makes it much easier to pass around and makes hand-writing an address tedious but possible.
    2. In case of a vulnerability in the public key cryptography used in Bitcoin, only putting the hash of the public key provides more security (though it requires using different public keys/addresses for every single transaction to benefit from this)

    Now to get back to what a Base58 address is. The actual pubkeyhash stored on the blockchain might be 456a16722e70016e0d3dcde67840d6df0b59e8ec. It is troublesome to pass a raw number like this around on the internet however because a typo or mistake could result in permanently lost funds. Additionally, it is impossible to determine if this data is for a pubkeyhash transaction, or a new pay-to-scripthash transaction. The Base58 address format thus consists of the following:

    • Version number
    • Data (in this case a public key hash, but could also be a script hash or any other data)
    • Checksum
      The version number solves the problem of identifying the data. In the case of Bitcoin, version 1 is used for pubkeyhash, while version 3 is used for scripthash. And finally the checksum solves the typo problem. A checksum takes data of arbitrary length and outputs a small digest. A checksum is similar to a non-secure version of a cryptographic hash, and has more restricted applications. This checksum digest can be checked to ensure that none of the data was changed accidentally.
      With this data together, it is then encoded into base58. Base58 is just a way of turning data into a text representation that uses 58 different characters. Similar to how base-2 is binary (using only 1 and 0). Base58 can be used outside of Bitcoin as well, for example testing (as ASCII data) in base58 is 5QqG6hNRBU.
      Finally, we have a reasonable way of exchanging tokens using transactions and scripts, and understand how money is stored. However, we face a problem. When someone sends you a transaction output, how can you be sure that their vins for that transaction only use unspent outputs. This is where the concept of the blockchain becomes important. The blockchain at it's core is what prevents double-spends, or spending the same transaction output more than once (presumably sending to different parties).
      A block in Bitcoin has a header. The header contains the following:
    • Version
    • Previous block header hash
    • Merkle root hash of all transactions in the block
    • Time of creation
    • Difficulty
    • Nonce
      The body of the block is complete transactions (and eventually witnesses as well, but that's another topic).
      Because each block includes a reference to the previous block, it is impossible to modify a previous block surreptitiously. To modify a previous block would change the block hash, and thus break the "chain" made of block hashes.
      Bitcoin uses the Proof of Work (PoW) consensus system. This will be explained more in a later article, but basically it is a system which requires participants (miners) in the block creation process to put in a certain amount of computational work to solve a difficult puzzle. The first miner to solve the puzzle gets a reward and their created block is added to the network's blockchain. How much work must be done is controlled by the "difficulty" specified in the block.
      The difficulty in the network adjusts throughout the lifetime of the blockchain. In Bitcoin specifically, they have a "block target time" of 10 minutes. The difficulty is adjusted automatically (by consensus) to try to hit this target time. So, if many blocks have been created only 5 minutes apart, then this would indicate the network is making blocks too quickly. As a result the difficulty would be adjusted automatically to become more difficult and require more computation to find a block. And it of course works in the opposite manner too, if the network is creating blocks too slow, then the difficulty will become easier. Bitcoin specifically will only adjust the difficulty about once every 2 weeks. (1008 blocks).
      Note that difficulty is commonly referred to as a number like 14,484.16. However, this is "only for humans". The blockchain itself uses a compact representation of a 256 bit number. If the block hash (which is also a 256 bit number) is less than the 256 bit difficulty number, then it is considered to have met the difficulty and is a valid PoW block.
      In PoW, only the block header is actually used for the consensus mechanism. The merkle root hash ensures that despite this, it is possible to validate every transaction in the body of the block, as well as ensure that every transaction has been received.
      Once a block has been created, it's transactions can be mostly considered permanent. The only way to "double spend" a UTXO is to replace the block in which the spending transaction took place. This can happen naturally in some cases (known as orphan blocks), but as more blocks are built on top of the transaction containing block, the likelihood of this becomes exponentially less likely, and furthermore, would require exponentially more work to maliciously attack and replace.
      This is why many services that accept Bitcoin wait for 3 or 6 confirmations (blocks placed on top of the transaction containing block). It becomes incredibly unlikely that the blockchain could be broken and those funds spent by another transaction.
      We have only one remaining problem. Where do the tokens initially come from? They come from a process called mining. As part of mining, a miner adds a special transaction called a "coinbase" transaction. This transaction has no inputs, and is allowed to have outputs worth a set amount (currently 12.5 Bitcoins). This coinbase transaction is where all of the tokens in circulation actually come from. Without tokens there would be no transactions to create, and thus nothing to be done.
      Now we have a functioning blockchain that is capable of holding it's value securely, ensuring that double spends are extremely difficult to execute (and increasing in difficulty with more confirmations). You should now know enough to understand how Bitcoin, Qtum, and other UTXO cryptocurrencies really work at the protocol level and can begin to look into more advanced topics on the blockchain.
      more information:
      1: https://en.bitcoin.it/wiki/Script#Obsolete_pay-to-pubkey_transaction
      2:https://en.bitcoin.it/wiki/Script#Standard_Transaction_to_Bitcoin_address_.28pay-to-pubkey-hash.29

Log in to reply
 

Looks like your connection to QTUM was lost, please wait while we try to reconnect.