以太坊EVM源码分析之数据结构

EVM代码整体结构

EVM相关的源码目录结构：

~/go-ethereum-master/core/vm# tree
.
├── analysis.go                     // 分析合约字节码，标记是否是跳转目标(jumpdest)
├── analysis_test.go
├── common.go                       // 一些公用的方法
├── contract.go                     // 智能合约数据结构
├── contracts.go                    // 预编译合约集
├── contracts_test.go
├── doc.go
├── eips.go                         // 一些EIP的实现
├── errors.go                       // 列出执行时错误
├── evm.go                          // 执行器 提供一些对外接口
├── gas.go                          // call gas花费计算 一级指令耗费gas级别
├── gas_table.go                    // 各个指令对应的计算耗费gas的函数
├── gas_table_test.go
├── gen_structlog.go
├── instructions.go                 // 指令对应的执行函数
├── instructions_test.go
├── interface.go                    // StateDB接口、EVM调用约定基本接口CallContext
├── interpreter.go                  // 解释器 调用核心
├── intpool.go                      // int值池 用来加速bit.Int的分配。 辅助对象
├── intpool_test.go
├── int_pool_verifier_empty.go
├── int_pool_verifier.go
├── jump_table.go                   // 指令和指令操作（操作，花费，验证）对应表
├── logger.go                       // logger、Tracer  辅助对象
├── logger_json.go
├── logger_test.go
├── memory.go                       // 内存模型及其访问函数
├── memory_table.go                 // EVM 内存操作表 计算指令所需内存大小
├── opcodes.go                      // EVM指令集
├── stack.go                        // 堆栈及其方法
└── stack_table.go                  // 一些栈的辅助函数, minStack、maxStack等~/go-ethereum-master/core# tree
.
├── evm.go                          // EVM用到的一些函数
├── gaspool.go                      // GasPool实时记录区块在执行交易期间可用的gas量
├── state_processor.go              // 处理状态转移
├── state_transition.go             // 状态转换模型
└── types                           // 一些核心数据结构├── block.go                    // block、blockHeader├── log.go                      // log├── receipt.go                  // receipt├── transaction.go              // transaction、message└── transaction_signing.go

这是网上找到的一张EVM模块的整体结构图，有些已经发生变化。到目前为止(2020.02.25)，EVM的指令集版本已经有7个了。operation的字段也有一些改动。
core/vm/jump_table.go

// 指令集, 下面定义了7种指令集,针对7种不同的以太坊版本
var (frontierInstructionSet         = newFrontierInstructionSet()homesteadInstructionSet        = newHomesteadInstructionSet()tangerineWhistleInstructionSet = newTangerineWhistleInstructionSet()spuriousDragonInstructionSet   = newSpuriousDragonInstructionSet()byzantiumInstructionSet        = newByzantiumInstructionSet()constantinopleInstructionSet   = newConstantinopleInstructionSet()istanbulInstructionSet         = newIstanbulInstructionSet()
)

EVM的代码结构要比想象的简单。EVM涉及的核心对象有解释器Interpreter、解释器的配置选项Config、为EVM提供辅助信息的执行上下文Context以及用于完整状态查询的EVM数据库stateDB。
从上图可以看出，EVM通过解释器运行智能合约，而解释器依赖于config的核心结构：JumpTable [256]operation，JumpTable的下标是操作码，JumpTable[opCode]对应的operation对象存储了指令对应的处理逻辑, gas计算函数, 堆栈验证方法, memory使用的大小以及一些flag。
以太坊的不同版本对应着不同的JumpTable，只有frontierInstructionSet的初始化函数中初始化了基本指令的operation对象,之后的版本都是对前一个版本的修修补补：首先生成前一个版本的指令，然后应用一些EIP，增加自己特有的指令，或者改动某些指令。例如最新的Istanbul版本：
core/vm/jump_table.go

// newIstanbulInstructionSet returns the frontier, homestead
// byzantium, contantinople and petersburg instructions.
// 先初始化前一个版本Constantinople的指令集，然后应用一些EIP.
func newIstanbulInstructionSet() JumpTable {instructionSet := newConstantinopleInstructionSet()enable1344(&instructionSet) // ChainID opcode - https://eips.ethereum.org/EIPS/eip-1344enable1884(&instructionSet) // Reprice reader opcodes - https://eips.ethereum.org/EIPS/eip-1884enable2200(&instructionSet) // Net metered SSTORE - https://eips.ethereum.org/EIPS/eip-2200return instructionSet
}

Contract

EVM是智能合约的运行时环境，因此我们有必要了解一下合约的结构以及比较重要的方法。
core/vm/contract.go

// ContractRef is a reference to the contract's backing object
// Contrtref是对背后的合约对象的引用
type ContractRef interface {Address() common.Address // Address方法返回合约地址
}// Contract represents an ethereum contract in the state database. It contains
// the contract code, calling arguments. Contract implements ContractRef
// Contract在状态数据库中表示一个以太坊合约。它包含合约代码，调用参数。
// Contract 实现 ContractRef接口
type Contract struct {// CallerAddress is the result of the caller which initialised this// contract. However when the "call method" is delegated this value// needs to be initialised to that of the caller's caller.// CallerAddress是初始化此合约的调用者的结果。// 然而，当“调用方法”被委托时，需要将此值初始化为调用者的调用者的地址。CallerAddress common.Addresscaller        ContractRef // 调用者self          ContractRef // 合约自身// JUMPDEST分析结果聚合。// 实际是合约字节码对应字节是指令还是普通数据的分析结果，若是指令，则可以作为jumpdest。jumpdests map[common.Hash]bitvec // Aggregated result of JUMPDEST analysis.// 本地保存本合约的代码JUMPDEST分析结果，不保存在调用者上下文中analysis bitvec // Locally cached result of JUMPDEST analysisCode     []byte          // 代码CodeHash common.Hash     // 代码hashCodeAddr *common.Address // 代码地址Input    []byte          // 合约输入的参数Gas   uint64   // Gas数量value *big.Int // 携带的数据，如交易的数额
}

构造函数

core/vm/contract.go

// NewContract returns a new contract environment for the execution of EVM.
// NewContract 为EVM的执行返回一个新的合约环境
func NewContract(caller ContractRef, object ContractRef, value *big.Int, gas uint64) *Contract {// 初始化Contract对象c := &Contract{CallerAddress: caller.Address(), caller: caller, self: object}// 将ContractRef接口类转换为Contarct具体类型，当成功标志为真时，// 表示成功将接口转换为具体类型，否则表示该接口不是具体类型的实例。if parent, ok := caller.(*Contract); ok {// Reuse JUMPDEST analysis from parent context if available.// 重用调用者上下文中的JUMPDESTc.jumpdests = parent.jumpdests} else {// 初始化新的jumpdestsc.jumpdests = make(map[common.Hash]bitvec)}// Gas should be a pointer so it can safely be reduced through the run// Gas应为一个指针，这样它可以通过run方法安全地减少// This pointer will be off the state transition// 这个指针将脱离状态转换c.Gas = gas// ensures a value is set// 确保value被设置c.value = valuereturn c //返回合约指针
}

方法

core/vm/contract.go

// Contract结构方法
// 判断跳转目标是否有效
func (c *Contract) validJumpdest(dest *big.Int) bool {udest := dest.Uint64() //将目标转换为Uint64类型// PC cannot go beyond len(code) and certainly can't be bigger than 63bits.// PC大小不能超过代码长度，并且位数不大于63位// Don't bother checking for JUMPDEST in that case.// 在这种情况下，不必检查JUMPDEST。if dest.BitLen() >= 63 || udest >= uint64(len(c.Code)) {return false}// Only JUMPDESTs allowed for destinations// 只有JUMPDEST可以成为跳转目标if OpCode(c.Code[udest]) != JUMPDEST {return false}// 下面的代码检查目的地的值是否是指令，而非普通数据// Do we have a contract hash already?// 我们已经有一个合约hash了吗?if c.CodeHash != (common.Hash{}) { //若不是空hash// Does parent context have the analysis?// 调用者上下文是否已经有一个分析，c.jumpdests = parent.jumpdests// go 中map 是引用类型,因此在这里c.jumpdests与parent.jumpdests指向同一结构，同一片内存区域analysis, exist := c.jumpdests[c.CodeHash] // 查看元素是否存在if !exist {                                //若不存在// Do the analysis and save in parent context// 进行分析并保存在调用者上下文中// We do not need to store it in c.analysis// 不需要在c.analysis存储结果analysis = codeBitmap(c.Code)//由于Map是引用类型，改变c.jumpdests等同于改变parent.jumpdests，键是各自的代码hashc.jumpdests[c.CodeHash] = analysis}// 检查跳转位置是否在代码段中，并返回结果return analysis.codeSegment(udest)}// We don't have the code hash, most likely a piece of initcode not already// in state trie. In that case, we do an analysis, and save it locally, so// we don't have to recalculate it for every JUMP instruction in the execution// However, we don't save it within the parent context// 我们还没有代码hash, 很可能是因为一部分初试代码还没有保存到状态树中。// 在那种情况下，我们进行代码分析并局部保存，在执行过程中我们就不必为每条跳转指令重新进行代码分析// 然而，我们并没有将分析结果保存在调用者上下文中// 一般是因为新合约创建,还未将合约写入状态数据库。if c.analysis == nil {c.analysis = codeBitmap(c.Code)}return c.analysis.codeSegment(udest)
}// AsDelegate sets the contract to be a delegate call and returns the current
// contract (for chaining calls)
// AsDelegate将合约设置为委托调用并返回当前合约(用于链式调用)
func (c *Contract) AsDelegate() *Contract {// NOTE: caller must, at all times be a contract. It should never happen// that caller is something other than a Contract.// 注:调用者在任何时候都必须是一个合约，不应该是合约以外的东西。parent := c.caller.(*Contract)         //调用者的合约对象c.CallerAddress = parent.CallerAddress //将调用者地址设置为调用者的调用者的地址c.value = parent.value                 //值也是return c //返回当前合约
}

Contract在EVM中的使用

Transaction被转换成Message后传入EVM，在调用EVM.Call或者EVM.Create时，会将Message转换为Contract对象，以便后续执行。转换过程如图所示,合约代码从相应的状态数据库地址获取，然后加载到合约对象中。

EVM

core/vm/evm.go

// EVM is the Ethereum Virtual Machine base object and provides
// the necessary tools to run a contract on the given state with
// the provided context. It should be noted that any error
// generated through any of the calls should be considered a
// revert-state-and-consume-all-gas operation, no checks on
// specific errors should ever be performed. The interpreter makes
// sure that any errors generated are to be considered faulty code.
// EVM 是以太坊虚拟机的基本对象，并且提供必要的工具，以便在给定的状态下使用提供的上下文运行合约。
// 应该注意的是，通过任何调用产生的任何错误都会导致状态回滚并消耗掉所有gas，
// 不应该执行任何对特定错误的检查。解释器确保产生的任何错误都被认为是错误代码。
//
// The EVM should never be reused and is not thread safe.
// EVM不应该被重用，而且也不是线程安全的。
type EVM struct {// Context provides auxiliary blockchain related information// Context提供区块链相关的辅助信息 提供访问当前区块链数据和挖矿环境的函数和数据Context// StateDB gives access to the underlying state// StateDB 以太坊状态数据库对象 提供对底层状态的访问StateDB StateDB// Depth is the current call stack// Depth 是当前调用堆栈depth int// chainConfig contains information about the current chain// chainConfig包括当前链的配置信息 当前节点的区块链配置信息chainConfig *params.ChainConfig// chain rules contains the chain rules for the current epoch// chainRules包含当前阶段的链规则chainRules params.Rules// virtual machine configuration options used to initialise the// evm.// vmConfig 是用于初始化evm的虚拟机配置选项。 虚拟机配置信息vmConfig Config// global (to this context) ethereum virtual machine// used throughout the execution of the tx.// 交易执行所采用的全局(对于这个上下文来说)以太坊虚拟机interpreters []Interpreterinterpreter  Interpreter// abort is used to abort the EVM calling operations// NOTE: must be set atomically// abort用来终止EVM的调用操作// 注意：设置时必须是原子操作abort int32// callGasTemp holds the gas available for the current call. This is needed because the// available gas is calculated in gasCall* according to the 63/64 rule and later// applied in opCall*.// callGasTemp 保存当前调用可用的gas。这是必要的，因为可用的gas是根据63/64规则在gasCall*中计算的，之后应用在opCall*中。// // 除去父合约在内存等方面花去的杂七杂八的gas成本，实际用于执行子合约的gas。也就是子合约可以使用的gas数量。callGasTemp uint64
}

构造函数

core/vm/evm.go

// NewEVM returns a new EVM. The returned EVM is not thread safe and should
// only ever be used *once*.
// NewEVM是EVM的构造函数。返回的EVM不是线程安全的，应该只使用*一次*。
func NewEVM(ctx Context, statedb StateDB, chainConfig *params.ChainConfig, vmConfig Config) *EVM {// 初始化字段evm := &EVM{Context:      ctx,StateDB:      statedb,vmConfig:     vmConfig,chainConfig:  chainConfig,chainRules:   chainConfig.Rules(ctx.BlockNumber),interpreters: make([]Interpreter, 0, 1), // 长度0，容量1}if chainConfig.IsEWASM(ctx.BlockNumber) {// to be implemented by EVM-C and Wagon PRs.// 由EVM-C和Wagon PRs实现。// 注释代码主要是向解释器集合添加新的解释器， EVMVCInterpreter 或者 EWASMInterpreter// if vmConfig.EWASMInterpreter != "" {//  extIntOpts := strings.Split(vmConfig.EWASMInterpreter, ":")//  path := extIntOpts[0]//  options := []string{}//  if len(extIntOpts) > 1 {//    options = extIntOpts[1..]//  }//  evm.interpreters = append(evm.interpreters, NewEVMVCInterpreter(evm, vmConfig, options))// } else {//   evm.interpreters = append(evm.interpreters, NewEWASMInterpreter(evm, vmConfig))// }panic("No supported ewasm interpreter yet.")}// vmConfig.EVMInterpreter will be used by EVM-C, it won't be checked here// as we always want to have the built-in EVM as the failover option.// vmConfig.EVMInterpreter将被EVM-C使用, 这里不会选中它，因为我们总是希望将内置的EVM作为故障转移(失败备援)选项。evm.interpreters = append(evm.interpreters, NewEVMInterpreter(evm, vmConfig))// 到目前为止，函数只为interpreters添加了一个解释器。当前源代码中只有一个版本的解释器EVMInterpreterevm.interpreter = evm.interpreters[0]return evm
}

此函数创建一个新的虚拟机对象，将EVM字段初始化，然后调用NewEVMInterpreter创建解释器对象，添加解释器，目前只有一个版本的解释器EVMInterperter，注释代码中描述了下一代解释器ewasm interpreter的添加过程。注意，此函数参数vmConfig并未填充vmConfig.JumpTable，此结构在NewEVMInterperter中进行填充。

Context

core/vm/evm.go

// Context provides the EVM with auxiliary information. Once provided
// it shouldn't be modified.
// Context为EVM提供辅助信息。一旦提供不能被修改
type Context struct {// CanTransfer returns whether the account contains// sufficient ether to transfer the value// CanTransfer返回账户是否拥有足够的ether进行交易CanTransfer CanTransferFunc// Transfer transfers ether from one account to the other// Transfer把ether从一个账户转移到另一个账户Transfer TransferFunc// GetHash returns the hash corresponding to n// GetHash返回区块链中第n个块的hashGetHash GetHashFunc// Message information// 提供发起者信息 sender的地址Origin common.Address // Provides information for ORIGIN// gas价格GasPrice *big.Int // Provides information for GASPRICE// Block information// 受益人，一般是矿工地址Coinbase common.Address // Provides information for COINBASE// 区块所能消耗的gas限制，可由矿工投票调整GasLimit uint64 // Provides information for GASLIMIT// 区块号BlockNumber *big.Int // Provides information for NUMBER// 时间Time *big.Int // Provides information for TIME// 难度，当前挖矿要解决的难题难度Difficulty *big.Int // Provides information for DIFFICULTY
}

构造函数

找到该交易的打包者，然后将各个字段填充。
core/evm.go

// NewEVMContext creates a new context for use in the EVM.
// NewEVMContext创建一个用于EVM的新上下文。
func NewEVMContext(msg Message, header *types.Header, chain ChainContext, author *common.Address) vm.Context {// If we don't have an explicit author (i.e. not mining), extract from the header// 如果我们没有一个明确的作者，从块头提取var beneficiary common.Addressif author == nil {// 忽略错误，我们已经通过了头部的有效性验证beneficiary, _ = chain.Engine().Author(header) // Ignore error, we're past header validation} else {beneficiary = *author}return vm.Context{CanTransfer: CanTransfer,Transfer:    Transfer,GetHash:     GetHashFn(header, chain),Origin:      msg.From(),Coinbase:    beneficiary,BlockNumber: new(big.Int).Set(header.Number),Time:        new(big.Int).SetUint64(header.Time),Difficulty:  new(big.Int).Set(header.Difficulty),GasLimit:    header.GasLimit,GasPrice:    new(big.Int).Set(msg.GasPrice()),}
}

StateDB

core/vm/interface.go

// StateDB is an EVM database for full state querying.
// StateDB是一个用于完整状态查询的EVM数据库。
type StateDB interface {CreateAccount(common.Address)SubBalance(common.Address, *big.Int)AddBalance(common.Address, *big.Int)GetBalance(common.Address) *big.IntGetNonce(common.Address) uint64SetNonce(common.Address, uint64)GetCodeHash(common.Address) common.HashGetCode(common.Address) []byteSetCode(common.Address, []byte)GetCodeSize(common.Address) intAddRefund(uint64)SubRefund(uint64)GetRefund() uint64GetCommittedState(common.Address, common.Hash) common.HashGetState(common.Address, common.Hash) common.HashSetState(common.Address, common.Hash, common.Hash)Suicide(common.Address) boolHasSuicided(common.Address) bool// Exist reports whether the given account exists in state.// Notably this should also return true for suicided accounts.// Exist报告在状态中是否存在给定帐户。// 值得注意的是已经自毁的账号也返回true。Exist(common.Address) bool// Empty returns whether the given account is empty. Empty// is defined according to EIP161 (balance = nonce = code = 0).// Empty返回给定帐户是否为空。// 空的概念根据EIP161定义(balance = nonce = code = 0)Empty(common.Address) boolRevertToSnapshot(int)Snapshot() intAddLog(*types.Log)AddPreimage(common.Hash, []byte)ForEachStorage(common.Address, func(common.Hash, common.Hash) bool) error
}

core/state/statedb.go

// StateDBs within the ethereum protocol are used to store anything
// within the merkle trie. StateDBs take care of caching and storing
// nested states. It's the general query interface to retrieve:
// * Contracts
// * Accounts
// stateDB用来存储以太坊中关于merkle trie的所有内容。 StateDB负责缓存和存储嵌套状态。
// 这是检索合约和账户的一般查询界面：
type StateDB struct {db   Database // 后端的数据库trie Trie     // 树 main account trie// This map holds 'live' objects, which will get modified while processing a state transition.// 下面的Map用来存储当前活动的对象，这些对象在状态转换的时候会被修改。stateObjects map[common.Address]*stateObject// State objects finalized but not yet written to the trie 已完成修改的状态对象(state object)，但尚未写入triestateObjectsPending map[common.Address]struct{}// State objects modified in the current execution 在当前执行过程中修改的状态对象(state object)stateObjectsDirty map[common.Address]struct{}// DB error. 数据库错误// State objects are used by the consensus core and VM which are// unable to deal with database-level errors. Any error that occurs// during a database read is memoized here and will eventually be returned// by StateDB.Commit.// stateObject会被共识算法的核心和VM使用，在这些代码内部无法处理数据库级别的错误。// 在数据库读取期间发生的任何错误都会记录在这里，最终由StateDB.Commit返回。dbErr error// The refund counter, also used by state transitioning.// 退款计数器，用于状态转换refund uint64thash, bhash common.Hash                  // 当前的transaction hash 和block hashtxIndex      int                          // 当前的交易的indexlogs         map[common.Hash][]*types.Log // 日志 key是交易的hash值logSize      uint                         // 日志大小preimages map[common.Hash][]byte // SHA3的原始byte[], EVM计算的 SHA3->byte[]的映射关系// Journal of state modifications. This is the backbone of// Snapshot and RevertToSnapshot.// 状态修改日志。这是快照和回滚到快照的支柱。journal        *journalvalidRevisions []revisionnextRevisionId int// Measurements gathered during execution for debugging purposes// 为调试目的而在执行期间收集的度量AccountReads   time.DurationAccountHashes  time.DurationAccountUpdates time.DurationAccountCommits time.DurationStorageReads   time.DurationStorageHashes  time.DurationStorageUpdates time.DurationStorageCommits time.Duration
}

构造函数

core/state/statedb.go

// StateDB的构造函数
// 一般的用法 statedb, _ := state.New(common.Hash{}, state.NewDatabase(db))// Create a new state from a given trie.
func New(root common.Hash, db Database) (*StateDB, error) {tr, err := db.OpenTrie(root)if err != nil {return nil, err}return &StateDB{db:                  db,trie:                tr,stateObjects:        make(map[common.Address]*stateObject),stateObjectsPending: make(map[common.Address]struct{}),stateObjectsDirty:   make(map[common.Address]struct{}),logs:                make(map[common.Hash][]*types.Log),preimages:           make(map[common.Hash][]byte),journal:             newJournal(),}, nil
}

Config

core/vm/interpreter.go

// Config are the configuration options for the Interpreter
// Config是解释器的配置选项
type Config struct {Debug                   bool   // Enables debugging 调试模式Tracer                  Tracer // Opcode logger 日志记录NoRecursion             bool   // Disables call, callcode, delegate call and create 禁用Call, callCode, delegate call和create.EnablePreimageRecording bool   // Enables recording of SHA3/keccak preimages 记录SHA3的原象// EVM指令表 如果未设置，将自动填充// 解释器每拿到一个准备执行的新指令时，就会从 JumpTable 中获取指令相关的信息，即 operation 对象。JumpTable [256]operation // EVM instruction table, automatically populated if unsetEWASMInterpreter string // External EWASM interpreter options 外部EWASM解释器选项EVMInterpreter   string // External EVM interpreter options 外部EVM解释器选项ExtraEips []int // Additional EIPS that are to be enabled 启用的额外的EIP
}

core/vm/gas_table.go

// operation存储了一条指令的所需要的函数。一个operation对应一条指令。
// operation存储了指令对应的处理逻辑, gas消耗, 堆栈验证方法, memory使用的大小等。
type operation struct {// execute is the operation function// 执行函数，指令处理逻辑execute     executionFuncconstantGas uint64  // 固定gasdynamicGas  gasFunc // 指令消耗gas的计算函数// minStack tells how many stack items are required// minStack 表示需要多少个堆栈项minStack int// maxStack specifies the max length the stack can have for this operation// to not overflow the stack.// maxStack指定这个操作不会使堆栈溢出的堆栈最大长度。// 也就是说，只要堆栈不超过这个最大长度，这个操作就不会导致栈溢出。maxStack int// memorySize returns the memory size required for the operation// 指令需要的内存大小memorySize memorySizeFunc// 表示操作是否停止进一步执行。指令执行完成后是否停止解释器的执行。halts bool // indicates whether the operation should halt further execution// 指示程序计数器是否不增加。 若是跳转指令，则pc不需要自增，而是直接改为跳转目标地址。jumps bool // indicates whether the program counter should not increment// 确定这个操作是否修改状态。是否是写指令（会修改 StatDB 中的数据）writes bool // determines whether this a state modifying operation// 指示检索到的操作是否有效并且已知 是不是一个有效操作码valid bool // indication whether the retrieved operation is valid and known// 确定操作是否回滚状态（隐式停止）。指令指行完后是否中断执行并回滚状态数据库。reverts bool // determines whether the operation reverts state (implicitly halts)// 确定操作是否设置了返回数据内容 指示该操作是否有返回值returns bool // determines whether the operations sets the return data content
}

Interpreter

core/vm/interpreter.go

// Interpreter is used to run Ethereum based contracts and will utilise the
// passed environment to query external sources for state information.
// The Interpreter will run the byte code VM based on the passed
// configuration.
// 解释器用于运行基于以太坊的合约，并将使用传递的环境来查询外部源以获取状态信息。
// 解释器将根据传递的配置运行VM字节码。
type Interpreter interface {// Run loops and evaluates the contract's code with the given input data and returns// the return byte-slice and an error if one occurred.// 用给定的入参循环执行合约的代码，并返回返回结果的字节切片，如果出现错误的话返回错误。Run(contract *Contract, input []byte, static bool) ([]byte, error)// CanRun tells if the contract, passed as an argument, can be// run by the current interpreter. This is meant so that the// caller can do something like:// CanRun告诉当前解释器是否可以运行当前合约，合约作为参数传递。这表示调用者可以这样做：//// ```golang// for _, interpreter := range interpreters {//   if interpreter.CanRun(contract.code) {//     interpreter.Run(contract.code, input)//   }// }// ```CanRun([]byte) bool
}// EVMInterpreter represents an EVM interpreter
// EVMInterpreter表示一个EVM解释器，实现了Interpreter接口
type EVMInterpreter struct {evm *EVMcfg ConfigintPool *intPool// Keccak256 hasher实例跨指令共享hasher keccakState // Keccak256 hasher instance shared across opcodes// Keccak256 hasher结果数组跨指令共享hasherBuf common.Hash // Keccak256 hasher result array shared aross opcodesreadOnly bool // Whether to throw on stateful modifications// 最后一个调用的返回数据，便于接下来复用returnData []byte // Last CALL's return data for subsequent reuse
}

构造函数

先将vmConfig.JumpTable初始化为对应版本的指令集，然后生成EVMInterpreter对象并返回。
core/vm/interpreter.go

// 构造函数
// NewEVMInterpreter returns a new instance of the Interpreter.
// NewEVMInterpreter返回解释器的一个新实例。
func NewEVMInterpreter(evm *EVM, cfg Config) *EVMInterpreter {// We use the STOP instruction whether to see// the jump table was initialised. If it was not// we'll set the default jump table.// 我们使用STOP指令来判断指令表是否被初始化。若没有，设置默认的指令表。if !cfg.JumpTable[STOP].valid {var jt JumpTableswitch {case evm.chainRules.IsIstanbul:jt = istanbulInstructionSetcase evm.chainRules.IsConstantinople:jt = constantinopleInstructionSetcase evm.chainRules.IsByzantium:jt = byzantiumInstructionSetcase evm.chainRules.IsEIP158:jt = spuriousDragonInstructionSetcase evm.chainRules.IsEIP150:jt = tangerineWhistleInstructionSetcase evm.chainRules.IsHomestead:jt = homesteadInstructionSetdefault:jt = frontierInstructionSet}for i, eip := range cfg.ExtraEips {if err := EnableEIP(eip, &jt); err != nil {// Disable it, so caller can check if it's activated or not// 若出现了错误，禁用它，这样调用者可以检查它是否激活。cfg.ExtraEips = append(cfg.ExtraEips[:i], cfg.ExtraEips[i+1:]...)log.Error("EIP activation failed", "eip", eip, "error", err)}}cfg.JumpTable = jt}return &EVMInterpreter{evm: evm,cfg: cfg,}
}

Input数据结构

Contract Application Binary Interface(ABI)是以太坊生态系统中与合约交互的标准方式，既可用于从区块链之外交互，也可用于合约间的交互。数据根据其类型进行编码。编码不具备自解释性，因此需要一个范式来解码。
我们假设合约的接口函数是强类型的，在编译时已知，并且是静态的。我们假设所有合约在编译时都具有它们调用的合约的接口定义。[7]

函数选择子(Function Selector)与参数编码(Argument Encoding)

函数调用的Input的前4个字节指定要调用的函数。它是函数签名的Keccak-256 (SHA-3)hash的前(左，大端高阶)4字节。函数签名即是带有参数类型括号列表的函数名，参数类型由逗号分隔而不是空格。签名不包括函数的返回类型。从第5个字节开始，后面跟着是编码后的参数。给定一个合约：[7]

pragma solidity >=0.4.16 <0.7.0;contract Foo {function bar(bytes3[2] memory) public pure {}function baz(uint32 x, bool y) public pure returns (bool r) { r = x > 32 || y; }function sam(bytes memory, bool, uint[] memory) public pure {}
}

如果我们想用参数69和true来调用baz方法，我们将传递68字节的Input，可以分解为以下几个部分:

0xcdcd77c0：函数选择子或者说是方法ID。baz的函数签名是baz(uint32,bool),然后进行hash操作并取前4个字节作为函数选择子:KeccakHash("baz(uint32,bool)")[0:4] => 0xcdcd77c0
0x0000000000000000000000000000000000000000000000000000000000000045:第一个参数，uint32, 值69填充为32字节。
0x0000000000000000000000000000000000000000000000000000000000000001:第二个参数，布尔值true。

所以调用Foo合约时传输的Input的值为：

0xcdcd77c000000000000000000000000000000000000000000000000000000000000000450000000000000000000000000000000000000000000000000000000000000001

函数选择子告诉了EVM我们想要调用合约的哪个方法，它和参数数据一起，被编码到了交易的data数据中。跟合约代码一起送到解释器里的还有Input，而这个 Input中的数据是由交易的data提供的。函数选择子和参数的解析功能并不由EVM完成，而是合约编译器在编译时插入代码完成的。

在我们编译智能合约的时候，编译器会自动在生成的字节码的最前面增加一段函数选择逻辑: 首先通过 CALLDATALOAD 指令将“4-byte signature”压入堆栈中，然后依次跟该合约中包含的函数进行比对，如果匹配则调用 JUMPI 指令跳入该段代码继续执行。

数据加载相关的指令

CALLDATALOAD：把输入数据加载到 Stack 中
CALLDATACOPY：把输入数据加载到 Memory 中
CODECOPY：把当前合约代码拷贝到Memory 中
EXTCODECOPY：把外部合约代码拷贝到Memory 中

这些指令对应的操作如下图所示：

Appendix A

Stack结构及其操作

core/vm/stack.go

// Stack is an object for basic stack operations. Items popped to the stack are
// expected to be changed and modified. stack does not take care of adding newly
// initialised objects.
// Stack是用于堆栈基本操作的对象。弹出到堆栈中的项将被更改。堆栈不负责添加新初始化的对象。
type Stack struct {data []*big.Int //指针的切片，堆栈中本质上存储的是指针
}func newstack() *Stack {return &Stack{data: make([]*big.Int, 0, 1024)} //初始长度为0，容量1024
}//-----------栈的方法----------------// Data returns the underlying big.Int array.
// Data返回底层的big.Int数组
func (st *Stack) Data() []*big.Int {return st.data
}func (st *Stack) push(d *big.Int) {// NOTE push limit (1024) is checked in baseCheck// 注意:在baseCheck中已经检查了堆栈最大限制 (1024)//stackItem := new(big.Int).Set(d)//st.data = append(st.data, stackItem)st.data = append(st.data, d) // 数组末尾就是堆栈的顶部
}
func (st *Stack) pushN(ds ...*big.Int) { // 一次性压入堆栈多个条目st.data = append(st.data, ds...)
}// 弹出栈顶元素
func (st *Stack) pop() (ret *big.Int) {ret = st.data[len(st.data)-1]      // 弹出的条目st.data = st.data[:len(st.data)-1] // 堆栈深度减一return
}func (st *Stack) len() int { // 堆栈长度，深度return len(st.data)
}// 将堆栈中第n项与栈顶元素交换
func (st *Stack) swap(n int) {st.data[st.len()-n], st.data[st.len()-1] = st.data[st.len()-1], st.data[st.len()-n]
}// 将栈的第n项复制并入栈
func (st *Stack) dup(pool *intPool, n int) {st.push(pool.get().Set(st.data[st.len()-n]))
}// 获取栈顶元素的值但不弹出
func (st *Stack) peek() *big.Int {return st.data[st.len()-1]
}// Back returns the n'th item in stack
// Back返回栈的第n项
func (st *Stack) Back(n int) *big.Int {return st.data[st.len()-n-1]
}

core/vm/stack_table.go

// 一些栈的辅助函数// maxStack specifies the max length the stack can have for this operation
// to not overflow the stack.
// maxStack指定该操作不会使堆栈溢出的堆栈最大长度。
// 也就是说，只要堆栈不超过这个最大长度，这个操作就不会导致栈溢出。
// 参数：pops 该操作执行过程中所做的pop次数; pushs 该操作执行过程中所做的push次数
func maxStack(pop, push int) int {return int(params.StackLimit) + pop - push
}// minStack tells how many stack items are required
// minStack 表示需要多少个堆栈项
// 参数：pops 该操作执行过程中所做的pop次数; pushs 该操作执行过程中所做的push次数
// 需要的堆栈项数就是该操作pop的次数
func minStack(pops, push int) int {return pops
}

Memory结构及其操作

core/vm/memory.go

// Memory implements a simple memory model for the ethereum virtual machine.
// Memory为以太坊虚拟机实现一个简单的内存模型
type Memory struct {store       []byte // 字节数组lastGasCost uint64 // 已分配的内存所花费的gas，用于扩展内存时计算花费的gas。
}// NewMemory returns a new memory model.
// NewMemory返回一个新的内存模型
func NewMemory() *Memory {return &Memory{}
}// Set sets offset + size to value
// 将offset--offset+size区域设置为value
func (m *Memory) Set(offset, size uint64, value []byte) {// It's possible the offset is greater than 0 and size equals 0. This is because// the calcMemSize (common.go) could potentially return 0 when size is zero (NO-OP)// offset大于而size等于是可能出现的情况。这是因为CalcMemSize函数当size为0的时候会返回0.(空操作)if size > 0 {// length of store may never be less than offset + size.// The store should be resized PRIOR to setting the memory// store的长度可能永远不会小于offset+size。// 在设置这片内存之前，store的大小应该会被调整// 所以若出现offset+size > store长度的情况，内存还没被分配，store为空。if offset+size > uint64(len(m.store)) {panic("invalid memory: store empty")}copy(m.store[offset:offset+size], value)}
}// Set32 sets the 32 bytes starting at offset to the value of val, left-padded with zeroes to
// 32 bytes.
// Set32将offset--offset+32的区域设置为val，若不够32字节，左填充0
func (m *Memory) Set32(offset uint64, val *big.Int) {// length of store may never be less than offset + size.// The store should be resized PRIOR to setting the memoryif offset+32 > uint64(len(m.store)) {panic("invalid memory: store empty")}// Zero the memory area// 先将那片内存区域置0.copy(m.store[offset:offset+32], []byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0})// Fill in relevant bits// 将val填入该内存区域math.ReadBits(val, m.store[offset:offset+32])
}// Resize resizes the memory to size
// Resize重新调整内存大小到size
func (m *Memory) Resize(size uint64) {if uint64(m.Len()) < size {m.store = append(m.store, make([]byte, size-uint64(m.Len()))...)}
}// Get returns offset + size as a new slice
// GetCopy将内存offset:offset+size的内容复制然后返回
func (m *Memory) GetCopy(offset, size int64) (cpy []byte) {if size == 0 {return nil}if len(m.store) > int(offset) {cpy = make([]byte, size)copy(cpy, m.store[offset:offset+size])return}return
}// GetPtr returns the offset + size
// GetPtr返回该内存区域的指针
func (m *Memory) GetPtr(offset, size int64) []byte {if size == 0 {return nil}if len(m.store) > int(offset) {return m.store[offset : offset+size]}return nil
}// Len returns the length of the backing slice
// Len返回内存长度
func (m *Memory) Len() int {return len(m.store)
}// Data returns the backing slice
// Data返回整个内存的数据
func (m *Memory) Data() []byte {return m.store
}

intPool结构及其操作

intPool就是256大小的 big.Int的池,用来加速big.Int的分配。节省频繁创建和销毁 big.Int 对象的开销。
core/vm/intPool.go

var checkVal = big.NewInt(-42) // 为什么是-42??const poolLimit = 256// intPool is a pool of big integers that
// can be reused for all big.Int operations.
// intPool是一个大整数池，可以为所有big.Int操作重用。
type intPool struct {pool *Stack // 这个big.Int池以栈的形式存在
}// intPool的构造函数
func newIntPool() *intPool {return &intPool{pool: newstack()}
}// get retrieves a big int from the pool, allocating one if the pool is empty.
// Note, the returned int's value is arbitrary and will not be zeroed!
// get从池中获取一个big.Int，如果池是空的，就分配一个。
// 注意：返回的big.Int值是随机的，不是归0以后的
func (p *intPool) get() *big.Int {if p.pool.len() > 0 { // 若不是空的，则直接获取return p.pool.pop()}return new(big.Int) // 若是空的，则现场分配一个big.Int。效率较低
}// getZero retrieves a big int from the pool, setting it to zero or allocating
// a new one if the pool is empty.
// getZero从池中获取一个big.Int，并将之归0.如果池是空的，就分配一个。
func (p *intPool) getZero() *big.Int {if p.pool.len() > 0 {return p.pool.pop().SetUint64(0)}return new(big.Int)
}// put returns an allocated big int to the pool to be later reused by get calls.
// Note, the values as saved as is; neither put nor get zeroes the ints out!
// put放入池中一些已经分配的big.Int，稍后由get方法重用。
// 注意，这些值原样保留。put和get方法都不会将这些整数归0.
func (p *intPool) put(is ...*big.Int) {if len(p.pool.data) > poolLimit { // 若池已满，返回return}for _, i := range is {// verifyPool is a build flag. Pool verification makes sure the integrity// of the integer pool by comparing values to a default value.// verifyPool是一个生成标志。池的验证函数通过将值与默认值进行比较来确保整数池的完整性。if verifyPool {i.Set(checkVal)}p.pool.push(i)}
}// The intPool pool's default capacity
// intPool的池的默认容量
const poolDefaultCap = 25// intPoolPool manages a pool of intPools.
// intPoolPool管理一个intPool的池
type intPoolPool struct {pools []*intPool // intPool的切片lock  sync.Mutex // 互斥信号量
}// 初始化
var poolOfIntPools = &intPoolPool{pools: make([]*intPool, 0, poolDefaultCap),
}// get is looking for an available pool to return.
// get查找一个可用的池然后返回
func (ipp *intPoolPool) get() *intPool {ipp.lock.Lock()         // 上锁defer ipp.lock.Unlock() // 运行完解锁if len(poolOfIntPools.pools) > 0 {ip := ipp.pools[len(ipp.pools)-1]        // 找到一个池ipp.pools = ipp.pools[:len(ipp.pools)-1] // 从intPool池中删除该项return ip                                // 返回}return newIntPool() // 若为空，现场初始化一个intPool返回
}// put a pool that has been allocated with get.
// 放入已分配的池。
func (ipp *intPoolPool) put(ip *intPool) {ipp.lock.Lock()defer ipp.lock.Unlock()if len(ipp.pools) < cap(ipp.pools) {ipp.pools = append(ipp.pools, ip)}
}

参考文献

Ethereum Yellow Paper
ETHEREUM: A SECURE DECENTRALISED GENERALISED TRANSACTION LEDGER
https://ethereum.github.io/yellowpaper/paper.pdf
Ethereum White Paper
A Next-Generation Smart Contract and Decentralized Application Platform
https://github.com/ethereum/wiki/wiki/White-Paper
Ethereum EVM Illustrated
https://github.com/takenobu-hs/ethereum-evm-illustrated
Go Ethereum Code Analysis
https://github.com/ZtesoftCS/go-ethereum-code-analysis
以太坊源码解析：evm
https://yangzhe.me/2019/08/12/ethereum-evm/
以太坊 - 深入浅出虚拟机
https://learnblockchain.cn/2019/04/09/easy-evm/
Contract ABI Specification
https://solidity.readthedocs.io/en/v0.5.10/abi-spec.html?highlight=selector#function-selector
认识以太坊智能合约
https://yangzhe.me/2019/08/01/ethereum-cognition-and-deployment/#%E8%B0%83%E7%94%A8%E5%90%88%E7%BA%A6
ntPoolPool) put(ip *intPool) {
ipp.lock.Lock()
defer ipp.lock.Unlock()

if len(ipp.pools) < cap(ipp.pools) {
ipp.pools = append(ipp.pools, ip)
}
}


## 参考文献
1. Ethereum Yellow Paper   ETHEREUM: A SECURE DECENTRALISED GENERALISED TRANSACTION LEDGER   https://ethereum.github.io/yellowpaper/paper.pdf
2. Ethereum White Paper   A Next-Generation Smart Contract and Decentralized Application Platformhttps://github.com/ethereum/wiki/wiki/White-Paper
3. Ethereum EVM Illustrated   https://github.com/takenobu-hs/ethereum-evm-illustrated
4. Go Ethereum Code Analysishttps://github.com/ZtesoftCS/go-ethereum-code-analysis
5. 以太坊源码解析：evmhttps://yangzhe.me/2019/08/12/ethereum-evm/
6. 以太坊 - 深入浅出虚拟机   https://learnblockchain.cn/2019/04/09/easy-evm/
7. Contract ABI Specificationhttps://solidity.readthedocs.io/en/v0.5.10/abi-spec.html?highlight=selector#function-selector
8. 认识以太坊智能合约https://yangzhe.me/2019/08/01/ethereum-cognition-and-deployment/#%E8%B0%83%E7%94%A8%E5%90%88%E7%BA%A6

以太坊EVM源码注释之数据结构相关推荐

以太坊EVM源码注释之State
以太坊EVM源码注释之State Ethereum State EVM在给定的状态下使用提供的上下文(Context)运行合约,计算有效的状态转换(智能合约代码执行的结果)来更新以太坊状态(Ether ...
以太坊EVM源码注释之执行流程
以太坊EVM源码分析之执行流程业务流程概述 EVM是用来执行智能合约的.输入一笔交易,内部会将之转换成一个Message对象,传入 EVM 执行.在合约中,msg 全局变量记录了附带当前合约的交易的 ...
以太坊控制台源码分析
最近有网友提到以太坊控制台的代码看不太明白,抽了点时间整理了一下. 当我们通过geth console或者geth attach与节点交互的时候,输入的命令是如何被处理的呢?看下面这张流程图就明白了: ...
以太坊挖矿源码：ethash算法
本文具体分析以太坊的共识算法之一:实现了POW的以太坊共识引擎ethash. 关键字:ethash,共识算法,pow,Dagger Hashimoto,ASIC,struct{},nonce,FNV ...
以太坊挖矿源码：clique算法
链客,专为开发者而生,有问必答! 此文章来自区块链技术社区,未经允许拒绝转载. clique 以太坊的官方共识算法是ethash算法,这在前文已经有了详细的分析: 它是基于POW的共识机制的,矿工需要 ...
以太坊Go-ethereum源码分析之启动流程
以太坊源码编译需要gov1.7以上,及C编译器,执行make geth 即可编译项目,编译后可执行的geth文件. Makefile文件: geth:build/env.sh go run build ...
以太坊DPOS源码分析
2019独角兽企业重金招聘Python工程师标准>>> 一.前言: 任何共识机制都必须回答包括但不限于如下的问题: 下一个添加到数据库的新区块应该由谁来生成? 下一个块应该何时产生? ...
以太坊地址算法php,以太坊ETH源码分析（1）：地址生成过程
一.生成一个以太坊钱包地址通过以太坊命令行客户端geth可以很简单的获得一个以太坊地址,如下: ~/go/src/github.com/ethereum/go-ethereum/build/bin$ ...
以太坊ETH源码分析（1）：地址生成过程
一.生成一个以太坊钱包地址通过以太坊命令行客户端geth可以很简单的获得一个以太坊地址,如下: ~/go/src/github.com/ethereum/go-ethereum/build/bin$ ...

以太坊EVM源码注释之数据结构

以太坊EVM源码分析之数据结构

EVM代码整体结构

Contract

构造函数

方法

Contract在EVM中的使用

EVM

构造函数

Context

构造函数

StateDB

构造函数

Config

Interpreter

构造函数

Input数据结构

函数选择子(Function Selector)与参数编码(Argument Encoding)

数据加载相关的指令

Appendix A

Stack结构及其操作

Memory结构及其操作

intPool结构及其操作

参考文献

以太坊EVM源码注释之数据结构相关推荐

最新文章

热门文章