Wasm Introduction (Part 2): Instruction Set & Operand Stack
Written by the CoinEx Chain lab, this article is the second one of the Wasm Introduction series and introduces instruction set & operand stack. CoinEx Chain is the world’s first public chain exclusively designed for DEX, and will also include a Smart Chain supporting smart contracts and a Privacy Chain protecting users’ privacy.
The previous article introduced the WebAssembly(hereinafter referred to as Wasm) binary format. Here we will introduce the Wasm instruction set, operand stack and some instructions.
Wasm Instruction Set
Like the real machine code, the code in the Wasm binary file is also composed of instructions. Similarly, the Wasm instruction also contains two pieces of information: the opcode and operands. The Wasm opcode is fixed to one byte, so it can represent up to 256 instructions, which is the same as the Java bytecode. The Wasml.0 specification defines a total of 172 instructions. These instructions can be divided into 5 categories by function:
- Control Instructions, a total of 13.
- Parametric Instructions, a total of 2.
- Variable Instructions, a total of 5.
- Memory Instructions, a total of 25.
- Numeric Instructions, a total of 127.
It is clear that more than two thirds of the defined instructions are numeric instructions. For the convenience of human writing and understanding, the Wasm specification also defines a mnemonic for each opcode. For example, the mnemonic of the opcode 0x41
is i32.const
. The following is a schematic diagram of the opcode distribution of the defined instructions (the grey area represents undefined opcodes):
Some instructions need to carry information, which is coded immediately after the opcode, called static immediate arguments(hereinafter referred to as immediate arguments). Take the i32.const
instruction as an example. The opcode 0x41
is followed by an encoded 32-bit integer. In the following examples, we will use a schematic diagram similar to the following to represent the encoded instructions:
Like stack machinessuch as JVM, most Wasm instructions also use the operand stack(hereinafter referred to as the stack). These instructions pop one or more numbers from the top of the stack, perform calculations, and then push the results to the top of the stack. The top stack elements that are operated by instructions are called dynamic operands(hereinafter referred to as operands). In the following examples, we will use a schematic diagram similar to the following to indicate the state of the stack before and after the instruction is executed (small arrows indicate pop or push operations):
Parametric and numeric instructions only operate on the stack, and their behavior is relatively simple, which will be introduced in this article. Other instructions will be introduced in subsequent articles.
Parametric Instructions
There are two parametric instructions: drop
(whose opcode is 0x1A
) and select
(whose opcode is 0x1B
).
drop
The drop
instruction pops an operand of any type from the top of the stack. There is no immediate arguments for the drop
instruction. Its schematic diagram is shown as below:
select
The select
instruction successively pops 3 operands from the top of the stack. If the first popped operand is zero, the second popped operand is pushed onto the stack, otherwise the third popped operand is pushed onto the stack. The select
instruction does not have immediate arguments either. Here is a diagram of it:
Note that the operand at the top of the stack must be of type i32
, and the remaining two operands must come in the same type. When we need to emphasize the specific type of the operand, we will mark the type with parentheses in the diagram. drop
and select
are two special instructions, because only these two instructions do not completely limit the type of the operands. For other instructions, all operand types are fully qualified.
Numeric Instructions
Numeric instructions can be divided into four groups of i32
, i64
, f32
, and f64
according to the operand type. Each group of instructions can be further divided into five groups according to the operation.
- Constant Instructions
- Test Instructions
- Comparison Instructions
- Arithmetic Instructions
- Unary Arithmetic Instructions
- Binary Arithmetic Instructions
- Conversion Instructions
Except constant instructions, all other numeric instructions have no immediate arguments.
Constant Instructions
The constant instruction pushes the immediate argument to the top of the stack. Take the i32.const
instruction (opcode 0x41
) as an example. The following is its diagram:
There are four constant instructions, and the other three are: i64.const
(opcode x42
), f32.const
(opcode x43
), and f64.const
(opcode 0x44
). It is not difficult to find the naming rule of the Wasm opcode mnemonic: If the type of the top stack element after the instruction execution is t
, then the mnemonic starts with t.
.
Test Instructions
The test instruction pops an operand from the top of the stack and tests whether it is 0. If it is, it pushes 1 (as i32
) onto the stack; otherwise it pushes 0 (as i32
) onto the stack. There are only two test instructions: i32.eqz
(opcode 0x45
) and i64.eqz
(opcode 0x50
). Take the i64.eqz
instruction as an example. Here is a diagram of it:
As you can see, the result of the test instruction is actually a boolean value, but Wasm does not define a bool
type and uses the i32
type to represent it instead (1 means true, and 0 means false)
Comparison Instructions
The comparison instruction pops two operands of the same type from the top of the stack, compares them, and then pushes the result onto the stack. Like the test instruction, the result of the comparison instruction is also a boolean value (i.e., i32
type). Take the i64.lt_s
instruction (Opcode 0x53
) as an example. The following is its diagram:
In addition to less than (le), there are instructions for such comparisons as being equal to (eq), not equal (ne), greater than (gt), less than or equal to (le), and greater than or equal to (ge). Instructions are not introduced here one by one. It should be noted that for some instructions operate on integers, it is necessary to clearly indicate how to interpret the operands: whether it is a signednumber (mnemonic ends with _s
) or unsignednumber (mnemonic ends with _u
) . These instructions usually appear in pairs, such as the i64.lt_s
instruction in the above example, and the corresponding i64.lt_u
instruction (opcode 0x54
).
Unary Arithmetic Instructions
Unary arithmetic instructions pop an operand from the top of the stack, perform calculations, and then push the result to the top of the stack. Take f32.neg
(opcode 0x8C
) instruction as an example. The following is a schematic diagram of it:
Binary Arithmetic Instructions
Binary arithmetic instructions pop two operands from the top of the stack, perform calculations, and then push the result to the top of the stack. Take f32.sub
instruction (Opcode 0x93
) as an example. Its schematic diagram is shown as below:
Conversion Instruction
The conversion instruction pops an operand from the top of the stack, performs type conversion, and then pushes the result to the top of the stack. If the type of the operand before conversion is t
, the type after conversion is t'
, and the conversion operation is conv
, then the mnemonic of the instruction is t'.conv_t
. Take the i32.wrap_i64
(opcode 0xA7
) instruction as an example. The following is a schematic diagram of it:
There are too many comparison, arithmetic and conversion instructions to introduce one by one in this article. Please refer to the Wasm specification.