Ethereum 1 Research topics

Flat data layout for storing the state

For decoupling merkle tree from the data storage. The result of this research is benchmarks and documentation on how to deal with various issues when changing from trie-based storage model to flat storage model. Examples of such issues:

This documentation should serve as a guide for any implemention team who wishes to migrate their implementations to the flat storage model.

Stateless Ethereum

For letting most nodes in the network to use less resources to process blocks. The results of this research are prototypes, benchmarks, documentation and specification describing how to implement stateless processing. Implementation would include three main parts:

  1. Block witness producer
  2. Block witness relayer
  3. Block witness consumer

Stateless clients is one approach to improve performance of Ethereum client implementations while processing blocks of transactions. Specifically, it seeks to ease the increasing burden that is the state size. It does so by removing the need to download, or implicitely construct and maintain the state, for most of the participants in the Ethereum network. The requirement to access to the state is removed by introducing another type of data packets (existing data packet types are, for example, blocks and transactions) to be gossipped around the p2p network. We call this data packets “block witnesses”. For each block we have one corresponding block witness. The two main properties that block witnesses have:

  1. It is possible to verify efficiently that the block witness is indeed constructed from the correct version of the Ethereum state.
  2. Block witness has all the required information to make it possible to execute the corresponding block.

More details can be found here:

Current vision of the project

To make stateless clients a reality and get it implemented in all Ethereum 1 clients, these large steps seem to be necessary, up to the test-net launch:

  1. Revisiting of the prototype (currently based on Turbo-Geth). Previous version of the prototype, that was used to produced data in the medium article showed above, had a different encoding for the structure than what is currently proposed (based on the “multiproofs” section here:
  2. Refresh the data. Using updated prototype, produce new data, and add non-smoothed charts (previously published charts were moving average over 1024 block to make them smoother, but they potentially concealed some extremes)
  3. Estimate savings in the block witness size from switching to binary tries It is a conjecture that switching to binary tries would reduce the size of the block witnesses by the factor of 3-4. Also, will merkle-hashing of contract byte codes reduce the size of the block witnesses.
  4. Extend to the prototype to generate block witness for the past blocks ranges. Witnesses may be produced not just for a single block, but for a range of blocks. This may make more sense for the past blocks, to speed up the sync process. If block witness is 1Mb per block, then for a million blocks, the cumulative witness size would be 1Tb. However, if we produced one witness per 1024 blocks, the ratio “cumulative witness size/cumulative block size” will go down. The goal of this step is to find the optimal value of this ratio and access whether it can speed up syncing by producing and making available witnesses for all past blocks.
  5. Visualisation in the prototype. Some initial steps have been taken (e.g. visualisations for the presentation at the STARKWare sessions). This needs to be extended to provide interactive visualisations of state and block witnesses, to help extend the circle of researches and developers dealing with the stateless clients.
  6. Prototype to support semi-stateless. Semi-stateless approach described here:, but the data gathering was still hard computationally. Data needs to be collected and published on how the block witness size reduced with the degree of statefulness, and how much memory various degrees of statefulness would require.
  7. Analysis of potential adversarial behavior. Using the prototype developed, make an assessment of how much gas would an adversary need to pay to inflate the size of the block witness. This analysis will be the input to the gas cost adjustment that introducing stateless client would require.
  8. Block witness specification. Formal (implementation-independent) description of how block witness can be produced, how to encode/decode them into/from the wire packets, how to execute transactions on a block witness.
  9. Partial block witness specification. This includes the case of semi-stateless block witnesses, tailored for a specified degree of statefulness.
  10. New sub-protocol specification. Specify a new, optional sub-protocol for gossiping block witnesses (full, partial, and for range of blocks) around the network.
  11. Gas cost specification. Based on the analysis of the adversarial behaviour, specify how the transaction senders will be charged for the production of block witnesses.
  12. Test-net. Construct a test-net supporting the block witnesses. Specification from previous steps would allow any client implementation to catch up and join.

Decoupling two functions of gas

For less painful repricing of opcodes. Gas is split into two components - one for metering, another for restricing recursion and callbacks. The result of this research is specification, prototype implementation, reference tests, that can be used to propose an EIP.

Formal system to specify semantics of Ethereum state transition

To agree on the most suitable way to specify semantics and use it to propose changes in Ethereum more formally.
Another important usage of such formal system is symbolic execution. A good formal system makes such symbolic execution easy.
Specification of the state handling, for example, is separate from the EVM spec and the definitions of transactions and blocks. It abstracts them away by the means of read sets and change sets, for example.

Current idea (proposed by Martin Lundfall) is to use the notation similar to what
is used in Web Assembly, which is very clever system, and is as follows:

  1. State of the system, of configuration, is described as a tuple of functions (for accounts, storage etc.), and an instruction sequence.
  2. Interestingly, the stack does not have a separate representation and is instead represented as a part of the instruction sequence. To achieve that, a special const instruction carrying a value is used.
  3. Most operations will be represented as a reduction rule that involve pattern that has to be matched inside the sequence of instruction, and an optional guard, which is formulated as a logical formula. Guards let us choose between multiple rules applicable to the same configuration, and allows us computing the effect of the operation on parts of the state.
  4. For our purposes, we can try to represent a gas counter (or two) a part of the instruction sequence too (with special instruction gas_counter, so we can have different reduction rules that apply depending on whether there is sufficient gas, as well as properly decrease the counter on usage.
  5. Things like execution frames are trickier. We need an ability to scrap them completely when the frame is aborted or we perform RETURN. For that reason we might need to introduce frame boundary “instructions”, frame_begin, frame_id_1 … , frame_begin, frame_id_2 …, frame_end, frame_id_2, …, frame_end, frame_id_1. In this way, pattern matching can recognise where frames begin and end so they can be eliminated. Frame ids are required to ensure that any reduction rule that removes the entire frame can specify in the guard that frame id at the start of the frame matches the frame id at the end of the frame. And, the frame needs to have its own frame id present locally.

One of the challenges is how to model the sub-states. Complexity with the substates is that the state modifications made in an execution frame (substate) could be reverted or “committed” at the end of the frame. However, the opcodes are allowed to reach beyond the substate and that can potentially trigger cascading reads.
To solve this, we can employ certain level of non-determinism. When the instructions like CALL executes, and spawns a new frame between instruction sequence and the stack, it is also allowed to copy an arbitrary part of the enclosing frame’s state into its local readset. When the execution proceeds, reading from beyond the readset results in the failure of evaluation. However, with sufficiently correct readset the evaluation will proceed as required. This is where non-determinism is. Symbolic execution engines will be able to deal with it pretty easily by actually implementing cascading reads, but modelling cascading reads as a reduction rule would require some advanced higher-level functions.
Developing this idea, we would also have local writeset, which is getting populated during the frame’s execution, and then either gets merged into the encompassing frame’s writeset, or discarded.

When jumping instructions are executed, this leads to the replacement of the entire code in the current frame with the slice starting from the jump location.

In preamble, the following types would need to be defined:

  1. Boolean
  2. Integer (up to 2^256)
  3. Lists
  4. Mapping

List will need a “drop” operation defined, like
l[index:], which is part of l with index number of items in the front dropped, or invalid_list if there aren’t enough items. It will be used for modelling jump instructions.

Mapping type will need a boolean function “submap” that tells if one mapping is a sub-mapping of another m1<=m2. It will be used to check that the readset (non-deterministic) of the newly spawned frame is correct. Also, mappings should allow “holes”, i.e. values that are intented to be deleted. Using them, we can define “merge”, which applies writeset generated by an execution frame to the writeset of the enclosing frame.

The general composition of the string that is getting reduced during the execution is this:

(1 stack1 [configuration1] instructions1 )1

When instruction CALL (or similar) is encountered,
it spawns another exection frame within the first one:

(1 stack1 c1 c2 c3 c4 c5 c6 [configuration1] CALL instructions1 )1
(1 stack1 [configuration1 - gas]
   (2 stack2-empty [configuration2 + readset]
   instructions of the called contract
instructions1 )1

Until this new execution frame completes, it will prevent any reduction rules in the first frame to proceed, because most reduction rules will have this kind of left hand side:

operand1 operand2 .. operand_n [configuration] OPCODE

Some reduction rules will include frame boundaries like (1 and )1 and some wildcards (* - matching any number of objects). Note that (1 will need to be two separate objects, so that we can have a guard to make sure that the frame id that we are matching corresponds to the inner-most frame. For example, revert will be something like this:

[configuration]( frame_id * mem_start mem_end [frame_id, mem] REVERT * frame_id )
configuration' = configuration + mem[mem_start:mem_end]

It is not very formal yet, but just to give an idea.

The next two challenges are:

Memory of an execution frame is modelled as a list of bytes + an integer specifying the starting address of the memory. Ending address can be always computed as starting address + length of the list. When memory is expanded, we need might need multiple guards for the cases where it is expanded upwards or downwards, with overlapping and non-overlapping “allocation”. When the memory is read onto the stack and writting from the stack, we need to have functions defined that turn certain number of bytes into the 0…2^256-1 integer.
In order to support the limitation of the stack depth (1024), there needs to be a short notation for matching large number of objects, so we do something like this:

( frame_id *{1024} [frame_id] PUSH1 x * ) frame_id ==>

In the above, the entire execution frame gets remove by the reduction rule. Unfortunately, there is nowhere to put the exception, but it can still be part of the configuration for the purposes of semantics and symbolic execution tools.

State syncronisation (snapshot, moving snapshot)

Model for what state syncronisation is, describe existing and proposed methods using this model. Ideally, such model would allow describing fast sync, beam sync, firehose etc., illuminate their differences and perhaps derive the optimal sync algorithm. Such model would also use specification for the state handling that abstracts away EVM and transactions, but requires a notion of blocks as a form of temporality.

Storing state outside of Ethereum network

Swarm, Cloudflare? How do we split the state data so that there is not too much churn of the chunks.

WASM for implementing Generalised Elliptic Curve precompiles

Assess whether Wasm interpreter with long-arithmetics subroutings would be acceptable substitute for precompiles in EIP-1962

WASM for implementing Stateless Ethereum

To potentially shorten the delivery time for the Stateless Ethereum

STARK proofs for Stateless Ethereum

Is it viable to use STARK proofs to reduce bandwidth requirement of Stateless Ethereum? Intermediate hashes and structure would be removed from the block witnesses, and replaced by 2 STARK proofs per block - one for pre-witness, another - for post-witness.

Useful link for implementing Keccak: (ARCHITECTURES FOR ARITHMETIC OPERATIONS IN GF(2M)
CURVE CRYPTOSYSTEMS). Looks like we would need normal basis representation, because cyclic shift is simply squaring.

Transaction pool improvements to accomodate larger transactions

To prepare implementations for applications that require larger transaction (carrying Merkle proofs), mostly due to off-chain data storage and processing.

Chain pruning

To develop set of rules and techniques for enabling partial storage of blocks, headers, and receipts.

Conformance testing to use Ethereum more natively

Instead of using implementation-specific drivers to execute conformance tests, or JSON RPC (retesteth), utilise p2p protocols to submit transactions, mine test blocks, and syncronise the state for assertions.