After years of research, development and testing, Ethereum will transition from proof of work to proof of stake in the coming months. Instead of “miners” using computational energy to process transactions, “validators” will lock up, or stake, their assets in the network in return for ETH rewards. The upshot is increased security and a much smaller environmental footprint for the decentralized network.

Danny Ryan is an Ethereum Foundation (EF) researcher helping to coordinate the network upgrade, known as the Merge. It’s part of a larger constellation of upgrades, once referred to as Ethereum 2.0, aimed at making the network more secure, sustainable and scalable.

Ryan joined Future to talk about the Merge. In Part I of our conversation, below, he explains the decision to temporarily prioritize security and sustainability over scalability, how the upgrade enables liquid stakers and other emerging actors, and why Ethereum doesn’t take a day off.

In Part II, he talks about the features users will likely see in subsequent upgrades, whether on-chain voting could be used for future upgrade decisions, and why shadow forks are the way forward.


Two out of three: Security and sustainability

FUTURE: What is the Merge designed to accomplish?

DANNY RYAN: Abstractly, when I think about the things we’re trying to do to and for Ethereum at the layer-one protocol over the next handful of years, we’re trying to make it more secure, sustainable, and scalable — the three S’s — while still being decentralized (which can mean a lot of things, but multidimensional decentralization).

Layer one (L1)

A layer one is a blockchain that can process transactions without relying on another network. They include Bitcoin, Ethereum, and Solana.

The Merge accomplishes two of those things. The Merge is to help make Ethereum more secure. That’s an argument that people will have maybe until the end of time — that proof of stake is more secure than proof of work, or vice versa. But based on our research, understanding of these systems, understanding of types of attacks and things like that, generally the Ethereum community and researchers make a claim that proof of stake is more secure than proof of work.

[With regards to] sustainability, proof of work, to do its cryptoeconomic magic, burns a ton of energy. Proof of stake, due to its cryptoeconomic magic, does not. So we’re achieving something like 99.9, 99.95, 99.98% energy reduction depending on your napkin math, but nonetheless incredibly substantial. 

[If Ethereum stayed on proof of work and] the price of ETH doubles, the new equilibrium of mining power on the Ethereum platform would double eventually. And in the proof-of-stake world, [if] the price of ETH doubles, the equilibrium of the number of nodes on the network doesn’t really change. There might be 10,000 nodes on the network. There might even be 100,000 nodes on the network. But it’s going to be 100 middle schools’ or 1,000 middle schools’ worth of energy consumption — not, like, Argentina or whatever.

We don’t get [scalability] out of the gate with the Merge. We do lay the foundation.

The Ethereum white paper says, “In the future, it is likely that Ethereum will switch to a proof-of-stake model for security, reducing the issuance requirement to somewhere between zero and 0.05X per year.” You mentioned not just security but sustainability. At what point did sustainability become as big a factor as security?

In the white paper, I don’t know if that’s touched on. But in some early Ethereum.org blog posts and just even in the world — then-2014, 2013 — the linear relationship between asset price and energy consumed on proof-of-work networks was very much known. I would say that when the Ethereum community began to be less insular and [started] onboarding non-crypto-native people into interesting applications, specifically in the art and NFT world, the energy component of this definitely came into the limelight because [of] increases in the ETH price, which increase the total mining power. Getting the limelight from different communities that had all sorts of different values alignments, that definitely became a more front-and-center component. But I would say that the “waste” of burning energy to demonstrate the crypto-economics in proof of work has not been something we’ve not known about; it’s definitely been a goal for quite a while.

The third S: Scalability

A lot of people have gotten ahead of themselves and looked forward to the things that the Merge is going to lay the groundwork for, such as lower fees, less congestion, and more. But at its most basic …

That’s that third S — scalability. And we don’t get that out of the gate with the Merge. We do lay the foundation, as you said.

So at this point, with just the move to proof of stake and no sharding until a later upgrade, we don’t have that third S. Where do things currently stand with scalability?

I like to be a bit tongue-in-cheek: Block times will be 12 seconds instead of an average 13 and a half seconds, but the gas limit will stay the same. So 10% scalability gain at the Merge. Take it or leave it. 

That’s not the kind of scalability gains that we’re looking for, really. But scalable, more-sophisticated consensus mechanisms that can come to consensus on more are actually hard to construct in proof of work. There are some attempts to do things like sharding [the planned scaling mechanism for Ethereum] and other things in proof-of-work protocols, but you end up simulating a proof-of-stake protocol inside of a proof-of-work protocol. So I would say that [proof of stake] is a requisite foundation for future scalability upgrades.

Additionally, there is a scalability path happening in parallel to the Merge through layer-two constructions [using] rollups. There are paths that actually are online, and that people are beginning to adopt more and more, that give you 10-100x scalability of the current Ethereum platform with no changes. And future scalability upgrades to the layer-one platform would complement this and multiply it. So the nice thing is — although from layer one we’re targeting these first two S’s, security and sustainability — in parallel, we’re getting scalability through layer-two constructions, which are buying us time and are bringing to fruition much of the needs. Over time, we can complement that through more scale at layer one. (See part 2 of our conversation for how L1 sharding can provide additional scale.) 

If you’re relying on layer-two solutions (protocols that sit atop Ethereum to increase throughput) for a certain degree of scalability, what are the security considerations in that?

It’s really easy to construct insecure layer twos, first and foremost. We believe that the most general-purpose secure construction are these rollups — optimistic and [zero knowledge, or] ZK. And one of the crucial components of this is that you publish transaction data or some sort of state transition data and certain ZK constructions on-chain — so you utilize the data availability of the chain. And that does limit the amount of scalability at the end of the day. 

Layer two (L2)

L2s refer to technologies atop an L1 that assist with scalability.

Rollups

Rollups process transactions off the main network before bundling them together and sending them back to the L1 network.

Sometimes people look at that and go, “Well, let’s just not do that. We’ll essentially do a rollup but we won’t publish the data, and we can, like, do side construction.” So all of a sudden, the incentive to get more scale is also the incentive to potentially cut corners on some of these layer-two constructions. Thus, I think some of the security concerns here are that it’s very difficult to understand the tradeoffs. If you had a pure L2 that didn’t cut corners, then you inherit the security of Ethereum. But if you have an L2 rollup that’s like, “Well, we’re pretty much a rollup,” then you not only don’t inherit the security of Ethereum, but by many orders of magnitude the threat profile enhances as those corners are cut. 

I think it’s very difficult for a consumer to look at L2 “A” and L2 “B” and understand that L2 A is, like, 1,000 times more secure than L2 B — especially when language is unclear, especially when it’s hard to see what’s actually going on. L2Beat is this independent third party that’s trying to just catalog this information so we can better understand the security tradeoffs here. But nonetheless, that’s certainly an issue when you have L2s that aren’t quite really what they say they are. 

Another issue would be complexity. L1 has a certain risk profile in relation to the types of bugs that might be introduced, the complexity of the software and things. And so when you make an L2, you’re taking that and then you’re adding a bunch of complexity. You’re adding this whole derivative system and so there’s risk there, insecurity. 

And then I would also say there’s a desire and a need to keep these L2-derivative systems upgradable. It’s hard for me to construct an L2 that can’t ever upgrade if I assume that L1 might upgrade. That’s where the need comes in. And there’s also a desire. I think many people constructing L2s want to get them out the door, but they also want to enhance the feature set over time. So there’s also a desire to upgrade these systems over time. Because of that, there’s also potential security risks. So what are the upgrade models? Is it upgradable by, like, three dudes and they have to sign a message? Is it upgradable by a DAO? Is that safe? Is it upgradeable instantly? Or does it give you like a year of lead time?… And there’s a whole spectrum of design here. The theoretical perfect L2 inherits the security of Ethereum. There are a lot of different things that augment that statement, though.

We believe there’ll be easily an order of magnitude more distinct validating entities than there were mining entities, which I think is good.

MEV, liquid staking, and the evolving Ethereum ecosystem

With the move to proof of stake as well as the infrastructure and incentive changes that come from the Merge, what sort of new actors or project types do you see coming to the fore?

Certainly, in with the validators, out with the miners. So that’s a shift in actor. We believe there’ll be easily an order of magnitude more distinct validating entities than there were mining entities, which I think is good.

In parallel over the past couple of years, the MEV (miner extractable value or maximal extractable value) space has created a few different actors. This is kind of independent of the Merge, though. There are now entities that specialize in searching [and] trying to find optimal configurations of blocks. Then there are intermediaries in there that help combine searchers into valuable blocks and then sell them essentially to miners or validators. So there’s this whole extra protocol construction of different actors that are playing this MEV game, which apparently, seemingly, is very high value, high stakes. That’s kind of independent, although there are things that the L1 protocol can probably do to make that whole construction in reality safer. (To hear more about how Ethereum can address MEV at the L1 level, read the second part of our conversation.) 

So there’s those actors. I would say staking derivatives are very interesting. There are many different versions of this, but essentially: When you’re staking, that has a certain risk profile — somebody is staking for you or you’re doing it yourself. And then there’s some representation of that underlying staked asset, which maybe you can trade or maybe you can bring into smart-contract world and bring into DeFi and things like that. 

I know LIDO is probably the most popular. There’s a handful of them, and there’s a bunch that are also up-and-coming. So there’s a lot of different players in relation to that. There are DeFi entities getting involved kind of closer into the staking world. There are DAOs governing staked derivatives, there are consortiums governing staked derivatives, there’s all sorts of fun stuff that shakes out of that world.

Right, and there was some discussion about whether LIDO, which stakes a lot of ETH to the beacon chain on users’ behalf, was hitting the max of what was good for a decentralized network.

I wrote a piece called The Risks of LSD — liquid staking derivatives. Maybe I mentioned LIDO as just an example. Some people assume that you can construct these things in ways that do not have the same kind of centralization concerns that you would if it was the single operator accumulating certain key thresholds. I make an argument in that piece that that is not the case — that you do get substantial risk when you pass one-third, one-half, and two-thirds. And that for some reason, because of the derivative nature here, we don’t acknowledge those risks quite the same. Thus, the market seems to be demanding to exceed those thresholds. 

So I make the claim that if I’m a staking derivative, DAO, or controller or whatever, it’s probably in my best interest not to exceed those thresholds because of the risk that it induces for my protocol and for my users. And I make the claim that [for] users, it’s not actually in their best interest, even though liquidity begets liquidity and being involved with a highly liquid staking derivative can have its benefits — that the risks begin to exceed such benefits. So my claim is: Let’s not not pay attention to the risks because the benefits are so great, and let’s wise up or else something bad probably will happen and then the market will probably get wiser.

[Editor’s note: In June 2022, LIDO holders voted down a governance proposal to explore setting limits on the amount of ETH staked through the platform.]

Some of the security gains, from my understanding, are that you’re going to get increased decentralization because it’s going to become easier to participate — not necessarily as a staker, but as a non-block-producing node. How much of the security gains are from increased user participation, and how much are attributable to other factors?

You probably get some sort of decentralization gain because proof of work and proof of stake require posting some sort of particular collateral, and it’s much easier to get the collateral for proof of stake because of the open markets to buy ETH. So it’s much easier for many participants to participate with the same edge in terms of access to that capital. Whereas in proof of work, the capital required is highly specialized machinery, you know, ASICs or GPUs.

Long story short, I think there are gains in decentralization and I think there are gains due to the type of cryptoeconomic capital — making it a bit more egalitarian, reducing the economies of scale. 

But a lot of what my claim would be [is] in the actual way the protocol is constructed: In proof of work, pretty much we can just reward. So if you do a good job, you end up making money. If you do a bad job, there’s opportunity costs. But if you explicitly attack, you don’t really lose anything. Whereas in proof of stake, if you do a good job, you make money. You do a bad job — you know, you’re offline, things like that — you stand to lose some money. And if you do explicitly nefarious things like contradict yourself and try to create reorgs and two different chains, you can lose tons of money. You can lose all of your money, depending on the extent of what’s detected.

Because the asset is in the protocol — the staked ETH — that asset can be destroyed. It’s kind of akin to: the protocol cannot burn somebody’s mining farm down if they tried to attack the chain, but the protocol can burn the staked ETH if they try to attack the chain. Not only do we get the rewards, but we can have punishments, so the security margin on the capital that is staked can be much higher. That’s the [explanation] for a lot of why we say it’s more secure.

Decentralization, access to the asset required, reduced economies of scale, and other stuff like that help as well.

There’s a lot going on with Ethereum all day, every day. There’s an expectation that it’s up. And that’s the expectation that we’re trying to keep. 

Doing it live

This entire upgrade is being done without any pause to transactions. And the Ethereum.org website states: “Ethereum does not have downtime.” Why was this such an important consideration? Why not just take a day, advertise in advance, and make the swap?

For one, I don’t know how much that will reduce the complexity. At the end of the day, we still have to coordinate on something, and we still have to agree where the end is and where to start. And once you have to do that, a day is probably not sufficient time to coordinate. 

If you actually wanted to do that — to stop, then everyone upgrades their nodes and then it starts again — I would say three days minimum, probably more like a week in terms of actually having success and coordinating. Maybe if you really give lead time [and] everyone knows it is going to happen, it could be 48 or 72 hours. I don’t think it would be just a day. 

So then the question is: What’s lost in that day? Probably a lot. I know the DeFi bros would be quite mad. It is a functioning economy. There’s a lot going on with Ethereum all day, every day. There’s an expectation that it’s up. And that’s the expectation that we’re trying to keep.

Again, I don’t know, maybe you can reduce the complexity by around 20% if you don’t do it live, but that’s probably not worth the losses of being offline for three days — both in terms of real numbers of the transaction activity on those days but also in terms of what people expect out of Ethereum. I think we would shatter that a little bit, but I don’t know. It’s the way it will be done unless there’s a concerted miner attack beforehand, and I don’t think it adds too much complexity. There was a pretty clear path on how to do it that way, so I think it made sense.

Read the second part of our conversation with Danny Ryan to learn what types of upgrades developers want to see after the move to proof of stake.

This interview has been edited and condensed.