From the outset, one of the big challenges OpenRelay has faced has been the management of Ethereum clients. In general, the challenges are:
- Syncing the blockchain takes hours at best, and often days.
- Losing peers can put you out-of-sync
- Processing transactions requires a lot of CPU, RAM, and disk IO
- If your application needs to make more RPC requests than a single node can handle, all of the above must be replicated to expand capacity.
In the world of databases, these problems are solved through streaming replications. Rather than running a single database server, database administrators run a master and replicas. Read requests can be routed to the replicas to provide additional capacity, while write operations go to the master and get replicated out. Replicas generally avoid a lot of the complex operations the master has to perform, and just writes to disk based on the master’s instruction. If a master ever fails, a replica can be promoted to master with minimal disruption.
The Ether Cattle Initiative brings streaming replication to Ethereum clients. In a nutshell:
- Systems administrators run a master server, a Kafka cluster to capture write operations, and as many replicas as necessary to meet capacity requirements.
- The master maintains peer-to-peer connections and validates incoming blocks & transactions.
- As the master writes to disk, it logs its write operations to a Kafka topic.
- The replica servers subscribe to the Kafka topic and write operations get stored on the replica.
- The replica server then serves RPC requests.
- Any transactions sent to the replica are routed back to the master though a separate Kafka topic to be broadcast to the network.
This design is documented in more detail in our Architecture Document, we also put out a YouTube video on the subject.
Ether Cattle master servers are standard Geth nodes, but have been extended to
log all write operations to a Kafka topic. This means our changes stay clear of
any consensus-critical parts of the Geth node, and it is extremely unlikely that
a master would behave incorrectly in terms of normal node operations. Masters do
need to run with the flag
--gcmode=archive to ensure that replicas get state
changes immediately, instead of only when the state gets flushed to disk on a
Ether Cattle replicas, however, are more complex. Replicas do not have the normal peer-to-peer behaviors of an Ethereum node; they just have a disk that mirrors the master. In a normal node, details like the latest block number are updated based on peer-to-peer operations, but in replicas those details are only available from the disk. As a result, our replicas had to re-implement a lot of logic to serve RPC requests directly from disk, without the dependency on peer-to-peer processes.
We have an OpenRelay testing environment that operates off replica nodes rather than conventional Ethereum nodes. We have been able to verify that all of the functionality OpenRelay depends on works correctly in our replicas. Now we want to reach out to other dApp developers to see that replicas function correctly for as many use cases as possible.
Because our replicas run behind a load balancer, event log subscriptions are
unreliable. If you create an event subscription with one replica, and subsequent
requests go to a separate replica, the new replica will be unable to serve your
request because it doesn’t know about the subscription. If your application is
Filter Provider to simulate event subscriptions with a load balanced backend.
Other languages can also simulate the behavior, but not quite as easily. We may
eventually develop a way to support event subscriptions with load balanced
replicas, but with the ease of work-arounds its not currently a high priority.
How You Can Help
If you have a dApp you’d be open to testing against Ether Cattle replicas, we have a couple of options.
First, we are hosting a public Goerli RPC server at https://goerli-rpc.openrelay.xyz/ — If your dApp runs on Goerli, we’d encourage you to point at our RPC server and check that everything works as expected. Also, if you run your dApp on Goerli, we’re trying to build a list of dApps that support Goerli, and would appreciate a pull request at github.com/openrelayxyz/goerli-dapp-list
To get Goerli working with Metamask, use
these instructions, with
https://goerli-rpc.openrelay.xyz as the network RPC URL.
If your dApp doesn’t run on Goerli, we also have a mainnet endpoint available, but we are not publishing it just yet. If you are open to helping test your dApp against our mainnet endpoint, reach out to me directly at austin.roberts[at]openrelay.xyz and I’ll get you the endpoint URL.
We plan to leave these endpoints up through the end of April, 2019.
If you run into issues with Ether Cattle Replicas, please report them to our Github Repository. Note that at this time we are running minimal infrastructure for the purposes of testing the behavior of our RPC servers; we do not have these endpoints deployed in a highly available configuration. We are monitoring for gateway errors, but don’t need bug reports when the endpoints go down.
If you have questions that you don’t think necessarily warrant a bug report, you can also reach out to us on gitter.
The work OpenRelay has done on the Ether Cattle Initiative has been possible (in part) thanks to a 0x Ecosystem Development Grant. The work on Ether Cattle will always be open source, and we hope to contribute it back to the Go Ethereum project once it’s stable.