This is part of a 5 part series on Rivet's transition to Cardinal. If you are unfamiliar with Cardinal, we suggest starting with the introduction
At a systems level, the architecture for Cardinal looks similar to that of EtherCattle:
Master nodes sit at the top, connecting to peers, processing blocks, establishing consensus, and sending their data through Kafka to replica nodes and the Flume index that can handle users’ RPC requests. But when you look at what’s going on inside each of these components, the picture starts to look pretty different.
While EtherCattle used the same codebase for both masters and replicas, Cardinal draws a hard line between the two.
Initially, Cardinal Masters will be based on Plugeth, the extensible Geth fork we’ve built to extract the data needed by Cardinal. In the near future, we will be adding Nethermind support to Cardinal, allowing us to support a wider variety of chains, and giving us durability against bugs in single clients.
Where EtherCattle worked by streaming all database writes over Kafka (thus tightly coupling the Master and Replica implementations around the database schema), Cardinal defines a streaming protocol called Cardinal Streams. Cardinal Streams helps manage out-of-order delivery of messages, as well as deduplication of messages, allowing us to make more efficient use of Kafka than EtherCattle ever could.
In the future, we hope to add support for additional transports in Cardinal streams. For example, we might add IPC and Websocket transports, allowing Cardinal replicas to sync directly from Cardinal masters without Kafka brokers in the middle. We could also see adding transports like AWS Kinesis streams or GCP’s Datastream.
Additionally, Cardinal Streams is designed to allow consumers to pull from multiple Cardinal streams. A single Cardinal replica can already pull from multiple Kafka brokers (increasing high availability), and as we add new transports replicas could pull from multiple different types of sources at the same time.
Cardinal Storage is designed as the destination of data from Cardinal Streams. Cardinal Storage is currently backed by BadgerDB, but we are also working on implementations with BoltDB and LMDB, with an eye towards supporting some networked databases in the future.
Cardinal Storage is designed to track blockchain data in a fairly chain-agnostic way. Cardinal storage can be configured to make a specific number of historic blocks queryable, and retains data to support reorgs as far back as necessary for a given chain.
In the future, we plan to make an archival version of Cardinal Storage, which would make the entire history of the chain queryable, which would obviously require additional space and may not perform as well implementation supporting only current data.
Cardinal RPC is a framework for writing web3 compatible RPC calls in a clear, concise manner. Cardinal RPC currently supports HTTP as a transport, but in the future will likely add Websockets, IPC, and possibly other transports.
Cardinal RPC also makes it easy for applications to report metadata, such as the blockhash a query ran at, the computational overhead of a query (eg. gas), as well as other information.
Forked from Geth, Cardinal EVM
combines Cardinal Streams, Cardinal Storage, and Cardinal RPC to make the
blockchain’s state queryable. It supports a number of Web3 RPC methods such as
eth_getBalance, and numerous other methods
that require Ethereum state data.
Notably, Cardinal EVM does not keep block history, transactions, receipts, logs, etc. It specializes in recent state data, and delegates historic block data to Flume.
Flume is the last element of the Cardinal stack. Flume is a holdover from the
EtherCattle stack. It started out as a log index for higher performance
eth_getLogs queries than Geth can offer, but grew to also offer blocks,
transactions, receipts, and mempool access as well. Between Flume and Cardinal
EVM, all Web3 RPC calls can be served with high efficiency.
Read about PluGeth.