Back in February we wrote about how Geth stores data, and some of the challenges inherent to getting the data that you need.

At ETH Denver 2020, the OpenRelay team set to work on Flume — an open source log indexer that makes this information easier to access. It’s taken a while to get everything ready for production, but Flume is finally here.

The Problem

Certain types of data are most easily obtained by looking at log events, but that data is not efficiently indexed in a conventional Ethereum client.

For example, if you wanted to know all of the events where the address 0xff3fbe056a5261e9e9d13b0e6e32c1d13f306884 had received Embiggen, you could run a getLogs query like this:

web3.eth.getLogs({fromBlock: 0, toBlock: "latest", address: "0xdde19c145c1ee51b48f7a28e8df125da0cc440be" topics:["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef", null, "0x000000000000000000000000ff3fbe056a5261e9e9d13b0e6e32c1d13f306884"]})

If you’re using a centralized service provider like Infura, or a wallet like Metamask that defaults to using Infura, this will return in under a second with all of the results that match the query. But if you run this against a conventional Geth node, it will take several minutes.

On one hand, this is a nice feature. They use indexes to retrieve the information you want without having to scan through lots of blocks. They still comply with the standard API, and get you the data you need very quickly. On the other hand, developers can easily build an application that they think their users can run against their own nodes, but by relying on queries like the one above they’ve actually built a dependency on a centralized provider.

The Solution

At OpenRelay, a big part of our mission is to make Ethereum nodes operationally manageable through open source tools. We developed Flume to enable developers to run their own log indexing servers, allowing dapps and their users to get rapid responses about log information without relying on centralized providers.

Like the rest of our Ether Cattle Initiative, Flume will always be open source under the AGPL for those who want to prefer to host their own node infrastructure.

If you prefer not to self-host or your team lacks the resources or expertise to do so efficiently at scale, we are pleased to announce that Flume is available on Rivet.cloud. With Rivet, all you have to do use Flume is make an eth_getLogs call and Flume will handle the request.

Additional Features

Now that we have a big index of Ethereum log data, we can do more with it than just serve eth_getLogs requests.

For now, we’ve added two new RPC calls:

Example:

curl http://flumehost/ --data '{"id": 1, "method": "flume_erc20ByAccount", "params": ["0x0a65659b64573628ff7f90226b5a8bcbd3abf075"]}'
{"jsonrpc":"","id":1,"result":{"items":["0x0027449bf0887ca3e431d263ffdefb244d95b555", "0xdde19c145c1ee51b48f7a28e8df125da0cc440be"]}}

Note: This API will return any ERC20 tokens the address has ever received; it is possible that those tokens are not currently present in the wallet.

Example:

curl http://flumehost/ --data '{"id": 1, "method": "flume_erc20Holders", "params": ["0xdde19c145c1ee51b48f7a28e8df125da0cc440be"]}'
{"jsonrpc":"","id":1,"result":{"items":["0xaa461d363125ad5ce27b3941ed6a2b1cf2c7cdf3","0x08409de58f3ad94c5e2c53dbe60ae01be472a820","0x0a65659b64573628ff7f90226b5a8bcbd3abf075","0x18e4ff99ee82f4a38292f1a5d5b2951a5d2a6f2d",["..."]]}}

Note: This API will return any accounts that have ever received the token; it is possible that those tokens are not currently present in the wallet.

In the future we may add additional APIs to be able to include current balances, along with the list of tokens received. If you have other ideas for ways to slice and dice log data, open an issue on Github (or even better, a pull request) and we’d be happy to entertain your ideas.


If you’re new to Rivet and you’d like to try out Flume for yourself, it takes less than a minute to get a free Rivet endpoint. Head on over to rivet.cloud/BUIDL to get started.