Tips and Tricks for Developing Substreams

Developing a new Substreams requires a deep understanding of general blockchain technology, being familiar with the specific chain you are…
Developing a new Substreams requires a deep understanding of general blockchain technology, being familiar with the specific chain you are…
Published on
October 25, 2024

Tips and Tricks for Developing Substreams

Developing a new Substreams requires a deep understanding of general blockchain technology, being familiar with the specific chain you are developing on, and the ability to write Rust code. But don’t worry, you don’t have to be an expert in any of those fields! Thanks to the composability of Substreams, you can utilize modules written by other developers, and change them to meet your needs.

You can consider the following recommendations on how to build efficient and reusable Substreams.

Learn about the Substreams development cycle

The following video covers the overall development cycle of Substreams:

  • Within a Rust program, define which blockchain data you need and the transformations you’d like to apply to that data.
  • With just a single CLI command, wrap your Rust program into a Substreams package.
  • Your Substreams package is sent to the Substreams provider of your choice, who will then return the requested data with the transformations already applied.
  • The data is now populated into the data sink of your choice. This could be a Postgres database, a subgraph, mongoDB, or wherever else you need it!

Get familiar with the Substreams components

To run a Substreams, you need several pieces working together, such as the manifest, the different Protobuf definitions and the module handlers.

  • The manifest (substreams.yaml)
    The manifest is a YAML file that works as a configuration file, where you can declare the structure of your Substreams and all the configurations needed.
  • The Protobuf schemas
    Substreams uses one or several Protobuf files to define the data model (i.e. what data you want to extract from the blockchain and what data you want to expose)
  • The modules
    A module is a small piece of Rust code defining the transformations of the blockchain data. A module handler is the actual Rust function.

Use the Substreams tools

As the Substreams ecosystem grows, more tools have become available to help you in building your Substreams.

  • substreams init
    The Substreams CLI includes a really useful tool that allows you to initialize a Substreams project. If you provide a contract address, the substreams init command will auto-generate the corresponding Rust code to track all the events of the given contract.
The speed of running `substreams init`
  • Substreams Registry (substreams.dev)
    Although building new Substreams is fun, it is not always necessary. There are already many open source Substreams available that may fit your needs! The Substreams Registry found at substreams.dev contains popular ready-to-consume Substreams that you can use yourself, or build atop of!

Learn the basics of Rust

If you want to develop a new Substreams, knowing the basics of Rust is needed. But don’t worry! You don’t have to be an expert. If you have previously worked with C++ or Java, you will find Rust similar enough, and the learning curve will probably be shallower.

The StreamingFast team recommends following the official Rust By Example tutorial, which covers almost everything you should know about Rust. You will also find some resources available in the Substreams documentation.

Avoid using the full block in several modules

One of the most powerful features of Substreams is that you can access the full block data of a chain. In the following example, an Ethereum full block is passed as a parameter:

Passing the full block as a parameter requires running complex internal decoding functions, which can slow down your Substreams.

If you use several modules, the StreamingFast team recommends passing the full block only in the first module. You can extract all the necessary information from the full block in the first module and use that specific data in downstream modules. This reduces the amount of data shared across modules, while drastically speeding up your Substreams.

Batch RPC calls (eth_calls)

Sometimes, you need to make RPC calls (eth_calls) to read real-time data from a specific smart contract. You should avoid making these calls as much as possible because they are unreliable (might be down and return unpredictable errors), slow down your Substreams (RPC requests take more time to process) and increase the amount of bytes consumed (you are making external requests).

However, if you do need to make RPC calls, you should try to make them in batches. This would mean sending several eth_calls in the same request, thus reducing the network latency.

Composability: split functionalities across different Substreams

You should prioritize small and concise Substreams over big and complex Substreams. Having small and independent Substreams allows you to only include the pieces of data that you really need, making your Substreams reusable and highly composable.

At the same time, when you run a Substreams, its data is cached in the servers of the Substreams provider. Running small and independent Substreams allows you to have cached data ready to use in larger Substreams, shortening your dev cycles as you build bigger and better things!

So are you feeling ready to start building your first Substreams? Head on over to substreams.dev to see what’s already available for you to use, and be sure to join the StreamingFast Discord server to ask any questions that may come up. Keep these helpful tips in mind as you get coding!