Use Case
You want to archive Base Mainnet block data as partitioned Parquet files in Amazon S3. Once stored, you can query the data with tools like AWS Athena, Apache Spark, or DuckDB without running any database infrastructure.Pipeline Configuration
Create a new pipeline
In the GoldRush Platform, navigate to Manage Pipelines and click Create Pipeline. Name it
block-archive.Configure the Object Storage Destination
Select Object Storage as the destination type. Enter your S3 credentials and configure the file format:
Select Your Source
Choose Base Mainnet as the chain and Blocks as the data type. This streams block headers and metadata from every Base block.
File Layout
Once running, files appear in S3 with this structure:Query with DuckDB
You can query the Parquet files directly without loading them into a database:Compression Options
| Format | Compression | Best For |
|---|---|---|
| Parquet + Snappy | Fast reads, moderate compression | Interactive queries (Athena, DuckDB) |
| Parquet + Zstd | Higher compression ratio | Long-term archival, storage cost optimization |
| JSON + Gzip | Human-readable, widely compatible | Debugging, simple consumers |
Production Tips
- Batch size: Larger batches (50,000+) produce fewer, larger files - better for query performance. Smaller batches (1,000-5,000) reduce latency to S3.
- Partition by day for archival workloads. Use hour if you need finer-grained partitions for time-range queries.
- GCS and R2: Change
providertogcsorr2and update credentials accordingly. R2 requires anendpointfield.