Before you can start querying the Bluesky Jetstream data, you need to get some credentials.
duckdb -c "$(curl -s https://skyfirehose.com/bootstrap.sql)"
This will print a SQL statement (CREATE SECRET s3secret (...);
) which contains the temporary credentials. You need to copy this statement for the next step.
The secret from step 1) is valid for 15 minutes. Please paste the copied statement into your local DuckDB instance and execute it. Once you have executed the statement, you can start accessing the Bluesky Jetstream data.
duckdb
This will attach the remote database to the local database, so you can query it.
ATTACH 'https://skyfirehose.com/database' AS bluesky;
You can inspect the schema of the remote database by the query below.
SELECT * FROM bluesky.schema;
The schema contains five tables: jetstream
, likes
, follows
, posts
, and reports
.
All of them are partitioned by event_dt
and event_hour
.
SELECT count(*) FROM bluesky.likes WHERE event_dt = '2024-11-18' and event_hour = '12';