Danny Teo Yong Song

Danny Teo Yong Song

Management Consultant | Software Engineer | Data Enthusiast | Tech Explorer

← Back to Projects

Olist E-Commerce Data Pipeline (BigQuery)

Completed Mar 2026

An ELT pipeline that loads the Brazilian E-Commerce dataset (Olist) into Google BigQuery and builds a star schema using dbt-bigquery, with orchestration via Dagster and data quality checks.

Overview

The pipeline ingests Olist CSV data into BigQuery raw tables, then uses dbt to create staging views and a dimensional data warehouse (marts). It supports both direct local-to-BigQuery ingestion and an optional path via Google Cloud Storage. A Dagster job runs the full flow: ingest → dbt run → dbt test → data quality. Analysis is done in a Jupyter notebook using pandas_gbq.

Schema

Tech & Tools

Links

Visitor Statistics

Overall visitors

--

Overall likes

--

This page visitors

--

This page likes

--