Write Your First SQL ETL Pipeline (Part II)
How to create and load an aggregate table for your GCP usage using ETL principles and SQL commands.
Writing A SQL ETL Pipeline: Desired Output
Although Python gets more love when it comes to creating ETL/ELT/EL pipelines, SQL can be just as efficient when it comes to creating a recurring load job.
This is part II in a two part series. If you’re at all lost, please see part I.
When creating a new pipeline, whether that be in SQL or Python, I find it especially helpful if I have an idea what my output should be.
Luckily, I included the desired output in part I.
I’ll re-share it here for your review:


Desired SQL ETL pipeline output.
Having written queries that pull specific fields like serviceDescription and perform transformations like extracting the year from invoice (invoiceYear), I want to focus on one column that requires an extra step.
Build Your Pipeline To A Data Engineering Career
You’ve reached the limit of the public preview. The full version of this post includes the implementation details: The code, the edge cases, and the "why" behind the architecture.
When you join PipelineToDE, you get:
- The DA → DE Pathway Course: A structured roadmap to bridge the gap between analysis and engineering.
- Weekly Senior Deep Dives: Fresh, tactical insights on Python, Cloud (GCP/AWS), and modern orchestration delivered every week.
- Production-Ready Blueprints: Access to 80+ protected stories and code repos from my time in the trenches as a Senior DE
- The DE Job Board (Coming Soon): Exclusive access to a curated board of high-agency Data Engineering roles.