The Subtle Power Of SQL’s Overlooked DML Function
One function you gloss over has the power to save you hours of development time — and preserve data accuracy.
Like many 90s-2000s kids, I’ve been watching the mini series Quiet On The Set, the behind-the-scenes chronicle of misdeeds on Nickelodeon sets, with a mixture of horror, disbelief and morbid curiosity. The show is interspersed with interviews and clips of shows featuring the stars being interviewed. In the midst of the darkness, one of the funny revelations was revisiting old clips and rewatching what we, as child viewers, believed qualified as profanity. One word stood out among the nonsensical rest: “Crud.”
Aside from being a helpful supplement for adult language, in data operations CRUD is one of the first acronyms every SQL student learns, shortly after DML (data manipulation language) and DDL (data declaration language).
If you’re unfamiliar, CRUD stands for the four operations that can be performed on most databases, including:
- Create
- Read
- Update
- Delete
As a data engineer, my work typically falls into the Create, Read and Delete categories. To deliver data in a timely manner, it’s impractical for a data pipeline to simply drop and re-create an entire table day-over-day. So we use DELETE statements to delete data corresponding with certain date windows, like 7, 14 or even 30 days in the past.
To “edit” existing tables, I might use a CREATE OR REPLACE TABLE statement to do things like change a schema or add partitioning and/or clustering to make storage more cost-effective.
And, of course, Read operations come into play whenever I create views that are processed and automatically populated with data inside the GCP environment when source tables load on their specified schedule.
Sandwiched in the middle of these operations is one I (and possibly you) barely use.
Build Your Pipeline To A Data Engineering Career
You’ve reached the limit of the public preview. The full version of this post includes the implementation details: The code, the edge cases, and the "why" behind the architecture.
When you join PipelineToDE, you get:
- The DA → DE Pathway Course: A structured roadmap to bridge the gap between analysis and engineering.
- Weekly Senior Deep Dives: Fresh, tactical insights on Python, Cloud (GCP/AWS), and modern orchestration delivered every week.
- Production-Ready Blueprints: Access to 80+ protected stories and code repos from my time in the trenches as a Senior DE
- The DE Job Board (Coming Soon): Exclusive access to a curated board of high-agency Data Engineering roles.