Why Do My Data Engineering Requests Take Forever?
How data engineers can set realistic development expectations and respond to impatient stakeholders.
How data engineers can set realistic development expectations and respond to impatient stakeholders.
A risk-averse approach to “flipping the switch” from test tables to production tables featuring a subtle BigQuery SQL function.
Covering GitHub versioning, CI/CD pipeline development and scheduling jobs within Google Cloud Platform.
Leverage BigQuery SQL table metadata to deduplicate, partition and delete data — all using only one word.
Distinguishing between Google Cloud Platform and a typical API’s authentication process emphasizes the need for secure credential storage.
Set up a virtual environment, install Python & pip and run Python scripts in a Google Cloud Compute Engine virtual machine.
How to use Python to read multi-page PDFs, transform unstructured data and SQL to format the final result in BigQuery.
Either afraid or stuck in old habits, new engineers fail to ask important probing questions; how devs can think critically.
Leverage the Python Google Cloud Storage and BigQuery APIs to bulk download, transform and upload CSV files in < 1 minute.
Avoid GCP 401 errors — and security concerns — by passing project credentials into your Docker image the right way.
Building time-sensitive data pipelines is a challenge; luckily, there is a way to build localized pipelines with 3 lines of Python.
4 thankless tasks effective data engineers must do — especially before Q4 and the holiday season.