Set Up A Virtual Environment In A Compute Engine VM In 5 Min.
Set up a virtual environment, install Python & pip and run Python scripts in a Google Cloud Compute Engine virtual machine.
“So how do I see the VM code? Is there, like, a window for that?” I’m sure the engineers that trained me found my question about Google Cloud Compute Engine instances overly basic and, possibly, concerning in its naïveté.
For context: One of my core responsibilities as a data engineer is responding to downtime alerts and debugging malfunctioning pipelines. While we use several tools to build pipelines, our core 3 are:
- Cloud Functions
- Airflow DAGs
- Compute Engine VMs
By the point I posed the initial question, I had considerable exposure to and debugging experience with both cloud functions and Airflow.
But VMs, themselves abstractions for an operating system remained, well, abstract.
Not overtly displaying source code like cloud functions or Airflow tasks, VMs were, for the longest time, a black box.
In my 3 years as an engineer, I’ve found that the best way to demystify a technical concept is to get hands-on and start at the beginning.
So, to help you better understand Compute Engine VMs, I’m going to walk through provisioning an instance with Python — from setting up an environment to running an actual script.
While I’ve gained significant experience creating, troubleshooting and contributing to code running within VMs, only recently have I had to provision one in an environment.
Note: The following walkthrough covers the provisioning of a VM with code, not the initial creation or set up, as Google Cloud provides robust and detailed documentation on this process.
Why I Needed To Provision A Compute Engine VM With Python
The Scenario
One of my peers needed to run a significant backfill. Typically we have a dedicated instance, already set up with Python, that team members are welcome to use during off-hours to provide an extra compute boost to time-consuming backfills. Unfortunately, this instance was reserved for a higher priority task and wouldn’t be available until the next sprint.
Build Your Pipeline To A Data Engineering Career
You’ve reached the limit of the public preview. The full version of this post includes the implementation details: The code, the edge cases, and the "why" behind the architecture.
When you join PipelineToDE, you get:
- The DA → DE Pathway Course: A structured roadmap to bridge the gap between analysis and engineering.
- Weekly Senior Deep Dives: Fresh, tactical insights on Python, Cloud (GCP/AWS), and modern orchestration delivered every week.
- Production-Ready Blueprints: Access to 80+ protected stories and code repos from my time in the trenches as a Senior DE
- The DE Job Board (Coming Soon): Exclusive access to a curated board of high-agency Data Engineering roles.