Data Science and Data Engineering Blog

DATA SCIENCE WARRIOR

“It always seems impossible until it’s done.”

NELSON MANDELA

Use GCP with Tesla V100 in Colab

Colab is a great tool for machine learning prototyping, it provides free access to GPU and can significantly seep up training time. However, there are many occasions when you need extra computational power while you train your deep learning models. As of July 2020, the best available GPU on GCP is Tesla V100. Below I will provide instruction on how to access Tesla V100 on your Colab account using GCP and boost your training time.

Prerequisite :

  1. GCP account.
  2. Make sure you have Google Cloud SDK installed.
  3. Access to GPU in your GCP. There is a quick instruction on how to get access to GPU on GCP.

Open Terminal (Mac) and execute the following commands:

gcloud auth login 
gcloud config set project PROJECT_ID

Note: PROJECT_ID – use your project id

export IMAGE_FAMILY="tf-1-15-cu100"    
export ZONE="us-central1-a"
export INSTANCE_NAME="deeplearning"

There are many different images available on GCP including Deep Learning PyTorch and TensorFlow images. You can check them here https://cloud.google.com/ai-platform/deep-learning-vm/docs/images#listing-versions.

gcloud compute instances create $INSTANCE_NAME \

--zone=$ZONE \
--image-family=$IMAGE_FAMILY \
--image-project=deeplearning-platform-release \
--maintenance-policy=TERMINATE \
--accelerator type=nvidia-tesla-v100,count=1 \
--metadata='install-nvidia-driver=True' \
--preemptible

The current setup uses preemptible settings which is significantly cheaper (around USD 0.75 per hour) but the instance lasts only 24hours. Delete ‘preemptible’ if you want to run your model longer but remember that regular V100 costs around USD 1.810 per hour. Please use the GCP calulator to get more accurate estimates.

This line sets up the port forwarding.

gcloud compute ssh --zone us-central1-a deeplearning -- -L 8081:localhost:8081

If you are getting an error related to port 22, please go to GCP -> VPC network -> Firewall -> default-allow-ssh click edit and change Targets to All instances in the network. (It worked for me)

pip install --upgrade jupyter_http_over_ws>=0.0.1a3 && \
jupyter serverextension enable --py jupyter_http_over_ws
jupyter notebook \
--NotebookApp.allow_origin='https://colab.research.google.com' \
--port=8081 \
--NotebookApp.port_retries=0

You should see a link in Terminal that looks something like this http://localhost:8081/?token=aa04a0d1ab47f04a40bcb2a4b7a73abc392bc please copy it

Open Colab and click “Connect” -> “Connect to local runtime” and paste your link there.

Now you should have your colab connected to GCP instance with Tesla V100.

Note: Currently GCP doesn’t work with Google Drive and you would need to move your data to GCP bucket.

Enjoy the ultimate power 🙂

Happy Deep Learning.

About The Author

Scroll to Top