Ways to import CSV files in Google Colab

Ways to import CSV files in Google Colab

Colab (short for Colaboratory) is Google’s free platform which enables users to code in Python. It is a Jupyter Notebook-based cloud service, provided by Google. This platform allows us to train the Machine Learning models directly in the cloud and all for free. Google Colab does whatever your Jupyter Notebook does and a bit more, i.e. you can use GPU and TPU for free. Some of Google Colab’s advantages include quick installation and real-time sharing of Notebooks between users.

However, when loading a CSV file it requires to write some extra line of codes. In this article, we will be discussing three different ways to load a CSV file and store it in a pandas dataframe. To get started, sign in to your Google Account, and then go to “https://colab.research.google.com” and click on “New Notebook”.


Ways to import CSV

Load data from local drive

To upload the file from the local drive write the following code in the cell and run it

from google.colab import files
uploaded = files.upload()

you will get a screen as,

Click on “choose files”, then select and download the CSV file from your local drive.  Later write the following code snippet to import it into a pandas dataframe.

import pandas as pd
import io
df = pd.read_csv(io.BytesIO(uploaded['file.csv']))



From Github

It is the easiest way to to upload a CSV file in Colab. For this go to the dataset in your github repository, and then click on “View Raw”. Copy the link to the raw dataset and pass it as a parameter to the read_csv() in pandas to get the dataframe.

url = 'copied_raw_github_link'
df = pd.read_csv(url)


From your Google drive

We can import datasets that are uploaded on our google drive in two ways :

1. Using PyDrive
This is the most complex method for importing datasets among all. For this we first require to install PyDrive library from python installer(pip) and execute the following.

!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# Authenticate and create the PyDrive client.
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)



Click on the link prompted to get the authentication to allow Google to access your Drive. You will see a screen with “Google Cloud SDK wants to access your Google Account” at the top. After you allow permission, copy the given verification code and paste it in the box in Colab.

Now, go to the CSV file in your Drive and get the shareable link and store it in a string variable in Colab. Now, to get this file in dataframe run the following code.

import pandas as pd
# to get the id part of the file
id = link.split("/")[-2]
downloaded = drive.CreateFile({'id':id}) 
df = pd.read_csv('xclara.csv')


2. Mounting the drive
This method is quite simple and clean than the above mentioned method.

  • Create a folder in your Google Drive.
  • Upload the CSV file in this folder.
  • Write the following code in your Colab Notebook :
from google.colab import drive


Just like with the previous method, the commands will bring you to a Google Authentication step. Later complete the verification as we did in the last method. Now in the Notebook, at the top-left there is File menu and then click on Locate in Drive, and then find your data. Then copy the path of the CSV file in a variable in your notebook, and read the file using read_csv().

path = "copied path"
df_bonus = pd.read_csv(path)

Now, to read the file run the following code.

import pandas as pd
df = pd.read_csv("file_path")


Last Updated on November 8, 2021 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs