How to Read a CSV or Excel File in Python with Pandas

Working with data in Python is made much easier by using pandas, a powerful library for handling and analyzing data. One of the first steps in any data project is loading your dataset into a format you can work with: the pandas DataFrame.

A DataFrame is like a spreadsheet inside Python. It has rows and columns, and each column can hold different types of data such as numbers, text, or dates.

Step 1: Install and Import Pandas

If you don’t already have pandas installed, open your terminal or command prompt and run:

pip install pandas
pip install openpyxl   # required for Excel files

Now import it in your script or notebook:

import pandas as pd

Step 2: Reading a CSV File

CSV (comma-separated values) files are a common way to store data. To load a CSV file into a DataFrame:

df = pd.read_csv("sample.csv")

👉 If your file is stored elsewhere on your computer, use the full file path. On Windows, make sure to replace backslashes (\) with forward slashes (/):

df = pd.read_csv("C:/Users/YourName/Documents/sample.csv")

Step 3: Reading an Excel File

For Excel files, the process is the same but with read_excel:

df = pd.read_excel("sample.xlsx")

Step 4: Naming Your DataFrame

While many tutorials use df, it’s better to give your DataFrame a name related to your data:

population_data = pd.read_csv("population_by_country.csv")

This makes your code easier to read later. A good practice is to add comments for each step. This will help you remember what you are doing and allow others who may be looking at your code to understand what you did and why.

Step 5: Inspecting Your Data

Once you’ve loaded the data, you’ll want to take a quick look. Here are three functions for a quick overview of the data:

print(df.head())      # First 5 rows
print(df.columns)     # Column names
print(df.info())      # Summary of rows, columns, and data types

Quick Recap

  • Use pd.read_csv() for CSV files.

  • Use pd.read_excel() for Excel files.

  • Always check your data with .head(), .columns, and .info().

✅ Now that you know how to load data, the next step is learning how to select specific columns and rows inside your DataFrame.

👉 Read the next tutorial: Selecting Columns and Rows in a Pandas DataFrame

FWD EDITORS

We’re a team of data enthusiasts and storytellers. Our goal is to share stories we find interesting in hopes of inspiring others to incorporate data and data visualizations in the stories they create.

Previous
Previous

How to Select Columns and Rows in Pandas DataFrames

Next
Next

How to Install Python on Your Computer