How to transform wide tables to long tables using Pandas
You’ve probably encountered datasets that look like giant spreadsheets—lots of columns, each representing a different category, year, or variable. These are called wide tables, and while they might be easy for humans to read, they’re often harder for computers (and analysts) to work with. That’s why one of the most common data preparation steps is to transform a wide table into a long table—also known as pivoting or melting the data.
What’s the difference?
To illustrate the difference, let’s look at the GDP per capita growth dataset from the World Bank. By default, the World Bank datasets are organized as wide tables. Each variable or time period gets its own column, as shown in the screenshot below.
Meanwhile, in a long format, each row represents a single observation. The screenshot below shows the same dataset transformed to a long format, with select columns from the original file.
Why go long?
Easier analysis and visualization
Most tools and libraries—like pandas in Python, R’s tidyverse, or visualization tools like Tableau and Power BI—expect data in a long format. For example, if you want to make a line chart showing GDP over time by country, the long format makes it easy to plot “Year” on the x-axis, “Value” on the y-axis, and “Country” as the color grouping.Simpler filtering and grouping
In a long format, you can easily filter data by year, category, or group without writing complex code. Summarizing or aggregating data by group (e.g., average GDP per decade) also becomes more straightforward. This format is also easier to create pivot tables in Excel and Google Sheets.Scalable and flexible
Adding new years or variables doesn’t require creating more columns. Instead, you just add more rows. This makes your dataset cleaner and more consistent over time.Better compatibility with machine learning models
Models and statistical functions usually require data where each row is an observation and each column is a feature. Long tables aligns naturally with this structure.
How to transform your data using pandas.melt() function
In Python’s pandas library, you can easily reshape a wide table into a long one using the pd.melt()
function. Here’s the syntax of the function.
pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)
Parameter Breakdown
frame
(DataFrame, required)
The pandas DataFrame you want to reshape from wide to long format.id_vars
(list or scalar, optional)
Columns that you want to keep fixed — they act as identifiers for each observation (e.g., “Country”, “Date”, “Product”).Think of these as the columns that will not be unpivoted.
value_vars
(list or scalar, optional)
Columns you want to unpivot (i.e., turn from columns into rows).
If you don’t specify this, pandas will melt all columns not inid_vars
.var_name
(str, optional)
The name of the new column that will hold the former column headers (e.g., “Year”, “Category”).value_name
(str, default = "value")
The name of the new column that will hold the values (e.g., “Sales”, “Population”).col_level
(int or str, optional)
If you have a MultiIndex column header, you can specify which level to melt.ignore_index
(bool, default = True)
Whether to reset the index in the resulting DataFrame.
If set toFalse
, the original index values are retained.
How it works
pd.melt()
takes multiple columns and "unpivots" them into two new columns:
one for the variable names (
var_name
),and one for their corresponding values (
value_name
).
Essentially, it converts a wide table with many columns into a long, tidy table with fewer columns and more rows.
Example of dataset transformation
Input
import pandas as pd # Sample wide-format data data = { 'Country': ['Canada', 'USA'], '2020': [1.6, 2.1], '2021': [2.3, 3.2], '2022': [3.0, 2.8] } df = pd.DataFrame(data) print("Wide format:\n", df) # Transform to long format using pd.melt() long_df = pd.melt( df, id_vars=['Country'], # Columns to keep fixed var_name='Year', # Name for the new "variable" column value_name='GDP_Growth' # Name for the new "value" column ) print("\nLong format:\n", long_df)
Output
Wide format: Country 2020 2021 2022 0 Canada 1.6 2.3 3.0 1 USA 2.1 3.2 2.8 Long format: Country Year GDP_Growth 0 Canada 2020 1.6 1 USA 2020 2.1 2 Canada 2021 2.3 3 USA 2021 3.2 4 Canada 2022 3.0 5 USA 2022 2.8
Final thought
Wide tables are great for visual inspection, but long tables are made for analysis. Converting your dataset to long format using pd.melt()
opens the door to easier plotting, cleaner aggregations, and smoother integration with data visualization tools.
Whenever your data feels “too wide to handle,” try melting it — you’ll be surprised how much simpler your workflow becomes.