Sorting and Ranking Data in Pandas
When analyzing data, it’s not enough to just look at a table of numbers — you often want to see patterns, identify leaders and laggards, or compare values quickly. For example:
Which country has the largest population?
Which product sold the most last month?
Which student scored the highest in a class?
Sorting and ranking help answer these questions.
Sorting reorders your rows based on the values in one or more columns, so the highest or lowest values appear first. This makes it easy to identify top-performing items or see trends at a glance.
Ranking assigns a position to each row based on its value, letting you compare items numerically (1st, 2nd, 3rd, etc.) even if the values themselves aren’t unique.
Together, sorting and ranking are essential skills for beginners because they help you organize data meaningfully and prepare it for visualization, reporting, or further analysis.
In this tutorial, you’ll learn how to:
Sort a DataFrame by a single column.
Sort by multiple columns at once.
Rank rows to see their relative position.
Understand why and when to use sorting and ranking in real-world scenarios.
Step 1: Example Dataset
Let’s start with a small example dataset:
import pandas as pd
data = {
"Country": ["Canada", "USA", "Mexico", "UK", "Germany"],
"Continent": ["North America", "North America", "North America", "Europe", "Europe"],
"Population (millions)": [38, 331, 128, 67, 83]
}
df = pd.DataFrame(data)
print(df)
Country | Continent | Population (millions) |
---|---|---|
Canada | North America | 38 |
USA | North America | 331 |
Mexico | North America | 128 |
UK | Europe | 67 |
Germany | Europe | 83 |
Step 2: Sorting by a Single Column
Sorting is the simplest way to reorder your data to make comparisons easier.
# Sort by Population in ascending order
sorted_df = df.sort_values("Population")
print(sorted_df)
Output:
Country | Continent | Population (millions) |
---|---|---|
Canada | North America | 38 |
UK | Europe | 67 |
Germany | Europe | 83 |
Mexico | North America | 128 |
USA | North America | 331 |
✅ Notice how the smallest population is now at the top, and the largest at the bottom.
If you want the largest values first, use ascending=False
:
sorted_df = df.sort_values("Population", ascending=False)
Step 3: Sorting by Multiple Columns
Sometimes, you need to sort by more than one column. For example, you might want to sort by Continent first and then by Population within each continent:
sorted_df = df.sort_values(
["Continent", "Population"],
ascending=[True, False] # Continent ascending, Population descending
)
print(sorted_df)
Country | Continent | Population (millions) |
---|---|---|
USA | North America | 331 |
Mexico | North America | 128 |
Canada | North America | 38 |
Germany | Europe | 83 |
UK | Europe | 67 |
CountryContinentPopulationUSANorth America331MexicoNorth America128CanadaNorth America38GermanyEurope83UKEurope67
✅ This groups the rows by Continent, then orders countries by population within each group.
Step 4: Ranking Data
Ranking assigns a numerical position to each row based on its values. This is useful when you want to see relative standings, like a leaderboard.
df["PopRank"] = df["Population"].rank(ascending=False)
print(df)
Country | Continent | Population (millions) | Population Rank |
---|---|---|---|
Canada | North America | 38 | 5 |
USA | North America | 331 | 1 |
Mexico | North America | 128 | 2 |
UK | Europe | 67 | 4 |
Germany | Europe | 83 | 3 |
✅ The highest population (USA) is ranked 1, Mexico 2, and so on. Pandas handles ties by assigning the average rank automatically.
Step 5: Why Sorting and Ranking Matter
Sorting and ranking are foundational analysis tools:
Sorting helps you quickly identify top or bottom performers.
Ranking gives a clear numerical comparison across rows.
Combined with filtering, these operations let you narrow, organize, and interpret your data efficiently.
Real-world examples:
Sort products by sales to see the best sellers.
Rank students by test scores to determine class standings.
Compare populations by continent to analyze global trends.
Step 6: Best Practices for Beginners
Always check your column names and data types before sorting.
Use meaningful names for your DataFrames, like
population_df
, instead ofdf
.Combine filtering + sorting to focus on specific subsets (e.g., countries in North America with population > 100).
✅ Next Steps: After sorting and ranking, you may notice your column names aren’t clear or some columns aren’t in the right format. Cleaning your columns and fixing data types is the next step.
👉 Read the next tutorial: Renaming Columns and Changing Data Types in Pandas