What is open data and why should you use it?

Open Data has become one of the most important resources for analysts, researchers, journalists, and everyday problem-solvers. Governments, agencies, and public organizations around the world publish thousands of datasets for anyone to download—free of charge and with permissive licensing. Whether you’re exploring crime trends, studying environmental change, or building a local app that maps bus arrivals, Open Data is often the easiest way to get reliable, authoritative information.

Our team relies on publicly available data to create charts and maps included in our stories. One of the most useful resources we’ve come across are Open Data portals that contain datasets published by different levels of government and non-governmental agencies like the World Bank, and OECD, and the United Nations. We’ve compiled a list of sources to help you find data for your next projects. Before we dive into the sources, let’s explore what open data is, why data is made available, and who uses the information.

What is open data?

Open data is digital data that can be used freely by anyone, modified, and shared by anyone for any purpose, including commercial purposes. It is usually released by:

  • National, provincial/state, and municipal governments

  • Public agencies (e.g., transit, health, environment, economic development)

  • International organizations (e.g., World Bank, UN, OECD)

  • Academic or non-profit institutions

To be considered “open,” data typically must:

  1. Be free to access

  2. Be machine-readable

  3. Use a license allowing reuse (e.g., Open Government License, Creative Commons)

The goal of Open Data is to improve transparency, innovation, public engagement, and evidence-based decision-making.

By making valuable datasets publicly available, organizations and governments — City of Toronto, The World Bank, and the OECD, etc. — are enabling people to explore and analyze data, leading to better decision-making and problem-solving. From weather reports to traffic data, crime statistics to socio-economic data, the volume of open data available is staggering.

Why is this data available?

Public institutions collect enormous amounts of information every day—about transportation, health, finance, demographics, the environment, and more. Historically, much of this information sat in internal systems, accessible only to government staff or available to the public only through lengthy formal requests. Open Data shifts this model by treating data as a public asset that should benefit everyone. By making datasets available proactively, governments enable the public, businesses, researchers, and community groups to extract value from information that would otherwise remain locked away. Open Data serves multiple public and economic purposes:

Promote Transparency & Accountability

Governments release open data to build trust with the public and demonstrate how decisions are made. By publishing datasets on budgets, crime statistics, public health, or procurement, institutions make it possible for anyone to verify claims or analyze how resources are allocated. For example, the City of New York publishes thousands of datasets through NYC Open Data, allowing residents to explore everything from 311 service requests to school performance. This transparency helps reduce misinformation, encourages evidence-based debate, and gives citizens a clearer picture of how public agencies are performing.

Support Economic Growth & Innovation

Open data fuels entrepreneurship by giving companies and developers access to information that would otherwise be expensive or difficult to obtain. Startups often rely on demographic, transportation, or environmental data to identify market opportunities, refine business strategies, or build new digital tools. A well-known example is the use of transit data from Transport for London, which led to hundreds of apps that help commuters plan trips, check delays, and navigate the city more efficiently. These innovations—enabled by freely available data—create economic value that far surpasses the cost of maintaining the open data program itself.

Improve Public Services

When governments share data openly, they also create opportunities to improve the quality and effectiveness of public services. Agencies can combine their own data with external analyses performed by researchers, nonprofits, and private companies to identify gaps or inefficiencies. For example, open transit and mobility data published by the European Union Open Data Portal has helped cities across Europe assess congestion patterns and improve public transportation planning. Citizen-built tools—like visualizations of road safety hotspots—often highlight issues faster than traditional reporting channels, giving agencies clearer signals on where to intervene.

Enable Data-Driven Policymaking

Open data ensures that policymakers are not the only ones with access to critical information. Researchers, analysts, and advocacy groups can use the same datasets to evaluate whether policies are working and propose alternative approaches. For example, public health data released by Health Canada has been instrumental for academics studying regional health outcomes, prescription trends, and environmental health risks. Because the underlying information is accessible to all stakeholders, policy debates become more evidence-based, collaborative, and rigorous.

Enhance Civic Engagement

Civic groups, neighbourhood associations, and nonprofits rely on open data to participate more effectively in public conversations. Access to datasets on air quality, housing affordability, or infrastructure spending gives citizens the ability to advocate for changes that matter to their communities. In many cities, open data portals have enabled residents to create tools that visualize local issues—such as mapping potholes, tracking tree-planting initiatives, or illustrating how long it takes for public works crews to respond to service requests. These community-built tools often influence local priorities and strengthen dialogue between citizens and government agencies.

Reduce Administrative Burden

Publishing data openly also reduces the workload of government employees. Instead of responding to repeated information requests from journalists, researchers, or the public, agencies can direct people to datasets that are already available online. This helps staff spend less time retrieving files or preparing custom reports and more time on strategic work. A good example is the widespread adoption of open budget dashboards, such as those published through Data.gov, which provide immediate answers to many financial inquiries that would otherwise require manual processing. Over time, this reduces redundancy and improves efficiency across government departments.

Why People Use Open Data

Open Data is valuable because it allows anyone—regardless of their background or technical skill—to explore information that was once locked away in government systems. People use open data for different reasons, whether it’s to answer a research question, build a new tool, understand trends in their community, or make more informed decisions at work. By providing free access to high-quality datasets, Open Data portals give analysts, developers, journalists, businesses, students, and community groups the ability to work with the same information that public institutions rely on internally. This levels the playing field and helps turn raw numbers into insights that benefit individuals, organizations, and entire communities.

For Data Analysts & Researchers

Data analysts and researchers use open data because it provides a reliable foundation for analysis, modeling, and evidence-based insights. Since open datasets come directly from government agencies, they tend to be collected consistently and are backed by clear documentation and metadata. This makes them ideal for exploring trends, evaluating policy impacts, and building dashboards or forecasting models. Analysts working in the private sector use this information to supplement their internal datasets, validate assumptions, or identify external factors that may influence business performance. For students and academics, open data provides an authentic dataset to practice real-world analytical skills without licensing barriers.

For Developers

Developers rely on open data—especially APIs and geographic datasets—to build apps, tools, and digital services that solve everyday problems. Transit, traffic, and weather data are commonly used to power applications that provide real-time updates to millions of users. Open geospatial files (like GeoJSON or shapefiles) help developers build interactive maps, routing tools, and visualizations. With unrestricted access, developers can experiment, prototype, and test ideas without negotiating contracts or paying for proprietary data. This lowers the barrier to innovation and enables small teams or independent builders to create tools that have significant public value.

For Journalists

Journalists use open data to investigate issues, fact-check public claims, and bring transparency to government decisions. Because datasets like crime statistics, budgets, and procurement records are authoritative and time-stamped, they provide an evidence-based foundation for reporting. Open data allows journalists to identify patterns, highlight discrepancies, and tell data-driven stories that shape public understanding. In an era where misinformation spreads quickly, open data also acts as a neutral reference point journalists can use to verify information and provide context for readers.

For Businesses

Businesses—large and small—use open data to inform strategy, understand local markets, and identify new opportunities. Demographic data helps companies choose store locations; traffic volumes help logistic companies plan routes; building permits help real estate developers forecast growth; and environmental data helps firms evaluate operational risks. Because open data is free, companies can access the same foundational information as large enterprises, leveling the playing field for small businesses and startups. Many firms also integrate open government APIs into their own products, creating value-added services without needing to build the underlying data infrastructure themselves.

For Students & Educators

Students and educators use open data because it provides real, high-quality datasets for learning data skills in a practical context. Instead of working with fictional samples or outdated worksheets, students can analyze real census data, study climate trends, or build dashboards using official government statistics. This helps learners develop analytical thinking, research methodology, and data literacy using the same types of datasets they’ll encounter in professional settings. Educators also use open data to design assignments that reflect real-world questions, making learning more engaging and directly transferable to the workplace.

For Community Groups

Community groups use open data to advocate for improvements in neighbourhoods and public services. Crime data may support campaigns for safer street design; transit wait times may justify route enhancements; environmental monitoring can highlight pollution hotspots; and service request data can identify inequality in how quickly issues are resolved across neighbourhoods. Open data empowers residents with the same information governments use internally, helping communities hold institutions accountable and propose solutions backed by evidence rather than anecdotes. This allows grassroots groups to participate meaningfully in civic decision-making.

Open data sources

Our team has compiled a list of sources; 54 sources categorized into 4 regions — Americas, Asia-Pacific, Europe, and Global — that provide free, publicly available datasets that you can use for personal or business purposes. Most of the datasets are available under the open data license. We’ve organized them by region and topics, and have included an overview of the file formats available for download on each website.

We’ll continue to add sources to the list, so check back often. If there is a source you’d recommend adding, please let us know in the comments section below.

  • CSV: Comma-separated values (CSV) files store tabular data (numbers and text) in plain text, where each line of the file typically represents one data record. It can be opened with a spreadsheet application like Excel, Google Sheets, or OpenOffice Calc.

    XLS/XLSX: An XLSX or XLS file is an Excel spreadsheet created by Microsoft Excel or exported by another spreadsheet program, such as OpenOffice Calc or Google Sheets.

    JSON: A JSON file is a file that stores simple data structures and objects in JavaScript Object Notation (JSON) format, which is a standard data interchange format. Since it is a text file, you can open JSON files using any text editor as well as compilers including Visual Studio Code.

    GeoJSON: A GeoJSON file is a JSON file that stores geographical data such as points, lines, polygons, etc. It is a common and compact format for exchanging spatial data between applications. It can be opened using geospatial software such as ArcGIS or QGIS.

    XML: An XML file is an XML (Extensible Markup Language) data file. They can be thought of as text-based data files. You can open an XML file in any text editor or web browser. But if you want to edit the files, you’ll need an XML editor — Microsoft XML Notepad.

    SHP: Contains a dataset that stores a collection of geographic features, such as streets, points of interest, and zip code boundaries. It can be used to store point, line, or area features. You can open these with mapping software such as QGIS or ArcGIS.

    KML: A KML file stores geographic modeling information in the Keyhole Markup Language (KML) including placemarks, points, lines, polygons, and images. KML files can be opened using Google Earth and mapping software.

Benefits of open data

Open data offers numerous benefits that extend to various sectors and stakeholders. Some of the key benefits include:

Transparency and Accountability: Open data enhances transparency in government activities, corporate practices, and other sectors. It holds institutions accountable for their actions and decisions by providing accessible information to the public.

Informed Decision-Making: Open data empowers individuals, organizations, and governments to make well-informed decisions based on accurate and up-to-date information. It aids in evidence-based policy formulation and strategic planning.

Research and Innovation: Researchers can access diverse datasets to conduct studies, validate hypotheses, and generate new insights. Open data fuels innovation by enabling collaboration and the development of new ideas.

Public Services Improvement: Governments can use open data to identify areas for improvement in public services, leading to enhanced service delivery, resource allocation, and responsiveness to citizens' needs.

Citizen Empowerment: Open data provides citizens with the tools to understand their communities, engage in civic discussions, and participate in public affairs. This promotes active citizenship and democratic engagement.

Challenges of working with Open Data

When working with open data, you may face several challenges. The quality of the information depends largely on the organization providing it. For example, organizations like the World Bank, IMF, and Statistics Canada offer datasets in structured formats that are easy to download and use without additional processing.

In contrast, the Ontario Government’s open data portal includes data from various departments with little consistency between datasets. This makes it difficult to combine information from multiple sources.

When using open data, carefully examine each dataset to check its completeness and accuracy. Review any available notes or metadata for details on how the data was collected, the time periods covered, and the last update date. This information helps you understand the dataset’s quality.

Many open data portals do not use consistent formats. Differences in column names, date formats, naming conventions, and number displays are common, even within datasets on similar topics. As a result, you often need to clean and standardize data before it can be used.

The consistency of dataset organization depends on the releasing organization and their data management practices.

In summary, open data promotes transparency, encourages innovation, empowers citizens, and supports positive changes in society, the economy, and the environment. If you have suggestions for additional sources to include on our list, please share them in the comments. Thank you!

 
 

You may also like

 
FWD EDITORS

We’re a team of data enthusiasts and storytellers. Our goal is to share stories we find interesting in hopes of inspiring others to incorporate data and data visualizations in the stories they create.

Previous
Previous

Scraping data to create a custom dataset

Next
Next

Selecting the right chart to visualize your data