Top 19 'Story-Finding' Free Public Datasets to master for Aspiring Data Journalists in 2025
In the world of data journalism, the story is everything. It's the beating heart behind the spreadsheets, the human narrative hidden within the rows and columns. But before you can craft that compelling, data-driven story, you need the most crucial ingredient: the data itself. For aspiring journalists in 2025, the challenge isn't a lack of information—it's navigating the vast ocean of public data to find the islands of insight.
Knowing where to look is half the battle. The best data journalists aren't just masters of Python or Tableau; they are digital archivists, modern-day explorers who know which portals hold the most valuable, story-rich information. They understand that a dataset isn't just a collection of numbers, but a snapshot of our society, our economy, our health, and our planet. Mastering these sources is the first giant leap toward producing impactful, award-winning work.
This guide is your treasure map. We've curated 19 of the most powerful, free, and publicly available datasets that are brimming with untold stories. Whether you're interested in local crime, global economic trends, or the future of our climate, this list will serve as your launchpad. Let's dive in and uncover the data that will define the journalism of tomorrow.
1. The World Bank Open Data
The World Bank is a go-to source for global development data. It provides a massive collection of time-series data on everything from GDP and inflation to poverty rates, health indicators, and access to education across hundreds of countries. Its datasets are clean, well-documented, and often span several decades, making them perfect for stories about long-term trends.
This is the place to ask big-picture questions. How has female labor force participation changed in Southeast Asia over the last 30 years? Is there a correlation between a country's investment in renewable energy and its economic growth? The World Bank's data allows you to compare and contrast nations, revealing global patterns and regional outliers that make for powerful, context-rich journalism.
Story-Finding Tip: Use the World Development Indicators (WDI) to compare a specific metric, like "Access to electricity (% of population)," between two neighboring but economically different countries. Visualizing the divergence over time can tell a potent story about infrastructure, policy, and inequality.
2. ProPublica Data Store
Created by journalists, for journalists. ProPublica, a leader in investigative reporting, makes many of the datasets from its award-winning stories available to the public for free. These aren't raw, messy government files; they are often cleaned, processed, and structured specifically for journalistic inquiry.
Here you'll find data on topics like political ad spending, lobbying, healthcare discrepancies, and corporate accountability. Because these datasets have already been used for major investigations, they serve a dual purpose: you can find your own unique angles within them, or you can study them to understand how top-tier journalists structure their data-driven projects. This is a principle Goh Ling Yong champions: data journalism isn't just about finding data; it's about structuring it for maximum impact.
Story-Finding Tip: Explore the "Dollars for Docs" dataset to investigate payments from pharmaceutical companies to doctors in your local area. Are there patterns? Do the top-paid doctors prescribe certain drugs more often? This can be a powerful local story.
3. U.S. Census Bureau
The U.S. Census Bureau is the foundation of American demographic data. It’s far more than just a once-a-decade population count. The American Community Survey (ACS) provides incredibly detailed annual data on income, housing, education, ancestry, and language spoken at home, down to the neighborhood (census tract) level.
This is your ultimate tool for stories about community change. You can track gentrification by looking at shifts in median income and rent prices in specific neighborhoods. You can analyze educational attainment disparities across different racial groups in your city. The granularity of Census data allows for hyperlocal stories that resonate deeply with your audience.
Story-Finding Tip: Use the 5-Year ACS estimates to compare commute times in your city's downtown core versus its suburbs over the last decade. Has remote work changed the landscape? Is public transit usage shifting? This data holds the answers.
4. Our World in Data
If you need to add historical context to any global story, Our World in Data is your secret weapon. Based at the University of Oxford, it brings together data from hundreds of sources to create clear, interactive visualizations on a vast range of topics—from global poverty and disease to CO2 emissions and political systems.
What makes it unique is its focus on the "long view." You can track literacy rates not just for 10 years, but for two centuries. This allows you to frame today's headlines within a much broader historical context, adding depth and perspective to your reporting. Every chart is downloadable as a CSV file, giving you the raw data behind their beautiful visualizations.
Story-Finding Tip: Use the data on child mortality to create a timeline for a specific country. Juxtapose it with major historical events in that country—like the introduction of a vaccine program or the end of a conflict—to tell a powerful story about public health progress.
5. FiveThirtyEight's Data
Similar to ProPublica, the data-driven news site FiveThirtyEight generously shares the data and code behind many of its articles. Their repository on GitHub is a goldmine for aspiring data journalists, covering everything from politics and sports to science and culture.
The datasets are often quirky, timely, and perfectly formatted for analysis. You might find data on every Marvel movie, polling averages for an upcoming election, or an analysis of flight safety records. Studying their data is like getting a behind-the-scenes look at how a professional data journalism team thinks, questions, and explores a topic.
Story-Finding Tip: Dive into their dataset on "Hate Crimes" which collates FBI data. Instead of just looking at the national trend, filter for your state or a major city and compare the rate of reported hate crimes before and after a major national event to see if there's a local impact.
6. FBI Crime Data Explorer
For any story touching on crime and justice in the United States, the FBI's Crime Data Explorer (CDE) is the authoritative source. It provides data on violent and property crimes reported by law enforcement agencies across the country. You can explore trends over time, compare cities, and break down data by specific offense types.
While it has limitations (it only includes crimes reported to the police), it's an essential starting point. You can investigate whether crime rates in your city are actually rising or falling, contrary to public perception. You can also explore "clearance rates"—the percentage of crimes solved—to ask questions about police effectiveness.
Story-Finding Tip: Compare the clearance rates for homicide in your city versus the national average. If your city's rate is significantly lower, it could be the starting point for a deep investigation into police resources, community trust, and investigative practices.
7. Google Dataset Search
Think of this as a search engine specifically for datasets. Google Dataset Search indexes freely available data from thousands of repositories across the web, including government portals, academic institutions, and journalistic organizations.
This is the perfect tool to use when you have a specific topic in mind but don't know where the data might live. Searching for "electric vehicle adoption rates by state" or "farmland prices in the Midwest" can surface datasets you never would have found otherwise. It’s an essential tool for the discovery phase of any data project.
Story-Finding Tip: Use it for discovery. Try a broad search like "public park usage" and see what comes up. You might find a niche academic study or a city's open data portal with records that could fuel a story about urban green spaces.
8. World Health Organization (WHO) Global Health Observatory
The WHO's data portal is the definitive source for global health statistics. It covers everything from life expectancy and immunization rates to data on specific diseases like malaria and tuberculosis, as well as risk factors like tobacco use and obesity.
This dataset is crucial for putting health news into a global context. When a new disease emerges, you can use the WHO's data to compare its mortality rate to other, more established diseases. You can track a country's progress toward health-related Sustainable Development Goals or compare healthcare spending and outcomes across different regions.
Story-Finding Tip: Track the reported measles immunization coverage in a specific region over the past 15 years. Is there a dip? Does that dip correlate with the rise of anti-vaccination movements or a period of political instability? This can reveal a hidden public health crisis.
9. International Monetary Fund (IMF) Data
While the World Bank focuses on development, the IMF is the authority on the global financial system. Its datasets are invaluable for stories about national economies, government finance, international trade, and financial stability.
Here you can find detailed data on government debt, inflation rates, exchange rates, and a country's balance of payments. Are you writing about a country facing a debt crisis? The IMF data will tell you who its creditors are. Reporting on inflation? You can compare your country's rate to its trading partners. This is dense, expert-level data that can power sophisticated economic journalism.
Story-Finding Tip: Use the IMF's Government Finance Statistics to track a country's public spending on education versus military expenditures over time. A shift in priorities can be a powerful political story.
10. NASA EarthData
For any story about our planet, NASA's Earth Observing System Data and Information System (EOSDIS) is an incredible resource. It provides a vast archive of satellite data on climate change, air quality, land use, wildfires, and sea levels.
While some of this data is highly technical, many of its tools and portals (like NASA Worldview) make it accessible for journalists to find and visualize changes over time. You can literally show, not just tell, the effects of deforestation in the Amazon or the shrinking of a major lake. This is visual storytelling at its most powerful.
Story-Finding Tip: Use the satellite imagery to find before-and-after shots of an area affected by a major flood or wildfire. Then, use the underlying data to quantify the damage—how many square kilometers of forest were burned? How many homes were in the inundated area?
11. OpenCorporates
This is the largest open database of corporate data in the world. It provides information on millions of companies, including their jurisdiction, incorporation date, status, and, crucially, their directors and officers.
OpenCorporates is the starting point for almost any story involving corporate accountability or "following the money." You can investigate shell companies, map out the business interests of a politician's donors, or uncover networks of companies run by the same group of individuals. It's a foundational tool for investigative journalists.
Story-Finding Tip: Pick a prominent local business person or politician and search for their name in the "Officers" search. You may uncover directorships or companies they are associated with that are not widely known, leading to stories about potential conflicts of interest.
12. Pew Research Center
For stories about public opinion, social trends, and demographic shifts, the Pew Research Center is an unparalleled resource. They conduct rigorous, non-partisan polling on a vast array of topics, from politics and technology to religion and media habits, and they make their raw datasets available for free.
This allows you to go beyond their excellent reports and conduct your own analysis. You can explore how different age groups or political affiliations view a specific issue. Their data provides the "why" behind the headlines, grounding your stories in the actual beliefs and behaviors of the public.
Story-Finding Tip: Download a dataset on a topic like "Trust in Media." Instead of just looking at the top-line numbers, segment the data by age. How does a 22-year-old's media consumption and trust differ from a 65-year-old's? This can be a fascinating story about a generational divide.
13. Data.gov
This is the central clearinghouse for open data from the U.S. federal government. It's a massive portal that aggregates over 200,000 datasets from various agencies, covering topics as diverse as aviation safety, food inspections, and federal contracts.
The sheer scale can be intimidating, but its search and filtering tools are powerful. This is the place to look for data on government operations and accountability. You can analyze federal spending in your district, look up safety records for local infrastructure, or explore environmental compliance data for nearby factories.
Story-Finding Tip: Search for the "Federal Assistance Award Data System" (FAADS) to see where grant money is flowing. Filter by your state or city to see which organizations are receiving the most federal funding and for what purpose.
14. UN Comtrade Database
For any story involving international trade, the United Nations Comtrade database is the gold standard. It contains detailed import and export statistics for goods, reported by over 200 countries, with data going back decades.
You can use it to track the flow of specific goods—from smartphones to soybeans—between any two countries. This is essential for reporting on trade wars, supply chain disruptions, or the economic impact of international sanctions. It allows you to quantify the real-world effects of global economic policy.
Story-Finding Tip: Pick a major product your country is known for exporting (e.g., coffee from Colombia, cars from Germany). Use UN Comtrade to see how the list of top import countries for that product has changed over the last 20 years. This can tell a story about shifting geopolitical and economic alliances.
15. The Marshall Project - The Record
Focused on the U.S. criminal justice system, The Marshall Project is another non-profit news organization that produces incredible data journalism. "The Record" is their collection of datasets on everything from prison populations and parole rates to police misconduct.
Like ProPublica, these datasets are often cleaned and ready for analysis. They provide the raw material for stories that shed light on the inequities and complexities of the criminal justice system. Exploring this data can help you move beyond anecdotal evidence to report on systemic issues.
Story-Finding Tip: Explore their "Testify" database of police misconduct lawsuits in major cities. You can analyze trends: What are the most common allegations? Are certain precincts or officers repeat offenders?
16. Kaggle Datasets
While Kaggle is primarily known as a platform for data science competitions, its Datasets section is a treasure trove of fascinating and often well-documented data. You'll find a mix of classic datasets, user-submitted data, and data from major organizations.
The key advantage of Kaggle is its community. Datasets often come with public "notebooks" where other users have already started exploring the data, providing inspiration and code you can learn from. It’s a great place to practice your skills on interesting, real-world data, from Netflix viewing habits to data on global UFO sightings.
Story-Finding Tip: Find a dataset on a topic you're passionate about, like "Board Game Geek Rankings" or "Spotify's Top Hits." Practice your analysis and visualization skills here on a low-stakes, fun topic. This "training" will prepare you for more serious journalistic projects.
17. International Consortium of Investigative Journalists (ICIJ) Data
The ICIJ is the organization behind massive, global investigations like the Panama Papers, Paradise Papers, and Pandora Papers. While they don't release all the raw data for privacy reasons, they often release structured data on the networks of companies and individuals involved.
Exploring this data is a masterclass in understanding complex offshore financial networks. Even if you don't use it for a direct story, studying how this data is structured can help you understand how to "follow the money" in your own investigations. It’s more for inspiration and learning than for finding a quick local story, but it’s invaluable for any serious investigative journalist.
Story-Finding Tip: Use the Offshore Leaks Database to search for names of prominent individuals or companies from your country. A hit could be the starting point for a major investigation into tax avoidance or hidden wealth.
18. CDC WONDER
The Centers for Disease Control and Prevention's Wide-ranging Online Data for Epidemiologic Research (WONDER) is a portal to an incredible amount of U.S. public health data. Its most powerful feature is the detailed mortality data, which allows you to query the underlying causes of death by year, state, age, race, and gender.
This is the definitive source for stories about public health trends. You can track the rise of "deaths of despair" (suicide, overdose) in your state, compare cancer mortality rates across different demographic groups, or analyze the impact of a public health intervention over time. The level of detail is extraordinary.
Story-Finding Tip: Use the Compressed Mortality File to compare the leading causes of death for 18-25 year olds in your state today versus 20 years ago. The shift in causes can tell a dramatic story about public health, societal changes, and emerging threats.
19. Your Local City/State Open Data Portal
Last but certainly not least, some of the most impactful stories are found right in your own backyard. Many cities and states now have their own open data portals, offering everything from restaurant health inspection scores and 311 service requests to city employee salaries and building permits.
This is where you can find data that directly affects your readers' daily lives. Is the city filling potholes faster in wealthier neighborhoods? Are certain restaurants consistently failing health inspections? Who are the highest-paid public employees in your town? These portals are full of stories waiting to be told. The insights I’ve seen data journalists pull from local portals consistently produce high-impact work.
Story-Finding Tip: Start simple. Download the 311 service request data for your city. Use a pivot table to find the most common complaints by neighborhood. This simple analysis can reveal disparities in city services and provide the foundation for a compelling local story.
Your Story Awaits
This list is just the beginning. The world of open data is vast and constantly expanding. The true skill of a data journalist in 2025 isn't memorizing every possible source, but developing a curious mindset and the persistence to hunt for the right data to answer a compelling question.
Your journey starts now. Pick one dataset from this list that sparks your interest. Don't worry about writing the perfect story right away. Just open the file. Explore its columns. Ask it questions. See what patterns emerge. The most powerful, data-driven journalism begins with a single step: a simple curiosity to look at the numbers and wonder what story they're trying to tell.
What's the first dataset you're going to explore? Do you have a favorite source that we missed? Share your thoughts and ideas in the comments below!
About the Author
Goh Ling Yong is a content creator and digital strategist sharing insights across various topics. Connect and follow for more content:
Stay updated with the latest posts and insights by following on your favorite platform!