COMMONSENSE NOTES // By IDRIS B. ODUNEWUThe Future of Open Data in the Age of AI: Safeguarding Public Assets Amid Growing Private Sector Demands
AI offers immense potential, but that potential must be realized within a framework that protects the public’s right to its own information. The open data movement must evolve to meet this new challenge—not retreat from it.
The proliferation of artificial intelligence (AI) technologies over the past decade has ushered in new possibilities across virtually every sector. From healthcare and finance to transportation and climate modeling, AI has demonstrated the transformative potential of data-driven innovation. However, with this rapid growth comes a new and underappreciated threat to a foundational pillar of democratic governance and public innovation: open government data.
As a former member of the U.S. Government’s Data.gov team, I spent four years supporting and witnessing firsthand the incredible societal value that open government data can offer. These datasets have catalyzed startups, empowered civil society, fueled academic research, and enabled transparency in government decision-making; but as private AI companies increasingly seek exclusive access to vast data reservoirs to feed their models, there is a growing concern that government agencies may succumb to pressures—financial, logistical, or political—to monetize these assets. If left unaddressed, this trend could erode the open data movement and compromise the public’s right to access its own data.
The AI Data Dilemma
AI models require massive and continuous data streams to train, fine-tune, and update algorithms. While some data can be generated synthetically or scraped from the web, much of the most valuable and structured data originates from government sources. These include environmental records, transportation statistics, census data, health outcomes, satellite imagery, education performance, and more.
Private firms recognize the value of these datasets not just for building products but also for establishing competitive moats. The data’s credibility, comprehensiveness, and longitudinal nature make it especially appealing for training AI systems with real-world applicability. As a result, there is an increasing temptation for governments to treat these assets not as public goods, but as monetizable resources.
Examples of Commercialization Pressures
There are already warning signs. Some government agencies around the world have begun entering into exclusive or restricted-access agreements with private firms. For instance, some weather and geospatial agencies have licensed data to commercial platforms, creating tiered access or delaying the release of full datasets. In certain sectors, APIs that were once free have become fee-based, with premium access options available to corporations while limited or outdated datasets remain accessible to the public.
These models may seem financially prudent—especially in an era of budgetary constraints—but they risk undermining decades of progress in open data policy. Worse, they can entrench inequalities in access, as large corporations afford premium access while small businesses, nonprofits, journalists, and academics are left behind.