Under the hood of Canada Spends with Brendan Samek
9th December 2025
I talked to Brendan Samek about Canada Spends, a project from Build Canada that makes Canadian government financial data accessible and explorable using a combination of Datasette, a neat custom frontend, Ruby ingestion scripts, sqlite-utils and pieces of LLM-powered PDF extraction.
Here’s the video on YouTube.
Sections within that video:
- 02:57 Data sources and the PDF problem
- 05:51 Crowdsourcing financial data across Canada
- 07:27 Datasette demo: Search and facets
- 12:33 Behind the scenes: Ingestion code
- 17:24 Data quality horror stories
- 20:46 Using Gemini to extract PDF data
- 25:24 Why SQLite is perfect for data distribution
Build Canada and Canada Spends
Build Canada is a volunteer-driven non-profit that launched in February 2025—here’s some background information on the organization, which has a strong pro-entrepreneurship and pro-technology angle.
Canada Spends is their project to make Canadian government financial data more accessible and explorable. It includes a tax sources and sinks visualizer and a searchable database of government contracts, plus a collection of tools covering financial data from different levels of government.
Datasette for data exploration
The project maintains a Datasette instance at api.canadasbilding.com containing the data they have gathered and processed from multiple data sources—currently more than 2 million rows plus a combined search index across a denormalized copy of that data.

Processing PDFs
The highest quality government financial data comes from the audited financial statements that every Canadian government department is required to publish. As is so often the case with government data, these are usually published as PDFs.
Brendan has been using Gemini to help extract data from those PDFs. Since this is accounting data the numbers can be summed and cross-checked to help validate the LLM didn’t make any obvious mistakes.
Further reading
- datasette.io, the official website for Datasette
- sqlite-utils.datasette.io for more on
sqlite-utils - Canada Spends
- BuildCanada/CanadaSpends on GitHub
More recent articles
- Highlights from my appearance on the Data Renegades podcast with CL Kao and Dori Wilson - 26th November 2025
- Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult - 24th November 2025