Simon Willison’s Weblog

35 items tagged “sql”

github/gh-ost: Thoughts on Foreign Keys? The biggest challenge I’ve seen with foreign key constraints at scale (at least with MySQL) is how they conflict with online schema migrations using tools like pt-online-schema-change or GitHub’s gh-ost. This is a good explanation of the issue by Shlomi Noach, one of the gh-ost maintainers. # 19th June 2018, 4:12 pm

Half of the time when companies say they need “AI” what they really need is a SELECT clause with GROUP BY.

Mat Velloso # 1st June 2018, 2:35 pm

System-Versioned Tables in MariaDB (via) Fascinating new feature from the SQL:2011 standard that’s getting its first working implementation in MariaDB 10.3.4. “ALTER TABLE products ADD SYSTEM VERSIONING;” causes the database to store every change made to that table—then you can run “SELECT * FROM products FOR SYSTEM_TIME AS OF TIMESTAMP @t1;” to query the data as of a specific point in time. I’ve tried all manner of horrible mechanisms for achieving this in the past, having it baked into the database itself would be fantastic. # 25th April 2018, 2:34 pm

How to rewrite your SQL queries in Pandas, and more (via) I still haven’t fully internalized the idioms needed to manipulate DataFrames in pandas. This tutorial helps a great deal—it shows the Pandas equivalents for a host of common SQL queries. # 19th April 2018, 6:34 pm

Use The Index, Luke! Paging Through Results (via) The best explanation of keyset pagination I’ve seen. Keyset pagination is where instead of using OFFSET/LIMIT to return the next page of results you instead track the last seen value in the column you sort by and then return the next X results that follow it. This allows you to paginate to arbitrarily deep offsets within a table, whereas OFFSET/LIMIT requires first iterating across all preceding rows and tends to stop working well after the first few thousand results. # 29th March 2018, 5:30 pm

User-defined Order in SQL (via) This is a fun intellectual exercise: how can one efficiently implement a user-defined order in a SQL table? The obvious initial approach is to have an integer position column, but this means every subsequent row must be updated when an item changes position. Joe “begriffs” Nelson explores some clever alternatives, including floating point or decimal positions (allowing new items to be inserted at a midpoint between existing positions) and a new custom rational number type he buiIt as a PostgreSQL extension. # 21st March 2018, 2:07 pm

SQLite Query Language: WITH clause. SQLite’s documentation on recursive CTEs starts out with some nice clear examples of tree traversal using a WITH statement, then gets into graphs, then goes way off the deep end with a Mandelbrot Set query and a query that can solve Soduku puzzles (“in less than 300 milliseconds on a modern workstation”). # 26th November 2017, 7:23 am

What are the key insights in mastering SQL queries?

You may find this article useful (despite the list-o-matic name): 10 Easy Steps to a Complete Understanding of SQL—I’ve been using SQL for years but I found that some of the concepts explained there helped firm up my fundamental understanding of how to use it effectively.

[... 64 words]

What tools and techniques are used for relational database version control (structure and data)?

The term you are looking for is database migrations (sometimes called database change scripts).

[... 308 words]

Appending the request URL to SQL statements in Django. A clever frame-walking monkey-patch which pulls the most recent HttpRequest object out of the Python stack and adds the current request.path to each SQL query as an SQL comment, so you can see it in debugging tools such as slow query logs and the PostgreSQL “select * from pg_stat_activity” query. # 2nd June 2010, 9:09 am

The answers to your Security Questions are case sensitive and cannot contain special characters like an apostrophe, or the words “insert,” “delete,” “drop,” “update,” “null,” or “select.”

Sacramento Credit Union # 14th May 2010, 12:40 am

grant XXX on * ? (via) PostgreSQL doesn’t have a way to say “this user is allowed to select/update/etc on all tables in database X”. That kind of sucks. UPDATE: This is fixed in PostgreSQL 9, see the comments. # 16th March 2010, 6:26 pm

django-batch-select (via) A smart attempt at solving select_related for many-to-many relationships in Django. Add a custom manager to your model and call e.g. Entry.objects.all()[:10].batch_select(“tags”) to execute two queries—one pulling back the first ten entries and another using an “IN” query against the tags table to pull back all of the tags for those entries in one go. # 23rd November 2009, 4:19 pm

When I worked at Amazon.com we had a deeply-ingrained hatred for all of the SQL databases in our systems. Now, we knew perfectly well how to scale them through partitioning and other means. But making them highly available was another matter. Replication and failover give you basic reliability, but it’s very limited and inflexible compared to a real distributed datastore with master-master replication, partition tolerance, consensus and/or eventual consistency, or other availability-oriented features.

Matt Brubeck # 4th October 2009, 9:50 am

Mandelbrot set in PostgreSQL. Surprisingly short SQL statement that produces an ASCII art Mandelbrot set. # 13th August 2009, 2:23 pm

SQL pie chart. Generating ASCII art pie charts using the world’s scariest MySQL SELECT statement. # 13th August 2009, 1:04 pm

YQL: INSERT INTO internet. insert into twitter.status (status,username,password) values (“Playing with INSERT, UPDATE and DELETE in YQL”, “twitterusername”,“twitterpassword”) # 8th July 2009, 8:19 pm

With YQL Execute, the Internet becomes your database. This is nuts (in a good way). Yahoo!’s intriguing universal SQL-style XML/JSONP web service interface now supports JavaScript as a kind of stored procedure language, meaning you can use JavaScript and E4X to screen-scrape web pages, then query the results with YQL. # 29th April 2009, 10:50 pm

python-sqlparse (via) Python library for re-identing SQL statements. This could make debugging Django’s generated SQL a whole lot easier. You can try the library out using an App Engine hosted application (complete with an API). # 28th April 2009, 8:25 pm

peeping into memcached. “Peep uses ptrace to freeze a running memcached server, dump the internal key metadata, and return the server to a running state”—you can then load the resulting data in to MySQL using LOAD LOCAL INFILE and analyse it using standard SQL queries. # 20th April 2009, 6:35 pm

Sphinx 0.9.9-rc2 is out. Interesting new feature: the Sphinx search server now supports the MySQL binary protocol, so you can talk to it using a regular MySQL client library and fire off search queries using SELECT syntax and the new SphinxQL query language. # 8th April 2009, 1:59 pm

YQL opens up 3rd-party web service table definitions to developers. This really is astonishingly clever: you can create an XML file telling Yahoo!’s YQL service how to map an arbitrary API to YQL tables, then make SQL-style queries against it (including joins against other APIs). Another neat trick: doing a SQL “in” query causes API requests to be run in parallel and recombined before being returned to you. # 9th February 2009, 9:08 pm

Amazon SimpleDB—Now With Select. So now all three of Yahoo!, Amazon and Google have invented their own SQL-like languages (YQL, SimpleDB and GQL)—though it looks like Yahoo!’s is the only one that attempts to provide joins. # 18th December 2008, 8:59 am

YQL—converting the web to JSON with mock SQL. YQL just got a whole lot more interesting to me—I had no idea they were exposing an HTML and RSS scraping tool over a JSONP API in addition to all of the Yahoo! web service methods. # 13th December 2008, 9:39 am

Secrets of the Django ORM. An undocumented (and unsupported) method of poking a Django QuerySet’s internal query to add group_by and having clauses to a SQL query. # 8th November 2008, 11:49 pm

Mass Attack FAQ. Thousands of IIS Web servers have been infected with an automated mass XSS attack, not through a specific IIS vulnerability but using a universal XSS SQL query that targets SQL Server and modifies every text field to add the attack JavaScript. If an app has even a single SQL injection hole (and many do) it is likely to be compromised. # 26th April 2008, 9:12 am

IronPython, MS SQL, and PEP 249. How Dino Viehland got Django’s ORM to talk to the .NET database layer. # 19th March 2008, 9:46 am

Queryset Implementation. Malcolm explains the work that has gone in to the queryset-refactor branch. Executive summary: Python’s ORM is probably a lot better at SQL than you are. # 19th March 2008, 9:43 am

django-mptt (via) Jonathan Buchanan’s simple utility for performing Modified Preorder Tree Traversal (efficient tree operations in SQL) on Django models. # 29th December 2007, 11:33 am

A Visual Explanation of SQL Joins. It turns out Venn diagrams are an excellent way of illustrating joins. # 12th October 2007, 9:42 am