85 items tagged “databases”
2013
What are the key insights in mastering SQL queries?
You may find this article useful (despite the list-o-matic name): 10 Easy Steps to a Complete Understanding of SQL—I’ve been using SQL for years but I found that some of the concepts explained there helped firm up my fundamental understanding of how to use it effectively.
[... 64 words]NoSQL: What is the “best” solution for storing high volumes of structured data?
On the right setup, PostgreSQL can handle petabytes. There are also commercial vendors such as Greenplum that offer data warehouse solutions built on a modified version of PostgreSQL.
[... 80 words]How was FriendFeed’s schema less db faster than pure MySQL?
The principle reason they switched to a schemaless DB was to work around the challenges of having to make schemes changes in MySQL, which can lock the table and take hours if bit days to complete in large tables.
[... 115 words]What is way that android connect to Oracle database?
As a general rule it’s not a good idea to allow mobile devices to connect directly to a server-side database, as it’s an invitation to hackers to figure out what’s going on and then connect to the database themselves for nefarious reasons.
[... 105 words]Should I store markdown instead of HTML into database fields?
You should store the exact format that was entered by the user.
[... 95 words]How does a web page interact with a server to parse a dynamic JSON file?
If you’re only dealing with 60 records there’s no need to add a full database. I’ve actually hand coded a 50 record JSON file before and it was fine- use an editor with good JSON support (I like Sublime Text 2) and it’s pretty easy to hand write.
[... 103 words]Where can I find an updated DB of countries, states and cities?
This is a surprisingly complicated question. The first thing you might want to ask yourself is “what’s a country”—how do you deal with places on this List of states with limited recognition for example?
[... 182 words]2012
What tools and techniques are used for relational database version control (structure and data)?
The term you are looking for is database migrations (sometimes called database change scripts).
[... 308 words]What are XML feed best practices?
It sounds like you’re pretty much screwed already, if you’re dealing with companies that still think FTPing XML around is a sensible thing to do.
[... 364 words]Benchmarks for scalability in NoSQL systems?
NoSQL systems are enormously varied which makes it hard (and not particularly constructive) to benchmark them against each other. How would you compare the performance of Redis, an in-memory data structure server, with Cassandra, a distributed redundant column store?
[... 78 words]2011
Why does Django still not have support for multiple joins?
I don’t fully understand the question, but if you’re talking about doing a single join across multiple tables the Django ORM handles that just fine. Let’s say you want to get every BlogEntry written by a User who belongs to the Group with the name “admins”:
[... 67 words]Is a relational database with many-to-many relationships difficult to develop into a web app?
Many to Many tables can be a bit of a pain to deal with using regular SQL, but a good ORM can abstract away any potential complexity almost entirely. I find using the Django ORM means I’m much less likely to shy away from a design that involves a many-to-many relationship because I know it won’t increase the complexity of the application. I imagine the Rails ORM has the same effect.
[... 91 words]2010
What are people’s experiences using Memcached?
That it’s so obviously a good idea (and works so well) that you’d be crazy not to use it. As far as I’m concerned, it’s part of the default stack for any web application.
[... 46 words]Where can I find a database of the cities in the United States, their populations, and square miles?
On Freebase: http://www.freebase.com/view/use...—if you sign up for a Freebase account you can further filter this report to include areas.
[... 46 words]A Gentle Introduction to CouchDB for Relational Practitioners. By “High Performance MySQL” author Baron Schwartz—a smart, concise overview that touches pretty much everything that’s interesting about CouchDB. # 22nd September 2010, 9:51 pm
How do Solr, Lucene, Sphinx and Searchify compare?
Lucene is a Java library for creating and searching through a full text index. If you want to make use of it, you’ll need to write your own Java code that integrates with it.
[... 109 words]What is the largest production deployment of CouchDB for online use?
The BBC have a pretty big CouchDB cluster, which they use mostly as a replicated key-value store. It’s used by their new identity platform which includes customisation features for iPlayer.
[... 47 words]PostgreSQL 9.0 Beta 1 Now Available. With asynchronous streaming replication. # 5th May 2010, 2:36 pm
grant XXX on * ? (via) PostgreSQL doesn’t have a way to say “this user is allowed to select/update/etc on all tables in database X”. That kind of sucks. UPDATE: This is fixed in PostgreSQL 9, see the comments. # 16th March 2010, 6:26 pm
Johnny Cache. Clever twist on ORM-level caching for Django. Johnny Cache (great name) monkey-patches Django’s QuerySet classes and caches the result of every single SELECT query in memcached with an infinite expiry time. The cache key includes a “generation” ID for each dependent database table, and the generation is changed every single time a table is updated. For apps with infrequent writes, this strategy should work really well—but if a popular table is being updated constantly the cache will be all but useless. Impressively, the system is transaction-aware—cache entries created during a transaction are held in local memory and only pushed to memcached should the transaction complete successfully. # 28th February 2010, 10:55 pm
Redis Virtual Memory: the story and the code. Fascinating overview of the virtual memory feature coming to Redis 2.0, which will remove the requirement that all Redis data fit in RAM. Keys still stay in RAM, but rarely accessed values will be swapped to disk. 16 GB of RAM will be enough to hold 100 million keys, each with a value as large as you like. # 9th February 2010, 3:59 pm
FleetDB (via) Yet Another Key-Value Store: Schema-free, JSON protocol, everything cached in RAM, append-only log for durability, multi-record transactions... but what’s really interesting about this one is that it’s written in Clojure and takes full advantage of that language’s concurrency primitives. The prefix operators used by the select API hint at its Lisp heritage. # 5th January 2010, 11:21 am
2009
Django | Multiple Databases. Russell just checked in the final patch developed from Alex Gaynor’s Summer of Code project to add multiple database support to Django. I’d link to the 21,000 line changeset but it crashed our Trac, so here’s the documentation instead. # 22nd December 2009, 5:22 pm
Another leak, the worst so far (via) “Arweena, a spokes-elf for Santa Claus, admitted a few hours ago that the database posted at WikiLeaks yesterday is indeed the comprehensive 2009 list of which kids have been naughty, and which were nice.” The first comment is great too. # 22nd December 2009, 10:42 am
PostgreSQL 8.5 alpha 2 is out. “P.S. If you’re wondering about Hot Standby and Synchronous Replication, they’re still under heavy development and still (at this point) expected to be in 8.5.”—Hot Standby is PostgreSQL-speak for MySQL-style master/slave replication for scaling your reads. # 28th October 2009, 9:02 am
MySQL Connector/Python. A pure Python implementation of the MySQL client/server protocol, meaning you can talk to a MySQL server from Python without needing to first install the MySQL client libraries (which often requires compiling from source). # 2nd October 2009, 2:16 pm
Why we migrated from MySQL to MongoDB. Includes some useful information on MongoDB’s limitations—for example, running many different collections can waste disk space and repairing large datasets or bulk deleting many rows can block and lock the database for the duration of the operation. # 27th July 2009, 10:49 am
Keyspace. Yet Another Key-Value Store—this one focuses on high availability, with one server in the cluster serving as master (and handling all writes), and the paxos algorithm handling replication and ensuring a new master can be elected should the existing master become unavailable. Clients can chose to make dirty reads against replicated servers or clean reads by talking directly to the master. Underlying storage is BerkeleyDB, and the authors claim 100,000 writes/second. Released under the AGPL. # 16th July 2009, 10:30 am
Using Mongo for Real-Time Analytics. MongoDB supports an “upsert” query, which when combined with the $inc operator can cause counter fields to be incremented if they exist and created otherwise. This makes it a great fit for real-time analytics applications (one increment per page view), something that regular relational databases aren’t particularly good at. # 30th June 2009, 7:28 pm