Simon Willison’s Weblog

Subscribe
Atom feed for ec2

43 items tagged “ec2”

2024

DSQL Vignette: Reads and Compute. Marc Brooker is one of the engineers behind AWS's new Aurora DSQL horizontally scalable database. Here he shares all sorts of interesting details about how it works under the hood.

The system is built around the principle of separating storage from compute: storage uses S3, while compute runs in Firecracker:

Each transaction inside DSQL runs in a customized Postgres engine inside a Firecracker MicroVM, dedicated to your database. When you connect to DSQL, we make sure there are enough of these MicroVMs to serve your load, and scale up dynamically if needed. We add MicroVMs in the AZs and regions your connections are coming from, keeping your SQL query processor engine as close to your client as possible to optimize for latency.

We opted to use PostgreSQL here because of its pedigree, modularity, extensibility, and performance. We’re not using any of the storage or transaction processing parts of PostgreSQL, but are using the SQL engine, an adapted version of the planner and optimizer, and the client protocol implementation.

The system then provides strong repeatable-read transaction isolation using MVCC and EC2's high precision clocks, enabling reads "as of time X" including against nearby read replicas.

The storage layer supports index scans, which means the compute layer can push down some operations allowing it to load a subset of the rows it needs, reducing round-trips that are affected by speed-of-light latency.

The overall approach here is disaggregation: we’ve taken each of the critical components of an OLTP database and made it a dedicated service. Each of those services is independently horizontally scalable, most of them are shared-nothing, and each can make the design choices that is most optimal in its domain.

# 6th December 2024, 5:12 pm / firecracker, aws, scaling, s3, postgresql, architecture, ec2, databases

2011

The excess capacity story is a myth. It was never a matter of selling excess capacity, actually within 2 months after launch AWS would have already burned through the excess Amazon.com capacity.  Amazon Web Services was always considered a business by itself, with the expectation that it could even grow as big as the Amazon.com retail operation.

Werner Vogels

# 5th January 2011, 3:13 pm / amazon, amazon-web-services, ec2, s3, recovered

2010

Bees with machine guns! Low-cost, distributed load-testing using EC2. Great name for a useful project—Bees with machine guns is a Fabric script which fires up a bunch of EC2 instances, uses them to load test a website and then spins them back down again.

# 27th October 2010, 11:04 pm / ec2, fabric, load-testing, performance, scaling, recovered

What’s powering the Content API? The new Guardian Content API runs on Solr, scaled using EC2 and Solr replication and with a Scala web service layer sitting between Solr and the API’s end users.

# 24th May 2010, 2:08 pm / apis, contentapi, ec2, guardian, openplatform, scala, scaling, solr, recovered

Automate EC2 Instance Setup with user-data Scripts (via) I knew about EC2’s user-data feature—what I didn’t know is that the Alestic and Canonical images are configured so that if the user-data starts with #! the instance will automatically execute it as a shell script as soon as it boots up (after networking has been configured).

# 11th March 2010, 12:31 pm / sysadmin, deployment, ec2, userdata

Since we moved to EC2, the number of unique users has gone up 50%, and pageviews are up more than 100%. To support this growth, we have added 30% more ram and 50% more CPU, yet because of Amazon's constant price reductions, we are actually paying less per month now than when we started.

Jeremy from Reddit

# 7th January 2010, 10:10 pm / reddit, ec2, amazon, pricing, cloud-computing

2009

Guardian iPhone app. Released today, ad-free, £2.39 for the application, has an excellent offline mode. I helped build the backend web service, which is a Django app running on EC2.

# 14th December 2009, 1:29 pm / guardian, ec2, django, iphone, python

4store Amazon Machine Image. Instructions for firing up an EC2 AMI running the recently released 4store high performance triple store and loading in 1.14 billion statements collected by crawling the semantic web.

# 1st November 2009, 12:12 pm / semanticweb, semweb, 4store, triplestore, ec2, ami

MySQL backups with EBS snapshots. Assaf Arkin’s 45 line ruby script shows how to lock tables / XFS freeze / create an EBS snapshot / unfreeze and unlock, with hourly snapshots preserved for the past 24 hours and daily snapshots for the past week. Is an EBS snapshot enough to restore your data to somewhere other than EC2 though?

# 13th October 2009, 12:34 pm / assaf-arkin, ruby, ec2, mysql, ebs, cloud, backups

OpenStreetMap Rendering Database. Amazon have added an OpenStreetMap snapshot as a public data set, thanks to some smart prompting by Jeremy Dunck.

# 10th October 2009, 1:05 pm / amazon, ec2, s3, publicdatasets, openstreetmap, mapping, jeremy-dunck

Tile Drawer (via) The most inspired use of EC2 I’ve seen yet: center a map on an area, pick a Cascadenik stylesheet URL (or write and link to your own) and Tile Drawer gives you an Amazon EC2 AMI and a short JSON snippet. Launch the AMI with the JSON as the “user data” parameter and you get your own OpenStreetMap tile rendering server, which self-configures on startup and starts rendering and serving tiles using your custom design.

# 26th August 2009, 9:32 am / openstreetmap, ec2, amazon, michal-migurski, cascadenik, mapnik, cloud-computing, json, userdata, mapping

Introducing Amazon Virtual Private Cloud (VPC). Amazon now let you create a network of private EC2 instances completely isolated from the internet and the rest of the EC2 cloud, then link them back to your home network via a VPN.

# 26th August 2009, 8:42 am / vpn, amazon, virtualprivatecloud, ec2

Memcached 1.4.0 released. The big new feature is the (optional) binary protocol, which enables other features such as CAS-everywhere and efficient client-side replication. Maintainer Dustin Sallings has also released some useful sounding EC2 instances which automatically assign nearly all of their RAM to memcached on launch and shouldn’t need any further configuration.

# 17th July 2009, 10:26 pm / memcached, dustin-sallings, binary, cas, ec2, ami, caching, performance, scaling

Investigate your MP’s expenses. Launched today, this is the project that has been keeping me ultra-busy for the past week—we’re crowdsourcing the analysis of the 700,000+ scanned MP expenses documents released this morning. It’s the Guardian’s first live Django-powered application, and also the first time we’ve hosted something on EC2.

# 18th June 2009, 11:16 pm / django, ec2, guardian, mpexpenses, projects, python, crowdsourcing

EC2: Creating an Image. Here’s the easier way of creating your own AMI: start with a running instance in EC2, then customise it to fit your purposes and create a new bundle (and then AMI) using the ec2-bundle-vol command.

# 19th May 2009, 7:50 pm / ec2, ami, amazon, cloud-computing

HOWTO Building a self-bundling Debian AMI. Not as terrifying as you would have thought. Also contains some neat hints as to how some of the more magical parts of EC2 work (like the way your SSH public key automatically ends up in /root/.ssh/authorized_keys).

# 19th May 2009, 7:49 pm / ec2, debian, ami, cloud-computing, amazon

aws—simple access to Amazon EC2 and S3. The best command line client I’ve found for EC2 and S3. “aws put --progress my-bucket-name/large-file.tar.gz large-file.tar.gz” is particularly useful for uploading large files to S3. Written in Perl (with no dependencies), shelling out to curl to do the heavy lifting.

# 19th May 2009, 11:38 am / curl, perl, aws, amazon-web-services, ec2, s3, commandline, tools, tim-kay

New Features for EC2: Elastic Load Balancing, Auto Scaling, and Amazon CloudWatch. EC2 now fulfils the promise of “magic scaling in the cloud” out of the box—CloudWatch monitors performance of your EC2 instances without needing to install any monitoring software, Auto Scaling allows you to configure “scaling triggers” which start up new instances based on information from CloudWatch, and Elastic Load Balancing balances requests across all available instances.

# 18th May 2009, 10:07 am / cloudwatch, amazon, ec2, elasticloadbalancing, autoscaling, scaling, cloud-computing

Ubuntu brings advanced Screen features to the masses. Ubuntu 9.04’s screen-profiles package adds a taskbar to screen and emulates the gnome panel. You can even add a widget showing the cost of your current EC2 session.

# 28th April 2009, 9:52 pm / screen, ubuntu, ec2, linux

Experiences deploying a large-scale infrastructure in Amazon EC2. “At OpenX we recently completed a large-scale deployment of one of our server farms to Amazon EC2. Here are some lessons learned from that experience.”

# 10th April 2009, 9:43 am / openx, amazonec2, ec2, amazon, scaling, griggheorghiu

Amazon Elastic MapReduce (via) Hadoop as a service. Basically a web based GUI around Hadoop—you could roll this yourself on EC2 but for a small markup on regular EC2 prices you get to avoid the extra work setting everything up. Data processing scripts can be written in Java, Ruby, Perl, Python, PHP, R, or C++ and are loaded in to S3 before firing off the job.

# 2nd April 2009, 10:25 am / cloud-computing, hadoop, amazon-web-services, amazon, mapreduce, ec2, s3

maps from scratch. An idea whose time has come: using EC2 AMIs for tutorial sessions to give everyone a pre-configured environment.

# 15th March 2009, 1:20 pm / tutorials, ec2, cloud-computing, michal-migurski, mapping

Introducing the Karmic Koala, our mascot for Ubuntu 9.10 (via) Ubuntu 9.10 will have a strong focus on cloud computing, including tools for easily creating EC2 AMIs and Eucalyptus, an open-source system for running an EC2-compatible cloud in your own data centre.

# 21st February 2009, 5:19 pm / ubuntu, ec2, cloud-computing, eucalyptus, mark-shuttleworth, linux, karmickoala

Load Balancing in Amazon EC2 with HAProxy. Solid tutorial introduction to HAProxy.

# 5th February 2009, 11:12 pm / ec2, haproxy, load-balancing, griggheorghiu

Manage Amazon EC2 With New Web-Based AWS Management Console. Finally! I’m amazed it took Amazon so long to do this. Managing EC2 instances from a custom Firefox extension was pretty bizarre. It’s a very nice interface, built on top of YUI. Unfortunately you still have to manage your entire virtual server farm using a single shared Amazon account.

# 9th January 2009, 9:34 am / amazon, aws, ec2, cloud-computing, yui, javascript

2008

How Tarsnap uses Amazon Web Services (via) Useful case study, including some thoughts on SimpleDB.

# 14th December 2008, 7:35 pm / simpledb, tarsnap, amazon-web-services, aws, s3, ec2, cloud-computing

Ubuntu and Debian AMIs for Amazon EC2. Exactly what it says on the tin.

# 8th December 2008, 6:04 pm / amazonec2, ec2, ubuntu, debian, linux, amis

Amazon SimpleDB a complete flop? Terry asks if anyone is actually using SimpleDB (related Google searches indicate not, and I’ve personally not heard of anyone using it despite plenty of usage of S3 and EC2). One factor might be that lock-in to EC2 and S3 is pretty small, but if you rely on SimpleDB you’ll need to rewrite your entire application to escape.

# 2nd December 2008, 10:17 am / ec2, s3, amazon-web-services, simpledb, terry-jones, lockin, cloud

Trying out Windows on EC2. Phillip Pearson provides the missing documentation.

# 24th October 2008, 9:57 am / windows, ec2, phillip-pearson, amazonaws, cloud-computing

Windows Server and SQL Server on EC2 (via) Launched today, the pricing includes rental of the Windows license. Regular Windows is 25% to 50% more expensive than Linux, but SQL Server comes in at a hefty $1.10 per hour, which is $9636 per year (nearly three times as much as a Linux server running an open source database).

# 23rd October 2008, 3:54 pm / open-source, cloud-computing, ec2, pricing, sqlserver, windows