43 items tagged “ec2”
2024
DSQL Vignette: Reads and Compute. Marc Brooker is one of the engineers behind AWS's new Aurora DSQL horizontally scalable database. Here he shares all sorts of interesting details about how it works under the hood.
The system is built around the principle of separating storage from compute: storage uses S3, while compute runs in Firecracker:
Each transaction inside DSQL runs in a customized Postgres engine inside a Firecracker MicroVM, dedicated to your database. When you connect to DSQL, we make sure there are enough of these MicroVMs to serve your load, and scale up dynamically if needed. We add MicroVMs in the AZs and regions your connections are coming from, keeping your SQL query processor engine as close to your client as possible to optimize for latency.
We opted to use PostgreSQL here because of its pedigree, modularity, extensibility, and performance. We’re not using any of the storage or transaction processing parts of PostgreSQL, but are using the SQL engine, an adapted version of the planner and optimizer, and the client protocol implementation.
The system then provides strong repeatable-read transaction isolation using MVCC and EC2's high precision clocks, enabling reads "as of time X" including against nearby read replicas.
The storage layer supports index scans, which means the compute layer can push down some operations allowing it to load a subset of the rows it needs, reducing round-trips that are affected by speed-of-light latency.
The overall approach here is disaggregation: we’ve taken each of the critical components of an OLTP database and made it a dedicated service. Each of those services is independently horizontally scalable, most of them are shared-nothing, and each can make the design choices that is most optimal in its domain.
2011
The excess capacity story is a myth. It was never a matter of selling excess capacity, actually within 2 months after launch AWS would have already burned through the excess Amazon.com capacity. Amazon Web Services was always considered a business by itself, with the expectation that it could even grow as big as the Amazon.com retail operation.
2010
Bees with machine guns! Low-cost, distributed load-testing using EC2. Great name for a useful project—Bees with machine guns is a Fabric script which fires up a bunch of EC2 instances, uses them to load test a website and then spins them back down again.
What’s powering the Content API? The new Guardian Content API runs on Solr, scaled using EC2 and Solr replication and with a Scala web service layer sitting between Solr and the API’s end users.
Automate EC2 Instance Setup with user-data Scripts (via) I knew about EC2’s user-data feature—what I didn’t know is that the Alestic and Canonical images are configured so that if the user-data starts with #! the instance will automatically execute it as a shell script as soon as it boots up (after networking has been configured).
Since we moved to EC2, the number of unique users has gone up 50%, and pageviews are up more than 100%. To support this growth, we have added 30% more ram and 50% more CPU, yet because of Amazon's constant price reductions, we are actually paying less per month now than when we started.
2009
Guardian iPhone app. Released today, ad-free, £2.39 for the application, has an excellent offline mode. I helped build the backend web service, which is a Django app running on EC2.
4store Amazon Machine Image. Instructions for firing up an EC2 AMI running the recently released 4store high performance triple store and loading in 1.14 billion statements collected by crawling the semantic web.
MySQL backups with EBS snapshots. Assaf Arkin’s 45 line ruby script shows how to lock tables / XFS freeze / create an EBS snapshot / unfreeze and unlock, with hourly snapshots preserved for the past 24 hours and daily snapshots for the past week. Is an EBS snapshot enough to restore your data to somewhere other than EC2 though?
OpenStreetMap Rendering Database. Amazon have added an OpenStreetMap snapshot as a public data set, thanks to some smart prompting by Jeremy Dunck.
Tile Drawer (via) The most inspired use of EC2 I’ve seen yet: center a map on an area, pick a Cascadenik stylesheet URL (or write and link to your own) and Tile Drawer gives you an Amazon EC2 AMI and a short JSON snippet. Launch the AMI with the JSON as the “user data” parameter and you get your own OpenStreetMap tile rendering server, which self-configures on startup and starts rendering and serving tiles using your custom design.
Introducing Amazon Virtual Private Cloud (VPC). Amazon now let you create a network of private EC2 instances completely isolated from the internet and the rest of the EC2 cloud, then link them back to your home network via a VPN.
Memcached 1.4.0 released. The big new feature is the (optional) binary protocol, which enables other features such as CAS-everywhere and efficient client-side replication. Maintainer Dustin Sallings has also released some useful sounding EC2 instances which automatically assign nearly all of their RAM to memcached on launch and shouldn’t need any further configuration.
Investigate your MP’s expenses. Launched today, this is the project that has been keeping me ultra-busy for the past week—we’re crowdsourcing the analysis of the 700,000+ scanned MP expenses documents released this morning. It’s the Guardian’s first live Django-powered application, and also the first time we’ve hosted something on EC2.
EC2: Creating an Image. Here’s the easier way of creating your own AMI: start with a running instance in EC2, then customise it to fit your purposes and create a new bundle (and then AMI) using the ec2-bundle-vol command.
HOWTO Building a self-bundling Debian AMI. Not as terrifying as you would have thought. Also contains some neat hints as to how some of the more magical parts of EC2 work (like the way your SSH public key automatically ends up in /root/.ssh/authorized_keys).
aws—simple access to Amazon EC2 and S3. The best command line client I’ve found for EC2 and S3. “aws put --progress my-bucket-name/large-file.tar.gz large-file.tar.gz” is particularly useful for uploading large files to S3. Written in Perl (with no dependencies), shelling out to curl to do the heavy lifting.
New Features for EC2: Elastic Load Balancing, Auto Scaling, and Amazon CloudWatch. EC2 now fulfils the promise of “magic scaling in the cloud” out of the box—CloudWatch monitors performance of your EC2 instances without needing to install any monitoring software, Auto Scaling allows you to configure “scaling triggers” which start up new instances based on information from CloudWatch, and Elastic Load Balancing balances requests across all available instances.
Ubuntu brings advanced Screen features to the masses. Ubuntu 9.04’s screen-profiles package adds a taskbar to screen and emulates the gnome panel. You can even add a widget showing the cost of your current EC2 session.
Experiences deploying a large-scale infrastructure in Amazon EC2. “At OpenX we recently completed a large-scale deployment of one of our server farms to Amazon EC2. Here are some lessons learned from that experience.”
Amazon Elastic MapReduce (via) Hadoop as a service. Basically a web based GUI around Hadoop—you could roll this yourself on EC2 but for a small markup on regular EC2 prices you get to avoid the extra work setting everything up. Data processing scripts can be written in Java, Ruby, Perl, Python, PHP, R, or C++ and are loaded in to S3 before firing off the job.
maps from scratch. An idea whose time has come: using EC2 AMIs for tutorial sessions to give everyone a pre-configured environment.
Introducing the Karmic Koala, our mascot for Ubuntu 9.10 (via) Ubuntu 9.10 will have a strong focus on cloud computing, including tools for easily creating EC2 AMIs and Eucalyptus, an open-source system for running an EC2-compatible cloud in your own data centre.
Load Balancing in Amazon EC2 with HAProxy. Solid tutorial introduction to HAProxy.
Manage Amazon EC2 With New Web-Based AWS Management Console. Finally! I’m amazed it took Amazon so long to do this. Managing EC2 instances from a custom Firefox extension was pretty bizarre. It’s a very nice interface, built on top of YUI. Unfortunately you still have to manage your entire virtual server farm using a single shared Amazon account.
2008
How Tarsnap uses Amazon Web Services (via) Useful case study, including some thoughts on SimpleDB.
Ubuntu and Debian AMIs for Amazon EC2. Exactly what it says on the tin.
Amazon SimpleDB a complete flop? Terry asks if anyone is actually using SimpleDB (related Google searches indicate not, and I’ve personally not heard of anyone using it despite plenty of usage of S3 and EC2). One factor might be that lock-in to EC2 and S3 is pretty small, but if you rely on SimpleDB you’ll need to rewrite your entire application to escape.
Trying out Windows on EC2. Phillip Pearson provides the missing documentation.
Windows Server and SQL Server on EC2 (via) Launched today, the pricing includes rental of the Windows license. Regular Windows is 25% to 50% more expensive than Linux, but SQL Server comes in at a hefty $1.10 per hour, which is $9636 per year (nearly three times as much as a Linux server running an open source database).