96 items tagged “http”
2010
Side-Channel Leaks in Web Applications. Interesting new security research. SSL web connections encrypt the content but an attacker can still see the size of the HTTP requests going back and forward—which can be enough to extract significant pieces of information, especially in applications that make a lot of Ajax requests.
Node.js, redis, and resque (via) Paul Gross has been experimenting with Node.js proxies for allowing web applications to be upgraded without missing any requests. Here he places all incoming HTTP requests in a redis queue, then has his backend Rails servers consume requests from the queue and push the responses back on to a queue for Node to deliver. When the backend application is upgraded, requests remain in the queue and users see a few seconds of delay before their request is handled. It’s not production ready yet (POST requests aren’t handled, for example) but it’s a very interesting approach.
Elastic Search (via) Solr has competition! Like Solr, Elastic Search provides a RESTful JSON HTTP interface to Lucene. The focus here is on distribution, auto-sharding and high availability. It’s even easier to get started with than Solr, partly due to the focus on providing a schema-less document store, but it’s currently missing out on a bunch of useful Solr features (a web interface and faceting are the two that stand out). The high availability features look particularly interesting. UPDATE: I was incorrect, basic faceted queries are already supported.
2009
HTTP + Politics = ? Mark Nottingham ponders the technical implications of Australia’s decision to apply a filter to all internet traffic. Australia is large enough (and far enough away from the northern hemisphere) that the speed of light is a performance issue, but filtering technologies play extremely poorly with optimisation technologies such as HTTP pipelining and Google’s SPDY proposal.
Real time online activity monitor example with node.js and WebSocket. A neat exploration of Node.js—first hooking a “tail -f” process up to an HTTP push stream, then combining that with HTML 5 WebSockets to achieve reliable streaming.
Language Detection: A Witch’s Brew? The Flickr team make the case for using the Accept-Language header over IP detection to pick a site’s language, with a simple UI for switching languages in case you get it wrong. They’ve been using this for two and a half years without any significant problems.
Node.js is genuinely exciting
I gave a talk on Friday at Full Frontal, a new one day JavaScript conference in my home town of Brighton. I ended up throwing away my intended topic (JSONP, APIs and cross-domain security) three days before the event in favour of a technology which first crossed my radar less than two weeks ago.
[... 2,025 words]SPDY: The Web, Only Faster. Alex Russell explains the benefits of Google’s SPDF proposal (a protocol that upgrades HTTP)—including header compression, multiplexing, the ability to send additional resources such as images and stylesheets down without needing the data:uri hack and Comet support built in to the core assumptions of the protocol.
node.js. “Evented I/O for V8 JavaScript”—a JavaScript environment built on top of the super-fast V8 engine which provides event-based IO functionality for building highly concurrent TCP and HTTP servers. The API design is superb—everything is achieved using JavaScript events and callbacks (even regular file IO) and the small standard library ships with comprehensive support for HTTP and DNS. Overall it’s very similar to Twisted and friends, but JavaScript’s anonymous function syntax feels more natural than the Python equivalent. It compiles cleanly on Snow Leopard. Definitely a project to watch.
Traffic Server. Mark Nottingham explains the release of Traffic Server, a new Apache Incubator open source project donated by Yahoo! using code originally developed at Inktomi around a decade ago. Traffic Server is a HTTP proxy/cache, similar to Squid and Varnish (though Traffic Server acts as both a forward and reverse proxy, whereas Varnish only handles reverse).
High-end Varnish-tuning. Tuning the Varnish HTTP cache to serve 27K requests/second on a single core 2.2GHz Opteron.
cloud-crowd. New parallel processing worker/job queue system with a strikingly elegant architecture. The central server is an HTTP server that manages job requests, which are farmed out to a number of node HTTP servers which fork off worker processes to do the work. All communication is webhook-style JSON, and the servers are implemented in Sinatra and Thin using a tiny amount of code. The web-based monitoring interface is simply beautiful, using canvas to display graphs showing the system’s overall activity.
PostBin. Handy debugging tool for webhooks—create a TinyURL-style URL, then see a log of any POST requests made to that address.
We experimented with different async DB approaches, but settled on synchronous at FriendFeed because generally if our DB queries were backlogging our requests, our backends couldn't scale to the load anyway. Things that were slow enough were abstracted to separate backend services which we fetched asynchronously via the async HTTP module.
rather baffling finding: POST requests, made via the XMLHTTP object, send header and body data in separate tcp/ip packets [and therefore,] xmlhttp GET performs better when sending small amounts of data than an xmlhttp POST
PubSub-over-Webhooks with RabbitHub. RabbitMQ, the Erlang-powered AMQP message queue, is growing an HTTP interface based on webhooks and PubSubHubBub.
The Resource Expert Droid. Like the HTML Validator but for your server’s HTTP headers—extremely useful.
Facebook Usernames and OpenID
Today’s launch of Facebook Usernames provides an obvious and exciting opportunity for Facebook to become an OpenID provider. Facebook have clearly demonstrated their interest in becoming the key online identity for their users, and the new usernames feature is their acknowledgement that URL-based identities are an important component of that, no doubt driven in part by Twitter making usernames trendy again.
[... 760 words]Styling buttons to look like links. Nat has a neat trick for styling submit buttons to look like regular links—so there’s absolutely no excuse for using a “delete” link when you should be using a POST request.
A rev=“canonical” HTTP Header. Chris Shiflett proposes optionally exposing rev=canonical information in an HTTP header, thus allowing sites to discover shorter URLs using just a HEAD request and removing the need to parse HTML. The pingback specification also uses this shortcut.
Concurrence. Exciting: a Python framework for “creating massively concurrent network applications” (the tutorial benchmarks a Hello World web server at over 8,000 requests a second). It’s implemented on top of libevent using pyrex, can run on either Stackless Python or Greenlets from the py library and ships with a WSGI server, an HTTP client and a DBAPI 2.0 compliant MySQL driver.
django-springsteen and Distributed Search. Will Larson’s Django search library currently just talks to Yahoo! BOSS, but is designed to be extensible for other external search services. Interestingly, it uses threads to fire off several HTTP requests in parallel from within the Django view.
Tokyo Cabinet: Beyond Key-Value Store. Useful overview of Yet Another Scalable Key Value Store. Interesting points: multiple backends (hash table, B-Tree, in memory, on disk), a “table” engine which enables more advanced queries, a network server that supports HTTP, memcached or its own binary protocol and the ability to extend the engine with Lua scripts.
Pragmatism, purity and JSON content types
I started a conversation about this on Twitter the other day, but Twitter is a horrible place to have an archived discussion so I’m going to try again here.
[... 555 words]Ehy IE8, I Can Has Some Clickjacking Protection? (via) IE8 has built-in protection against clickjacking, but it’s opt-in (with a custom HTTP header) and IE only. It turns out the usual defence against clickjacking (using framebusting JavaScript) doesn’t work in IE as it can be worked around with a security=“restricted” attribute on an iframe.
2008
ETags And Modification Times In Django. Part of Malcolm’s series of tutorials on implementing advanced HTTP concepts in Django.
ptth (Reverse HTTP) implementation in a browser using Long Poll COMET. Donovan Preston experiments with the cleverly named idea of ptth, where servers send HTTP requests to clients.
Response Splitting Risk. Important reminder that you should always ensure strings used in HTTP headers don’t contain newlines.
Versioning REST Web Services. Peter Williams suggests using a vendor MIME media type in the Accept header to specify a required API version, because embedding the API version in the URL itself leads to a single resource ending up with many different URLs, one for each API version.
Robust Defenses for Cross-Site Request Forgery [PDF]. Fascinating report which introduces the “login CSRF” attack, where an attacker uses CSRF to log a user in to a site (e.g. PayPal) using the attacker’s credentials, then waits for them to submit sensitive information or bind the account to their credit card. The paper also includes an in-depth study of potential protection measures, including research that shows that 3-11% of HTTP requests to a popular ad network have had their referer header stripped. Around 0.05%-0.10% of requests have custom HTTP headers such as X-Requested-By stripped.