What are some strategies for scaling sites & infrastructure so global response times are relatively close to US response times?
You need to run your application in multiple data centers around the world, partitioned such that an incoming HTTP request can be completely serviced by a single data center. Then you use global DNS load balancing to direct users to the data center that is closest to them.
Building an application like this is extremely difficult due to the need to synchronise data between your data centers, and to avoid inconsistencies if those data centers lose connectivity between them. Most startups avoid doing this until they have large engineering teams and can afford to hire people who have done this before.
If you want to speed up read-only traffic this kind of thing is easier to achieve—you can work with a CDN/edge caching company such as http://fastly.com/, Akamai or Amazon CloudFront who run their own servers around the world and can cache your content for you. This can dramatically speed up your site for international visitors.