What are some strategies for scaling sites & infrastructure so global response times are relatively close to US response times?
My answer to What are some strategies for scaling sites & infrastructure so global response times are relatively close to US response times? on Quora
You need to run your application in multiple data centers around the world, partitioned such that an incoming HTTP request can be completely serviced by a single data center. Then you use global DNS load balancing to direct users to the data center that is closest to them.
Building an application like this is extremely difficult due to the need to synchronise data between your data centers, and to avoid inconsistencies if those data centers lose connectivity between them. Most startups avoid doing this until they have large engineering teams and can afford to hire people who have done this before.
If you want to speed up read-only traffic this kind of thing is easier to achieve—you can work with a CDN/edge caching company such as http://fastly.com/, Akamai or Amazon CloudFront who run their own servers around the world and can cache your content for you. This can dramatically speed up your site for international visitors.
More recent articles
- Weeknotes: Parquet in Datasette Lite, various talks, more LLM hacking - 4th June 2023
- It's infuriatingly hard to understand how closed models train on their input - 4th June 2023
- ChatGPT should include inline tips - 30th May 2023
- Lawyer cites fake cases invented by ChatGPT, judge is not amused - 27th May 2023
- llm, ttok and strip-tags - CLI tools for working with ChatGPT and other LLMs - 18th May 2023
- Delimiters won't save you from prompt injection - 11th May 2023
- Weeknotes: sqlite-utils 3.31, download-esm, Python in a sandbox - 10th May 2023
- Leaked Google document: "We Have No Moat, And Neither Does OpenAI" - 4th May 2023
- Midjourney 5.1 - 4th May 2023
- Prompt injection explained, with video, slides, and a transcript - 2nd May 2023