Simon Willison’s Weblog

Subscribe

Entries tagged aws

Filters: Type: entry × aws × Sorted by date


sqlite-comprehend: run AWS entity extraction against content in a SQLite database

I built a new tool this week: sqlite-comprehend, which passes text from a SQLite database through the AWS Comprehend entity extraction service and stores the returned entities.

[... 1146 words]

s3-ocr: Extract text from PDF files stored in an S3 bucket

I’ve released s3-ocr, a new tool that runs Amazon’s Textract OCR text extraction against PDF files in an S3 bucket, then writes the resulting text out to a SQLite database with full-text search configured so you can run searches against the extracted data.

[... 1493 words]

Weeknotes: git-history, created for a Git scraping workshop

My main project this week was a 90 minute workshop I delivered about Git scraping at Coda.Br 2021, a Brazilian data journalism conference, on Friday. This inspired the creation of a brand new tool, git-history, plus smaller improvements to a range of other projects.

[... 1239 words]

Weeknotes: SpatiaLite 5, Datasette on Azure, more CDC vaccination history

This week I got SpatiaLite 5 working in the Datasette Docker image, improved the CDC vaccination history git scraper, figured out Datasette on Azure and we closed on a new home!

[... 986 words]

Using AWS, as my cloud, what is left for me to work on? Is it enough for me to just write the html+css code and programming language code (python)? Or do I stil have to work with mysql and backend stuff? I am pretty new at programming, so I hope it i...

Using a cloud server platform like Amazon EC2 unfortunately will not protect you from needing to understand basic server adminstration—it’s not that different from running your own physical server, except that if you screw up the configuration it’s much easier to throw everything away and start from scratch.

[... 134 words]

What kind of website can be run on AWS for 10, 100, 1 thousand, 10 thousand, 100 thousand, 1 million dollars per month?

“But is there a simple way to say that for 10$ per month you can run website on AWS, that has X unique users and Y data transfer...”

[... 166 words]

For a Django application, deployed on Heroku, what are my options for storing user-uploaded media files?

S3 is really a no-brainer for this, it’s extremely inexpensive, very easy to integrate with and unbelievably reliable. It’s so cheap that it will be practically free for testing purposes (expect to spend pennies a month on it).

[... 88 words]