Which is the best open source tool to populate my database with test data for my load test?
11th February 2012
My answer to Which is the best open source tool to populate my database with test data for my load test? on Quora
I’ve seen tools that do this, but to be honest it’s very simple to write your own script for this (especially if you’re using an ORM). The other benefit to writing your own script for this is that you’ll have a much better chance of accurately representing your expected data, sizes etc.
A couple of techniques that are pretty useful: Build up lists of common first names and last names, then generate user names by picking a random first name and a random last name. Build a utility function that generates 6 letter random strings, then generate email addresses as random-6-letter-string@random-domain. For relationships, one technique is to populate one table, then pull all of the primary keys out in to a list and pick them at random from that list when creating other records. You might want to bias that selection towards some records to get more of a realistic bell-curve rather than a purely random selection.
There are libraries that can help with this (e.g. built-in routines for generating fake email addresses etc). If you’re using Ruby, http://faker.rubyforge.org/ is worth a look (a port of Data::Faker from Perl). There’s a Python port here: https://github.com/threadsafelab...
More recent articles
- Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode - 11th December 2024
- ChatGPT Canvas can make API requests now, but it's complicated - 10th December 2024
- I can now run a GPT-4 class model on my laptop - 9th December 2024