<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: design-thinking</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/design-thinking.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2019-09-20T20:13:01+00:00</updated><author><name>Simon Willison</name></author><entry><title>Weeknotes: Design thinking for journalists, genome-to-sqlite, datasette-atom</title><link href="https://simonwillison.net/2019/Sep/20/weeknotes-design-thinking-genome-sqlite/#atom-tag" rel="alternate"/><published>2019-09-20T20:13:01+00:00</published><updated>2019-09-20T20:13:01+00:00</updated><id>https://simonwillison.net/2019/Sep/20/weeknotes-design-thinking-genome-sqlite/#atom-tag</id><summary type="html">
    &lt;p&gt;I haven’t had much time for code this week: we’ve had a full five day workshop at JSK with &lt;a href="https://twitter.com/tranosaurus"&gt;Tran Ha&lt;/a&gt; (a JSK alumni) learning how to apply &lt;a href="https://www.ideou.com/pages/design-thinking"&gt;Design Thinking&lt;/a&gt; to our fellowship projects and generally to challenges facing journalism.&lt;/p&gt;
&lt;p&gt;I’ve used aspects of design thinking in building software products, but I’d never really thought about how it could be applied outside of digital product design. It’s been really interesting - especially seeing the other fellows (who, unlike me, are generally not planning to build software during their fellowship) start to apply it to a much wider and more interesting range of problems.&lt;/p&gt;
&lt;p&gt;I’ve been commuting in to Stanford on the Caltrain, which did give me a bit of time to work on some code.&lt;/p&gt;
&lt;h3 id="genome-to-sqlite"&gt;genome-to-sqlite&lt;/h3&gt;
&lt;p&gt;I’m continuing to build out a &lt;a href="https://github.com/dogsheep"&gt;family of tools&lt;/a&gt; for personal analytics, where my principle goal is to reclaim the data that various internet companies have collected about me and pull it into a local SQLite database so I can analyze, visualize and generally and have fun with it.&lt;/p&gt;
&lt;p&gt;A few years ago I shared my DNA with &lt;a href="https://www.23andme.com/"&gt;23andMe&lt;/a&gt;. I don’t think I’d make the decision to do that today: it’s incredibly personal data, and the horror stories about people making unpleasant discoveries about their family trees keep on building. But since I’ve done it, I decided to see if I could extract out some data…&lt;/p&gt;
&lt;p&gt;… and it turns out they let you &lt;a href="https://you.23andme.com/tools/data/download/"&gt;download your entire genome&lt;/a&gt;! You can export it as a zipped up TSV file - mine decompresses to 15MB of data (which feels a little small - I know little about genetics, but I’m presuming that’s because the genome they record and share is just the interesting known genetic markers, not the entire DNA sequence - UPDATE: &lt;a href="https://customercare.23andme.com/hc/en-us/articles/202904600-Difference-Between-DNA-Genotyping-Sequencing"&gt;confirmed&lt;/a&gt;, thanks &lt;a href="https://twitter.com/laurencerowe/status/1175151249567576064"&gt;@laurencerowe&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;So I wrote a quick utility, &lt;a href="https://github.com/dogsheep/genome-to-sqlite"&gt;genome-to-sqlite&lt;/a&gt;, which loads the TSV file (directly from the zip or a file you’ve already extracted) and writes it to a simple SQLite table. Load it into Datasette and you can even facet by chromosome, which is exciting!&lt;/p&gt;
&lt;p&gt;This is where my knowledge runs out. I’m confident someone with more insight than me could construct some interesting SQL queries against this - maybe one that determines if you are &lt;a href="https://www.snpedia.com/index.php/Rs1805007"&gt;likely to have red hair&lt;/a&gt;? - so I’m hoping someone will step in and provide a few examples.&lt;/p&gt;
&lt;p&gt;I filed &lt;a href="https://github.com/dogsheep/genome-to-sqlite/issues/1"&gt;a help wanted issue&lt;/a&gt; on GitHub. I also put &lt;a href="https://twitter.com/simonw/status/1174712746157326336"&gt;a request out on Twitter&lt;/a&gt; for an UPDATE statement that could turn me into a dinosaur.&lt;/p&gt;
&lt;h3&gt;datasette-atom&lt;/h3&gt;
&lt;p&gt;This is very much a work-in-progress right now: &lt;a href="https://github.com/simonw/datasette-atom"&gt;datasette-atom&lt;/a&gt; will be a Datasette plugin that adds &lt;code&gt;.atom&lt;/code&gt; as an output format (using the &lt;a href="https://datasette.readthedocs.io/en/stable/plugins.html#register-output-renderer-datasette"&gt;register_output_renderer plugin hook&lt;/a&gt; contributed &lt;a href="https://github.com/simonw/datasette/pull/441"&gt;by Russ Garrett&lt;/a&gt; a few months ago.&lt;/p&gt;
&lt;p&gt;The aim is to allow people to subscribe to the output of a query in their feed reader (and potentially through that via email and other mechanisms) - particularly important for databases which are being updated over time.&lt;/p&gt;
&lt;p&gt;It’s a slightly tricky plugin to design because valid Atom feed entries require a globally unique ID, a title and an “updated” date - and not all SQL queries produce obvious candidates for these values. As such, I’m going to &lt;a href="https://github.com/simonw/datasette-atom/issues/2"&gt;have the plugin prompt the user&lt;/a&gt; for those fields and then persist them in the feed URL that you subscribe to.&lt;/p&gt;
&lt;p&gt;This also means you won’t be able to generate an Atom feed for a query that doesn’t return at least one datetime column. I think I’m OK with that.&lt;/p&gt;
&lt;h3&gt;github-to-sqlite&lt;/h3&gt;
&lt;p&gt;I &lt;a href="https://github.com/dogsheep/github-to-sqlite/releases/tag/0.4"&gt;released one new feature&lt;/a&gt; for &lt;a href="https://github.com/dogsheep/github-to-sqlite"&gt;github-to-sqlite&lt;/a&gt; this week: the &lt;code&gt;github-to-sqlite repos github.db&lt;/code&gt; command, which populates a database table of all of the repositories available to the authenticated user. Or use &lt;code&gt;github-to-sqlite repos github.db dogsheep&lt;/code&gt; to pull the repos owned by a specific user or organization.&lt;/p&gt;
&lt;p&gt;The command configures a SQLite full-text search index against the repo titles and descriptions, so if you have a lot of GitHub repos (I somehow have nearly 300!) you can search through them and use Datasette to facet them against different properties.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;github-to-sqlite&lt;/code&gt; currently has two other useful subcommands: &lt;code&gt;starred&lt;/code&gt; fetches details of every repository a user has starred, and &lt;code&gt;issues&lt;/code&gt; pulls details of the issues (but sadly not yet their comment threads) attached to a repository.&lt;/p&gt;
&lt;h3&gt;Books&lt;/h3&gt;
&lt;p&gt;I’m trying to spend more time reading books - so I’m going to start including book stuff in my weeknotes in the hope of keeping myself on track.&lt;/p&gt;
&lt;p&gt;I acquired two new books this week:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://abookapart.com/products/just-enough-research"&gt;Just Enough Research&lt;/a&gt; by Erika Hall (recommended by Tom Coates and Tran Ha), because I need to spent the next few months interviewing as many journalists (and other project stakeholders) as possible to ensure I am solving the right problems for them.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://producingoss.com/"&gt;Producing Open Source Software&lt;/a&gt; by Karl Fogel, because my &lt;a href="https://simonwillison.net/2019/Sep/10/jsk-fellowship/"&gt;fellowship goal&lt;/a&gt; is to build a thriving open source ecosystem around tooling for data journalism and this book looks like it covers a lot of the topics I need to really do a good job of that.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Next step: actually read them! Hopefully I’ll have some notes to share in next week’s update.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/genetics"&gt;genetics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/reading"&gt;reading&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsk"&gt;jsk&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-thinking"&gt;design-thinking&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="genetics"/><category term="projects"/><category term="reading"/><category term="sqlite"/><category term="datasette"/><category term="jsk"/><category term="weeknotes"/><category term="design-thinking"/></entry></feed>