Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

The Google Browser

Anil Dash suggests Google should start sponsoring the Mozilla project, and use it as a basis for releasing their own browser. He makes a very good case:

Firebird is, finally, a usable browser, and damn close to the being the best in the world, if it isn’t already. Google’s shown the ability to get an installable client onto millions of desktops around the world. And they have a user experience focus that would nicely shore up the critical weakness that’s dogged Mozilla from day one. If the goal is now organizing and presenting information instead of just being the best search engine, then a browser client focused on information retrieval, search, and management is a great first step. And I’d give them better than even odds at being able to grow that application into a full microcontent client if they were so inclined.

The Google toolbar is a runaway success, but could a Google browser be nearly as popular? Google seem to be in the ideal position to launch a browser: they are one of the most popular and trusted brands on the internet, and have a reputation for usability which fits brilliantly with the focus of the Firebird browsers. Mozilla advocates such as myself have long bemoaned the fact that far better browsers exist which the IE using public are completley unaware of. Google have the marketing coverage and the influence to help them discover the alternatives.

What’s in it for Google? Anil suggests built in hooks to Google’s services and APIs, evolving in to a fully fledged microcontent client. I think the biggest advantage is the huge boost a well promoted alternative browser would give to the overall health of the internet. Without competition to drive it forward IE has stagnated, and the web has stopped moving forward. Introduce “Firebird, Google edition” to the mix and things suddenly get interesting again.

Oh, and just think of the cool things Google could do with XUL.

Further Thoughts (updated 11:27am)

With Microsoft’s recent announcements that they plan to compete seriously with Google in the search market, this idea becomes even more relevant. Microsoft have a history of using the dominance in one area to win market share in another (they are after all a convicted monopoly). If they’re planning a big push on Microsoft Search you can bet they’ll use Internet Explorer to help them get it—it already defaults to searching MSN if you enter words straight in to the location bar. If IE retains its market dominance, Google will be competing on Microsoft’s turf, and MS don’t have a very good history of playing fair. With their own cross-platform browser, Google will be in a far stronger tactical position.

This is The Google Browser by Simon Willison, posted on 17th July 2003.

View blog reactions

Next: Lots to come

Previous: New PHP experiment, inspired by ColdFusion

25 comments

  1. On the other hand, Google aren't very keen on standards. Their adsense code they require people to put into their pages would make a whole load of pages invalid. Their homepage doesn't even validate. They aren't a member of the W3C (why? Even Microsoft are!), and they don't take advantage of semantic elements very well.

    I've said it before, and I'll say it again: Google are in a unique position to promote well-structured semantic markup to the masses, as "Search Engine Optimization" is a large market, where people are willing to change their whole coding practices based on a few hints here and there from the big search engines. From what I see, Google would benefit from a cleaner web too, but they just seem complacent about the issue. This doesn't really sit well with the Mozilla standards aspect (although, their recent redesign of mozilla.org didn't validate at first, dunno if it's still broken).

    PS: Simon, I know you validate comments, so wouldn't it be easy to validate your own input when posting as well? ;)

    Jim - 17th July 2003 11:52 - #

  2. Good points about their lax approach to standards; maybe they should hire Eric Meyer and get some evangelising going.

    I used to validate my posts with a client-side javascript XML parser, but it broke when switched to application/xml+xhtml. I should really switch the input form around to check for XML validity before adding things to the database. Normally I can fix validation errors manually within a few seconds of posting something but today I'm on a 'net connection with a 10 minute lag on some page loads :/

    Simon Willison - 17th July 2003 12:05 - #

  3. The biggest boost ever for 'web standards' would be if Google gave an extra PageRank point or two to pages using valid markup. Overnight, the commercial web would start taking the issue seriously.

    Of course, it's not Google's responsibility and they'd get little or nothing from the work they'd have to put in, but we can dream.

    Matt Round - 17th July 2003 14:06 - #

  4. What do Google gain from encouraging semantic well-formed markup? As far as I can see, doing this would just make it easier for their competitors to parse and analyse webpages and the data in them, a problem google is making lots of money from having solved already.

    Their homepage doesn't validate because doing so would make it heavier, without any additional benefits in browser support or forward compatibility. The page is served millions of times a day, they measure changes to it in bytes. Just adding quotes to attributes would use more bandwidth in a day than most blogs use in a year.

    People seem to forget Google are a business. They aren't going to do anything unless it provides some kind of benifit to them.

    somemonkey - 17th July 2003 14:22 - #

  5. Grabbing knowledge from the web is by no means a solved problem, Google are not massively ahead of their competitors in this regard. Yes, it would benefit their competitors, but it would also benefit them. Also, by planning it out ahead of time, they get a head start on everyone else.

    Valid HTML alone would help them out. How much of their cluster is dedicated to parsing all that tag soup? A regular syntax is more efficient when applied to many documents at once, you get an economy of scale, of sorts.

    I don't believe for one second that the reason Google aren't serving valid HTML for their own site is down to bandwidth. Look at their source code. You have things like <body bgcolor=#ffffff ...> and <style> elements on every page. In no way is that the mark of technical people who want to save on bandwidth. Factor that crud out into a stylesheet, and make the stylesheet very cachable. Bingo, you've just saved quite a bit of bandwidth, at the expense of retrieving an extra object on the first pageview (which, with the popularity of Google and shared caches, they will hardly feel at all).

    Due to the intrinsically dynamic nature of their pageviews, embedding style into their pages directly is a completely boneheaded move for them to make if they are concerned about their bandwidth usage. I'm assuming that they aren't boneheaded, so I can only conclude that they aren't bothered about bandwidth. And if they measure out every byte, why so many linefeeds?

    Jim - 17th July 2003 15:36 - #

  6. If google releases a google browser based on mozilla, they will be at a very competitive position business-wise, and they have the mozilla community doing the work to make the browser better each day. Meaning, as microsoft stops giving away IE, google is providing everyone with a browser that does everything as they would want it. Regarding google's tag soup, perhaps their system makes it difficult for them to clean markup output, making it easier and faster for them to just add more code and styles, but we all know that such a solution only provides more problems in the long run.

    markku seguerra - 17th July 2003 17:13 - #

  7. I agree that there is a competitive advantage in having a browser that lots of people use. But how will they use it? Netscape originally had something that fed back your surfing habits to them (smartsurf?). Whilst this could be an interesting tactic for Google to use (for instance, what about that distributed crawler project that got some publicity about six months ago?), there have been privacy concerns raised about Google before, and it might just spark the row off all over again.

    Another thought: have Google been planning this already? Their web services API certainly lends itself to "smart client" uses (and the XML cuts down on bandwidth compared with the HTML!). What data can they use from clients, and how can they provide a better experience to surfers than they do at the moment with HTML?

    Jim - 17th July 2003 17:58 - #

  8. Some code from googles homepage:

    <style><!-- body,td,a,p,.h{font-family:arial,sans-serif;} .h{font-size: 20px;} .q{text-decoration:none; color:#0000cc;} //--> </style> <body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 onLoad=sf()>

    (the code above would be a lot clearer if I could use <br> or <pre> tags)

    If we moved the font and background colours into a stylesheet, we'd have:

    <style><!-- body,td,a,p,.h{font-family:arial,sans-serif;backgr ound:#ffffff;color:#000000} .h{font-size: 20px;} .q{text-decoration:none; color:#0000cc;} a{color:#0000cc} a:active{color:#ff0000} a:visited{color:#51a8b} //--> </style> <body onLoad=sf()>

    The first is 229 bytes, the second 261. The first also colors the text in every browser it's possible to color text in, the second doesn't.

    If we then move the stylesheet into a separate file, then the theory goes that the one file can be cached more easily, at the expense of adding another 20 or 30 bytes to the html to specify the url and mime type. This would be a worthwhile change, except the stylesheet is only applicable to one page, and the styles change as often as the page does. Even if they don't change that often, a 302 response (not changed since last request) which the server or webcache sends out will weight as much, if not more, than the 211 byte stylesheet. So all we've achieved is forcing the browser to request 3 objects from the server rather than 2. When the files you're dealing with weigh such a small amount, the size of the http headers and the time taken to create a connection, and a web server process to serve it, start to have a big effect on download speeds.

    Premature optimization is a bad thing. Before you make an optimization, you should work out where the bottlenecks are, and how to solve them. You then need to benchmark your changes and ensure they actually improve performance. The idea of optimizing a page by using css, and putting the css into a separate cacheable stylesheet is a good one. It makes sense in almost all situations. Google's homepage isn't one of them.

    Having said that, yes, they can loose 6 line breaks, two or three spaces and the // at the end of the style tag for a combined saving of 10 bytes. Maybe the google engineers laughing at us having this discussion will get round to removing them at some point in the future?

    somemonkey - 17th July 2003 18:29 - #

  9. If we moved the font and background colours into a stylesheet...

    You mean into a <style> element. Moving them into a stylesheet is what I was focussing on.

    The first also colors the text in every browser it's possible to color text in, the second doesn't.

    An acceptable loss: Google has never been about a snazzy look, and a couple of browsers getting their default colours (which are likely the same anyway) is hardly a catastrophe.

    ...the stylesheet is only applicable to one page, and the styles change as often as the page does

    Not true at all. The styles hardly ever change; the HTML, on the other hand is constantly changing. If you mean that the styles are not consistent between pages, I would also say this is incorrect. The vast majority of the pageviews will be search results and hits to the homepage - two pages only. The styles will be consistent not only for numerous searches per user, but consistent across all users. This is why factoring out the styling so that it can be shared independantly is effective.

    Even if they don't change that often, a 302 response (not changed since last request) which the server or webcache sends out will weight as much, if not more, than the 211 byte stylesheet.

    Who cares about 302s? They only come into play when you have to validate your cached object. There's no need to do so on anywhere near a regular basis given proper caching instructions, and even when you do, a single public cache can validate a cached stylesheet for numerous individuals.

    So all we've achieved is forcing the browser to request 3 objects from the server rather than 2.

    What we've achieved:

    • When a user requests a page, they get a smaller HTML document.
    • If the user has visited Google any time recently, the end.
    • If anybody using their shared cache has visited Google any time recently, they will retrieve the stylesheet from that cache.
    • When the stylesheet eventually expires, shared caches have to perform a 302 to revalidate their copy once. They can continue serving that copy to many users, for many weeks, until it expires again.
    • In the rare cases where shared caches are not used, and the user hasn't got a cached copy, Google has to actually send out its own copy, resulting in an equivalent transfer size.

    So really, the worse case scenario is rare, and is about as bad as what is happening right now. I mean, how many people actually don't use their ISPs cache right now? How often would the Google stylesheet actually change? How often would it actually be chucked away by public caches for not being popular enough?

    Premature optimization is a bad thing.

    Oh, I completely agree with this. But static styles and dynamic content, and the effect on bandwidth is a well-understood problem, and the Google website doesn't undergo changes to style very often.

    Maybe the google engineers laughing at us having this discussion will get round to removing them at some point in the future?

    Probably, my position is that they aren't too fussed about outgoing bandwidth (surely they are swamped with incoming data?).

    Jim - 17th July 2003 19:13 - #

  10. It's not Google that needs to fund Mozilla: It's Microsoft. What with the Massachussetts lawsuit, why, it'd be a perfect way of undercutting the argument against them.

    (yes, that's a joke)

    Adam Rice - 17th July 2003 20:52 - #

  11. Responding to Mattt Round's comment:

    The biggest boost ever for 'web standards' would be if Google gave an extra PageRank point or two to pages using valid markup. Overnight, the commercial web would start taking the issue seriously.

    In a way, Google already does this; validity doesn't score you any points, but IIRC organized, structural markup does (i.e., delineate sections with real headings, etc.).

    By the way, Simon, your comment validator needs some work; blockquote is essentially unusable at the moment (i.e., I can't emphasize text within a blockquote, and for some reason it was throwing "raw character data not allowed in tag blockquote" on perfectly ordinary text I copied and pasted).

    James - 18th July 2003 01:12 - #

  12. I don't think this would be a good thing at all. As Tim Berners Lee has pointed out, one organisation having a monopoly on a particular "layer" of Internet infrastructure is ok. It's when a company starts to take a vertical slice through the layers that things become worrying and choice is diminished. See: Microsoft.

    Foobar - 18th July 2003 10:32 - #

  13. Perhaps Google contributing in sponsoring the Mozilla project would not be too bad an idea. Albeit it would depend upon what "form" the sponsorship resembled as to whether it would be beneficial to the ethos of the Mozilla Foundation, open source philosophy and the web.

    Robert Wellock - 18th July 2003 14:05 - #

  14. You could also save some space by using three-digit RGB notation for the CSS colors (i.e., #00c;instead of #0000cc;). And, while it's true that valid markup could add to file size in some cases (adding quotes, etc.), it's possible that could be offset by the time the browser doesn't have to spend figuring out how to render the spaghetti code.

    kirkaracha - 19th July 2003 00:46 - #

  15. Why does the google homepage have the google logo inside a one row one cell table?

    <table border=0 cellspacing=0 cellpadding=0><tr><td><img ...></td></tr></table>

    That's 70 bytes they can save right there.

    eric scheid - 19th July 2003 04:30 - #

  16. While you're waiting for Google to maybe embrace Mozilla or Firebird, try the Googlebar extension (http://googlebar.mozdev.org).

    Neil Parks - 22nd July 2003 15:03 - #

  17. Now, everyone go along and email the google team suggesting all these cool changes to suggestions@google.com :)

    Kevin - 23rd July 2003 17:30 - #

  18. I made a bunch of changes to make Google's home page smaller, including removing the table around the logo: smallergoogle.

    Jesse Ruderman - 8th September 2003 07:25 - #

  19. Bug 226572 - Google branded Mozilla browser

    Robin - 24th November 2003 09:49 - #

  20. Come on everyone ! The only reason relevant and long standing websites have been bumped from the high spots is because the Google wants to raise the share value prior to floating the company. If your living relies on google searches, and you disapear from the search your only option is to pay for ad words. (Pay Per Click) This is a self generating fraudulent money maker. The more people who are forced to do this, the more they will have to bid against eachother to ever be seen on adwords. They deny this? Then let them show that there has not been an increase (dramatic) in profits from this PPC scam. Everybody is in panic. If there were any solidarity people would switch search engines. That would change there way of thinking and give you back control of your own business. Google can say whatever they want but the point is, if your site has all the relevant info for a given subject, has no faults in its construction and, above all, still appears in the same position in the minor engines, what other reason could there be? Can someone tell me ?

    Murphy - 16th February 2004 15:52 - #

  21. It seems to me that if that many companies were really getting screwed-over by Google, this would have become a larger story than it is right now. What I think is that a few people are bugging out because other pages have become higher ranked than they are, due to changes in their content, so they automatically jump the gun, and assume that Google is automagically screwing them specifically over.

    Carson - 22nd May 2004 21:00 - #

  22. I enjoyed your blogs, but you have a tendency to misuse some grammar. The prime examples are your use of company names in a plural context. Comments such as "Google have the marketing coverage ..." are horribly wrong. Any company, although made up of many people, is a singular expression. It should be "Google has the marketing coverage " or another example from your Google blog is "and MS don't have a very...". That should be "and MS doesn't have a very�" You read like you put great thought and research into your comments, but then it gets overshadowed a bit by the poor use of grammar. John Dvorak has a comment that AOL is �dumbing� down the Internet. In my opinion, failure to adhere to the proper use of our language for the sake of expediency is equally damaging, not only to the Internet, but to our culture and society as well. And I didn�t do well in English in high school. Go figure.

    Bryan - 31st July 2004 17:50 - #

  23. (Totally off-topic) Actually, Bryan, using plural verbs for collective nouns (such as "company", or more specifically "Google" and "Microsoft") is pretty common usage in British English.

    Jan! - 2nd August 2004 12:03 - #

  24. Check out my extension for Firefox that uses the Google API to open search results in multiple tabs: http://lookahead.mozdev.org/installation.html

    Rintoul - 21st September 2004 19:20 - #

  25. Who need GBrowser when you can have Gsuite , or , better, Google OS! eye-watch the future at http://grooan.com/futurefeeds/index.php?p=14 ( Google Suite ) or http://grooan.com/futurefeeds/index.php?p=15 ( Google OS )

    Luca - 22nd September 2004 07:38 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2003/07/17/theGoogleBrowser

A django site