Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

getElementsBySelector()

Inspired by Andy, I decided to have a crack at something I’ve been thinking about trying for a long time. document.getElementsBySelector is a javascript function which takes a standard CSS style selector and returns an array of elements objects from the document that match that selector. For example:

document.getElementsBySelector('div#main p a.external')

This will return an array containing all of the links that have ’external’ in their class attribute and are contained inside a paragraph which is itself contained inside a div with its id attribute set to ’main’.

So far I’ve only tested it on Phoenix but it seems to work as intended for the small number of test cases I’ve tried. If you spot any bugs please let me know. I’m about to fire up a Windows PC and see how much it breaks in IE...

Update: I’ve put together a demo page showing the function in action. It works fine in IE 6.

This is getElementsBySelector() by Simon Willison, posted on 25th March 2003.

Tagged , ,

View blog reactions

Next: Retrieving all DOM descendants

Previous: Date-centric vs Entry-centric

25 comments

  1. Works fine in Mozilla 1.3 as well. As I suspect you already knew, it chokes and dies in Netscape 4.x, due to a lack of getElementsByTagName. *shrug* I had it open anyway.

    gilmae - 25th March 2003 22:10 - #

  2. Grr... I was gonna do that! Oh well, never mind. Well done.

    Andrew Hayward - 26th March 2003 01:19 - #

  3. Did it fail gracefully or noisily? I've updated the script so that it quietly returns an empty array if document.getElementsByTagName doesn't exist.

    Simon Willison - 26th March 2003 01:21 - #

  4. When I tried it in n4.x earlier it did nothing except inform me of a javascript error (the 'Type javascript:' message in the status bar). Now it pops an empty alert dialog, which seems to indicate an empty array being returned.

    gilmae - 26th March 2003 06:57 - #

  5. Further testing: The test page works perfectly in Opera 7, and almost works perfectly in Opera 6.11 and Konqueror on Linux. With a bit of poking around I'm sure I could get it to work on those browsers as well.

    Simon Willison - 26th March 2003 08:56 - #

  6. Simon, this is an excellent idea. Thanks for bringing that on the table. Please see http://daniel.glazman.free.fr/weblog/newarchive/20 03_03_23_glazblogarc.html#s91402243 Daniel, editor of the CSS OM in W3C's CSS Working Group

    Daniel Glazman - 26th March 2003 09:53 - #

  7. Nice implementation! You should make it clear, though, that only element, class, and ID selectors, and only the descendant combinator, are implemented. Any plans to extend this to support more selectors, such as those found in the CSS3 Selectors module? How serious are you about bugs found in this code? I would be happy to give it a proper work out, but I don't really want to do so unless you really intend to fix bugs that I find. Drop me an e-mail if you do.

    Ian Hickson - 26th March 2003 19:35 - #

  8. I saw that the code between the class section and element section was nearly identical, and started to add a separate getElementsByClassSelector function to reuse that bit. Context was the issue that stopped me. document.getElementsBySelector can assume "document" as the initial context, but as long as we're hacking up useful DOM interfaces, it might be nice to have Node.getElementBy* functions, and referencing "this" instead of "document".

    Jeremy Dunck - 26th March 2003 21:23 - #

  9. Nice work. Have you put it to work yet?

    Eliminating whitespace between selectors results in failure. The following examples use your demo.

    This doesn't work:

    javascript:alert(document.getElementsBySelector("b ody>#foo"));

    But this does:

    javascript:alert(document.getElementsBySelector("b ody > #foo"));

    And this selector text works (moz only)

    javascript:void(document.styleSheets[0].insertRule ("body>#foo {background-color: oldlace;}", 3));

    adding multiple class selectors (e.g. ".foo.bar" -- match elements with class foo with class bar) would be yet another challenge.

    I'd rather use getElementsBySelector instead of document.getElementsBySelector (no expando).

    You'd have to enhance this:

    // Split selector in to tokens

    var tokens = selector.split(' ');

    Removing expando would be the first thing I would do. There's lots of other improvements that could be made, but perhaps those improvements would affect the "geek factor" more than the usefulness.

    For instance, you could use returnedCollection = []; and then return returnedCollection at the end. This would give you the ability to eliminate all of the return new Array() statements in the middle of the function.

    It all depends how much you need to use it, I guess. I've written a number of much simpler functions, though I've never needed anything this comprehensive. In case I do, it's good to know it's already done.

    Garrett Smith - 5th August 2003 22:29 - #

  10. Nice work. Have you put it to work yet?

    Eliminating whitespace between selectors results in failure. The following examples use your demo.

    This doesn't work:

    javascript:alert(document.getElementsBySelector("b ody>#foo"));

    But this does:

    javascript:alert(document.getElementsBySelector("b ody > #foo"));

    And this selector text works (moz only)

    javascript:void(document.styleSheets[0].insertRule ("body>#foo {background-color: oldlace;}", 3));

    adding multiple class selectors (e.g. ".foo.bar" -- match elements with class foo with class bar) would be yet another challenge.

    I'd rather use getElementsBySelector instead of document.getElementsBySelector (no expando).

    You'd have to enhance this:

    // Split selector in to tokens

    var tokens = selector.split(' ');

    Removing expando would be the first thing I would do. There's lots of other improvements that could be made, but perhaps those improvements would affect the "geek factor" more than the usefulness.

    For instance, you could use returnedCollection = []; and then return returnedCollection at the end. This would give you the ability to eliminate all of the return new Array() statements in the middle of the function.

    It all depends how much you need to use it, I guess. I've written a number of much simpler functions, though I've never needed anything this comprehensive. In case I do, it's good to know it's already done.

    Garrett Smith - 5th August 2003 22:29 - #

  11. Your Demo seems to work in Safari 1.0 (v85.5), except for the '.blog' selector, which seems to return an empty array. However, my installation of Safari has been acting somewhat strange lately so others may not even have this problem. The demo also works great in Internet Explorer 5.2.2 on MacOS/X Great work!!

    Már Örlygsson - 3rd December 2003 13:25 - #

  12. The problem in Safari lies with the fact that the special TagName case "*" is not supported for the getElementsByTagName() function. This makes your getAllChildren() function fail. :-( According to http://dhtmlkitchen.com/experiment/safari/index.ht ml there is no workaround for this bug.

    Már Örlygsson - 5th December 2003 10:21 - #

  13. Interesting stuff. Simply wanted to say, that the 'a[href$="org/"]' actually works in Opera7.23 and Opera7.50. So you can remove or change the "fails in Opera 7" message behind it. ;)

    Graste - 27th May 2004 15:25 - #

  14. Hello, I've just port getElementsBySelector function from Javascript to PHP5. I would like to use it to create PDF file from Xhtml source using DOM and FPDF library. this kind of rules seems not work correclty : myelement#myid.myclass ex: div#debug.Enabled

    Sébastien Cramatte - 17th October 2004 21:51 - #

  15. Thanks for the great work! I made a couple of additions to your code for my own work. Specifically, I have made it recurse to handle multiple selectors (for e.g, "P.sideBar, P.mainContent") and to correctly "AND" multiple classes instead of "OR"ing them (for e.g, "A.sideBar.selected" will only match A tags which have both "sideBar" and selected "classes" assigned to them). I'll be glad to share the code with you if you are interested. Also, I would like your permission to share the modified code I made (attributing and acknowledging your code of course :).

    Kingsley Joseph - 11th July 2005 22:55 - #

  16. Works great in firefox 1.06. thanks for awesome work you did.

    matt - 12th August 2005 12:17 - #

  17. This is beautiful, of course, and I've been using it. I'm trying to hack it to permit the setting of an element "starting point", so that the selector will operate within that context. This means passing in an argument that will overwrite "currentContext"'s original value of "document." Unfortunately, it's not working yet, and while I'll continue working on it, I'd expect the author of the original code to know the best way to go about it. So... I guess I'm saying that this is a feature request. :) Thank you for this function. Great work!

    Kramer - 7th December 2005 23:49 - #

  18. Great work on the script, extremely handy!

    I've come across one small bug while using it though. If you use a selector that references an invalid ID paired with a tag name (e.g. body#profile, when #profile doesn't exist), a script error occurs; yet using #profile alone works fine.

    The source of the error is line 89:

    if (tagName && element.nodeName.toLowerCase() != tagName) {

    It's finding the tagName (body), but element is undefined due to the ID being invalid. A slight change to the above if statement fixes the problem.

    if (!element || (tagName && element.nodeName.toLowerCase() != tagName)) {

    Maybe somebody will find that useful. Thanks again for the great code!

    James Gregory - 24th March 2006 11:26 - #

  19. I've been toying with this hot little number at work the past few days, and I've found a little bug.

    When you use hyphens in a class or ID, getElementsBySelector() ignores the hypen and everything after it.

    HTML:
    <div id="container">
      <ul class="nav">
        <li class="nav-home">Home</li>
      </ul>
    </div>
    
    Javascript:
    var found = document.getElementsBySelector('#container .nav');
    
    What I expect is that found will contain the UL element, which it does. But it also countains the LI element, even though it has a different (and valid per W3C spec) class that doesn't match the selector I asked for.

    Brett - 5th April 2006 19:33 - #

  20. Lots of thanks for this jewel! This solved my last problems dividing JS from HTML templates in programming...

    Frankie - 18th May 2006 14:55 - #

  21. dsfdsfsdf

    sdfsdfsdfsdf - 20th May 2006 06:14 - #

  22. I am also seeing the hyphen bug that Brett is reporting, two comments above. I'm looking into a fix...

    Nagu - 26th May 2006 17:42 - #

  23. ;SLDFKG D SFL;G KLSDFJGL;SD LKLSD GOSDJ;LGJ LSDGJ SDFKLG SDGKLDSG SDJKGHJKD GKJHDJKSGDFJKSGH JKD GJKLSDHK SDGKLSHDKLG SDJKG JKSDHGJKSD GJKSDGJKH SDJKG SDJKAHGJKA GKJAHFKL HA;FKL;OIfrioqjrkl maso cvuwioatjk whtuiwhgk dns,bmvhklasn cvkenwionfmkn ,. bkl opms.vnl;snbgio bngjkn gn

    VIMAL - 17th June 2006 08:40 - #

  24. how would you use this to change, say, the "color" attribute of all the (li class="friends") nodes? thanks - i think this will solve a lot of my problems!

    sagel - 30th June 2006 18:50 - #

  25. I came across the same problem mentioned above (.nav-home matching .nav). The problem is that the regular expression uses word boundries to test the class name (and '-' is a word boundary). The fix is very simple, just replace

    new RegExp('\\b'+className+'\\b')

    with

    new RegExp('(\\s|^)'+className+'(\\s|$)')

    and

    new RegExp('\\b'+attrValue+'\\b')

    with

    new RegExp('(\\s|^)'+attrValue+'(\\s|$)')

    I have also changed the class match loop so that the regular expression is created before the loop and reused for each iteration. I haven't compared performance differences, but it should atleast create less garbage that needs to be collected. Also, it seems that the tokens should be split by a whitespace regex (/\s+/) not a single space as in the code (' '). I may be wrong, but I think the spec allows for any amount of whitespace between 'tokens'.

    Jeremy - 14th July 2006 14:35 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2003/03/25/getElementsBySelector

A django site