Forum Moderators: phranque
My objective was to find a tool which is free, or at least inexpensive, and which could be readily integrated with the rest of the site, with output suitable for styling with CSS. Until now as a stopgap I’d used a Google search form, but Google has the disadvantage that you can’t reindex on demand. I’ve also used Picosearch before on another site, but, as with Google, since the results are generated by an external server there is limited control over the look of them.
The first I tried was Fluid Dynamics Search Engine (FDSE). It’s quite a nifty little Perl-based tool, but from my point-of-view it has one major drawback, in that the format of the HTML it generates is hard-coded into the script, and entirely table-based. It can be restyled up to a point, but it’s not very flexible.
Integration ought to be easier with a PHP-based tool, so I consulted Google and built up a shortlist of Site Search Pro, PhpDig, FastFind, Sphider and WrenSoft Zoom.
Site Search Pro and FastFind only have online demos, which meant their integration features could not be tested. On that basis I eliminated them immediately.
There were several references to PhpDig being very slow at indexing on various forums and blogs, so I decided not to bother with it, at least until I’d tried the other options.
I downloaded Sphider and followed its relatively straightforward installation instructions. Although there were no problems in getting it working, I was unimpressed with its indexing because it took no account of the BASE element, which meant it generated incorrect URLs. That made it useless for my purposes.
Finally, I tested Zoom. It’s a sophisticated product, but for small sites (up to 50 pages) it’s free. It’s well worth getting to grips with its abundance of features because, as I found, with some imagination there’s usually a way to do what you want.
To run the indexer you need a Windows PC, but the actual search will run on most servers, as long as they’ll execute either PHP, ASP, or Perl scripts. You can also use it to index static files, useful if you are supplying documentation on CD for instance.
The indexer generates a set of binary files and the scripts to search them. The HTML generated by the scripts is fairly clean, with DIV and CLASS hooks for styling, so while, as with FDSE, you can’t actually customize the HTML itself, in this case it doesn’t really matter.
To integrate the Zoom PHP script into my site was quite straightforward, although I found I had to set an undocumented variable called $LinkBackURL to cope with the URL rewriting used by my code. I also had to insert some of the proprietary comments that Zoom uses to control its indexing, to ensure that common elements like menubars were ignored.
Once I had it working, I started playing with the features and observed that it has a search terms highlight facility in the results. Not only that, but with the help of some supplied javascript when you follow the link to a result the referred page will also show highlights and automatically scroll to the first one. Very cool. Another notable feature is that it'll suggest alternative spellings for terms when the number of results falls below a configurable threshold.
Unfortunately the highlight script fails in IE 5.01 (one of the several browsers I use for cross-browser testing), but there is a geeky workaround (supplied on request) which causes a quiet failure rather than the rude message which pops up if you run IE with debugging on. I’ve sent it to the Zoom people at Wrensoft and hopefully they’ll incorporate it into their script.
In conclusion, if you are looking for an integrated site search tool, I reckon Zoom should definitely go on your list. One caveat: I've no idea what its performance is over very large sites, though its per page performance was very fast throughout my testing today.