Welcome to WebmasterWorld Guest from 18.104.22.168
Forum Moderators: incrediBILL
Now a large site I'm working on wants a Site Search, and I'm hoping to get some input on how to do it well. We might look into the Google search box, but I'm not at all sure that will completely serve the purpose.
Here are some of my concerns:
1. Search Box on the Page versus Having a Link
Even on a big site, I'm uncomfortable with inviting people to dive for the search function. If a visitor tries the search and gets no result, they may well leave. So I much prefer solid IA, with good copy and navigation cues. It seems to me that a link to the search function would result in better site stickiness while still offering the desired functionality.
2. Monitoring Search Quality
It seems to me that the searches need to be monitored very closely by someone who really knows the site well. Then the right keywords can be added to the special meta tags as needed, and ideas for new content can get passed to the right memebr of the development team. Is anyone doing this kind of thing? How many dedicated man-hours does it take per week?
3. Typos, Synonyms, etc.
One search function we're considering would use propietary meta tags. The idea is that descriptions and keywords designed for the search engines don't need to be used for site search as well. I like this idea a lot. Am I missing any downside?
4. Indexing Message Board Content
In addition to several thousand content pages, we're aiming for 40 or more message boards over the next year. In addition to WebmasterWorld, does anyone have experience at this? How frequently do you build the index? What server load considerations are there? (We're running Apache and Tomcat with load sharing over four boxes right now.)
My paranoia was heightened this week by using the site search on Compaq, and getting results that were totally unusable. 52 identical titles and descriptions, even though each one was a separate document. I figure if they can foul it up, then I certainly can. I welcome other areas of input -- anything at all. I've never waded into anything this big before, and I'm hoping to avoid a fiasco.
For typo's, synonyms and those extra words that you may not have thought of, I would steer clear of proprietary tags if you're also interested in search engines. It's better just to adjust the content to match. Search engine logs can also be very useful for identifying new products or services that should be offered, or just variations on existing products or services.
If you're going to set up 40 message boards and expect things to be added often (so need a fresh search) then it may be worth considering using a board server with built in search. If you're likely to be generating static pages then "If-Modified-Since" and 403 "Not Modified" headers may be enough. i.e. just fetch the pages that have changed if your Web server and search engine are able to negotiate dates. It really depends on how much time you can dedicate.
Tedster, I know you own a copy of Nielsen's 'Homepage Usability' and he insists that you have a search box on the home page.
Information Architecture for The World Wide Web (L. Rosenfeld) has a whole chapter on Search pages, it's well worth reading before embarking on an ambitious site search.
I too have been reluctant to add a search box to our homepage until I'm confident that the search is working really well. Keep us updated as to how you get on!
(edited by: joshie76 at 5:36 pm (utc) on Feb. 27, 2002)
> For typo's, synonyms and those extra words that you may not have thought of, I would steer clear of proprietary tags if you're also interested in search engines.
ciml, could you expand a bit on that? Do you mean that there could be a problem with search engines if we use proprietary meta tags?
One of many reasons that I am considering this is that there will be scholarly papers and such, where we really can't change the content - but searchers may easily use alternate spellings. Much of the terminology is Sanskrit and Hindi, and there are a lot of alternate English transliterations involved. I will be creating content pages with those alternate spellings, but I also want the main scholarhip documents to be returned on our site search, even if they say "Sakti" when the author wrote "Shakti".
> ...Nielsen's 'Homepage Usability' and he insists that you have a search box on the home page. <
Right, Josh. This is one place where I don't see eye to eye with Jakob. I want search available as a tool, but that empty box is just too enticing a toy, IMO. I've seen it limit stickiness several times. Placing a link in the same position on every page appeals to me very much.
(edited by: tedster at 5:31 pm (utc) on Feb. 28, 2002)
And the slight bit of research I did at the time:
One client site seemed to benefit from the search feature while the other, a site devoted entirely to research and encryption technology showed some odd results. From the logs, only 21% of the phrases and keywords searched for could be absolutely identified as topical, 51% were a bit of a stretch and 28% were totally off the mark.
Stemming also proved to be an issue. A search for algo can pull up Al Gore, etc.
The tech site in question has a huge database of encryption papers, algorithms, mathematical expressions etc.
Custom search boxes for each topic with explicit instructions were then implemented on the encryption site. This site has now gone private
and I can no longer access any of the stats as my contract with them expired. The other pertinent details have to be excluded due to the NDA but the search boxes did seem to result in a lot of 404-surfer leaves results before they were changed.
1. Link or Box
I run to seperate search facilities- one external and one internal to the site. I don't link to internal pages in the internal engine because excessive linking to the results will bring about the same SERP's which could be considered spam.
The search engine im using gives you records of all searches, where many of them are spelling mistakes of my keywords. However, many visitors are at high school, which may be a factor in this, but still an important one to consider. The anti-climax with search stats I have is that it does not differentiate SE referrers from other links. This means many searches are inevitably similar to my targetted keywords.
3. Keywords etc
The one im using (again) allows you to have titles in the search matching the query, i.e. linking to SERP's with a term "oatmeal cookies" puts that title in your results page. This is excellent if you choose to link to results
4. Message Boards
There is another thread running about them just now. Im just wondering how quick they grow in file size alongside your concerns aboutt them tedster
just my 2 euro
1. I understand your reluctance to put a search box in every page. If you are happy to with the quality of your search then it makes sense because it makes it easier for the site visitors to search. If you're not happy with your search then you should probably improve the quality of your search rather than make it more difficult to search. A recent usability report (http://world.std.com/~uieweb/what_they_want.htm) found that people were less likely to find stuff on a site if they used search! This is a reflection of the poor quality and poor implementation of most searches - Compaq was a great example
2. Many search products offer reports that help you monitor search quality. Your customer should be able to get lists of the top search terms, the top results from search, the search terms with no results and search terms with poor results This last one is important, one of our customers has over 100 searches per day for a particular search term, but only 8 of those searches generate clicks on search results. Although they have results for this search term this shows they don't have very good results and they need to do something about it. If these reports are generated automatically then it doesn't take many man hours per week to monitor the search quality. I spend about half an hour/day looking at these types of reports for our customers and letting them know if I see anything that needs their attention.
3. Your meta-tag idea sounds good. Some search engines will automatically generate search suggestions. One of the problems with manually doing this is that even for a reasonably small site there will be a huge variety of search terms.One of our customers had over 30,000 unique search queries over two months. This was on the support area for one of their software products. Trying to manually keyword typos can be a large job. A more efficient way is to monitor the most popular search terms with no results. Common misspelling that aren't indexed will show up in those reports.
I hope this helps.
The client in this case is a not-for-profit who will have a volunteer monitoring search results, so this aspect will be handled...and we definitely will have a solution that generates the kinds of reports you mention. It's helpful to have you list the three types; I would probably have overlooked the "poor" results category.
I've been thinking that one of the roughest areas in the search results that I've seen is the description of each result. So many times they are nearly inscrutable.
I'm wondering about the idea of having each page author create a short abstract of their material, then and have the search function return that abstract instead of kepinging a large meta tag on page or extracting Google-like snippets. We could easily add this into the content management workflow.
Is this approach something that is pre-packaged into products? Does it help?
I'd like to log the terms people search for on the site. Does anyone know of a good way to catch and then pass (redirect) the queries to Google? Ideal would be if they got logged in the server's access log as a request (e.g. GET /cgi-bin/search?q=foo+bar) so I could use a standard log analyzer to extract them.
Shaun, I like the idea of tracking clicks from SERPs to find low-click through phrases. Very slick.
But it sounds like some of your concerns have more to do with the quality of the meta data or the results set than with whether you have a search link or box. That gets into issues of proper tagging, using a search engine that doesn't index toolbars, etc.
I prefer putting search boxes on every page. We've found that some users like to browse and some use the search engine as their main navigational tool - for such people, it might be annoying putting the search function one more click away. But supporting that assumption really means making sure the underlying documents are set up to return relevant results.
The search function we have planned for our site is limited in what it can deliver. It is meant to be used only for particular types of searches. The site structure itself is easily navigable for most searches. It's only when the navigation type of search is exhausted or inappropriate that we expect the user to turn to the search function. Then, on its own page, we can explain the limitations and the best way to use the search function as well as the site to find information.
It's not completely done yet. It still has some bugs. One to be exact(from what I've seen so far). If anyone would like to work on it to use it for there site, just sticky mail me and I'll send it on to ya.
I'm not fond of a search box on the page. Give them a link and let them find it when all else fails.
There are quite a few hours involved. If you have someone with an eye for search engine optimization building the pages, then you should be ok and not have to tweak meta's or nothing for search quality. A good search engine should be able to match your quality.
The problem will be with the generic nature of the search queries by the user. Someone types in "webmaster" on the site search engine here several times a day. 20meg of indexed text, about "webmaster" and the engine is suppose to pice the best one out of 110k msgs?
That situation is going to occur with most site engines. After all, it's the one topic nature of all sites. Relavancy is very difficult in that "keyword loaded" situation.
>3 any downside
Depends on if you are serving the same pages to the se's? I wouldn't want to optimize specifically for the site search engine and for the net engines too. I do it one time well, and then adjust the dials and knobs on the ranking algo of the local site engine.
Servers can get overloaded very quick (notice, I don't host the site search here). I rebuild the index nightly. It can be automated, but I like to do it by hand to watch what gets indexed.
I read an article (tried to find it), that said 75% of all site search engine usage ends in frustration and session abandonment.
One of my clients wanted a search function built in but was very particular about it. Long story short, we had to use their SQL database and enter every keyword that could possibly apply to each item in their inventory, plus misspellings, abbreviations, alternative spellings, etc. Every time they add a new item, I have to enter new keywords for it. The search works pretty well, but is so time consuming it can't be cost effective.