homepage Welcome to WebmasterWorld Guest from 54.227.12.219
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

    
Site Search - how to make it a total asset
tedster




msg:586762
 1:47 pm on Feb 26, 2002 (gmt 0)

For many people having a Site Search box on the page seems like a given, but there are pitfalls that have caused me to remove it in several past cases. The ones that I do have are pretty rudimentary.

Now a large site I'm working on wants a Site Search, and I'm hoping to get some input on how to do it well. We might look into the Google search box, but I'm not at all sure that will completely serve the purpose.

Here are some of my concerns:

1. Search Box on the Page versus Having a Link
Even on a big site, I'm uncomfortable with inviting people to dive for the search function. If a visitor tries the search and gets no result, they may well leave. So I much prefer solid IA, with good copy and navigation cues. It seems to me that a link to the search function would result in better site stickiness while still offering the desired functionality.

2. Monitoring Search Quality
It seems to me that the searches need to be monitored very closely by someone who really knows the site well. Then the right keywords can be added to the special meta tags as needed, and ideas for new content can get passed to the right memebr of the development team. Is anyone doing this kind of thing? How many dedicated man-hours does it take per week?

3. Typos, Synonyms, etc.
One search function we're considering would use propietary meta tags. The idea is that descriptions and keywords designed for the search engines don't need to be used for site search as well. I like this idea a lot. Am I missing any downside?

4. Indexing Message Board Content
In addition to several thousand content pages, we're aiming for 40 or more message boards over the next year. In addition to WebmasterWorld, does anyone have experience at this? How frequently do you build the index? What server load considerations are there? (We're running Apache and Tomcat with load sharing over four boxes right now.)

My paranoia was heightened this week by using the site search on Compaq, and getting results that were totally unusable. 52 identical titles and descriptions, even though each one was a separate document. I figure if they can foul it up, then I certainly can. I welcome other areas of input -- anything at all. I've never waded into anything this big before, and I'm hoping to avoid a fiasco.

 

ciml




msg:586763
 6:11 pm on Feb 26, 2002 (gmt 0)

Personally I'd use a search link. Anyone who knows what IA stands for should be able to make a generally usable site. ;)

For typo's, synonyms and those extra words that you may not have thought of, I would steer clear of proprietary tags if you're also interested in search engines. It's better just to adjust the content to match. Search engine logs can also be very useful for identifying new products or services that should be offered, or just variations on existing products or services.

If you're going to set up 40 message boards and expect things to be added often (so need a fresh search) then it may be worth considering using a board server with built in search. If you're likely to be generating static pages then "If-Modified-Since" and 403 "Not Modified" headers may be enough. i.e. just fetch the pages that have changed if your Web server and search engine are able to negotiate dates. It really depends on how much time you can dedicate.

Calum

joshie76




msg:586764
 5:32 pm on Feb 27, 2002 (gmt 0)

>> Search Box on the Page versus Having a Link

Tedster, I know you own a copy of Nielsen's 'Homepage Usability' and he insists that you have a search box on the home page.

Information Architecture for The World Wide Web (L. Rosenfeld) has a whole chapter on Search pages, it's well worth reading before embarking on an ambitious site search.

I too have been reluctant to add a search box to our homepage until I'm confident that the search is working really well. Keep us updated as to how you get on!

(edited by: joshie76 at 5:36 pm (utc) on Feb. 27, 2002)

tedster




msg:586765
 11:21 pm on Feb 27, 2002 (gmt 0)

Thanks for the tip on the Rosenfeld book. I will check into that.

> For typo's, synonyms and those extra words that you may not have thought of, I would steer clear of proprietary tags if you're also interested in search engines.

ciml, could you expand a bit on that? Do you mean that there could be a problem with search engines if we use proprietary meta tags?

One of many reasons that I am considering this is that there will be scholarly papers and such, where we really can't change the content - but searchers may easily use alternate spellings. Much of the terminology is Sanskrit and Hindi, and there are a lot of alternate English transliterations involved. I will be creating content pages with those alternate spellings, but I also want the main scholarhip documents to be returned on our site search, even if they say "Sakti" when the author wrote "Shakti".

> ...Nielsen's 'Homepage Usability' and he insists that you have a search box on the home page. <

Right, Josh. This is one place where I don't see eye to eye with Jakob. I want search available as a tool, but that empty box is just too enticing a toy, IMO. I've seen it limit stickiness several times. Placing a link in the same position on every page appeals to me very much.

(edited by: tedster at 5:31 pm (utc) on Feb. 28, 2002)

digitalghost




msg:586766
 11:44 pm on Feb 27, 2002 (gmt 0)

I dug this link up from a few months ago:

[pcworld.com...]

And the slight bit of research I did at the time:

One client site seemed to benefit from the search feature while the other, a site devoted entirely to research and encryption technology showed some odd results. From the logs, only 21% of the phrases and keywords searched for could be absolutely identified as topical, 51% were a bit of a stretch and 28% were totally off the mark.

Stemming also proved to be an issue. A search for algo can pull up Al Gore, etc.

The tech site in question has a huge database of encryption papers, algorithms, mathematical expressions etc.

Custom search boxes for each topic with explicit instructions were then implemented on the encryption site. This site has now gone private
and I can no longer access any of the stats as my contract with them expired. The other pertinent details have to be excluded due to the NDA but the search boxes did seem to result in a lot of 404-surfer leaves results before they were changed.

DG

brotherhood of LAN




msg:586767
 12:21 am on Feb 28, 2002 (gmt 0)

In regards to people "diving for the search function", around 10% of all my page impressions are on searches, and the search facility is found on all pages

1. Link or Box
I run to seperate search facilities- one external and one internal to the site. I don't link to internal pages in the internal engine because excessive linking to the results will bring about the same SERP's which could be considered spam.

2.Stats
The search engine im using gives you records of all searches, where many of them are spelling mistakes of my keywords. However, many visitors are at high school, which may be a factor in this, but still an important one to consider. The anti-climax with search stats I have is that it does not differentiate SE referrers from other links. This means many searches are inevitably similar to my targetted keywords.

3. Keywords etc
The one im using (again) allows you to have titles in the search matching the query, i.e. linking to SERP's with a term "oatmeal cookies" puts that title in your results page. This is excellent if you choose to link to results

4. Message Boards
There is another thread running about them just now. Im just wondering how quick they grow in file size alongside your concerns aboutt them tedster

just my 2 euro

shaunryan




msg:586768
 4:00 am on Feb 28, 2002 (gmt 0)

I work for a search company (SLI Systems) and thought some of my experience could be useful here.

1. I understand your reluctance to put a search box in every page. If you are happy to with the quality of your search then it makes sense because it makes it easier for the site visitors to search. If you're not happy with your search then you should probably improve the quality of your search rather than make it more difficult to search. A recent usability report (http://world.std.com/~uieweb/what_they_want.htm) found that people were less likely to find stuff on a site if they used search! This is a reflection of the poor quality and poor implementation of most searches - Compaq was a great example

2. Many search products offer reports that help you monitor search quality. Your customer should be able to get lists of the top search terms, the top results from search, the search terms with no results and search terms with poor results This last one is important, one of our customers has over 100 searches per day for a particular search term, but only 8 of those searches generate clicks on search results. Although they have results for this search term this shows they don't have very good results and they need to do something about it. If these reports are generated automatically then it doesn't take many man hours per week to monitor the search quality. I spend about half an hour/day looking at these types of reports for our customers and letting them know if I see anything that needs their attention.

3. Your meta-tag idea sounds good. Some search engines will automatically generate search suggestions. One of the problems with manually doing this is that even for a reasonably small site there will be a huge variety of search terms.One of our customers had over 30,000 unique search queries over two months. This was on the support area for one of their software products. Trying to manually keyword typos can be a large job. A more efficient way is to monitor the most popular search terms with no results. Common misspelling that aren't indexed will show up in those reports.

I hope this helps.
-Shaun

tedster




msg:586769
 6:14 am on Feb 28, 2002 (gmt 0)

Thanks Shaun. Welcome to the forum - it's good to have input from a search professional.

The client in this case is a not-for-profit who will have a volunteer monitoring search results, so this aspect will be handled...and we definitely will have a solution that generates the kinds of reports you mention. It's helpful to have you list the three types; I would probably have overlooked the "poor" results category.

I've been thinking that one of the roughest areas in the search results that I've seen is the description of each result. So many times they are nearly inscrutable.

I'm wondering about the idea of having each page author create a short abstract of their material, then and have the search function return that abstract instead of kepinging a large meta tag on page or extracting Google-like snippets. We could easily add this into the content management workflow.

Is this approach something that is pre-packaged into products? Does it help?

Slud




msg:586770
 2:42 pm on Feb 28, 2002 (gmt 0)

I use Google for the site search on non-profit site I maintain (where ads appearing along side the results aren't a problem).

I'd like to log the terms people search for on the site. Does anyone know of a good way to catch and then pass (redirect) the queries to Google? Ideal would be if they got logged in the server's access log as a request (e.g. GET /cgi-bin/search?q=foo+bar) so I could use a standard log analyzer to extract them.

ciml




msg:586771
 5:16 pm on Feb 28, 2002 (gmt 0)

Sorry Tedster, I meant that if the major search engines are important to the site, then it's certainly not enough to use proprietory tags; I'd rather work at the pages to try to get the page rankable under the words, on my engine and elsewhere. I don't mean to imply that there's any harm per se.

Shaun, I like the idea of tracking clicks from SERPs to find low-click through phrases. Very slick.

agaffin




msg:586772
 6:02 pm on Feb 28, 2002 (gmt 0)

Proprietary tags are fine if your search engine can support them. One of the things I really like about Ultraseek (um, Inktomi Enterprise Search, or whatever they call it these days) is its ability to support custom meta tags. Another is a thesaurus function, which lets you match common misspellings with correct ones.

But it sounds like some of your concerns have more to do with the quality of the meta data or the results set than with whether you have a search link or box. That gets into issues of proper tagging, using a search engine that doesn't index toolbars, etc.

I prefer putting search boxes on every page. We've found that some users like to browse and some use the search engine as their main navigational tool - for such people, it might be annoying putting the search function one more click away. But supporting that assumption really means making sure the underlying documents are set up to return relevant results.

Alecto




msg:586773
 1:18 am on Mar 1, 2002 (gmt 0)

I'm with tedster on preferring the search function to be on its own page. Someone immediately using the search function and not getting the results they expect might move off the site just as immediately.

The search function we have planned for our site is limited in what it can deliver. It is meant to be used only for particular types of searches. The site structure itself is easily navigable for most searches. It's only when the navigation type of search is exhausted or inappropriate that we expect the user to turn to the search function. Then, on its own page, we can explain the limitations and the best way to use the search function as well as the site to find information.

circuitjump




msg:586774
 5:32 am on Mar 1, 2002 (gmt 0)

Not to change the subject all the way, but I made a search results page in javascript. The neat part about it is that it grabs the info from a db and puts it into a dimensional array that way the server doesn't have to deal with all the request all the time. It sends the info once when the search is made. After that, the JS page takes care of all the rest.

It's not completely done yet. It still has some bugs. One to be exact(from what I've seen so far). If anyone would like to work on it to use it for there site, just sticky mail me and I'll send it on to ya.

Brett_Tabke




msg:586775
 11:44 am on Mar 1, 2002 (gmt 0)

>1

I'm not fond of a search box on the page. Give them a link and let them find it when all else fails.

>2

There are quite a few hours involved. If you have someone with an eye for search engine optimization building the pages, then you should be ok and not have to tweak meta's or nothing for search quality. A good search engine should be able to match your quality.

The problem will be with the generic nature of the search queries by the user. Someone types in "webmaster" on the site search engine here several times a day. 20meg of indexed text, about "webmaster" and the engine is suppose to pice the best one out of 110k msgs?

That situation is going to occur with most site engines. After all, it's the one topic nature of all sites. Relavancy is very difficult in that "keyword loaded" situation.

>3 any downside

Depends on if you are serving the same pages to the se's? I wouldn't want to optimize specifically for the site search engine and for the net engines too. I do it one time well, and then adjust the dials and knobs on the ranking algo of the local site engine.

>4

Servers can get overloaded very quick (notice, I don't host the site search here). I rebuild the index nightly. It can be automated, but I like to do it by hand to watch what gets indexed.

I read an article (tried to find it), that said 75% of all site search engine usage ends in frustration and session abandonment.

Woz




msg:586776
 12:34 pm on Mar 1, 2002 (gmt 0)

I had an idea about a search javascript link and, as it is more a coding thing, started a new thread here [webmasterworld.com].

Onya
Woz

przero




msg:586777
 9:12 pm on Mar 8, 2002 (gmt 0)

who wants to hear the worst horror story of developing a search box, ever?

One of my clients wanted a search function built in but was very particular about it. Long story short, we had to use their SQL database and enter every keyword that could possibly apply to each item in their inventory, plus misspellings, abbreviations, alternative spellings, etc. Every time they add a new item, I have to enter new keywords for it. The search works pretty well, but is so time consuming it can't be cost effective.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved