homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Search Engines / Alternative Search Engines
Forum Library, Charter, Moderators: bakedjake

Alternative Search Engines Forum

This 65 message thread spans 3 pages: < < 65 ( 1 2 [3]     
GigaBlast Part 3

 11:17 am on Mar 18, 2002 (gmt 0)

Continued from: [webmasterworld.com...]

Looks impressive so far indeed. I'm really curious about any increase/decrease in relevance, once there's a significant number of sites indexed.

A few things to note, most of which you probably know already:

  • Always respect robots.txt for all pages.

  • The spider needs to do some load balancing, so that it doesn't fetch too many pages from the same site in a short time. The recommended ratio is about one page per minute and site (http://www.robotstxt.org/wc/robots.html)

  • Make sure that the images on your site are served with headers for creation date, size, and expiry date, so that the client can cache them. This will noticeably reduce the bandwidth requirements on your own system.

  • Only list one of www.example.com/ and www.example.com/index.html (home¦default.htm¦asp¦php, etc.) at least if they contain the same text.

  • Cluster the results, so that one site can't dominate the SERPs for any keyword combination.

  • I'm sure there's a lot more work waiting for you... ;)


     5:09 pm on Mar 29, 2002 (gmt 0)

    It will probably reset many more times. It is just in pre-beta testing. Nothing is permanent at this stage. I'm sure we will all need to check it and resubmit when it comes out of testing.


     10:39 pm on Mar 30, 2002 (gmt 0)

    Hey, Gigablast has been 404 all afternoon!


     8:43 pm on Apr 1, 2002 (gmt 0)

    pyst - While clustering improves matters a lot, it can be very useful to spider deeply. Some sites don't have every topic on the site detailed on the home page, and sometimes a deeper page is really more relevant to a search.

    Not spidering sites deeply, and only paying attention to the home pages just encourages people to get a different site for each product. Certainly, the more topics your home page covers, the less likely you are to rank well for the specific topics customers will look up. This is exactly what happens in Yahoo and Looksmart, who only pay attention to the homepage. You end up with whole categories full of one-off sites that are obvious domain spam.


     6:34 pm on Apr 2, 2002 (gmt 0)

    The site has been down for me for a couple of days now. Matt, are you still reading post in the forum? I was just wondering if there was some kind of major problem or are you just doing some more fine tuning??


     12:09 am on Apr 3, 2002 (gmt 0)

    Lets go ahead wrap this one up. Gigablast looks like a good new project. We wish you well.

    When you get it all tweaked and ready for a roll out - feel free to let us know and we'll have another go at it.


    This 65 message thread spans 3 pages: < < 65 ( 1 2 [3]
    Global Options:
     top home search open messages active posts  

    Home / Forums Index / Search Engines / Alternative Search Engines
    rss feed

    All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
    Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
    WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
    © Webmaster World 1996-2014 all rights reserved