homepage Welcome to WebmasterWorld Guest from 54.226.43.155
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
Forum Library, Charter, Moderator: open

Paid Inclusion Engines and Topics Forum

  posting off  
New Spidering pattern?
New Spidering pattern?
jlara




msg:21787
 1:03 pm on Oct 24, 2000 (gmt 0)

I have just noticed something in my logs that is attracting my attention... Many spiders recently stopped going after the submitted URLs and instead go for the index.html. To make this post short and sweet I'll concentrate on AV.

You submit [somesub.somesite.com...]
Then that site gets listed.
Two weeks later AV sends a different spider and goes to
[somesub.somesite.com...]

This is a little surprising. If the index.html was optimized for another engine, or if there is a meta refresh, then this could be a problem.

Has this always been an AV pattern? I will watch my rankings and see if it causes a drop. This index.html spidering trend throughs another wrench into SEO efforts, because no matter how well you make a page -- it is very difficult for it to match each search engine's algorithum.

 

rogerd




msg:21788
 2:22 pm on Oct 24, 2000 (gmt 0)

Jlara, I'd say it is pretty common SE behavior to go back to grab the index page of any site for which it has pages indexed. Although some are more aggressive than others, their objective is to capture the entire site (page and depth limits may exist, of course), as well as find links to other sites. AltaVista is employing some level of theme analysis, and for that to work it has to sample the entire site.

Many people submit only their home page or a hallway page and wait for the spider to find the rest of the pages on his own - "found" pages sometimes rank better than submitted pages. (This approach takes a lot of patience.)

If you have pages you don't want a particular SE to see, you should exclude them in your robots.txt file.

Brett_Tabke




msg:21789
 8:36 am on Oct 25, 2000 (gmt 0)

Ya, if the index isn't in the db, Alta will grab it too. It usually takes longer than that though.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved