Forum Moderators: phranque

Message Too Old, No Replies

Pages won't spider

         

Marcia

12:35 am on Sep 30, 2000 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I ran a few pages through the spider simulator, and it works just fine - very handy!

I ran a the index page of a new site I'm starting work on through (which has been modified just slightly slightly from the owner's original, with text links added), without a
problem - different story with the other pages.

I haven't touched the other pages, which are a totally different design (the site owner did them in Front Page), but they are VERY heavy in code, and the simulator will not spider them at all. This is what it's returning:

200 (return error code 0)

It seems to be picking up a bit, and stopping short when it hits the code for the page. What does this code mean? Is the code "choking" the spider, as I suspect?

I don't use FP (and won't), so I'm not sure I can do anything at all with these pages aside from changing the title and meta tags.

Any clues?

TIA, Marcia

DaveAtIFG

5:36 am on Sep 30, 2000 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>I ran a few pages through the spider simulator, and it works just fine - very handy!

I suspect you're talking about Brett's spider sim at Search Engine World [searchengineworld.com]

I'll bring your post to his attention and he'll likely appreciate the feedback. He's the guy to address your question since he wrote the sim. OK?

And from what you describe, it does sound like the FP code is crashing the sim.

Brett_Tabke

3:07 pm on Sep 30, 2000 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I have been working on that util on/off all week testing some new html stripping code. I also have been setting it up for spider stopping. We've had people spider the sim spider for 50k, 30k, and 20k in one day before - I can't allow that. So, I've been working with adding the users ip address to the agent string after xx page views. While all that has been going on, there have been some glitches here and there. I just have to address the spider problem or take the thing offline.

Marcia

1:44 am on Oct 1, 2000 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thank you Dave. Brett, in this case I can conclude that it's the page itself that's a problem - it's a whopping 124K + in size, PLUS a remotely hosted animated banner (which I believe is not doing any good) - PLUS on top of all that, HumanClick.

I just tried two other pages from the site with no problem, but when running that page through again, came up with the same error - which in itself is actually helpful. In checking, it seems that the search engines cannot spider this particular page either.

In this case, it's more a matter of correcting site usability than rankings.

Thank you so much for providing all the useful tools and information!

stuntdubl

3:32 am on Sep 26, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is a very cool program Brett, I hope you can work out the bugs, because it is a very useful utility for the "not so seasoned" webmasters like myself. I hope you can work the bugs out, because it is quite helpful.