Forum Moderators: open

Message Too Old, No Replies

Sim Spider

         

macrost

2:55 pm on Apr 20, 2005 (gmt 0)

10+ Year Member



Well I tried using the sim spider [searchengineworld.com] just a few minutes ago, and it only spidered half the page.
Not sure if anyone knew!

Brett_Tabke

6:11 am on Apr 21, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Working ok here. Sticky me the url.

merijn123

1:57 pm on Apr 21, 2005 (gmt 0)



Does your site contain javascript-links? Meaby thatīs the reason why the spider canīt index your whole page. If it's true don't forget that searchengine spiders got the same problem. Make your links href.

macrost

10:28 am on Apr 23, 2005 (gmt 0)

10+ Year Member



Brett,
Stickied url.

Brett_Tabke

2:37 pm on Apr 23, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Ya, it's the page.

If the spider is choking on some of that, it is a hint that the se's might be too.

Where to start to look for trouble:

- get it close to validating. (fix some of the scripting...get it escaped...) currently 54 html errors. many are nontrivial.
- dump the 22 meta tags.
- dump the repetitive and unnec metas. (I would get it down to 4 or less - the rest is seen as spam).
- strip out all that code beautification spacing (5k savings probably...)
- preference: move some of that bitmap area data to the bottom of the page if possible (move the content UP in the code. you can do it without changing the looks of the page).
- preference: change some of that js so that it is all one one line, and escape the data with quotes.
- preference: either quote all your attribute values or don't quote all, but don't mix em (it can confuse html strippers when mismatched - as it did sim spider)

macrost

4:48 pm on Apr 23, 2005 (gmt 0)

10+ Year Member



Brett,
I was thinking that also. Unfortunately the client thinks he has the inside track and insists on keeping it that way. Oh well, we'll see!
:-)

On another thought, I used a couple of other checkers and they pulled just fine, just how quick is the sim spider? Most of that content is dynamic.