Forum Moderators: open

Message Too Old, No Replies

Our old pages are getting spidered

         

snook

1:17 pm on Apr 16, 2003 (gmt 0)

10+ Year Member



Google Deepbot and other spiders, keep coming to our site and crawling our old query pages, and I can't seem to figure out how. We have no links to them connecting them to each other anywhere

A few weeks ago, we did a mod_rewrite, and have our new links off our index page.

How is Google and other spiders finding and crawling these pages? From search engine results?

We want our new pages crawled. (new pages are off the index page, and on our site map)

marcs

3:49 am on Apr 17, 2003 (gmt 0)

10+ Year Member



How is Google and other spiders finding and crawling these pages?

While you may no longer have links to those pages, other sites may.

I've noticed some pages which were removed years ago getting crawled. This is the only reason I can come up with. Someone is still linking to them.

If the pages which contain the link(s) are a below a PR4, it may be hard to located them and ask them to change URL's.

mcavic

4:07 am on Apr 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As long as the old urls are returning 404, and there are links to the new urls, I wouldn't worry about it - it'll work out eventually.

If someone clicks on a broken url, you should see the referer in the log, and you'll know if they're coming from a particular site rather than a search engine.

snook

5:05 am on Apr 17, 2003 (gmt 0)

10+ Year Member



These are active urls as all are still in the search engine results, so I don't want to stop the traffic we are getting from them. (would like for it to be the new urls, but realize that won't happend overnight) but I just can't see how they are finding these as links, as they are loooong urls, and are in a session.

Doing a search via link: to any of the ones their spidering, produces no results

I guess I won't worry about it like mcavic
I will just look at it as doorway pages :)

mcavic

2:30 pm on Apr 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I will just look at it as doorway pages :)

Yes, absolutely. If the page returns either a 404 with a custom message, or a 301 redirect, then it acts like a doorway, but it's legitimate and can't be penalized.

On my site, when I changed all my links, I used a 301 redirect to get the spiders to update the urls, then when the spiders are done, I'll use a 404 with a link to get people to update their bookmarks.