Welcome to WebmasterWorld Guest from 220.127.116.11
I've done a quick search, but can't find reference to this. Googlebot has requested hundreds and hundreds of pages from our website with '%09%09%09%09' inserted into the querystring/URL.
At the moment, all of these pages fail. It would be possible for us to capture the requested page, remove all of the '%09' characters and return the resulting page - but we're worried Google might then see our site as having infinite pages.
Has anyone else experience with this? Is there a best course of action?
Are you pages plain html pages?
We're running on a microsoft server platform. caspita isn't from what I've read, so I guess it's not particularly platform related.
We do run a dynamic site, but we're actually returning .htm pages with querystrings. The %09%09...'s are inserted into the querysting. I'm thinking that it is either Googlebot running errors, or that Googlebot is using this as a technique to check we don't return an infinite number of pages (i.e. we don't return a page for any querystring, it has to follow specific rules).
For now, unless we hear a reason otherwise, we're going to allow the site to error and return a 404 page when these pages are requested, rather than fix it.
Incidentally, we're 99% certain we have no links with (this many) tabs in the querystring from within our site, so we think Googlebot has just 'made up' these pages.
joined:Mar 13, 2005
It would not surprise me one bit if Google did not test dynamic/server-gen pages further than standard, no offense to you, I am sure if Google does it, it does it for some reason likely related to influencing search results / spam or thereabouts.
I always take my old pages and leave them on my server but I re-code them to redirect the user AND the se's to the new page as follows:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
Then upload oldpage.htm and leave it in place, I got pages changed 3-4 years ago but I leave them because for some reason, certain se's / links have never updated and still from time to time a visitor is sent to oldpage.htm (which now kindly and immediately redirects the visitor to newpage.htm).
And, I think it's more user-friendly than a 404, there are SOME pages which I can not re-direct to the new page (because there IS no new page), some those I re-direct to the MAIN page, the few that are left get a custom 404 with several links to the main parts of my site.
I think you misunderstood me. I get the redirect stuff. I do it all the time. I was talking about getting the old pages out of the SERPs. I have pages in there from 2002 that haven't been accessed or viewed. I would like to get rid of those. I've tried just about everything. It's a G supplemental problem.