Welcome to WebmasterWorld Guest from 50.17.117.221

Forum Moderators: DixonJones & mademetop

Message Too Old, No Replies

%09 in Logfiles

Googlebot requesting strange pages.

     
10:27 am on Nov 30, 2005 (gmt 0)

New User

10+ Year Member

joined:Oct 6, 2004
posts:9
votes: 0


Hi

I've done a quick search, but can't find reference to this. Googlebot has requested hundreds and hundreds of pages from our website with '%09%09%09%09' inserted into the querystring/URL.

At the moment, all of these pages fail. It would be possible for us to capture the requested page, remove all of the '%09' characters and return the resulting page - but we're worried Google might then see our site as having infinite pages.

Has anyone else experience with this? Is there a best course of action?

6:42 pm on Dec 1, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 3, 2002
posts:894
votes: 0


%09 is the escape character for a tab. Not sure why this would be showing up in a URL. I tried some experiments in plain text html and cannot reproduce this. It could be a parsing error in your code in reference to a link. It could be that G is broke for some reason.

Are you pages plain html pages?

6:45 pm on Dec 1, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Dec 1, 2003
posts:311
votes: 0


You may want to try this.

[webmasterworld.com...]

12:46 pm on Dec 2, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 3, 2002
posts:894
votes: 0


Your link tells of the problem. I wonder what causes it. What platforms and servers you guys running. Linux/Apache?
2:39 pm on Dec 2, 2005 (gmt 0)

New User

10+ Year Member

joined:Oct 6, 2004
posts:9
votes: 0


Hi and thanks for the responses....

We're running on a microsoft server platform. caspita isn't from what I've read, so I guess it's not particularly platform related.

We do run a dynamic site, but we're actually returning .htm pages with querystrings. The %09%09...'s are inserted into the querysting. I'm thinking that it is either Googlebot running errors, or that Googlebot is using this as a technique to check we don't return an infinite number of pages (i.e. we don't return a page for any querystring, it has to follow specific rules).

For now, unless we hear a reason otherwise, we're going to allow the site to error and return a 404 page when these pages are requested, rather than fix it.

Incidentally, we're 99% certain we have no links with (this many) tabs in the querystring from within our site, so we think Googlebot has just 'made up' these pages.

Interesting though.....

4:23 pm on Dec 2, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 3, 2002
posts:894
votes: 0


Yep,

I think returning 404 pages is the best solution. Theoretically, the pages should be dropped eventually. But I have been having a hard time with old pages still showing in the SERPs lately

Good Luck!

3:19 am on Dec 7, 2005 (gmt 0)

Junior Member

joined:Mar 13, 2005
posts:174
votes: 0


webmaster99:

It would not surprise me one bit if Google did not test dynamic/server-gen pages further than standard, no offense to you, I am sure if Google does it, it does it for some reason likely related to influencing search results / spam or thereabouts.

webdude:
I always take my old pages and leave them on my server but I re-code them to redirect the user AND the se's to the new page as follows:
oldpage.htm code:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">

<html>
<head>
<meta name="robots"
content="noindex,follow">

<title></title>
<script language="JavaScript"
type="text/javascript">
{location.replace('http://yourdomaine.com/newpage.htm');}//-->
</script>
</head>

<body>
</body>
</html>

Then upload oldpage.htm and leave it in place, I got pages changed 3-4 years ago but I leave them because for some reason, certain se's / links have never updated and still from time to time a visitor is sent to oldpage.htm (which now kindly and immediately redirects the visitor to newpage.htm).
And, I think it's more user-friendly than a 404, there are SOME pages which I can not re-direct to the new page (because there IS no new page), some those I re-direct to the MAIN page, the few that are left get a custom 404 with several links to the main parts of my site.

Peace out

8:57 pm on Dec 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 3, 2002
posts:894
votes: 0


topsites

I think you misunderstood me. I get the redirect stuff. I do it all the time. I was talking about getting the old pages out of the SERPs. I have pages in there from 2002 that haven't been accessed or viewed. I would like to get rid of those. I've tried just about everything. It's a G supplemental problem.