homepage Welcome to WebmasterWorld Guest from 54.243.23.129
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Accredited PayPal World Seller

Visit PubCon.com
Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
%09 in Logfiles
Googlebot requesting strange pages.
webmaster99




msg:893741
 10:27 am on Nov 30, 2005 (gmt 0)

Hi

I've done a quick search, but can't find reference to this. Googlebot has requested hundreds and hundreds of pages from our website with '%09%09%09%09' inserted into the querystring/URL.

At the moment, all of these pages fail. It would be possible for us to capture the requested page, remove all of the '%09' characters and return the resulting page - but we're worried Google might then see our site as having infinite pages.

Has anyone else experience with this? Is there a best course of action?

 

webdude




msg:893742
 6:42 pm on Dec 1, 2005 (gmt 0)

%09 is the escape character for a tab. Not sure why this would be showing up in a URL. I tried some experiments in plain text html and cannot reproduce this. It could be a parsing error in your code in reference to a link. It could be that G is broke for some reason.

Are you pages plain html pages?

caspita




msg:893743
 6:45 pm on Dec 1, 2005 (gmt 0)

You may want to try this.

[webmasterworld.com...]

webdude




msg:893744
 12:46 pm on Dec 2, 2005 (gmt 0)

Your link tells of the problem. I wonder what causes it. What platforms and servers you guys running. Linux/Apache?

webmaster99




msg:893745
 2:39 pm on Dec 2, 2005 (gmt 0)

Hi and thanks for the responses....

We're running on a microsoft server platform. caspita isn't from what I've read, so I guess it's not particularly platform related.

We do run a dynamic site, but we're actually returning .htm pages with querystrings. The %09%09...'s are inserted into the querysting. I'm thinking that it is either Googlebot running errors, or that Googlebot is using this as a technique to check we don't return an infinite number of pages (i.e. we don't return a page for any querystring, it has to follow specific rules).

For now, unless we hear a reason otherwise, we're going to allow the site to error and return a 404 page when these pages are requested, rather than fix it.

Incidentally, we're 99% certain we have no links with (this many) tabs in the querystring from within our site, so we think Googlebot has just 'made up' these pages.

Interesting though.....

webdude




msg:893746
 4:23 pm on Dec 2, 2005 (gmt 0)

Yep,

I think returning 404 pages is the best solution. Theoretically, the pages should be dropped eventually. But I have been having a hard time with old pages still showing in the SERPs lately

Good Luck!

topsites




msg:893747
 3:19 am on Dec 7, 2005 (gmt 0)

webmaster99:

It would not surprise me one bit if Google did not test dynamic/server-gen pages further than standard, no offense to you, I am sure if Google does it, it does it for some reason likely related to influencing search results / spam or thereabouts.

webdude:
I always take my old pages and leave them on my server but I re-code them to redirect the user AND the se's to the new page as follows:
oldpage.htm code:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">

<html>
<head>
<meta name="robots"
content="noindex,follow">

<title></title>
<script language="JavaScript"
type="text/javascript">
{location.replace('http://yourdomaine.com/newpage.htm');}//-->
</script>
</head>

<body>
</body>
</html>

Then upload oldpage.htm and leave it in place, I got pages changed 3-4 years ago but I leave them because for some reason, certain se's / links have never updated and still from time to time a visitor is sent to oldpage.htm (which now kindly and immediately redirects the visitor to newpage.htm).
And, I think it's more user-friendly than a 404, there are SOME pages which I can not re-direct to the new page (because there IS no new page), some those I re-direct to the MAIN page, the few that are left get a custom 404 with several links to the main parts of my site.

Peace out

webdude




msg:893748
 8:57 pm on Dec 7, 2005 (gmt 0)

topsites

I think you misunderstood me. I get the redirect stuff. I do it all the time. I was talking about getting the old pages out of the SERPs. I have pages in there from 2002 that haven't been accessed or viewed. I would like to get rid of those. I've tried just about everything. It's a G supplemental problem.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved