homepage Welcome to WebmasterWorld Guest from 54.197.65.82
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
%09 in Logfiles
Googlebot requesting strange pages.
webmaster99

10+ Year Member



 
Msg#: 3834 posted 10:27 am on Nov 30, 2005 (gmt 0)

Hi

I've done a quick search, but can't find reference to this. Googlebot has requested hundreds and hundreds of pages from our website with '%09%09%09%09' inserted into the querystring/URL.

At the moment, all of these pages fail. It would be possible for us to capture the requested page, remove all of the '%09' characters and return the resulting page - but we're worried Google might then see our site as having infinite pages.

Has anyone else experience with this? Is there a best course of action?

 

webdude

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3834 posted 6:42 pm on Dec 1, 2005 (gmt 0)

%09 is the escape character for a tab. Not sure why this would be showing up in a URL. I tried some experiments in plain text html and cannot reproduce this. It could be a parsing error in your code in reference to a link. It could be that G is broke for some reason.

Are you pages plain html pages?

caspita

10+ Year Member



 
Msg#: 3834 posted 6:45 pm on Dec 1, 2005 (gmt 0)

You may want to try this.

[webmasterworld.com...]

webdude

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3834 posted 12:46 pm on Dec 2, 2005 (gmt 0)

Your link tells of the problem. I wonder what causes it. What platforms and servers you guys running. Linux/Apache?

webmaster99

10+ Year Member



 
Msg#: 3834 posted 2:39 pm on Dec 2, 2005 (gmt 0)

Hi and thanks for the responses....

We're running on a microsoft server platform. caspita isn't from what I've read, so I guess it's not particularly platform related.

We do run a dynamic site, but we're actually returning .htm pages with querystrings. The %09%09...'s are inserted into the querysting. I'm thinking that it is either Googlebot running errors, or that Googlebot is using this as a technique to check we don't return an infinite number of pages (i.e. we don't return a page for any querystring, it has to follow specific rules).

For now, unless we hear a reason otherwise, we're going to allow the site to error and return a 404 page when these pages are requested, rather than fix it.

Incidentally, we're 99% certain we have no links with (this many) tabs in the querystring from within our site, so we think Googlebot has just 'made up' these pages.

Interesting though.....

webdude

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3834 posted 4:23 pm on Dec 2, 2005 (gmt 0)

Yep,

I think returning 404 pages is the best solution. Theoretically, the pages should be dropped eventually. But I have been having a hard time with old pages still showing in the SERPs lately

Good Luck!

topsites



 
Msg#: 3834 posted 3:19 am on Dec 7, 2005 (gmt 0)

webmaster99:

It would not surprise me one bit if Google did not test dynamic/server-gen pages further than standard, no offense to you, I am sure if Google does it, it does it for some reason likely related to influencing search results / spam or thereabouts.

webdude:
I always take my old pages and leave them on my server but I re-code them to redirect the user AND the se's to the new page as follows:
oldpage.htm code:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">

<html>
<head>
<meta name="robots"
content="noindex,follow">

<title></title>
<script language="JavaScript"
type="text/javascript">
{location.replace('http://yourdomaine.com/newpage.htm');}//-->
</script>
</head>

<body>
</body>
</html>

Then upload oldpage.htm and leave it in place, I got pages changed 3-4 years ago but I leave them because for some reason, certain se's / links have never updated and still from time to time a visitor is sent to oldpage.htm (which now kindly and immediately redirects the visitor to newpage.htm).
And, I think it's more user-friendly than a 404, there are SOME pages which I can not re-direct to the new page (because there IS no new page), some those I re-direct to the MAIN page, the few that are left get a custom 404 with several links to the main parts of my site.

Peace out

webdude

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3834 posted 8:57 pm on Dec 7, 2005 (gmt 0)

topsites

I think you misunderstood me. I get the redirect stuff. I do it all the time. I was talking about getting the old pages out of the SERPs. I have pages in there from 2002 that haven't been accessed or viewed. I would like to get rid of those. I've tried just about everything. It's a G supplemental problem.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved