| 6:12 pm on Jan 25, 2004 (gmt 0)|
not sure what the answer to your first problem is, but i know that referrer logs don't usually show the? in the URI. what you're seeing is common.
| 8:00 pm on Jan 25, 2004 (gmt 0)|
Never noticed before the "?" was not in the logs.
| 8:13 pm on Jan 25, 2004 (gmt 0)|
Did Googlebot find a link off your site that was in the "wrong" format? (or is that a dumb question, maybe?)
I'm just preparing to get into my first URL rewrite project on a Windows server, so this is a vital issue for me.
| 8:42 pm on Jan 25, 2004 (gmt 0)|
|Did Googlebot find a link off your site that was in the "wrong" format? |
Not at all, all links are in the following format: www.site.com/FL/Miami/123456.asp
That is why I am very confused, I was under the impression that the rewrite filter would change things server side, and Gbot would never see the page www.site.com/page.asp?state=FL&cityid=123456&city=Miami
Hopefuly some of the isapi_rewrite gurus can shed some light on this issue.
| 8:55 pm on Jan 25, 2004 (gmt 0)|
Actually let me rephrase something, the links look like:
<a href="FL/Miami/123456.asp">Link Text</a>
I wonder if that is causing the problem, I am going to change them to:
<a href="http://www.site.com/FL/Miami/123456.asp">Link Text</a>
| 9:46 pm on Jan 25, 2004 (gmt 0)|
I meant to ask if there might be an inbound link to your pages, like on a forum or link partner somewhere, that was in the wrong format. However, I didn't state my question very well.
| 9:52 pm on Jan 25, 2004 (gmt 0)|
|I meant to ask if there might be an inbound link to your pages, like on a forum or link partner somewhere, that was in the wrong format |
No, these pages are a couple of weeks old, and other than me, no one knows of their existence.
| 9:05 pm on Jan 26, 2004 (gmt 0)|
googlebot saw the correct links but iis is serving up page.asp with all the parameters, which is what it is recording in your log files. as long as there are no links anywhere to the unfriendly uri's then google will never be the wiser.
| 9:13 pm on Jan 26, 2004 (gmt 0)|
I'd also recommend that you get into the habit of using Absolute URIs in any rewriting routine. I've seen some ill effects occur when using Relative URIs.
Absolute = http*://www.example.com/sub/file.asp
Relative = /sub/file.asp
| 9:45 pm on Jan 26, 2004 (gmt 0)|
|googlebot saw the correct links but iis is serving up page.asp with all the parameters, which is what it is recording in your log files. |
That makes a lot of sense, thanks.
I feel better now.
Thanks, I'll try that.
| 10:06 pm on Jan 26, 2004 (gmt 0)|
Just to be on the safe side, I would add a Disallow: in your robots.txt file...
| 10:24 pm on Jan 26, 2004 (gmt 0)|
Be cautious with this. I've got a site with a similar situation. I use a 404 processor rather than ISAPI_rewrite, but Google continues to index the dynamic URLs. Virtually all linkage is to the "static" URLs, hence the relative PR of the static pages should shove any duplicate content on query string URLs out. That hasn't happened yet, though.
PR is equally misleading. The static pages show decent PR but don't appear in results, while the dynamic pages have zero PR but show up well in many cases. "Spidering" the pages shows everything OK - good URLs, 200 OK headers, etc.
There are a few things I could try, but I'm afraid to get too aggressive lest I lose the decent traffic going to the site now.
I don't think there are many links to the query string pages, and it almost seems as if Google is crawling them from memory.
| 10:39 pm on Jan 26, 2004 (gmt 0)|
|I use a 404 processor rather than ISAPI_rewrite, but Google continues to index the dynamic URLs. |
Hmmm, I might be a little leery entrusting a URI rewrite routine to a 404 processor. But, knowing you, I'm going to assume that you've covered all your bases.
I would definitely look at each step involved with this method. Is it possible that the 404 processor is returning a 302 somewhere along the way?
| 7:29 am on Jan 27, 2004 (gmt 0)|
i would think rogers problem would be related to the server returning a 404 and then possibly a 301 or 302 when the 404 page redirects to the real page? I went the 404 route for a while but it kinda always seemed like a mickey mouse fix.. using one of the isapi filters is so much cleaner and more seemless.
| 5:01 pm on Feb 1, 2004 (gmt 0)|
Hmmm. I came looking for the answer to this isapi_rewrite problem, but it doesn't look like anyone has it.
In my system, I rewrite all URLs to a script file. Googlebot gets the correct pages but the server writes the rewritten filename (the script's name) to the log. I was hoping there might be an isapi-rewrite setting/switch or something to make the requested URL be written to the log. I want to know which URLs Googlebot is requesting.
Nobody got any ideas?
If there isn't a standard way of dealing with it, I suppose I could write the requested URLs to a text file from the script.
| 5:05 pm on Feb 1, 2004 (gmt 0)|
Found the answer! - I think.
Second item from the bottom is about the "U" flag which, apparently, causes the "U"nmangled (requested) URL to be written to the log.
I'm gonna test it.