Forum Moderators: phranque

Message Too Old, No Replies

Duplicate page titles message in WMT

? after index.shtml - how to redirect?

         

Play_Bach

7:17 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



For some reason Google's Webmaster Tools is giving me a Duplicate page titles message for my index page which is .shtml, not php. URLs like example.com/? are resolving to the home page even though nowhere on the site is there such a link - these are inbound links coming from Yahoo! of all places. I have no idea why Yahoo! is indexing my site like this, but I don't need a dozen links to my home page from them.

For example, URLs like these resolve to the home page.

.com/?
.com/?abc
.com/?1234

Anybody know the redirect to deal a 404 for these URLs that append a ? after the .com/

Thanks!

g1smd

8:30 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



404 response is "not found" and is never a redirect.

Redirects have a 30x status code.

You can return one code or the other.

Both are simple to do using a RewriteRule with a preceding RewriteCond looking at %{QUERY_STRING}.

Play_Bach

9:39 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks g1smd

Rather than mess things up flailing in the dark here, I looked around a bit and found this (I'm hesitant to run it until I get a second opinion. Probably should have deferred to more experience in the first place and asked what the preferred solution is to a problem like mine). anyhow... Thanks again!

RewriteCond %{QUERY_STRING} .
RewriteRule (.*) http://www.example.com/$1? [R=301,L]

g1smd

9:51 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That clears the query string from every URL request anywhere on your site.

That may well interfere with things if people log in to your site or if there are any forms present. In that case you'll need to exclude POST requests with another RewriteCond.

lucy24

10:01 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



even though nowhere on the site is there such a link

Do you mean that nothing on your site has a query string? If so, your redirect is certainly one way to do it. The other way is to do nothing. If the links have no existence outside of Yahoo's fevered imagination, there's no reason you have to do anything about them at all.

Quick edit: Oops, it isn't just Yahoo in a vacuum, they've been discovered by g###. Missed that. Is there a way to set a "wild card" in gwt so it ignores all parameters?

Play_Bach

10:49 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> That clears the query string from every URL request anywhere on your site.

uh... no, that's not what I want. I do have php pages on the site which have query strings, just not the home page which is .shtml. I don't know if it's because of the other php pages that Yahoo! thinks there must also be an index.php, but in any event, the home page is the only one I'm having a problem with as far as WMT is concerned.

Thanks again!

g1smd

11:32 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Change the (.*) bit which means "everything" to a new pattern which matches only the URL paths that you do want to correct.

This will probably be something like
^(index\.shtml)?$
which matches requests for path / and /index.shtml alike.

[edited by: g1smd at 11:50 pm (utc) on Mar 19, 2012]

Play_Bach

11:49 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks g1smd

like this?

RewriteCond %{QUERY_STRING} .
RewriteRule ^(index\.shtml)?$ http://www.example.com/$1? [R=301,L]

g1smd

11:51 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Remove $1 from the target URL otherwise you'll be redirecting to a named index file.

You need to redirect only to / here.

Play_Bach

11:59 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok, thanks. So something like this?

RewriteCond %{QUERY_STRING} .
RewriteRule ^(index\.shtml)?$ http://www.example.com/? [R=301,L]

Also, I noticed on my other redirects the use of quotes and back slashes such as

RewriteCond %{HTTP_HOST} ^.*$
RewriteRule ^blog$ "http\:\/\/example\.blogspot\.com" [R=301,L]

Do I not need those?

Thanks again!

g1smd

12:03 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You certainly do not need quotes or backslashes on the target URL.

When you redirect to the root page of a site, the target URL ends with a trailing slash.

Make sure your rules are in the right order, from most specific to most general. This avoids an unwanted multiple step redirection chain for some requests.

Play_Bach

12:13 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



ok, well I screwed that up :-|
I tried this:

RewriteCond %{QUERY_STRING} .
RewriteRule ^(index\.shtml)?$ http://www.example.com/ [R=301,L]

Firefox spun around for a little and put up an error page

The page isn't redirecting properly

Firefox has detected that the server is redirecting the request for this address in a way that will never complete.

This problem can sometimes be caused by disabling or refusing to accept cookies.

g1smd

12:21 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Use the Live HTTP Headers extension for Firefox and you will see that it redirects to the same URL over and over in a loop until the browser gives up.

That URL has the original query string on the end.

Play_Bach

12:28 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Right you are, thanks. Ok, so put the above code back in and fired up Safari with cache cleared, entered example.com/? in the address bar, hit return - no redirect. I'm stumped.

g1smd

12:58 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"above code"

Which one?



Be aware that a request such as example.com/? doesn't contain a query string.
A query string will be the characters after the question mark.

[edited by: g1smd at 1:49 am (utc) on Mar 20, 2012]

Play_Bach

1:27 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



sorry g1smd, this code

RewriteCond %{QUERY_STRING} .
RewriteRule ^(index\.shtml)?$ http://www.example.com/ [R=301,L]

> Be aware that example.com/? doesn't contain a query string.

Ok, now I'm confused. What is that '?' doing?

Thanks again for the help!

[edited by: Play_Bach at 1:29 am (utc) on Mar 20, 2012]

g1smd

1:29 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Test it again using a query string.

Be aware that a request such as example.com/? doesn't contain a query string.
A query string will be the characters after the question mark.

[edited by: g1smd at 1:31 am (utc) on Mar 20, 2012]

Play_Bach

1:30 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



sorry again, replies out of sync here

> Be aware that example.com/? doesn't contain a query string.

Ok, now I'm confused. What is that '?' doing in the URL?

g1smd

1:31 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You had that in your first post.

Play_Bach

1:37 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok, went into WMT and looked at the strings. You're right, no .com/? URLs.

Duplicate title URLs look like so:

/
/?abc
/?1234

So I went back and tried it again in Safari with everything cleared and copied over one of the URLs with a string after the ? from WMT. Same problem as in Firefox, too many redirects, Safari spun out and put up a warning page. Obviously I'm screwing something up here.

g1smd

1:48 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Notice that your code is not removing the query string, so the redirect runs again re-appending the query string.

One of your rules above does have the correct code to remove query strings. Make that single fix to your current code.

Play_Bach

3:02 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok, looks like I have some homework to do. Thanks again. :-)

lucy24

4:38 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Input: Request that has query string:
RewriteCond %{QUERY_STRING} .


Output: Request that has no query string:
http://www.example.com/?


If you can get them both into the same post at the same time-- both of the above are quoted from you at different points in this thread --then your next step will be to get them both into the same mod_rewrite at the same time :)

g1smd

8:46 am on Mar 20, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yep. That's the fix.

Redirect requests with query strings and strip the query string from the target URL.