GWT duplicate title tag for url with trailing ?

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

GWT duplicate title tag for url with trailing ?

MrBlack

7:14 pm on Mar 19, 2014 (gmt 0)

In my gwt account I have been flagged for duplicate title tags and descriptions for urls with a trailing ? e.g.

http://www.example.com/page-1/

and

http://www.example.com/page-1/?376462764

It appears that googlebot was just testing to see what would happen if it tried the url with a trailing ? as there are no internal links to urls with trailing ?

I know this will probably not impact rankings but is it possible to 404 the urls with trailing ?

Cheers

[edited by: brotherhood_of_LAN at 7:18 pm (utc) on Mar 19, 2014]
[edit reason] changed to example.com [/edit]

Mentat

7:23 pm on Mar 19, 2014 (gmt 0)

Use canonical tag

mack

7:53 pm on Mar 19, 2014 (gmt 0)

The canonical tag would be the best way to go about this. You could also theoretically produce a 404 using .htaccess when the url meets a certain criteria (trailing ?).

For simplicity I would stick with Mentat's suggestion. This will inform Google and other search engines that even if there are two possible methods of arriving at a page, one is the preferred/default method.

Mack.

lucy24

8:52 pm on Mar 19, 2014 (gmt 0)

You say "trailing ?" but isn't it really the same path with and without a query string? What do you see in the Parameters area of gwt?

MrBlack

9:59 pm on Mar 19, 2014 (gmt 0)

Under the parameters tag it says there are no problems, therefore I do not need to configure them.

It just seems pointless to me to flag it as duplicate description and titles when there are no links that I know of pointing to http://www.example.com/page-1/?s=6172dfa9937aee91206a1d612243588c

I have in-fact tested on a whole load of sites putting a trailing ? on the url and none have gone to a 404...all show the same content as without the trailing ?

JD_Toims

10:11 pm on Mar 19, 2014 (gmt 0)

It just seems pointless to me to flag it as duplicate description and titles when there are no links that I know of pointing to http://www.example.com/page-1/?s=6172dfa9937aee91206a1d612243588c

No argument from me.

I have in-fact tested on a whole load of sites putting a trailing ? on the url and none have gone to a 404...all show the same content as without the trailing ?

Not all sites do:

### .htaccess file ###

RewriteEngine on
RewriteCond %{THE_REQUEST} \?
RewriteRule .? http://www.example.com%{REQUEST_URI}? [R=301,L]

Besides keeping issues like this from happening with "silly" tools, it's a nice little security addition for sites that are actually dynamic but use static URLs, because it makes it really tough to "inject" random variables into a script when all query_strings sent by browsers/bots are automatically stripped via redirect -- Especially if you validate URL characters and don't allow POST requests except to pages that should accept them and thoroughly scrub anything coming into those.

rainborick

4:46 am on Mar 20, 2014 (gmt 0)

Well, it's hardly pointless. Technically, the same root URL with and without a query string are distinctly different URLs, and it's hardly an uncommon practice to have a site that serves different content in response to the same URL depending on the presence and content of a query string. And it's also common for such URLs to have the same <title> regardless of the query string, and I think everyone would agree that's not best practice. So all you're seeing in Webmaster Tools is Google's automated warning alerting you to the situation.

The problem URL you post in your second message looks a lot like a Session ID. If your site uses sessions, you should use 'URL Parameters' control in Webmaster Tools to get Googlebot to ignore it. Google generally handles Session IDs pretty well on its own, but you can get things rolling faster by using the tool. If you don't use sessions, then the source is external and you can waste a lot of time trying to get the link fixed because it's probably from a scraper or SPAMmer who could care less about your site.

I'd suggest the rel="canonical" tag over a redirect in a situation like this. The canonical tag will reinforce Google's native processes and should be faster to repair the situation. With a broad redirect you'll have to do extra maintenance on your .htaccess file if you ever need to use a query string, and Googlebot will try to crawl the problem URL more often and will do so for a very long time which eats into your site's crawl budget.