Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Changing the URLs in my Google sitemap after 301 redirects?

         

nestman

12:06 am on Aug 25, 2012 (gmt 0)

10+ Year Member



I currently use the following format in my sitemap:

example.com/important-widgets/1234


But to help with SEO, I now use this format to get to the same article:

example.com/important-widgets/1234/no-more-gizmo-problems-while-driving


My sitemap uses PHP and is dynamic, so it would be very easy for me to change the code so the sitemap produces URLs with the new format. But google may see this as a whole new URL, and throw away the "link juice" that my URLs have built up in the last 8 years.

I use mod_rewrite in my .htaccess file so that people who attempt to visit the old URL format get redirected with a 301 to the new format. When that scenario happens, I also tell the landing page to use the canonical URL. However, if I change the format of the URLs in the sitemap, there really is no way to do a 301 redirect. Should I leave the old URL format in my sitemap or change it to the new format? Any suggestions would be greatly appreciated!
.

[edited by: Robert_Charlton at 2:17 am (utc) on Aug 25, 2012]
[edit reason] examplified and removed specifics [/edit]

g1smd

7:48 am on Aug 25, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your site should link to the canonical URLs.

Your sitemap should list the canonical URLs.

Your sitemap should list only URLs that return 200 OK status.

Google will revisit every URL they have ever seen forever, so every URL should return the correct 301, 404, 410, or 200 status code.

As your page loads it should check that the URL long text is exactly correct for the requested page number. If it is not, it should redirect to the correct URL.

Failure to do this leaves your site open to duplicate content issues and malicious linking in the form
example.com/important-widgets/1234/do-not-buy-this-overpriced-unreliable-junk

nestman

12:10 am on Aug 28, 2012 (gmt 0)

10+ Year Member



Yes, my goal is to have my sitemap list the canonical URLs. However, if I start doing that today, Google will see the following as two different URLs:

example.com/important-widgets/1234
example.com/important-widgets/1234/no-more-gizmo-problems-while-driving

Maybe I should continue to use .htaccess to redirect the old URL to the new one. Then, once Google has indexed the new URLs, by following the 301 redirect on each old URL, I can edit my dynamic sitemap so that it starts using the canonical URLs. Am I correct in my thinking?

Thank you.

Andem

12:15 am on Aug 28, 2012 (gmt 0)

10+ Year Member Top Contributors Of The Month



Since you know PHP, I would suggest forgetting about redirecting with htaccess (mod_rewrite) and product 301/Location: headers in PHP instead. From what I understand, you can't accomplish what you've described just by using htaccess.

lucy24

5:02 am on Aug 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



But to help with SEO, I now use this format to get to the same article:

example.com/important-widgets/1234/no-more-gizmo-problems-while-driving

Something wonky here. Do you mean that you are using this longer form as the visible URL, or simply that that's where the content lives?

Do you personally speak php or is that just your ready-made sitemap code? You can easily detour to a php page to issue a 301 redirect, if you're more familiar with php than with Apache. Neither humans nor robots know the source of the redirect; they only know they've been redirected.

I can't think of any reason you would want to change from shorter to longer URL.

aakk9999

9:27 am on Aug 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I can't think of any reason you would want to change from shorter to longer URL

In this particular case, for a better click-through?

From SEO perspective, I think the any miniscule gain the new URLs may have would perhaps been offset against link juice loss via 301.

g1smd

9:44 am on Aug 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you have two URLs, one of which returns content and "200 OK" status, and the other returns "301 Moved", there is NO duplicate content issue.

aakk9999

11:38 am on Aug 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@g1smd
If you have two URLs, one of which returns content and "200 OK" status, and the other returns "301 Moved", there is NO duplicate content issue.

This is correct, but from how I understood the opening post, nestman is concerned about the timespan where Google has picked up the new URL, but has not yet re-crawled the old URL, hence resulting in Google having two versions of URL for the same content for a while. From my experience, Google is always much faster spidering new URLs than re-spidering existing.

@nestman
There are some speculations that if you keep old URLs in Sitemap for a while, you will "speed up" the process of Google re-crawling old URLs. I am personally not in the favour of this because Sitemap should only contain valid URLs that do not redirect / are not blocked by robots.

And importantly, leaving OLD URLs in sitemap will not preclude Google from seeing duplicate content between new and old URLs (which will exist until the old URL is re-spidered).

As g1smd said in his post further above, you should:
- link internally to new URLs (I presume this is what you are doing?)
- have 301 in place from old --> new URL (which I believe you have)
- and I would favour to update the Sitemap to have new URLs in there as it shows better indications of site being properly technically maintained

With regards to *temporary* duplicate content that will exist in the period where Google has spidered new URL but not yet re-spidered old URL, my experience is that, when executed technically well, this will not hurt the site. What I have found out is that old URL ranks (and new is ignored, as is duplicate), until Google picks up redirect, at which point new URL ranks, often in the place of the old one.

Ald lastly, if there are many URLs replaced and redirected, it may take Google some time to pick up all redirects.

g1smd

11:59 am on Aug 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I never list URLs that either redirect or return error message in an XML sitemap.

By dropping URLs from the sitemap, that should be a signal to Google to go find out the status of those URLs that were dropped - maybe they no longer exist or maybe they now redirect.

lucy24

5:04 pm on Aug 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not to mention that
:: ahem ::
OK, this is the g### forum. But you've only got one Sitemap. Unless you want to do some jiggery-pokery with rewriting different search engines to different physical files, which seems a bit risky to me. And That Other Search Engine says pretty emphatically that they "don't trust" sitemaps that lead to a lot of redirected or nonexistent files.

Once google knows a file exists, they will remember it forever, whether it's on a sitemap or not.

nestman

11:06 pm on Sep 7, 2012 (gmt 0)

10+ Year Member



So the consensus among you SEO experts is that I should go ahead and start using the new longer format in the sitemap?