Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

What to use - 410 or 301 or 404?

         

zehrila

10:23 am on Jun 4, 2011 (gmt 0)

10+ Year Member



After panda i have been fixing internal issues and i figured i had a lot of issues with site Url structure. I first tried to use canonical tag, but failed to implement them rightly, so i picked an easy fix and 301'd duplicate urls to homepage, soon after that i got hit by Minus 30 penalty and some of important pages tanked down. Following is the issue i have been trying to deal with.

Domain.com/Green-Widgets-1234.html <-- Desired structure with G of Green in caps

Domain.com/green-Widgets-1234.html <-- Not desired

After receiving minus 30 penalty for putting 301 redirect to urls with lower case, i decided to put 410 error code to such urls. Now after 3 days, i see some of my pages (which got hit by penalty) back in the serps, but some other pages are now gone, disappeared from serps altogether site: doesn't show any result, which makes me think Google first crawled the wrong version of Url and then decided to drop both wrong and right URLs.

How come Google is dropping Domain.com/Green-Widgets-1234.html when its showing the proper response code, and what could be a fix to this issue?

lucy24

3:19 pm on Jun 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ouch. Are you talking about two different pages, or two different forms of user input? If you genuinely have two pages with the same name, you're pretty well stuck. But if you're talking about users who just aren't sure about their capitalization, you should never be redirecting in the first place. That's a job for a rewrite, and google will never know.

Possibly related aside: google itself seems to be experiencing some kind of capitalization glitch. I've recently found a few 404's in the form /lowercase.jpg >> file does not exist. Well, they're absolutely right it doesn't exist: it's /MixedCase.jpg and that's what all links say. (I was brought up to be rigorously case-sensitive.) So it's not my error, it's theirs.

zehrila

4:04 pm on Jun 4, 2011 (gmt 0)

10+ Year Member



Thanks for your input lucy24, issue is, Google is indexing same url twice, one is with capital letter Green-Widgets-1234.html and second with lower case green-Widgets-1234.html its one page basically.

g1smd

4:09 pm on Jun 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Where you have traffic arriving from other sites that have used incorrect capitalisation for your URLs, the visitor should be redirected to the correctly-cased canonical URL so that you don't have Duplicate Content. However, I much prefer all-lower-case for URLs and usually 404 incorrectly cased requests for pages. For images this isn't quite so important, you could allow any case if you wanted.

It is never appropriate to mass-redirect to the root homepage. Those URLs should return "410 Gone" and the error message should link to potential new places the visitor might like to visit instead.

tedster

4:47 pm on Jun 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google is indexing same url twice, one is with capital letter Green-Widgets-1234.html and second with lower case green-Widgets-1234.html

It may help you to understand the situation more clearly to appreciate that these really are two different URLs - the web IS case sensitive by design, even though Microsoft ignored that specification in many ways.

For this reason, I agree with g1smd most strongly. I use lower-case only in building a website and this bypasses the entire problem area.

lucy24

5:07 pm on Jun 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google is indexing same url twice, one is with capital letter Green-Widgets-1234.html and second with lower case green-Widgets-1234.html its one page basically

Ugh. Yes, in that case you do have to redirect, if only for google's sake. They figure out the 301 pretty fast. (Not so sure about 410. I recently looked up randomly selected page in the raw logs and found that after 30 consecutive 410s they are still looking for the page. But this may be because I was dilatory in setting up the 410 so the same page has an earlier history of 404.)

zehrila

7:01 pm on Jun 4, 2011 (gmt 0)

10+ Year Member



Where you have traffic arriving from other sites that have used incorrect capitalisation for your URLs, the visitor should be redirected to the correctly-cased canonical URL so that you don't have Duplicate Content.


Its my internal Url structure which is poorly done, i did not bother in the beginning, but now that site has started getting a lot of traffic, i decided to tackle this issue, it was harder to point small letter url to Capital one, so i got 410 status code on lower case page. Now Google is dropping my pages which i want to keep in serps and loading fine.

For this reason, I agree with g1smd most strongly. I use lower-case only in building a website and this bypasses the entire problem area.


I agree too, that keeps a lot of errors off the site. Now what do you suggest, should i remove status 410 code at lower case urls so that further pages don't deindex? what other alternative solutions?

tedster

7:11 pm on Jun 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would not use 404 or 410. Especially since you're already tangled up with Google, I'd say bite the bullet and 301 redirect all mixed case requests to a URL in this form:

example.com/green-widgets-1234.html

It will take a while (several weeks for the bulk of it, and then months for the whole pile) but Google will eventually show only all lower-case URLs in the SERPs. Don't worry too much about WebmasterTolls reports - worry more about the main SERPs (not even site: operator searches.)

g1smd

7:16 pm on Jun 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The alternative solution is VERY simple.

Use lower-case URLs throughout your site.

At the TOP of your .htaccess file set up an internal rewrite that captures all requests that contain any upper-case characters. These requests are rewritten to a PHP script that fixes the URL and sends the correct 301 headers back.

The complete code for this was posted in the Apache sub-forum about a month ago.

On another note, I prefer the ID number as the first item in the path and omit the .html extension too. With the ID first, truncated URLs, such as example.com/12345-this-great-produ or similar, can also be correctly redirected. Additionally, short "example.com/12345" URLs can be posted to Twitter and elsewhere and will still work (via an initial on-site redirect).

Just to be absolutely certain, your main page generation script MUST also check the "slug" text against the ID number and redirect to the correct URL if the "slug" text is incorrect for that ID.

zehrila

7:39 pm on Jun 4, 2011 (gmt 0)

10+ Year Member



I'd say bite the bullet and 301 redirect all mixed case requests to a URL in this form:


Use lower-case URLs throughout your site.


This was the solution proposed by developer and i liked it too, but since the old url structure with Caps was already in the Google, picking up in serps and getting traffic, i was scared to upset Google with lower case redirects. All my deeper links from blog posts and articles are pointing to mixedcase urls.

Now this is the summary of what has happened.

1: 301 redirected urls starting with lower case to Homepage. Resulted into minus 30 penalty

2: I realised the penalty, and got those pages show 410 status code. Resulted into de indexation of Urls.

Surprisingly, there are some urls which Google has not indexed twice and those urls are now back to old ranks.

My questions are, would not it be bit too much to first do 301, then 410 and now 301 redirect all the urls to lowercase, specially when all the links are pointing to mixedcase?

Secondly, how about if i try to point the lowercase urls which Google is indexing to desired mixed case urls?

tedster

7:43 pm on Jun 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would NOT advise "desiring" any mixed case URLs whatsoever.

The more you try to do this, the more technically tangled you are likely to make the situation. It's hard enough to get the 301-to-lowercase thing right on a large development. That's why I said "bite the bullet". Using the 301 redirect will at least preserve most of your backlink powere - eventually, when Google sorts it all out (which is going to take a while.)

zehrila

8:11 pm on Jun 4, 2011 (gmt 0)

10+ Year Member



i will bite the bullet then, its better to do it now than later when site traffic and backlink count is higher.

g1smd

9:19 pm on Jun 4, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Short term pain. Long term gain. Long term ease of site management.