homepage Welcome to WebmasterWorld Guest from 23.23.57.144
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Yahoo / Yahoo Search Engine and Directory
Forum Library, Charter, Moderators: martinibuster

Yahoo Search Engine and Directory Forum

    
Is it risky behavior to use a 404 error page that redirects?
Is this a best practice?
jastra

5+ Year Member



 
Msg#: 3865722 posted 3:20 pm on Mar 8, 2009 (gmt 0)

I'm not an expert on redirects, so please take it easy on me!

Is a 404 error page that is set up to redirect all erroneous URLs to the home page the best way to do a 404 error page? That is, rather than the traditional 404 error page that says the usual "Sorry, you've typed in the wrong URL, or the page you entered no longer exists," etc. etc. and a few main links to the site?

Here's the problem: The client was victimized by a black hat SEO. The site used to have sneaky redirects that caused the site's being banned from Yahoo's search results.

All the offending files were removed from the server a couple of years ago before I ever heard of the issue. But there are still backlinks out there coming to those deleted pages from other sites with the same type of pages that our client used to have. I found at least two backlinks using the same method, example.com/sneakyredirects/example-1.htm.

And under the current 404 redirect setup, those toxic backlinks are getting redirected to index.htm.

Here's the question:
If the current 404 setup automatically redirects all comers to the index page, including who knows how many toxic backlinks, then won't Yahoo see the redirect of those black hat backlinks as being complicit in the black hat linking scheme?

Wouldn't it be safer to use a conventional 404 error page rather than this redirect? (And wouldn't that always be a best practice for all sites anyway?)

I do know that it's unlikely that Yahoo would penalize a victimized site for one-way black hat backlinks, as long as there's no reciprocating link.

Is this a case that Yahoo is just never going to forgive my client, or is there any hope that this issue can ever be fixed?

[edited by: martinibuster at 6:08 pm (utc) on Mar. 8, 2009]
[edit reason] example.com is reserved for examples. [/edit]

 

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3865722 posted 6:51 am on Mar 9, 2009 (gmt 0)

when you say "404 page" are you returning a 404 Not Found response?

how are you doing the "redirect"?
meta refresh?
time delay?

SteveWh

5+ Year Member



 
Msg#: 3865722 posted 7:04 am on Mar 9, 2009 (gmt 0)

jastra, your instincts are correct that a 404 page should never redirect anywhere.

You can put a link on the page that goes to the home page or a sitemap or wherever you want, but never redirect there automatically. The user should have to click the link to exit the page. Don't meta-refresh away from the 404 page, either. And also don't redirect to a custom 404 page that *says* "Page Not Found" but actually returns the page with a 200 result code.

What's important is that a Page Not Found error must return a response/status code of 404.

Search engines use 404 responses to determine what pages are on the site and what pages don't exist. If you redirect, it's like telling them that no matter what they ask for, they'll get a page; it's a sneaky way to try to get thousands or millions of nonexistent pages indexed. The search engines don't trust that behavior.

lavazza

5+ Year Member



 
Msg#: 3865722 posted 8:08 am on Mar 9, 2009 (gmt 0)

Google has a free widget for such pages :)

[google.com...]

Enhance your custom 404 page

A 404 page is what a user sees when they try to reach a non-existent page on your site (because they've clicked on a broken link, the page has been deleted, or they've mistyped a URL).

While the standard 404 page can vary depending on your web host, it usually doesn't provide the user with much useful information, and users may just surf away from your site. Therefore, we recommend creating a custom 404 page that provides the user with more information about your site and its content. (You should still make sure that your webserver returns a 404 status code to users and spiders, so that search engines don't accidentally index your custom 404 page.)

The 404 widget is a quick and easy way to embed a search box on your custom 404 page and provide users with useful information designed to help them find the information they need. Where we can, we'll also suggest other ways for the user to find <snip/>


jastra

5+ Year Member



 
Msg#: 3865722 posted 2:56 pm on Mar 9, 2009 (gmt 0)

phranque, the site is hosted somewhere else and, unfortunately, I have no information other than it was called a redirect.

SteveWH, thanks. The redirecting idea did appear a bit sketchy to me. You said, "What's important is that a Page Not Found error must return a response/status code of 404." It doesn't sound as if this redirecting scheme does this.

lavazza, thanks for the tip!

mattur

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3865722 posted 5:28 pm on Mar 9, 2009 (gmt 0)

You can check the response status code using the "Live HTTP headers" extension for Firefox. Or download WFetch from MS.

Both will let you see the line in the response where it says "HTTP/1.1 404" or "HTTP/1.1 302" or "HTTP/1.1 301" etc.

See HTTP/1.1: Status Code Definitions:
[w3.org...]

jastra

5+ Year Member



 
Msg#: 3865722 posted 6:21 pm on Mar 9, 2009 (gmt 0)

mattur, WFetch gave me this line among the response info:

HTTP/1.1 404 Object Not Found\r\n

So it looks after all that they've done a proper 404 error page? I can't make sense of the document code that followed that line.

mattur

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3865722 posted 7:18 pm on Mar 9, 2009 (gmt 0)

Yes, it seems the server is setup to return a 404 page with the correct status code.

But what happens then? Is the page redirecting (in the browser) to the homepage using javascript or meta refresh, or is it just displaying the homepage content?

(The code below the header response code in WFetch should be the HTML sent to the browser, with \r\n for new lines, \t for tabs etc).

SteveWh

5+ Year Member



 
Msg#: 3865722 posted 8:12 pm on Mar 9, 2009 (gmt 0)

I realized after shutting my computer off last night that for your situation a 410 response code might be more efficient at accomplishing what you want for these individual pages. It tells the requester that the page is Gone (intentionally removed) from the server, not moved to a new location, and it shouldn't be requested again.

References:
[en.wikipedia.org...]
[httpd.apache.org...]

Here is sample code for .htaccess:

# You might already have these first 2 lines in your .htaccess
RewriteEngine On
RewriteBase /

# These lines send the 410 response.
# To handle a group of pages with similar names,
# use a more general regular expression on the PATHANDFILENAME line.
RewriteCond %{REQUEST_URI} ^/PATHANDFILENAME\.html$ [NC]
RewriteRule .* - [G,L]

When I collapsed one page into another and wanted the old page deindexed from search engines, it wasn't enough to just do a 301 redirect from the old page to the new or to return a 404 for the old page. They kept coming back for the page. When I sent the 410, Google and Yahoo each requested the page once, got the 410, and never requested it again.

[edited by: SteveWh at 8:19 pm (utc) on Mar. 9, 2009]

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3865722 posted 10:40 pm on Mar 9, 2009 (gmt 0)

do you see something like this in the head of the 404 document that is returned?
<META HTTP-EQUIV="refresh" content="0;URL=http://www.example.com/">

jastra

5+ Year Member



 
Msg#: 3865722 posted 3:40 pm on Mar 10, 2009 (gmt 0)

Thanks all, for the replies. I'm working from home until tomorrow, and the home computer download of Wfetch isn't behaving here and getting results. I'll look for the meta refresh when I get back. I'll have to have one of our programmers examine it, as I know little javascript.

SteveWH, the 410 response code makes good sense.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3865722 posted 9:22 pm on Mar 10, 2009 (gmt 0)

it's not javascript - it's an html meta element in the document head.

H76: Using meta refresh to create an instant client-side redirect ¦ Techniques for WCAG 2.0:
[w3.org...]

jastra

5+ Year Member



 
Msg#: 3865722 posted 3:27 pm on Mar 12, 2009 (gmt 0)

When looking at the WFetch data, there's no meta refresh code.

I'm unsure of the TOS or propriety of displaying the whole results, but here are some excerpts, if that will give a clue about how it works. Again, this is just what I picked out that looked to me to be relevant.

example.com in the code below is microsoft.

--------------------

Server: Microsoft-IIS/5.1\r\n

Content-Length: 4040\r\n

// in real bits, urls get returned to our script like this:\r\n
// res://shdocvw.dll/http_404.htm#http://www.DocURL.com/bar.htm \r\n
\r\n
\t//For testing use DocURL = "res://shdocvw.dll/http_404.htm#https://www.example.com/bar.htm"\r\n
\tDocURL = document.URL;\r\n
\t\t\r\n
\t//this is where the http or https will be, as found by searching for :// but skipping the res://\r\n
\tprotocolIndex=DocURL.indexOf("://",4);\r\n

mattur

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3865722 posted 3:41 pm on Mar 16, 2009 (gmt 0)

Jastra stickied me the site to have a look at. The 404 page is returning the correct 404 status code, but it has a javascript redirect in the HTML:

<SCRIPT LANGUAGE="JavaScript">
<!--
location.replace("/");
-->
</SCRIPT>

jastra

5+ Year Member



 
Msg#: 3865722 posted 3:48 pm on Mar 16, 2009 (gmt 0)

Thanks Mattur,

And as you say, this code should be deleted and a more useful version of a 404 error page is called for. This isn't the best practice.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Yahoo / Yahoo Search Engine and Directory
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved