|Link Checker - 301 > 302 > 302 > 200|
W3C Link Checker
Check links and anchors in Web pages or full Web sites
What exactly does this have to do with Link Development? Let's see what we can come up with.
Picture this, you've got this killer website, it's sticky, you're doing things right and everything "appears" to be just fine.
Let's say it is time for annual maintenance and it is discovered that you've got a plethora of links that go through various redirects before reaching their final destination. If you've got pages that have a "lot of links", you may want to pay close attention to how many redirect sequences are taking place.
I was just running a few quality reports on a site and ran into all sorts of 301 > 302 > 200 recursions. I also ran into a few 301 > 302 > 302 > 200. While some of these may have been required due to the UA/IP, I believe many were oversights of the Webmasters.
One of the more common issues I ran into was the www vs non-www challenge. And if there are any variables being passed in that first redirect, it appears that things can get even more trickier at the destination so, double/triple check what is going on with those recursive redirects.
When you're doing all of your internal and external link development, do you pay special attention to the statuses being returned by "all" of your links? Are you performing regular maintenance to insure that the redirect chains within your site are "reasonable" and do not take the bot through a third or even fourth recursion? If you've got anything more than that, I'd call that a Kiss. ;)
As an example, something like this
http://del.icio.us/post?*** is going to end up in a 301 > 302 > 302 > 200. It appears to be handled properly but why put the bot through that extra 301 > 302?
Speaking of 302s, what is up with those these days? I mean, I understand their use and see them quite frequently running reports. It's those second 302s that concern me. In layman's terms, you are telling the bot too...
Moved Permanently > Found > Found > OK
Now, depending on what the server is returning at the > Found > Found levels can be somewhat confusing. You've got a 302 which is a temporary redirect. There is also a 303 (treated as 302) and 307 which refine the redirection further. All are relative to temporary redirects. So, what happens at this stage of the recursion?
301 > 302 > 302 > 200
Ya, this is a test, I think. ;)
just to further confuse things, some search engines will treat an undelayed 302 meta refresh as a 301.
Crib notes for the test:
Client requests A.html
Server responds 301 -> B.html
Client requests B.html
Server responds 302- > C.html
Client requests C.html
Server responds 302- > D.html
Client requests D.html
Server responds 200-OK and sends content of D.html
Client renders content. If the client is a search engine robot, it takes "B.html" as the URL-path to show in search results, since B.html was the destination URL-path of the 301-Moved Permanently redirect response to the request for A.html, and all of the subsequent redirect responses were 302-Found.
Note that a 303, although often treated as a 302, is more like a "Moved-No Comment" response. It intentionally says nothing about whether the content for the requested URL was moved temporarily or permanently.
|If the client is a search engine robot, it takes "B.html" as the URL-path to show in search results, since B.html was the destination URL-path of the 301-Moved Permanently redirect response to the request for A.html, and all of the subsequent redirect responses were 302-Found. |
except for (maybe) if the 302 is an undelayed meta refresh as described above.
i know it's 3 years old, but...
SEO advice: discussing 302 redirects:
Redirection chains can be killers. They are best avoided.
Careful crafting of redirect and rewrite rules can minimise the problem.
There are a lot of sites with far from optimum solutions.
|There are a lot of sites with far from optimum solutions. |
that i believe is due to the limited mastery of various servers; i have seen enterprise site admins essentially employ 404s that forward articles to articles and everything else to site maps after major redesigns.
S meta-refresh is no kind of 302 redirect at all. It's is a client-side kludge, and not a server redirect response.
Calling a meta-refresh a 30x redirect is harmful to clear use of terms.
But does their link checker tell you the links no longer belong to the original site you linked?
That's where the whole thing falls apart as expired domains showing hosting company pages with a nice "200" or a registrar's domain park page, or worse yet it's fallen into a bad neighborhood or even now hosts malware.
Superficial link checking just scratches the surface and those sites that turned sour that don't want to be noticed return a perfectly fine 200 OK so you have no clue why your pages traffic went away as those pages are being red flagged due to those links.
I maintain a site with 13,000+ outbound links in a niche area.
1. I use Xenu for basic maintenance and check the failures and 301/302 redirects, changing the interdomain 301/302 redirects - ie example.com to example1.com. This runs every month.
2. I wrote a script that logs any html refreshes in the code and logs the page title. This catches some of the ones that Xenu misses. This runs every month about 2 weeks after #1.
3. I look at any questionable titles and then every blank title about every 3 months.
4. We call every listing on a 12-18 month cycle.
5. As we also have addresses to maintain, we use an NCOA update for the US addresses every 6 month. This catches more.
|i have seen enterprise site admins essentially employ 404s that forward articles to articles and everything else to site maps after major redesigns. |
Ha. Ha. I have an former client that wanted a 404 for any page that needed redirection, regardless of reason; non-existing page, upgrade that required changing structure and moving pages. Made no difference. 404 everything and forward user to Home page with a 7 second delay. No loss was he:))
Xenu is a handy little tool and I like it a lot. The W3C link checker is really nice but maxes out at 150 pages:((