Forum Moderators: phranque

Message Too Old, No Replies

Redirecting old site with one rule

         

mumble

3:28 pm on Jun 28, 2011 (gmt 0)

10+ Year Member



Hi Guys

I recently redesigned a mates site and am now trying to mend all the broken links from the old site to the new one but have 388 of them and aren't really sure what pages they are for!

The old pages all start like www.example.com/spip1.82ds/etc/etc..... so was wondering if there is a rule to redirect them all to the new home page or another way round it. The rule I would be looking for is something like "redirect any page that starts www.exmaple.com/spip1.82ds to www.example.com"

Cheers

Jools

lucy24

7:20 pm on Jun 28, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<begin impersonation of g1smd>
This question has been asked approximately 75,000 times in this forum, most recently two hours ago. Site search should turn them up.
</end impersonation>

<begin impersonation of myself>
For starters, bookmark this page [httpd.apache.org]. It generally contains the fix for everything that ails you.
</end impersonation>

g1smd

8:31 pm on Jun 28, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It is generally a bad idea to mass-redirect a large number of page to the root.

If they have truly gone, return 404 or 410 with a helpful human readable note.

mumble

2:12 pm on Jun 29, 2011 (gmt 0)

10+ Year Member



@lucy I have checked that site before and have tried many ways and researched for many hours but am still having issues as I am a relative beginner and am not sure what language to use to specify the rule. Hence the question :-) What I need is the rule written or shown for me (as asked in the original post) as through my own research have gotten nowhere :-(

@g1smd some have gone (like forum crap) others are still there though I have no idea which they link to.
Why is it a bad idea to redirect to the homepage? Surely better than a 404, plus I'm assuming google will still not like the humanified 404 whereas google would prefer to hit the home page and go from there.

Any more help gratefully appreciated.

Jools

g1smd

6:42 pm on Jun 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A redirect says "the content at the requested URL has moved; the same or very similar content is now at this new URL". In this case it would be a lie to say that the home page of the site satisfies that rule for the 2000 pages that would be redirected there. Using a mass redirect like that is a "signal of low technical quality" as far as searchengines are concerned.

lucy24

8:55 pm on Jun 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There are two different and unrelated things.

One is the message the server sends out, in this case a 404 or 410.

The other is what physically happens to the user who has requested a page that doesn't exist. Some sites do deal with it by sending them to the home page. (This user personally hates this approach. I don't know if anyone's ever collected reliable data on the Average User.)

If there's been massive revision so you're essentially starting over even though a lot of the same pages still exist (been there, done that) I'd go with a general 410 instead of a 404. It just seems a bit snarky to hit people with a 404 for a misspelled url* when the same url correctly spelled would end up as a 410 anyway ;)

If you use a Redirect to send a 410, it's done by redirecting them to nowhere. (That is, physically they'll end up on your 410 page.) If you use a Rewrite, you can do just about anything.

In situations like this it makes most sense to use the same physical page for both 404 and 410. The lines in htaccess simply say ErrorDocument {4-whatever} and then the page, expressed as an absolute url. Then make a document that gives the appropriate information and include links to the various parts of the site that the user might have been looking for. All url's have to be absolute rather than relative, because you don't know where the user intended to go.

Mine says
I’ve recently done a lot of housekeeping and rearranging in the Paintings area, so some pictures may not be where you expected to find them.

This is a brazen lie, but sounds nicer than "I wasn't thinking of search engines and didn't bother to keep records of which pages got renamed to what, so nothing is where it used to be". Accompanied by a string of rewrites and redirects that would make htaccess-knowledgeable people weep.

Another thing we don't have Hard Information on is how long it takes g### to assimilate a 410, especially if it has been preceded by a 404 for the same url. (I stopped counting at 50 for one url. That's assuming it goes by number of hits, not time elapsed.)


* Hasty edit from "page" to "url" after uneasy look in g1's direction.

g1smd

9:38 pm on Jun 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The lines in htaccess simply say ErrorDocument {4-whatever} and then the page, expressed as an absolute URL.

The ErrorDocument directive must refer to the file to be served in server internal location format and not as a URL. If you include a domain name in the ErrorDocument directive it will serve a 302 redirect as documented in the Apache manual.

lucy24

11:05 pm on Jun 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Oops. When I say "absolute" I mean "with leading slash", not full path. As opposed to relative, meaning without leading slash. ("On the second floor" as opposed to "Take the 299 exit" on one hand, or "Down the hall" on the other.)