Forum Moderators: phranque

Message Too Old, No Replies

Redirecting URLs and keeping google happy!

How do i redirect or rewrite my urls and keep google pleased?

         

netnerd

1:52 pm on Dec 12, 2003 (gmt 0)

10+ Year Member



I have a database application on apache, using php.

I want any query on the root of the site to be processed by a script and return a page based on that query.

So, if someone types in [mysitename.com...] , my server will go and use theirquery.htm as a string and use it to create a page from a database and display it on [mysitename.com...]

I dont think its wise to use a 404 error page to redirect because if it returns a 404 then the site wont be spidered. I dont know if i 301 redirect will still let google keep the original url, or if it will cause google to keep note of the "file processing" url.

Any ideas?

netnerd

1:59 pm on Dec 12, 2003 (gmt 0)

10+ Year Member



Also - id like to know if there is a way to ensure that a "200" code is returned to the logs instead of a 301 or 404 for queries that i have set up in the script.

beowulfdk

2:47 pm on Dec 12, 2003 (gmt 0)

10+ Year Member



Search for mod_rewrite tutorials :-)

In your .htaccess file you put something like:

rewriteEngine on
rewriteRule ^([a-z0-9]+).htm$ /page.php?query=$1

Then whenever someone visits youdomain.com/anything-limited-to-letters-and-numbers.htm it will be served by the page.php script that uses the value of $query to determine the contents of the page. You can rewrite in any number of ways to create human-readable fully database driven sites.

netnerd

3:17 pm on Dec 12, 2003 (gmt 0)

10+ Year Member



Will google see all the pages that a user would come in on as being actual pages (200)?

beowulfdk

5:14 pm on Dec 12, 2003 (gmt 0)

10+ Year Member



If the file ends in htm or html it has no way of finding out whether it is in fact dynamic pages or static pages. Of course converting a database driven website into a static-looking website does not guarantue that google indexes everything overnight; just how much (how deep) it indexes your site depends on a lot of things (ie. page ranking, how well the site links internally etc.)

netnerd

5:17 pm on Dec 12, 2003 (gmt 0)

10+ Year Member



Ok - so as long as i have nice big high PR links going into it and keep a good link structure within the site, i am in business. (Providing i give .htm or .html endings to the pages which are created on the fly)?

jdMorgan

5:44 pm on Dec 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The code beowulfdk posted would not be 'visible' to search engines, since it simply rewrites the URL internal to the server. As far as any search engine can tell, the page *is* an .htm page.

You question about 200-OK responses brings up another point... Make sure your script *does not* provide 200-OK resonses to *all* queries. If you don't have something useful to serve, then have php write a 404 response header. Search engine spiders are very leery of sites where they *cannot* get a 404 - even by trying - because they then suspect that the URL-space of the server is infinite - that it will never retuirn 404. So, they artificially limit the depth of spidering on such sites.

Jim

netnerd

5:52 pm on Dec 12, 2003 (gmt 0)

10+ Year Member



Thanks Jim - a VERY good point. Now for my next question - how do i return a 404 using this setup?

RobbieD

5:54 pm on Dec 12, 2003 (gmt 0)

10+ Year Member



ErrorDocument 404 /404.html

404.html can be called anything you want.

jdMorgan

6:14 pm on Dec 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>how do i return a 404 using this setup?

>> have php write a 404 response header

That's a php question - something I am singularly unqualified to answer.

It's response.header.write or something like that...

On a deeper level, your script has got to 'know' what queries it has an answer to. So the answer very much depends on what kind of site you've got, and is likely well beyond the scope of our little forum here.

On a database-driven site, you'd query the database and return a page if you found query-related data. Otherwise, you'd return a 404-Not Found response -- remember, the browser or spider thinks this is a real html page it's asking for.

Jim