Forum Moderators: phranque

Message Too Old, No Replies

About to launch new site, neep help with file names

and how to ban SEs from indexing .php extension

         

walkman

7:29 pm on Mar 26, 2005 (gmt 0)



I'm about to go live with a site. It runs on PHP and the file names are widget.php. My old site has re-write and the filenames were domain.com/widget-keyword. Now, I can change the script to link to /widget-keyword instead of widget.php, it only takes me about 4-5 minutes to that sitewide. However, if someone links to widget.php, the page is still there so I have to use redirect, just in case.

Can I make that almost all .php files (minus 2-3 ones) respond to /widget-keyword format? I know I can make all, but can I exclude, or give a different path to a few...?

Also, can I safely ban Google and other bots from indexing all .php files to void any dupe issues just in case? Keep in mind that the index page too is index.php, but I will not link to it; all links will be domain.com/

thanks for any suggestions,

jorj

10:00 pm on Mar 26, 2005 (gmt 0)

10+ Year Member



Two months ago I've run a test: three identical pages all linked within the homepage, same link text but going to:

/dir/
/dir/file.html
/dir/file.php

Google went through all of them and indexed only
/dir/file.php

What can you say about that! In my opinion as long as there are no parameters like in the query string any of the links above would've been indexed equaly.

geekay

9:19 am on Mar 27, 2005 (gmt 0)

10+ Year Member



Can one also draw the conclusion from jorj's test that there will be no duplicate page problem in SE's if there is, on a site, a mixture of absolute and relative internal links pointing at the same page?

If a consistent linking structure (exclusively absolute or exclusively relative) is one day accomplished, are some pages likely to fall out of the SE index due to duplicate content penalty?

jorj

9:37 am on Mar 27, 2005 (gmt 0)

10+ Year Member



I haven't tested the relative vs absolute linking to the same page but I can say that the homepage's links are almost all absolute and points to other domains.

You made one interesting observation : google probably stops indexing pages whose cotent is similar on the same site. Probably thay have few degrees of similarity which is why some pages are treated as different, other as 'similar pages' and others are just not indexed at all.

It this is true then I should start thinking to a tool to measure the 'distance' between pages of the same site which will ultimatelly affect the number of pages google will take into consideration.