homepage Welcome to WebmasterWorld Guest from 54.211.201.65
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
Can I feed the spiders PHP in disguise?
Will this .htaccess trick affect spidering?
rycrostud




msg:1303545
 7:28 pm on Nov 27, 2002 (gmt 0)

As we all know, whether or not dynamic pages with .php/.asp/.cgi extensions are treated the same as plain .html files by the spidering search engines is a matter of constant debate.

I've recently discovered a very handy method of forcing your server to treat .php files as if they were .html by adding the following line to an .htaccess file:

AddType application/x-httpd-php .php .html

The result is that you can create .html files with embeded PHP scripts and they are parsed in exactly the same way.

What I want to know is will a spidering engine such as Google or Inktomi be able to differenciate between a normal .html page and one that has been through the PHP interpreter. Both appear as plain HTML in the browser, but will this affect spidering?

FYI - none of these pages will have variable=value pairs passed in the URLs such as index.html?prod=23.

 

Slade




msg:1303546
 8:36 pm on Nov 27, 2002 (gmt 0)

It is possible that some header may be sent back with the page that could give it away.

The thing I can think of is that the LastModified Header won't be sent(by default) with your processed pages. That's a decent way to tell if the page was preprocessed or static.

See Are you using If Modified Since? [webmasterworld.com] for some useful info.

dhdweb




msg:1303547
 11:12 pm on Nov 27, 2002 (gmt 0)

Google has no problem indexing a .php page!

rycrostud




msg:1303548
 10:15 am on Nov 28, 2002 (gmt 0)

Thanks for the responses. Most of my pages have .php extensions and so far I've not had any problems getting them indexed.

I was just wondering if some spiders might take 2 identical pages, one with .php and one with .html and give the latter a better ranking.

My other concern was that the spider may somehow detect that it was being served a dynamic page even though it had a .html extension and somehow penalise for it.

The 304 page header info was good, thanks slade.

Allen




msg:1303549
 12:02 pm on Nov 28, 2002 (gmt 0)

This method can kill a busy server though, as every .html page will now be passed through PHP.

Most search engines read PHP perfectly fine, since they get the HTML just as a web browser does (ie. View Source is what the search engine spiders see).

Allen

rycrostud




msg:1303550
 1:53 pm on Nov 28, 2002 (gmt 0)

You can be quite specific about which pages are parsed by the PHP translator since the .htaccess file only affects files in the same directory.

But I hear what you're saying. Processing true HTML files when they don't contain any PHP is not only pointless but draining on server resources.

Should probably be used with care.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved