Forum Moderators: open

Message Too Old, No Replies

HTML,PHP and Google

Is this considered duplicate content?

         

duhboy

4:24 pm on Aug 6, 2004 (gmt 0)

10+ Year Member



Hello all,
I help out with a legit business site which started with html pages and then evolved to php pages. The origional html pages are still on site. The new php pages are very similar except of course for the php functionality. One could argue that there is duplicate content except for the mentioned function of the php.

Our visibility at Google dropped like crazy after Florida. It has not recovered despite solid effort. Yet MSN and Yahoo serve us quite well.

Would Google consider this "duplicate content" and penalize our site because of it? If so, should I consider getting rid of the html pages to reduce this possible risk? I realize that the other alternative is to change the content. This is lots of work to consider before knowing for sure.

Thanks for your help, Dboy.
PS. Is content still king at Google?

diamondgrl

10:52 pm on Aug 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



yes, that's duplicate content. except for the .php on the end, all google sees is the html code that php spits out. that html is presumably the same as your, well, html code.

so definitely definitely get rid of the old html files with a 301 permanent redirection.

internetheaven

8:07 am on Aug 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yet MSN and Yahoo serve us quite well.

Are MSN and Yahoo showing both the HTML and PHP pages and ranking both well?

semiprofessional

1:10 am on Aug 12, 2004 (gmt 0)

10+ Year Member



Have you thought about telling apache to allow .html files to execute php impliclty? This is what I do so that all my URLs show .html, but underneath the source is actually php files and it all works just fine.

Outdoor

3:04 am on Aug 12, 2004 (gmt 0)

10+ Year Member



Welcome to WW, semiprofessional.
Outdoor

diamondgrl

5:23 am on Aug 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



semiprofessional makes a good point. when i said definitely definitely get rid of your .html files and 301 them, what i probably should have said is definitely 301 one or the other, not both. and preferably 301 the .php files.

then after you have renamed all your .php files as .html, if you use apache, set up .htaccess to treat .html files as though they are php (i'm a windows iis guy, not apache, so i assume this is correct).

i'm not convinced that it matters whether you keep .php or .html but some believe that google might not like pages created in php as much because they are dynamic. it certainly can't hurt to go with .html files, and will eliminate endless worries you might have down the line about whether you should have used .php files.

duhboy

12:34 am on Aug 25, 2004 (gmt 0)

10+ Year Member



Hello everyone,
Thank you for all of the input. I have been away from this forum for a while trying to complete other projects. While doing so I thought about my origional question. Then it came to me.

Why not change a significant amount of text content on the html pages, leaving the PHP pages as they are. The graphics and table layout etc. would be the same, but the text is delivering the same message using completly different wording.

This would allow me to keep a larger number of pages on the site, modify some text for various keywords for testing purposes and reduce the site maintainence described above, which frankly speaking is above my skill level at this time.

Any further input is gratefully recieved. Just as a side note, is content still king at Google? Anyone have an opinion?
Take care and thanks again, Duhboy.

Patrick Taylor

1:10 am on Aug 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



set up .htaccess to treat .html files as though they are php

I haven't had it explained to me fully, so can't give chapter and verse, but I understand that when this is done with a large number of pages it raises a performance issue on the server, and that it's technically better to use the .php file extension where php is required.

Marcia

6:57 am on Aug 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Another welcome, duhboy. Yes, they're dups - check and see what Yahoo has crawled and picked up at this point. Some 301 or mod_rewrite will have to be used, but check all factors before the final decision and don't wait too long.

Thanks Patrick, I've just set up a couple of new site now parsing HTML and using php inludes - also started to parse HTML for PHP includes on another, older site. I'd better double-check before setting up anything further.

I know there's extra processing involved than with ordinary HTML pages, but ss it the fact that it's specially parsing the HTML files that puts the extra server load on, or is it the PHP itself?

Patrick Taylor

9:08 am on Aug 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Marcia, I don't know, sorry. I read somewhere that it only comes into play with large numbers of pages - what "large" is I don't know either, nor the precise effect on server, but it suggests the correct long term way is to go with the .php extension. I should know more, so I will do some research in the meantime.

There's a useful WW thread here: [webmasterworld.com...]

[edited by: Patrick_Taylor at 9:25 am (utc) on Aug. 25, 2004]

Amygdala

9:14 am on Aug 25, 2004 (gmt 0)

10+ Year Member



I would say that with modern servers, the extra load is negligable. I have all my pages .html and they are all php driven. No problem at all.

Also, I don't know where the idea comes from re it being technically better. In fact there are a number of reasons for doing it that go beyond SEO. One is to hide the technology behind your site.

duhboy, I have thousands of pages indexed using php with .html extensions. Before that some of them were .php. I simply 301 all .php to the new .html in the crossover and everything was fine. No duplicate content, just a page permanently moved to the new location.

donovanh

9:20 am on Aug 25, 2004 (gmt 0)

10+ Year Member



Re: Extra load on server

I think the only consideration as far as extra load is concerned is that if you set apache to handle all html files as php, then any html files not containing php would use slightly more resources than when they were simply processed as html. The php parser would check the html file and find there is no php. That check would be the extra load on the server.

Quite negligable, suggest I would.