Forum Moderators: open

Message Too Old, No Replies

How does Google handle Dynamic PHP Pages?

         

Xenozenith

11:22 pm on Mar 3, 2003 (gmt 0)

10+ Year Member



I am utterly new to the art of getting one's website noticed by Google. I have enjoyed reading some of the tips on what are the right - and wrong - things to do.
Until about three days ago, I wouldn't have even ever imagined that there was such a thriving forum based on nothing other than Google listings and rankings!
But then again, when you are dealing with a true legend, I suppose that all roads now lead to Google.

It is obviously frustrating when you have a new site and want to get more traffic (targeted traffic). Ironically, the biggest reason I became more fascinated with Google is Overture's (rather silly, in my opinion) decision to raise its minimum bids to 10c with no notice. The specific terms I was planning on bidding on had only one or two other bidders, at 5 and 6 cents - I had no choice but to bid 10c. And it turned out not to be worth it. So I turned to Google.
Google's adwords have been helping, but I feel that my site will be far more valuable in the eyes of users if it shows up in the search results and not as a "sponsor link".

I am hoping that the structure of my site (designed more for the users than for search engines) will be one that is friendly for Googlebot.
Due to the tips in this forum, I have added useful(outside) links on my pages, clearer navigation, and hopefully content that will attract the right search phrases.
I just wanted to thank those who are "in the know" for taking the time to share their experiences with others.
To GoogleGuy: Many thanks to you and your co-genii, for without Google there would be no way to easily find anything useful online. Thank you for being able to update every month, instead of once every six months or two years or so like the dinosaurs. Thanks for creating Freshbot, even though not even the Illuminati know how that thing works ;)

Onto the questions:

1) I don't have any HTML pages on my site. Every spiderable page is part of a series of directories (they go up to 3 deep), and each directory is run by a dynamic program to give up-to-the-second content updates. Does Google prefer HTML pages to directory pages?

2) All of the end-pages of my directory structure link to the appropriate Google directory for that particular page type (along with other applicable links, here and there). But some of the middle-pages only have links to home and to end-pages. Should I put outside links on these middle-pages?

3) If there is no referer, my directory programs do not add anything to sublinks (ie, foo.com/Dir1/Dir2/) but if there is a referrer, they will add a session id (ie, bar.com/Dir1/Dir2/?sess=abcde). Does Googlebot ever have a "referer"?

4) Does Google dislike spaces - %20 - in urls? Such as foobar.com/Dir1/New Additions/ ...? Should I replace the spaces with dashes or underscores? Does it matter?

5) It seems that Freshbot came through last night and looked at all of my directories. But only one of them is up on Google with a "zzz.com site:www.zzz.com" query. (Two days ago, two of my directories were up on Google - different ones, and they both disappeared this morning.) Is this normal? Will the deep Googlebot pick them up if Freshbot can find them? Or should I be worried?

Anybody's input on any of these questions would be much appreciated!

-Xenozenith

Tartan75

11:47 pm on Mar 3, 2003 (gmt 0)

10+ Year Member



Xenozenith - welcome to webmaster world

As you will see im only new here myself, but heres my thoughts:
1)Google doesnt seam to have any problem with dynamic sites, provided they are cleanly written - my site is nearly entirely dynamic, and all 1400 pages are listed, with the important ones ranking well

4)Im sure some of my pages have spaces in the urls, without a problem - dont know if it affects ranking though, although i wouldnt have expected it would

5)If your site is only in by freshbot, then expect this sort of activity. Things should settle down after the dance after your deepcrawled.

netguy

11:55 pm on Mar 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just watch out for the dynamic pages that are PHP. Google doesn't like them at all. My client decided to go it alone on a PHP cart and the site went from page 1 to never, never land in 2 days.

Tartan75

12:06 am on Mar 4, 2003 (gmt 0)

10+ Year Member



my dynamic pages are php - no reason to think that affects my rankings at all.

netguy

12:26 am on Mar 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Tartan75 - RE: PHP Didn't work on Google: There was not a single keyword for any of the 200 products that came up after going to PHP. I hired a PHP consultant to do a work-around, but it was so SLOOOOW that we abandoned it and I just did a ton of static front-end html pages to drive into the PHP cart.

If you're using PHP and getting picked up in Google, you're lucky. If you go to google.com/webmasters/facts.html they explain what they work with, and PHP isn't one of them.

atadams

12:33 am on Mar 4, 2003 (gmt 0)

10+ Year Member



If you're using PHP and getting picked up in Google, you're lucky. If you go to google.com/webmasters/facts.html they explain what they work with, and PHP isn't one of them.

Isn't PHP just like the other pre-processors? It's still just serving out HTML, isn't it?

netguy

12:40 am on Mar 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not the PHP expert. It's just one of my clients I put online in 1996 and had everything working fine, with top positions in virtually every category, then his new IT guy decided to get 'fancy' and go with PHP. A routine check a few days later showed EVERY product off Google. Since the client had already invested heavily in the database, and after trying other avenues to make it work, I went with the static fron-end, and the client is happy again.

Tartan75

12:44 am on Mar 4, 2003 (gmt 0)

10+ Year Member



atadams - your correct. The only posible thing that google could see as any differant from static html, would be the url, but even that can easily be made to look static.

<edit>assuming the php is nice and 'clean'</edit>

[edited by: Tartan75 at 12:48 am (utc) on Mar. 4, 2003]

netguy

12:55 am on Mar 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Tartan75, you may be correct on the 'clean'... As I said, I personally don't know PHP, but my client learned a costly lesson. You may have a clever work-around, but I'll stick with what Google's own site says, and not recommend PHP as an avenue to get picked up by the search engines.

jatar_k

1:00 am on Mar 4, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I've never had any problems.

google says not to use session ids in the urls and they say they have no problems spidering most file types and languages.

PHP is server side, it spits out straight html. If it doesn't get spidered then the person who coded it is choking the spider with html or monster query strings in the url. I have never had any problems with either a .php or .html serving php.

It's not the php, it's the programmer. ;)

netguy

1:04 am on Mar 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Where were all you guys when my client needed you......

Tartan75

1:13 am on Mar 4, 2003 (gmt 0)

10+ Year Member



netguy
I appreciate your client had problems. I wasnt meaning to contradict you, and im sorry you have had problems. Im only self taught/taught by a freind who is self taught, so i dont consider myself any expert.

I was just wanting to stress that based on my experience, i dont think anyone should be put off using php, because of search engine rankings. Before I wrote sites in php i checked to see if people advised against it for SEO (in the google approved sense) and clear answer was that it should not be a problem.

Perhaps the lesson is that, if writing a dynamic site, regardless of the launguage, then the php programer should be aware of good clean html, and general SEO, ie they should always be careful to make sure the *output* looks 'natural' and clean.

jatar_k

1:18 am on Mar 4, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



well netguy we were right here :)

well said Tartan75, most of the programmers I have worked with have no clue about SE's or spiders. Most just want to make their coding easier. It never crosses their minds to consider how accessible the site is to spiders and how they will market the site later.

"But look the site is so cool and fast"

who cares, if no one can find it then it really doesn't matter how cool it is or that you used all those fancy techniques.

The KISS method always worked best for me.

netguy

1:19 am on Mar 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would certainly agree. Unfortunately, the new IT guy hired "a friend of a friend" without considering the SEO ramifications. The guys knew how to (basicly) do the database fine - but whatever else they did, screwed up the search capabilities. Such a dramatic shift needs to be researched thoroughly, and if there is a point here, anyone considering PHP should make sure they are dealing with a PHP programmer that knows that Google has to 'see' the pages - otherwise there are big problems.

Woz

1:31 am on Mar 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Split from [webmasterworld.com...]

Onya
Woz

atadams

2:07 am on Mar 4, 2003 (gmt 0)

10+ Year Member



There are issues with URL variables (# of and name of), lord knows I've had a couple, but I would imagine those would be the same for all pre-processors.

Xenozenith

10:00 am on Mar 4, 2003 (gmt 0)

10+ Year Member



Hi Tartan and all others,
Thanks for the feedback. I have gone to full lengths to try to ensure that there won't be any problems when the Google deep bot comes.
I noticed a few comments about PHP in the string - I've never used the stuff - no SSIs on my pages. Also no cookies and not too much JS to speak of. All of my pages are generated by Perl programs which string together template files and pieces.
Google has picked up some of my shorter .CGI urls from my other sites. (But it puts a space in the display of these urls when they match a search, for some reason - although they still click through properly)

-Xenozenith

ruserious

11:33 am on Mar 4, 2003 (gmt 0)

10+ Year Member



jatar_k basically said it, PHP is server-side technology, if done right, the client requesting cannot know how the pages are served from the server-side, no matter wether they are static (html-)files, php, python, perl, C++ (*G*) or whatever.

A good read is Common HTTP Implementation Problems [w3.org]. Especially 3.1 where it states that you should generally (i.e. dynamic and static content) serve without file-extensions, so you are not dependent on technology. Summed up in the often heard advice: "Cool URIs don't change", but cool content does.

And yes, this is also good in Google's sense. I have two sites that are totally build with php and have a .php extension (back then I didn't know) and usually one or two (short) query strings. Both sites have a reasanable PR and several thousand pages indexed, they are good positioned in the SERPs and are frequently visited by Freshbot.

So my conclusion is, if you know what you're doing and how it works, than yes, php works great with users and searchengines (including google).

ade_uk

11:36 am on Mar 4, 2003 (gmt 0)

10+ Year Member



i have some nice ranked PHP nuke sites :)

hetzeld

3:15 pm on Mar 4, 2003 (gmt 0)

10+ Year Member



Netguy,

What your client faced, it a "less than clever" usage of PHP sessions. Any session tracking would get your pages out of the index as soon as GoogleBot hits your pages.
This happened to me in October 2002 when I decided to use PHP sessions to track the visitor's language on a bilingual site.
It took less than a week to see all my pages but my main page dropped from the index. The reason for the first page to stay in the index is that PHP sessions isn't a problem for the first page hit on a site, but all links therein have a PHPSESSID var added.

A few ways to go around this:
- don't use sessions
- don't use sessions if user agent is a spider
- use session but make sure to disable trans-sid (so that PHP won't add a PHPSESSID=... to your URLs if the visitor doesn't accept cookies - like spiders)

To be on the safe side when using sessions, make sure all your links to pages (not to images) are using full addresses (http://...) and not relative ones.

Once I solved this, I was back in the index a month later :))

cheers,

Dan