Forum Moderators: open
It is obviously frustrating when you have a new site and want to get more traffic (targeted traffic). Ironically, the biggest reason I became more fascinated with Google is Overture's (rather silly, in my opinion) decision to raise its minimum bids to 10c with no notice. The specific terms I was planning on bidding on had only one or two other bidders, at 5 and 6 cents - I had no choice but to bid 10c. And it turned out not to be worth it. So I turned to Google.
Google's adwords have been helping, but I feel that my site will be far more valuable in the eyes of users if it shows up in the search results and not as a "sponsor link".
I am hoping that the structure of my site (designed more for the users than for search engines) will be one that is friendly for Googlebot.
Due to the tips in this forum, I have added useful(outside) links on my pages, clearer navigation, and hopefully content that will attract the right search phrases.
I just wanted to thank those who are "in the know" for taking the time to share their experiences with others.
To GoogleGuy: Many thanks to you and your co-genii, for without Google there would be no way to easily find anything useful online. Thank you for being able to update every month, instead of once every six months or two years or so like the dinosaurs. Thanks for creating Freshbot, even though not even the Illuminati know how that thing works ;)
Onto the questions:
1) I don't have any HTML pages on my site. Every spiderable page is part of a series of directories (they go up to 3 deep), and each directory is run by a dynamic program to give up-to-the-second content updates. Does Google prefer HTML pages to directory pages?
2) All of the end-pages of my directory structure link to the appropriate Google directory for that particular page type (along with other applicable links, here and there). But some of the middle-pages only have links to home and to end-pages. Should I put outside links on these middle-pages?
3) If there is no referer, my directory programs do not add anything to sublinks (ie, foo.com/Dir1/Dir2/) but if there is a referrer, they will add a session id (ie, bar.com/Dir1/Dir2/?sess=abcde). Does Googlebot ever have a "referer"?
4) Does Google dislike spaces - %20 - in urls? Such as foobar.com/Dir1/New Additions/ ...? Should I replace the spaces with dashes or underscores? Does it matter?
5) It seems that Freshbot came through last night and looked at all of my directories. But only one of them is up on Google with a "zzz.com site:www.zzz.com" query. (Two days ago, two of my directories were up on Google - different ones, and they both disappeared this morning.) Is this normal? Will the deep Googlebot pick them up if Freshbot can find them? Or should I be worried?
Anybody's input on any of these questions would be much appreciated!
-Xenozenith
As you will see im only new here myself, but heres my thoughts:
1)Google doesnt seam to have any problem with dynamic sites, provided they are cleanly written - my site is nearly entirely dynamic, and all 1400 pages are listed, with the important ones ranking well
4)Im sure some of my pages have spaces in the urls, without a problem - dont know if it affects ranking though, although i wouldnt have expected it would
5)If your site is only in by freshbot, then expect this sort of activity. Things should settle down after the dance after your deepcrawled.
If you're using PHP and getting picked up in Google, you're lucky. If you go to google.com/webmasters/facts.html they explain what they work with, and PHP isn't one of them.
google says not to use session ids in the urls and they say they have no problems spidering most file types and languages.
PHP is server side, it spits out straight html. If it doesn't get spidered then the person who coded it is choking the spider with html or monster query strings in the url. I have never had any problems with either a .php or .html serving php.
It's not the php, it's the programmer. ;)
I was just wanting to stress that based on my experience, i dont think anyone should be put off using php, because of search engine rankings. Before I wrote sites in php i checked to see if people advised against it for SEO (in the google approved sense) and clear answer was that it should not be a problem.
Perhaps the lesson is that, if writing a dynamic site, regardless of the launguage, then the php programer should be aware of good clean html, and general SEO, ie they should always be careful to make sure the *output* looks 'natural' and clean.
well said Tartan75, most of the programmers I have worked with have no clue about SE's or spiders. Most just want to make their coding easier. It never crosses their minds to consider how accessible the site is to spiders and how they will market the site later.
"But look the site is so cool and fast"
who cares, if no one can find it then it really doesn't matter how cool it is or that you used all those fancy techniques.
The KISS method always worked best for me.
A good read is Common HTTP Implementation Problems [w3.org]. Especially 3.1 where it states that you should generally (i.e. dynamic and static content) serve without file-extensions, so you are not dependent on technology. Summed up in the often heard advice: "Cool URIs don't change", but cool content does.
And yes, this is also good in Google's sense. I have two sites that are totally build with php and have a .php extension (back then I didn't know) and usually one or two (short) query strings. Both sites have a reasanable PR and several thousand pages indexed, they are good positioned in the SERPs and are frequently visited by Freshbot.
So my conclusion is, if you know what you're doing and how it works, than yes, php works great with users and searchengines (including google).
What your client faced, it a "less than clever" usage of PHP sessions. Any session tracking would get your pages out of the index as soon as GoogleBot hits your pages.
This happened to me in October 2002 when I decided to use PHP sessions to track the visitor's language on a bilingual site.
It took less than a week to see all my pages but my main page dropped from the index. The reason for the first page to stay in the index is that PHP sessions isn't a problem for the first page hit on a site, but all links therein have a PHPSESSID var added.
A few ways to go around this:
- don't use sessions
- don't use sessions if user agent is a spider
- use session but make sure to disable trans-sid (so that PHP won't add a PHPSESSID=... to your URLs if the visitor doesn't accept cookies - like spiders)
To be on the safe side when using sessions, make sure all your links to pages (not to images) are using full addresses (http://...) and not relative ones.
Once I solved this, I was back in the index a month later :))
cheers,
Dan