Welcome to WebmasterWorld Guest from 220.127.116.11
Forum Moderators: open
Only my frontpage is currently listed with google, my question is, will google spider php sites? If not, why not? :-)
All I ever get on the logs is that its getting the main page as,
"GET / HTTP/1.0" 200 46011 "-" "Googlebot/2.1 +http://www.googlebot.com/bot.html)"
And getting the robots.txt
Thanks a lot (hope that made sense)
[edited by: Marcia at 12:12 am (utc) on July 26, 2003]
[edit reason] No individual site checks, please. [/edit]
joined:Feb 26, 2003
The problem may be one of several things.
1) If the site is new, it may be too early for other pages to get spidered.
2) If the PR is low, Google may not want to spider the other pages.
3) If you use session codes in the URLs, Google may decide not to spider other pages in case it is a dead end and its bot gets stuck / lost and does not come home for dinner.
4) Some php driven forums are not that cleverly designed SEO wise. Some use the damn same title for each of its dynamic pages. If Googlebot sees same-same, it says no-no.
My suggestion is;
1) Get someone to move any on page JS into separate files.
2) Get rid of any session codes
3) Find out a way to edit the Title (and other meta areas, although this is not anywhere near as important) of the dynamic pages (so, perhaps, it returns the subject of the posting as the title and heading).
4) Make sure you cross link as much as possible your internal pages.
You said it is a PHP forum?
There is code that can be put into your HTaccess file along with files you can put into your root that will convert all of your PHP files to HTML just for GoogleBot when it comes to the site, and it makes all pages appear just one level down under the index, not buried.
I use it on a site with thousands of posts and GoogleBot loves the site. The results are wonderful.
Sticky me if you want more info.
What I did do, was added the following code to sessions.php:
global $SID, $HTTP_SERVER_VARS;
if (!empty($SID) &&!eregi('sid=', $url) &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'email@example.com;'))
Apparently this will stop google using sessions. Unfortunately googlebot hasnt been back since I added this, so I guess its just sit back and wait.
Thanks for you advice guys.
your using the phpbb forums software and should find a lot of infos at their support forums - it's been discussed there, believe me. ;)
If you use the proper code the reason for your trouble must be another.
This is the proper code you should use:
global $SID, $HTTP_SERVER_VARS;
if (!empty($SID) &&!eregi('sid=', $url) &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'FAST-WebCrawler') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Slurp@inktomi') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Scooter') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'MicrosoftPrototypeCrawler'))
$url .= ( ( strpos($url, '?')!= false )? ( ( $non_html_amp )? '&' : '&' ) : '?' ) . $SID;
You should also check the recent posts about session id's and google [webmasterworld.com] since there are many statements that help to reduce any paranoia. ;)