Welcome to WebmasterWorld Guest from 220.127.116.11
The first thing I did was type in the domain name (lets call it example.com) and click the link showing 'find web pages from this site'.
The index page / home page was the only one showing. I realize that it can take a while for a site to be fully indexed, so I decided to send the spider simulator to example.com to take a look.
Then there are about 13 links to 'about us' 'contact us' 'store news' etc, and the spider simulator shows the links that it found to those pages as "http://home.php/?com=go&nid=Contact/" (the domain is missing). The simulator shows a 500 error code for those.
I brought this to the attention of the programmer, and he responded with this -- "the site loads content dynamically, so it has to be one page i.e. home.php and the subsequent content are variable conditionals added onto the home.php page."
Sure enough I go to the spider simulator again, add home.php to example.com, and now it lists the URLs properly with the domain name after the http://
That is one issue. Over the weekend it seems someone put a link up to example.com/home.php somewhere, but since it has the same content as the index page, I don't think that's a great idea in and of itself.
I'm not comfortable with a site that can't be spidered from the default / index page that most people will link to.
The other big issue with the site is that, when you navigate through to a product category, there are images that link to the individual product pages. I can send the sim spider to the product category URLs and sim spider returns this message.
"mysql_error(): supplied argument is not a valid MySQL-Link resource in /home/httpd/vhosts/example.com/httpdocs/phpFunctions_example_V1.php on line 166"
Can that type of link be spidered by Googlebot, and to a lesser extent the other bots out there?
It seems that there is wall after wall of defences within the site to prevent Googlebot from crawling it properly - would you folks agree with that assesment? I'm concerned that my findings might be met with some skepticism.
My concern is that I don't want to offer my services to put icing on the cake if the cake has serious issues that need resolving.
I welcome your thoughts and opinions!
There are a lot of sites out there that load content dynamically and that are very spiderable. I'm not familiar enough with mod-rewrite to know if a patch can be put in to write spider-friendly urls in this case, but you might want to look into that. As for the use of pop-ups and open new window stuff for navigation, well, that seems like it just has to go. No spider can follow that stuff.
The biggest problem I see here if you decide to take this on is that, after you find what can be done and plan a course of action, you're going to have to work very closely with the programmer. It's going to require a lot of tact. I'd be sure to present it as a problem-solving exercise (geeks love problems), rather than changing a "bad" design.
It sounds like this site needs a standards-based make-over.
With problems like this, if the competitors have clean sites, it's a serious concern.
They seem to have changed something now so that the links on the index page leading to 'contact us' 'about us' etc are no longer causing the sim spider to see [home.php...] , the domain name is in there where it belongs now.
I could make an html sitemap to help bots find the categories, but..
That will be a big problem, as the content in those popups is what makes the site unique, and where I'd want targeted traffic to land.
I just now noticed that every single page (categories and products, contact us, index page) has the exact same contents in the <title> tag too. All of the pages have seriously stuffed meta description and keywords tags with identical content across the board. I can only imagine what that will add to the equation.
I just fixed another site where identical meta tags on every page caused an URL only listing in Google for many of the pages (that site was dynamic but easily spiderable, and had unique <title> tags of course).
Fixing all this stuff or directing others to wasn't in the original mandate, I'm not quite sure what I should do at this point.
Thanks again for your opinions guys, much appreciated.