Forum Moderators: Robert Charlton & goodroi
I have a site that has a section that lets users view images. Let's say it's images of widgets. Before they can view the widgets, they have to pick which color. So I have 4 categories of widgets; red widgets, blue widgets, green widgets and white widgets.
So far so good.
Before they can view these widgets, I have set up a page that let's them choose which color. After choosing the color, the next page shows thumbs of the widgets in that category. If they click on a thumb, it pops open a window that shows the large image of the widget they clicked and also gives them the ability to scroll through the widgets in that category.
So far so good for the user.
Now, to keep track of the catagory that was clicked and the results from the database, I decided to assign the categories and the sorted results to variables. The category was just a straight variable and the results were in an array. I also stored the name of the image, and other info. I also used these variables to display in the title of the page, description, keywords, etc. etc. Fully dynamic.
This worked very well for users that were looking at the site. I tested it many times and there were no flaws that I could see. Everything was great.
It took a long time for googlebot to finally get to these pages and I waited patiently for them to show up in the SERPs. The page that you chose the category was in the SERPs from day one, but the thumb and detail pop-up took a long time before they were listed.
Well those other pages finally got in the SERPs and I was checking them today. And did I learn a thing or three.
First off, googlebot will crawl a page just like a user would. The variables would load and it would follow to the thumbs page and ultimately follow to the pop-up which showed the individual scrolling page. Great, it worked fine.
BUT
The links in the SERPs that go directly to the thumbs page and the links that go to the pop-up detail page that scrolls were a definite problem. You see, if you go directly to those pages and bypassed the page where you chose the category, there are no variables loaded. This gave me some pretty unexpected results. The links going to the thumbs pages would display ALL the images in the database because it was searching a null value. The links going to the individual pop-up pages was returning just the template with no title, image, name, etc. etc. etc.
I ended up redoing the whole app using post arguments instead. This way, the links in the SERPs will have the category as a post arg and the call to the database will know what to display.
So what have I learned?
It seems googlebot will act as a user and you can declare variables and whatnot. It will crawl just as a user would.
But if you are building an APP that relies heavily on variables, there is no way the links will work correctly in the SERPs.
As Holmes would say, "That's elementary, my dear Watson." But I learned this the hard way and decided I would share this so others don't go down the same path as me. Now I am going to have to go through all my apps to make sure there isn't other stuff I inadvertently screwed up by using this method.
Anyway, not sure if this has been mentioned, but decided to share.
Yes, very instructive.
It's important that however you get to a page, even directly from an SE, that the page works, even with unexpected flow.
So at least have your variables default to sensible things and test your pages with cookies turned off so that variables can't be held, etc, etc.
Thanks for sharing!
Rgds
DamonHD
It seems googlebot will act as a user and you can declare variables and whatnot.
You've used the terms "variables" and "arrays", but you don't specify exactly what you mean by that. Your variables may exist in you programming language's context but they wont exist in an HTTP context. You need to use some kind of HTTP mechanism to maintain state.
I can assure you that Googlebot does not act "as [an other] user would". Googlebot does not accept cookies or provide HTTP_REFERER information, nor does it make POST requests.
How exactly are you maintaining state and passing variables between your pages?
I suspect your fuindamental problem is not the use of variables and arrays, but more the web development platform you are using, it sounds to me as if it is distancing you from what's actually happening on your web server. You have tried to write a web site in the same way as you would write a desktop application, and that just isn't going to work. You need to learn how HTTP works as well as how your programming environment works.
A good test for any site, is to disable cookies, disable HTTP_REFERER and disable client side scripting in you web browser and then have a go at browsing your site. If you can't browse it properly, then its unlikely that a web spider will be able to either.
As soon as the information was published about Google and Yahoo and those two flags, I put them on a bunch of pages.
Both search engines are happily spidering those pages.
I finally put in a check for the user agent and send them a 401. Ghezsh!