Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: open
The pages were ready for the last update, but it appears we have had trouble being included in the index, and I thought the above might be a reason - ie too many variables - any thoughts anyone pls?
It is generally accepted that Google will parse and index dynamic URLs, but also that they restrict the amount of dynamic URLs they will spider. How they restrict the numbers is not officially known, but general concensus is that it is a combination of PR, Depth, and number of URL variables.
The reason being of course that one page could generate 1000s of variations by pulling data out of a database and hence the updates would be in danger of shifting from a Lunar to a Solar cycle - not something we want to see happening.
From my experience,
I tend to build seperate pages per product/category/whatever and hard code as many variables into the page wherever possible to reduce the variables after the ?. Your idea of a/b/c/d would do the same thing and certainly would be recommended.
However, I do have some high PR pages that produce secondary iterations, (page=2 etc) to get all the data displayed and these seem to be indexed down to 8 levels from a pr6 page.
So, to be honest, I would be reducing anything that might hinder indexing if I could. It may be an idea to go back to the tech guys and have another talk with them
For example, what reason did they give for not accepting your ideas and were those reasons sufficient enough to risk losing sales?
Let us know how to get on.
I'd suggest that if you are going to go with dynamic pages (Your first suggestion is best, though logistically for your web firm can be nightmarish and cost you a nice chunk) that you go with something like file.html?a=3033 and parse it into its unique elements on the server side. It's more load on the server, but more inviting for spiders (and it makes it shorter to paste the link all over the web to help PR ;) ) If each digit could become two or more digits, you could even bust it down into ?a=3-0-3-3. [I was originally going to use a "pipe" (¦) there to delimit the variable, but I'd also stay away from odd looking characters that the bots might barf on]. Also make sure that SOMETHING shows up if the parameter is left off. Some useful spiders strip the ? and everything after it and it's better to have "something" indexed than nothing at all.
One odd thing I noticed about dynamic pages and PR when we were discussing it in another topic [webmasterworld.com], is that the following holds true in the toolbar. (I'm assuming that it's "guesses" PR in the toolbar and the general consensus seems to be that dynamic pages don't get any "real" PR at all, though they DO seem to get the effect of "structure trickle" from the main page).
1) www.mysite.com/page.asp?a=1 has a PR of 4
2) www.mysite.com/page.asp has a PR of 3
3) www.mysite.com/page.asp?a=1&b=2 has a PR of 3
Note how the optimum seems to be a single variable - even better than with no variable at all.
Hope some of that helps.
http://www.url.com/shop.cgi?category=4&product=6 etc etc
Not one single page was indexed, and our site is visited by Googlebot every day to check for changes.
As of today we changed the URLs to something linke this :
I expect our 15,000 pages or so of shop content to now get indexed, and it should be happy days.
I would therefore warn against using dynamic URLS, '?' etc as you may not get indexed.