Forum Moderators: open
www.domain.com/pageName.asp?id=4
Will Googlebot eat this page or spit it out?
This is my first attempt at dynamic content and want to know what others have learned. We do not need to get into details, I just want to know if it will get indexed or not. All other posts reference .php and .asp is rare.
Thanks,
jtoddv
I guess I'm doing something wrong judging by what other people say. The thing I am trying at the moment is that when the page is called by a spider just as page.asp - instead of giving a very short error message I am actually putting up a menu of links to the pages. I won't know until December whether this makes any difference.
For example "/yourprogram.asp" or "/yourprogram.cgi" will be indexed by google, but if you include a workstring then the page will not be indexed by Google; for example "/yourprogram.asp?id=23444".
The reason behind this is simple. Spiders/Bots don't want to get stuck crawling in an infinite loop of pages! Most dynamic programs track some kind of session information. The session id, as it's sometimes called, is basically a random element. If this random element is included in the URL, then you 'could' end up with an infinite number of page URL's. Basically, the URL becomes random and not specific. Therefor, most Spiders/Bots will not crawl URL's that include a workstring (that part after the question mark).
I've seen a badly designed Spider crawl a site for days, stuck in an infinite loop! It drastically increased CPU usage!
The workaround for your Dynamic pages is to use an Apache Server to rewrite the URL's to be Spider/Bot friendly (rewrite rules). There are also other methods such as "on-the-fly content generation" techniques. Good luck.
We have decided that it does depend on your PR (we have a PR7 site), and that anything more than 2 parameters in the qs seems to be eaten by the spider but not indexed.
Best advice is pass your parameters around in other ways or rewrite the urls via one of the mod rewrite equivalent IIS software tools.
Hope this helps
Thanks for your advice. I was deperately looking for asp?id=1 examples and I found them in links to your site from your profile site. When I looked at the
map.asp?param1=123¶m2=432 links I found that typing in map.asp gave some content. My current experiment is that my pages used to just say "Form Id not found". Maybe Google tries the page.asp link on its own and if it produces barely nothing it decides not to index it - if it produces enough to be worth indexing then it then tries again with the parameters from the links
Previously if no formid was supplied they just wrote the message "FormID not supplied" - now they put out a full page (actually a menu of all the possible pages). It's as if Google checks the page without the parameter and if it finds a tiny error page it doesn't index it and doesn't index the pages with the parameter. So the moral of the story is: make sure you have some content when parameters aren't supplied.
Surname.asp?Surname={addnamehere}
and many have been in the index for several months. Each page is different in content based on some database query parameters.
One thing I have heard is that if you have a static page listing the variable surnames, Google can follow them. But apparently, those links need to be there for a while. The longer they are there, the more likely they are to be indexed and the higher up they apparently go. Also if you, like me, have 100,000 possible pages driven off the one template, the depth of Google's spidering make all the difference to your traffic. I may have 100,000 pages but maybe 30 or 40,000 are actually in the index. I'm hopeful that over time more and more will make it in.
Apache Mod Rewrite