homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

Googlebot and ASP generated pages
Curious behavior, not doing a deep crawl.

 8:59 pm on Dec 2, 2002 (gmt 0)

I have noticed a curious aspect when it comes to googlebot spidering my dynamically created pages, and I was hoping some of you folks out there might be able to shed a little light on the subject.

I have a database full of different types of products by different manufacturers. Let’s call them widgets. Now on a page on my site you can see the types of widgets we sell. This page contains links to asp pages that carry different widgets by different manufacturers. There are many many widgets in each category, so I have broken the pages up to about 16 widgets per page.

The link to one of these pages looks like this:


Ok, no problem there. In fact, googlebot has gobbled up all of these pages, and every widget found on these pages appears #1 in a google search. Not Bad.

Now, remember, I have broken the widgets up to about 16 widgets per page. So at the bottom of the above mentioned page there is a link that says ‘CLICK HERE TO SEE MORE WIDGETS!’ and the link to the next page will look like this:


the ‘nav=16’ is how I wrote my code so the ASP page will show products 16 through 32. At the end of that page will be a ‘click here to see more widgets’ link that will say:


and so on.

Now, here is my dilemma. Google will NOT spider the pages that contain this ‘nav=’ line in it. Every one of these products from the base pages appears in the google index with a high rank, however anything deeper and googlebot will not spider.

Does googlebot have some sort of flag pop up when it see’s an ‘=(number)’ in a dynamic url? Is there a sneaky way I could get around this? Do I have to rewrite my entire code? If I, say, used letters instead of numbers (nav=BB) and used asp code to translate this into numbers, would googlebot then spider these pages?

Any Suggestions?



 3:50 am on Dec 3, 2002 (gmt 0)

I am having virtually the same problem.

Google only wants to go a certain depth in my pages then it wont go any further.

I guess GoogleGuy can shed some light on the manner.



 4:25 am on Dec 3, 2002 (gmt 0)

To avoid any issues, your urls should be parsed and they might look like this...


We're finalizing this right now on an asp generated database. We've already done it with one and are awaiting final results, we're going to give it 30 days or so before making a final decision. If you can get rid of all the unusual characters, you'll be much further ahead.

Do a site search here for the term "parsing urls" and you should find some valuable information on how to do it for both UNIX and Windows.


 3:18 pm on Dec 3, 2002 (gmt 0)

OK folks, here is my idea for getting around this problem without having to fool around with parsing urls.

currently, to stick the nav in for the asp code to read, I do this:


now, instead, I am going to try a little ASP voodoo. The url will now look like this:


I am going to assume the "~" caracter (what is that called anyways?) is going to be a 'safe' symbol, since I have seen that in web page filenames. none of my products have the '~' symbol in it, so it will not cause mistakes that way. I do not think google will flag a page for this symbol.

now, in the asp page itself, i will seperate 'widgets~15' into two seperate strings by useing a couple of string functions.

nav = right(category, (len(category) - instr(category,"~")))
category = left(category, instr(category,"~")-1)

ta-da! i have seperated that single variable into two seperate variables:

any comments on this? think it will work?


 5:58 pm on Dec 3, 2002 (gmt 0)

The ~ is referred to as a tilde.

any comments on this? think it will work?

Not too sure but I don't think the tilde is an acceptable replacement. Parsing the URL's is no big deal once you get the hang of it. I dropped you a sticky mail with an example, we're you able to surmise anything from that?

Also keep in mind that there are other SE's besides Google! Many of them cannot get past the question (?) mark. I'd take the plunge now and parse the urls, you'll be thankful you did in the long run.

P.S. Think about your users too. Its much easier to remember this...


It also looks much nicer in the SERP's. ;)


 6:09 pm on Dec 3, 2002 (gmt 0)

Is rewriting URLs in .ASP only available as add-on modules to the IIS server? Is there a way to do this with ASP script?


 6:27 pm on Dec 3, 2002 (gmt 0)

Yes, it can be done with an asp script. We just completed one a couple of months ago that my programmer customized. It took a little bit of time since this was his first large scale parsing project, but it works like a charm.

We just installed IISRewrite on the server yesterday. We are experimenting with that to see if it is a viable alternative to customizing each asp database. I'd much rather have the ability to do it at server level instead of for each and every site.

There is no downloadable script that I'm aware of. There are a few software packages that you can install on the server and that is it. The rest of it is custom from the ground up.


 6:28 pm on Dec 3, 2002 (gmt 0)

I saw the links you sent, and I understand what you are saying, however, in order to be able to parse your urls, do you not require access to the actual configuration of the webserver?

Remember.. I am renting hosting space on another company's system.. i have only basic FTP access to modify my site...


 6:31 pm on Dec 3, 2002 (gmt 0)

I saw the links you sent, and I understand what you are saying, however, in order to be able to parse your urls, do you not require access to the actual configuration of the webserver?

Hmmm, I don't think so with that one because all of the parsing is done through statements at the tops of the main category pages. I'll check with my programmer, he comes in again tomorrow (Wednesday).

When I worked with him on this over the phone, he was accessing the main category and product templates and writing his code in there.


 7:32 pm on Dec 3, 2002 (gmt 0)

Anybody knows why some of my dynamic asp pages are not showing the information from the Title.

On the search result, it just shows the URL:

On the other hand, the the url like this one got indexed with the correct title.


I don't see any difference!


 7:49 pm on Dec 3, 2002 (gmt 0)

Hello eCommando, welcome to WebmasterWorld.

Are you including the record set or variable in the <title> of your template page?



 10:18 pm on Dec 3, 2002 (gmt 0)

Yes, my title tag has dynamic information in it. Google doesn't like dynamic info in the titles? I tried it with the spider simulator and all the text seemed to be fine.


 10:22 pm on Dec 3, 2002 (gmt 0)

Google doesn't like dynamic info in the titles?

Not at all. Just trying a process of elimination to help assist you in determining your problems.


 10:26 pm on Dec 3, 2002 (gmt 0)

For parsing my dynamic urls I used the 404.asp file. This works perfect. If a file/directory doesn't work, they user is sent to the 404.asp file, then that reads the info and shows the right page.


 10:26 pm on Dec 3, 2002 (gmt 0)

Some of the pages (a small number) with the dynamic <Title> info got indexed with the dynamic info though.


 10:35 pm on Dec 3, 2002 (gmt 0)

Its a problem with your web server. The spider wouldn't be able to see your ASP code unless your web server barfed. Perhaps it was overloaded at the time the GoogleBot was crawling the site.


 10:44 pm on Dec 3, 2002 (gmt 0)

Hm... that's what I thought also -- "spider wouldn't be able to see your ASP code".

It takes some time to get the pages displayed because they have to parse through dynamic xml data from another site to get the Title.

But the thing is google still indexed the pages, just not with the title information.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved