homepage Welcome to WebmasterWorld Guest from 54.204.215.209
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Website
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Googlebot and ASP generated pages
Curious behavior, not doing a deep crawl.
Weblamer




msg:138245
 8:59 pm on Dec 2, 2002 (gmt 0)

I have noticed a curious aspect when it comes to googlebot spidering my dynamically created pages, and I was hoping some of you folks out there might be able to shed a little light on the subject.

I have a database full of different types of products by different manufacturers. Let’s call them widgets. Now on a page on my site you can see the types of widgets we sell. This page contains links to asp pages that carry different widgets by different manufacturers. There are many many widgets in each category, so I have broken the pages up to about 16 widgets per page.

The link to one of these pages looks like this:

[wwWebmasterWorldebsite.com...]

Ok, no problem there. In fact, googlebot has gobbled up all of these pages, and every widget found on these pages appears #1 in a google search. Not Bad.

Now, remember, I have broken the widgets up to about 16 widgets per page. So at the bottom of the above mentioned page there is a link that says ‘CLICK HERE TO SEE MORE WIDGETS!’ and the link to the next page will look like this:

[wwWebmasterWorldebsite.com...]

the ‘nav=16’ is how I wrote my code so the ASP page will show products 16 through 32. At the end of that page will be a ‘click here to see more widgets’ link that will say:

[wwWebmasterWorldebsite.com...]

and so on.

Now, here is my dilemma. Google will NOT spider the pages that contain this ‘nav=’ line in it. Every one of these products from the base pages appears in the google index with a high rank, however anything deeper and googlebot will not spider.

Does googlebot have some sort of flag pop up when it see’s an ‘=(number)’ in a dynamic url? Is there a sneaky way I could get around this? Do I have to rewrite my entire code? If I, say, used letters instead of numbers (nav=BB) and used asp code to translate this into numbers, would googlebot then spider these pages?

Any Suggestions?

 

skirope




msg:138246
 3:50 am on Dec 3, 2002 (gmt 0)

I am having virtually the same problem.

Google only wants to go a certain depth in my pages then it wont go any further.

I guess GoogleGuy can shed some light on the manner.

SkiRope

pageoneresults




msg:138247
 4:25 am on Dec 3, 2002 (gmt 0)

To avoid any issues, your urls should be parsed and they might look like this...

dynamicpage/32/widgets.htm

We're finalizing this right now on an asp generated database. We've already done it with one and are awaiting final results, we're going to give it 30 days or so before making a final decision. If you can get rid of all the unusual characters, you'll be much further ahead.

Do a site search here for the term "parsing urls" and you should find some valuable information on how to do it for both UNIX and Windows.

Weblamer




msg:138248
 3:18 pm on Dec 3, 2002 (gmt 0)

OK folks, here is my idea for getting around this problem without having to fool around with parsing urls.

currently, to stick the nav in for the asp code to read, I do this:

[wwWebmasterWorldebsite.com...]

now, instead, I am going to try a little ASP voodoo. The url will now look like this:

[wwWebmasterWorldebsite.com...]

I am going to assume the "~" caracter (what is that called anyways?) is going to be a 'safe' symbol, since I have seen that in web page filenames. none of my products have the '~' symbol in it, so it will not cause mistakes that way. I do not think google will flag a page for this symbol.

now, in the asp page itself, i will seperate 'widgets~15' into two seperate strings by useing a couple of string functions.

nav = right(category, (len(category) - instr(category,"~")))
category = left(category, instr(category,"~")-1)

ta-da! i have seperated that single variable into two seperate variables:
widgets
15

any comments on this? think it will work?

pageoneresults




msg:138249
 5:58 pm on Dec 3, 2002 (gmt 0)

The ~ is referred to as a tilde.

any comments on this? think it will work?

Not too sure but I don't think the tilde is an acceptable replacement. Parsing the URL's is no big deal once you get the hang of it. I dropped you a sticky mail with an example, we're you able to surmise anything from that?

Also keep in mind that there are other SE's besides Google! Many of them cannot get past the question (?) mark. I'd take the plunge now and parse the urls, you'll be thankful you did in the long run.

P.S. Think about your users too. Its much easier to remember this...

products/widgets/500.htm

It also looks much nicer in the SERP's. ;)

sun818




msg:138250
 6:09 pm on Dec 3, 2002 (gmt 0)

Is rewriting URLs in .ASP only available as add-on modules to the IIS server? Is there a way to do this with ASP script?

pageoneresults




msg:138251
 6:27 pm on Dec 3, 2002 (gmt 0)

Yes, it can be done with an asp script. We just completed one a couple of months ago that my programmer customized. It took a little bit of time since this was his first large scale parsing project, but it works like a charm.

We just installed IISRewrite on the server yesterday. We are experimenting with that to see if it is a viable alternative to customizing each asp database. I'd much rather have the ability to do it at server level instead of for each and every site.

There is no downloadable script that I'm aware of. There are a few software packages that you can install on the server and that is it. The rest of it is custom from the ground up.

Weblamer




msg:138252
 6:28 pm on Dec 3, 2002 (gmt 0)

I saw the links you sent, and I understand what you are saying, however, in order to be able to parse your urls, do you not require access to the actual configuration of the webserver?

Remember.. I am renting hosting space on another company's system.. i have only basic FTP access to modify my site...

pageoneresults




msg:138253
 6:31 pm on Dec 3, 2002 (gmt 0)

I saw the links you sent, and I understand what you are saying, however, in order to be able to parse your urls, do you not require access to the actual configuration of the webserver?

Hmmm, I don't think so with that one because all of the parsing is done through statements at the tops of the main category pages. I'll check with my programmer, he comes in again tomorrow (Wednesday).

When I worked with him on this over the phone, he was accessing the main category and product templates and writing his code in there.

eCommando




msg:138254
 7:32 pm on Dec 3, 2002 (gmt 0)

Anybody knows why some of my dynamic asp pages are not showing the information from the Title.

On the search result, it just shows the URL:
[webservername.com...]

On the other hand, the the url like this one got indexed with the correct title.

WEB SITE TITLE
[webservername.com...]

I don't see any difference!

pageoneresults




msg:138255
 7:49 pm on Dec 3, 2002 (gmt 0)

Hello eCommando, welcome to WebmasterWorld.

Are you including the record set or variable in the <title> of your template page?

<%=rstemp("recordset")%>

eCommando




msg:138256
 10:18 pm on Dec 3, 2002 (gmt 0)

Yes, my title tag has dynamic information in it. Google doesn't like dynamic info in the titles? I tried it with the spider simulator and all the text seemed to be fine.

pageoneresults




msg:138257
 10:22 pm on Dec 3, 2002 (gmt 0)

Google doesn't like dynamic info in the titles?

Not at all. Just trying a process of elimination to help assist you in determining your problems.

iJeep




msg:138258
 10:26 pm on Dec 3, 2002 (gmt 0)

For parsing my dynamic urls I used the 404.asp file. This works perfect. If a file/directory doesn't work, they user is sent to the 404.asp file, then that reads the info and shows the right page.

eCommando




msg:138259
 10:26 pm on Dec 3, 2002 (gmt 0)

Some of the pages (a small number) with the dynamic <Title> info got indexed with the dynamic info though.

sun818




msg:138260
 10:35 pm on Dec 3, 2002 (gmt 0)

Its a problem with your web server. The spider wouldn't be able to see your ASP code unless your web server barfed. Perhaps it was overloaded at the time the GoogleBot was crawling the site.

eCommando




msg:138261
 10:44 pm on Dec 3, 2002 (gmt 0)

Hm... that's what I thought also -- "spider wouldn't be able to see your ASP code".

It takes some time to get the pages displayed because they have to parse through dynamic xml data from another site to get the Title.

But the thing is google still indexed the pages, just not with the title information.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved