Forum Moderators: open

Message Too Old, No Replies

Problems with Dynamic URL spidering

Issues with Googlebot 2.1

         

itssook

5:10 am on Apr 4, 2003 (gmt 0)

10+ Year Member



Speaking of Googlebot 2.1+ particularly, because that is what is currently crawling my site...or the lack thereof. Would there be any problems with the googlebot with doing 2 of the following thing?
1. Using Cached content
2. Having URLS like website.com/viewpage.cfm/page/49.html

I read somewhere that these were supposed to be SES URLs. Should this be fine and should i be patient to see if the googlebot likes this? I've seen scooter 3.2 all over my site but the googlebot seems to stop on my default.cfm page. Any Suggestions?

Thanks!

mcavic

5:26 am on Apr 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I expect that those URLs would be fine. I have some like website.com/viewpage.php/page and Googlebot crawled them. That was after the last update, so I don't know if they're in.

I don't know what you mean by cached content?

itssook

5:44 am on Apr 4, 2003 (gmt 0)

10+ Year Member



By cached content I mean using session management (specifically within Cold Fusion) and when a client, i guess even googlebot could be considered a client, comes to the site it displys cached content...however I've had a little bit of the problem with the caching and therefore it might effect googlebots method of spidering my website. Any suggesions anyone?

WebWalla

9:49 am on Apr 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you mean that your Coldfusion pages use session id's then indeed this is your problem - Google hates session id's - try to eliminate them wherever possible.

itssook

11:01 am on Apr 4, 2003 (gmt 0)

10+ Year Member



Thanks WebWalla. I tried changing my application settings and turned off clientside cookies and various application settings. We'll see what happens when googlebot returns tomorrow...hopefully this will appease the spider! If not, back to the drawing board.

Does anyone know of any session management in coldfusion that googlebot doesn't mind?

BGumble

4:32 pm on Apr 4, 2003 (gmt 0)

10+ Year Member



Make sure you're using a valid form of redirection. For some pages I was using an old method (catching the error 404) instead of mod_rewrite and Google would not pick them up.

khuntley

4:49 pm on Apr 4, 2003 (gmt 0)

10+ Year Member



I have what is arguably the most dynamic content out there since we use Miva Merchant. Most pages on the site used sessions and cookies with many arguments in the urls. I created static mirror pages for each dynamic one. Google usually returned searches for terms on the static pages near the top and the dynamic pages were nowhere to be found. Went so far as to block access to all dynamic content in robots.txt. I don't even deal with the dynamic pages issue now.

Kevin

itssook

7:49 pm on Apr 4, 2003 (gmt 0)

10+ Year Member



BGumble, can you elaborate on the valid form of redirection? What would an example of that be? What is catching the 404? Thanks a lot!