Forum Moderators: phranque

Message Too Old, No Replies

Session init redirection and its impacts on search engine bots

Advice needed

         

dnimrodx

7:08 pm on May 5, 2007 (gmt 0)

10+ Year Member



Hello,

I have a website that at every session start redirects to a page that checks to see the client browser supports session cookies and has javascript enabled. After these checks are performed this page is never invoked again.

What I would like to know is what the impacts are of such a redirection at session-start on web crawling bots? Can they handle it well, or should I just abandon this 'session-check' altogether and adopt a more conventional way (no checking)?

Thank you.

dataguy

12:54 am on May 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Even thought the gbot has been reported to be able to read cookies, I don't think there are any crawlers that will allow the session to be saved, so chances are every crawler that hits a page of your site will be redirected, every time.

Doesn't sound like a good idea to me.

jdMorgan

1:08 am on May 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Right. Never give a search engine spider a session unless you want big trouble.

Jim

dnimrodx

10:48 am on May 7, 2007 (gmt 0)

10+ Year Member



Never give a search engine spider a session unless you want big trouble.

How exactly can I find out whether a visitor to my website is a spider bot? Mind you, I've submitted my website to a number of different search engines and not just Google or Yahoo...

Thanks to all of your replies.

Dave

jdMorgan

12:19 pm on May 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Simple method: Examine the User-agent header sent by the client in its HTTP requests to your server.
Better: Examine that 'claimed' User-agent, and verify that with a reverse-DNS lookup on the client's IP address to be sure the User-agent is a genuine one and not a spoof.

How you do this depends on what server you're running on and/or what language you're using to write your code.

If you're using an off-the-shelf software package, have a look through the documentation and/or the on-line FAQs. This is a common situation, and pre-made solutions may be readily available.

Jim

dnimrodx

12:24 pm on May 7, 2007 (gmt 0)

10+ Year Member



Simple method: Examine the User-agent header sent by the client in its HTTP requests to your server.
Better: Examine that 'claimed' User-agent, and verify that with a reverse-DNS lookup on the client's IP address to be sure the User-agent is a genuine one and not a spoof.

How you do this depends on what server you're running on and/or what language you're using to write your code.

If you're using an off-the-shelf software package, have a look through the documentation and/or the on-line FAQs. This is a common situation, and pre-made solutions may be readily available.

Jim

I'm using custom made scripts in PHP and using Apache v2 on Linux, so I don't think it will be a problem to implement this. Thanks for the help Jim.