Forum Moderators: open

Message Too Old, No Replies

Does Google's spider have a concept of an IIS session?

         

threecrans

3:01 pm on Aug 29, 2002 (gmt 0)

10+ Year Member



IIS sessions use cookies to associate a user with a session. When making multiple requests, does Google's spider (on any other spider) return this cookie with the request, thus maintaining the session state?

The reason I ask this is because I would like to take action (301 redirect to a url) within the Session_OnStart event (in global.asa). If the session is maintained and the bot makes multiple requests, I believe the Session_OnStart event will only fire on the first request...so only the first request will be properly redirected while the other requests are served a page.

Does anyone know how this is handled?

Dreamquick

3:07 pm on Aug 29, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No, as far as I'm aware none of the big SEs currently handle/accept any type of cookies - including session cookies.

- Tony

Grumpus

3:19 pm on Aug 29, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nope. The googlebot makes a new session each time it visits. And, even if it did hold session cookies, the next logical hit comes in from a different IP in most cases anyway - you'll rarely see a single IP (instance of Googlebot) move through the site in a normal matter.

I'm not sure exactly what you are trying to do, but I'd guess you could accomplish what you want to do by doing this... (This isn't code, but you can easily convert the train of thought to code...)

If mid(user-agent,1,8) = "googlebot" and Application("GoogleSession") = False then Application("GoogleSession") = True

If Googlesession = True then do or don't redirect or whatever.

Unfortunately, you'd have to put this on every page (probably in an SSI so you can tweak it and have it update the whole site) but depending upon exactly what you are trying to do, something along these lines would work - App variables can be passed to bots.

Hope that at least gets you thinking, if not solving the problem altogether.

G.

threecrans

4:39 pm on Aug 29, 2002 (gmt 0)

10+ Year Member



Thanks Grumpus and Dreamquick!

Also Grumpus...thanks for the snippet but it actually is the other way around...but let me run the situation by you (this is somewhat long...my apologies).

As stated in the thread [webmasterworld.com ] my staging server has recently been indexed. Since ALL of the staging server content is published to the main server, this resulted in duplicated content. One of the suggestions by many in the thread is to 301 redirect to the appropriate server.

In IIS, I can set a global redirect for the website to a specified url (Properties -> Home Directory -> A redirection to a url). There are two problems however.

  1. I can't specify a page to redirect to dynamically. For example, if a request comes in for staging/page1.asp, I want to be able to redirect to main/page1.asp. As it stands now I can only specify one url, so ALL requests go to main/redirectedtraffic.asp.
  2. When you do this, the response from the server is "301 Error", not "301 Moved Permanently". I am afraid how Google and other bots will react to this.

So, my second solution was to intercept the request in Session_OnStart and set the response to "301 Moved Permanently". This is great unless the session persists. For example, if a request goes to staging/page1.asp, I can redirect to main/page1.asp. But if you keep the browser open, then request staging/page2.asp, you get staging/page2.asp. This is OK with me, as long as ALL spiders are properly redirected to main on every request...that was my original (yet misleading) question in this thread.

So if you (or anyone else) could give me your opinion for any or all of the following three questions it would be a great help:

  1. Are there any known issues with Google or other spiders "misinterpreting" a "301 error" response from IIS.
  2. Is there any known way to redirect dynamically (i.e. redirect to the same page they requested but on a different server) in IIS outside of the Session_OnStart implementation described above.
  3. If I go with the Session_OnStart implementation described above, do you see any potential issues with spiders not being redirected on every request.

clickclick

5:02 pm on Aug 29, 2002 (gmt 0)

10+ Year Member



Why not change the ip and the name of the staging server?

ciml

5:06 pm on Aug 29, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> if a request comes in for staging/page1.asp, I want to be able to redirect to main/page1.asp

I really think that's a server issue, it's a trivial thing to do on other Web servers and it's the right thing to do.

> the response from the server is "301 Error", not "301 Moved Permanently".

I can't imagine any search robot paying attention to the text. The server software is quite wrong to do this in my opinion, but I'd eat my hat if it mattered in Google.

korkus2000

5:09 pm on Aug 29, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I use the Session_OnStart in my global.asa to redirect malicious spiders from my sites. I have not found any problems. You can also set up a ban in the manager dialogs by ip. This thread [webmasterworld.com] might give some insight. Another thread that might help is this one on IIS cloaking [webmasterworld.com].

korkus2000

5:24 pm on Aug 29, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To answer your original question you can use the global.asa because every googlebot hit will trigger a new session. It does not understand session and cookies.

If you want to give a 301 response instead of the response.redirect 302 use this addheader code:

<%
Response.Status = "301 Moved Permanently"
Response.addheader "Location", "http://www.newdomain.com/newurl/"
Response.End
%>

threecrans

6:48 pm on Aug 29, 2002 (gmt 0)

10+ Year Member



Thanks..great answers everyone. I feel much more confident now.

clickclick Why not change the ip and the name of the staging server?

This was my original approach, but after getting some feedback from other users in the thread referenced above, it appeared that 301 was a more appropriate solution.