Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Verifying no session IDs in Google's index

         

Tonearm

6:13 pm on Aug 4, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My site will only generate session IDs in the URL if the user is not detected to be a robot, and the user does not accept cookies. Even so, I'd like to verify that Google hasn't indexed any of my pages with a session ID appended.

Is going through the pages listed for site:www.example.com the best way to do this?

Bones

6:31 pm on Aug 4, 2007 (gmt 0)

10+ Year Member



site:www.domain.com inurl:sid

Where "sid" is whatever the session id parameter your using is, should work.

Tonearm

6:44 pm on Aug 4, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Unfortunately, that doesn't seem to be working even for terms that are obviously in my URLs.

Tonearm

12:09 am on Aug 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Could there be a URL with a query string that is causing me duplicate content harm that isn't listed with a site: query?

g1smd

8:32 pm on Aug 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Unlikely.

londrum

8:38 pm on Aug 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



according to google's webmaster pages, placing this in your robots.txt will stop googlebot from indexing any pages with a
?
in it

User-agent: Googlebot
Disallow: /*?*

maybe you could have a read up on their pages. they've got a couple of other pattern matching things on there which might help to stop them adding new ones to their index. (doesn't help you with removing the old ones though)

Tonearm

12:25 am on Aug 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not a bad idea londrum, but I think that technique still causes Google to include the URL in the index, just not the page itself.