Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Removing indexed Session Id's

removing currently index Session Id's on Google

         

Fortune

3:21 pm on Aug 4, 2005 (gmt 0)

10+ Year Member



Hi all,

One of my clients have the following problems with all their sites.

Firstly, a brief background:

- The sites are +/- 5 years old.
- They are built on standard Microsoft ASP.
- The sites make use of session tracking in order to track user behavior and in order to track marketing campaigns for their affiliates and publishers.

I know from Google Guidelines, they say stay away from session tracking. Obviously it's a bit late now but I’m trying to find some way to rectify this problem.

Here is a full practical example of my problem:

Site: www.siteurl.com (not real URL for client confidentiality)

If I do a search for indexed pages of the URL in Google, returned results look similar to these:

www.siteurl.com/default.asp?btag=campaign_id_300
www.siteurl.com/products.asp?btag=campaign_id_412

(Essentially, the URLS above would appear on their affiliates sites (unique tracking code so they are paid for their sales).

As you can see, the above URL's include the query string for those pages. If I now search for just the URL's with no query strings i.e. www.siteurl.com/default.asp, Google says they can’t find those pages. I suspect they indexed the pages with the query string first, then indexed the original page without the query string, looked at both pages, found a duplicate and threw one of them away. In my case, majority of the main URL’s were removed

Now,

I have asked a few people around including Google, how I stop them from indexing pages with sessions ID's. Google tells me I need to exclude these pages in the Robots.txt file but because it’s an ASP site it makes use of a Global.asp file that sets requirements for the sessions tracking. This Global.asp file is not a directory so I can’t exclude it from being indexed (It’s a server side file).

Im thinking a solution could be the following:
In the Global.asp file, create an IF, THEN, ELSE statement that says something about If Googlebot. THEN don’t index sessions ELSE index the rest of the pages…

I’m not sure if the above sounds a little confusing to you all but I am wondering if anybody has any suggestions….

Thanks All…

g1smd

10:28 pm on Aug 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Using Disallow: www.siteurl.com/products.asp?btag will get rid of all pages with a tag on, allowing the plain products.asp to be indexed. However, the tagged pages may still appear as URL only listings.

An alternative would be to use www.siteurl.com/products.asp#btag=campaign_id_412 because all after the # is thrown away by search engines.

I also always add <meta name="robots" content="noindex"> to all "print friendly" versions of pages in order to avoid duplicate content, and to avoid a searcher arriving directly at a page that immediately tries to print itself.