Welcome to WebmasterWorld Guest from 18.208.186.19

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Removing indexed Session Id's

removing currently index Session Id's on Google

     
3:21 pm on Aug 4, 2005 (gmt 0)

New User

10+ Year Member

joined:Dec 20, 2004
posts:18
votes: 0


Hi all,

One of my clients have the following problems with all their sites.

Firstly, a brief background:

- The sites are +/- 5 years old.
- They are built on standard Microsoft ASP.
- The sites make use of session tracking in order to track user behavior and in order to track marketing campaigns for their affiliates and publishers.

I know from Google Guidelines, they say stay away from session tracking. Obviously it's a bit late now but Iím trying to find some way to rectify this problem.

Here is a full practical example of my problem:

Site: www.siteurl.com (not real URL for client confidentiality)

If I do a search for indexed pages of the URL in Google, returned results look similar to these:

www.siteurl.com/default.asp?btag=campaign_id_300
www.siteurl.com/products.asp?btag=campaign_id_412

(Essentially, the URLS above would appear on their affiliates sites (unique tracking code so they are paid for their sales).

As you can see, the above URL's include the query string for those pages. If I now search for just the URL's with no query strings i.e. www.siteurl.com/default.asp, Google says they canít find those pages. I suspect they indexed the pages with the query string first, then indexed the original page without the query string, looked at both pages, found a duplicate and threw one of them away. In my case, majority of the main URLís were removed

Now,

I have asked a few people around including Google, how I stop them from indexing pages with sessions ID's. Google tells me I need to exclude these pages in the Robots.txt file but because itís an ASP site it makes use of a Global.asp file that sets requirements for the sessions tracking. This Global.asp file is not a directory so I canít exclude it from being indexed (Itís a server side file).

Im thinking a solution could be the following:
In the Global.asp file, create an IF, THEN, ELSE statement that says something about If Googlebot. THEN donít index sessions ELSE index the rest of the pagesÖ

Iím not sure if the above sounds a little confusing to you all but I am wondering if anybody has any suggestionsÖ.

Thanks AllÖ

10:28 pm on Aug 5, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Using Disallow: www.siteurl.com/products.asp?btag will get rid of all pages with a tag on, allowing the plain products.asp to be indexed. However, the tagged pages may still appear as URL only listings.

An alternative would be to use www.siteurl.com/products.asp#btag=campaign_id_412 because all after the # is thrown away by search engines.

I also always add <meta name="robots" content="noindex"> to all "print friendly" versions of pages in order to avoid duplicate content, and to avoid a searcher arriving directly at a page that immediately tries to print itself.