homepage Welcome to WebmasterWorld Guest from 54.167.75.155
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Removing "query-stringed" pages from Google
And other search engines using robotx.txt
MatthewHSE




msg:1527934
 8:55 pm on May 24, 2004 (gmt 0)

I have a few directories on my site related to my content management system that I don't want indexed by Google or the other SE's. Not that there's any harm in having them indexed; it's just that they're always redirected to my generic login page, which is thus indexed several times. I'd like to use robots.txt to exclude those directories since they serve no useful purpose to the SE's.

However, the directories are generated dynamically from time to time as modifications to the CMS are made. The only thing they'll have in common is the first eight letters of the directory name. After those first eight letters will be maybe eight or ten random numbers. (Bad for SEO, I know, but it's a third-party script...)

Is there any way to simply exclude those directories, categorically, using robots.txt? Remember I don't know what they'll be in the future, so I need to exclude them based only on the first eight characters of the directory name. Can that be done, and if so, how?

Also, given that these pages are already indexed, is there a safe way to remove them without affecting the rankings of my other pages?

Thanks,

Matthew

 

ZopeMaven




msg:1527935
 2:09 pm on May 25, 2004 (gmt 0)

Your best bet would be to create another URL element before the dynamic portion, like this:

Before: somesite.com/directory/jhg8668765

After: somesite.com/directory/dynamic/jhg8668765

Then you can ban (in robots.txt) access to somesite.com/directory/dynamic/

HTH.

Sanenet




msg:1527936
 2:23 pm on May 25, 2004 (gmt 0)

Try putting on your redirect page the meta <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
(or on your login page). This will stop them either indexing, or following, the dynamic directory.

(Assuming your directory page uses an HTML redirect and not 301/302?)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved