homepage Welcome to WebmasterWorld Guest from 54.204.141.129
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
How to block this type of URL's
iCyborg




msg:3815765
 7:29 pm on Dec 29, 2008 (gmt 0)

well I want to block the url's

http://example.com/blog/page/6/?theme=xyz
http://example.com/blog/this-is-post/?theme=xyz

but want to keep these url's
http://example.com/blog/page/6/
http://example.com/blog/this-is-post/

so basically I just want to block all pages which are ending with "?theme=xyz" as this is causing unnecessary content duplication.

 

iCyborg




msg:3815791
 8:16 pm on Dec 29, 2008 (gmt 0)

I want to prevent them from being indexed

netchicken1




msg:3816080
 6:32 am on Dec 30, 2008 (gmt 0)

Disallow: theme

(I think)

iCyborg




msg:3816182
 12:03 pm on Dec 30, 2008 (gmt 0)

Shouldn't there be some * too ?

firefoxin




msg:3817268
 7:52 am on Jan 1, 2009 (gmt 0)

How to block this type of url in robot.txt

http://example.com/member.php?action=email&id=10039‎

[or]

http://example.com/comments.php?shownews=605&highlight=‎

[or]

http://example.com/member.php?action=list‎

g1smd




msg:3817342
 3:10 pm on Jan 1, 2009 (gmt 0)

First Question:

Disallow: /[b]*[/b]theme

.

Second Question:

Which URLs are similar to those, but do not need to be blocked?

Otherwise is this what you want?

Disallow: /member
Disallow: /comments

.

Be aware that Google will still list "blocked" URLs as URL-only entries in the SERPs.

firefoxin




msg:3817794
 12:47 pm on Jan 2, 2009 (gmt 0)

thanks g1smd

/comments.php?id=3740&ocid=30562&replyid=0&catid=1 [remove]
/comments.php?id=3740&replyid=30574&catid=1 [remove]
/comments.php?shownews=3740 [OK]

i want to remove first two line and 3th one is my primary link.

i think i must use this code :

User-agent: googlebot
Disallow: /*id
Disallow: /*replyid

is that true ?

g1smd




msg:3818035
 7:56 pm on Jan 2, 2009 (gmt 0)

Your last suggestion is probably too wide. It will block anything with "id" or "replyid" anywhere in the URL or in the parameters. That's likely going to block some stuff that you don't want blocked. That is, "id" would block anything with "catid" and "ocid" and "docid" as well.

Do you know if the parameters ever appear in a different order?

Do you know if URLs with shownews ever have additional parameters and that you will not want to block those?

Otherwise, I would do:

User-agent: *
Disallow: /comments.php?id=
Disallow: /comments.php?action=
Disallow: /comments.php?highlight=
Disallow: /comments.php?ocid=
Disallow: /comments.php?replyid=
Disallow: /comments.php?catid=

Maybe others too?

You need to be aware of every possible format that could be requested.

You also need to be aware that with parameters in a different order you have a duplicate URL for the same content.

Jonesy




msg:3819060
 9:32 pm on Jan 4, 2009 (gmt 0)

Yes, the last suggestion is too wide (by far) and doubly redundant. :)
Referring to:

Disallow: /*id
Disallow: /*replyid

the case of "/*id" will disallow
/avoidupois, /ridiculous, /rancid, /recidivism,
/riddle, /typhoid, /zircofluoride, etc., usw.,
and
anything that looks like: "/*replyid".

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved