Welcome to WebmasterWorld Guest from 54.205.74.11

Forum Moderators: goodroi

How to block this type of URL's

   
7:29 pm on Dec 29, 2008 (gmt 0)

5+ Year Member



well I want to block the url's

http://example.com/blog/page/6/?theme=xyz
http://example.com/blog/this-is-post/?theme=xyz

but want to keep these url's
http://example.com/blog/page/6/
http://example.com/blog/this-is-post/

so basically I just want to block all pages which are ending with "?theme=xyz" as this is causing unnecessary content duplication.

8:16 pm on Dec 29, 2008 (gmt 0)

5+ Year Member



I want to prevent them from being indexed
6:32 am on Dec 30, 2008 (gmt 0)

5+ Year Member



Disallow: theme

(I think)

12:03 pm on Dec 30, 2008 (gmt 0)

5+ Year Member



Shouldn't there be some * too ?
7:52 am on Jan 1, 2009 (gmt 0)

5+ Year Member



How to block this type of url in robot.txt

http://example.com/member.php?action=email&id=10039‎

[or]

http://example.com/comments.php?shownews=605&highlight=‎

[or]

http://example.com/member.php?action=list‎

3:10 pm on Jan 1, 2009 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



First Question:

Disallow: /[b]*[/b]theme

.

Second Question:

Which URLs are similar to those, but do not need to be blocked?

Otherwise is this what you want?

Disallow: /member
Disallow: /comments

.

Be aware that Google will still list "blocked" URLs as URL-only entries in the SERPs.

12:47 pm on Jan 2, 2009 (gmt 0)

5+ Year Member



thanks g1smd

/comments.php?id=3740&ocid=30562&replyid=0&catid=1 [remove]
/comments.php?id=3740&replyid=30574&catid=1 [remove]
/comments.php?shownews=3740 [OK]

i want to remove first two line and 3th one is my primary link.

i think i must use this code :

User-agent: googlebot
Disallow: /*id
Disallow: /*replyid

is that true ?

7:56 pm on Jan 2, 2009 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Your last suggestion is probably too wide. It will block anything with "id" or "replyid" anywhere in the URL or in the parameters. That's likely going to block some stuff that you don't want blocked. That is, "id" would block anything with "catid" and "ocid" and "docid" as well.

Do you know if the parameters ever appear in a different order?

Do you know if URLs with shownews ever have additional parameters and that you will not want to block those?

Otherwise, I would do:

User-agent: *
Disallow: /comments.php?id=
Disallow: /comments.php?action=
Disallow: /comments.php?highlight=
Disallow: /comments.php?ocid=
Disallow: /comments.php?replyid=
Disallow: /comments.php?catid=

Maybe others too?

You need to be aware of every possible format that could be requested.

You also need to be aware that with parameters in a different order you have a duplicate URL for the same content.

9:32 pm on Jan 4, 2009 (gmt 0)

5+ Year Member



Yes, the last suggestion is too wide (by far) and doubly redundant. :)
Referring to:

Disallow: /*id
Disallow: /*replyid

the case of "/*id" will disallow
/avoidupois, /ridiculous, /rancid, /recidivism,
/riddle, /typhoid, /zircofluoride, etc., usw.,
and
anything that looks like: "/*replyid".
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month