Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt disallow help

         

virtualreality

11:32 pm on Nov 23, 2009 (gmt 0)

10+ Year Member Top Contributors Of The Month



How do I disallow dynamic URLs such as:

myforum-topic1?datecut=0&sortby=subject&order=asc
myforum-topic2?datecut=0&sortby=subject&order=asc
myforum-topic3?pid=10&mode=threaded

I want to disallow only the part after "?", therefore I would like to index myforum-topic1.

I cant Disallow: /myforum-topic1?* because I have hundreds of topics, and that would mean to do it manually for all.

I tried:

disallow /?* and disallow /*?* but when I run my site map generators all URLS with "?" get indexed again.

dstiles

2:21 am on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Canonical tags in the web page header? Built dynamically you can remove any or all of the querystring fields.

virtualreality

2:47 am on Nov 24, 2009 (gmt 0)

10+ Year Member Top Contributors Of The Month



and how to do that?

tangor

2:57 am on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You'll need to review the robots exclusion rules. Can't help much with dynamic since I run none myself, but I do know that wildcards are few to non-existent in robots.txt.

tangor

2:57 am on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You'll need to review the robots exclusion rules. Can't help much with dynamic since I run none myself, but I do know that wildcards are few to non-existent in robots.txt.

dstiles

9:46 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There's quite a lot of info on canonical tags in the google section of this site.

Basically:
<link rel="canonical" href="http://www.example.com/page.ext" />

where page.ext is something like index.htm or aboutus.php (obviously the basic name of the page in which the canonical tag is inserted). Querystring portions can be added if relevant.

marketingguru

10:00 am on Nov 28, 2009 (gmt 0)



RObots.txt is very helpful to give allow and disallow.
rel=canonical is helpful.

[edited by: bill at 10:23 am (utc) on Nov. 28, 2009]
[edit reason] No URL signatures [/edit]