homepage Welcome to WebmasterWorld Guest from 54.198.140.182
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
A bit of help if possible would be appreciated.
glennk




msg:3803547
 5:23 pm on Dec 9, 2008 (gmt 0)

Firstly I would like to ask a question about blocking wap2 pages on my forum. I am only really concerned about getting this right for google at the current time.

will this work

User-agent: *
Disallow:/*?wap2

The urls I want to block are as follows

http://www.example.com/forum/example-example-example-example/example-example-rights/?wap2

-------------------------------------------------------------

Secondly could anyone in the know take a few minutes look at my robots and tell me if there is anything problematical in there.

User-agent: *
Disallow:/forum/?action*

User-agent: *
Disallow:/forum/index.php?action*

User-agent: *
Disallow:/forum/index.php?action=login

User-agent: *
Disallow:/forum/profile/?*

User-agent: *
Disallow:/forum/profile/*

User-agent: *
Disallow:/forum/index.php/profile/*

User-agent: *
Disallow:/forum/index.php?action=register

User-agent: *
Disallow:/forum/index.php?action=calendar

User-agent: *
Disallow:/forum/index.php?action=help

User-agent: *
Disallow:/forum/index.php?action=help;page=registering

User-agent: *
Disallow:/forum/index.php?action=help;page=loginout

User-agent: *
Disallow:/forum/index.php?action=help;page=profile

User-agent: *
Disallow:/forum/index.php?action=help;page=post

User-agent: *
Disallow:/forum/index.php?action=help;page=pm

User-agent: *
Disallow:/forum/index.php?action=help;page=searching

User-agent: *
Disallow:/forum/Themes

User-agent: *
Disallow:/forum/?action=login

User-agent: *
Disallow:/forum/?action=register

User-agent: *
Disallow:/forum/?action=calendar

User-agent: *
Disallow:/forum/?action=help

User-agent: *
Disallow:/forum/?action=help;page=registering

User-agent: *
Disallow:/forum/?action=help;page=loginout

User-agent: *
Disallow:/forum/?action=help;page=profile

User-agent: *
Disallow:/forum/?action=help;page=post

User-agent: *
Disallow:/forum/?action=help;page=pm

User-agent: *
Disallow:/forum/?action=help;page=searching

User-agent: *
Disallow:/forum/?action=stats

User-agent: *
Disallow:/forum/?action=who

User-agent: *
Disallow:/forum/?action=recent

User-agent: *
Disallow:/forum/?action=search

User-agent: *
Disallow:/forum/?action=profile

User-agent: *
Disallow:/*?action=printpage

User-agent: *
Disallow: /*?action=ignore

User-agent: *
Disallow:/*?sort=views

User-agent: *
Disallow:/forum/?sort=views

User-agent: *
Disallow:/*?sort=views

User-agent/*?sort=subject

User-agent: *
Disallow:/*?sort=starter

User-agent: *
Disallow:/*?sort=replies

User-agent: *
Disallow:/*?sort=last_post

User-agent: *
Disallow:/*?wap2

User-agent: *
Disallow:/kayak-fishing-reviews/tag/*

User-agent: *
Disallow:/kayak-fishing-reviews/tag/

User-agent: *
Disallow:/north-east-fishing-news/tag/*

User-agent: *
Disallow:/north-east-fishing-news/tag/

User-agent: *
Disallow:/fishing-tackle-reviews/tag/*

User-agent: *
Disallow:/fishing-tackle-reviews/tag/

User-agent: *
Disallow:/kayak-launch-sites/tag/*

User-agent: *
Disallow:/kayak-launch-sites/tag/

[edited by: engine at 5:50 pm (utc) on Dec. 9, 2008]
[edit reason] examplified [/edit]

 

jdMorgan




msg:3803555
 5:31 pm on Dec 9, 2008 (gmt 0)

You should group all Disallows for the same group of robots under one User-agent: directive.

In other words, you need only one "User-agent: *" line, followed by all of your "Disallow: /xyz" lines.

Policy records which are addressed to a single robot, or those which are addressed to a specific sub-group of robots should be placed before the "catch-all" policy record addressed to "*".

I would recommend not including proprietary Disallow formats, Crawl-Delay directives, or Allow directives in policy records addressed to robots which do not explicitly state that they support those extensions to the Standard for Robot Exclusion [robotstxt.org].

Jim

[edited by: jdMorgan at 5:33 pm (utc) on Dec. 9, 2008]

glennk




msg:3803569
 5:44 pm on Dec 9, 2008 (gmt 0)

Sorry I really struggle with robots.txt it has me beat. Will what I have work for google ? or will I need to change it. Also will the wap2 urls be blocked properly by that command.

jdMorgan




msg:3803581
 6:04 pm on Dec 9, 2008 (gmt 0)

No, your format is invalid, as previously-stated. It may not work at all, or it may confuse other major search engines. You need to put all of those Disallows under a single "User-agent: *" line for a start.

There's not much wiggle-room on this, so I'd suggest re-reading the Standard until it makes sense, and/or asking very-specific questions here.

Jim

glennk




msg:3803810
 10:32 pm on Dec 9, 2008 (gmt 0)

Ok thanks,

Just looking at the first few what is the correct syntax for what you suggest ? Is the example below right or do I need to seperate them some other way.

User-agent: *
Disallow:/forum/?action*

Disallow:/forum/index.php?action*

Disallow:/forum/index.php?action=login

Disallow:/forum/profile/?*

Disallow:/forum/profile/*

glennk




msg:3805465
 8:24 pm on Dec 11, 2008 (gmt 0)

Is this right ?

phranque




msg:3806706
 9:22 am on Dec 13, 2008 (gmt 0)

yes - here is an example of what you want to do:
[webmasterworld.com...]

glennk




msg:3807357
 7:52 pm on Dec 14, 2008 (gmt 0)

Thanks phranque

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved