Welcome to WebmasterWorld Guest from 54.160.221.82

Forum Moderators: goodroi

Message Too Old, No Replies

Stop Spiders Crawling Message Board Edit

Cgi Factory Message Board

     
10:45 pm on May 22, 2004 (gmt 0)

New User

10+ Year Member

joined:Mar 2, 2004
posts:40
votes: 0


I run a message board it is the one from cgi-factory

I want search engines to index each message thread but unfortunatly there are 'edit' and 'delete' links on each message and Google has followed each of these and indexed each of the page it throws up - this is using huge bandwith for no reason.

So i thought i could use the robots.txt to stop it but i am new to robots.txt so dont really know how.

The message board pages i want indexed have the URL:

[www.mydomain.com/cgi-bin/mb/mesage_number.html]

when you click an edit function you get the URL:

[www.mydomain.com/cgi-bin/mb/edit.pl?query_string]

Is there a way i can stop indexing of all the edit pages easily?

Another way i have thought was to add something in the <head> tag to stop search engines following links off each of these pages. Is there such a command?

All messages have a master template so it would be simple to add. However this would only cover new messages not all the old ones.

2:51 am on May 23, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0



User-agent: *
Disallow: /cgi-bin/mb/edit.pl

should do it.

For the on-page solution:


<meta name="robots" content="index,nofollow">

Jim

4:36 am on May 23, 2004 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 30, 2000
posts:497
votes: 0


Hi mn1dbp, I believe Googlebot ends up spidering the "bad referer" section of this script. If so try this, open edit.pl and find $my_title="Bad referer"; above that delete &vheader; then find sub message_box and below "print" add your new header... <head><meta name="robots" content="noindex,nofollow">...
5:06 am on May 23, 2004 (gmt 0)

Moderator from GB 

WebmasterWorld Administrator brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 30, 2002
posts:4842
votes: 1


If you're familiar with the script you could also edit it so that the links are buttons. Google and most 'normal' bots won't be able to follow the buttons.
12:25 pm on May 23, 2004 (gmt 0)

New User

10+ Year Member

joined:Mar 2, 2004
posts:40
votes: 0


Many thanks for the replies.

I will report back how i get on, to give an idea of the scale the are about 400 legitimate pages on my site including message threads - Goolge currently reports over 2000 pages. SO at least 1600 of them are simply it following the edit and delete links on each message in every thread and indexing that page.

1:01 pm on May 23, 2004 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38047
votes: 11


This is a question for your forum software developer. They should have a solution in place for this very issue.
12:36 pm on May 25, 2004 (gmt 0)

New User

10+ Year Member

joined:Mar 2, 2004
posts:40
votes: 0


Its now up to 3,500 pages indexed!

If someone is looking at changing the script then i also have another problem with it - the old one used to allow ten messages then start a new page for message 11-20 etc...

This new version (v5.0) after 10 creates a new page for each and every extra message - so #11 on page 2, #12 on page 3 etc...

I currently get around this by having unlimited messages per page but this leads to gigantic page sizes that are not good for users.