Welcome to WebmasterWorld Guest from 54.159.214.250

Forum Moderators: goodroi

Stop Spiders Crawling Message Board Edit

Cgi Factory Message Board

   
10:45 pm on May 22, 2004 (gmt 0)

10+ Year Member



I run a message board it is the one from cgi-factory

I want search engines to index each message thread but unfortunatly there are 'edit' and 'delete' links on each message and Google has followed each of these and indexed each of the page it throws up - this is using huge bandwith for no reason.

So i thought i could use the robots.txt to stop it but i am new to robots.txt so dont really know how.

The message board pages i want indexed have the URL:

[www.mydomain.com/cgi-bin/mb/mesage_number.html]

when you click an edit function you get the URL:

[www.mydomain.com/cgi-bin/mb/edit.pl?query_string]

Is there a way i can stop indexing of all the edit pages easily?

Another way i have thought was to add something in the <head> tag to stop search engines following links off each of these pages. Is there such a command?

All messages have a master template so it would be simple to add. However this would only cover new messages not all the old ones.

2:51 am on May 23, 2004 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member




User-agent: *
Disallow: /cgi-bin/mb/edit.pl

should do it.

For the on-page solution:


<meta name="robots" content="index,nofollow">

Jim

4:36 am on May 23, 2004 (gmt 0)

10+ Year Member



Hi mn1dbp, I believe Googlebot ends up spidering the "bad referer" section of this script. If so try this, open edit.pl and find $my_title="Bad referer"; above that delete &vheader; then find sub message_box and below "print" add your new header... <head><meta name="robots" content="noindex,nofollow">...
5:06 am on May 23, 2004 (gmt 0)

WebmasterWorld Administrator brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If you're familiar with the script you could also edit it so that the links are buttons. Google and most 'normal' bots won't be able to follow the buttons.
12:25 pm on May 23, 2004 (gmt 0)

10+ Year Member



Many thanks for the replies.

I will report back how i get on, to give an idea of the scale the are about 400 legitimate pages on my site including message threads - Goolge currently reports over 2000 pages. SO at least 1600 of them are simply it following the edit and delete links on each message in every thread and indexing that page.

1:01 pm on May 23, 2004 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



This is a question for your forum software developer. They should have a solution in place for this very issue.
12:36 pm on May 25, 2004 (gmt 0)

10+ Year Member



Its now up to 3,500 pages indexed!

If someone is looking at changing the script then i also have another problem with it - the old one used to allow ten messages then start a new page for message 11-20 etc...

This new version (v5.0) after 10 creates a new page for each and every extra message - so #11 on page 2, #12 on page 3 etc...

I currently get around this by having unlimited messages per page but this leads to gigantic page sizes that are not good for users.

 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month