Forum Moderators: mack

Message Too Old, No Replies

How to keep a spider/bot from following a link

         

Jon_King

12:43 pm on Jan 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have a single link on a page to a PDF that I don't want indexed. I can't use robots.txt or meta nofollow because I want the page indexed, is there a 'proper' way to hide this link?

(I am aware of the new attribute, rel="nofollow" on hyperlinks, I don't think it is to be used for this situation.)

MatrixBrains

1:00 pm on Jan 29, 2005 (gmt 0)

10+ Year Member



I dont' know whether This might work or not but it is just a suggestion:

Suppose <a href="somedoc.pdf">PDF</a> is the link

Try using this as -
<table width="100">
<tr>
<a href="somedoc.pdf">
<td>
PDF
</td>
</a>
</tr>
</table>

This should create a hidden link.

Regards,
MB

MatrixBrains

1:02 pm on Jan 29, 2005 (gmt 0)

10+ Year Member



Also note that what you right click on the above link - the option {open in a new window} is not there.

When the visitor clicks on the cell containing the link, he is taken to the relevant page.

Regards,
MB

Lord Majestic

1:06 pm on Jan 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This should create a hidden link.

LOL -- this _may_ create hidden link for browsers, but certainly not for robots! Very funny JavaScript is probably the best bet, but using robots.txt should do the trick too since you can disallow access to a single page there, better still create a directory for all PDFs and disallow access to it.

Jon_King

1:25 pm on Jan 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks LM,

That is a good solution for well behaved bots.

I can use this method in a pinch but, I am really looking for something more bullet proof especially against the rogues and scrapers that don't obey robots.txt and nofollows'.

Lord Majestic

1:28 pm on Jan 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can try asking to enter numbers people see on a warped image before they can proceed to download PDF. This should work against all bots, but won't work against determined human attacker (and nothing on the public web will). Be warned however that people expect to download PDF's without having to go through this number typing procedure, so I would advise to watch your drop offs on that page very carefully.

Bonusbana

1:50 pm on Jan 29, 2005 (gmt 0)

10+ Year Member



How about:

<?php
if (!preg_match("#(google¦slurp@inktomi¦yahoo! slurp¦msnbot)#si", $_SERVER['HTTP_USER_AGENT'])) {
echo "<a href=\"file.pdf\">pdf</a>";
}
?>

MatrixBrains

2:56 pm on Jan 29, 2005 (gmt 0)

10+ Year Member



Thanks LORD MAJESTIC! (I got ur point.)

Jon_King

4:23 pm on Jan 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Bonusbana, I have to keep it an HTML page. PHP won't work for me in this case.

I have found several software products that claim to hide affiliate links within an HTML file... although I'm not using it for affiliate links it should work for any link. They are typically called 'affiliate cloners', anyone know how they do it? If they work?