Unspiderable links

Forum Moderators: martinibuster

Message Too Old, No Replies

Unspiderable links

Regent

4:49 pm on Jun 23, 2003 (gmt 0)

Can anyone give me an example of how to code a link to make it unspiderable? The reason is I want a simple text navigation that the search engine can follow, and another unspiderable navigation for users. Preferably, I would like the code examples to employ JavaScript, but by all means show me different examples if you can.

Thanks.

pixel_juice

7:44 pm on Jun 23, 2003 (gmt 0)

Hi Regent, welcome to webmasterworld :)

Because search engine spiders don't read javascript, any method of linking using javascript is currently unspiderable.

One method is just to use

<script language="JavaScript" type="text/javascript">
document.write ('<a href=http://www.example.com>example</a>')
</script>

Or you could use something like one of these:

<a href="#" onclick="window.location='http://www.example.com;'">example</a>
<a href="javascript:location.href=('http://www.example.com')">example</a>

Regent

7:47 pm on Jun 23, 2003 (gmt 0)

Thanks. This helps alot.

worker

7:48 pm on Jun 23, 2003 (gmt 0)

Are the examples you have given effective at preventing 'PR bleed'?

If a link is not 'spiderable', then does that mean that Google will not count it in its algorhythm?

jimbeetle

8:04 pm on Jun 23, 2003 (gmt 0)

Regent,

Try using an external js file. At the moment it still works to hide links, but be prepared to change as spiders change.

pixel_juice,

>>Because search engine spiders don't read javascript

No longer true for Google. If you can read it, g-bot will read it. What g-bot does not yet do is parse js.

worker,

What might be "effective at preventing 'PR bleed'" will also be effective at making a site look like a dead end -- many links in, none out -- in general, SEs like to see sites that are well linked.

Jim

pixel_juice

9:15 pm on Jun 23, 2003 (gmt 0)

>>No longer true for Google. If you can read it, g-bot will read it. What g-bot does not yet do is parse js.

I'm not quite sure what you mean. Are you saying that Google will spider links written in javascript or that if you make it complicated enough it won't?

jimbeetle

9:31 pm on Jun 23, 2003 (gmt 0)

pixel_juice,

Both.

Google implemented reading js for links a couple of months ago. It will read the js and if it sees something that looks like a link it will try follow it, basically anything that has an anchor or an href. So in both of your above examples it will pull out the http*://www.example.com.

And yes, you can make it more difficult because g-bot will not parse the js (at least not yet). So if you write an expression that concatenates strings into a full url (however it works: 'www.' + 'example' + '.com' or something) you should (maybe) be okay (for now).

And I don't know which ones offhand but I believe that a couple of other bots have been following js links for some time. Can anybody confirm either way?

Jim

MaxGrenk

12:13 pm on Jul 2, 2003 (gmt 0)

Hmmm... this makes not to much sense to me.

I have build quite a nice site (so I feel), which utilises a DHTML menu; the spiders couldn't follow it... while a databse dump of my products into a single DIR (single indes.asp plus 2,000 product pages) got picked up within two weeks...

... now that I have written this post, it occurs to me that the DHTML menue, though it is written in JavaScript, has neither href nor www in its links.

E.g ," … Scanners","/ShopDisplayProducts.asp?id=291&cat=Scanners",,"Scanners ...",0

... which explains why it is not picked up.

Any ideas on what i could do to have a menu the spiders could follow?

quiet_man

12:19 pm on Jul 2, 2003 (gmt 0)

>>Any ideas on what i could do to have a menu the spiders could follow?<<

Could you not put the links in a <noscript> tag within your document <head> section?

takagi

12:21 pm on Jul 2, 2003 (gmt 0)

ShopDisplayProducts.asp?id=291&cat=Scanners

>> Any ideas on what i could do to have a menu the spiders could follow?

The problem is the parameter 'id'. This suggests a session ID, and search engines don't like session IDs. Following them would give a lot of identical pages. So to get the spiders follow your links, rename the parameter to 'code' or so.

Iguana

12:32 pm on Jul 2, 2003 (gmt 0)

jim

ia_archiver (Alexa/Way Back Machine) atempts to follow Javascript - it appears to interpret javascript in buttons eg. goxxx('123456') as a possible link and will make a request for a /curr_dir/123456 page.

So even if the goxxx function is in an external javascript it still makes an attempt. I've not seen this behaviour from any other bots though

mil2k

2:46 pm on Jul 2, 2003 (gmt 0)

Google implemented reading js for links a couple of months ago. It will read the js and if it sees something that looks like a link it will try follow it, basically anything that has an anchor or an href. So in both of your above examples it will pull out the http*://www.example.com.

I Agree. Matt Cutts said something along the same lines in Pubcon 4.

MaxGrenk

10:24 pm on Jul 2, 2003 (gmt 0)

Thank you kindly for your response...

I still believe that g-bot can't follow the kind of coding I've mentioned above.

I fixed the DHTML menu problem by putting text-based links underneath the menu. The DHTML menus live in another layer. This would also cater for users/visitors with Java disabled.

In regard to "id" meaning "session id", well, I think the bots need better algos. :)
Real session ids are much longer than 5 digits.

Another story is -- related to link development -- g-bot also got lost (and its not its fault) when hitting the "view cart" link. This triggered an error (Cart is empty), which is also a 302 redirect (to the error msg page). I have now disabled the links (together with "checkout") in case the cart is empty.

DavidT

7:46 pm on Jul 3, 2003 (gmt 0)

Google's Media Partners bot definitely reads and follows javascript I've noticed, although not directly relevant to link following.