Forum Moderators: open
<a href="java script:void" onclick= "window.open('http://www.somesite.com/');
return false;">
... but the URL opens in a new window, and I want an unspiderable link to open the new URL in the same window. Could anyone tell me if this is possible in JavaScript and what the code is?
Many thanks.
[webmasterworld.com...]
/claus
<a href='javascript:self.location.href="http://www.somesite.com/";'>
or
<a href='#' onclick='javascript:self.location.href="http://www.somesite.com/"; return false;'>
"...would the robots follow this link..."
Currently most/all robots don't parse javascript. But I recall a post from GoogleGuy to the effect that 'just because googlebot doesn't parse javascript now, don't assume that will always be the case; they are working on it'
Of course, for well behaved bots, you could ask them not to spider it in your robots.txt file. And there is a monster thread somewhere here on the perfect htaccess file to ban badly behaved bots.
Shawn
Regarding the thread that claus showed, one thing that can cause this may be that the javascript is visible to the bot, so if it is not commented out, the bot can read it just like a very old browser might display the raw code on screen. So I can understand why there were so many varied experiences in that thread; there are different ways to write the same thing in javascript...
Shawn
Msg #29 speaks of off-page javascript links, embedded in a file outside the page that is read. Of course these are another matter.
ShawnR:
>> one thing that can cause this may be that the javascript is visible to the bot, so if it is not commented out, the bot can read it
Even though text is commented out, it's still visible to the bot - the bot retrieves the whole html document with a GET request, there's no sorting of JavaScript, styles, comments, or whatever. All that you serve the Gbot will get read by it (perhaps, in some cases only the first some hundred Kb, i'm not sure about that)
After retrieving the document, it is parsed. In this parsing process, the parsing rules may decide that comments, html code, javascript, or whatever should not be indexed. Now it seems like comment text is ignored, but, such rules may change over time, as it has apparently done with javascript.
>> I recall a post from GoogleGuy to the effect that 'just because googlebot doesn't parse javascript now
Msg #19 and #20 of that thread may imply this, but it's not the exact wording. GG speaks of "hoarding PR" as a general phenomenon that can be identified and even used as a scoring factor itself. It is not specific to javascript, quote: "You can try all sorts of stuff to "conserve PageRank,"" (all sorts, msg #20)
Heres a <snip> Coogle cache image of #9 in the SERPS for "window.open". Please consider this: "These terms only appear in links pointing to this page: window open".
I'm not all that sure that this text can be taken for granted, face value. But in this case it might be so.
This is not a page on JavaScript. It's not a tutorial or an explanation of the "window.open" method. Far from it. Here's the url of the page: <snip>
Try searching the page for "window". Nada. Then "View Source" -> Search for "window" -> #2 result says bingo! This site apparently uses a javascript dropdown for navigation.
You can conclude what you like. Personally, i think it proves that Google actually indexes parts of the text on a page that is not visible to a person using a browser.
It could also wery well indicate that Google indexes JavaScript links.
Here's another link to the G cache: <snip>
It's #8 in the SERPS for "TABLE START". The Smithsonian Institution. It's not about tables, neither wooden ones nor the HTML kind. Do you see TABLE START anywhere on that page? Well, it's got a high PR and the TABLE html-tag is used 42 times (21 start+end) on that page. HTML code is not visible to persons looking at the page by means of a browser, but it's visible to Gbot.
I find it hardly believable that a lot of people would use the link text "table start" when pointing to the SI. A "link:www.si.edu" search reveals that they have around 7,800 backlinks, but i did not find an anchor text of "table" on the ones i tried.
I have not yet found solid evidence for commented-out text. I have tried. It does not seem like it's being indexed, but i am still not 100% sure.
/claus
[edited by: claus at 1:56 pm (utc) on July 24, 2003]
Incidentally, the effect I want to achieve is to control the flow of Google PageRank within my own site rather than to prevent any pages being indexed or to "hide" outgoing links. All the pages are linked in some way or another by pure HTML, but according to my strategy, my index page and a couple of others will have a much higher "raw" PageRank than some of the less important ones, which is the effect I want, but of course depends on whether my JavaScript links are actually unspiderable.
Claus, I think we are agreeing with each other ;) My post was just explaining the possible mechanism, and why there are inconsistencies in the reported behaviour.
I've read the thread (previously and now), and I'm not convinced they demonstrate anything definitive, except to confirm that the mechanism described by your post above, and my post above that, is feasible. Many posts in the Google News forum are just conjecture, and some of those you list admit to being that. Then again, many posts are absolute gold.
"...Heres a Coogle cache ..."
I'd really request you remove the urls. Posting urls or search terms which can identify specific sites is against the terms of service. At any rate, the issue this thread is addressing is not what is on those pages, but how google got to those pages. (How Google ranks a page's relevance to search terms is a big discussion that can't be covered by this thread, but yes, many factors are taken into account which are not visible, such as alt tags, file names, urls, etc.)
"...an example of how to write a JavaScript link that would not be followed by a robot, compared to other JavaScript link codes that would be spidered..."
Personally I don't think this is something you can rely on, although perhaps it is true for some bots now.
Three options / suggetstions:
Following the last option, you don't have to worry, as the links that you don't want followed simply are not there, neither in html nor javascript.
ShawnR:
>> I'm not convinced they demonstrate anything definitive, except
Yes i agree, we'll have to watch developments before we conclude 100%, but there is evidence that G is including more than "the visible parts of a page". Comments are a "high risk zone" regarding spam, so i'd be very careful on indexing that one. On the other hand, the bot lives and thrives off links, so i'd go a long way to make it able to identify more of these.
>> I'd really request you remove the urls
- done, no problem :)
>> how google got to those pages
I think this speaks for itself, although i'm still uncertain that this sentence can always be trusted to mean exactly what it says: "These terms only appear in links pointing to this page: window open".
/claus
Oh, found this one as well: [webmasterworld.com...]
/claus