tedster

msg:4454712 | 6:58 pm on May 17, 2012 (gmt 0) |
The big deal here - and Google's urgency - is indexing AJAX content, I assume.
|
onebuyone

msg:4454714 | 7:05 pm on May 17, 2012 (gmt 0) |
Wasn't Googlebot adjusted to crawl FB comments and only them?
|
g1smd

msg:4454735 | 8:10 pm on May 17, 2012 (gmt 0) |
Google's bots added a quarter of a million quids worth of products to the shopping basket of a site last week. They're now blocked.
|
Sgt_Kickaxe

msg:4454759 | 9:34 pm on May 17, 2012 (gmt 0) |
Google had switched to a visual method of recording page content a long time ago, before they launched page previews. They've been able to pick up textual comments loaded by javascript for a long time now. The real news is that they've begun trusting googlebot to dig deeper into code too. I would NOT be surprised if your site(or you as a webmaster) needs to pass a *sniff test* with their visual methods first, or have a trustworthy history, before googlebot opens up in your javascript with post requests.
|
Sgt_Kickaxe

msg:4454911 | 8:42 am on May 18, 2012 (gmt 0) |
g1smd, was googlebot cookied through at least part of the checkout process?
|
rustybrick

msg:4454970 | 11:21 am on May 18, 2012 (gmt 0) |
They were doing this 6+ months ago with all that chatter around Disqus comments and Facebook comments.
|
Planet13

msg:4455007 | 1:44 pm on May 18, 2012 (gmt 0) |
| Google's bots added a quarter of a million quids worth of products to the shopping basket of a site last week. They're now blocked. |
| Huhh... on my shopping cart, I would get these mysterious instances where SEVERAL visitors would all arrive at the same time and add one product to the shopping cart and then leave. They would all be added at the same minute. At first I thought it was some competitor trying to deplete my inventory, but now it seems more likely that it could be googlebot (because they only order ONE of each item, while a competitor would order hundreds or thousands of an item to deplete the inventory).
|
dstiles

msg:4455104 | 7:51 pm on May 18, 2012 (gmt 0) |
if not real user with real web browser then do not display ANY forms end if Ditto javascript, css, whatever
|
koan

msg:4455211 | 4:14 am on May 19, 2012 (gmt 0) |
Would it run javascript code of a js file located in a directory blocked by robots.txt?
|
lucy24

msg:4455260 | 9:23 am on May 19, 2012 (gmt 0) |
| Would it run javascript code of a js file located in a directory blocked by robots.txt? |
| Preview definitely would if it could-- but then, Preview isn't a robot. So would the plainclothes bingbot.
|
dstiles

msg:4455392 | 7:42 pm on May 19, 2012 (gmt 0) |
koan - you cannot rely on any bot to obey robots.txt in all situations. They used to but, as lucy notes, preview and other plain-clothes bots can do anything. Detect IP ranges, UAs, headers, whatever either within an htaccess file or within the page itself (I would have thought webmasters should be doing that anyway to determine real visitors). What you do with the catch when you get it depends on you. I generally throw it back with a 403 or, in the case of JS, do not load it within the page.
|
lucy24

msg:4455452 | 1:54 am on May 20, 2012 (gmt 0) |
| I generally throw it back with a 403 or, in the case of JS, do not load it within the page. |
| Do you mean that you include a bit in the js itself to detect the UA and/or IP and act accordingly? So the page gets a little bit fatter but you're shifting the work from your server to the visitor's computer?
|
Kendo

msg:4455508 | 6:10 am on May 20, 2012 (gmt 0) |
Well there goes another area of PRIVACY thanks to Google. A common practice to protect links and content from being indexed is to use JavaScript to write in the link. Lord, please forgive them for they do not know what they do... after all they are only criminally insane idiots.
|
tedster

msg:4455513 | 6:40 am on May 20, 2012 (gmt 0) |
The processing of javascript links should not be news to anyone who's been paying attention. It's been happening (and discussed here) for several years. To protect those javascripted links takes another step - Disallow googlebot from crawling your JS file, for instance. This change is about being able to crawl AJAX content, presumably without the clunky hash-bang workaround. The sky is not always falling ;)
|
Staffa

msg:4455522 | 8:57 am on May 20, 2012 (gmt 0) |
| Disallow googlebot from crawling your JS file, for instance. |
| As if, if G has a mind to it, G will follow your robots.txt disallow I have caught G more than once disregarding robots.txt rules
|
dstiles

msg:4455669 | 8:47 pm on May 20, 2012 (gmt 0) |
Lucy - no, it's all part of the page processing. JS never gets sent to bots. Staffa - see my previous post re: not having to rely on the clunky and easily ignored robots.txt.
|
Staffa

msg:4455684 | 9:31 pm on May 20, 2012 (gmt 0) |
dstiles - I know and I certainly do not rely on robots.txt, I'm just surprised that after all those years it's still suggested by tedster tedster - unless you are using Disallow in the broader sense and not necessarily via robots.txt, in which case "Disallow" threw me off track as it is so specifically associated with robots.txt
|
tedster

msg:4455706 | 10:34 pm on May 20, 2012 (gmt 0) |
No - I did mean robots.txt. I've had good luck with it, although I have heard that others ran into trouble. Do you have any idea about what the differences might be? I've mostly used it to keep affiliate links from "being counted."
|
Staffa

msg:4455886 | 10:44 am on May 21, 2012 (gmt 0) |
I have no idea whatsoever, the sites are as plain as they come without any ads or other external input. The javascript and css are purely for visitors and disallowed in robots.txt and when G ignores this it gets whacked like any other rogue bot
|
lucy24

msg:4456011 | 4:25 pm on May 21, 2012 (gmt 0) |
The javascript and css are purely for visitors and disallowed in robots.txt and when G ignores this it gets whacked like any other rogue bot |
| Do you have the googlebot itself reading and acting on javascript? See, I could have sworn I'd caught it myself. Many times. But I pored over logs and all I could find was Preview consistently misbehaving.
|
Staffa

msg:4456059 | 5:43 pm on May 21, 2012 (gmt 0) |
Yes, it's Gbot itself occasionally fetching css and js files. Preview and translate are blocked as standard ;o)
|
|