homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Google Makes AJAX Applications Crawlable

 2:21 pm on Mar 4, 2010 (gmt 0)

Google Makes AJAX Applications Crawlable [code.google.com]
This document outlines the steps that are necessary in order to make your AJAX application crawlable. Once you have fully understood each of these steps, it should not take you very long to actually make your application crawlable!

Briefly, the solution works as follows: the crawler finds a pretty AJAX URL (that is, a URL containing a #! hash fragment). It then requests the content for this URL from your server in a slightly modified form. Your web server returns the content in the form of an HTML snapshot, which is then processed by the crawler.



 12:18 am on Mar 5, 2010 (gmt 0)

So the onus is on the site's owner to pay for extra work on the site, then. :(

What's wrong with modifying a standard XML reader as a bot? Surely THAT knows whether a file is XML or not? Don't the headers give a clue?

Have to say I'm glad I'm not coding XML. I tried to read Numbers pages on the IANA site yesterday using Firefox. All I got was garbage. But at least it told me at the top that it was XML.


 2:44 am on Mar 7, 2010 (gmt 0)

As I understand it, the issue is not really about reading XML. The issue is indexing a URL with a # mark as being different from the URL without the #. If your site needs this technology, then your various AJAX page states were already NOT getting indexed by the search engines, and this technology is a help, and worth investing the resources to get more search traffic.

I think Google is correct here. The web needs a different way to identify a new AJAX call to the server for page content that wasn't served in the first download. Recycling the simple hash mark for that purpose has definitely been problematic. That's even more the case with the new interest in indexing the page fragments with direct links in the snippet.


 8:15 pm on Mar 7, 2010 (gmt 0)

Surely it's possible to determine the likelihood of a hash link on a page being in-page or external? If the page is XML it's likely to be the latter, surely? Follow the link and see what comes up: if it's not XML, drop it.

Or perhaps I'm being too simplistic. I come from an age when programming was logical and structured, before internet "gurus" got hold of the idea. :)

As to getting more search traffic: if the browser can't display it, surely it ain't much use? The punter (eg me) will go away and find a page that can be viewed.

Or is non-viewable an XML design fault? If so, I think IANA should be told.


 9:54 pm on Mar 7, 2010 (gmt 0)

You're right - a major part of the html/xhtml on the web is a far, far distance from anything you could call disciplined programming. If web pages were treated as true programming, then your browser would be blank almost all of the time today!

So the job of trying to surface relevant content from that soup is extremely challenging - especially with the amount of truly malicious JavaScript that is being served out there. No one would want that code to just run on their servers.

Also a crawler is definitely not using the same kind of approach that a visual browser uses. Different goals, different end uses. I don't know if this #! approach that Google suggests will really fly or not, but it's not too bad as a first attempt.


 10:02 pm on Mar 7, 2010 (gmt 0)

The # on its own already has a specific meaning.

So, I applaud that site owners have a new way (using the #! combination) to signify content they do want crawled, rather than Google 'assuming' and just barging on in.


 1:28 am on Mar 11, 2010 (gmt 0)

I like this. Very useful if designed thoughtfully.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved