Forum Moderators: open
Im currently working as a SEO for a Marketing company in Latin America, and the agency just got a new HUGE client to do SEO and SEM (Sponsored results).
When i examined the client's website, i noticed that the site was 100% flash, but getting into Google's cache, and then in the "text only version" i noticed that there was actually content that spiders could read. Then taking a look at the client's source, and going to their office and having a little chat with them, they told me that all the information was stored in an XML. Then they used that XML to put content in the Flash "movie", and under the flash (like in another layer not visible for users) the write the same content but in html for spiders.
I've asked a Google Technitian for help, but he only told me this
"This is our policy about Flash.
[google.com...] "
As far as I know this is not cloaking because the content shown to spiders and the one shown to users is the same, except that users see the flash, and spiders see the html. Also if a user has flash disabled will get the html version.
Do you think this is penalized? I think it's not because this client is not "cloaking" anything, but a second opinion would be fine.
Thanks
Swf2html extracts text and links from a Macromedia Flash .SWF file, and returns the data to stdout or as an HTML document.
I would think the cloaked version is a bit more cleaner than the extracted Macromedia data.
There really is no need to cloak the HTML version. I'd be offering that to my visitors as an alternative to the Flash version. Block the Flash from getting indexed (Google can read Flash files) and let the HTML version do its thing.
I'll try to find more information about that tool from macromedia, thanks.
Adobe - Player Licensing : Search Engine SDK FAQ
[adobe.com...]
Who should use the Macromedia Flash Search Engine SDK?
The Macromedia Flash Search Engine SDK is designed for search engine application engineering teams. Users of the SDK can add Flash file decompression, parsing, and indexing features to their server-based search applications.
We've been using similar techniques since the late 90's, and we've never had any trouble with SE.
There is a few different ways of doing this, all of which has been around for years. They all originally stem from the way frameset pages had to be modded so that they could be indexed with deep/pretty urls. To accomplish that, the frameset was inserted via javascript, making it appear on-top of an ordinary non-framed version of the page that also was coded into the page, using the exact same data pulled from db of cource. Flash-developers adopted these techniques quickly, as it also solved the 100% layout problems we had with flash in the old days (1998/1999). Ever since XHTML/CSS became the norm (2002/2003), the flash-object has usually been placed within a replacable DIV instead.
Good thing more people are starting to take note of these solutions these days. Although it's kind of funny to see how things seem to be re-discovered in type ten-year intervals :)
Simple menus, splash pages, etc. can all be addressed using the attributes available for the elements you are using.
Rare?! I don't know about that. Depends where you look I guess :)
However, I see more and more hybrid-sites utilizing these same concepts (as the embedded applications gets more and more content-dependant). Maybe not so much on the Flash-scene as within the AJAX-community - where it has really become a major issue lately.
In response to Google being able to read SWF-files: There's much confusion going around out there on this topic. Here's the deal: Yes Google can index SWFs, but only the text that is embedded within the published SWF, not text that is dynamicly loaded on the client at runtime. As most flash-development today involves pulling data from a DB at runtime, Google will not be able to index/see that content. Going with a replacable DIV, or similar, is therefore a must if you want to have your content indexed.
Guideline 11. Use W3C technologies and guidelines.
Checkpoint 11.4 If, after best efforts, you cannot create an accessible page, provide a link to an alternative page that uses W3C technologies, is accessible, has equivalent information (or functionality), and is updated as often as the inaccessible (original) page. [Priority 1]
I have been trying to work with a php based page where it uses user agent detection with a php if else statement.
-----------------------------------------------------------
if(userAgent.ToLower.IndexOf("googlebot")>1 ¦¦
userAgent.ToLower.IndexOf("msnbot") > 1 ¦¦
userAgent.ToLower.IndexOf("slurp") > 1 ¦¦
userAgent.ToLower.IndexOf("ask jeeves/teoma") > 1)
{
// Insert code for Html layer here.
}
else
{
// Insert current code for flash layer here.
}
-----------------------------------------------------------
If the page is detected by a SE then the text variation displays. If not then the flash content is displayed.
I am still trying to figure out if the SE's will look at it as spamming or cloaking. If anyones has any experience with this please let me know!
But always showing exactly the same content in both versions. That's where common sense HAS to appear.
If the content you show inside the flash (let's say SE's can't index it)
But, the SEs can index it. And, they get better at it as time goes by. So, it is up to the developer to make sure that the Flash version is blocked from getting indexed while the HTML version does its thing.
Or, you can get high tech with the process and start doing IP based content delivery. It works like a charm if you do it correctly. I don't want the bots inside my Flash content "guessing" at what is there. They still aren't smart enough to do that so "I" have to provide them with the path of least resistance, my HTML version whether it be public or bot only.
In the ultimate scenario, the Flash developer would have taken this all into consideration and is detecting for the Flash Player. If not installed and/or not supported, the HTML version is served. It's that simple.
You may want to consider creating HTML copies of these Flash pages for our crawler. If you create HTML copies, please be sure to include a robots.txt file that disallows the Flash pages in order to ensure that our crawler doesn't recognize these pages as duplicate content.
Along with it I also add a small text link in bottom for flash version. This is for the user who land up from Google SERPs. Or else someone suggest to check whether the user could read flash or not, depending upon it will redirect to flash or html version. I have never tried this as i am a non technical person for it.