Forum Moderators: open

Message Too Old, No Replies

Why is google still indexing my FLASH?

         

ichthyous

4:17 pm on Jun 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi there...I have two Flash sites and they are built using various .SWF files linked together. Google has indexed each component individually and I am actually getting hits to them, which I don't want. I added this line to my robots.txt:

User-Agent: Googlebot
Disallow: /*.swf$

The swf files are still indexed though...is there some kind of command I can give to google to drop these files from the index? Thanks!

doc_z

8:23 pm on Jun 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As far as I know the syntax [robotstxt.org] isn't correct. You can use a robots.txt validator [searchengineworld.com] to check it.

ichthyous

11:18 pm on Jun 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



it validates fine

drbrain

11:28 pm on Jun 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So you keep all your flash content in a directory called "/*.swf$"?

In other words, the Disallow line does not allow a regular expressions.

ichthyous

12:11 am on Jun 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



no, the swf files are at the root level of the domain...I grabbed this code from this site...what i want to tell Google to do is to just not index any .swf files in the site, to make all swf files off limits. Is this not the correct code? Thanks

ichthyous

6:16 pm on Jun 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does anyone know the proper syntax to accomplish this?

doc_z

9:13 pm on Jun 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would place the swf files in a directory and exclude this directory.

Also, you can ask the question in forum93 (robots.txt).

ichthyous

9:32 pm on Jun 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



thanks for the advice...because of the way the flash sites are built I can't put them in another directory

sfxmystica

11:28 am on Jun 16, 2004 (gmt 0)

10+ Year Member



Google also suggests this alternative : Using the 'Robot' META tag.

Try adding this to your HTML file :

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

INDEX directive specifies if an indexing robot should index the page. The FOLLOW directive specifies if a robot is to follow links on the page.

Google also says in its FAQ : "In order to save bandwidth Googlebot only downloads the robots.txt file once a day or whenever we have fetched many pages from the server. So, it may take a while for Googlebot to learn of any changes that might have been made to your robots.txt file. Also, Googlebot is distributed on several machines. Each of these keeps its own record of your robots.txt file."

The swf files are still indexed though...is there some kind of command I can give to google to drop these files from the index?

From Google FAQ :

<META NAME="ROBOTS" CONTENT="NOARCHIVE">

This tag will tell robots not to archive the page. Google will continue to index and follow links from the page, but will not present cached material to users.

Removing a page from Google's index :

".... Google's policy for removing a page from our index requires that we obtain the permission of that page's webmaster .... we will remove the offending page from our index. For more information on this process, please see [google.com...] "

You can get more details about all this and more here :
Google FAQ's [google.com]

tombola

12:05 pm on Jun 16, 2004 (gmt 0)

10+ Year Member



because of the way the flash sites are built I can't put them in another directory

If you can rename all your .swf files, there is a solution.
Let all your .swf filenames start with the same letters (for example: "swf", so "intro.swf" would become "swf-intro.swf").
Now you can add these lines to your robots.txt file:

User-Agent: Googlebot
Disallow: /swf

Make sure that only .swf files start with that letters...

ichthyous

1:41 pm on Jun 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The robots exclusion tag doesn't work because I need Google to archive the HTML page the main swf is embedded in. The problem is that Google is also spidering the swf files that are NOT embedded in HTML, and which I don't want to appear in the index. Also, renaming the swfs won't work as it would require all of the actionscript to be updated for the various swf files to talk to each other.