Forum Moderators: open

Message Too Old, No Replies

Google Crawling and Indexing of Frames

6 questions to help set the record straight

         

swerve

1:50 pm on Apr 25, 2003 (gmt 0)

10+ Year Member



I have read many conflicting opinions about how Google handles frames. My questions below are an attempt to reach consensus on some of the issues surrounding frames. For the purposes of this discussion, it would be great to limit the discussion to crawling, indexing, and PR effects -- other "issues" with frames would be better left for a seperate discussion (usability, site management, orphaned pages, etc.).

1. What gets indexed, the frameset part or the <noframes> part (assuming both exist)?

2. Does Googlebot follow frame references (ie. "<frame src="http://...."

3. Does Googlebot follow links within the <noframes> part of the page?

4. If the answer is yes in 2 and/or 3 above, does the frameset page require a minimum PR level in order for Googlebot to follow such links?

5. Does PR pass through a "<frame src..." reference? (For example, if a framset page has PR5, with a single frame reference, without <noframes> tags, does the frame source get the benefit of the PR5 "backlink"?)

6. What PR effect, if any, occurs as a result of "frame respositioning" -- scripts that ensure that a frame is displayed only within its parent frameset? Do such "redirects" pass all the PR to framset page (and none to the framed page)? Does it matter whether the script is client-side or server-side?

Receptional Andy

1:55 pm on Apr 25, 2003 (gmt 0)



My snappy answers (without any evidence ;))

1. Typically, the page that calls the framset and also the pages referenced in the frameset.

2. It didn't used to but it does now.

3. Yes

4. Not in any special way because it is in a frames page.

5. I would say that it probably does but this is extremely difficult to verfiy.

6. The same effect as with any other redirect. If it's javascript google won't even follow it, if it's server-side then it depends on may other factors.

John_Caius

1:59 pm on Apr 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google has indexed my frameset but its cache for that page is completely blank. It has followed all (100ish) the links in the noframes content and cached them normally, although as orphan pages (I have no javascript wizardry).

swerve

2:07 pm on Apr 25, 2003 (gmt 0)

10+ Year Member



Google has indexed my frameset but its cache for that page is completely blank.

John, I have a few pages that show the same (although looking at the source of the cached page does reveal the complete HTML (including the noframes part) of the frameset page. Care to share the PR of your frameset page? Thanks.

Receptional Andy

2:07 pm on Apr 25, 2003 (gmt 0)



>>Google has indexed my frameset but its cache for that page is completely blank

Tthe page will appear to be blank in the cache - look at the source and you will see your frameset there. It looks blank because the framset code has to go between the <head> and <body> tags. Google's code at the top of the page breaks the frames code. It's nothing to do with PR, just simple HTML ;)

Condor12

2:12 pm on Apr 25, 2003 (gmt 0)

10+ Year Member



Does Google penalise if the individual pages with in the frames set check that they are in the frame and redirect to the frame set?

Receptional Andy

2:13 pm on Apr 25, 2003 (gmt 0)



>>Does Google penalise if the individual pages with in the frames set check that they are in the frame and redirect to the frame set?

This question has already been asked and answered on this page. Use javascript and Google will not even know the redirect is there because it doesn't try and read javascripts (currently).

jjdesigns4u

2:48 pm on Apr 25, 2003 (gmt 0)

10+ Year Member



so to avoid the framed pages becoming orphans in the listnigs you guys put a script on it that if the page is loaded it make sure the frame set is there....

can I get a copy of this script from someone

thanks!

Condor12

3:42 pm on Apr 25, 2003 (gmt 0)

10+ Year Member



Hi jjdesigns4u,

Try something like this

if(top.frames.length>0) {
void(0);
} else
top.location.href = "theframepage.html"

jjdesigns4u

3:47 pm on Apr 25, 2003 (gmt 0)

10+ Year Member



thank you very much

jetboy_70

4:19 pm on Apr 25, 2003 (gmt 0)

10+ Year Member



Condor12's code will load in the frameset, but not the page the user was actually requesting. To do this you'd need to pass a reference from the calling page. There's a Dreamweaver extension called Framejammer (available from [macromedia.com...] that accomplishes this, and a series of very useful article at:

[tech.irt.org...]

I butchered these sources for the code I used on a site earlier this week. It's not very adaptable, but it is fairly compact. Here's the email that I send out to my colleagues:

When frames sites are indexed on search engines, the individual frames are listed on the results pages, and can be clicked through to by the user without the site loading in the navigation frames, resulting in orphaned pages, a stranded user and no business for the client. Combat the problem with this code:

Firstly, make sure your frames are named, and that you call the frameJammer() function using the onLoad event. This will usually be in your index.html page:

<frameset rows="50,50" onLoad="frameJammer()">
<frame name="topFrame" src="topframe.html">
<frame name="bottomFrame" src="bottomframe.html">
</frameset>

This example has 'topFrame' as the navigation frame, and 'bottomFrame' as the content frame. The orphaned pages that we would expect to be listed on search engines should actually be loaded into 'bottomFrame'. Also, you need to add the frameJammer() function into the <HEAD> section of index.html:

function frameJammer()
{
var pageURL = location.search;
if (pageURL.indexOf("~bottomFrame")!= -1)
{
pageURL = pageURL.substring(1);
var tilde = pageURL.lastIndexOf('~');
var pagePath = pageURL.substring(0,tilde);
var pageFrame = pageURL.substring(tilde+1);
eval("top."+pageFrame+".location.replace('"+pagePath+"')");
}
}

This fuction assumes the frame 'bottomFrame' is being used. If you are using a different name, you will have to change the reference in the function (but leave the tilde intact).

In the <HEAD> section of every content page (i.e. page that should load into 'bottomFrame'):

if (window.name!='bottomFrame' &&!((self.innerHeight == 0) && (self.innerWidth == 0)))
top.location.replace('index.html?'+location.pathname+location.search+'~bottomFrame');

Once again, 'bottomFrame' is referenced. Change if needed.
How does it work?

The code in the content page checks to see if it's loaded into 'bottomFrame' (i.e. correctly positioned in the appropriate frameset). If not, it calls the index page, and sends a reference to itself (path to the file and any parameters) and a reference to 'bottomFrame' for the frameJammer function to access.

The frameJammer() function checks to see if it needs to do anything (is '~bottomFrame' in the refering URL?), and if so, splits the parameter string back into its component parts, and uses the parts to load the page into the frameset correctly.

If you need to trap more than one frame, just expand the second line of the frameJammer() function:

if (pageURL.IndexOf("~bottomFrame") ¦¦ pageURL.IndexOf("~topFrame"))

DavidT

5:16 pm on Apr 25, 2003 (gmt 0)

10+ Year Member



Am wondering if there is any possiblity that Google would view dimly the frame jammer function on a hand check since it prevents users from viewing the cached version of pages doesn't it?

DavidT

7:08 am on Apr 26, 2003 (gmt 0)

10+ Year Member



This just today, Googlebot seeming to call the frame jamming function, like this:

"GET /?internal-page.html~mainframe HTTP/1.0" 200 13249 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

1. internal-page.html is wrong syntax for this page should be Internal-Page.html
2. Actually what it pulled was the index page, '13249' kb is for that.

I have never seen Googlebot doing this on my site before.