Forum Moderators: open

Message Too Old, No Replies

How to delay loading elements to increase page speed

without negatively impacting Google's ability to crawl the content

         

NickMNS

7:53 pm on Jun 30, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have a page, it loads relatively fast considering all the content it contains. But I want to add some links to the page, but determining what links to show requires an additional round trip to the database. These would be "similar content" type links. From the users perspective, I doubt they would care whether the links load immediately or at some point after the initial page load (5 or 6 seconds later lets say).

The problem, and half the point of showing those links is to allow Googlebot to crawl more pages of my site. So the question is, if I use ajax to load the links after the initial page load will Googlebot see and crawl those links?

The page is currently setup such that the graphics that appear below the fold are not generated until after the use scrolls. So If a user comes to the page, and bounces, they never see the graphics and no resources nor page load time was ever wasted on generating them. This keeps my page load time reasonable. But, when I fetch the page in SC, the graphics do not appear and I assume that Googlebot never sees them. When it comes to the graphics I am okay with it as these svg objects and do not get indexed as images, and since they are graphics do not really provide much content to the page from Google's perspective. All the graphics are described with text and the text appears at page load, so the content that counts is indexed.

So what is the solution here, take the hit load the links with the page or take a chance and load the links after "document ready"?

robzilla

9:05 pm on Jun 30, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Googlebot doesn't render your page like a browser would, at least not during the crawling phase, so it would not see or index the content you add dynamically using Javascript.

Similar content type links tend to be relatively static (unless heavily personalized), so they're good candidates for caching. Have you looked into that?

How much would the extra query cost you?

NickMNS

9:35 pm on Jun 30, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



so it would not see or index the content you add dynamically using Javascript.

I'm not sure that is true. But come to think of it, this would be easily testable. I will add the links with an AJAX call at document ready and see if they appear with fetch as Google. The graphics I describe above do not appear because they require user action.

they're good candidates for caching.

At this stage in my project I am adding a lot of content so the links are subject to change frequently, however at some point in the future It may be an option to consider.

How much would the extra query cost you?

Not sure, as mentioned above I adding content to the db so even if I test it will certainly take longer in the future. But I haven't even tested anything, I am just trying to plan the implementation.

robzilla

10:02 pm on Jun 30, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm not sure that is true

Neither am I, after reading this article [searchengineland.com] from 2015 over at Search Engine Land.

But it's hard to say if Google would value the dynamically inserted links, if it could indeed find them, the same as plain-text ones.

You will be introducing extra overhead when you dynamically fetch and insert the links, of course. It sounds like a premature optimization.

since they are graphics do not really provide much content to the page from Google's perspective

I think Google does value images, actually. Probably because users do, too. I've always seen having original images as a plus.

Have you considered using a sitemap to get Googlebot to crawl more pages?

NickMNS

11:30 pm on Jun 30, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is no doubt that the images are valuable, but these are not standard images. They are interactive bar charts, written in SVG and placed on the page using the <object> tag. Google does not index these as images, and I am not sure what the code offers in terms of content value. (My understanding is that SVG is crawled and indexed as text content, but the amount of actual text is very limited) But as mentioned in the OP, for each chart, I provide a written summary of the content of the chart. So I assume (and yes I may be wrong) that the summary text is sufficient. Obviously from users perspective charts are much more interesting and the hope is that this will lead to more shares and links. So there is value.

I have been struggling to get this site off the ground, but I think it is due mostly to crawlability and that I am still feeding the feeding the db with relevant data.

At any rate I am going to work on adding the links and then I will test it. I'll let you know how it turns out.

NickMNS

5:07 pm on Jul 1, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I implemented the change on the page. Specifically,

$(window).on('load', function() {
$.post( "/my-url",
$( "#form" ).serialize(),
function(j){
if (j.ok) {
$("#added-links").html( j.data );
}
});
});

This loads links (typically one or two) to the page after the page has fully loaded.

The experiment

Methodology:
A group of pages was selected to meet the following criteria:
- The pages must use/benefit from the feature, that is at least one other page exists that will be linked to the submitted page and the only link to that page is by the AJAX injected link.
- None of the pages of the group were ever in the index and were ever submitted to the index in the past.
- There must be a reasonable expectation that once the pages are submitted to the index that the pages will be accepted into the index.
- The page to submit and the linked pages are similar, in that one could easily be substituted for the other with little impact, but sufficiently different that the pages would not constitute duplicate content. (example based on an e-comm page: one page is for a specific model of a product and the other page for different model for the same product and bot pages display the information and specifications for the product but that information varies by specifics of the model.)

The page to submit was tested for speed before and after the change with PageSpeed Insights and with the performance tab in Chrome (Opera) developer tools.

Only once the changes are implemented the "page to submit" is submitted to the Fetch and Render tool in GSC. The both rendered page views, "How Googlebot sees the page" and "How a visitor to your website would have seen the page" are compared to see if the injected links appear in both pages. Then the page is submitted to the index using the feature Indexing requested for URL and linked pages. Once submitted to the index a site: search is done to see if the group of pages in question appear in the index.

Results:
There was no measurable impact on page speed, PageSpeed Insights score did not change from before to after. And there was no change to load speed up to "DOMContentLoaded" event. when checking performance with developer tools. Obviously, after the DOMContentLoaded event there is additional loading as the ajax call for the links and the rendering of the links to page is executed. In this case the ajax call takes less than 50ms.

Google fetch and render shows the links in both views.

The page was submitted to the index and after a short time (1 minute) a "site:" search was done, only the submitted page was found, the linked page was not included. The search was repeated 30 minutes later and there was no change.

Conclusion:
Googlebot is able to find the content that was loaded after DOMContentLoaded event was triggered as the links appear in both views of the fetch and rander feature of GSC. But the linked content is not immediately indexed when submitted to the index using the "Indexing requested for URL and linked pages" option. But there are some unanswered questions:

1- Will the linked content appear in the index given more time?
2- Is the fact that the content is linked to after DOMContentLoaded event the cause or is that linked content, even links that appear in the static html of the page, less likely to be indexed when submitted to the index with fetch and render tool?

One more aspect that was not controlled for, there was no check to see if including the links directly in the page would have or not made a measurable impact on the page speed.

These are my findings and I would love to hear some feedback.

robzilla

11:16 pm on Jul 1, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Googlebot is able to find the content that was loaded after DOMContentLoaded event was triggered as the links appear in both views of the fetch and rander feature of GSC.

This isn't how Googlebot crawls pages, though. The fetch & render tool will retrieve all page contents as if it were a browser with an empty cache. Googlebot tends to slowly crawl one resource at a time, and not necessarily in a logical order. The rendering/interpretation part probably doesn't take place simultaneously with fetching, but that's just my assumption.

My own experience with the Submit to Index option is that indexing usually happens within several minutes, at which point Googlebot won't even know, and apparently doesn't care, if the page is even linked to by any other pages.

If the Ajax request takes 50ms including latency, then it's unlikely to influence your page rendering at this point. You'll want to make sure you have the proper database indexes in place, and optionally caching in place for when the dataset grows much larger, to keep it that way.

lucy24

1:33 am on Jul 2, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The rendering/interpretation part probably doesn't take place simultaneously with fetching, but that's just my assumption.

Well, it can't be literally simultaneous, since crawling and interpretation (“rendering”) and indexing are all different processes performed by different computers in different places, all running their own programs.

When you do a Fetch and Render request, make a note of the time and check it later in your logs. You'll find two sets of requests: one from the Googlebot, one from a humanoid UA (currently something like Chrome/29).

“What the googlebot sees” is really a pretty meaningless phrase. It’s shorthand for “What a human would see if their browser were subject to the same robots.txt exclusions and access-control rules that govern the Googlebot”.

robzilla

11:13 am on Jul 2, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Right, and that's the main purpose of the Fetch as Googlebot tool; it's no indication of the extent to which Googlebot is able to discover links added via Javascript, for example. As you say, that's a different process. You're basically just looking at a screen shot made by a headless browser.

And no, strictly speaking, fetching and rendering is never literally simultaneous :-)

lucy24

7:36 pm on Jul 2, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Maybe a better word would be “concurrent”. That’s how a human browser works: It displays everything as soon as it becomes available. (Some years back, when I converted a bunch of old games into Javascript, one point of difficulty was that there’s no "delay" function: “Wait three seconds and then do suchandsuch.” You have to do convoluted workarounds like “Wait three seconds and then run this other function”--and then make sure nothing else is allowed to happen in the meantime that would mess with the output.)

NickMNS

2:24 am on Jul 4, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This isn't how Googlebot crawls pages, though. The fetch & render tool will retrieve all page contents as if it were a browser with an empty cache. Googlebot tends to slowly crawl one resource at a time, and not necessarily in a logical order. The rendering/interpretation part probably doesn't take place simultaneously with fetching, but that's just my assumption.

So if what you are saying is true, and I am not really doubting it at this point, then what is the purpose of differentiating between "What googlebot Sees" and "What Users See"?

My assumption / understanding was that if the Googlebot view in fetch and render shows the content then Googlebot could crawl it. This appears not to be the case.

Now what is ironic, is that googlebot is showing me a 405 error in SC for the url used for the AJAX that appears in the js script noted above ("/my-url"). Its a 405 since they are trying to GET a url that is limited to a POST.

lucy24

4:51 am on Jul 4, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The fetch for “What a human sees” uses a different user-agent--its UA string does not contain the component “Googlebot”--and it is not subject to robots.txt. If you request a roboted-out page, only the humanoid fetch will take place ... and they won't display the result at all (possibly because they don't want to remind us that Google is perfectly capable of fetching anything it wants, at any time). This tells us something about how the “Fetch as Googlebot” function is coded at their end.

405 eh. Lately I've been seeing a fair number of 408s. It's a mistake--but it's not my mistake, so tough patootie.