Can Google send traffic without providing Google the content?

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Can Google send traffic without providing Google the content?

NickMNS

6:09 pm on May 24, 2021 (gmt 0)

Can Google send traffic to website without that website providing Google the access to the content?

This question spans from another thread where the premise is that Google is taking content from websites and providing answers directly to users such that users no longer need to visit the website from which the data was collected. Thus it would seem logical, that limiting the information provided to Google would prevent or make it more difficult for it to use the information to provide answer's directly to users.

So then the question is, will limiting the information provided to Google, limit it's ability to rank and understand the nature of the website?

Limiting access to content could be by placing the content behind some wall (paywall or password protection). Or by displaying content in the form graphics or visualization or some or other format that would make it difficult for Google to crawl.

Here is the reference thread:
[webmasterworld.com...]

Wilburforce

1:26 am on May 25, 2021 (gmt 0)

will limiting the information provided to Google, limit it's ability to rank and understand the nature of the website?

Yes.

Under the current indexing model, Google cannot index - how would it? - material it cannot access.

Under the model proposed by the paper referenced in that thread, it would be unable to incorporate material it cannot access in any synthesis of sources.

Nobody is going to use your ingredients to bake a cake if you lock the pantry and hide the key.

NickMNS

2:14 am on May 25, 2021 (gmt 0)

To further the cake analogy.

Nobody is going to use your ingredients to bake a cake if you lock the pantry and hide the key.

The goal is exactly that, you don't want Google to take your cake ingredients, but at the same time you would like Google know of the ingredients in the pantry. Say by putting them up on high shelf where they can see them but not reach them.

But that is not really what I mean. More concretely say you take some data and derive from it useful information. You can display that information that can be easily and conveniently consumed by humans, for example in some kind of visualization or tool. The problem is that Googlebot can't consume the visualization, thus you must also provide the information in format that Googlebot can consume, either some text, or better yet (for Google) in some schema markup. Now Google can not only, know what your "visualization" is about and index it, but it can take your information and use it for it's own purposes, like creating its own "visualization". If the information in question is "factual" then you don't even have a copyright claim.

You do the work of processing the data and deriving useful information, but then are forced, due to Google's dominance of internet search to provide your product for free to them. What other option do you have, buy ads on Facebook?

So really the question becomes, what is the minimum amount of information that you need to reveal to still be able to be indexed by Google and rank for relevant searches?

And to make matters worse, you are not alone, your competitors may not share your strategy. They may feel that providing all the information provides them with short or medium term benefit.

Catch22.

iamlost

3:28 am on May 25, 2021 (gmt 0)

This trajectory of Google hoarding rather than referring traffic, of transitioning to an answer machine from a search engine is not new. Some of us, here on WebmasterWorld, have been speaking to this topic for over a decade. And a few of us have been building solutions.

Of course, just as no two sites or business models are the same, so too there is no cookie cutter one size fits all solution. However, for a perceptive person... perhaps a way that fits them may be found.

First, before I continue, let me get the mandatory overarching disclaimer out of the way: there is more to search traffic than Google, search traffic is typically the worst converting traffic, and Google is usually the worst of that. There, said. Back to question at hand.

From NickMNS�s linked post the bit that resonated with me was:

There is no longer a business case for creating linear or static informational websites, where one simply provides content eg: like a book. Instead information will need to be provided to the user in a more interactive or dynamic way, for example as visualizations or customized. This will make it impossible for Google to use the information to feed it's systems and will make the website content more appealing for the user.

And from the above OP:

So then the question is, will limiting the information provided to Google, limit it's ability to rank and understand the nature of the website?

Just as Google has been down the personalisation rabbit hole so too have some webdevs, for an insight into my journey see, from May of 2017, Analytics is THE engine of change [webmasterworld.com], and especially the additional links in my second post in that thread.

The critical light bulb moment for me is summed up in the following:

Instead of thinking: structure/semantics -> content -> presentation/behaviour for each target also remove structure/semantics from the 'page' such that a given URL is totally amorphous. Think instead: context -> content -> structure/semantics -> presentation/behaviour.

What this means, in practice, is that for each page (URL) there is a default content, typically text with graphics/images, that is served to first time visitors and SE bots. Basically much what everyone has/is publishing, except of a much higher quality/value, of course. :)

However, for identified return visitors OR anytime visitors from select referrers contextually different content may be shown.

Further, for all human traffic, past the landing page additional content or variations on content, eg slideshows, videos, interviews, may be added to a page.

SE bots get quality content that allows full indexing of those pages I want to share with them. Human visitors get at least that, often presented/combined in a personalised context, and frequently with additional rich content.

SE bots get fed the bog standard site map; human visitors get to mix and match, go down other routes and back and over and all about content in a highly variable way. It�s all about how content is chunked, associated, and linked. And stomping on SE bots toes as appropriate.

And it works. So far. Google search referred traffic has, with a couple short exceptions, been up YoY every month for two decades.

Google loves me, yes they do. Although they do know much is blocked from their crawl and disallowed from their public index and they do complain now and again. :)

Just a mo while I knock my wooden head...

Wilburforce

5:46 am on May 25, 2021 (gmt 0)

The goal is exactly that, you don't want Google to take your cake ingredients, but at the same time you would like Google know of the ingredients in the pantry. Say by putting them up on high shelf where they can see them but not reach them.

Well I suppose I have done this myself in a sense (and, ironically, it is my only page to have retained the #1 spot for relevant searches): I have a page in which users can enter values for common measurements in my sector, and JS formulae return useful related values. The page appears to bots as it does to visitors, but the JS is inaccessible to anyone. Thus anyone - including bots - asking a specific value-based question can get the correct (value) answer.

The formulae themselves are not secrets (although some of them require quite arduous and complex iterations to perform manually), but as far as I know nobody else has bothered to encode them in JS. This was blocked from both bots and users not as a deterrent to Google, but to prevent plagiarism, to which the information section of my site is particularly prone.

There are probably other types of interactive content that can work in a similar way, but returning to the cake analogy, this is more like having a finished product in the pantry (a bottle of Magic Spice) on which the ingredients might be listed, but the method of preparation is not disclosed.

A caveat: my own example serves the same thing to Google as it does to users (and something like iamlost's example probably does, in the sense that googlebot doesn't bring any personal preferences to the page). Doing otherwise is risky.

NickMNS

6:03 pm on May 27, 2021 (gmt 0)

I have a page in which users can enter values for common measurements in my sector, and JS formulae return useful related values. The page appears to bots as it does to visitors, but the JS is inaccessible to anyone. Thus anyone - including bots - asking a specific value-based question can get the correct (value) answer.

I have done the same, in several instances. In one instance, when I first launched the tool I got no traffic, it seemed as if it did not exist to Google. I then added a tabular representation of some values of the tool, where the results in the table were linked back to the tool. After implementing that Google seemed to become aware of the tool and it began to get traffic. On another project, I put together another tool, but I never added any additional descriptive content and to this day the website gets no traffic. Now I have neglected the later site on several fronts, so I doubt one can draw a definite conclusion.

I have third website, which is my main website, where I have also created such a tool, a comparison engine. It performs pretty well but "entities" that are being compared each have their own page on the site, and all the js tool does is allows the user to compare several entities. But I have serious doubts that I could get Google to rank the tool without having the individual pages to add "content value" despite the fact that the content is essentially the same.

@Iamlost,
What you describe is impressive and sounds elaborate, I'm not sure it would apply to me. My content is more single dimensional, I don't have much to over in terms of :

eg slideshows, videos, interviews,

But once you have that kind of content, you can also move towards a subscription model, where you have less of a reliance on Google.

phranque

3:47 am on Jun 1, 2021 (gmt 0)

technically speaking google will index urls it can't crawl if it discovers something interesting through a link context, but i doubt you can optimize for organic search traffic without showing your content.
for one thing, they are unlikely to display a relevant description in the search snippet without seeing the content on the page and/or the meta description element.