Forum Moderators: open

Message Too Old, No Replies

New Google Algo Creates Original Content from Your Content

         

martinibuster

2:30 pm on May 17, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



This algorithm takes topics from Wikipedia, queries the search engine for documents, then outputs original summaries in the form of wholly original content.

Google AI has published a research paper on a new algorithm that can take documents from the web, summarize the content, then generate a 100% original Wikipedia page entry from it.

It uses technology similar to the summarization algo used to create a featured snippet. Then it uses another kind of algorithm to generate a paraphrased web page.

The algo uses Wikipedia topics as a starting point. For the paraphrased content, they tested the algorithm using web content and another set using just the citations linked from Wikipedia.

The research paper does not discuss the ethical issues.

The full article is here:

[searchenginejournal.com...]

Travis

5:33 pm on May 17, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



In a near future, Google will drop its search engine, and just provides the answers using these algorithms . Someone, once mentioned how it would be the end of times if Google was buying Wikipedia, ... see, they do not need to do this. And they can certainly produce again more content, again more faster than Wikipedia.

almost a third of the summaries contain fake facts.

hum...

keyplyr

6:43 pm on May 17, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



They're not doing that now?

Travis

6:45 pm on May 17, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



They're not doing that now?

No at this scale, so imagine "later", what it will be like.

keyplyr

7:28 pm on May 17, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Rhetorical

engine

9:56 am on May 18, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If you want to read the research papers, here are two links.

Faithful to the Original: Fact Aware Neural Abstractive Summarization [arxiv.org...] (PDF)

GENERATING WIKIPEDIA BY SUMMARIZING LONG SEQUENCES [openreview.net...] (PDF)

Automation: It reminds me to double-check the information found on Google.

Travis

11:16 am on May 18, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



We might also think about what "original" content means. To me rehashing and rewording of others content, is not something we can call "original".

nomis5

5:59 pm on May 18, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In the end this will be self-defeating for Google. If they rehash several articles into one of their own, and then splash links at the top of the SERPS to funnel their users into selecting their article, what point is there in maintaining an independent website? And then where do Google get their source information from?

AI is clearly a force that has the potential to destroy original thought if Google is able to promote their own AI created articles from the content, hard work and original thought of others?

Maybe I'm a fool, but at the moment they won't do this to my website. My whole website depends on user input which then alters the content according to the user input.

The fact is though that Jo Public has no idea how manipulated he / she is by massive corporations like Google. And, until Jo Public realises this, and history shows that he / she won't until it's too late, we are fast approaching a Big Brother situation of massive proportions.

The attitude of "I want it now and I want it quick" at all costs will cost society dearly. But how to educate Jo Public, I have no idea?

martinibuster

6:23 am on May 19, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Imo, this fits neatly into their Voice Assistant program.

So rather than envision Google returning their own web page, think about it instead as someone asking the Computer an informational question then receiving a spoken answer.

Wringing hands over https and mobile friendliness is the past. Voice Assistant is the future. That's what you should be building for. This, imo, is part of that paradigm.

keyplyr

6:30 am on May 19, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Voice Assistant is the future. That's what you should be building for
As I've suggested many times.

graeme_p

2:55 pm on May 19, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



almost a third of the summaries contain fake facts.


Which will not prevent it being a commercial success - quite the contrary I would have thought.

what point is there in maintaining an independent website? And then where do Google get their source information from?


From non-commercial sources, and sources that do not make money directly from visitors. A lot of people will want to express their opinions, share their knowledge etc. and will be happy with small numbers of visitors.

EditorialGuy

11:34 am on May 20, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



And some news organizations are using "newsbots" to write stories:

[wired.com...]

In a near future, Google will drop its search engine, and just provides the answers using these algorithms.

Only if they're extremely foolish. (Would *you* abandon a successful, multi-billion-dollar business?)

keyplyr

11:44 am on May 20, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Would *you* abandon a successful, multi-billion-dollar business?
Do I have to answer right away?

graeme_p

7:54 pm on May 20, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



We might also think about what "original" content means. To me rehashing and rewording of others content, is not something we can call "original".


Agreed, is that not rather the point of the discussion?

engine

11:47 am on May 21, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Google's Danny Sullivan seemed to pour cold water on the idea of Google actually doing this right now.
We publish research papers all the time on many topics. This is from Google Brain, not Google Search. We've often said any Google research papers in general shouldn't be assumed to be something that's actually happening in search.


[twitter.com...]

Shaddows

1:24 pm on May 21, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That is a non-denial denial.

It is generally accepted that if there is no documented basis for Google doing something (patent, research paper), then they are not doing it. That is a hurdle that has now been overcome in the field of auto-spinning.

Philosophically, I'm not 100% sure if you can differentiate between a human-authored Wiki article and a spun one. I mean, Wiki is not supposed to be original research, but sourced content. Obviously, you can insist a human is involved creatively, or else it is not original- but that does not seem to be a intellectually satisfying way of defining "originality".

martinibuster

6:20 pm on May 21, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



There is nothing in the article to pour cold water on.

The article is a report on the algorithm itself. How it works, what it does, and toward the end, how it COULD fit into search.

The article never states that it is in use.
The article begins:
Google has published research of a new algorithm...


Then the article explains the algorithm, piece by piece.

I then discuss what will be summarized:

Is Google’s Algorithm Summarizing Your Content?
The research paper is silent on whether Google will show their own content created from your content.


I again clarified that there is no indication of when or if it will be used:
There is no word yet when or if Google will begin generating it’s own content from your content.


Danny is reciting their standard boilerplate that there is nothing to see and to move along.

But there is something to see.

LifeinAsia

6:48 pm on May 21, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Google AI has published a research paper
Does this mean that someone from the Google AI team published this, or that the algorithm itself published the research?

A few years ago, one wouldn't need to ask because the answer would be obvious- AI wasn't advanced enough to do that.

A few years from now, one won't need to ask because the answer will be obvious- humans will have devolved too much to do that...

[edited by: LifeinAsia at 7:47 pm (utc) on May 21, 2018]

EditorialGuy

7:01 pm on May 21, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



IMO, they set the bar pretty low with Wikipedia articles as the finished product.

engine

11:55 am on May 22, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



The inference was that this was doable, and that brought out the comments which, mostly, imho, were correct.

The source of the data is the key, and if sites/webmasters and publishers take down the source the result will be a weak page, and unverified, too.

Shaddows

12:37 pm on May 22, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The source of the data is the key, and if sites/webmasters and publishers take down the source the result will be a weak page, and unverified, too.


But that's the same a Wikipedia. Wiki is supposed to be sourced; unsourced content is supposed to be removed.

What is the difference between human-devised sourced content, and machine-devised sourced content, from the view of "originality"

Additionally, I would expect G-pedia to be no worse than Wikipedia in reproducing factual errors. And probably a little less biased than Wiki has become (for example, the US culture wars have a clear winner in Wikipedia).

engine

1:33 pm on May 22, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I was generalising beyond wikipedia.

Shepherd

10:04 am on May 31, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



IMO, they set the bar pretty low with Wikipedia articles as the finished product.

No need to set it higher when google has spent the last 10+ years grooming the average searcher to see Wikipedia as the "best" (number 1 result in many searches) information available on the internet. Everything is moving along doubleplusgood.

I am seriously considering buying a printing press and storing it in an old barn somewhere...