Welcome to WebmasterWorld Guest from 54.196.86.89

Forum Moderators: open

Featured Home Page Discussion

New Google Algo Creates Original Content from Your Content

     
2:30 pm on May 17, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14727
votes: 428


This algorithm takes topics from Wikipedia, queries the search engine for documents, then outputs original summaries in the form of wholly original content.

Google AI has published a research paper on a new algorithm that can take documents from the web, summarize the content, then generate a 100% original Wikipedia page entry from it.

It uses technology similar to the summarization algo used to create a featured snippet. Then it uses another kind of algorithm to generate a paraphrased web page.

The algo uses Wikipedia topics as a starting point. For the paraphrased content, they tested the algorithm using web content and another set using just the citations linked from Wikipedia.

The research paper does not discuss the ethical issues.

The full article is here:

[searchenginejournal.com...]
5:33 pm on May 17, 2018 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 25, 2018
posts:500
votes: 92


In a near future, Google will drop its search engine, and just provides the answers using these algorithms . Someone, once mentioned how it would be the end of times if Google was buying Wikipedia, ... see, they do not need to do this. And they can certainly produce again more content, again more faster than Wikipedia.

almost a third of the summaries contain fake facts.

hum...
6:43 pm on May 17, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:11485
votes: 692


They're not doing that now?
6:45 pm on May 17, 2018 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 25, 2018
posts:500
votes: 92


They're not doing that now?

No at this scale, so imagine "later", what it will be like.
7:28 pm on May 17, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:11485
votes: 692


Rhetorical
9:56 am on May 18, 2018 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:25274
votes: 690


If you want to read the research papers, here are two links.

Faithful to the Original: Fact Aware Neural Abstractive Summarization [arxiv.org...] (PDF)

GENERATING WIKIPEDIA BY SUMMARIZING LONG SEQUENCES [openreview.net...] (PDF)

Automation: It reminds me to double-check the information found on Google.
11:16 am on May 18, 2018 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 25, 2018
posts:500
votes: 92


We might also think about what "original" content means. To me rehashing and rewording of others content, is not something we can call "original".
5:59 pm on May 18, 2018 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Apr 29, 2005
posts:2062
votes: 92


In the end this will be self-defeating for Google. If they rehash several articles into one of their own, and then splash links at the top of the SERPS to funnel their users into selecting their article, what point is there in maintaining an independent website? And then where do Google get their source information from?

AI is clearly a force that has the potential to destroy original thought if Google is able to promote their own AI created articles from the content, hard work and original thought of others?

Maybe I'm a fool, but at the moment they won't do this to my website. My whole website depends on user input which then alters the content according to the user input.

The fact is though that Jo Public has no idea how manipulated he / she is by massive corporations like Google. And, until Jo Public realises this, and history shows that he / she won't until it's too late, we are fast approaching a Big Brother situation of massive proportions.

The attitude of "I want it now and I want it quick" at all costs will cost society dearly. But how to educate Jo Public, I have no idea?
6:23 am on May 19, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14727
votes: 428


Imo, this fits neatly into their Voice Assistant program.

So rather than envision Google returning their own web page, think about it instead as someone asking the Computer an informational question then receiving a spoken answer.

Wringing hands over https and mobile friendliness is the past. Voice Assistant is the future. That's what you should be building for. This, imo, is part of that paradigm.
6:30 am on May 19, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:11485
votes: 692


Voice Assistant is the future. That's what you should be building for
As I've suggested many times.
2:55 pm on May 19, 2018 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Nov 16, 2005
posts:2773
votes: 111


almost a third of the summaries contain fake facts.


Which will not prevent it being a commercial success - quite the contrary I would have thought.

what point is there in maintaining an independent website? And then where do Google get their source information from?


From non-commercial sources, and sources that do not make money directly from visitors. A lot of people will want to express their opinions, share their knowledge etc. and will be happy with small numbers of visitors.
11:34 am on May 20, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:June 28, 2013
posts:3204
votes: 626


And some news organizations are using "newsbots" to write stories:

[wired.com...]

In a near future, Google will drop its search engine, and just provides the answers using these algorithms.

Only if they're extremely foolish. (Would *you* abandon a successful, multi-billion-dollar business?)
11:44 am on May 20, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:11485
votes: 692


Would *you* abandon a successful, multi-billion-dollar business?
Do I have to answer right away?
7:54 pm on May 20, 2018 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Nov 16, 2005
posts:2773
votes: 111


We might also think about what "original" content means. To me rehashing and rewording of others content, is not something we can call "original".


Agreed, is that not rather the point of the discussion?
11:47 am on May 21, 2018 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:25274
votes: 690


Google's Danny Sullivan seemed to pour cold water on the idea of Google actually doing this right now.
We publish research papers all the time on many topics. This is from Google Brain, not Google Search. We've often said any Google research papers in general shouldn't be assumed to be something that's actually happening in search.


[twitter.com...]
1:24 pm on May 21, 2018 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:Aug 11, 2008
posts:1642
votes: 233


That is a non-denial denial.

It is generally accepted that if there is no documented basis for Google doing something (patent, research paper), then they are not doing it. That is a hurdle that has now been overcome in the field of auto-spinning.

Philosophically, I'm not 100% sure if you can differentiate between a human-authored Wiki article and a spun one. I mean, Wiki is not supposed to be original research, but sourced content. Obviously, you can insist a human is involved creatively, or else it is not original- but that does not seem to be a intellectually satisfying way of defining "originality".
6:20 pm on May 21, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14727
votes: 428


There is nothing in the article to pour cold water on.

The article is a report on the algorithm itself. How it works, what it does, and toward the end, how it COULD fit into search.

The article never states that it is in use.
The article begins:
Google has published research of a new algorithm...


Then the article explains the algorithm, piece by piece.

I then discuss what will be summarized:

Is Google’s Algorithm Summarizing Your Content?
The research paper is silent on whether Google will show their own content created from your content.


I again clarified that there is no indication of when or if it will be used:
There is no word yet when or if Google will begin generating it’s own content from your content.


Danny is reciting their standard boilerplate that there is nothing to see and to move along.

But there is something to see.
6:48 pm on May 21, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 10, 2005
posts:5753
votes: 120


Google AI has published a research paper
Does this mean that someone from the Google AI team published this, or that the algorithm itself published the research?

A few years ago, one wouldn't need to ask because the answer would be obvious- AI wasn't advanced enough to do that.

A few years from now, one won't need to ask because the answer will be obvious- humans will have devolved too much to do that...

[edited by: LifeinAsia at 7:47 pm (utc) on May 21, 2018]

7:01 pm on May 21, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:June 28, 2013
posts:3204
votes: 626


IMO, they set the bar pretty low with Wikipedia articles as the finished product.
11:55 am on May 22, 2018 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:25274
votes: 690


The inference was that this was doable, and that brought out the comments which, mostly, imho, were correct.

The source of the data is the key, and if sites/webmasters and publishers take down the source the result will be a weak page, and unverified, too.
12:37 pm on May 22, 2018 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:Aug 11, 2008
posts:1642
votes: 233


The source of the data is the key, and if sites/webmasters and publishers take down the source the result will be a weak page, and unverified, too.


But that's the same a Wikipedia. Wiki is supposed to be sourced; unsourced content is supposed to be removed.

What is the difference between human-devised sourced content, and machine-devised sourced content, from the view of "originality"

Additionally, I would expect G-pedia to be no worse than Wikipedia in reproducing factual errors. And probably a little less biased than Wiki has become (for example, the US culture wars have a clear winner in Wikipedia).
1:33 pm on May 22, 2018 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:25274
votes: 690


I was generalising beyond wikipedia.