Welcome to WebmasterWorld Guest from 3.228.21.186

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google Direct Answers (Answer Box) Uses Forum Post As Content Source

     
7:00 pm on Sep 14, 2014 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Apr 30, 2008
posts:2630
votes: 191


I have just came across a Google Direct Answer / Answer Box / Knowledge box which contained one of posts from stack overflow. Here is the screenshot example of a Google Search with the answer from the forum:

[i.imgur.com...]

We have discussed Google Direct Answers in the past in this thread:

Google Direct Answers: overrated threat?
13th of March 2014
http://www.webmasterworld.com/google/4653835.htm [webmasterworld.com]

However this is the first time I have seen a random post from the (albeit respectable) forum being a source of a Direct Answer box.

Did I find it useful? Not at all, and here is why:

1) It completely missed what I was searching for

2) It zoomed in on a forum thread with one miniscule aspect of css width and px

3) The text of the answer framed in the box was just a small extract of a larger post of the poster who replied to OP and hence it felt out of context.

4) The answer picked up seemed to be a random choice. It was not one with the most endorsements, in fact it had only one endorsement. It was not from the user with most endorsements overall either.
8:56 pm on Sept 14, 2014 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Dec 11, 2013
posts:383
votes: 110


Google's algorithm should get it right in time. As a user I'd prefer to see the whole answer (or at least the most voted up) in a collapsible/expandable box so that I don't have to spend time and visit the site at all. If I don't like the result, I could browse another website right on Google.
9:37 pm on Sept 14, 2014 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4511
votes: 349


I see they offer a teeny grey link for "Feedback" just below the answer box. I wonder if they get much feedback, or read it?
11:28 pm on Sept 14, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 29, 2006
posts:1378
votes: 18


a random post from the (albeit respectable) forum

Which seems to raise a question about the implications for a site ToS such as this:

23. You will not copy and retransmit any information out of these forums without first getting the permission of the original author of the message and a WebmasterWorld.com administrator.

It's not quite the same as a "snippet" encouraging a clickthrough.

...
1:50 am on Sept 15, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 28, 2013
posts:3476
votes: 781


Samizdata, Stack Overflow allows re-use or redistribution under the Creative Commons license:

[creativecommons.org...]

One can question whether Google's "direct answer" complies with the Creative Commons 3.0 attribution requirements (I didn't see a contributor's name in the screen shot), but Google's publication of the material seems legit.
2:06 am on Sept 15, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 5, 2009
posts:1730
votes: 387


Once financial gains become part of this discussion or debate, I'm sure it will all take care of itself. It's not just Google doing this, but when you own such a huge portion of the pie, it can and will be very predatory. I find the whole aspect of content/information scraping sickening and morally corrupt. That's just my feelings of course which isn't going to change this trend.

If I own a store I might get you in the door with a free offering, but ultimately I want you in my store to buy something. I'm not offering a freebie because I want you to be my friend. The longer you're in my store, the greater the chance you might find something you didn't know you wanted. You will be in my store and not the store across the street. Etc, etc. Is there a financial or market gain to be had from an answer box?

To the OP, your take is interesting but it's more intelligent an analysis than 98% of the people who type something into Google search are capable of. I'm sure those boxed answers are as good as gold to most people regardless if they are or not.
6:03 am on Sept 15, 2014 (gmt 0)

Full Member

10+ Year Member

joined:May 30, 2009
posts:234
votes: 7


I find the whole aspect of content/information scraping sickening and morally corrupt.


Someone should tell comScore and other companies that track the market share of search engines to stop including Google in those stats. They are no longer a search engine.
4:12 pm on Sept 16, 2014 (gmt 0)

Full Member

5+ Year Member

joined:Apr 26, 2012
posts:328
votes: 8


micklearn, Google has never aimed to specifically be a search engine. Their mission statement is "to organize the world’s information and make it universally accessible and useful." It just so happened that in 1998 that was the way things were organized and they thought they could do it better, and they did. Now times have changed and technology has advanced.

Where copyright is concerned, Google has long had an adversarial view of copyright. You can see this in everything they do, especially with YouTube and their book scanning service.

I think it's more than just not respecting other people's copyright. I think it's no accident that they have no copyright notices on their site (the main Google site, anyway), that it's somewhat difficult to find any kind of copyright notices in their terms of service (though they do exist for instructions on using the logo), and that they have behave in a fairly non-litigious way toward people infringing on their copyright (leading to all sorts of clones that only make you think of Google). The same is true of their trademark.

There is a school of thought out there that copyright is a bad thing that discourages innovation and I wouldn't be surprised if Larry and Sergey subscribe to it.
4:51 pm on Sept 16, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 5, 2009
posts:1730
votes: 387


Regarding copyright, I think it's quite clear. It really only exists when it's manageable. In other words, if you have a platform such as YouTube and 10,000 videos from different people are uploaded on the same day for your copyrighted program, what'cha gonna do about it? The scale of the internet in my opinion makes copyright protection a joke. Personally if I own a corporation, I'm protecting my information and nobody is going to touch it unless I say so or I get compensated for its usage.
2:41 pm on Sept 17, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 28, 2013
posts:3476
votes: 781


This passage from a page about MLA In-Line Citation Style from the Online Writing Lab at Purdue University may be helpful in understanding when or if sources should be cited. Boldface formatting is mine:

When a citation is not needed

Common sense and ethics should determine your need for documenting sources. You do not need to give sources for familiar proverbs, well-known quotations or common knowledge. Remember, this is a rhetorical choice, based on audience. If you're writing for an expert audience of a scholarly journal, for example, they'll have different expectations of what constitutes common knowledge.


As for copyright, it doesn't come into play here. Nobody has a copyright on Barack Obama's age, the meaning of "H20," the height of the Empire State Building, or the capital of France. Such unadorned facts are in the public domain.
4:23 pm on Sept 17, 2014 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Dec 11, 2013
posts:383
votes: 110


"common knowledge"

From Wikipedia definition:

Common knowledge is knowledge that is known by everyone or nearly everyone, usually with reference to the community in which the term is used. Common knowledge need not concern one specific subject, e.g., science or history. Rather, common knowledge can be about a broad range of subjects, including science, literature, history, entertainment etc. Often, common knowledge does not need to be cited. Common knowledge is distinct from general knowledge.

[en.wikipedia.org...]

In that case, if we take websites related to travel (for example), everything can be defined as common knowledge. In consequence, websites discussing travel destinations (including reviews?) should not expect a reference in the knowledge box. It actually applies to most industries.
6:19 pm on Sept 17, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 28, 2013
posts:3476
votes: 781


Selen, Google's staff includes people from academic backgrounds, so I'm sure the search team is capable of figuring out what deserves a citation and what doesn't.

Site owners who aren't clear on what they should cite can find any number of academic style guides to help them.
6:40 pm on Sept 17, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 4, 2001
posts: 1277
votes: 17


It's interesting that they didn't use the most upvoted answer but I'm not at all surprised that stackexchange qualifies as an answer source. As any developer can confirm, they're like the Wikipedia of coding.
6:41 pm on Sept 17, 2014 (gmt 0)

Full Member

10+ Year Member

joined:May 25, 2006
posts:300
votes: 36


Here's an interesting example...

a professor in a UK university carried out some research a while ago that was reported in a little known but important scientific journal.

The research was picked up and reported by various news agencies who all reported the same headline, but as far as I can see none of them actually linked to the original source

G now include the principal fact from the report as 'knowledge' but the credit goes to the BBC as one of those who reported the story, not to the original researcher or scientific journals...
7:11 pm on Sept 17, 2014 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Dec 11, 2013
posts:383
votes: 110


Common knowledge can be taken both in the global and in the local context. Looking from the travel perspective, all information about Paris (including restaurant or hotel reviews etc.) can be qualified as common knowledge because it could be common to people who live in Paris or travel to Paris. Information about making champagne is common to folks who live or work in Champagne, FR area. Information about investments or financial opportunities is common among people who work on the Wallstreet, etc.

So, again, webmasters should not expect a link to their websites if they don't offer anything original or UNcommon.
8:12 pm on Sept 17, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 28, 2013
posts:3476
votes: 781


Looking from the travel perspective, all information about Paris (including restaurant or hotel reviews etc.) can be qualified as common knowledge because it could be common to people who live in Paris or travel to Paris. Information about making champagne is common to folks who live or work in Champagne, FR area. Information about investments or financial opportunities is common among people who work on the Wallstreet, etc.


It isn't quite that simple.

Take an article about the Eiffel Tower in Paris:

The facts about the tower's history, height, hours of operation, ticket prices, etc. aren't subject to copyright, and most of those facts (e.g., the tower's height or the year it was built) normally wouldn't be cited--whether by Google or in a guidebook or an encyclopedia.

The actual text of that article is a different story. It's protected by copyright, because the "representation" (i.e., the article's words and their arrangement) is governed by copyright law.

But the issue here isn't copyright, it's when and whether to do citations. Google gets to decide when to cite a source for a fact like the height of the Eiffel Tower or when the tower was built, just as you and I do. And that decision is likely to be based on accepted citation practices, not on whim.
8:50 pm on Sept 17, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 5, 2009
posts:1730
votes: 387


Bah. One thing I do know for certain. An employee of Google, Bing, etc didn't enter in Obama's age into their system/page/answer box. Somebody did and guess what? It wasn't them. That's where it's indefensible to me. If those facts are in print, somebody printed it. That took some type of effort. I think where this is going is whether anyone who put anything online (including adding to Wikipedia) meant that it could be scraped by any corporation on earth and used to keep patrons within their own ecosystem. So I guess the only "sucker" is the person who helped assemble all the information in one tidy location for the corporations to feed off of in the spirit of free use.

[edited by: MrSavage at 8:53 pm (utc) on Sep 17, 2014]

8:52 pm on Sept 17, 2014 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Dec 11, 2013
posts:383
votes: 110


Google can easily rewrite/paraphrase content if needed (rewriting page title is a good example). They could also take relevant parts from multiple sources to display information.

From user perspective, the ideal situation is to stay on Google and get all information without visiting other (quite often slow-loading or hacked) websites. Especially that most visitors want to browse multiple sites in a short period of time.

Visitors go to Google for a reason and it should not be a surprise that Google's goal is to provide the best user experience, just like other webmasters do on their own websites.
9:28 pm on Sept 17, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member aristotle is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 4, 2008
posts:3660
votes: 373


From user perspective, the ideal situation is to stay on Google and get all information without visiting other (quite often slow-loading or hacked) websites

That might be okay for the type of person who would be satisfied with a superficial and quite possibly inaccurate response to their query.
9:31 pm on Sept 17, 2014 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Dec 11, 2013
posts:383
votes: 110


What I meant, it would be ideal to be able to browse full pages right on Google in a box (something like in an iframe).. that way user can decide what to read or when to stop reading and move on to another page without having to visit them at all.
9:35 pm on Sept 17, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 4, 2001
posts: 1277
votes: 17


An employee of Google, Bing, etc didn't enter in Obama's age into their system/page/answer box. Somebody did and guess what? It wasn't them.

So what's the solution? Force Google and Bing to hire people to manually enter answers? Or in other words, make them scrap answers entirely because doing it manually at that scale is impossible.

Take it a step further using StackExchange or other Q&A type sites as an example. The answers on those kinds of sites are entered by humans but the sites themselves are essentially using the crowd as a very sophisticated algorithm. Any of those humans can answer something by copying content from somewhere else. So now the site has not only exploited people as unpaid content creators, they are also hosting copyrighted content without consent. Should we restrict that too?

People yell about this stuff (invariably because they are content creators and feel personally, perhaps justifiably, affronted by scraping) without stopping to think about what those kinds of laws would do to the internet.

If we look at the internet as an unprecedented system for storing and transmitting human knowledge then not only is it beautiful, it's important to protect it. Not just protect what exists now, but also protect innovation so that it can evolve into things we can't imagine yet. Bots (scrapers included), and what they produce, are a very important part of the equation.

Now suppose that a major western country or two decides to ban or limit bots in some way as a result of copyright concerns. At that moment innovation by new players is dead. The big guys like Google and MS might be able to afford the lawyers to navigate the new legal landscape, which would start off complicated and likely get worse with each legislative session. But little guys with great new ideas would have no chance.

I think existing copyright laws are more than sufficient. I mean, what is the argument here? That Google and Bing shouldn't be able to grab snippets of your content because they do it with an automated system? Whereas a journalist or blogger is free to do it under fair use because they're pressing CTRL+C by hand? Or if it's about citations, the screenshot in the OP is an example of something far better, a direct link to the original content. Is there anyone who would turn down a link in the answer box for a popular question?
9:56 pm on Sept 17, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 28, 2013
posts:3476
votes: 781


An employee of Google, Bing, etc didn't enter in Obama's age into their system/page/answer box. Somebody did and guess what? It wasn't them.


The data came from a public record that was created in Hawaii (or in Kenya, if you're Donald Trump).

So who should be credited? Obama's Mom? (Did she fill in the data for the birth certificate?) The clerk in Hawaii who typed up the birth certificate? The newspaper Linotype operator who set type for the birth announcement?

If you were writing an article about Barack Obama and you wanted to include his age, what source would you identify in a citation? (Mind you, if you were a professional journalist, you wouldn't even bother with a citation, because Obama's age--53--is a generally-known fact based on a public record.)
10:14 pm on Sept 17, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 5, 2009
posts:1730
votes: 387


One word. Scale. Two words. Scale. Use.

Laws currently can't even begin to grasp what's going on because they are from a mindset of scale.

It hasn't really come up yet, but Wikipedia was or is essentially a honey trap. Personally if I knew that my contributions were going to be hosted on a site other than Wikipedia? If my contribution is being used, I certainly didn't do it for a favor to Bing and Google.
1:55 am on Sept 18, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 11, 2007
posts:774
votes: 3


One thing I do know for certain. An employee of Google, Bing, etc didn't enter in Obama's age into their system/page/answer box. Somebody did and guess what? It wasn't them. That's where it's indefensible to me.


Hmmm Google is not scraping Obama's age off of anyone's site. They have likely seen Obama's birth date listed on millions of HTML documents around the web including various legal documents that exist on the web. So they trust that the listed birth date is correct and have likely added it to their Knowledge Graph database as a common fact about Obama.

So anytime someone queries Google for his age, they calculate the age based on the data of birthday they have in their database for him. It is the derived age that is displayed.

The fact that his birthday can be found so many places on the web would lead any reasonable person to believe it is common knowledge and therefore shouldn't really require a citation.