Forum Moderators: open

Message Too Old, No Replies

Scraping Google News

Is this OK with google?

         

chiyo

1:45 pm on Nov 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ive noticed quite a few scripts or RSS feeds (and growing) that are scraping the news.google site. The feed can then be used on your site and bingo, you have text based links to headlines indexed by google news refreshed every 15 minutes or so. The links go direct to the original sources, bypassing google completely.

Im wondering whether this violated Google's TOS? Firstly there is no attribution to Google for providing the service, and their server is being accessed every 15 minutes or so. It does seem to me that it may be violating google's adminishments against robots and automatans hitting their site.

Reason I ask is that I would like to use one of these feeds, but am not sure whether it is kosher with Google. These are not using the google API I think.

Thanks for your thoughts.

Brad

2:30 pm on Nov 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



chiyo,

I suspect this violates the Google TOS. Google does not like unauthorized use of it's web search (without paying for it) and I can only assume the same is true for their news page and search.

I would check out some of the other news agregators out there. I know of at least one that is open source. Sticky me if you want info on that one.

Grumpus

2:34 pm on Nov 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Chiyo - the Google Web Services doesn't implement the news section yet? I dunno, I haven't checked...

G.

chiyo

3:03 pm on Nov 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Brad and Grumpus. We use RSS and other feeds a lot to deliver news feeds to our users. Moreover is no longer what it used to be, but we are using a PHP scripts to deliver individual news feeds from RSS files. Works quite well, but we would really like to have a feed that combines many news sources in a certain topic area. So far there is not an open source script aggregator that does this efficiently with low loading times as far as I know. You ra ebasically stuck with one feed per RSS file/source.

Not advanced enough to know how to work with Google's API as well!

Yep, i suspected that these google news scripts may not know what they are doing could be dangerous to sites that use them..

kaz

4:08 pm on Nov 21, 2002 (gmt 0)

10+ Year Member



An interesting related article that just came out is located here:

ZDNet: Google, AOL take lead in Web services
http://zdnet.com.com/2100-1104-966546.html

Google is giving developers direct access to its search database, bypassing its Web site and allowing them to design their own ways to use the valuable technology.

RBuzz

12:28 pm on Nov 22, 2002 (gmt 0)

10+ Year Member



Alas, no, the Google API doesn't yet support Google News, so Google News scrapers do violate the Google TOS. I'm looking forward to Google's API expansion into other Google properties like Google news. You're right, you could get some excellent RSS feeds out of it.