homepage Welcome to WebmasterWorld Guest from 54.198.148.191
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 40 message thread spans 2 pages: 40 ( [1] 2 > >     
The cost of crawling...
Google's lost revenue may have slowed things down a bit
stcrim




msg:130197
 3:19 pm on Feb 29, 2004 (gmt 0)

There's a lot of talk about Google not crawling or not crawling much. And they are certainly not looking hard at new sites.

The answer may be as simple as cost. Now that Yahoo is doing their own thing, Google may have to conserve resources at bit.

-s-

 

ByronM




msg:130198
 4:34 pm on Feb 29, 2004 (gmt 0)

Where are you seeing this? I'm crawled every day and i only have a few pr4 sites.

Get inbound links, have an easy to read site and build traffic.

"If you build it, google will come"

My cache is rarely over 2 days old. I find it amazing how than can index so quickly!

allanp73




msg:130199
 4:35 pm on Feb 29, 2004 (gmt 0)

Does anyone have stats on how much a crawl costs? It is an automated system, what costs are there?

walkman




msg:130200
 4:41 pm on Feb 29, 2004 (gmt 0)

come on. They use regular computers to do all their work. Modified but normal computers. 1000 of those would be less than $500K for them. Google has plenty of money and they're constantly hiring so I don't buy that.

stcrim




msg:130201
 5:34 pm on Feb 29, 2004 (gmt 0)

They have all but stopped crawling news sites over the past 2 weeks. And they have all but stopped looking at new pages in new directories on old sites.

[webmasterworld.com...]

And don't kid yourself about the cost of crawling and making use of the crawled pages...

-s-

darkroom




msg:130202
 5:41 pm on Feb 29, 2004 (gmt 0)

Yeah,
It certainly cannot be a cost issue. After all the activity of crawling is directly related to the success/relevancy of google's results..and specially now, google would want to exploit any possibility available to remain on top of their competitors....

BryonM said,
"Where are you seeing this? I'm crawled every day and i only have a few pr4 sites.
Get inbound links, have an easy to read site and build traffic.
"If you build it, google will come"
My cache is rarely over 2 days old. I find it amazing how than can index so quickly! ""

Very nice to hear that googlebot is still visiting your site regularly...However, it doesn't always seem to be the case...I've got a site with about 20,000 unique links with a PR6-7 with lots of unique content and all pages are google friendly...Googlebot is not crawling the site anymore and it now shows a 1 month old cache...

quotations




msg:130203
 6:29 pm on Feb 29, 2004 (gmt 0)

>They have all but stopped crawling news sites
>over the past 2 weeks.

Our latest Press Release (yesterday) was found by googlebot within one hour and included in their news feed immediately.

Most often it is the Mediapartners-Google/2.1 which is visiting but there was a major visitation last night and this morning from the 64.68 guys.

Rick_M




msg:130204
 6:44 pm on Feb 29, 2004 (gmt 0)

I would think the biggest cost would be bandwidth. I know how much bandwidth Google is pulling from my site - I would guess they don't get incoming bandwidth for free.

As for being crawled and indexed - I have seen what seem to be changes with my sites. I think it was around the time of Florida when pages that were picked up by Freshbot began to stick in the index. Over the past month, I've had a few pages picked up by freshbot, indexed for a few days, then dropped. Pre-Florida, this was the typical pattern, and back then, it wasn't until the big monthly update occurred that all of the crawled were included.

Another thing I've noticed is that I've actually had many new pages spidered over the past several days (not mediabot, which I thought is just for adsense?), but they are not showing up in the index, with or without fresh tags. Over the past few months, new pages would typically show up around 24 hours after being crawled - not so this last week.

I don't know what to make of this, or if there is any reason to make anything of it, but I figured I'd just share my experience.

FYI, the sites that are getting crawled have links from PR 6 sites.

BigDave




msg:130205
 7:08 pm on Feb 29, 2004 (gmt 0)

Almost every month there are times when googlebot seems to become less active, and all of a sudden there are threads on "what happened to googlebot?". The only difference is that there is a new straw to grasp at.

If it was a money issue, it would not have such an immediate effect.

All those servers of Google's that do the crawl are also general purpose computers. They are able to use them for other things when they feel the need. It just might be that they pulled a bunch of them for use fighting some evil or another that people complain about here from time to time.

Googlebot has definitely slowed down on my site, but it is still there. I'm not worried.

As for new sites, google still claims that you should expect them to be indexed in between one and two months. Y'all are just getting spoiled with freshbot lately.

div01




msg:130206
 7:39 pm on Feb 29, 2004 (gmt 0)

FWIW, Googlebot was been quite active this last week on most of my sites.

europeforvisitors




msg:130207
 7:50 pm on Feb 29, 2004 (gmt 0)

On my PR6 site, Google is crawling more than ever before, with an average of 1,500+ Googlebot hits per day in February. (That's about 38% of my pages on a typical day.)

metrostang




msg:130208
 8:04 pm on Feb 29, 2004 (gmt 0)

Agreed that Googlebot is as active a ever. He has been on my site of 17,000 page constantly for the past month grabbing every page there. Problem is the only page in results that isn't a month old is the index. Almost every day it shows a fresh index from yesterday.

So far that's the only page that's fresh and many of the others show in the top ten results for their keywords. I keep thinking that every day some will start showing up, but nothing.

Maybe the cost isn't in crawling sites, but in updating the databases.

stcrim




msg:130209
 9:10 pm on Feb 29, 2004 (gmt 0)

Simple Question:

Has anyone put up a new site or new pages in a new directory in the last 14 days that has been crawled and included by Google?

Included meaning you can find them with some type of seaarch on google.

Please do not include news sites as they are spidered differently...

-s-

nileshkurhade




msg:130210
 9:19 pm on Feb 29, 2004 (gmt 0)

No probs here Googlebot has been crawling the index page everyday as always and has picked up 2 new pages.

zgb999




msg:130211
 9:46 pm on Feb 29, 2004 (gmt 0)

We are definitely seeing less fresh results in the last few weeks.

I assume it has to do with the 1 billion new pages that Google has to crawl. Their ressources probably cannot handle as much fresh content anymore.

quotations




msg:130212
 9:57 pm on Feb 29, 2004 (gmt 0)

>Simple Question:
>Has anyone put up a new site or new pages
>in a new directory in the last 14 days that
>has been crawled and included by Google?

I put up two new sites on February 26th. Neither is related to news in any way.

The concept for the sites did not even exist on the morning of the 26th and they were fully deployed by 5:00 that night.

As of today, (Feb 29th) both of them are showing up in the SERPS for a variety of search terms and they have fresh tags for February 27th.

europeforvisitors




msg:130213
 10:10 pm on Feb 29, 2004 (gmt 0)

Has anyone put up a new site or new pages in a new directory in the last 14 days that has been crawled and included by Google?

I published 100+ pages about a city in Germany on February 26. Two pages had links from my home page, and both were in the Google index when I checked last night.

The remaining pages don't seem to be indexed yet, although Googlebot has been so active that I suspect they've been crawled. (I'm also waiting for some other new pages to get indexed; these are pages from about a week ago that don't have links from the home page.)

In general, I'd say that Google isn't adding "fresh" pages to the index as quickly as it did a while back, but the delays don't appear to be extreme. Anyway, this thread began with the premise that Google is "not crawling or not crawling much." That isn't true, at least for those of us who have seen Googlebot in our logs every day.

jchance




msg:130214
 10:22 pm on Feb 29, 2004 (gmt 0)

There are a bunch of financial articles out there that discuss the financial side of Yahoo dropping google and the amount of money Google was receiving from Yahoo was VERY small compared to what they are expected to be making from adwords. Basically Yahoo dropping them doesn't impact their bottom line.

As far them not indexing sites for the last two weeks I feel is due to the fact that the current algo is just a temporary "make everyone happy for a few weeks" algo. They wanted webmasters to be praising Google for bringing their site back from the dead while the whole Yahoo switch thing was starting up.

I think very soon we will be back to Florida/Austin results and the indexing will pick back up again.

darkroom




msg:130215
 10:26 pm on Feb 29, 2004 (gmt 0)

But then how does all this relate to one of my sites that isn't getting crawler activity anymore? its been 1 year since the site was launched and ever since then, this wasn't an issue until a little more than a week ago..the only thing i did was redesinged my homepage and lost all freshbot activity...plus why would it suddenly show a month old cache?..

rfgdxm1




msg:130216
 10:27 pm on Feb 29, 2004 (gmt 0)

>The answer may be as simple as cost. Now that Yahoo is doing their own thing, Google may have to conserve resources at bit.

I'd expect the reverse. With Yahoo no longer using Google, this means that Google is facing much greater competition than before. If Google went to a smaller index, this would mean more searcher dissatisfaction, and them switching to other SEs.

steveb




msg:130217
 10:36 pm on Feb 29, 2004 (gmt 0)

Google seems to be crawling more and deeper than ever before. Aditionally, particularly in Google News, they pick up articles within an hour sometimes.

Where do these myths come from? Somebody has a site that doesn't get crawled, and they don't take a look at the serps where in the past several days we have been seeing fresh tags FROM THE SAME DAY. Yesterday at 6pm west coast time I had hundreds of Feb 28 fresh tags.

What Google seems to be doing less of is crawling huge (25,000+), poorly seo'ed sites, but aside from that Googlebot is far more active than usual for me... and the objective fact of the daily fresh tags is something that simply can't be ignored.

===

3:10 pacific, and seeing Feb 29 fresh tags

stcrim




msg:130218
 11:33 pm on Feb 29, 2004 (gmt 0)

My original post is the result of being the web developer for a lot of existing sites (that are doing very well in Google) We have added 4 or 5 new sites in the past two weeks and many folders or additional content on some of the existing sites - that are not being spidered at all - or were spidered without inclusion...

This was not some wild question based on one site doing poorly

-s-

namechangedtoprotect




msg:130219
 11:37 pm on Feb 29, 2004 (gmt 0)

seeing plenty of crawls, but not fresh backlinks or PR. Also seeing Google have some problems grabbing the description they use.

steveb




msg:130220
 11:39 pm on Feb 29, 2004 (gmt 0)

But that isn't new at all. For amny months Google might hit the top page of a new site (that has some good links) then wait awhile, then hit the next level, then do the whole shot a month after first finding it.

Expecting new sites to be fully crawled within two weeks is expecting a lot at best. Index pages within two weeks, yes I'd expect that, but anything more is very optimistic. And new directories on existing domains... that just depends on the linking. A page off an index page might get hit immediately, a directory four levels deep might not get picked up for a long time.

There is a huge difference between adding new pages to the index, and very actively crawling domains that googlebot already values.... and adding a directory to a PR7 site, and adding one to a PR3 site.

Becky




msg:130221
 12:26 am on Mar 1, 2004 (gmt 0)

I also have noticed the daily fresh tags. I have a PR6 site that is getting crawled daily.

stcrim




msg:130222
 1:01 am on Mar 1, 2004 (gmt 0)

Old stuff is doing just fine - the original post is about new stuff...

-s-

quotations




msg:130223
 1:23 am on Mar 1, 2004 (gmt 0)

>Old stuff is doing just fine -
>the original post is about new stuff...

... and I responded about new stuff.

Two brand new sites, non-existent before February 26th.

Two of them. Crawled the next day and included.

BigDave




msg:130224
 1:33 am on Mar 1, 2004 (gmt 0)

New deep content, in a new deep directory, uploaded last night, was crawled within hours off of my "what's new" type page that googlebot checks several times a day. That page is showing up as feb 28 as its fresh date.

The directory that contains that review was crawled today and has a feb 29 fresh date. I am assuming that it was crawled because of the cookie crumbs on the page.

Rick_M




msg:130225
 2:14 am on Mar 1, 2004 (gmt 0)

As I said in my post, I did have some fresh pages that were included right after they were crawled, but then dropped out and haven't come back. I've also had a lot of pages crawled the last few days that haven't shown up at all. And I have some medium PR pages that are getting freshed daily, but those are older pages. For those who had pages included with freshbot, I'm curious if they will stick or drop out after a day or two until the next update. We'll have to wait and see.

Trisha




msg:130226
 5:52 pm on Mar 1, 2004 (gmt 0)

I've had problems getting some new pages crawled also, on some sites anyway. I added a new section of pages on one site a few weeks ago, but they are still not crawled. Never seen this before. I think even on the old update schedule they would have been crawled by now. Googlebot is coming by though, just not taking the new pages.

This 40 message thread spans 2 pages: 40 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved