Forum Moderators: open

Message Too Old, No Replies

Google hits!

         

kosar

2:25 pm on Oct 25, 2004 (gmt 0)

10+ Year Member



60,000 google hits so far this month never have seen anything like this. Anyone else having this experience?

mzadorian

12:21 am on Nov 5, 2004 (gmt 0)

10+ Year Member



I get spidered almost every other day but since the Google danc eon Sept 24th, all of my rankings on searches went from number 1 or 2 all the way to 100. We have a PR of 4, and are doing great on Yahoo. Google traffic is very very little. Does anyone know anything that could help me out or is anyone experiencing the same thing?

Thanks.

Rick_M

2:01 am on Nov 5, 2004 (gmt 0)

10+ Year Member



When I was using a virtual private server, I installed a module on the server called "mod-throttle" that limited the number of pages fetched per second. If your server is getting hammered, it might be worth looking into.

ccton

2:42 am on Nov 5, 2004 (gmt 0)

10+ Year Member



I got a theory:

The partily indexed URL has another meaning: waiting for update with crawled new data.

During the updating, google dropps some pages and then put new data in.

Obviously a dancing is taking place. My site's indexed number goes: 238->102->243->238->132->243->299->132

Every hour is different.

ccton

2:44 am on Nov 5, 2004 (gmt 0)

10+ Year Member



when it went to less than 200, a lot of page have url only.

ccton

3:07 am on Nov 5, 2004 (gmt 0)

10+ Year Member



5 minutes before it was 132, now it is 293.

jnmconsulting

3:13 am on Nov 5, 2004 (gmt 0)

10+ Year Member



this is what I got, on 10-31-04, 517 pages crawled by "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" IP 66.249.66.199

I saw no change in the number of pages in google or changes in SERPS, all same IP address.

Then on 11-2-04, 2533 hits by "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" IP 66.249.65.165.

now just about all of the 834 pages in google show just url, no description...nothing but URL.

I would like to think that there is something going on, I hope..because I have not seen movement in SERPS for 3 months. In sept I had 1200 pages indexed about 25% of them had full descriptions and cached. Now I'm down to 834 and no cache or description.

I had tried to find out how the bots work in one of the other threads, don't remember. It was my understanding that there are more than one type. One type just checks to make sure the page is there and grabs all links. Another indexes the page itself.

Can anyone shed any light on this for me?

kpaul

4:03 am on Nov 5, 2004 (gmt 0)

10+ Year Member



((I run a search engine, and besides your various algo's, the two most important things are the size of your index, and its freshness.))

I haven't read the whole thread yet, but I'm thinking they're increasing the size of their index...

jnmconsulting

4:19 am on Nov 5, 2004 (gmt 0)

10+ Year Member



maybe this is not the best way to check, but I have been watching the index by using the search term +the this is what I get:

Results 1 - 100 of about 6,140,000,000 for +the

last month it was

Results 1 - 100 of about 5,370,000,000 for +the

not sure if if means anything but the G home page still shows,
©2004 Google - Searching 4,285,199,774 web pages

sasha

6:20 am on Nov 5, 2004 (gmt 0)

10+ Year Member



All I know is that after crawling 150% of available pages of my site (some pages were indexed twice), the site went from being 10% indexed by Google to being 5% indexed and 3% of 'undigested' URLs.

To me Google business model is simple - kill the SERPS, boost the advertising revenue. Simple and efficient!

ccton

6:29 am on Nov 5, 2004 (gmt 0)

10+ Year Member



sasha, you kiddding here, right?

u are talking about evil stuff after Google's IPO

ccton

6:33 am on Nov 5, 2004 (gmt 0)

10+ Year Member



I am still believe in Google! It will go the right way, not for the money but for the people - this is how google used to stand with, this is what the google we know.

bears5122

6:54 am on Nov 5, 2004 (gmt 0)

10+ Year Member



Google wants bad results. Just wait till Christmas season, it will get nuts again. Get used to seeing forum posts outranking shopping centers.

rfung

10:04 pm on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



has anyone noticed an increase in revenue with all this spidering? This month so far has been pretty good for me.

sasha

8:39 pm on Nov 6, 2004 (gmt 0)

10+ Year Member



On Nov 1 Google indexed 105,000 pages on my site. On that date Google index of my site (site:domain.com) was about 11,700 pages. This is a relatively new site and it has never been fully indexed by Google.

Now, as of today the index of my site is 6,640 pages and several thousand 'no-description, no-title' pages.

None of the pages that Google indexed on Nov 1 have appeared in the index and what's more it seems that a lot of previously indexed pages have either disappeared or became 'no-description, no-title' pages.

The pages that remained in Google index have fresh tags on them from Nov 4 and Nov 5.

Did anybody have the same occur to them? Even assuming that there is a duplicate content penalty, what happened to 100,000 pages that were crawled. Why didn't they ever make it into Google's index?

esoteric

2:55 pm on Nov 7, 2004 (gmt 0)

10+ Year Member



Hey Sasha

I am having the same problem! Are you also using the eBay API or other API's for content?

robho

3:23 pm on Nov 7, 2004 (gmt 0)

10+ Year Member



what happened to 100,000 pages that were crawled. Why didn't they ever make it into Google's index?

As far as I can see, the pages crawled by the new Mozilla Googlebot haven't made it to the current index. I had 50,000 pages crawled by it on Nov 1st, none of them are in the index. Maybe they are saving them for a new version.

On the same day I had about 3,000 pages crawled by the old bot, most of which are now in the index. On Nov 1st for a one month old site I had 18,000 pages listed, that's now rapidly increased to 65,000 so it's not the case for everybody that the page count is dropping.

sasha

5:29 pm on Nov 7, 2004 (gmt 0)

10+ Year Member



> Are you also using the eBay API or other API's for content?

Ours is a database driven coldfusion directory.

Obviously directories have lots of similar pages. However we have taken extensive steps to ensure that each page has sufficient 'unique' content (which is quite hard to do in a database driven format with 50,000-60,000 pages).

Perhaps 2 things are unrelated: the dropping of pages from Google index and the massive Google spidering (that has not been included in the index).

I also agree, the regular Google bot that spidered only very few pages of the site since Nov 1 - has all those pages already included in the index.

What really disturbes me about Google is this - if for whatever reason Googlebot does not successfully spider old pages during their deepcrawl, they just drop those pages from the index and one must wait several months to get re-crawled and re-indexed.

It seems that in their all-knowing heads, the millionaire PhDs at Google have decided that if their software does not get to the page FOR WHATEVER REASON - the page must not exist!

The only search engine to do this. Yahoo tries for the longest time, sometimes 6-12 months, before they will drop previously indexed pages from their index.

itloc

11:38 pm on Nov 7, 2004 (gmt 0)

10+ Year Member



Well, i do have 500'000 + pages on my new website and the page is online since last monday. Guess what? Google indexed 133 pages so far...

Very disappointing - it's a quality site - no doorway stuff or so... :-)

I would be happy if the bot would suck 300'000 from my page ... come on bot ... get it :-)

Regards

Roger

RedWolf

1:54 am on Nov 8, 2004 (gmt 0)

10+ Year Member



Ok, I have to ask...

How do you get a brand new site with 50,000 pages of quality content? Even if you got a new businesses catalog up, that is a lot of pages. Or are these just rehashes of travel sites or directories that are a dime a hundred out there? Just the time alone to write 50,000 pages of guality unique content is stagering. That is about 200 pages written a business day for a year. Even a very active forum will take a while to get to that point, and you said this site is less than two weeks old.

itloc

2:08 am on Nov 8, 2004 (gmt 0)

10+ Year Member



Well, it IS possible

Yes it is a directory - but i have to mention that i worked on this site for nearly two years. Full time. It nearly killed me ...

If you like to have the URL i can send it to you - and you may decide by yourself whether Google shoould index it or not :-))))

Roger

BillyS

3:10 am on Nov 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think itloc said the site had 500,000 pages, not 50,000.

That works out to 1,000 pages of content a day for 2 years (assuming 250 work days in a year).

Glad itloc is working full time on this one or I would feel badly. I can only pump out about 2 pages of content a day (part time of course...).

Back to writing!

itloc

3:26 am on Nov 8, 2004 (gmt 0)

10+ Year Member



Hey BillyS

I never said that i have entered 500'000 pages by hand - i am crazy ... but not that crazy... :-)

It's the functionality around the listings... that generates these pages...

Roger

donovanh

1:03 pm on Nov 8, 2004 (gmt 0)

10+ Year Member



I have a site that used to have over 150k pages indexed, in the last few weeks this has dropped to less than 40k, but the google bot has been hitting really hard, grabbing over 300k pages in the last week.

Is anyone seeing the results of this crawling appearing in the SERPs? It's pushing me toward my bandwidth limit as it is, so was wondering if it might be worth putting a delay into the robots.txt to temper the beast...

Lorel

4:41 pm on Nov 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't know about Googlebot as I have a Mac and so that software is either too costly or I don't have time to install it/learn it and I'm too busy building websites anyway.

However, referral visits to my web site have been increasing steadily for last 2 months by 8-10% each week and 95% of that is google traffic and Keyword rank is steadily rising in google. I have never paid for clicks, etc. Some of the client shopping sites I manage are doubling/tripling referral visitors--probably christmas shoppers for those but I don't know whats causing the rise in visitors to my own site but I'm enjoying it (spending 2-3 hours per day writing estimates).

Lori

Dayo_UK

4:45 pm on Nov 8, 2004 (gmt 0)



Is anyone seeing the results of this crawling appearing in the SERPs?

Nope. (Although pages have been added via other crawling - but not the massive crawl that took place a few days ago)

Although it has been said that Google does not do major updates nowadays - however, when/if all this data does get added to the index it should cause a big shift!

itloc

5:02 pm on Nov 8, 2004 (gmt 0)

10+ Year Member



Well,

If that is true then i have just missed that - and that is kind of depressing.

However - i don't hope you're right.

Roger

victor

5:14 pm on Nov 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's pushing me toward my bandwidth limit as it is, so was wondering if it might be worth putting a delay into the robots.txt to temper the beast...

Googlebot does not honor the crawl-delay parameter -- it's a useful, but non-standard, extension to the robots.txt standard.

I emailed google and told them to back off or be banned. I told them the highest acceptable crawl rate for my site.

They replied in a day that they'd adjusted the rate for the site. They are still crawling, but at an acceptable pace.

Odd that they had to tweak their crawler by hand. I thought they prefered to do everything by algorithm.

itloc

5:19 pm on Nov 8, 2004 (gmt 0)

10+ Year Member



Hi Victor

May i mail them to set the rate UP for my website? :-) I am really nearly unlisted - and that is kind of depressing...

site:mydomain delivers me 133 results ... since last friday no change... :-(

Roger

jnmconsulting

8:07 pm on Nov 8, 2004 (gmt 0)

10+ Year Member



"I emailed google and told them to back off or be banned. I told them the highest acceptable crawl rate for my site."

well then, I'm sure they made the changes due to the fact that you would ban their bot from your site...That would scare me as well...

Just kidding BTW

GoogleGuy did stop by one of the posts and indicate that they did need to throttle it down a bit...for obvious resons.

jnmconsulting

9:47 pm on Nov 8, 2004 (gmt 0)

10+ Year Member



I have now noticed in my sector that the returned results for "blue things" has gone from 341,000 to 129,000. that is a big drop, however my site has not moved at all. I have also seen 75% of the urls that were shown using site:www.mysite.com drop from 1265 to 178.

This started the day after the new Gbot hammered my site for 9 hrs

Frustrating...

This 96 message thread spans 4 pages: 96