I have been getting hit hard too by the googlebot..
Recently (it's a new site) they have sucked up over 250 pages. (even my forum)
The only other thing I'm wondering now is when am I gonna get a PR... jeesh
It seems that gbot is little bit settled down, spidering seems normal again. What do you think?
One of my sites have been hit everyday for the last 5 days by multiple G bots, I see 12 different IP addys.
Gbot i staking a break...still spidering but not as much. I get about 200 hits on a 3-4000 page site.
Now I'm getting worried. The last time I saw a fresh tag for one of my websites was September 29th. I actually saw this tag appear on the 29th.
Now fresh tags are missing for me - two days running. I have more pages showing up in google than ever before for this website when I query "site:www.widgets.com widgets" But I just checked my logs, no googlebot for 2 days straight - ARGHHHHHHHHHHHHHHHHHHHH!
Thoughts? (no black hat, in case you're wondering)
Hi there all,
First Iíd like to take a moment to introduce myself, as this is my very first post to your website. Iím Casey and live down Scottsdale, Arizona way.
I'm relatively new to web design, (but not just dropped off the turnip truck new) and will be putting my efforts toward a Native American Art website, so Iíll be turning to this section here a lot in my efforts to learn more about Google and how it operates, along with all the other valuable information on SEO to be found too. However, as for right now I feel rather inadequate with this vast amount of knowledge reeling by me at the turn of every page. I will try to find answers to my questions first by going to your Search Area, but hope you will bear with me if I ask some elementary questions now and again.
Welcome, Casey. :)
I just posted this over in the robots forum too (awaiting review). More stuff from Google (I think)
22.214.171.124 - - [02/Oct/2004:08:31:18 -0400] "GET /robots.txt HTTP/1.0" 200 484 "-" "stat (email@example.com)"
126.96.36.199 - - [02/Oct/2004:08:31:19 -0400] "GET / HTTP/1.0" 200 19826 "-" "stat firstname.lastname@example.org"
Anyone else see this one yet? I could not find anything on the web. Looks like it resolves to Speakeasy?
CustName: SFO BRIDGED CIRCUITS
Address: 440 Mission Court
NetRange: 188.8.131.52 - 184.108.40.206
CIDR: 220.127.116.11/32, 18.104.22.168/31, 22.214.171.124/30, 126.96.36.199/29, 188.8.131.52/28, 184.108.40.206/27, 220.127.116.11/26, 18.104.22.168/25
TechName: Stollar, Andreas
OrgTechName: Stollar, Andreas
# ARIN WHOIS database, last updated 2004-10-01 19:10
# Enter? for additional hints on searching ARIN's WHOIS database
It's a non-Google bot that's owned by someone that happens to have a gmail account.
Weird, I typically get what I'd guess you'd fresh tags every 2-3 days, but I just checked and the cache for my home page has reverted to what was retrieved on 9/12/04
I watch a SERP with about 30 results in it, several times per week (it is a result for some incorrect information that is printed on other sites that they have been asked to remove).
The SERP has been stable for months, except for the reduction from about 50 results where sites gradually comply with the removal request.
The SERP was rearranged a bit a few days ago, with several results dropping out even though they haven't yet made the requesed change; but today there is a major change. The results have been almost turned upside down. Looks like Google has built and published a new index based on the massive spidering that they did a week or so ago.
I probably think, Google and MSN are testing their new Crawlers.
Possibly Google wanted to test and identify cloaking pages through the Mozilla User Agent which it could track maximum redirections as well crawl other application page (ex. Flash, MS word docs etc...)
MSN is about to demonstrate its search capabilities to a panel of people so possibly they increased the spiders to apply their alogirthm on maximum pages.
The other ip I saw in the previous page here is not from MSN or Google.
|Pass the Dutchie|
|Possibly Google wanted to test and identify cloaking pages |
Yea you could be right if these are all new IP's Gbot are using.
|Looks like Google has built and published a new index based on the massive spidering that they did a week or so ago. |
This is what I am seeing too. In my areas, it looks like the index has been rebuilt from the ground up. I still think this has to do with a cloaking and metarefresh/302 problem the G was having.
New IPs. New datcenters? New Index? Don't know yet for sure yet.
Looks like a bunch of junk to me. Google just went out and try to find some new fresh pages to add to their index. It does not look like any ranking factor has been introduced as of yet. These results will change over the next few weeks as the weighting factors are applied.
Just to help clear up an earlier discussion in this thread on the number of servers at Google between Hanu and Lord Majestic, I just came across this pdf: - [research.ibm.com...]
Page 39 shows the number of servers and queries at various points between Nov-98 and Aug-02. Charting these figures in excel, and using the 200M queries/day stat from page 35, makes it look like there would be about 12,000 servers for search now + maybe a few for adwords/adsense/gmail/etc.
The last stat they give for number of servers in Aug-02 (handling 150M queries) is shown as >10,000 but looking at the data in excel sugests it shouldn't be much over that unless they were hitting some scaling issues.
At Pubcon 6 there was a well know google employee who said the official figure that they admit to was 14,200 or 14,500 I cant remember now, but then gave his normal big beaming smile ;)
Now us lot being on the outside of googleplex...well...the figures i have heard of now exceeds 100,000...who knows?
I run a very small, very niche search engine (for kids) which has under 500 listings. Gbot has performed 17,000 queries of my search results pages in the past week. You think they got the information they are after?
FRESH TAGS... i monitor 5 times a day the SERPS in my industry let me tell you the TAGS are all fresh in the top 10 5/10/04.
Are you running a forum or portal that assigns session ids? That can cause bots to freak out, b/c they think each new sid is a new page.
|Gbot has performed 17,000 queries of my search results pages in the past week. |
>Are you running a forum or portal that assigns session ids? <
Nope, just a querystring with the search term and page number...
i have gotten 18,000 hits over the last 4 days for an 800 page site.
I'm up to 34k hits this month alone from GoogleBot, and almost a gig of bandwidth used. Google used to track 35 pages for me (as I said before), it then jumped to about 600, last nite it was 900, today its 750, so Google seem to be all over the board on their results lately. Its almost like things change on a minute by minute, search by search basis for me. (The GoogleDance Tool shows the same results for all datacenters, yet all datacenters change throughout the day.)
You know, traffic is picking up steadily, I think I will retract anything that sounded like a complaint and ask the Gbot to come back for more....
ds, what is the google dancetool you are reffering too?
|ds, what is the google dancetool you are reffering too? |
He is referring to a tool available on a popular SEO website that allows you to see SERPS from different Google datacenters side by side for comparison.
Go to google and type "google dance tool". The number one result should take you there. Hopefully it hasn't changed since this post time. ;)
| This 176 message thread spans 6 pages: < < 176 ( 1 2 3 4 5  ) |