Welcome to WebmasterWorld Guest from 22.214.171.124
Forum Moderators: open
My first post here....
I have been studying Google closely, and it has been acting weird all yesterday. I get searches that return half the amount they normally do, then my new pages are listed, MSN's page rank went down (hehehe), but now it is all back to where it was.
I love Google - just hope my pages end up a little better than they were, I had figured out what the page rank should be for one - and it was off by way too much.
Anyway glad to be here - hope eveyone's pages work out well (well not everyone, but everyone here :) ).
One other interesting note. There is an error (actually several) in the google web directory in the adult section, it is pretty severe (categories empty), I hope they fix it - but oddly enough - the category button showed the correct category - even though google shows it as empty. I don't want to post where, because it is an adult setion, but check out the A & B sections of any of your categories that use the alpha bar - it might be empty too.
joined:Apr 13, 2001
This is a 10-day processing cycle, which is shorter than the 14-day I've come to expect. The crawl was a bit shallower than the previous crawl, even after disregarding the fact that they skipped one of my sites entirely.
There is also evidence that Google is doing overlapping cycles. My site has been crawled extensively the last few days, and half-heartedly about five days earlier. It used to be that there was a single cycle, but now it looks like smaller, overlapping cycles.
For any site where Google was not able to make it through all the pages before stopping under the old crawl pattern, the question for this new pattern is obvious:
For these mini overlapping cycles, is Google starting from square zero each time, and crawling shallower, or are they continuing where they left off the last time?
Too early to tell, but I have a bad feeling that it's the former -- more frequent, but also more shallow.
joined:Apr 13, 2001
The only thing I had concluded based on the consistency of the October to April pattern was that they had to stop crawling and start processing all at once. Typically it would crawl for a week, and get downright feverish by the end of the week, and then stop cold and never come back to crawl until the cycle restarted three or four weeks later.
I speculated on the basis of this that they had to turn off the crawl so that those PCs could start processing the data. My site was deep enough so that I noticed the cutoff. Other sites might not notice, because Google would get through their site before it became time to stop.
I don't know what Google is doing now.
joined:Apr 13, 2001
where AAAA_BBBB_CCCC is a proper name.
The XXXXX is always the same.
The YYYY alternates between two cgi programs, but I've locked Google out of one of these by returning a "Server too busy" because all the names are covered with the other cgi program.
The AAAA_BBBB_CCCC is always changing.
Each page returned from the above link has from several to several hundred additional links on it in the same form, but with new names in the links.
Each of these also links to a page with from several to several hundred in the same form. The page itself is usually less than 50K bytes.
And so on, and so on. That's deep.
It would be possible to run out of names after 115,000 pages if: 1) Google got that far, and if: 2) Google could detect on the fly whether it already got that name, and if: 3) Google stopped asking for that second cgi program that repeats the name and always comes back "Server too busy" because I've locked them out of this search that returns a Java applet.
As it stands now, Google would actually have to get 230,000 pages to run out of names, assuming it can detect skip duplicates on the fly. Half of these would be "Server too busy."
With six crawlers working at once, I don't think Google can detect duplicates on the fly, because I don't think the crawlers are talking to each other much, if at all. So I suspect that it's getting the same name several times, and these get purged later into just one page for each name. Very inefficient.
Usually I end up with from 20,000 to 40,000 useful pages in the index before it quits. By "useful," I mean a page that isn't merely "Server too busy." (Actually these "too busy" pages aren't entirely useless, because the name is in the link and folks hit on it. It's just Google's cache copy that's useless.)
In all, Google often ends up with lots of obscure names when they ought to be going after the least obscure names.
That's why I'd just like to send them a CD-ROM once a year, with the best data, all laid out and Linux-ready per their specs. No response from Larry Page on this, and it's been six months.
joined:Nov 11, 2000
I get weird results from Google today, May 23; sometimes my site shows up as it used to, ten minutes later it does not show at all, not even in all the pages Google lists. All gone. Even other sites I have that were not well-ranked, are sometimes now not there at all; other times they are. Any clues?
Did the page(s) that pointed to your site go away?
It could just be an update thing. There is a part of the google update proceedure that goes like this:
(Drop all child links - redo something or other - and then add them back in)
At one point or another - those on the bottom of the foodchain are dropped for google to do some sort of iteritive process - than they are added back in. Maybe that is what you are seeing.
I would be suprised if the totally got rid of your pages all together after everything is said and done.
Here is a quote from one paper:
"Dangling links are simply links that point to any page with no outgoing links....Often these are ....simply pages we have not downloaded yet.....we siply remove them until all the PageRanks are calculated. After all the PageRanks are calculated, they can be added back in...."
This wasn't the quote I was looking for - but it is something like that.
Thanks for the info. Yes they dropped every page. 3 sites, and every page on those 3. I can tell because if I do a keyword search for the brand product I sell (say Black and Decker), other sites which I am familiar with that sell the same brand, or sites that are linked to me, show. But not my site(s). But then, as I say, about 10 minutes later I will show, and the other listings that show on that page when I am shown are completely different than the listings that show when I am not coming up.