Welcome to WebmasterWorld Guest from 126.96.36.199
Forum Moderators: open
I am talking about traffic of nearly 7 k from google everyday and hence its a sizable decrease.
Looking for early answers on how we could check the things
The theory on the trouble with calculating the homepage is interesting, however it doesn't affect my site. I am listed on other homepages however so if they dropped it is possible it affected my site.
Of coarse to determine this without any pr update is almost impossible so I can only wonder how Googe figures out what they do. Do they have a special toolbar to show the real pr?
Anyway looks like we are stuck with this drop for now,,,
Google exhibited clear signs of being broken in April and May, 2003. This was when the monthly crawl, PageRank calculation, and update dance was ended. They threw out an entire crawl and reverted to the previous month's index. Since then they have never returned to the old monthly cycle that had served them so well for about three years.
In June there was what I believe to be an insider leak of what happened. You can still find this on WebmasterWorld here [webmasterworld.com].
Ever since June 2003, no one has convincingly challenged that interpretation by re5earcher.
It was very clear to Google by then that Yahoo was gearing up to compete. Yahoo had already bought Inktomi, and Overture, Alltheweb, and Altavista were acquired in mid-1993. Microsoft was already crawling the web on an experimental basis.
In November we had Florida. Many sites were dropped. Google was forced to turn back the knob on Florida because of all the screaming.
Since Florida, I've been watching a few nonprofit sites with thousands of pages. The number of pages that are indexed by Google, as opposed to merely having the URL listed in Google, has declined across the board. Traffic from Google is at an all-time low on one nonprofit site I've been watching. Most of the pages are listed as URL-only. Dot-org sites like this were not affected by Florida, but they've been squeezed slowly for the last seven months.
My theory is that when Google was confronted with the 4-byte integer problem, they looked at several factors:
1. Microsoft and Yahoo were knocking on the door.
2. Google's profits from ads were extremely hot, and climbing, and Google had a window of opportunity to get really rich before the competition had time to kick in the door.
Google could either put their resources into their organic algorithms, which would take a year of planning and effort to expand to a 5-byte docID, or they could put the organic results on hold and maximize their profits from ads.
With the IPO always a possibility, Google decided to go for the ads, and put the organic results on hold. The result is that for the last year or so, in order to get a page into Google, another page had to come out. The old method of calculating PageRank was abandoned in favor of "guessing" PageRank based on the parent directory, or on a small sampling instead of a recursive calculation involving the entire web. The fresh bot, which had been working since August 2002, was expanded so that it almost replaced the old monthly crawl cycle.
Google has basically abandoned the main index in favor of cashing out. I think it's time for folks to stop trying to second-guess the logic behind which sites drop and which don't. Given the fact that the web is growing at a good clip, and Google is required to appear fresh or face ridicule, and given the fact that they are limited by their 4-byte docID problem, the bottom line is that innocent pages will get dropped all over the place. This is true even if you assume that Google is making a good-faith effort to restrict the dropped pages to those spammy ones that might deserve to be dropped.
Google's priorities are with ads. They've abandoned pure search. The fact that they can claim they're doing a good job on keeping spam out of the main index means that they've completely lost perspective about what's going on with the main index. Or, their priorities are clear in their own minds, but they have to keep spinning the myth of excellence in algorithmic search, which is how Google got on the map in the first place.
PLAYBOY [referring to ecommerce spam in the main index]: Playing cat and mouse like this, how can you be sure to stop them?
PAGE: We have a lot of people devoted to stopping them. We do a good job.
BRIN: People try new things all the time. By now, the people who succeed have to be very sophisticated. All the obvious or trivial things one might think of have been done many times, and weíve dealt with them.
PAGE: Itís going to get harder and harder to do these things. However, the benefits are obviously large, so some people will try to manipulate the results. Ultimately, itís not worth it. If youíre spending time, trouble and money promoting your results, why not just buy advertising? We sell it, and itís effective. Use that instead. Advertising is more predictable and probably more effective.
This is interesting. It explains the great loss of traffic and in some cases sales. I bet those obscure, longer searches are the most profitable ones too.
Perhaps Google just decided to provide "certain" results to those profitable searches, that is only-adwords matching results by eliminating the possibility of "contamination" of otherwise great free serps. If they can somehow manipulate the best traffic towards adwords clients they will be maximizing their overall gross, in the end.
no one has convincingly challenged that interpretation by re5earcher.
Scarecrow, good points, and fairly self-evident given the completely static total pages indexed count over the last year almost now. I read that thread with great interest, and took especial note of the strongest arguments for and against the thesis, as well as who put those forwards. The original posting satisfied a basic research theory axiom of being the simplest, and most comprehensive model put out to explain a wide range of behaviors, and still is. Sandbox, incomplete large site indexing, dropped sites, etc ad nauseum, why look for complex explanations when simple ones do fine?
The biggest mystery to me is not the points you raised, but why google doesn't just pretend to increase the indexed page count on the home search page, since no one would ever know the difference anyway.
its just that in June 2003 Google stated it had: 4,294,967,296 pages
and now, August 2004 it says it has 4,285,199,774
not an inconsiderable abount of pages are out then!
its just that in June 2003 Google stated it had: 4,294,967,296 pages
Google's numbers have been screwy for almost a year now. They say I have more pages in particular directories than have ever existed in those directories. Not a few more, but usually about 50 to 100 percent more.
Many people have sites that have seen no impact whatsoever from this change were not even aware of an update.
I believe it is because they have good inbound links and did not rely on their site optimization (cross links inside their sites). Than, naturally, their rank did not changed. I can see it in my own example. My site is rather old and has different topics (directories). The pages that had good (independent of me) inbound links have not changed their PR. The pages that got PR mainly from other pages in my site and from download sites got PR0. But among the pages that got PR0 one that has good outbound links to independent sites now has PR2.
If they simply de-valuate these techniques, i don't really care. As of yet I have not heard really any reliable theories on what is going on with Google, so I am sitting tight.
Of course, really only Google knows. But I believe that they are clever and reasonable. So they indeed probably just de-valuate some techniques.
googleguy himself just explained in another thread that google's algorithms try to determine the home page
but then wont ALL index pages have something nice to say about themselves and commmon, you ought to.
I did not say that Google has special low rank for home page. Simply if Google now relies more on what other sites say (inbound link with good PR), index page should *in general* has much lower rank than other site pages. For example, when I link to another sites I try to link to the page with specific information, relevant to the topic of my page. And (surprise!) as a rule it is not an index page. Other seems do the same. When I now try to find the new part of my site in Google I find first numerous download sites that point to me and (you guess right!) my download page and not index one after them all.
Links from other domains have only a very small higher likelihood of being unrelated than links within a domain. Google is not going to ignore the reality of the web, let alone play so obviously into the hands of template/duplicate spammers.
devaluing internal links and boosting external links, that will only play into the >hands of people with a large network of sites.
I believe here PR is involved. I also believe also that Google tries to detect spammers and some sites that by their nature just duplicate the content (e.g. download sites). I believe that for template/duplicate spammers it is very difficult to have many unique contents so they will have in general low page rank. And if one site try to spread its good PR to unusually many sites (network) it is probably rather easily to detected. This may be the reason why PR dropped for some SEOs and for oldest sites. If an oldest site tries to spread its PR to unusually many dependent sites it is indeed looks like spam.
One theory is that perhaps Google went to crawl my site, and when it returned a dead link perhaps they took me out of the index. On the Google FAQ page they list this web hosting problems as a potential factor for a site disappearing. Anyone out there with any other thoughts/suggestions?
[edited by: DaveAtIFG at 5:32 am (utc) on Aug. 18, 2004]
[edit reason] No URLs please [/edit]
I agree with you. Google should just come up with a bunch of good, but very different algos and just rotate them, weekly, daily, or hourly, as you suggested in message #489. Share the wealth, stop the excessive, obsessive, compulsive SEO and spamming. Finally, a SUGGESTION and not just whining or complaining. Just do it, Google! Seems like a simple and yet effective solution to this on-going problem and cat and mouse game.
Google should just come up with a bunch of good, but very different algos and just rotate them, weekly, daily, or hourly
Instead of Google SERPS being a contest it would be a charity handout. Nothing wrong with that in principle but it means Google wouldn't be a search engine any more, and Yahoo etc would take all the business and the status quo would resume.