Forum Moderators: open

Message Too Old, No Replies

Why does the 'Google Lag' exist?

Trying to understand its purpose.

         

bakedjake

1:43 am on Sep 29, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I had some in-depth discussion this weekend with some friends about the sandbox. Every theory on how to beat it kept coming back to one central problem - no one is sure why it exists.

I feel very strongly that until we have a good grasp on why it exists, it will be very hard to beat.

I don't buy the explanation that it's intended to be a method of stopping spam. Why? One, there's too much collateral damage it is doing. Two, if you accept the 80/20 principle (20% of spammers are doing 80% of the spamming), and you realize that there are multiple ways already of beating the sandbox that all of those spammers are aware of, it doesn't make sense anymore.

So, why does the sandbox exist?

The most obvious effect of the sandbox is that it prevents new domains (not pages) from ranking for any relatively competitive term. So, start thinking like a search engine - what would be the benefit of this?

chrisnrae

2:09 pm on Sep 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"That's a HUGE risk"

I think it would be riskier to have an entire legion of webmasters who have your number. SEO newbies and novices had learned how to game google with ease. If you don't figure out a way to make your algo a mystery, you'll soon be dead in the water. IMO.

jnmconsulting

2:20 pm on Sep 29, 2004 (gmt 0)

10+ Year Member



why would they spend time completely redesigning their algo and recreating the index when they already have arguably the best search engine already?

Lots of companies and businesses have a tendancy to become complacent, they loose the drive that got them to the top to begine with. a successfull company will allways be striving to be better.

Alot of top companies have hit a tree while looking in the rearview mirror at the compitition.

that was a quote from newsweek a while back.

sean

2:30 pm on Sep 29, 2004 (gmt 0)

10+ Year Member



I'm not the most technical webmaster, but Google looks like an engine that is within 0.002% of a docID limit.

Surely, Ye Olde Time Google, without constraints, would have come up with infinitely more elegant solutions.

I vote for the new index theory, guessing we'll see it early 2005.

jnmconsulting

2:39 pm on Sep 29, 2004 (gmt 0)

10+ Year Member



DocID limit!

I read a couple of articles on this subject, however I cannot find them, my understanding was with some minor changes they would be able to overcome the DocID limit that was initially set up in the begining. In order to do this they have to reindex...looks like this is what they are doing. Does anyone know of any articles that explain in detail the DocID limitations?

ownerrim

3:52 pm on Sep 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"i think that pagerank is almost dead in the water"

Pagerank's importance may be minimized now, and may be minimized even more in the future, but I don't see it ever being dead. You have to have some fundamental way of doing the macro-sorting of webpages. Now, once you get beyond the big sort, you refine it further with relevancy criteria such as keywords, titles, links, anchor text.

If pagerank ever completely died, there'd be no way to do business startups on the web and get them going inside of a number of years (minus paying for advertising such as adwords). Even info sites would take years to ever get noticed particularly since they generally don't solicit links but get them organically over a period of years.

ownerrim

4:18 pm on Sep 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Of course, any really serious mucking around with the manner of indexing might have adsense and google-revenue implications

decaff

4:49 pm on Sep 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



BakedJake...

My take on this process is that Google has been doing some serious "data mining" for the last couple of years as it develops out its algo and engine...and tracks all sectors of it's SERPs...

Data mining is all about "trending" the data...looking for patterns and anomalies..
One consistent pattern would be ... that established Web sites with a history should show "normalized" behavior..which simply means..that these sites are addressing their respective "audience" through content initiatives...building link relationships (and these would typically occur through a one-to-one type process...not hundreds of new links suddenly showing up on the radar)..
and so on...

Today...when a new site that is trying to "suddenly" compete in a competitive area shows up in the SERPs with hundreds of inbound links, a huge number of content pages but no real history...this raises a red flag as far as Google is concerned...this would fall under the "anomalies" aspect of data mining..

It's a no-brainer for Google to see new sites and what they bring to the table...and also to track the competitive sectors and watch for this type of "SPAM" behavior...
or patterns to design responses to..

Google is simply trying to meter the process some with the "sandbox" initiative...and you can bet that the big established companies are "communicating" their needs to Google through "legislative type channels" (think Washington, D.C. and how Bills and Laws are developed and implemented through "lobbying")...

Google has to find a way to control this process as they continue to expand their global/multi-language reach or the SERPs will become a sea of useless information...

The collateral damage has to be an accepted factor in all of this...and yes...this means that sites that don't use apparent "SEO tactics" and simply want to address and serve their "visitor base" may be affected by these algo changes (and in some cases we know that Google will manually look into really aggressive situations)

The other argument is that Google is looking for ways to generate more revenue...so this "sandbox" process may be forcing some advertisers to step into the PPC thing while their "new" content is in the "silicon queue box"

(now back to my automated content generator and link spamming software...oooohaaahaaa! - just kidding)

DaveN

4:51 pm on Sep 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Of course, any really serious mucking around with the manner of indexing might have adsense and google-revenue implications....

WHY!

Think about update Florida, IF you searched for "real leather sofa"

#1 was a porn site and #3 was a porn site! i have lists upon lists of bad searches,

one company useing subs took the top 300 slots yes 300!

Did it effect adwords - adsense - or how long an engineer spends on his lunch break.... I think Not

Google know they have holes in the Index,so there are ways people can buck the system and earn a buck....they need to fix these holes, imo everytime they have plugged a hole they have made a another one, But thats common when firefighting problems, As an EX-database programmer sometimes all the fixes and patches cloud what you are try to achieve... it just makes sense to get a clean DB and run a brand new Algo on it. Can anyone rememeber what happened when they tried to intergrate the OCR spider into the natural spider LOL?

DaveN

Midhurst

5:26 pm on Sep 29, 2004 (gmt 0)

10+ Year Member



Graywolf,
I've joined this thread rather late. You may have already answered this point:

"site wide links will push you into the sandbox if you're on the edge."

Why? Is it the loss of PR?

graywolf

5:44 pm on Sep 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Midhurst I dunno why I just know it did for me. I have establshed sites with site wides that work just fine, but for new sites it seems to be like aiming a howitzer at your own penny loafers.
This 354 message thread spans 36 pages: 354