homepage Welcome to WebmasterWorld Guest from 54.197.108.124
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 44 message thread spans 2 pages: 44 ( [1] 2 > >     
GoogleFox or FireBot?
Cross Breeding a browser and Spider?
Brett_Tabke




msg:738744
 3:51 pm on Jan 25, 2005 (gmt 0)

The question of Google hiring programmers that worked on the FireFox browser, has generated many stories around the web. Ideas ranging from the thought of Google building a browser, to working on a full blown operating system have been suggested.

While only Google can tell us if that is true, I think we can explore one other alternative that fits better into Googles long term search goals.

[blogs.forrester.com...]

(video files often hide where spiders can't find them, such as behind Flash animations, in frames, or behind JavaScript)

q: So how can Google and others find quality "non standard" content?
a: By using browser code itself to bust that stuff hiding behind frames and javascript.

A spider today can't execute Javascript, but there is no telling what it will do in a few months. If Google can mind meld GoogleBot to FireFox (GoogleFox, or FireBot?)1 they might be able to increase the relevance of searches and find new content to mine.

Right now, Google is data rich but still data hungry. They have mountains of data to sift through from:

Toolbar Data: We all know just how much data is shuffled back to google.

  • Referral data.
  • Search Data (keywords...phrases, bookmarks).
  • Time on site (between clicks).
  • Surfing Path. Where to click, what to click, when to click, and even how to click.

Search Data: Not only simple search engine usage data, but referral data, keyword data, search path data, gfx data, browser data, usage data...etc.

  • Usenet Data. What people talk about in public.
  • Gmail - email data. What people talk about in private.
  • Orkut - social network data. How people act in personal web environments.
  • Directory Data. Who thinks who is hot.
  • News Data. What the talking heads think.
  • Graphics data, shopping data, consultant data...whew - there is no end.
  • A dozen other data sets I've not even thought of yet - they have on hand.

What this all points too, is that Google is amassing information on the overall search experience of web users. Only the major ISP's (such as AOL or MSN), or those involved in the the big Router and Proxy operations (MCI/ATT/Cisco) can even begin to compare the depth of data sets that Google has available. They also have the huge computer farms to crunch that data in spectacularly speedy fashion. What sets Google apart from those sites such as AOL, is that Google knows about the web as a whole. The crawling and indexing of the web presents a composite picture of the web those big ISP's can't see.

We can only try to imagine the decisions and analysis that could come out of that data. This fact, is the single most under rated and under appreciated aspect of Google biz operations today. This fact along, justifies the $200+ valuations we are seeing.

What we also know about Google, is that they have a ravenous appetite for data. Most of that data from the User experience side of the equation, comes via the venerable browser.

Ultimately, what Google is building is a leading edge insight into the human experience of Cyberspace. No one else can compare to the mass of data Google has available for synthesis. Thus, by combing their current indexer/bot with a Browser html rendering engine, they can get at some of that content they are currently barred from getting.

Q: 1 share of google stock?
A: $204 today.

Q: Knowing how, when, where, and why people surf?
A: Priceless.

1 you heard it here first folks ;)

 

pmkpmk




msg:738745
 4:16 pm on Jan 25, 2005 (gmt 0)

I always wondered why Google doesn't implement a partner program with webmaster for the search results.

What I mean is that the webmasters sign up, get screened, get their honesty tested, sign up a contract and THEN these webmaster can SEND data regarding their sites to Google.

So if I subscribe to this "Google trusted webmaster" program, I can tell THEM when something on my site changes.

This would save them a lot of bandwidth and ressources (which they can use to discover the "hidden" net) and on the same time I have the benefit of exactly knowing what data they have about my sites.

Of course the possibility of abuse is high, but this contract should have clauses on (quarterly?) audits as well as penalties for abuse.

But in the end I guess both sides would benefit.

lammert




msg:738746
 11:10 pm on Jan 25, 2005 (gmt 0)

Pmkpmk: a partner program for Google is not in their benefit. They are there primarily for the searching crowd, not for the webmasters. If they had partners I would be very suspicious of the SERPS, never know if the first entries were really the best ones or just partner listings.

They have enough information as Brett already explained to decide what is best for the user. Having a few "reliable" partners gives only problems when it comes to contract administration, screening etc. They might save a few % bandwidth, but I don't think they are interested in that.

Hiring a FireFox developer on the other hand gives them exactly what they need: knowledge how flat data in HTML, PHP, SWF, MPG or whatever format is translated to the user experience. And by making that translation in their bot they will be miles in front of Yahoo and MSN (again).

They need such a major improvement. Just a few years ago almost all search engines used results from the Google technology. Yahoo now uses Inktomi and the beta version of MSN is life. Google knows what happened to Altavista in just a short period, I am sure they don't want to go the same road.

lammert




msg:738747
 11:26 pm on Jan 25, 2005 (gmt 0)

Those garbage sites and "meat search engines" as you call them is just what I am affraid for. These are run by people who (think they) know the secrets behind the Google system. And they will be the first also to sign up for a partner program if that would exist. The average webmaster with his non SE optimized site and a lot of interesting content will never think about such a thing as a Google partnership.

On the other hand I agree with you that Google has to find a solution for the current garbage in the SERPs and I hope they find a way to do that. But for every site they remove from the SE, ten new ones are created. It seems that these garbage constructors have specialized software that can faster create these fake sites than Google can detect and delete them at the moment.

pmkpmk




msg:738748
 11:46 pm on Jan 25, 2005 (gmt 0)

Maybe it's like IE vs. Firefox. Because of the big marketshare, IE draws all the spyware-programmers to it.

Because of Google's marketshare, all the black-hat-"optimizers" are drawn to it. THIS could actually play out pretty good for MSN if they manage to keep the garbage out.

I have to agree that the "normal" webmaster would probably not signup for a "Google trusted partner" program, especially when it comes with a fee. Even I would have a hard time explaining the benefits to my boss if the fee would be substantial.

And I also have to agree that the SERP-spammers always seem to be one bit ahead with finding new techniques.

*sigh*

I'm not quite sure about my own loyalty here. I held up the Yahoo flag, when everybody switched to Altavista. I held up the Altavista-flag when everybody else was already jumping ship in favour of Google. I'm still believing in Google but the first colleagues are moving to MSN. But I guess the SE which would provide good results AND keep eBay & Co out can very easily win my affection.

2by4




msg:738749
 12:20 am on Jan 26, 2005 (gmt 0)

Brett_Tabke, I'd say that even though I might not always agree with what you say, in this case I think you have it exactly 100% right, I've seen this rumour mill on google browser for a while, but your explanation makes much more sense at every level, especially the need to be able to render css/javascript with no errors, it's been extremely easy to use css tricks to do virtual cloaking for a while now, so easy I decided against doing it since it's only a matter of time before google did what you are saying they will do, integrate a very high end rendering engine like gecko into their system. Makes total sense. Now if they can just get back to actually indexing the web like they used to, and giving uptodate results, and not blocking 1 year of the web, we'd all be happy campers.

Clark




msg:738750
 4:19 am on Jan 26, 2005 (gmt 0)

Don't get me wrong, I agree w/ Brett that having a browser dev on staff is very good for the reasons he mentioned. I was always amazed at how long it took the browsers to catch onto the basic stuff like frames and javascript. I mean if the browser knows how to render it, the SE should know too.

decaff




msg:738751
 5:42 am on Jan 26, 2005 (gmt 0)

I think it goes way beyond "just the browser"...the Mozilla Framework is capable of "Rapid Application Development" and you can expect Google to look for new ways to implement for new market initiatives...

Chndru




msg:738752
 7:25 pm on Jan 26, 2005 (gmt 0)

[weblogs.mozillazine.org...]

Darin Fisher too.

zeus




msg:738753
 7:52 pm on Jan 26, 2005 (gmt 0)

Heres a little [marketwatch.com] for your speculations on a browser and other for Google

Clark




msg:738754
 11:24 pm on Jan 26, 2005 (gmt 0)

Nice dvorak article. Yup, totally agree. They are doing "netscape" but the right way. How appropriate to hire firefox developers to help them along the way. Integrating the browser with the myriad new stuff Google is doing will show what Internet Explorer has not done. This is such a no-brainer.

amznVibe




msg:738755
 5:59 am on Jan 27, 2005 (gmt 0)

Things you could do:
- The complete elimination of hidden text from *any* code based source.
- The interpretation of div's and css with 100% accuracy.
- Key word 'spamming' could all but be eliminated in all it's form.

That type of work would be a complete waste of time for a programmer of his talent.
They don't need to hire someone like him to do the above.

Google could have accomplished all three points in a heartbeat if they *really* wanted to, years ago. They pretty much ignore CSS, especially external stylesheets and any SEO knows this. Might as well add javascript intepretation to such "improved spidering". I'm still surprised they don't index javascript text after all these years - cpu power has never been cheaper and it should be fairly easy to buy a javascript interpretor and hack it to their needs (I hope spammers never figure that out).

Their plans become even more obvious with the hiring of Darin Fisher.
Darin, a former IBM and Netscape employee, is a "module owner" for the Mozilla project
and is in charge of cookies and permissions, as well as Mozilla's networking library.

Clark




msg:738756
 7:02 pm on Jan 27, 2005 (gmt 0)

Maybe Google hired the FF developers for a different reason. Adsense is the cash cow for Google. If by some wild stretch Firefox turns into 90% of the browser market within a year or two or five, (doubtful which I'll explain below), and the adblock extension becomes too popular, the adsense cow will stop producing milk. So hire the developers to come up with a system that will get around adblock? I doubt it, more likely, they are building a browser, will compete head to head with MS on the desktop. They want to be masters of their own future. I bet they will even partner with Hardware manufacturers and come up with PCs that surf the net, w/ their browser preinstalled, and no need to go near MS. By having their own browser, they can make sure no adblock will turn off their adsense ads. For them, a browser is imperative.

Coming back to the FF 90% browser share, people are missing a lot of points in this war. FF is an admirable effort. But it can't touch what MS can do if it wants to. IE has stopped being developed because a rich Internet threatens MS's cash cow. Windows&Office. But the day that Firefox does for the Internet what MS killed lo many years ago, you can bet that MS's team will come up with top features to beat FF. True, FF won't die since they are open source, but they will have a rough time keeping up.

MS will attempt to do what I'm suggesting that Google is working on.

You can bet that there are a lot of changes going to happen on the internet in the coming few years, with big Google vs MS war.

The fact that MS needs to kill innovation to keep the cash cow going, makes me think Google will "IBM" Microsoft. And the world turns around one more time.

(Is anyone still following?)

pmkpmk




msg:738757
 7:49 pm on Jan 27, 2005 (gmt 0)

Yes.

It will be interesting to come back to this thread 12 months down the road. And I love conspiracy theories...

In my country, there is a strong anti-Microsoft movement. Not only by computer-skilled professionals or prosumers, but especially by the public sector, city-, county- and even state authorities. Reasons: money & distrust. The public sector gets advised to switch to Firefox for security, and to Open Office for cost reduction. They even intend to bring Linux to the desktop, which I personally, with many years of Linux experience, pretty much doubt that it will work out.

We just had a vendor in for a training last week, who's web-interface for his application only works with IE. Since we are selling very much into the public sector, we told him he'd better have Firefox support until summer.

Google - in my country - still has an even better name than Mozilla (Godzilla? Noooo, MOzilla!). Even though I doubt that a Google-OS stands a chance to become popular, a Google-browser would have a huge, huge credibility even before it would be launched. Hey, whom do you like better? Bill Gates or Larry Page? See! And if Google would venture into text processing, databases (that's what they do on the backend anyways) and spreadsheets, their credibility would give them a huge chunk of the market willing at least to early adopt. Let's see when they hire the first core members from Open Office...

Clark




msg:738758
 7:56 pm on Jan 27, 2005 (gmt 0)

I didn't even think of that. What's a spreadsheet to Google? A word processor? No big deal. they integrate office features into a browser and BOOM. Haha, nice post.

pmkpmk




msg:738759
 7:58 pm on Jan 27, 2005 (gmt 0)

And again XUL comes into mind. Yes, you are absolutely right. This sounds even more probable than Open Office.

Clark




msg:738760
 8:01 pm on Jan 27, 2005 (gmt 0)

I still remember when WW was arguing about Google going into email. People were saying no waay. Google has a laser focus on search. And several of us, including Brett that time, were saying email was a no brainer once the IPO started happening. And just as that came to pass, so shall this. I bet money (in GOOG) on this.

pmkpmk




msg:738761
 8:10 pm on Jan 27, 2005 (gmt 0)

"You heard it first on WebmasterWorld"TM

Brett, you should probably get a trademark on that sentence...

gmiller




msg:738762
 10:23 pm on Jan 27, 2005 (gmt 0)

I wouldn't be so confident that Microsoft can hold on against Firefox. If they decided to restart development on a standalone browser today, how long would it take to get something designed, implemented, tested, and released? They may be able to prevent Firefox from reaching 90% marketshare, but by the time they started to seriously compete again, there'd be no chance of IE ever getting back to the 90% mark.

And keep in mind that, over time, they're going to have much more important battles to fight than whose free browser gets used the most. Can they crack the server OS market? They recently dumped their Itanium version of Windows because it's obvious they won't be able to crack the segments that Itanium is being marketed to, and now Solaris is being released as open source. Linux continues to grow. OpenOffice.org is growing. Why divert your efforts to giving things away when your core cash cows are under attack? Why waste time and money trying to regain a virtual monopoly that's gone forever, now?

And, ultimately, how much does browser quality matter, anyway? The Mozilla Application Suite never really took off because the tech media jumped on the Mozilla-bashing bandwagon and now Firefox is succeeding because the tech press is on the Firefox bandwagon. That's what it seems to come down to.

Clark




msg:738763
 6:53 am on Jan 28, 2005 (gmt 0)

Update from blake ross [blakeross.com...]

David Bruning




msg:738764
 5:19 pm on Feb 2, 2005 (gmt 0)

The "not being listed for a year in google" is in reference to the theory that google freezes new websites from ranking well for a significant period of time after first being indexed - possibly to combat spam sites from ruling the search engine page results (SERPS).

It is usually called the google sandbox or deepfreeze on these forums.

There are people who say it exists and people who say it doesn't.

My personal opinion is new commercial sites without significant link resources may take an unspecified period in order to have the same ranking power a non-commercial site would have.

Hopefully helps a bit :)

instinct




msg:738765
 1:16 am on Feb 3, 2005 (gmt 0)

Flowers, start here:

[webmasterworld.com ]

WebWalla




msg:738766
 6:15 pm on Feb 4, 2005 (gmt 0)

(GoogleFox, or FireBot?)

I prefer FoxBot by far :)

amznVibe




msg:738767
 6:51 pm on Feb 4, 2005 (gmt 0)

Am I hallucinating or did this thread/topic get recycled?

I'm sorry but there is no way a bot, even based on advanced browser code, will ever be able to determine if certain situations are deceptive on purpose, or if they are fancy css/javascript design.

A perfect example would include simulated popup DIVs that appear or disapper when a mouse is hovered over words (doable both in css or javascript). This is a perfectly legitimate feature. But how is a search engine bot, no matter how advanced, going to know that is for real and isn't cloaked text? It's impossible and can only be determined with human intuition.

Again, Google did not have to wait until now to parse css and javascript via buying browser code and developers. I seriously doubt that is what they are doing. Google simply realizes that an advanced browser is the very key to the web and having one of their own empowers them to produce greater products. After developing the advanced code for gmail and "suggest" it's easy to come to that conclusion. There is also a possibility they need more powerful browser tools for their Adsense program, for their agents to review websites, etc.

Kirby




msg:738768
 7:00 pm on Feb 4, 2005 (gmt 0)

Ultimately, what Google is building is a leading edge insight into the human experience of Cyberspace. No one else can compare to the mass of data Google has available for synthesis.

Dead on, Brett. Its the human experience data that is so valuable. The other under-the-media-radar player that gets this is Barry Diller and IAC.

Knowledge is power. MS knows this, but they are playing catch up just to get to the point where Google is now. Google has the ability to jump even further ahead. I like a company that is this proactive and aggressive.

Google has a laser focus on search

Google has a laser focus on information.

Brett_Tabke




msg:738769
 8:40 pm on Feb 4, 2005 (gmt 0)

I split this off and put it on the homepage. This issue has been gnawing at me for the last week.

I have not come to any new conclusions. What has happened is that the implications have started to set in.

We need a name to refer to the Google "data set". While we have a name for the search database (STHK: Sum Total of Human Knowledge) we need a name for the data set as a whole.

Clark




msg:738770
 9:08 pm on Feb 4, 2005 (gmt 0)

You seem to think the laser focus was a long-term reality rather than a short-term goal/marketing tool. The idea behind that slogan was to differentiate google from altavista and yahoo. THOSE guys were trying to do too much. Shopping engine, email, news. WE on the other hand will have a very simple look and feature. Enter a keyword and click on search. But NOW that they have the money to do what the others are doing, and don't need to differentiate anymore, that slogan refers to the NEXT search engine wannabe. Their focus is all over the place, very unlaserlike. Ask them about it and I'm sure they'll say who cares.

But I do agree that they love to collect datasets. Brett, while your point about firebot makes sense, don't you think they'd want even MORE data than a mere toolbar? Just IMAGINE the data they'd get with millions of users using their BROWSER. The toolbar was just a preview.

By the way, with all this talk of Sandbox, while typing I just got a theory. They had no way around those sites that scraped content and slapped on adsense (or other affiliate program). There was a specific date that the app to do it became widespread (an assumption). They are building their index to sites around before that date, slowly adding in new sites with new unduped content. And trying to solve the problem. I haven't seen fresh results on google. Haven't seen great results either. But I also haven't seen much of those scraped sites either. This goes along w/ the Google is broken theory.

If this was mentioned somewhere in the thousands of sandbox threads, sorry to repeat it.

Iguana




msg:738771
 11:01 pm on Feb 4, 2005 (gmt 0)

I did write a long posting but thought twice about it.

The thing I have to say is Google? Weren't they the search engine of the early 2000s? If they haven't used their mass of collected data to stop their search results deteriorating then why would they succeed in doing anything else with it?

All the toolbars or browsers in the world won't stop the fact that their search is just not what it used to be. What is the point of indexing the parts of the web they haven't reached so far when they have failed to order what they can see?

Sorry, I'll get back to an Update thread where my misery and despair will find company. More likely I'll be on Teoma or Yahoo actually looking for interesting sites rather than epinions, Amazon, cduniverse, rateyourmusic, kelkoo.

blaze




msg:738772
 12:42 pm on Feb 5, 2005 (gmt 0)

They've been doing this for a long time.

Though, I agree, the firefox guy could have been hired to help improve their browser simulation.

johnser




msg:738773
 1:17 pm on Feb 5, 2005 (gmt 0)

Why do people hate IE so much? MS have made software that most of the (Western) world can easily use to do what they need to do. It also works (most of the time)

Would you really prefer a software world dominated by the techies who invented the SQL language "so that 20 yr old office secretaries would find it easy to use computers" as I believe they said in the early 70's? No thanks.

True, MS bullied their way past Netscape into the position they're in and have done no innovation with the program for years. So what?

Do you genuinely think most users know or care about support for css, javascript, divs etc? Don't think so.

What made Google what it is now? Simplicity & quality.
This is EXACTLY what IE has right now. I can view every site I like with it. It works 99% of the time. Theres no ads.

I've Opera, Netscape & Firefox also installed - Why do I use IE? Because I know what to expect. I don't have the time to "learn" about other software so that I can have 2% improved work productivity. Is that a good time investment? Don't think so.

A comment was made on some WW thread recently that the internet is (rapidly becoming) a direct marketing channel. I've been at this stuff for years yet I'm still reeling from the simplicity of that comment. (Am clearly not too bright)

So if indeed the Internet is a direct marketing channel, then just as with the pizza flier you get through your door, its simplicity is what makes it effective - which IMHO is why any new browser must stay as basic as IE is.

The only reason I can see that Google are doing a browser is to tie all their ideas (domains, email, search etc) together.

Are they doing this to "organize the world's information"? No, they're doing it as its the smart thing for them to do product-wise. Will regular users (& not WW regulars) go for it? Let's wait and see. My money's on IE.

Why? The masses are used to it and it (usually) works every time.

We are not in the mid-90's where only nerds like the people writing & reading this thread (like myself) had a PC at home. Millions use their PC now BECAUSE of the simplicity.

You buy a PC from Dell and its working within an hour of arriving. There's no printer drivers, font faces, hardware drivers for your 56K modem etc to install. It used to take me anywhere from 1-2 days min to set up a PC 10 years ago (Wasn't too bright then either!)

People don't buy functionality - They buy benefits.
Is there a bigger browser benefit than something that works every time and that you don't have to think to use a la IE?

Johnser - (A cave-dwelling webmaster)

=====

Brett: "name for the data set as a whole"
Suggestion: "Googus"

("Goog" + the latin word "totus" meaning entire, whole, complete)

This 44 message thread spans 2 pages: 44 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved