Welcome to WebmasterWorld Guest from 23.22.140.143

Forum Moderators: bakedjake

Message Too Old, No Replies

GigaBlast Hits 1 billion Pages Indexed

     
4:30 am on Jan 7, 2005 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38048
votes: 12


Everyones dark horse favorite search engine Gigablast hits 1 billion pages indexed:

[prnewswire.com...]

announced today its recent database expansion from 650 million webpages to
over 1 billion. The new index also includes a comprehensive refresh of
webpages from the previous index.
4:32 am on Jan 7, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 13, 2004
posts:428
votes: 0


Alright Gigablast! :-D And well done to Matt Wells.
4:34 am on Jan 7, 2005 (gmt 0)

Moderator

WebmasterWorld Administrator skibum is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Sept 20, 2000
posts:4469
votes: 1


Yea Baby! Go Matt!
5:04 am on Jan 7, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Mar 7, 2004
posts:285
votes: 0


1,014,363,952 pages indexed - to be exact ;)
5:58 am on Jan 7, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 25, 2003
posts:404
votes: 0


And they seem to be the right 1 billion pages. Kinda reminds me of Google 2002. I think the problem of the future is not to index everything under the sun, but indexing the stuff that is worth indexing.
6:32 am on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


At last! Something to take the place of the old Altavista. How Wells does it with a handfull of PCs is just beyond me. -Larry.
6:40 am on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 25, 2001
posts:660
votes: 0


Good on you Matt!

Just think - if Gigablast can index 1 billion pages with just Matt - imagine what Matt could do with a few staff and more resources.....

7:28 am on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2002
posts:776
votes: 0


Yes very nice - certainly gives sometime for those folks at MSN Beta to aim for :)
8:20 am on Jan 7, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 13, 2004
posts:428
votes: 0


Hahaha :D
10:15 am on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 20, 2002
posts:889
votes: 0


It's not just the number of pages, decent quality results, and good speed. It also has the Site search for your own site, and the custom search for you to define your own subset of websites to search. Now you can receive all of this back in XML and serve the results to your users (limited to 1000 searches per day). All of this seems better and easier to implement than partial equivalents from Google or Amazon.

If they can grow without losing their focus then they will be serious contenders in the search engine market soon.

11:18 am on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 11, 2002
posts:2024
votes: 0


Gigablast seems to suffer from index spamming too. Look at this:

Query for one of my important keywords and Gigablast's results:

#1 eBay Portal
#2 Meta-Search-Engine, showing eBay Portals as top reults
#3 Meta-Search-Engine, showing eBay Portals as top reults
#4 Meta-Search-Engine, showing eBay Portals as top reults
#5 AdWords, Amazon and eBay offers
#6 75% on-topic site
#7 Auction site (non-eBay)
#8 my site

Same query on Google:

#1 On-topic end-user site
#2 my site
#3 On-topic end-user site
#4 On-topic competitor site
#5 On-topic competitor site
#6 slightly off-topic end-user site
#7 On-topic commercial services site
#8 On-topic competitor site

Ove

11:51 am on Jan 7, 2005 (gmt 0)

Senior Member from SE 

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 24, 2001
posts:786
votes: 0


Well Done Matt

/Ove

11:53 am on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 16, 2002
posts:2010
votes: 0


All that even with the addurl page disabled for months ;)

Nice job though - still one of the cleaner, better engines and we all love to root for the underdog!

One of my year old sites is still not indexed though...

12:05 pm on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 2, 2002
posts:1167
votes: 0


Kudos to Matt, not only for the milstone but for a system which understands a 301 Code means abc.htm will hereafter be found at folder/abc.htm so it's time to index the latter and drop the former.

The others seem to continue to interpret 301 as "oh yeah, this place moved, didn't it, let's get out the map, ah, here we go, we shoulda turned left a block back. no worries, let's go there now. Nah, nuthin to change, nuthin' to write down. I'll just remember it in my head and be ready the next time (yeah, right!)."

12:52 pm on Jan 7, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:July 17, 2003
posts:560
votes: 0


So how does Gigablast work?
The SERPs looks like a hybrid of Google and Yahoo with a few suprise results thrown in.
Seems there is a little more of a bias towards url text but not so much for title.
Some interesting, fairly good results.
If I was a standard web user I'd be fairly happy with the results bar the odd quirky SERPs.
1:21 pm on Jan 7, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 30, 2003
posts:143
votes: 0


Looks like a great new index - very fast with a refreshingly clean feel to it.

I'll definitely use this in the future.

2:41 pm on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member billys is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 1, 2004
posts:3181
votes: 0


This is truly a testiment to what can be done when someone puts their mind to it. Good Job!

By the way, only yesterday I was using GigaBlast after failing to find what I was looking for in Google. I'm not taking a shot at Google here, but at times the "drill down" layout of GigaBlast is very helpful. (And I did find what I was looking for...)

3:15 pm on Jan 7, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Oct 11, 2003
posts:255
votes: 0


Pretty good for a small search engine. Results appear to approach those of the major search engines.

Spent about five minutes looking around:

"widget tool box"

1. Spammy site
2. Spammy site same as #1 but different landing page.
3. #1 selling "widget tool box" site (ranked #1 by Google and Yahoo).
4. Spammy site
5. Spammy site

"country name"

1. Government Tourist Board (mainly tourism fluff and brochures)
2. Independent Country Portal (ranked #1 by Google and Yahoo).
3. Spammy site
4. Weird Wiki-type guerilla Indymedia site.
5. Spammy site.

The spammy sites mostly have inside pages named widget-tool-box.html

5:49 pm on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member rocknbil is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 28, 2004
posts:7999
votes: 0


Yeah I gave it a ten-minute run and all I got for results were keyword spam pages without real content, but it was a topic of vague focus.
6:46 pm on Jan 7, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Aug 11, 2004
posts:582
votes: 0


Its amazing to see this. The little guys still have room in the big game. I like the stripped down option, you can quickly visualize how the SE sees the site, with the keywords highlithed. Also, the keyword suggestor is a good idea, eventhough I don't get what the perscentages represent yet. Could give keyword suggestions.

Finally, I hope this Matt dude leaves some room for exploration and testing from the webmsaters side. Basically, hopefully he doesnt drop you at the first sign of irregularity, but gives you the chance to re-try.

10:01 pm on Jan 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member whoisgregg is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Dec 9, 2003
posts:3416
votes: 0


How Wells does it with a handfull of PCs is just beyond me. -Larry

From the press release, it looks like Matt has outgrown his backroom cluster:

Gigablast will spider their websites in real-time at the rate of one page every five seconds. With multiple dedicated clusters, Gigablast can handle large amounts of DSS queries and webpages.

I bet the day after the G IPO, Matt had a dozen venture capitalists knocking on his door... ;)

<added>
Which is great for Matt, I've always enjoyed seeing how Gigablast progressed through the years. Glad to see his success. :D
</added>

12:19 am on Jan 8, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 22, 2002
posts:902
votes: 0


Kinda reminds me of Google 2002.

Or 2003 and 2004 judging from that old standby: "miserable failure". :)

But yes, hats off to Matt and Gigablast.

11:52 am on Jan 8, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 11, 2002
posts:2024
votes: 0


@Hugene

I like the stripped down option, you can quickly visualize how the SE sees the site, with the keywords highlithed.

Few seem to know that this is possible in Google too, if Google has a chached snapshot of the page. In the heading frame it says:

This cached page may reference images which are no longer available. Click here for the cached text only.

And if you click on the link, it gives you exactly the representation how Google sees it. Try it with a page which has a form on it. Very interesting...

2:31 pm on Jan 8, 2005 (gmt 0)

Senior Member from DE 

WebmasterWorld Senior Member 10+ Year Member

joined:May 25, 2002
posts:926
votes: 0


I personally like the results, the speed, the layout and the feeling. But some here saw it right already: the results are mixed with spam and cloaking, which does not mean to be sooo bad in english, but for other roman languages, like german or dutch, it looks spammed from top to bottom.

nevertheless, MSN is suffering the same problem with other languages and a bit of "snobbism" for the american development team can't be so wrong, the search engine war first will be decided in english.

(*fingers crossed*)

my 2 pennies,
P!

4:30 pm on Jan 8, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 5, 2002
posts:1713
votes: 0


And I am old enough to remember when he started out - the spidering used to get to 1000 or even 10,000 sites, then a glych would kick in and next day he would be showing zero sites spidered.

Eventually he sorted the instability problems, and before you know it (well a few years later) he has a billion spiderd.

8:38 pm on Jan 8, 2005 (gmt 0)

Moderator

WebmasterWorld Administrator skibum is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Sept 20, 2000
posts:4469
votes: 1


Time spit out a toolbar.
10:15 pm on Jan 16, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 11, 2002
posts:2024
votes: 0


Gigablast seems offline. Anybody elese notice it?

Argglll. Now it works again. Probably maintenance of some sorts.

6:31 am on Jan 17, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Maybe a circuit breaker tripped in the garage. That's easily fixed.
12:47 pm on Jan 19, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 16, 2002
posts:2010
votes: 0


Has anyone noticed their "super recall" feature? I kinda like it:
[gigablast.com...]
9:49 pm on Jan 24, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 11, 2002
posts:2024
votes: 0


On January 7th I wrote:
Query for one of my important keywords and Gigablast's results:

#1 eBay Portal
#2 Meta-Search-Engine, showing eBay Portals as top reults
#3 Meta-Search-Engine, showing eBay Portals as top reults
#4 Meta-Search-Engine, showing eBay Portals as top reults
#5 AdWords, Amazon and eBay offers
#6 75% on-topic site
#7 Auction site (non-eBay)
#8 my site

Good news! Results just got better:

#1 eBay/AdSense "portal" (probably the one from Jan07 - I'm not sure)
#2 Slightly related news site
#3 Related site, offering a service related to the search term
#4 On-topic enduser site
#5 my site
#6 my site
#7 spammy SEO site
#8 On-topic enduser PDF file

Seems Gigablast got rid of all so called "Meta Search Engines". The right way to go (imho). I'd LOVE to see SERPS without ANY eBay and without ANY "metasearch" results!

This 34 message thread spans 2 pages: 34
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members