Welcome to WebmasterWorld Guest from 54.146.240.181

Forum Moderators: bakedjake

Message Too Old, No Replies

GigaBlast Hits 1 billion Pages Indexed

     

Brett_Tabke

4:30 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Everyones dark horse favorite search engine Gigablast hits 1 billion pages indexed:

[prnewswire.com...]

announced today its recent database expansion from 650 million webpages to
over 1 billion. The new index also includes a comprehensive refresh of
webpages from the previous index.

Teknorat

4:32 am on Jan 7, 2005 (gmt 0)

10+ Year Member



Alright Gigablast! :-D And well done to Matt Wells.

skibum

4:34 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Administrator skibum is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Yea Baby! Go Matt!

christopher w

5:04 am on Jan 7, 2005 (gmt 0)

10+ Year Member



1,014,363,952 pages indexed - to be exact ;)

creepychris

5:58 am on Jan 7, 2005 (gmt 0)

10+ Year Member



And they seem to be the right 1 billion pages. Kinda reminds me of Google 2002. I think the problem of the future is not to index everything under the sun, but indexing the stuff that is worth indexing.

larryhatch

6:32 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



At last! Something to take the place of the old Altavista. How Wells does it with a handfull of PCs is just beyond me. -Larry.

Chris_D

6:40 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Good on you Matt!

Just think - if Gigablast can index 1 billion pages with just Matt - imagine what Matt could do with a few staff and more resources.....

figment88

7:28 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes very nice - certainly gives sometime for those folks at MSN Beta to aim for :)

Teknorat

8:20 am on Jan 7, 2005 (gmt 0)

10+ Year Member



Hahaha :D

Iguana

10:15 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's not just the number of pages, decent quality results, and good speed. It also has the Site search for your own site, and the custom search for you to define your own subset of websites to search. Now you can receive all of this back in XML and serve the results to your users (limited to 1000 searches per day). All of this seems better and easier to implement than partial equivalents from Google or Amazon.

If they can grow without losing their focus then they will be serious contenders in the search engine market soon.

pmkpmk

11:18 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Gigablast seems to suffer from index spamming too. Look at this:

Query for one of my important keywords and Gigablast's results:

#1 eBay Portal
#2 Meta-Search-Engine, showing eBay Portals as top reults
#3 Meta-Search-Engine, showing eBay Portals as top reults
#4 Meta-Search-Engine, showing eBay Portals as top reults
#5 AdWords, Amazon and eBay offers
#6 75% on-topic site
#7 Auction site (non-eBay)
#8 my site

Same query on Google:

#1 On-topic end-user site
#2 my site
#3 On-topic end-user site
#4 On-topic competitor site
#5 On-topic competitor site
#6 slightly off-topic end-user site
#7 On-topic commercial services site
#8 On-topic competitor site

Ove

11:51 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well Done Matt

/Ove

amznVibe

11:53 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



All that even with the addurl page disabled for months ;)

Nice job though - still one of the cleaner, better engines and we all love to root for the underdog!

One of my year old sites is still not indexed though...

kevinpate

12:05 pm on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Kudos to Matt, not only for the milstone but for a system which understands a 301 Code means abc.htm will hereafter be found at folder/abc.htm so it's time to index the latter and drop the former.

The others seem to continue to interpret 301 as "oh yeah, this place moved, didn't it, let's get out the map, ah, here we go, we shoulda turned left a block back. no worries, let's go there now. Nah, nuthin to change, nuthin' to write down. I'll just remember it in my head and be ready the next time (yeah, right!)."

surfgatinho

12:52 pm on Jan 7, 2005 (gmt 0)

10+ Year Member



So how does Gigablast work?
The SERPs looks like a hybrid of Google and Yahoo with a few suprise results thrown in.
Seems there is a little more of a bias towards url text but not so much for title.
Some interesting, fairly good results.
If I was a standard web user I'd be fairly happy with the results bar the odd quirky SERPs.

chrisk999

1:21 pm on Jan 7, 2005 (gmt 0)

10+ Year Member



Looks like a great new index - very fast with a refreshingly clean feel to it.

I'll definitely use this in the future.

BillyS

2:41 pm on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member billys is a WebmasterWorld Top Contributor of All Time 10+ Year Member



This is truly a testiment to what can be done when someone puts their mind to it. Good Job!

By the way, only yesterday I was using GigaBlast after failing to find what I was looking for in Google. I'm not taking a shot at Google here, but at times the "drill down" layout of GigaBlast is very helpful. (And I did find what I was looking for...)

mromero

3:15 pm on Jan 7, 2005 (gmt 0)

10+ Year Member



Pretty good for a small search engine. Results appear to approach those of the major search engines.

Spent about five minutes looking around:

"widget tool box"

1. Spammy site
2. Spammy site same as #1 but different landing page.
3. #1 selling "widget tool box" site (ranked #1 by Google and Yahoo).
4. Spammy site
5. Spammy site

"country name"

1. Government Tourist Board (mainly tourism fluff and brochures)
2. Independent Country Portal (ranked #1 by Google and Yahoo).
3. Spammy site
4. Weird Wiki-type guerilla Indymedia site.
5. Spammy site.

The spammy sites mostly have inside pages named widget-tool-box.html

rocknbil

5:49 pm on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member rocknbil is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Yeah I gave it a ten-minute run and all I got for results were keyword spam pages without real content, but it was a topic of vague focus.

Hugene

6:46 pm on Jan 7, 2005 (gmt 0)

10+ Year Member



Its amazing to see this. The little guys still have room in the big game. I like the stripped down option, you can quickly visualize how the SE sees the site, with the keywords highlithed. Also, the keyword suggestor is a good idea, eventhough I don't get what the perscentages represent yet. Could give keyword suggestions.

Finally, I hope this Matt dude leaves some room for exploration and testing from the webmsaters side. Basically, hopefully he doesnt drop you at the first sign of irregularity, but gives you the chance to re-try.

whoisgregg

10:01 pm on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member whoisgregg is a WebmasterWorld Top Contributor of All Time 10+ Year Member



How Wells does it with a handfull of PCs is just beyond me. -Larry

From the press release, it looks like Matt has outgrown his backroom cluster:

Gigablast will spider their websites in real-time at the rate of one page every five seconds. With multiple dedicated clusters, Gigablast can handle large amounts of DSS queries and webpages.

I bet the day after the G IPO, Matt had a dozen venture capitalists knocking on his door... ;)

<added>
Which is great for Matt, I've always enjoyed seeing how Gigablast progressed through the years. Glad to see his success. :D
</added>

pleeker

12:19 am on Jan 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Kinda reminds me of Google 2002.

Or 2003 and 2004 judging from that old standby: "miserable failure". :)

But yes, hats off to Matt and Gigablast.

pmkpmk

11:52 am on Jan 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



@Hugene

I like the stripped down option, you can quickly visualize how the SE sees the site, with the keywords highlithed.

Few seem to know that this is possible in Google too, if Google has a chached snapshot of the page. In the heading frame it says:

This cached page may reference images which are no longer available. Click here for the cached text only.

And if you click on the link, it gives you exactly the representation how Google sees it. Try it with a page which has a form on it. Very interesting...

pontifex

2:31 pm on Jan 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I personally like the results, the speed, the layout and the feeling. But some here saw it right already: the results are mixed with spam and cloaking, which does not mean to be sooo bad in english, but for other roman languages, like german or dutch, it looks spammed from top to bottom.

nevertheless, MSN is suffering the same problem with other languages and a bit of "snobbism" for the american development team can't be so wrong, the search engine war first will be decided in english.

(*fingers crossed*)

my 2 pennies,
P!

cornwall

4:30 pm on Jan 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



And I am old enough to remember when he started out - the spidering used to get to 1000 or even 10,000 sites, then a glych would kick in and next day he would be showing zero sites spidered.

Eventually he sorted the instability problems, and before you know it (well a few years later) he has a billion spiderd.

skibum

8:38 pm on Jan 8, 2005 (gmt 0)

WebmasterWorld Administrator skibum is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Time spit out a toolbar.

pmkpmk

10:15 pm on Jan 16, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Gigablast seems offline. Anybody elese notice it?

Argglll. Now it works again. Probably maintenance of some sorts.

larryhatch

6:31 am on Jan 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Maybe a circuit breaker tripped in the garage. That's easily fixed.

amznVibe

12:47 pm on Jan 19, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Has anyone noticed their "super recall" feature? I kinda like it:
[gigablast.com...]

pmkpmk

9:49 pm on Jan 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



On January 7th I wrote:
Query for one of my important keywords and Gigablast's results:

#1 eBay Portal
#2 Meta-Search-Engine, showing eBay Portals as top reults
#3 Meta-Search-Engine, showing eBay Portals as top reults
#4 Meta-Search-Engine, showing eBay Portals as top reults
#5 AdWords, Amazon and eBay offers
#6 75% on-topic site
#7 Auction site (non-eBay)
#8 my site

Good news! Results just got better:

#1 eBay/AdSense "portal" (probably the one from Jan07 - I'm not sure)
#2 Slightly related news site
#3 Related site, offering a service related to the search term
#4 On-topic enduser site
#5 my site
#6 my site
#7 spammy SEO site
#8 On-topic enduser PDF file

Seems Gigablast got rid of all so called "Meta Search Engines". The right way to go (imho). I'd LOVE to see SERPS without ANY eBay and without ANY "metasearch" results!

This 34 message thread spans 2 pages: 34
 

Featured Threads

Hot Threads This Week

Hot Threads This Month