GigaBlast Part 3 - Alternative Search Engines forum at WebmasterWorld - WebmasterWorld

Forum Moderators: bakedjake

Message Too Old, No Replies

GigaBlast Part 3

«
1
2
3
»

bird

11:17 am on Mar 18, 2002 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Continued from: [webmasterworld.com...]

Looks impressive so far indeed. I'm really curious about any increase/decrease in relevance, once there's a significant number of sites indexed.

A few things to note, most of which you probably know already:

Always respect robots.txt for all pages.

The spider needs to do some load balancing, so that it doesn't fetch too many pages from the same site in a short time. The recommended ratio is about one page per minute and site (http://www.robotstxt.org/wc/robots.html)

Make sure that the images on your site are served with headers for creation date, size, and expiry date, so that the client can cache them. This will noticeably reduce the bandwidth requirements on your own system.

Only list one of www.example.com/ and www.example.com/index.html (home¦default.htm¦asp¦php, etc.) at least if they contain the same text.

Cluster the results, so that one site can't dominate the SERPs for any keyword combination.

I'm sure there's a lot more work waiting for you... ;)

mattdwells

6:17 pm on Mar 21, 2002 (gmt 0)

10+ Year Member

greetings,

If robots.txt wasn't obeying your site's rules, there was a bug, but it should work now.

I've put together a page of logos and page designs that I'd like everyone to view and if you feel so moved as to give positive or constructive feedback, please don't hesitate!

<a href=http://www.gigablast.com/designs/designs.html>
[gigablast.com...]
</a>

And thanks for all the comments so far, i've really found and fixed a lot of bugs!

truly yours,
matt

greektomi

6:50 pm on Mar 21, 2002 (gmt 0)

I guess I like the first one the best though I don't understand the purpose of reversing the first a in gig a blast.

I think the logo should be relatively small.

Two cents,

Greektomi

ceo

9:02 pm on Mar 21, 2002 (gmt 0)

I hope to get the time to make one and send it across.
I really didn't like any of those.

BTW : I think your page is decent, not to say the posters page design was not good. It was pretty too, better actually, but I yet feel you should stick to the fast loading one.
Cheers,
RR

klaus

11:28 pm on Mar 22, 2002 (gmt 0)

10+ Year Member

Matt,
you make a great job. Congratulation.

One question to better understand the numbers:
You say:

the current hardware i have should hold somewhere between 200-250 million web page

On the Gigablast about-page i read

scales to 200 billion full pages

Is my assumption right, 200 billion is the range what the software can do? To reach it, you need "a little bit" more of hardware, right?

cheers klaus

JayC

1:14 am on Mar 23, 2002 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

re: logo designs, I really think the "lightning bolt for an L" technique is seriously overused (maybe you should try a lightning bold with a swoosh behind it!). And every time I see that in a gigablast logo I think of Jolt cola. And I've never even tasted Jolt.

Maybe it's for that reason that the only design on or linked from that page that I like is the second one; the red lettering with the gray and black in the background.

Craig_F

1:30 am on Mar 23, 2002 (gmt 0)

10+ Year Member

Of those submitted so far I actually like the Sticky Sauce version. It feels google-like, but little cooler...dress it up by switching the black for your favorite color. Keep it simple...

DGBrown

7:57 am on Mar 23, 2002 (gmt 0)

10+ Year Member

I liked the second one. It's got that trendy feel while looking clean and suffisticated.

The first one is cool too, except the backwards "a" looks like another "b" that got blown up. If you left the "a" going the right direction, but tilted and lowered it slightly, it would probably look a lot better.

The lighting bolt ones look kinda cheezy, like they might appear on the box of a store brand knock-off cereal. (that was the first thing to come to mind when I saw them)

I hope that was helpful :)

- D.G.

FreeBee

12:53 pm on Mar 23, 2002 (gmt 0)

10+ Year Member

Matt - good job!

mattdwells

12:30 am on Mar 24, 2002 (gmt 0)

10+ Year Member

thanks for the comments, guys.

and, yes, gigablast does scale to 200 billion pages (200,000,000,000).
and, yes, i would need more hardware.
my current setup only goes to about 200-250 million, so i'd need 1,000 machines times what i have, which is actually very doable.

matt

click watcher

12:40 am on Mar 24, 2002 (gmt 0)

hi matt great work

re logos...

stickysauce is best because its slickest/quickest looking

the lightening is a no no for me, nice work but doesn't suit a searchengine.

the neon layout is good, but green = kiss of death i think (despite dmoz)
most shades of green are not appealing, plus the lightening bolt doesn't quite work.

BikeMan

9:11 am on Mar 24, 2002 (gmt 0)

10+ Year Member

Add URL shows that it is temporarily unavailable.

Any ideas when its going to be up and running again?

BikeMan

9:14 am on Mar 24, 2002 (gmt 0)

10+ Year Member

Ooopppsss ... goofed on my first post.

Add URL is back up and running.

SmallTime

12:33 pm on Mar 24, 2002 (gmt 0)

10+ Year Member

2,295,520 pages. quick, seems to follow links well from the site I submitted, interesting.

pyst

10:38 pm on Mar 24, 2002 (gmt 0)

Gigablast looks good apart from;
a lousy looking logo graphic
an awful background colour

I don't mean to be nasty - just critical of something wrong that may seem like nothing but I think makes a difference.

Yep, it is only MY opinion. Take it or leave it. I think I still like the engine and the simplicity, just like googles, I give a thumbs up to.

Why is it so necessary for spiders to go past the index page. I can't think of many sites that need to be spidered so 'deeply' - all they do is clutter up engines with a multitude of pages making it more difficult to find the others. I clicked on one of the recent searches and was presented with an entire page of links to different pages for ONE site - very UNimpressive.

Why don't spiders take notice of metatags - it really peeves me to make an effort to describe my sites and have it all ignored and something quite irrelevant (or less meaningful) placed in the description area.

Is there such a thing as a human edited search engine as opposed to a human edited *laugh* directory like dmoz used to be.

Maybe this board needs a discussion on what a search engine should be. Maybe someone will read it and build a new engine that people will actually enjoy using 100%.

JayC

10:55 pm on Mar 24, 2002 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

pyst, you may have missed some of the earlier discussion in this thread and on the previous pages.

>>I clicked on one of the recent searches and was presented with an entire page of links to different pages for ONE site

Matt has mentioned that clustering is not yet being done, but it will be.

Regarding the logo and colors, you'll also see that he's asked for and recieved several suggested designs; likely that will be changed.

Remember that the site is still in the early development stages.

>>Why don't spiders take notice of metatags - it really peeves me to make an effort to describe my sites

Because while you may use them to accurately describe your sites, many other people have used them inaccurately to spam search engines.

greektomi

11:07 pm on Mar 24, 2002 (gmt 0)

I think you should add the advanced search option on all of the result pages instead of just on the main page.

My reasoning is that often after you make your first search you realize that you need to refine it somewhat...

2 more cents,
Greektomi

Tapolyai

11:22 pm on Mar 24, 2002 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Hmmm... Add a URL is down again...

JBoss008

11:23 pm on Mar 24, 2002 (gmt 0)

10+ Year Member

Matt,

Very nice work and great job.

Just want you to know this is the kind of ingenuity that is going to make the search engine world a whole new place to be.

Jason

pyst

3:52 am on Mar 25, 2002 (gmt 0)

ADD URL is up again.

I just did a search which made me wonder about the relevance of gigablast searches because I was given as many irrelevant results as valid ones. On closer inspection I noticed my search term seemed to be giving me results for another term.
The terms are 'gay pics' and 'tin cans'.
Any reason this might be happening? However apart for that the search results were ok.

Something missing from the Gigablast results - the ability to pick a page 'deeper' in the pack like google has - a dozen or more pages you can pick from instead of just 'next' or 'previous'.

raceboat

3:50 pm on Mar 25, 2002 (gmt 0)

Just found this thread... So I thought I would give it a try.

I added the URL of my site. Within seconds it apparently had spidered my site (200 pages or so...) as well as a few hundred other related sites I have listed in my directory, judging from the "date spidered".

Pretty impressive!

pgsbs

10:13 pm on Mar 25, 2002 (gmt 0)

10+ Year Member

Strange... I added my url a couple of days ago after reading this thread. And now my site seems to have vanished from the SE. What's going on??

brotherhood of LAN

10:25 pm on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

pgsbs, he has been starting the database from scratch, he done it 3 times last I heard, probably many more times now

pgsbs

8:44 am on Mar 26, 2002 (gmt 0)

10+ Year Member

So I guess all the hope for a new good SE has vanished now?? Well Matt shouldn't give up on it. We need an extra, good working SE. Even with Google up and about.

top5jamaica

6:52 pm on Mar 26, 2002 (gmt 0)

10+ Year Member

hope this really becomes the next google. need some serious options out there along with google :-)

the logos/designs, i really wasnt impressed by any of them. i'm sure some better submissions will come in soon

Eathan

7:32 pm on Mar 26, 2002 (gmt 0)

10+ Year Member

Man, this sure brings back memories!

Ica

9:27 am on Mar 27, 2002 (gmt 0)

10+ Year Member

pyst:
I discovered that too. I think it's caused by an "or"-connection between the search terms. That means, the term "gay pics" will deliver all hits with "gay" and "pics" first, and then start delivering all pages with "gay" or "pics".

BikeMan

9:56 pm on Mar 27, 2002 (gmt 0)

10+ Year Member

Looks like the database is corrupted again. This time I found my site description leading to a porn site.

Hmmmm if it was the other way around traffic would have gone up.

Hope its corrected before the site owner has a stroke.

nima

10:52 pm on Mar 27, 2002 (gmt 0)

10+ Year Member

Hi I tried to submit my site but it says that it is temporarly down. Does anyone know when it will be back up?

Jacquie

12:40 am on Mar 28, 2002 (gmt 0)

What is going on with Gigablast?

raceboat

1:44 pm on Mar 29, 2002 (gmt 0)

The database appears to have been reset again...

This 65 message thread spans 3 pages: 65

«
1
2
3
»