Welcome to WebmasterWorld Guest from 23.20.101.42

Forum Moderators: bakedjake

Message Too Old, No Replies

GigaBlast Part 3

     
11:17 am on Mar 18, 2002 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Aug 10, 2001
posts:1551
votes: 10


Continued from: [webmasterworld.com...]


Looks impressive so far indeed. I'm really curious about any increase/decrease in relevance, once there's a significant number of sites indexed.

A few things to note, most of which you probably know already:

  • Always respect robots.txt for all pages.

  • The spider needs to do some load balancing, so that it doesn't fetch too many pages from the same site in a short time. The recommended ratio is about one page per minute and site (http://www.robotstxt.org/wc/robots.html)

  • Make sure that the images on your site are served with headers for creation date, size, and expiry date, so that the client can cache them. This will noticeably reduce the bandwidth requirements on your own system.

  • Only list one of www.example.com/ and www.example.com/index.html (home¦default.htm¦asp¦php, etc.) at least if they contain the same text.

  • Cluster the results, so that one site can't dominate the SERPs for any keyword combination.

  • I'm sure there's a lot more work waiting for you... ;)
  • 6:17 pm on Mar 21, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Mar 16, 2002
    posts:65
    votes: 0


    greetings,

    If robots.txt wasn't obeying your site's rules, there was a bug, but it should work now.

    I've put together a page of logos and page designs that I'd like everyone to view and if you feel so moved as to give positive or constructive feedback, please don't hesitate!

    <a href=http://www.gigablast.com/designs/designs.html>
    [gigablast.com...]
    </a>

    And thanks for all the comments so far, i've really found and fixed a lot of bugs!

    truly yours,
    matt

    greektomi

    6:50 pm on Mar 21, 2002 (gmt 0)

    Inactive Member
    Account Expired

     
     


    I guess I like the first one the best though I don't understand the purpose of reversing the first a in gig a blast.

    I think the logo should be relatively small.

    Two cents,

    Greektomi

    ceo

    9:02 pm on Mar 21, 2002 (gmt 0)

    Inactive Member
    Account Expired

     
     


    I hope to get the time to make one and send it across.
    I really didn't like any of those.

    BTW : I think your page is decent, not to say the posters page design was not good. It was pretty too, better actually, but I yet feel you should stick to the fast loading one.
    Cheers,
    RR

    11:28 pm on Mar 22, 2002 (gmt 0)

    New User

    10+ Year Member

    joined:Feb 23, 2002
    posts:16
    votes: 0


    Matt,
    you make a great job. Congratulation.

    One question to better understand the numbers:
    You say:

    the current hardware i have should hold somewhere between 200-250 million web page

    On the Gigablast about-page i read

    scales to 200 billion full pages

    Is my assumption right, 200 billion is the range what the software can do? To reach it, you need "a little bit" more of hardware, right?

    cheers klaus

    1:14 am on Mar 23, 2002 (gmt 0)

    Senior Member

    WebmasterWorld Senior Member 10+ Year Member

    joined:Dec 5, 2001
    posts:724
    votes: 0


    re: logo designs, I really think the "lightning bolt for an L" technique is seriously overused (maybe you should try a lightning bold with a swoosh behind it!). And every time I see that in a gigablast logo I think of Jolt cola. And I've never even tasted Jolt.

    Maybe it's for that reason that the only design on or linked from that page that I like is the second one; the red lettering with the gray and black in the background.

    1:30 am on Mar 23, 2002 (gmt 0)

    Preferred Member

    10+ Year Member

    joined:Nov 2, 2001
    posts:597
    votes: 0


    Of those submitted so far I actually like the Sticky Sauce version. It feels google-like, but little cooler...dress it up by switching the black for your favorite color. Keep it simple...
    7:57 am on Mar 23, 2002 (gmt 0)

    New User

    10+ Year Member

    joined:Dec 18, 2001
    posts:18
    votes: 0


    I liked the second one. It's got that trendy feel while looking clean and suffisticated.

    The first one is cool too, except the backwards "a" looks like another "b" that got blown up. If you left the "a" going the right direction, but tilted and lowered it slightly, it would probably look a lot better.

    The lighting bolt ones look kinda cheezy, like they might appear on the box of a store brand knock-off cereal. (that was the first thing to come to mind when I saw them)

    I hope that was helpful :)

    - D.G.

    12:53 pm on Mar 23, 2002 (gmt 0)

    Full Member

    10+ Year Member

    joined:Feb 19, 2001
    posts:308
    votes: 0


    Matt - good job!
    12:30 am on Mar 24, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Mar 16, 2002
    posts:65
    votes: 0


    thanks for the comments, guys.

    and, yes, gigablast does scale to 200 billion pages (200,000,000,000).
    and, yes, i would need more hardware.
    my current setup only goes to about 200-250 million, so i'd need 1,000 machines times what i have, which is actually very doable.

    matt

    12:40 am on Mar 24, 2002 (gmt 0)

    Preferred Member

    joined:July 3, 2001
    posts:387
    votes: 0


    hi matt great work

    re logos...

    stickysauce is best because its slickest/quickest looking

    the lightening is a no no for me, nice work but doesn't suit a searchengine.

    the neon layout is good, but green = kiss of death i think (despite dmoz)
    most shades of green are not appealing, plus the lightening bolt doesn't quite work.

    9:11 am on Mar 24, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Mar 24, 2002
    posts:64
    votes: 0


    Add URL shows that it is temporarily unavailable.

    Any ideas when its going to be up and running again?

    9:14 am on Mar 24, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Mar 24, 2002
    posts:64
    votes: 0


    Ooopppsss ... goofed on my first post.

    Add URL is back up and running.

    12:33 pm on Mar 24, 2002 (gmt 0)

    Preferred Member

    10+ Year Member

    joined:Sept 20, 2001
    posts:478
    votes: 0


    2,295,520 pages. quick, seems to follow links well from the site I submitted, interesting.

    pyst

    10:38 pm on Mar 24, 2002 (gmt 0)

    Inactive Member
    Account Expired

     
     


    Gigablast looks good apart from;
    a lousy looking logo graphic
    an awful background colour

    I don't mean to be nasty - just critical of something wrong that may seem like nothing but I think makes a difference.

    Yep, it is only MY opinion. Take it or leave it. I think I still like the engine and the simplicity, just like googles, I give a thumbs up to.

    Why is it so necessary for spiders to go past the index page. I can't think of many sites that need to be spidered so 'deeply' - all they do is clutter up engines with a multitude of pages making it more difficult to find the others. I clicked on one of the recent searches and was presented with an entire page of links to different pages for ONE site - very UNimpressive.

    Why don't spiders take notice of metatags - it really peeves me to make an effort to describe my sites and have it all ignored and something quite irrelevant (or less meaningful) placed in the description area.

    Is there such a thing as a human edited search engine as opposed to a human edited *laugh* directory like dmoz used to be.

    Maybe this board needs a discussion on what a search engine should be. Maybe someone will read it and build a new engine that people will actually enjoy using 100%.

    10:55 pm on Mar 24, 2002 (gmt 0)

    Senior Member

    WebmasterWorld Senior Member 10+ Year Member

    joined:Dec 5, 2001
    posts:724
    votes: 0


    pyst, you may have missed some of the earlier discussion in this thread and on the previous pages.

    >>I clicked on one of the recent searches and was presented with an entire page of links to different pages for ONE site

    Matt has mentioned that clustering is not yet being done, but it will be.

    Regarding the logo and colors, you'll also see that he's asked for and recieved several suggested designs; likely that will be changed.

    Remember that the site is still in the early development stages.

    >>Why don't spiders take notice of metatags - it really peeves me to make an effort to describe my sites

    Because while you may use them to accurately describe your sites, many other people have used them inaccurately to spam search engines.

    greektomi

    11:07 pm on Mar 24, 2002 (gmt 0)

    Inactive Member
    Account Expired

     
     


    I think you should add the advanced search option on all of the result pages instead of just on the main page.

    My reasoning is that often after you make your first search you realize that you need to refine it somewhat...

    2 more cents,
    Greektomi

    11:22 pm on Mar 24, 2002 (gmt 0)

    Senior Member

    WebmasterWorld Senior Member 10+ Year Member

    joined:Oct 10, 2001
    posts:731
    votes: 0


    Hmmm... Add a URL is down again...
    11:23 pm on Mar 24, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Mar 10, 2002
    posts:64
    votes: 0


    Matt,

    Very nice work and great job.

    Just want you to know this is the kind of ingenuity that is going to make the search engine world a whole new place to be.

    Jason

    pyst

    3:52 am on Mar 25, 2002 (gmt 0)

    Inactive Member
    Account Expired

     
     


    ADD URL is up again.

    I just did a search which made me wonder about the relevance of gigablast searches because I was given as many irrelevant results as valid ones. On closer inspection I noticed my search term seemed to be giving me results for another term.
    The terms are 'gay pics' and 'tin cans'.
    Any reason this might be happening? However apart for that the search results were ok.

    Something missing from the Gigablast results - the ability to pick a page 'deeper' in the pack like google has - a dozen or more pages you can pick from instead of just 'next' or 'previous'.

    raceboat

    3:50 pm on Mar 25, 2002 (gmt 0)

    Inactive Member
    Account Expired

     
     


    Just found this thread... So I thought I would give it a try.

    I added the URL of my site. Within seconds it apparently had spidered my site (200 pages or so...) as well as a few hundred other related sites I have listed in my directory, judging from the "date spidered".

    Pretty impressive!

    10:13 pm on Mar 25, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Oct 9, 2001
    posts:57
    votes: 0


    Strange... I added my url a couple of days ago after reading this thread. And now my site seems to have vanished from the SE. What's going on??
    10:25 pm on Mar 25, 2002 (gmt 0)

    Senior Member from GB 

    WebmasterWorld Senior Member brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

    joined:Jan 30, 2002
    posts:4843
    votes: 2


    pgsbs, he has been starting the database from scratch, he done it 3 times last I heard, probably many more times now
    8:44 am on Mar 26, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Oct 9, 2001
    posts:57
    votes: 0


    So I guess all the hope for a new good SE has vanished now?? Well Matt shouldn't give up on it. We need an extra, good working SE. Even with Google up and about.
    6:52 pm on Mar 26, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Mar 25, 2002
    posts:153
    votes: 0


    hope this really becomes the next google. need some serious options out there along with google :-)

    the logos/designs, i really wasnt impressed by any of them. i'm sure some better submissions will come in soon

    7:32 pm on Mar 26, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Mar 16, 2004
    posts:74
    votes: 0


    Man, this sure brings back memories!

    Ica

    9:27 am on Mar 27, 2002 (gmt 0)

    New User

    10+ Year Member

    joined:May 30, 2001
    posts:24
    votes: 0


    pyst:
    I discovered that too. I think it's caused by an "or"-connection between the search terms. That means, the term "gay pics" will deliver all hits with "gay" and "pics" first, and then start delivering all pages with "gay" or "pics".
    9:56 pm on Mar 27, 2002 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Mar 24, 2002
    posts:64
    votes: 0


    Looks like the database is corrupted again. This time I found my site description leading to a porn site.

    Hmmmm if it was the other way around traffic would have gone up.

    Hope its corrected before the site owner has a stroke.

    10:52 pm on Mar 27, 2002 (gmt 0)

    New User

    5+ Year Member

    joined:July 8, 2009
    posts:6
    votes: 0


    Hi I tried to submit my site but it says that it is temporarly down. Does anyone know when it will be back up?

    Jacquie

    12:40 am on Mar 28, 2002 (gmt 0)

    Inactive Member
    Account Expired

     
     


    What is going on with Gigablast?

    raceboat

    1:44 pm on Mar 29, 2002 (gmt 0)

    Inactive Member
    Account Expired

     
     


    The database appears to have been reset again...
    This 65 message thread spans 3 pages: 65