| 9:08 am on Jun 14, 2006 (gmt 0)|
Sites having keywords in URL get a boost....
| 4:53 pm on Jun 16, 2006 (gmt 0)|
How did you seed your engine?
| 7:17 pm on Jun 16, 2006 (gmt 0)|
The engine was seeded with results from MSN to get it started. The results are then put into a database and stored on the server. Each keyword has it's own database so it's not really a true search engine, it's more like a massive top 100 list that the users can control.
When new pages are added to the databases using the submission form, the new pages get an extra push to the top of the results giving new pages a chance to get traffic right away.
If the new pages are junk, people will eventually vote the bad or not relevant pages to the bottom and the good pages will rise to the top.
I also realized that there are alot of relevant pages that will not get a chance to rank #1 so I added a random "Site of the Moment" so all sites will get ranked #1 sometimes regardless of votes.
| 3:52 am on Jun 19, 2006 (gmt 0)|
I'm assuming by the lack of interest that it's probably going to flop. But I have found and fixed alot of bugs from all your searches so thanks!
Maybe I can simply sell the scripts to the public? I sure wish I knew what you all thought of this idea?
Let me know...
| 4:01 am on Jun 19, 2006 (gmt 0)|
You'll need to put on your 'evil' hat and think about some of your 'voting' algorithms. For example, if your engine got very popular and people started competing for position, then keyword-site #2 could hire a whole bunch of low-paid 'submitters' for his own pages, and then keep them busy afterwards submitting spam reports on his #1 competitor's pages.
A sad fact these days is that not only do you need an algorithm to rank pages, you need another one to fight the darker aspects of human nature... especially when those humans are compteting for money.
| 4:30 am on Jun 19, 2006 (gmt 0)|
The spam reporting feature is hand managed and sites that are reported as spam that are not really spam will not be able to be reported as spam once checked.
So if a page is reported as spam but is not spam, I simply set the page so it can never be reported as spam again. So a competetor would be shooting itself in the foot.
I realize that it will be a big job but for the most part, it will be the best way to sort through the spam.
I did notice that everyone that searched for their main keyword, instantly voted their own site to the top of the results and I have checked the results and most people 99% deserve to be #1 simply because webmasters know best how the results should be ranked.
So far it has worked out exactly the way I designed it and the results that have been adjusted by webmasters are very good.
I also wanted to set it up so new pages will rank well when submitted but ultimately the best pages will always rank best because every clickthrough will be a vote also.
It would be very time consuming for a competetor to derank other pages enough for their site to rank in the top 5 for a single keyword which they would have to do it to many different keywords for that to be effective.
Either way, even though they are a competetor their page may be more relevant than their competition. I say let the people judge.
| 4:45 am on Jun 19, 2006 (gmt 0)|
|sites that are reported as spam that are not really spam will not be able to be reported as spam once checked. |
So if a page is reported as spam but is not spam, I simply set the page so it can never be reported as spam again. So a competetor would be shooting itself in the foot.
But wouldn't that give the site owner a license to spam the heck out of a site? All he'd have to do is report his own site as spam, wait for it to be checked and cleared and then spam it out, knowing no-one else could report it again.
Or did I misread or misunderstand that?
| 5:08 am on Jun 19, 2006 (gmt 0)|
Good point! I like to think that most people are honest and it's not very hard to vote your own page to the top spam or not but there is definately a small flaw that needs to be worked out.
I am beginning to see why google is so secretive about everything. HAHAHA! I wanted to create a transparent engine with no secrets but there is always those few that love to buck the system.
Either way, ken_b, I really appreciate your pointing this out to me. It's not a hard fix but it is definately a problem.
| 5:39 am on Jun 19, 2006 (gmt 0)|
They also take the risk that maybe nobody really checks the spam reports? HAHAHA! I am sure the spam reports will back up a long way!
| 11:40 pm on Jun 19, 2006 (gmt 0)|
Has anyone checked out the page submission process? Is it complicated or self explanitory?
Is it helpful or a pain in the butt?
| 8:11 pm on Jun 23, 2006 (gmt 0)|
I still haven't gotten any feedback on the idea? Do you think this is a good idea? How about usability? Do you think it will be useful to people?
Are there any features you think it needs?
Is there anything that is confusing to you or possibly the user?
Please let me know.
| 10:58 pm on Jun 26, 2006 (gmt 0)|
Hi eyezshine, First of all don't give up. It is a hard road you travel, but it is good to see someone trying.
jdMorgan had a good point on the evil side, but it is much worse, top lists and voting scripts are spoofed all the time, a verry common practice in the adult side of things, and those spoofing scripts and bots eat a lot of bandwidth.
ken_b brought up another good point, but another twist on the matter is domains are bought all the time by spammers for the sole purpose of getting their good linkage.
I do like that you want to empower the user, and in that concept let me toss a couple ideals by you.
How about grouped searches, ie a series of dot boxes that let the user choose where they want to search from. The ideal is that maybe I only want to search buliton boards because I only want to see what others are saying about this or that. Or I just want to look at sale pages as I am looking to buy, or review pages as I want to see how something ranks. Or then again maybe I'm not sure exactly where I want to look, but I know I am not likely to find my answer in blogs, or buliton bords, etc.
Another helpful tool would be to be able to elinimate an entire domain from my search results, and / or domains that link to it
I am happy with your engine, found a new contact, hopefolly a new co consperiter.
| 3:53 am on Jun 27, 2006 (gmt 0)|
I have been thinking about this idea for years and about 4 months ago began to put all the ideas together and making them work.
My whole goal was to make an engine that doesn't use much server resources so only a few people can run it easily but mostly give the user what they want and give it to them now and fast.
All this feedback is helping a ton and helped me find alot of small bugs and flaws in my design. The engine may or may not compete with the big engines one day, but it does manage spam way better than the big engines do.
If it does well, I will have to monetize it somehow to pay for the overhead. A contextual PPC model is simple enough to create but I would like to ride out the free results as long as possible.
Once I get the whole system on a cluster of servers I may even try getting a contract with other ISP's to provide search for them. That would pay for the overhead and keep the results clean with no ad's.
I definately want to keep the engine private because I seen what happened to google when they went public. It became all about the money and they forgot about the little guy. Getting rich is great but it's not everything. Helping others is everything.
| 4:03 am on Jun 27, 2006 (gmt 0)|
Conceptually it is a nice idea.
Suggestions:- Make it open source
Allow a site to apply for more than one keyword
Auto-suggest keywords based on best fit with existing sites
Get hold of a faster server - this one is ...slow...
Get a designer in, pronto!
| 5:29 am on Jun 27, 2006 (gmt 0)|
<<Make it open source
I'm not sure how to do that?
<<Allow a site to apply for more than one keyword
In the future when I have more time I will work on that.
<<Auto-suggest keywords based on best fit with existing sites
I thought about that too but I was trying to keep the submission process as simple as possible because there are many webmasters out there that have no clue what they are doing.
<<Get hold of a faster server - this one is ...slow...
Once I work out the bugs in the scripts and the site gets more traffic, I plan on putting the site on a cluster of many servers with load balancing etc.
<<Get a designer in, pronto!
Come on.... That hurts! HAHAHA! I was going for cool calm green colors with a hint of blue to set the mood. The design will change eventually but for now it's the tropical theme. HAHAHA!
| 2:08 am on Jun 29, 2006 (gmt 0)|
Spam is my biggest problem when using a search engine. I run 5 directories and about 70% of what is submitted is some type of Adsense site - really useless. I am seeing more of these useless sites in my search results on all engines and it gets very frustrating. I really like your idea and am excited to see it grow! Please keep up the good work.....:-)
| 2:30 am on Jun 29, 2006 (gmt 0)|
It will be a hard road getting people to use a different search engine than google but maybe it'll catch on one day? Guess I need to start advertising.
I wanted to work out as many bugs as possible before I started advertising it but then again, now is better than later I guess?
Once other webmasters begin seeing referrals from it, it may begin to catch on but that is a long way off it seems.
If it does catch on, it will be very frustrating for the spam websites and I am sure they won't even have a chance once webmasters begin watching their keywords. I am sure there are more honest webmasters than there are black hat which is what I am counting on.
So far, every spam submission I have recieved was definately spam and I haven't had one problem at all yet. But the way it's set up in the back end makes it very easy to spot the good spam reports from the bad.
Another thing is, not all adsense sites are spam. Websites with nothing but adsense and ads are definately spam and pointless pages. But informational and product pages and even affiliate pages with adsense on the page are fine with me. As long as the user finds what they want.
| 3:10 am on Jun 29, 2006 (gmt 0)|
I think you have a pretty decent concept. Certainly open to abuse... but everything is. Have you considered making the user vote automated?
For instance, if someone searches for "widgets" and clicks on a result but hits the back button within a given time frame (10 seconds for example) and tries another link it would be counted as a thumbs down. If they clicked on a result and didn't came back within, say 5 minutes, it would be counted as a thumbs up. (You could keep the actual time frames "top secret" and even vary them from day to day.)
This would also inadvertantly keep "good" MFA sites on the good side of the SERPs and gradually weed out the bad ones.
Just a thought.
| 5:29 am on Jun 29, 2006 (gmt 0)|
One of my problems is that the average "user" doesn't know what a MFA site is so most would just click through and then click an adsense link and maybe find what they wanted and that would be a pointless vote. But at least sometimes they find what they need.
I am counting on webmasters who know what spam looks like to help organize the results. I will deal with the "black hats" as they pop up. I can easily block an entire domain from showing in the results by simply adding that domain to a database and then checking that database every search.
But doing so would create more time it takes to produce results. It uses alot less cpu to use the method I am using now.
I sat here and organized the results on many keywords just playing around and it's kinda fun like a game.
My biggest problem is the submission process. I am affraid it will be too time consuming for webmasters to submit all their pages. I remember Altavista used to make you submit pages individually just before they died to google. But it does make it harder to build a million page adsense site and get them all indexed.
My thought on it is, if it's important to you then it must be worth submitting. Think about all the time you'll save by not having to buy or trade a million links to rank well. Submitting pages to the 100% targeted keyword you choose is a walk in the park compared to gaining link popularity! HAHA!
Plus there is no waiting. pages are added immediately. Submit and forget. Dead links will be weeded out by the spam reporting feature. New sites will be able to rank well. The world will be a happier place. We'll all be able to find what we need and life will be simple again.
I have a dream....... Where the users and the webmasters unite to create a search engine that will not fail.....
Ok I'm done being funny.
| 7:07 am on Jun 29, 2006 (gmt 0)|
Keep up the good work. New projects are always encouraged and I wish you all the best.
I would like to provide you true feedback. However I am short on time to suggest any ideas or improvements.
To be frank, I think it may be a long time before your project catches up or for that matter, it may not work at all.
The reason is simple:
If you are going to depend on webmasters to submit their own sites, then you are certainly missing something.
I personally would never submit my pages manually to a search engine which I know wouldn't send me a good number of referrals. We all know that the bulk of referrals are sent by Google, Yahoo, and MSN. The rest of the search engines pale by comparision and that includes askjeeves.com as well.
Even If I do submit, it would mostly be my home page. You wouldn't be able to build a huge database and in turn it will mostly end up with dissatisfied users.
The voting system has many flaws and it is open to a massive scale abuse.
In the end, users want quick access to the information they are looking out for.
| 7:32 am on Jun 29, 2006 (gmt 0)|
|The idea behind this engine is to let the users control the results instead of algorythms. Users can vote and demote pages in the rankings and eventually only the best pages will rise to the top of the results. |
Certainly on the somewhat crude test I did, it appears to be comparatively easy for a user to manipulate your voting results, to the point of making them meaningless :(
To put it bluntly, the idea does not stand a chance in a commercial world. Bot or click factories would work against you, if the site caught on, to ensure that the scum rose to the top.
| 2:10 pm on Jun 29, 2006 (gmt 0)|
Yes that was a very crude attempt. HAHAHA! You only clicked on one link? How is that going to mess things up?
I was hoping it was really messed up so I could study it and see how I could do something about it.
I do have a few bugs I need to fix still that I just found so thanks for inadvertantly showing me that. This is helping alot!
| 2:17 pm on Jun 29, 2006 (gmt 0)|
Also, I want the scum to rise to the top because that way it'll get reported as spam. Once reported it'll never be ranked again and they will never be able to submit it again.
| 4:07 pm on Jun 30, 2006 (gmt 0)|
You will have a problem on the voting, but your idea of getting a webmaster to specify exactly what the website is all about is a step in the right direction.
Our idea has been along the lines that you have say 100 points. You can use for example 40 points for prime keywords, you can use all 40 for 1 word, or 20 points each for 2 words, up to a maximum of 5 words at 8 points each.
You can then add up to a further 10 words to total 30 points, then 30 words of description or words at 1 point each.
A site that is purely about 'widgets' may want to put 'all their widgets in one basket', but someone who does green widgets and brown widgets and dark brown widegts can decide accordingly.
if you are trying to provide every form of widget in the world you'll have a tougher time.
I think the biggest problem the majors have is to have an algo that gives 4 million results for what a surfer is searching for when in fact there may be only 10 to 100 real websites relating to their search. No one is going to search 4 million sites.
When someone searches they are looking for information that answers what they are searching for.
It is just 'so simple'. We started working on 'something' a few weeks back and so far it is looking great!
| 12:08 am on Jul 14, 2006 (gmt 0)|
So far everything is working out great! Thanks for your help and suggestions. I have fixed alot of small bugs and problems and will be working on more features soon.
I haven't seen any problems with the voting yet and the spam reports have all been perfect. It's amazing how much crap is in the results. There is only about 2 million keywords/phrases to clean up which has been very easy so far because the spam reports have been very accurate.
I am sure there are more bugs I haven't found yet so if you don't mind doing some more searches and submissions etc. Thanks!