Welcome to WebmasterWorld Guest from 54.147.44.93

Forum Moderators: incrediBILL & lawman

Message Too Old, No Replies

lets try this for a month or three...

last recourse against rogue bots

     

Brett_Tabke

1:21 am on Nov 19, 2005 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



[webmasterworld.com...]

required login the real story here...
MSN and yahoo bots were blocked in October. This does everyone else.

Robin_reala

1:29 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



no one is looking forward to the inevitable compatability problems

Judging by the quality of support in forum83 that should really be an issue :)

claus

1:41 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You know, I've been saying this for years, literally - even several times in these forums: A good webmaster will want loyal revisiting users, not bots or random traffic.

So, if this site can work without SE's all the more power and respect to my fellow members, to Brett and to the mods throughout the years for that. Well done!

It's not the first site to achieve that status, but it's certainly one of the largest.

--
An internal site search is *badly needed* though, and a very good one at that. But that's apparently in the works, so I'll just have to be patient (like any good doctor will tell you to be).

lasko

1:53 pm on Nov 24, 2005 (gmt 0)

10+ Year Member




Thats a bold move Brett, I say I hope one day I could work on a web site that doesn't have to think about Search Engines :)

Just imagine forgetting Search Engines all together we could bang out sites left right and center without the worry.

I must say I'm going to miss the Google search, specially in the Php section looking up previous posts and syntax tips.

Looking forward to help test the new search function.

I only wish I used my Add Bookmark button more, never mind.

Hands up all those who would love to close the door on search engines!

I remember finding WW in Google for the very first time many years ago. When I searched Google for php answers etc I got WW and Co and 99% of the time the answers.

WW is like a family on the web you can go about your business each day knowing that the support or community is right there when you need it, thats why WW doesn't require Google or Yahoo any longer!

Good Luck Brett on the new search function!

rj87uk

2:01 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My view is hes doing what is needed to be done. 'On yer sel, Brett.'

Play_Bach

2:02 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member play_bach is a WebmasterWorld Top Contributor of All Time 5+ Year Member



> > if bandwidth is a problem

> It's not - system load is.

> Sooner or later, you are going to kick the nighbors out of the house and build a fence.

OK, well my question still stands. How does eBay, Amazon, craigslist, Yahoo! or any of the other big portals deal with this bot problem? Somehow they are all able to be spidered by Google (et al) and also provide site search - anybody know?

Thanks

DaveN

2:08 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



system load is this a inerrant problem with the BestBBS

[edited by: DaveN at 2:13 pm (utc) on Nov. 24, 2005]

notsleepy

2:10 pm on Nov 24, 2005 (gmt 0)

10+ Year Member



>It's not - system load is.

Uhhhh. Load balancing? Dells are cheap.

lasko

2:13 pm on Nov 24, 2005 (gmt 0)

10+ Year Member



How does eBay, Amazon, craigslist, Yahoo! or any of the other big portals deal with this bot problem? Somehow they are all able to be spidered by Google (et al) and also provide site search - anybody know?

Its called a multi-million dollar investment in load balancing, servers, bandwidth etc.

WW is a free forum with donations from regular supporters and I guess only 1 server that has to handle a huge load.

DaveN

2:15 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



but if it is server load.. how are we going to handle a site search?

surley that will bring the server to it's knees

DaveN

oddsod

2:21 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hands up all those who would love to close the door on search engines!

Exactly! If you're playing high risk games with SEs you want them around 'cause that's how you get your buzz. Everyone else would like to give up the addiction but won't admit it in public and cetainly won't go the whole hog and ban all bots.

Lawnboyronmiller

2:21 pm on Nov 24, 2005 (gmt 0)

10+ Year Member



yeah, just setup a server farm. i have a server farm of about 10 servers. You can set up a simple one, and just round-robin dns..

if anything just Disallow forums /forum30/ (forum 1-XX) and you probably would have cut down robot load by 99% while still maintaining your homepage.

If your gone for 6 months, thats pretty bad... and self-inflicted is even worse. One thing about pubcon's is you see old friends you've made and you see fresh faces at each one...

Play_Bach

2:26 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member play_bach is a WebmasterWorld Top Contributor of All Time 5+ Year Member



> Its called a multi-million dollar investment in load balancing, servers, bandwidth etc.

But I thought Brett said the problem was "rogue bots," right? So how do the big portals deal with it? Does anybody know?

Thanks.

Brett_Tabke

3:14 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month




> yeah, just setup a server farm.
> i have a server farm of
> about 10 servers. You can set
> up a simple one, and just round-robin dns..

Load balancing is easy on the send - syncing writes is very difficult and requires software specific to your app to deal with. That software is started, but not complete here. I'd guess 50 files a minute are updated. Probably 8-10k files a day are changed. Some how, the software has to sync those files across all the server simultaneously.

> how are we going to handle a site search?

Good point - yet to be seen. I am working under the theory that we'll do as we did before and put it on sew [searchengineworld.com] - or another server altogether. Also, plan B is aspseek, which isn't too bad load wise via a sql server on the same box (although, I think the results are pretty poor).

But ya know - of all the people that would have issue, or voice support for the action - yours is completely baffling Dave. You talk often about being so "black hat", eschewing scrapper sites, loathing the engines, and ripping them of traffic, that you should be the first to support this!?

> So, if this site can work without SE's
> all the more power and respect to my fellow
> members, to Brett and to the mods throughout
> the years for that. Well done!

Claus - thanks!

> server load is

there is also the issue of ripped content... that is another independent story though that I don't think is prudent to discuss in public.

> system load is this a inerrant problem with the BestBBS

I believe it is the most system friendly forum on the web.
/. was last heard to be running about .75 our page views and uniques, but on 8 load balanced servers. Eg: bestbbs is 10-15 times more efficient than / code. Which is pretty good considering I wrote the software to handle 1,000 members and 10,000 page views a day (take times 100 to get into the ball park of where we were last week). Who ever freakin believed we would have 600k files in just threads? yeow...

[edited by: Brett_Tabke at 3:20 pm (utc) on Nov. 24, 2005]

reseller

3:14 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Brett!

I see a revolution in what you have done :-)

Lets assume that everything will go successfuly as you have thought and planned. And WebmasterWorld grow and nurish without being listed in search engines, and I do hope that that exactly whats gonna happen.

Have you ever thought about what that means for the SEO industry?

In fact you are showing the owners of big sites that they don't need SEO and search engines to survive!

Leaving those famous 26 steps to the webmasters of mini-sites, small and medium sites.

And I can already see 100s of SEO specialists starting shining their resumes :-)

Play_Bach

3:24 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member play_bach is a WebmasterWorld Top Contributor of All Time 5+ Year Member



> In fact you are showing the owners of big sites that they don't need SEO and search engines to survive!

Pretty early to be making that claim...

DaveN

3:29 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



brett said : But ya know - of all the people that would have issue, or voice support for the action - yours is completely baffling Dave. You talk often about being so "black hat", eschewing scrapper sites, loathing the engines, and ripping them of traffic, that you should be the first to support this!?

lol, Maybe I'm getting soft in my old age :)

DaveN

incrediBILL

3:39 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



syncing writes is very difficult and requires software specific to your app to deal with.

OK, I could explain to you the concept of time invariant data models and how you could easily update write synchs seamlessly across multiple servers as each server would always have it's correct snapshot "in time" as they all update, but it only works if your data architecture is designed properly.

Been there, done that, I'm not cheap ;)

ogletree

4:21 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member ogletree is a WebmasterWorld Top Contributor of All Time 10+ Year Member



If all the non SEO forums are causing all the traffic maybe you can just allow a few SEO forums to be indexed. I think that would cut down useless traffic quite a bit. I do understand the bad bot problem. A site this populer and mainstream can get away with all kinds of stuff that looks black hat but is not because of the reasons behind it. Cloaking is not bad and is done by lots of big sites includimng google itself. I'm talking about allowing the top 3 spiders in by IP and everyone else has to log in. It is not to rank better it is to cut expenses. Yeah somebody will report you but stuff like that G will look at it and know why. Specially if it is done publicly and becomes newsworthy. Even if they do ban you how is that differnt than you banning them both have the same result. At least this way there is a chance of getting new quality visitors.

Of course I came to WebmasterWorld from the advice of a friend.

DaveN

4:29 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



here is a thought ... now the SE's can't get in here .. the content in here just went up though the roof .. I mean it's orginal content that the SE's CAN'T see it's just begging to be lifed and dropped on to a scaper site ..

hey Brett .. look back to my old evil blackhat self ;)

DaveN

Kirby

4:35 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Whatever you do Brett, I hope it works. I can tell you that it is somewhat baffling to me after talking with notsleepy, seth, oil and jatar_k outside the bar one night about a search function. What I kept hearing was that a search function was soooo difficult to pull off and with the exception of the Supporters forum, at least we could use Google, so this really sucks.

With regard to new members, I found this via word of mouth 3 years ago. With blogs, WW gets even more play, so that is probably a wash.

Its not like this place will fall apart in 60-90 days, but I hope you are testing this with a goal in mind. Not sure though, since you and DaveN seem to be on different pages and I would have expected your mods to be in this with you from the start, or at least willing to give it a chance. Making decisions of this magnitude without your mods is foolish and arrogant, especially when you want this to out live you. Without good mods it doesnt stand a chance, and without search, you will wear out your mods. Losing your mods from this community, but past and present, is not something you can afford.

I wonder if somepeople are just wanked that their profile will no longer pass pr? So it comes back to their inability to game google through webmasterworld. interesting... this must be what G feels like during an update. lol

This is the only thing that really irked me in this thread. We have good reasons to be wanked, so that was a cheap shot. Like many others, I havent put anything in a profile. Up until last week at PubCon, and outside of my supporters registration, my ID was completely seperate from my online nic.

Best of luck with this and Happy Thanksgiving.

DaveN

4:41 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Kirby I used google all the time to find stuff in WebmasterWorld .. so thats why I'm wanked

The first I knew this was happening is when a member MSN IMed me ..lol

DaveN

Kirby

4:47 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Dave, I figured as much, based on the conversations I had with other mods about search.

I used Google to find Suzy's css tutorial. It was bookmarked in my laptop someone stole. This is more than a community to me. It is a resource. Its like someone just took away my library card.

Stefan

4:50 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've been following this thread since it started and have become increasingly more baffled. I understand entirely that WW can probably get away with this (try a search for "webmasterworld" in G), but "search engine addiction"? What on earth are the alternatives? Shall I spend billions putting billboards up all over the planet that have the site URL's on them (actually, I just need to cover the areas our tourists come from - call it half a billion)? To consider this approach further, beyond just how it would affect me - will all of the sites on the net also be putting up billboards, and will we soon run out of dry land on which to place them? If we eradicate SE's, we have to at least put up signs advertising the ODP. Of course, the users could just start guessing URL's and typing those in.

Best of luck to all those who decide to hide their sites from now on, but I'm going to leave mine out there where everyone can find them. (And if that's an addiction, what the hec - I also like a few beers everyday, so this won't be the first one I'm dealing with.)

Play_Bach

4:57 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member play_bach is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Having a good site search is an asset - Google certainly provided that. I thought the reason Brett shut it down was because of "rogue bots" - which would seem to be a security issue, right? Somehow, I'm not following how WebmasterWorld with all it's programmer talent couldn't be just as secure as eBay, Amazon or Yahoo! - perhaps "rogue bots" isn't the whole picture here...

jetboy

5:19 pm on Nov 24, 2005 (gmt 0)

10+ Year Member



Having just spent the best part of an hour trying to track down one of my own old posts without Google - hey, I've forgotten more than I know, and WebmasterWorld often helps to jog my memory - I've got to to agree with Kirby's library card analogy. It hurts even more when it's your own books you're looking for!

AlexK

5:29 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I hope that your actions cause the SEs some pause for thought.

Brett_Tabke #150:

> if bandwidth is a problem
It's not - system load is.

Brett_Tabke #71:
slurp was so aggressive that it was too much load

Same problem my site... Yahoo reply says "didn't fine an actual problem ... (use) crawl-delay directive". Of course, Yahoo is not alone in this - the G Mozilla-bot has actually brought other websites down due to an over-agressive GET-frequency. Very satisfying to think that your actions with this site may actually get this message home to them.

Brett_Tabke #67:

New site search engine is in alpha ... Not in any real big hurry for it

Brett_Tabke #69:
Less than 1 in 1k users use site search

You need to re-think your attitude to the importance of in-house search now. There was little need for it before, because it was covered by the SEs. This place is not just a community, it is also a resource.

It is now a resource who's history cannot be accessed.

Powdork

5:29 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member powdork is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Kind of funny.
There isn't any whining about how bad Google's results are in this thread. You don't know what you got until it's gone.

tigertom

5:35 pm on Nov 24, 2005 (gmt 0)

10+ Year Member



Off topic: I'm in the UK. You guys know what the word "#*$!" means over here, right? And I think the Brits invented it. The word, I mean.

Later: Heheh, WebmasterWorld does too now, it seems.

DaveN

5:37 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



im the the uk too, #*$! is about the only bad word thats not on the filter lol

DaveN

5:38 pm on Nov 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



oops i guess not any more lol #*$!
This 223 message thread spans 8 pages: 223
 

Featured Threads

Hot Threads This Week

Hot Threads This Month