Welcome to WebmasterWorld Guest from 35.153.135.60

Forum Moderators: bakedjake

Message Too Old, No Replies

Burf.co

Burf.com returning from the graveyard

     
6:10 am on Mar 26, 2018 (gmt 0)

New User

10+ Year Member

joined:Oct 29, 2005
posts: 37
votes: 1


Hi everyone, not been on the site for a long time! For anyone who remembers I built Burf.com, a small search engine about 10 years back.

Last month I decided to build a new engine Burf.co. It's still in progress as I am trying to work out the right niche at the moment.

I have written most of it in Swift which has been great fun. MongoDB for storage, which has given me some headache.
6:47 am on Mar 26, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 893


Hi burf200 and welcome back to Webmaster World.

What changes/improvements do you plan to make?

Will your index be all sites or genre specific? All file types?

And what will be the full User Agent string and IP range so we can recognize your bot when it comes around in our server logs?
1:36 pm on Mar 26, 2018 (gmt 0)

New User

10+ Year Member

joined:Oct 29, 2005
posts: 37
votes: 1


So this is a completely new project (Burf got sold and died as we know it), I am now a mobile dev and I really missed the fun of Search engines, how websites work and connect etc.

I think the whole privacy stuff has got me thinking but I think DuckDuckGo has that covered? As the moment it indexes everything to check the indexing code works. Indexing pages now seems a bit harder with all sorts of characters in URLs and Javascript rendered pages.

I had a thought about categories pages so people could search for just technical stuff, or news etc. I know Google does an epic job, it's just interesting to learn about. This is not really a project about making money.

Burf Search Engine is the agent.
2:45 pm on Mar 26, 2018 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 25, 2018
posts:500
votes: 101


@burf2000, I wish you all the best in your new project !
3:00 am on Mar 27, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 893


Burf Search Engine is the agent.
That's the complete UA string?

In order for a lot of sites to allow access, they need to create exceptions to existing block rules if you plan to use any of the large server farms (Amazon, Google, OVH, GoDaddy, LeaseWeb, RackSpace, etc)
7:23 am on Mar 27, 2018 (gmt 0)

New User

10+ Year Member

joined:Oct 29, 2005
posts: 37
votes: 1


OK what would you suggest?

I rewrote the indexer this morning as its getting too slow at fairly low number (5 mil). Reading up a lot. Yes I am reinventing the wheel but its soo much fun doing it
8:12 am on Mar 27, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 893


From a webmaster point of view, we like to know:
• who is accessing our digital properties
• why you want it
• what you plan to do with it
• why we should allow your bot to do this.
Basically, will it benefit our interests as site owners.

So an appropriate UA might be:
Burf Search Engine; +https://example.com/info.html
or
BurfBot/1.0; +https://example.com/info.html

Where the info.html is an accessible page on your server than explains the above mentioned points.

Requesting & supporting robots.txt is always a plus :)
8:18 am on Mar 27, 2018 (gmt 0)

New User

10+ Year Member

joined:Oct 29, 2005
posts: 37
votes: 1


That's really useful information mate, I will put that on the list todo as I think that will also get more people visiting the site to find out more.
11:50 pm on Mar 27, 2018 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 25, 2018
posts:500
votes: 101


Also, make your crawler to respect the robots.txt file.
6:08 am on Mar 29, 2018 (gmt 0)

New User

10+ Year Member

joined:Oct 29, 2005
posts: 37
votes: 1


Yeah that's also on my list to do, I think that may take me a little while to work out.
6:22 pm on Mar 29, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15705
votes: 812


Is it just me or ...

Does the conjunction of “burf” and “returning from” lead to unfortunate mental images?

I guess it’s just me.
9:53 am on Oct 17, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 893


If you have to ask, it probably is ;)

...and any derived humor is overwritten by the disgusting imagery.
12:24 pm on Oct 17, 2018 (gmt 0)

New User

10+ Year Member

joined:Oct 29, 2005
posts: 37
votes: 1


I think that's the urban dictionary definition for it mate.

On an update of [Burf.co ], I actually started using the common crawl index
3:42 pm on Oct 17, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:2091
votes: 370


Ahh, the nostalgia of "Submit site".