A Dipsie tidbit

Forum Moderators: bakedjake

Message Too Old, No Replies

A Dipsie tidbit

tedster

9:04 pm on Mar 26, 2004 (gmt 0)

Some of us have openly wondered if Dipsie, the "coming soon" search engine, is anything but vaporware with a nice overlay of PR buzz. I just read an article that had the most concrete tidbit I have yet read - Dipsie's approach for indexing content from the dynamic pages in the "deep web".

In particular, crawlers are stymied by dynamic Web pages, which are customized as users choose various options, such as car color at Cars.com.
To counter that, Chicago-based Dipsie Inc. is developing software that promises to fill out Cars.com's simple online forms, which are based on multiple choice, though not the complex ones for the government's patent and trademark databases, which require typing in keywords. A public test version is expected by summer.
Mercury News [mercurynews.com]

digitalv

9:05 pm on Mar 26, 2004 (gmt 0)

More importantly, does anyone really care? :) The way the web is going, Google is the only thing that matters.

IMHO

angiolo

9:11 pm on Mar 26, 2004 (gmt 0)

> Google is the only thing that matters

Yahoo is important too!

pleeker

9:16 pm on Mar 26, 2004 (gmt 0)

Great find tedster. A smart crawler that can make simple choices a human would make to get to dynamic content. If it actually becomes reality, it'd be a step in the right direction.

GeekyChic

9:27 pm on Mar 26, 2004 (gmt 0)

Google and Yahoo might be the big dogs for now... but as long as they keep up the pace and continue to improve with tech. they'll stay there. Or else, they like many will fade away...

rfgdxm1

4:28 am on Mar 28, 2004 (gmt 0)

>Yahoo is important too!

And also MSN.

tedster

6:18 am on Mar 28, 2004 (gmt 0)

Yeah, but Google, Yahoo and MS Search have their own forums here - we set up this spot to talk about the little guys, right?

Even if a smaller start-up doesn't send us lits and loit of traffic, it will be fun to see small innovators do their thing. I'll be we're FAR from maturity in the search space. I'll bet we have lots of surprises ahead of us. At least I hope so, because the current search landscape feels pretty dull to me. I mean, I didn't get involved with the web to spend my time planning out link tricks, you know? That's not what really turns me on.

Anyway, earlier press for Dipsie talked about launching with 10 billion pages - if this is their approach to finding deep content, it will be interesting to see if that makes a big difference or if it's basically a yawn.

sidyadav

11:54 am on Mar 31, 2004 (gmt 0)

hey guys, isn't it 2004 this year? isn't dipsie supposed to launch?

Sid

christopher

10:51 pm on Apr 6, 2004 (gmt 0)

As far as I know. Perhaps they've run into tech problems.

Be interesting to see if a press release has been posted anywhere.

christopher

10:56 pm on Apr 6, 2004 (gmt 0)

Just checked - and no press release usually means they have not officially launched yet.

Tick Tick Tick....

It does look quite nice though.

Liane

11:16 pm on Apr 6, 2004 (gmt 0)

Actually, if Dipsie proves itself as a reliable technology, my bet is that it will revolutionalize search overnight and leave anyone not up with it in the dust.

I'm with Tedster ... I think search technology has a very long way to go and I'm looking forward to the improvements and challenges.

sidyadav

4:07 am on Apr 7, 2004 (gmt 0)

tedster [webmasterworld.com] is a long time WebmasterWorld administator.
If you read this thread thoroughly, you would've seen tedster's post (msg.#8).

Sid

jmccormac

4:44 am on Apr 7, 2004 (gmt 0)

I tend to be a bit cynical when I read the PR stuff about search engines and how they are going to revolutionise the market. Dipsie seems to be a classic example of this vapourware. This quote in particular is a dead giveaway:
"To counter that, Chicago-based Dipsie Inc. is developing software that promises to fill out Cars.com's simple online forms, which are based on multiple choice, though not the complex ones for the government's patent and trademark databases, which require typing in keywords. A public test version is expected by summer."

The same kind of thing can be achieved by just reading the O'Reilly Spidering Hacks book. Customising a spider so that it analyses a form and then fills out all possibilities and submits them is trivial. However the damage that it can do to the website being spidered is considerable and could, in some cases result in something approaching a Denial Of Service attack. Of course most data on websites with form based searching is constantly updating. This means that the site has to be respidered frequently. Doing it wrong can mean an immediate ban.

The Dipsie quote is interesting because what it is describing already exists. It exists in the form of shopping comparison websites. It is nothing new or revolutionary. As for the claim of ten billion pages spidered - just where is Dipsie's spider? Has anyone ever seen it? Perhaps Dipsie is buying in data or will end up as just another Overture or Espotting SERPswamp.

The majority of the web is largely static, changing on a yearly basis. As such it can be spidered aperiodically since it is not being regularly updated. The key to preparing a good search engine index is in identifying the chronological types of sites being spidered before wasting time on spidering. Perhaps working on writing spiders and building search engines has made me a bit too cynical :) , but I think that Dipsie has nothing new and certainly nothing that GYM [1] could not squash.

Regards...jmcc
[1] The Google Yahoo Microsoft troika.

christopher

9:40 am on Apr 7, 2004 (gmt 0)

These Spiders/crawlers etc. How controllable are they exactly?

I mean would one be able to send it out to say just one engine?

It would then scan all sites 10'000'000 etc then return the results to dipsie, without us knowing about it.

Maybe that's why we haven't seen Dipsie's spider in our stats logs.

DoppyNL

9:53 am on Apr 7, 2004 (gmt 0)

The minute they are starting to submit forms automaticly with a crawler they will be in my robots.txt with a complete disallow + a complete ban (I know, they won't be able to read the robots.txt then....)

A crawler has no way of knowing what a form is used for and what the result may be of posting the form.
I don't want them filling in my forms, as this would result in entry's in the database and who knows what.

Problems could be huge.

They expect to find more content when submitting forms?
I good constructed website will allow crawlers to reach all content via normal links. So this would not be neccesary.

jmccormac

10:50 pm on Apr 7, 2004 (gmt 0)

These Spiders/crawlers etc. How controllable are they exactly? I mean would one be able to send it out to say just one engine?

It is possible to target just one site but it would seem like a blizzard to the webmaster. Normally a spider on a large run tends to randomise the URLs to be spidered so as not to put excessive load on the websites being spidered.

I heard about one airlines fare comparison site that was banned from an airline site for putting a high load on that site's servers. The data on the type of site is time sensitive and thus has to be frequently respidered.

It would then scan all sites 10'000'000 etc then return the results to dipsie, without us knowing about it. Maybe that's why we haven't seen Dipsie's spider in our stats logs.

I'd be very surprised if anyone has seen Dipsie's spider. It has all the appearance of vaporware - loads of buzzwords, piles of public relations/press releases and no results. :)

Regards...jmcc

sidyadav

4:19 am on Apr 8, 2004 (gmt 0)

It would then scan all sites 10'000'000 etc then return the results to dipsie, without us knowing about it.

Thats impossible. Any robot which obey's robots.txt and has a bot [dipsie.com] page, is of course, a real spider - which has a user-agent, IP etc.
and when a bot visits your website, it leaves behind its IP, reffered URL, User-Agent etc. Its impossible for a robot which has a UA to visit your website and leave behind nothing.
Thats one of the reasons I think Dipsie is a scam.

It promises to provide a HUGE index of 11 billion pages when it launches - and guess when its launching date is set to? 2004, which if, I'm not wrong, is this year. So for sure, if a search engine is promising 11 billion webpages its gotta be seen somewhere!

Sid

christopher

9:00 am on Apr 8, 2004 (gmt 0)

Sounds possible. But why would they go to the trouble of sticking up a 'PR' coming soon type page, claim this and that - then fail to deliver, they would be shooting themselves before they even got started.

If you're going to set up a scam, there are easier, less expensive ways of doing it.

But I hear ya.

It don't sound to clever to me. I reckon it's genuine, just they messed up on promises maybe? - probably due to delivery date pressures or something.

I think it's a scam - but more complex than would first appear, if it were as basic a scam as you suggest, people wouldn't fall for it, and it wouldn't even get started - therefore why attempt a non - starter if it's doomed for failure?

It's just like the rest, idea being to make a name for themselves, get as big as the net public will allow, then hope to be bought out by someone.

That's why most of the engines are created. You can usually tell if a new one will be successful, by it's actual design work and services.

Most of these multiple engine searchers will fail - cos there's nothing special about them.

sidyadav

9:28 am on Apr 8, 2004 (gmt 0)

I'm occassionally in contact with one of Dipsie's rep, they said they're gonna launch an SEO service next week - I have to ask them if Dipsie is a scam or not..

They'll most probably say 'no'. (obviously)

Sid

jmccormac

9:38 am on Apr 8, 2004 (gmt 0)

I'm not yet convinced that it is a scam. At the moment it looks more like a clueless me-too venture where some people who were completely unaware of the complexities of spidering the entire web, spidering the 'deep' web while at the same time providing millions of search results per day decide that starting a search engine would be a good thing from a marketing point of view. Perhaps it would be but running a world class search engine is not the same as issuing press releases about it to gullible technology journalists who are always looking for the next big thing or at least some copy to fill empty space.

One of the interviews with the main mover in Dipsie was so full of venture capital guff and search buzzwords that I nearly spilled my coffee from laughing. To anyone who does not work with search engines or indexing large datasets over networks for a living, it was very convincing. When you read between the lines and then start thinking of the bandwidth, storage and processing requirements for doing what Dipsie is supposed to be doing, it does not make sense.

Perhaps Dipsie will appear but as an other recycler of Overture or some other search fodder provider. It may even overlay its own search algorithms over these results to provide a more streamlined approach. Using another engine's data would be more efficient. But would this mean that Dipsie is just another meta search engine rather than a genuine player?

Regards...jmcc

[edited by: jmccormac at 9:50 am (utc) on April 8, 2004]

jmccormac

9:45 am on Apr 8, 2004 (gmt 0)

I'm occassionally in contact with one of Dipsie's rep, they said they're gonna launch an SEO service next week - I have to ask them if Dipsie is a scam or not..

A Search Engine Optimisation service Sidyadav?
Does this mean that they are getting out of the search engine business or just getting into it in a completely different way to actually running a search engine? ;)

Regards...jmcc

sidyadav

10:41 am on Apr 8, 2004 (gmt 0)

A Search Engine Optimisation service Sidyadav?
Does this mean that they are getting out of the search engine business or just getting into it in a completely different way to actually running a search engine? ;)

lol, I'm not sure, but thats what they said:
"We are launching the beta for our first product this week-- an SEO service."

Sid

christopher

12:09 pm on Apr 8, 2004 (gmt 0)

Well, even if you do know this rep, until I see otherwise, I'll take what you say about the launch date with a pinch of salt.

Reps don't own or hold managerial posts in companies - otherwise they would be directors and not reps!

So I doubt very much that what you've been told by some rep bloke, is the whole truth anyway.

And that's based on common sense.

People are way too quick to believe others on the web.

Someone sets themselves up as an expert - and people bow down to these charlatans.

Nothing more than enthusiastic speculators I'm afraid.

I'll judge this engine based on what I actually see with my own eyes, and not on secondary info, from some sales bloke that was passed on to someone that I don't even know from adam.

Sorry, but he and you could be saying anything - really.

So why would Dipsie tell it's sales reps about the launch - then NOT post this launch date on their site or place press releases out to all media?

Na- it just don't work like that. This info you've been getting just sounds strange and without procedure or authority.

But if you are right, it should be interesting to see it in action - next week you say?

sidyadav

12:24 pm on Apr 8, 2004 (gmt 0)

No, by saying it was a rep, I meant a staff member or the CEO itself. I'm not sure how you got the feeling that it was a sales bloke -as you say it?

Believe what you want, I'll sticky you the URL when it comes out.

Sid
PS, you are taking this rep business too seriously..

jmccormac

4:36 pm on Apr 13, 2004 (gmt 0)

No sign of this Dipsie SEO service yet? :) It is beginning to smell a bit fishy. (running out of marine metaphors.)

Regards...jmcc

christopher

7:03 pm on Apr 13, 2004 (gmt 0)

Isn't Dipsie a 'sinker' weight thing on the end of a fishing line lol

(another fishy joke)

Will it sink or float?

258cib

1:12 pm on Apr 29, 2004 (gmt 0)

CNET says yes, but there is no sign of it on their site. Helloooooooo?
[news.com.com...]
This is really lame.

christopher

1:58 pm on Apr 29, 2004 (gmt 0)

So what are they? It seems they are having trouble deciding whether they want to do SEO or Web Search.

I don't know about anyone else, but I'm confused already!

Yikes.

jmccormac

7:20 pm on Apr 29, 2004 (gmt 0)

Guess Dipsie doesn't know whether to sink or swim. :)

From years of reading press releases recycled as news (in what masquerades as the "technology press"), let's run the CNET article through the cynical editor's gobsh1te filter:

'competing consumer search engine'

- it searches for consumers to click on its PPC listings.

'uses natural language algorithms to assess the content of a Web page and render a slew of synonyms and antonyms likely to crop up in a Web search'

- Somebody browsing the page with dictionary.com and thesaurus.com open in other browser tabs.

It then feeds that page to search engines to help the site's position in results

- Whoa I thought this software was meant to improve the ranking of web pages in Google et al not submit them. I guess improving the PR from not indexed to 1 counts as an improvement.

"It uses our crawling technologies to get past barriers that have been around for last five to 10 years in search robot technologies," Wiener said.

- Yeah right! Nobody spots any spider - anywhere. Claims of indexing all possiblities on database backed websites, claims of world domination through superior software, missed deadlines, vapourware website, nothing more complex than the "hacks" in the O'Reilly Spidering Hacks book.

A classic example of search engine development by press release.

Regards...jmcc

christopher

9:04 pm on Apr 29, 2004 (gmt 0)

It's a shame, cos from a web design point of view, it could have great Public Relations potential.

Are they a big company? I mean I don't see any evidence that they are backed by anyone.

This 39 message thread spans 2 pages: 39

A Dipsie tidbit

tedster

digitalv

angiolo

pleeker

GeekyChic

rfgdxm1

tedster

sidyadav

christopher

christopher

Liane

sidyadav

jmccormac

christopher

DoppyNL

jmccormac

sidyadav

christopher

sidyadav

jmccormac

jmccormac

sidyadav

christopher

sidyadav

jmccormac

christopher

258cib

christopher

jmccormac

christopher

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week