New RDF Dump Available

Forum Moderators: open

Message Too Old, No Replies

New RDF Dump Available

senox

12:57 am on Jan 31, 2003 (gmt 0)

A new RDF Dump is available. The data is a week out of date, and does not include any catid tags. This will be fixed later.

[rdf.dmoz.org...]

Brett_Tabke

3:14 am on Jan 31, 2003 (gmt 0)

Very good news. Thanks for the tip off.

rjohara

3:20 am on Jan 31, 2003 (gmt 0)

Hooray! Congratulations and thanks to the DMOZ folks.

skibum

3:22 am on Jan 31, 2003 (gmt 0)

Down, but not out. Dmoz is back! Rock n' roll!

vmcknight

3:34 am on Jan 31, 2003 (gmt 0)

Not only that, but the editing and the internal forums are lightning-fast again! Best editing I've had in months!

xbase234

3:36 am on Jan 31, 2003 (gmt 0)

excellent news - this should shake up the landscape a little bit.

fathom

3:40 am on Jan 31, 2003 (gmt 0)

Superb! Excellent! - I had know doubt that the team would prevail.

Keep up the excellent work - bigD! ;)

pageoneresults

3:50 am on Jan 31, 2003 (gmt 0)

senox, in all the excitement I think we missed a welcome. Well, welcome to WebmasterWorld!

That's some ground shaking news wouldn't you say so? Thank you!

amznVibe

3:52 am on Jan 31, 2003 (gmt 0)

Awesome find! Hopefully Google will import in time for the february re-index! Any insights GoogleGuy?

mack

3:53 am on Jan 31, 2003 (gmt 0)

I think this goes to show that the ODP is not in the greatest of hands with AOL.

The odp needs funding, If there had been enough tech staff involved with the ODP this woudl have been resolved a lot faster. The ODP is a valuable assest to the web in general and i just dont think AOL take it seriously enough.

steveb

4:27 am on Jan 31, 2003 (gmt 0)

"I think this goes to show that the ODP is not in the greatest of hands with AOL."

The RDF dump caused Steve Case and Ted Turner to lose their jobs. Once we got rid of those slackers, no problem....

fathom

4:38 am on Jan 31, 2003 (gmt 0)

Once we got rid of those slackers, no problem....

;)

senox

4:44 am on Jan 31, 2003 (gmt 0)

Thanks for the welcome. I see you morphed what initially was a reply to a new topic.

We're all very happy that a new RDF Dump is available. Please note that it's not yet a regular one, but this should hopefully be fixed soon. Thanks a lot to staff who worked very hard on that. :)

amznVibe

4:50 am on Jan 31, 2003 (gmt 0)

Thanks a lot to staff who worked very hard on that

Definitely thanks! I wonder if the staff in general or per category should have tip jars? I mean many of us do make money off the enhanced listings we get from google and the other directories out there that clone the DMOZ listings.

Or would this corrupt the volunteer system? If it would be setup annonymously somehow, I don't see too much harm...

rfgdxm1

4:55 am on Jan 31, 2003 (gmt 0)

From what I understand, it'll take a new RDF dump for those who need catid tags starting next week, which means it could be over a week before that RDF dump is finished.

steveb

6:21 am on Jan 31, 2003 (gmt 0)

What is the technogeek answer to:
Can Google use this new data for its directory. I would guess no since thousands of categories have changed/moved/created.

Gringo

7:45 am on Jan 31, 2003 (gmt 0)

Is there any particular reason why this page [dmoz.org] isn't updated?

(Says RDFs NOT pushed)

Gringo

windharp

8:46 am on Jan 31, 2003 (gmt 0)

@Gringo:
Yes, beacuse its not a "complete" RDF but one without Category IDs. The "normal" RDF generation would maybe have failed again.

Napoleon

12:28 pm on Jan 31, 2003 (gmt 0)

>> Not only that, but the editing and the internal forums are lightning-fast again! <<

And the browsing.

The difference is amazing from here in the UK. Well done to all concerned.

dvduval

1:37 pm on Jan 31, 2003 (gmt 0)

I'm curious if they have it "figured out" now, such that a delay in the RDF dump will be unlikely in the future. Obviously it's mission critical they do this.

Congrats to the people who I'm sure worked hard to get this latedt RDF dump to work.

victor

2:28 pm on Jan 31, 2003 (gmt 0)

I'm curious if they have it "figured out" now, such that a delay in the RDF dump will be unlikely in the future.

There's some recent upbeat comments from staff on the ODP forums to the effect that they've been working around hardware and resourcing problems, but they are now confident of success.

That's a pretty good sign, as is getting out the intermediate RDF.

SEOPTI

7:44 pm on Jan 31, 2003 (gmt 0)

So let's hope they will do the dump 2 times a year from now on.

AgedAthlete

10:35 pm on Jan 31, 2003 (gmt 0)

I've been a reader / lurker for over a year, but this is my first post here. It's a great forum and I have learned much.

I do have a question that perhaps those more experienced than I could help with.

I'd like to get and use the odp / dmoz category information but the files are so large that I cannot seem to get them.

I've got a high speed cable connection with a p3 1ghz and a half gig ram, yet those files (maybe it is the odp servers?) are so large (slow), they never complete the download.

Is there another way to get the data other than by browser? Is there an ftp location perhaps that I'm unaware of?

Thanks!

rogerd

10:45 pm on Jan 31, 2003 (gmt 0)

Welcome to WebmasterWorld, AgedAthlete! I don't have an answer about the ftp location, but what I can say is that dmoz.com availability has been pretty dicey lately. I did a few edits earlier, and it fluctuated from very zippy response times to timeouts/404s - often within seconds of each other. It could be your downloads are experiencing some of this inconsistency.

AgedAthlete

11:00 pm on Jan 31, 2003 (gmt 0)

thanks Roger. Are you using the file listings provided at URL the top of this thread to try to access it via your browser?

I'm referring to any of the rdf file links shown in your browser and that url.

I consider myself a pretty advanced user; am not a programmer, but fairly experienced.

But on this one, I feel kind of stupid, as tho there is some 'secret' location that everyone else uses but me!

victor

11:32 pm on Jan 31, 2003 (gmt 0)

As far as I know, there is no FTP address for the RDF.

Means you need a downloader that can handle big http files and will restart at point of failure when it crashes.

One possibility is wget:
[wget.org...]

AgedAthlete

3:19 am on Feb 1, 2003 (gmt 0)

I found the answer to my question.

If anyone cares:

Instead of left mouse clicking on the desired link, one hovers the link, right mouse click and select 'save target as' from the menu (Windows of course).

The dmoz category file with catid's is 379MB for those who care. That file does not contain the actual listed sites, links or descriptions. Just the category data.

The file downloaded in under 10 minutes that way and IE6 did not attempt to open the link to the rdf/xml layout.

I look forward to posting more!

Thanks for all your contributions to the rest of our education!

Hollywood

5:08 am on Feb 1, 2003 (gmt 0)

I spoke with some at DMOZ and the word is that faster newer servers will be online shortly.

I did not know about this but here it is...

[ch.dmoz.org...]

in case you have problems using searches.

All best

Gods speed