Forum Moderators: open
1) editor only ODP-Server: This is the "original" one where the public HTML is generated and the original copy is stored. It is not accessible by the public anymore for performance reasons (one server can't handle the amount of requests). Mentioned here just for completeness, no real relevance for you.
2) www.dmoz.org - A server farm of independent servers. Access is loadbalanced, you can't control which one you access. They are synchronized from 1) by a scheme that should guarantee that none of this servers is more than 4-7 days behind. Since hey all are standalone, it is possible that you receive different states of the category on multiple requests. Anyway, as I said, none of them should be more than a week behind.
3) ch.dmoz.org / de.dmoz.org: Our "initial" mirrors, provided by external companies for free. Very usefull since they usually have a great performance. Feeded by a rsynch directly from 1). In theory thry should never be more than 48 hours behind. At the moment, staff is doing performance checks on all the other things and disabled the rsynch since its influence on the measurements is to big. I don't know any schedule for reactivation, sorry. At the moment (I didn't check that) I think they are some months behind...
4) All data-uers (like google...) using RDF files: RDF-files are still compiled once every 1-2 weeks. It is up to the data users how often they want to download the files. At the moment it looks like Google has some problems updating their data, google search has a different state of the Google directory than the directory itself.
I hope this helps to calm down some of you, at least you know the approx. delays to expect.
[edited by: choster at 6:40 pm (utc) on Oct. 9, 2003]
Now it's gone completely, doesn'T show in the search or directory, and I never heard anything from dmoz. My site really grew and improved, traffic doubled to over 1200 unique visitors per day and is doing great otherwise. I have no clue why it may have dropped from the directory again :( Would it be ok to re-apply?
SN
It is a very painful process not knowing if you need to submit, trying to use their guidelines by only submitting one URL at a time.
If you are submitting multiple, different URLs for the same business you may be spamming -- I'm not saying you are, just that it's a possible interpretation from your wish to submit multiple URLs.
A business may have more than one website. The ODP is likely to list only one of them. A website almsost certainly has more than one URL. The OPD is likely to list only the homepage
(And yes, I know there are many examples of deeplinking (multiple URLs for the same business) in the ODP. But make sure that your business qualifies for that before flooding the suggestions pool with URLs)
How are we supposed to deal with this issue? Wait and do nothing, it's very painful!
Two suggestions:
1. Become an editor. For any category....don't wait until "your" category is available: you want to be more active than that, right?
If there are twice as many active editors then:
a) the waiting time for everyone will be halved; and
b) the OPD will be more up-to-date so everyone using its date gains.
2. Start a campaign to educate (and, if necessary forcibly, re-educate in labor camps) spammers. About 50% of all suggested URLs to the OPD are spam. Get rid of them and the unreviewed pool is half the size, again halving the time for a listing
So, basically, either sit back and comment that those volunteers aren't working hard enough for you, or join in.
Only one URL, one website - not duplication of content or name. I already applied to become an editor and was knocked back, was told basically the only reason I was becoming an editor is for self promotion which is not what the ODP is about.
I took this as a personal insult, I felt that the information I sent in did not get reviewed correctly.
I plainly stated that I could be an editor on many other categories. Not good really, one thing I picked up on: It would be better to submit sites to directories or sub categories that are undeveloped. This does not apply to me, if anyone has any suggestions i'd like to hear from you.
I accept ODP do not give a dam about SEO or SEP but atleast have the time/decency to read the information submitted and remember not all PEOPLE are spammers.
It would be better to submit sites to directories or sub categories that are undeveloped. This does not apply to me, if anyone has any suggestions i'd like to hear from you.
With ODP its always best to submit the site to the most specific category possible. An editor will probably forward it there anyway. As far as becoming an editor, most of the time the advice is to pick a small regional or non-commercial category with less than 50 sites and start there.
New applications are probably accepted in some commercial categories but for the best chance of success, the above is likely to be the best way to get accepted.
If that is painful -- and there are no doubt many people for whom it is -- imagine what it would be like to review submittals for an hour, knowing that half of them won't have followed the submittal guidelines at all, and the other half won't have followed the editing guidelines for writing titles and descriptions. And you have to fix them all!
You really would not have enjoyed editing. Pick something worthwhile that you enjoy, or something in your neighborhood that needs doing so badly that you're willing to do it even though you hate it, just to have it done...and give your free time to that. Editing ODP is not the only good thing to do; it's not even the best possible thing to do.
As for getting listed, my primary site took 2 months to get listed, but another site that went into another category took less than 12 hours. Since editors watch a particular category, it really depends on how active they are.
Like somebody else said, if you're having trouble getting listed, try another category or pitch in and help.
-Chris
[edited by: skibum at 9:33 pm (utc) on Oct. 16, 2003]
[edit reason] no site specifics please [/edit]
Roughly speaking, once a week the ODP (tries to?) cut a "RDF". This is used internally, to build various search indexes (including the public ones), and is publicly posted for anyone to load. The Google directory is built from this (whenever Google feels like it, which is once a month or less).
The publicly visible directory at dmoz.org runs on a cluster of servers, with a caching algorithm, and access back to the editor's database (not the RDF). When things are working correctly, no page should be more than two weeks out of date, but things weren't always working perfectly back before the new process.
Google search spiders the publicly visible category pages, for link pop purposes.
The "Official mirror" pages (in Europe) are IIRC supposed to update nightly.
The editors don't know exactly how or when any of this happens, except that the RDF has a timestamp so anyone can know when it was last cut. And ... we don't worry about it. It will happen when it happens. If the RDF is delayed a day, that just means 3000 sites will be published 6 days AHEAD of the schedule. Editing goes on, and what goes IN the pipeline must eventually come out at the other end.
And ... you shouldn't worry about it either. dmoz.org search will not be a major source of referrals. If the site is listed at dmoz.org....all the rest happens, mostly within the next six months.
Ahem, DMOZ has done an update every week for a very long time.
Google has had every opportunity to take and use every update as it was published. They live at rdf.dmoz.org/dmoz/ you know. Google has finally processed and used an update. Google has been the one doing nothing, not the ODP. The update only shows on a few datacentres so far, not all.
Well, laying aside moral issues: that isn't clear yet. I'm persuaded the ODP is in its way focused on its vision of "ordinary web users" -- Google has a different focus and a different vision, but "ordinary web user" is still in the crosshairs.
A lot of marketroids don't understand either of these projects -- can't see how just providing a good service is enough to attract users, and how not providing the best service won't keep them. They just see the "buzz" (some of it genuine, and yes, some of it artificial) and suppose that's what keeps these projects live. It isn't.
If you want to know what a buzz-driven company looks like, look at SCO. If you want to see a marketing-driven product, look at ... um, any Microsoft product. Technical excellence doesn't enter into them, even by the doggie door. THAT'S marketing. Google's behavior doesn't resemble either of those in any form or fashion.