| This 77 message thread spans 3 pages: 77 (  2 3 ) > > || |
|State of the ODP: 3.5 million in, 1.1 million unreviewed.|
I just looked up the exact numbers, and the above are the current figures for today. If anyone is wondering exactly why it sometimes takes so long to get sites approved, the above statistics should reveal why. I do have access to the complete breakdown, but I don't think the Powers That Be at the ODP would approve of me revealing this publicly. However, the most backlogged categories are Business, Computers and Shopping. No surprised there about Business and Shopping, although this high number of unrevieweds in Computers surprises me. The percentage unrevieweds in Business as compared to the the number of listed sites is staggeringly high. Of course a lot of those are no doubt spam. However, the editors have to slog through the spam to get to the sites that should be listed.
DMOZ should review the system it uses. I do not encourage or presently pay for any listings but I think DMOZ should start charging for listing and have a time guarantee, it is not acceptable in the world of microwaves and 3 Gig computers that so may websites are unreviewed.
I hope DMOZ looks at its business plan and charges for each listing, passses some of this down to the reviewers who it monitors more carefully and gives a time guarantee.
Not sure that those unreviewed numbers tell the story considering I just deleted 12 submissions from one site the other day in a smaller size category and have found as many as 127 submissions and 109 from another site not that long ago (all "unwarranted" deeplinks).
Just thought we might want to look at the "real" picture ;)
The real picture is more than just numbers. How many of those 1.1 million are Server Not Found or 404? How many are duplicate submissions from aggressive submitters? How many are low-quality pages hosted for free somewhere? How many are affiliate spam or Work-at-home scams?
Who knows, but I bet it's a large percentage of what's in the queue.
Read my post above: "Of course a lot of those are no doubt spam. However, the editors have to slog through the spam to get to the sites that should be listed." I didn't mean to suggest that there were 1.1 million sites that should be added, merely that was the number of unrevieweds. I've deleted far more spam as an ODP editor than I have reviewed sites that even possibly fit that category. I'll presume that your cats were hit by that FX spammer that was submitting with a bot like mine were being clobbered? The problem is that there is limited time the editors have. Having to spend most of your editing time dropping spam in the bit bucket means that it is that much longer before the legitimate submissions can be reviewed. Currently in my cat space which has 311 sites listed, in the @ links that I can't edit there are 900 or so unrevieweds that are languishing in categories without editors I can't do anything about. Personally, I think the ODP needs some new procedures to deal with cases of underworked editors like me. What immediately comes to mind is a new class of editor that fits my situation. Call it "stand-in editor". I'd propose that in cases like mine, that I could apply for this status in any @ link that currently is without an editor, and has a ton of greens. A stand-in editor would have full editor privs in that category (or, at least over all the reds and greens; although if the cat has a ton of unreviewed sites, I'd think the currently added ones aren't being checked up on either) until such time that someone applies for an editor via the usual procedure and is approved. The basis for approval would be a meta reviewing the applying editors current cats to make sure they are ship shape, and the editor The problem with the current system is that you have to submit 3 proposed sites to add to each cat, with appropriate title, description, etc. The idea behind a stand-in editor is that they are just someone who is pitching in to help cats that are troubled by a large number of greens and reds. In an ideal world every cat would have an editor that is truly dedicated to it. However, in cases of cats with no editors, and also have a backlog of reds and greens, I'd think my idea is better than just letting sites languish in unrevieweds.
I think that rfgdxm1 was just trying to say why it takes a long time in some cats to get listed.
I would agree that there is a lot of spam or 404's in the unreviewed number.
|Of course a lot of those are no doubt spam. However, the editors have to slog through the spam to get to the sites that should be listed. |
Whether or not a URL is 404 or spam, it still takes time to review and act on each unreviewed entry.
<added> Sort of like what rfgdxm1 just said! </added>
Your absolutely right kctipton. My guess is that if the ODP threw at me 100 randomly selected greens to review and edit, a significant percentage, quite possibly even most, I'd end up giving the thumbs down. Personally, I don't give a damn about the spam greens that are languishing. Unfortunately, someone's gotta slog through all the crud and drop it in the bit bucket to get to the sites that deserve to be added.
|The real picture is more than just numbers. How many of those 1.1 million are Server Not Found or 404? How many are duplicate submissions from aggressive submitters? How many are low-quality pages hosted for free somewhere? How many are affiliate spam or Work-at-home scams? |
I think it's still a staggering snapshot of the current state of affairs.
Even if only 1% of the sites in the queue are worthy of serious consideration, it still points to an enormous problem.
Worthy or not, those queues need to be tackled in an efficient manner.
Quite simply, the current system is not able to effectively support the submission load.
Instead of the usual "why ODP rocks" vs. "ODP sucks" debate, I'd love to hear suggestions for potential solutions... especially from editors themselves. (thank you rfgdxm1 for offering your own ideas)
And let's try to refrain from the defensive "it's fine just the way it is / deal with it" type responses.
A shared focus on creative solutions will yield more value than a "who's right, who's wrong" debate each and every time.
|there are 900 or so unrevieweds that are languishing in categories without editors I can't do anything about. |
Apply for greenbusting permissions in some of those categories.
|Worthy or not, those queues need to be tackled in an efficient manner. |
What gives you the impression that they're not? The amount of unreviewed sites in the directory has been stable for a long time, it hasn't grown or significantly shrunk for more than a year now. In short the amount of sites being submitted daily is about the same as the amount of sites being added. Also add to that the fact that an unknown percentage of all unreviewed sites are actually update requests.
Consider that even if 50% of all sites submitted are worthy of inclusion (a ridiculously high proportion), it means it takes on average 4.5 months for a site to be reviewed directory wide. (Take 550,000 sites, divide by 4,000 and then by 30 days) If you drop the inclusion percentage down to say 30% (which is a more accurate figure), the average directory-wide review time falls below three months.
Thus the average site will be reviewed within three months time. Naturally there's a great deal of variation depending on the popularity of the category. It also points out why applying to the correct category the first time is so important. If you screw up, then on average you wait three months for an initial review, the editor moves your site so you have to wait another three months on average for the second review.
Please don't get me wrong, I'm not trying to whitewash the submission process, however it's completely misleading to rely on the 1.1 million figure as an indication of the efficiency of the review process.
|What gives you the impression that they're not? |
Well, unfortunately, we know that far too many of the "worthy" submissions get chainsawed along with the "unworthy" when the queues get too big.
ODP can implement all the editor tools they want, but the bottom line is that a human being has to go through each and every one of the submissions to ascertain which should be included and which should be deleted. The only remedy for the problem is to add more editors.
I know that two years ago, only about 10% of applicants were accepted. I haven't read anywhere that that has changed.
>>we know that far too many of the "worthy" submissions get chainsawed along with the "unworthy" when the queues get too big. <<
We do? Who's doing that? I don't know anyone who does that, and I haven't come across a case of anything similar which didn't involve deliberate abuse by someone who wanted the competition to disappear.
Thank you for sharing that rafalk. Very sound points.
Clearly the number of sites in the queue does not suggest in any way how long they have been there.
If in fact the queue size has remained fairly constant, I see that the average site review time could be anywhere from instantaneous to 2 years and not change the size of the queue if the number of submissions kept pace.
So now I'm wondering about your methodology for guessing what that average time for review might be.
Is it sound to start with the percentage of sites worthy of inclusion as your base figure when an editor has to review many of the "unworthy" submissions to get to the worthy ones?
Also what are you basing the dividing factors of 4000 and 30 days on?
Or perhaps I'm just revealing how sluggish my grey matter is performing this evening. :)
>Apply for greenbusting permissions in some of those categories.
I've never looked into exactly what is involved in greenbusting. Would that allow me to at least blow away the spam? I'm actually tempted to throw in an application for Substance Abuse, which is an @ link in my cat space. Substance Abuse currently 537 greens and 27 reds. Not only does Substance Abuse have no editor, neither does Addiction above it, and the only editors of that are way up at the top of Health. I am editor at Zeal, and already have top level editing privs over all cats dealing with both recreational drug use and substance abuse. Because the Zeal taxonomy is better than the ODP [which is partially due to the fact that I redid much of the taxonomy in my cat space there. ;)], both recreational drug use and substance abuse are in the same branch. All I'd have to do is pick out the 3 best sites I have already added over at Zeal that aren't in the ODP, rewrite my descriptions so as not to plagiarize Zeal, and I should have a top notch editor app for Substance Abuse. Only problem is that likely it'd take the ODP months to even get around to reviewing my app. :( There are no editors in the subcats of Substance Abuse, which would mean if I got approved I'd be able to deal with all the greens and reds.
>If in fact the queue size has remained fairly constant, I see that the average site review time could be anywhere from instantaneous to 2 years and not change the size of the queue if the number of submissions kept pace.
It all depends on where you end up submitting it. If in my cat space less than a week if the cat is one I am sole editor, and possibly as long as several weeks if I have to wait and see if a lower editor logs in to deal with it. If in a cat with no editor and a ton of greens, it could take many months before anyone even notices the submission. The reality is that there isn't one queue, but a large number of different queues in different areas of the directory. As an ODP editor it means nothing to me there are 1.1 million greens at the moment. I'm the ODP editor equivalent of the Maytag repairman. I've got no greens or reds, and waiting around for something to actually do.
Greenbuster permissions allow you to: Perform any editing function in unreviewed
Edit sites in the Greenbusted queue
Add new sites to the category's greenbusted queue
Greenbusters can't edit reviewed listings or make changes that would affect the reviewed listings.
These (greenbusted) edits are reviewed and added to the "live" listings by the resident category editors or others who have permission to edit the category.
These edits are added to your editing log.
|Only problem is that likely it'd take the ODP months to even get around to reviewing my app |
Could be a matter of days.
I'm sure someone will correct me if I'm wrong
You're not in a position at the odp to do anything. You have no authority there. You don't rank. Why should I take your word on the state of the odp?
You previously posted to sticky you about screwed up entries at the odp and you stickied me back saying, "Oh, uh, you should tell a meta editor about that..."
Sorry, that's lame. Why the heck did you post people to sticky you if you aren't in a position to do anything?
And I thought, "DUH! This is impotent swaggering."
With all due respect, I'm sure you have your heart in the right place, but you don't have any rank at the odp so you should not solicit stickies from people if you are not in a position to do anything.
You don't have any rank in the ODP, so why should I believe you when you say that there are 33% unreviewed web sites?
Quite frankly, I can't recall you having anything positive to say about anybody. Be it Google or ODP.
My personal experience as an editor and as a submitter with the ODP does not concur with your statement.
[edited by: martinibuster at 8:25 am (utc) on Dec. 19, 2002]
rfgdxm1, this really isn't the place for discussions of how to get additional privileges at the ODP (that's what the internal ODP forums are for). I'd suggest you check out some of the threads there regarding Greenbusting and/or how to get more categories.
"I've got no greens or reds, and waiting around for something to actually do."
You can wash my car if you want.
One thing I'd just re-emphasize is that for submitters to understand that wait in the queue is hard to get a handle on. If one queue is Wilt Chamberlin and the other is Mickey Rooney, it would be wrong to think queues average out to a six foot tall dude. Some categories have long waits because spammers bury them. Others have long waits because they are extremely obscure and no editor really pays attention to the area. Then some categories are updated literally every day.
Speaking generally about the ODP is a bit like the three blind guys who all touched different parts of an elephant and thus described completely different animals.
>You're not in a position at the odp to do anything. You have no authority there. You don't rank. Why should I take your word on the state of the odp?
And, where did I claim to have any authority beyond those categories that I edit that I edit? As far as the issue of abuse, if you or anyone else convinced me that there was abuse, as an ODP editor I would be compelled to report it. I have to assume that the ODP must get a tons of complaints about people whining that the editor was engaging in abuse, when in fact all that happened was their worthless site wasn't added. As such, I'd have to figure an internal complaint from another editor about abuse would get moved right to the top of the pile. Only a clueless editor would make bogus or frivilous abuse complaints about other editors. Doing so is a good way to get on the **** list at the ODP.
Also martinibuster, there have been people around here posting allegations that those in charge of the ODP are corrupt, and intentionally allowing abuse. While I have no reason to believe this is true at all, I very much would be interested in hearing from people who are making these accusations to see if they have any merit. In fact, just recently someone brought to my attention a case involving a category where things looked suspicious. My best theory at the moment is that category was left in mess by a now absent editor, and the reason no new sites have been added is they are just buried in the queue. However, I do intend to watch that category and see if anything is done about this problem.
And as far as taking my word on the state of the ODP, I don't see anybody replying that what I said in the subject line of this thread are inaccurate numbers. One doesn't have to be high ranking at the ODP to be able to find out the number of unrevieweds. This isn't exactly a secret.
Pretty much "ditto" rfgdxm1. I think 95% of all whines against the odp are because some low quality site wasn't added. The problem is the 5% that are accurate. I think this is partly what laisha was referring to with those that get chainsawed - it darn hard to figure out if that 5% is worth a darn.
>> One doesn't have to be high ranking at the ODP to be able to find out the number of unrevieweds. <<
No, but one has to be in a position to look at the bigger picture in order to know what that means.
If I see a bunch of dead flies on a windowsill, does that mean that the species is threatened, or that someone just sprayed?
Sites sit in unreviewed for a number of reasons, among them being:
1. Multiple submissions of the same site to several categories.
2. Submitted to inappropriate categories, causing a delay.
3. Spam, affiliate links, and other non-listable sites.
4. Once listed sites that have become unresponsive. Rather than simply deleting them, we often place them in unreviewed for a time, until they either come back up or we are able to locate the new URL (yes, we do look).
5. Suggested titles and descriptions are loaded with hype and inappropriate keywords, causing a delay in listing.
6. Some categories receive such a large volume of submissions that it is difficult to keep up with them, usually made more difficult by numbers 1-3 above.
Whenever I go into an overloaded category in order to help bring the numbers of unreviewed down to a more manageable level, I usually find that 90% (percentage pulled out of my butt, for effect) of them are are unlistable for one reason or another, and that very few of them are a real asset to the directory.
In most of the ODP categories, new submissions are dealt with quickly. Many ODP editors not only manage submissions efficiently, but go out looking for useful sites to add to the categories that they manage. New editors are added every day, and abusive editors are removed as they are found.
What everyone needs to remember about the ODP is that it is a human directory with human editors (who have regular jobs as well).
Spam needs to be slowed down (very difficult) and more editors needed. Also, maybe the submission form for editors should be shortened to encorage more applications.
Just my 2 cents...
I had to apply 3 times to my cat before i was accepted. I think reducing the length of the submission form would encourage more unsuitable applications.
This in turn would just create more work for editors who reviewed the applications.
I think though, more existing editors need to be encouraged to sign up for greenbusting duties (i signed up this morning after reading this thread!).
I think that steveb makes a good point about the great variations between cats. Averages really don't mean all that much if the cat you have submitted to hasn't been edited in years.
I know that many cats are in great shape. My tiny cat is, and I'm sure most of the cats run by the diligent and helpful members here at WebmasterWorld are too. By contrast, a few cats that I submitted sites to months ago haven't had an editor in ages. The next level up doesn't have an editor, either, and the next level up has an editor that hasn't edited in over a month. You have to go to the top level cat before you find any active editors, and clearly these folks have a LOT of stuff to do. I think one of the cats I submitted to hasn't had a site added in two years.
This isn't a complaint - I'm sure if good editor apps were coming in, these neglected cats would be neglected no more. My point is that some of the webmaster complaints ARE justified - good performance in many areas of the directory doesn't mean that there aren't some black holes in there. If a webmaster's category of interest is one of them, he/she might well conclude that DMOZ is completely broken.
And for those WebmasterWorld members who aren't editors - give it a try. Clearly, many participants here have a good command of the language as well as a good nose for spotting spam. Adopting one small cat is not time-consuming, and if enough people do it we won't have all these backlog arguments.
|Doing so is a good way to get on the **** list at the ODP. |
**** list ... pink list? .... blue? ....
excuse frivolity, blame the time of year.
|Is it sound to start with the percentage of sites worthy of inclusion as your base figure when an editor has to review many of the "unworthy" submissions to get to the worthy ones? |
That's a very good point. An editor can eliminate a large majority of the "unworthy sites" without actually having to review them. That's because they're duplicate submissions, or have been already tagged as unsuitable sites. I would say that 50% would be a fair estimate as to the actually amount of sites an editor has to "review" - that is actually looking through the site.
|Also what are you basing the dividing factors of 4000 and 30 days on? |
4000 is the average amount of sites that are added in a day to the ODP. Once you divide the amount of sites by this number you get the average review time in days. Dividing by 30 gives you the average review time in months. Sorry for not making this clearer earlier. :)
If anyone from Yahoo is reading perhaps we could compare the number of paid editors with the number of paid for and not profit sites getting into the Yahoo index.
|We do? Who's doing that? I don't know anyone who does that, and I haven't come across a case of anything similar which didn't involve deliberate abuse by someone who wanted the competition to disappear. |
Yes, you do. Many (overburdened) editors do it, and quite likely completely by accident. Any line editor who gets a large amount of emails from submitters who claim they've submitted repeatedly and check into them can see what I am talking about. It's always been that way, and it has not changed.
This dialogue, however, could go on and on forever with me saying "yes it does," and you saying, "no, it doesn't." I could list urls until your eyes roll back in your head, but we all know that would be dangerous for those listings. So I won't belabor the point.
I would really like this question answered, though it seems to have been lost in the postings:
|I know that two years ago, only about 10% of applicants were accepted. I haven't read anywhere that that has changed. |
| This 77 message thread spans 3 pages: 77 (  2 3 ) > > |