Forum Moderators: open
The term editing seems to be broad and even changing a single character in the title/description seems to be considered an act of editing. It is conceivable that many editors, who want to remain as editors for various reasons, would log in once every 4 months and make a single word change here or there just to fulfill the requirement for continuing their editor status.
I wonder, if it is possible for general public or even the DMOZ editors to get some chart or bar diagram indicating breakup of editors by number of sites added during the past year that were not affiliated with them. A representative output could be
#sites added #editors
1 - 5 ---------3,167
6 - 10 ----------456
11 - 25 ---------111
26 - 100 ---------56
101 - 250 --------12
251 - 1000 --------7
1001 - ------------1
This will give us more understanding of how DMOZ works.
takagi
Yes, it is a made up table. I wrote "A representative output could be ..." ;)
However, I think that the actual number will be similar to ones I presented.
Moreover, categories are as different as editors are. Someone who edits in the Sanskrit lexicons category may spend six hours a day hunched over Copernic searching for new listings to add and find none. Someone in international dating classifieds could spend six hours deleting spam and not have time to add a single site. But an editor in, say, Tennessee Baptist chuches might handle five submissions in five minutes-- leaving a hundred more in unreviewed.
n(edits of (i)th-most-active editor) is proportional to
c(log(i)).
For initial values:
Take 200,000 edits as the "most active editor's" stat.
Take 1 edit as the "50,000th most active editor's" stat.
Take 20% of edits as unique site adds, and another 5% as duplicate site adds (dual listings in Regional/Topical, Spanish/English, etc.)
It could be for internal use only so that when editors log on, they find this info flashing in front of their eyes. It's non-accusatory and might encourage some editors to do better.
John_Caius
I think I mentioned that there are about 10000 "active" editors.
hutcheson
Thanks for the suggestion. I will however want to eliminate outliers and take into account the unique distributin of #edits - perhaps taking the numbers for the 100th most active editor and 5000th most active editor will be better for interpolation/extrapolation.
60,000 is the total number of editors ever. IIRC active editors is something like less than a quarter of that. And "active" can mean as little as one edit in the last several months. Biggest problem with how you want to look at this is what choster brought up. If an editor only edits one or several very obscure ODP cats, he may be able to maintain them very well with just a few edits a year. Some topics don't have worthwhile new sites pop up frequently.
In fact, ODP is very active right now. Lots of new exciting things going on that will go public in the next few months.
You all have to remember that ODP is a volunteer effort. Sometimes, real life takes over and you don't have time to determine if the 500 sites in your que are relevant or not.
In addition, the directory and it's editors aim to make it the most comprehensive listing of the web. It's based on quality, not quantity. Good things come to those who wait.
Let's check my ODP "activity" today. I stopped RL work around 6 PM and since then
- went to the public forum to read submission status threads, spent some 30 minutes there
- went to the internal forums to read relevant threads, spent some 30 minutes there
- replied to 7 emails from submitters, 4 asking where their site could be listed, 2 asking why they haven't been listed yet, and one complaining that the editor in the "Google category where they are listed" didn't want to change the description of their site in an oh so wonderful keyword-stuffed one. Spent some 20 minutes there
- did some investigation on a case of alleged abuse, spent some 30 minutes there
- reviewed 3 new editor applications, spent 1 hour there
- while surfing the categories the applications were intended for, corrected 3 inappropriate descriptions
- prepared a brief report for an ongoing reorg in a small subtree, spant some 15 minutes
Total number of hours dedicated to ODP today: more than 3 hours
Total number of edits: 3
Number of new sites added: 0
Please (re)define "active".
I have earlier stated in msg #4
>You are right about ther useful activities by editors such as deleting spam or even changing title/descriptions. We could have a table for that too. :)
Obviously you did useful work for the ODP. Actually the editors who are active are the ones most likely to be visiting such forums and therefore, there is a biased sample.
On the other hand, it is quite possible that there are editors who are fast asleep now with the alarm programmed to wake them up every 4 months so that they can log in ODP, change a word in one description, and then go back to sleep.
While computer generated reports cannot do justice to many editors, at least the most obvious forms of negligence can be easily spotted. In your case and some other editors who are dealing with cats with very few sites in existence, they will ignore the report which states that they are in the 10th percentile of sites added ranking compared to the entire pool of editors. However, if the report also shows that editors editing similar categories are scoring in the 25th percentile, it might lead to thinking of one's strategy. It's just a feedback and does not mean that one is doing something wrong or right.
Looking at your approximate guesstimate breakdown:
edits ---------- editors
6 - 10 ----------456
11 - 25 ---------111
The 100th most active editor is certainly an editall, if not a meta, given that there are several hundred editors with higher permissions, who have typically amassed between 10 and 30 thousand edits each on top of all the non-site-reviewing things they do, so eloquently highlighted by ettore. Given that most senior editors have been around for two or three years, that's around 10000 edits a year for somewhere between the 100 and 500 most active editors, not the more conservative 6-25 edits per year in your model.
3 million sites listed in three years is a million sites a year. Given an average of say 5000 active editors throughout the history of the ODP, that gives an average of 200 site adds per editor per year. In the areas I edit, I probably add one in five sites that I review and that's probably fairly average. So your mean value is going to be around 1000 edits per editor per year.
Just my guesstimates. :)
Negligence? What are you talking about? Nothing in this topic suggests any negligence at all. An editor making one edit in four months either makes a microscopically positive contribution via that one edit, a microscopically negative one by screwing up a sentence, or has no effect (changing "is not" to "isn't").
You seem to not understand what is involved here. that editor isn't being negligent in the least. That editor isn't preventing another person from volunteering, or editing a specific category.
You seem to think there are only so many seats on a bus and some people are hogging them. That is not how it works. It's not even remotely close to how it works.
ROUGH STATISICS
Rough statistics for sites added are trivially simple. Once a week, analyze the RDF (or spider the DMOZ site) and collect data on new sites added. You can in many cases, to a statistically significant level, assign edits to named editors:
This will undercount additions by metas, and overcount for some dormant editors who share cats, but it will give the "sort of rough statistics" that you can use to confirm or deny your "suspicion that out of about 60,000 editors maybe about 1000 current ones are 'really' dedicated."
The facts are out there, and we all know where they are. It just needs someone to do the work. And the best person to do that is the one who wants the results.
I suggest you undertake to publish the numbers here once a month for the next year. If ODPers complain that the numbers are oversimplified, or biased, or whatever, they will be able to suggest refinements, and you can reanalyze until all objections are met.
Good suggestion. I think I will took at the data and see what useful info can be extracted out of it.
John_Caius
John, if we take averages, you are absolutely right. My figures are wrong. For example, even if one assigns the largest number to each editor in each grouping, it leaving hundreds of thousands of sites to just one poor editor. ;) However, the point I was trying to make was that it is likely that while a few editors are really adding tens of thousands of sites, many might be adding just a few and seeing a chart showing where they stand might motivate them to improve their standings.
steveb
By negligence I didn't mean deliberate negligence. There are many small categories with very sincere editors and they sincerely believe that there are only say 5 sites in the world concerning that topic and they have found them all. To continue the editorship, they have to change a few things once in 4 months, which is a pain but tolerable. If those same editors are shown that compared to their peers they are lagging - not as criticism of their volunteer activity but just as a feedback - some of them might talk to other editors in similar categories to discover better ways to find new sites. For example, let's say there is a category based on novelist Robert Lastname. Later when the editor finds out that it is fruitful to look under "Bob Lastname" too, maybe more sites could be found. (This example is trivial.)
The reason why this kind of information for motivation probably wouldn't be employed in the ODP goes back to the volunteer principle - who would volunteer if they got named and shamed, or told that they had to do a minimum amount of editing? Yes, once every four months is a nominal amount but doing only this many edits would essentially rule an editor out of applying for a second category, restricting them to only an extremely small cat space, perhaps fifty sites in typically a not particularly commercial area. You need to clean up your cat before applying for another one, requiring perhaps 100-150 edits. The ODP staff and metas prefer to have a lot of people doing a little editing, plus a decent number doing a lot of editing, than having all the editing duties left just to the hard-core types.
If editors were being paid to work, as is the case at LS or Y!, then I think it's quite reasonable to kick people up the backside if they're not pulling their weight. However, objectivity perhaps decreases when there is a financial reward for listing sites. Why should I delete all the useless online pharmaceutical affiliate sales sites when I get paid 5c for each one I list?
There are specific procedures in place within the internal structure of the ODP, including category checks and suggestion for improvement by senior editors, new editor mentoring, a dedicated forum for New Editor questions, plenty of documentation on editing skills, editor-produced tools etc. that are easily available for new editors to learn new skills. One example is regular threads on how to deal with sites that are flagged up as not responding by the dmoz robot, Robozilla - advice on where to look to see whether the site has moved, does it have a Google cache, is there useful information on archive.org, is it part of a larger site that has just completed a re-org etc.
The principle is that the information is there if editors want to make use of it, however there's no obligation to do so, other than that your edits, however sparse, should be unbiased and guidelines-compliant.
Correct. I am in no way suggesting anything like that. It should be used only for self evaluation and could be easily ignored if found not applicable to one's situation. It could be a good tool for senior editors to evaluate where DMOZ is going and what can be done to improve that.
Anyway , I think I agree with victor that I should be able to present something concrete by using the data to show what I mean. Otherwise it is just empty air.
After all is said and done, my biggest concern is the long delays in acceptance reported by many. On internet, projects have short half lives. I have a site that got accepted by DMOZ quickly and thanks to it and acceptance by one more directory, I am doing well on Google serps. I tried to get a few more links but was unsuccessful except for 2-3 minor ones. I would guess that DMOZ listing and the other directory listing accounts for most of my ranking on the search engines.
However, say if I had to wait for a year for a listing, I would have been nowhere unless I paid big bucks for "review" in some commercial directories. I think inordinate delays in listing can kill many projects because DMOZ is so important. (I had applied to GoGuides too more than 4 months ago, still awaiting review, do I care, very litte.)
Go ahead and compile the stats, but don't expect them to be anything but a curiosity.
As explained previously this is wholly irrelevant and inappropriate thinking. Editors neither compete, nor is there any reason at all to compare them.
You seem to be wasting a lot of time on a non-useful line of thinking rather than doing something yourself. The only reason things remain undone in some areas is because people don't choose to volunteer to do them, whether they be existing editors or those who don't even bother to join up.
However, I'm not convinced that having stats showing editor activity would benefit the directory in any way. For instance, compiling stats on an IRC channel has the risk of people starting to post lots of nonsense to gain higher ranking in the stats. Naturally this will only lower the quality of the discussion and dmoz is all about quality, not quantity. The only slight benefit I see is that it would prove that work is in fact being done at dmoz.
Nonetheless, being a statfreak I would really like to see such stats. Just don't expect me to actually implement anything although I probably could manage to do something along the lines of my suggestion.
Really, is it anybody's business what happens within DMOZ except for the editors? I suggest you all go about your daily lives and leave DMOZ to those chosen to care for it.
I agree that much of the statistics could be meaningless becasue of difficulties in comparing different things. However, more transparency in info can only help DMOZ.
For example, lots of rumors, mostly unfounded and perhaps spread by webmasters who think that their sites have been unfairly treated, circulate about years it takes to get one's site included, how some editors are just there to help their own interest and block their competitors etc. There are some redress mechanisms but people have the inertia.
To give an example: If a chart was made available about time it took for sites to get included, it will be difficult for false rumors to spread because DMOZ can clearly show the facts. For example it could say, "look, there was one site that took 5 years to get included , but median time is 7 weeks, and average time is 9 weeks, and the site that took 5 years to get included was probably because the cat editor was an ex of the person submitting the site."
Or the site was submitted to a neglected corner of the ODP, or the site was submitted to a very heavily spammed cat and got buried in all the spam, or the site got moved into a series of the wrong unreviewed queues due to editor error about guessing its topic, or...
There is just no way of knowing from these kinds of stats if there was likely abuse. This would take a case by case evaluation.
I agree that much of the statistics could be meaningless becasue of difficulties in comparing different things. However, more transparency in info can only help DMOZ.
How so?
It seems to me that the original idea in this thread is based on the notion that DMOZ should do things to keep webmasters and SEO folks happy. That's just not the way it is. I'd suggest that DMOZ's main customer group is all of the sites that use its data for their own directory purposes, and the secondary audience is Joe Websearcher who uses DMOZ to find web sites.
And I don't think either of those customer groups would care about a chart showing "how DMOZ works" by providing editor activity data.
Just my opinion....
Thank you kctipton for the offer. You could select the categry yourself since you might know which category some compaints migh be referred to. If possible select a category many visitors to this forum might be interested in professionally. Thanks.
pleeker
Professional webmasters are also users and while DMOZ should not be catering to them exclusively, they should not be ignored because they provide good feedback because of their strong interest.
As is the matter with everything, that is not based on more rigorous analysis, any perception of DMOZ or even Google is based on selection bias of the sample. For example, webmasters who could not get their sites included in DMOZ within a reasonable time are bound to be much more vocal about what they "really" think about DMOZ than the webmasters who quietly are getting multiple sites accepted within weeks.
So is the matter with the criticism of some of the editors. Rumors are spread by webmasters who think that their competitors, who also are the cat editors, are unfairly rejecting their sites or delaying the submissions.
It is my belief that DMOZ is not as bad as the perception created by the upset webmasters. Some sort of statistics, while it will probably be attacked by almst everyone for its distortion of "reality", will nevertheless help in countering some of the serious alllegations and at the same time serve as a monitoring tool for DMOZ metas and senior editors. Just my opinion. :)