homepage Welcome to WebmasterWorld Guest from 54.225.57.156
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Google directory / ODP RDF Dump, When?!
Any news on the long overdue rdf dump?
webby2001




msg:182036
 11:36 pm on Jan 8, 2003 (gmt 0)

Well it's starting to get beyond a joke now with the ODP RDF dump software problems.
Google directory has not received a full rdf dump since September!

Now, I know Google currently crawls the ODP and still passe s on PR from the ODP, but it is also clear that web sites in the Google directory do get an extra ranking/PR boost.

It surely must be as frustrating for odp editors as well as for webmasters and of course the many small directories that rely on the dump for their own web pages (Who are also left completely in the dark). I mean the editors voluntarily, for the most part, put a lot of work into keeping the directory current and spam free and yet the fruits of their efforts are never seen beyond the no doubt very few visitors who use the DMOZ directory. Their additions and amendments and spam booting have not been seen in Google since September. This must be demoralizing for them.

I actually think DMOZ is NOT the corrupt snake pit I have heard many times said about it, and I believe they do a thankless task very well for the most part. There are a few bad apples but I believe they get found out eventually and booted out.

I am really concerned that the ODP is going to lose their Google partnership, even though they are closely linked with PR calculation. It MUST be embaressing for Google to have such an out of date spam filled directory, surely?

Anyway, if anyone knows what on earth is going on please let us all know. I'm sure I'm not the only one who expected to see the rdf ready for this month, only to find yet another month gone by :-(

Alan

 

Marcia




msg:182037
 9:27 am on Jan 9, 2003 (gmt 0)

Alan, my guess is that if Google's search results are relevant and their directory has quality sites that satisfy the needs of the searchers, with both satisfying their quality standards, they might not be overly concerned.

It's more than likely much more of an agitant to webmasters whose sites have been delayed for inclusion in the Google Directory, but their day will come. It will happen. Some day.

Bobby_Davro




msg:182038
 9:52 am on Jan 9, 2003 (gmt 0)

The latest news is that they are in the process of compiling a new dump, with no known errors at this point. However, the last one produced previously unseen errors, so we may have the same problem again.

victor




msg:182039
 7:38 pm on Jan 9, 2003 (gmt 0)

Part of the problem is that the dumps take days to run, and the errors show up towards the end (they would, wouldn't they!?)

So as Bobby_Davro says, a new dump run started a couple of days ago (6-Jan). As far as I know it's still running.

This no-RDF is just as irritating to us editors as it is to those of you with websites you want showcased. It appears to me (I may be wrong about this) that DMOZ editor's programs (not the public search) run at a lower priority than the RDF dump. Certainly, I've given up trying to edit several times this week because its been so slow.

jdMorgan




msg:182040
 8:16 pm on Jan 9, 2003 (gmt 0)

I think that Google and the other successful search companies that use ODP should make an annual donation to DMOZ to sufficient to cover the cost of hiring at least one other paid technical staff member to help the single beleaguered ODP engineer. As long as the funded positions are are technical rather than editorial, there is no conflict of interest.

Damn the inter-corporate politics, brag as much as you want about it in the press, but just do it. Thank you for your kind consideration.

I have several small non-profit sites which show clearly that the ODP is the source of almost all page rank. It is by far the most authoritative directory on the planet. Without ODP, these small informative niche sites are sunk. Thank you, all ODP staff and editors.

MHO, YMMV,
Jim

rogerd




msg:182041
 9:31 pm on Jan 9, 2003 (gmt 0)

jd, I agree that something should be done - this whole situation brings to mind the Wizard of Oz, with one little guy behind the curtain. I can see a variety of solutions - better funding from its owner, a non-profit grant approach, payments from a few major users as you suggest, or even an expedited inclusion/PFI model. The thousands of voluteers, and millions of direct and indirect users, need more paid staff than one person. (Hats off to him/her, though, for keeping things going at all...)

arjan




msg:182042
 10:47 pm on Jan 9, 2003 (gmt 0)

Googleguy?

julinho




msg:182043
 12:16 pm on Jan 10, 2003 (gmt 0)

Alan, my guess is that if Google's search results are relevant and their directory has quality sites that satisfy the needs of the searchers, with both satisfying their quality standards, they might not be overly concerned.
It's more than likely much more of an agitant to webmasters whose sites have been delayed for inclusion in the Google Directory, but their day will come. It will happen. Some day.

Not just that, Marcia.
I have seen several (I stopped counting at around 30)expired domains (PR 6 and 7) which used to be informational and now point to adult sites.
Can you picture the situation? You are searching for a kindergarten, you end up looking at adult stuff. It doesn´t take many of such cases to severely hurt the credibility
of google and dmoz.
These sites owe their high PR to the hundreds of sites which mirror Dmoz. These sites are actually quickly deleted from dmoz (competition usually alerts editors), but while there is not an RDF, they remain listed everywhere else (including G Directory), remain with high PR, get lots of Google traffic everyday.

I agree: some money should be spent by someone NOW to fix this problem, instead of spending lots more later to try to fix a bad image.

Marcia




msg:182044
 12:29 pm on Jan 10, 2003 (gmt 0)

I wonder if there was a voluntary contribution available how many of those many dozens of small independent sites that use the ODP data and make a profit from it would actually contribute.

webby2001




msg:182045
 12:38 pm on Jan 10, 2003 (gmt 0)

Some excellent points made so far.
What I am actually astounded to know is that there is just the one person(!) on the technical staff sorting this all out. I mean, it isn't as if ODP is owned by Joe Bloggs who runs it from his garage for crying out loud. We are talking, if my memory serves me correctly, Netscape and I believe AOL here. As for the run taking days, I suspect that has more to do with lack of funding/hardware than the fact it actually has to take that long. I think Netscape, AOL, and dare I say it, Google should temporarily allocate one or two technical staff to get it sorted. The users, webmasters, editors I'm sure don't give a rats a** about the politics, they just want the dang thing fixed :-/

Julinho makes the very valid point that the Google directory simply isn't just relevant and spam free results and does not provide the editorial standards you would expect from results coming from Google. There is plenty of javascript redirection to porn sites, and I wouldn't mind betting, there are sharks out there buying domains in the Google directory which have high PR solely to redirect to Porn/Casino/irrelevant web pages.

Lets hope this current run produces no errors!


Hagstrom




msg:182046
 2:05 pm on Jan 10, 2003 (gmt 0)

Part of the problem is that the dumps take days to run, and the errors show up towards the end

I just don't see why they don't make partial dumps.

In my (limited) experience Google doesn't update the entire structure at once - e.g. they might refresh the so-called World category before the rest of the hierarchy.

victor




msg:182047
 4:01 pm on Jan 10, 2003 (gmt 0)

I just don't see why they don't make partial dumps.

A lot of the discussion on the ODP internal forum about this is lamenting the difficulty of maintaining old, complex, legacy code. Just one person and a great heap of problems and priorities.

If they'd known they were going to be this successful, I'm sure they would have written it for partial dumps in the first place. As it is that is unlikely to happen at least short term.

percentages




msg:182048
 4:24 pm on Jan 10, 2003 (gmt 0)

>So as Bobby_Davro says, a new dump run started a couple of days ago (6-Jan). As far as I know it's still running.

What are they using to do the processing a TRS80? There are only 3.8 million web sites listed, why on earth does it take so long?

I guess the bottom line is that if Google cared a great deal about a new ODP dump it would either offer some $$$ or get a programmer at the Plex to write a crawler to produce the dump. I like the latter solution, I'm not sure AOL would use the $$$ wisely.

Maybe this explains the absence of GoogleGuy, he said he was going to be around less after the New Year, is he working on a crawler for the ODP? ;)

jdMorgan




msg:182049
 4:45 pm on Jan 10, 2003 (gmt 0)

I mentioned Google because we're in the Google News forum, not meaning to single them out. The whole search provider industry seems to have a dependence on ODP for "clean" results, as evidenced by what will happen if you submit your site to ODP and ODP only: If it is included in ODP, it will usually show up in all the majors after a couple of months.

So, obviously, ODP is a resource worthy of support by the industry - Using a PageRank-like analogy, all of these major search providers "vote" for ODP by using its results, directly or indirectly.

Regarding Marcia's suggestion, I also think a "Make a Donation" button at the upper right corner of each ODP page is a great idea, and have been known to click on them in the past.

Jim

tiguere




msg:182050
 4:54 pm on Jan 12, 2003 (gmt 0)

As far as I know, the RDF is STILL having problems. :-(

I agree with webby2001 and jdMorgan and Marcia.
This is getting to be a bad joke about the RDF Dump that never finished. Some of the big companies which use the Dmoz DB SHOULD make something, contribute someway or make a donation in either hardware or advanced programming.

If just one of the big guys from Dmoz could read this thread and put a "Make a donation to Dmoz" somewhere in the results pages, I wouldn't be the only one to use it, and I'm pretty sure that a lot of webmasters would do the same.

Are the big Players going to let Dmoz die? And as Arjan previously said: ....
[everybody turn their heads and stare at Googleguy face and from the crowd somebody asks:]
Googleguy?

Napoleon




msg:182051
 5:07 pm on Jan 12, 2003 (gmt 0)

>> If just one of the big guys from Dmoz could read this thread and put a "Make a donation to Dmoz" somewhere in the results pages, I wouldn't be the only one to use it <<

Agreed.... and I reckon I could twist quite a few arms to get some others on board.

ODP is far too important to the net to be disrupted like this. The current custodians are frankly bringing themselves into disrepute by not serving it properly. If they won't give it the resources it deserves, they should pass it on to someone who will (and I reckon there would be plenty of takers).

Marval




msg:182052
 2:57 am on Jan 13, 2003 (gmt 0)

Julinho...you mention the expired PR6 and 7 domains that are now pointing to or being used for boosting rank on or even becoming adult domains. The RDF will not take care of most of those, as most of them are still listed in the ODP as active sites. The only thing that will change that situation is the editors looking at their links, and removing the expireds from the ODP, which Im seeing is not happening in alot of categories. There is a whole market out there for buying expireds in the ODP and getting them turned around before the deep crawl each month, which seems to be a booming business right now.

Wooden_Shoe




msg:182053
 10:20 pm on Jan 28, 2003 (gmt 0)

Believe me; today jan 28 it is still the four (4) m o n t h old (sep 25 2002) RDF dump that is being used by the whole world. Nothing since then popped up out of the hands of the two payed 'employees' of ODP. Maybe not their fault but good old Netscape should have put a brake on the expansion or should have supplied the system with more people, newer software and or hardware. It is a joke and has to come to an end or it will........

WebManager




msg:182054
 12:03 am on Jan 29, 2003 (gmt 0)

You just need patience.

Customers click rapidly - but databases update slowly.

It eventually sorts itself out in the end if you're not trying to chase a quick solution.

Put up a great site, and give it time. Think slowly and act slower.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved