Forum Moderators: open

Message Too Old, No Replies

Google & The ODP

I think this is getting out of control

         

Chico_Loco

2:47 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I know this has been disgussed before but I think it deserves some more time.

If Google are trying so hard to have their database as "fresh" as can possibly be, why would they still use DMOZ are their source for their directory. It has been well over 3 months since the ODP have updated ANYTHING, it's absolutly obsurd. As though the DMOZ wasn't already known for being a useless resource, this just makes it worse for them.

I can understand that they had problems when upgrading their servers etc.. thats fine, but holy god 3 Months? I personally could manage to completly redo anything that needs redoing in 3 months, and I'm just one guy, and probably not as good at IT as the guys they have, what on earth are they doing over there? Could they be on strike?

If the new hardware don't work, they should probably get a refund before that 12 month warranty expires, as it's fast approaching.

Am I wrong?

The Contractor

2:53 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If Google are trying so hard to have their database as "fresh" as can possibly be, why would they still use DMOZ are their source for their directory. It has been well over 3 months since the ODP have updated ANYTHING, it's absolutly obsurd. As though the DMOZ wasn't already known for being a useless resource, this just makes it worse for them.

Uhmm... Google crawls the ODP daily so it doesn't rely on just the RDF dump.

I personally could manage to completely redo anything that needs redoing in 3 months, and I'm just one guy, and probably not as good at IT as the guys they have, what on earth are they doing over there? Could they be on strike?

Well maybe you should as there happens to be only one person that takes care of the whole backend now. I suggest anyone that states they can do so - do it!

My 2-cents
{edited} for my spelling ;)

heini

3:01 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




>Google crawls the ODP daily so it doesn't rely on just the RDF dump

Perhaps we should put up a huge banner on this forums index page with just that text... ;)

The Contractor

3:08 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Perhaps we should put up a huge banner on this forums index page with just that text.

Or dmoz should. I don't understand why people don't realize that "most" directories and/or "news" sites along with many other sites get crawled daily... freshbot anyone ;)

kfander

4:37 pm on Dec 21, 2002 (gmt 0)

10+ Year Member



I've had a site added to the ODP since the last RDF dump, and it made it into Google as quickly as it would have before the RDF problems. This seems to be much ado about nothing, IMO.

Chico_Loco

4:43 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah but go scan the Google Directory, I'm finding quite a few 404's

europeforvisitors

5:04 pm on Dec 21, 2002 (gmt 0)



I've had a site added to the ODP since the last RDF dump, and it made it into Google as quickly as it would have before the RDF problems. This seems to be much ado about nothing, IMO.

No, it's much ado about an outdated directory.

Google may crawl the ODP daily, but that crawl isn't used to keep the Google Directory up to date. The Google Directory is assembled from the RDF data.

The Contractor

5:21 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google may crawl the ODP daily, but that crawl isn't used to keep the Google Directory up to date. The Google Directory is assembled from the RDF data.

And that affects who? The webmasters that check the little green bars next to the sites in their category ;)

Sorry, you will still be given credit for the link in dmoz whether you are in the directory/rdf dump or not. On the flip-side if you are in the directory and your site is 404 who is that affecting? I have seen many 404's that are not in the ODP at all - I fail to see the point?

Napoleon

6:24 pm on Dec 21, 2002 (gmt 0)



>> I fail to see the point? <<

I think the point is that some people just like moaning about DMOZ for the sake of it, whether there is foundation or not.

>>As though the DMOZ wasn't already known for being a useless resource<<

It's actually known as an outstanding resource old chap. The best directory on the web by far. That's why Google and thousands of portals use it.

It's free, it's enormous and tens of thousands of people put a hell of a lot of effort into maintaining it. They do a great job.

The problem is that it is so easy for other people to stand on the sidelines and throw stones at it for their own reasons.

>> Am I wrong? <<

Too right you are.

frontpage

8:46 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google crawls the ODP daily so it doesn't rely on just the RDF dump

I know this is not a definitely not a true sentence in all cases.

The google directory for one particular category lists 37 links while the same dmoz directory lists 40 links. It has been this way for months.

It is not a case of 404 or spam or even irrelevant links, the dmoz category simply never gets updated by google.

If this forum would allow examples to assist webmasters, I would illustrate several examples.

The Contractor

9:14 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I know this is not a definitely not a true sentence in all cases.

The google directory for one particular category lists 37 links while the same dmoz directory lists 40 links. It has been this way for months.

I have never stated that Google updates its directory by crawling. I stated that it crawls ODP for new/changed listings ;)

Chico_Loco

10:15 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



And that affects who? The webmasters that check the little green bars next to the sites in their category

I'm not too worried about the PR given by the link on the DMOZ pages, I'm frustrated that that link won't be carried over the the Google Directory and I'm missing out on THAT PR...Which after all will most likely be higer as the directory page in the Google Directory nearly always have a higher PR than their counterparts on the DMOZ site ..

So Does this affect just me?

NO - It affects the entire Google database, think about it. Those old links that have transformed into spam and 404's are getting PR that they don't deserve (and therefore higher ranking), and those new sites (which Google strives so hard to find) don't get the best PR possible because they are missing out on the PR from the Google Directory pages, accordingly, these new sites don't rank as high, and are in my opinion receiving an unfair mini-penalty.

The Contractor

10:37 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Chico_Loco,

I still don't understand your logic. Your link in dmoz is counted whether it is in the Google Directory/RDF Dump or is picked up by crawling. As far as the sites that are 404 or switched to porn - how is this affecting any site?

If the #1 position for "widget shopping" is 404 or has switched to porn - how is this going to affect positions 2-2million? Do you feel you are losing customers/visitors to 404's or irrelevant sites?

The day that you worry about losing visitors to 404’s or irrelevant sites is the day you need to hang up your webmasters hat. I do not mean this in a smart-a## way.

kctipton

10:56 pm on Dec 21, 2002 (gmt 0)

10+ Year Member



Hey, if you want to make Google feel ashamed for using ODP then contact them directly. What has this thread accomplished other than demonstrating that ODP-bashing is popular? I don't believe Google will blush since they are far and away The Big Thing in searching. They're doing things right, not wrong.

cornwall

11:06 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What has this thread accomplished other than demonstrating that ODP-bashing is popular

Well, it is Christmas, isn't it. :)

rogerd

11:17 pm on Dec 21, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I think an out of date Google Directory IS a problem. I know that I like to use if for certain kinds of searches, and the integration with PageRank makes it far more valuable that the alpha-sorted directory visible at dmoz.org.

If the current owners of the ODP aren't willing to finance it well enough to operate, maybe they should explore spinning it off and seeking help from some major foundations to fund it as a non-profit. I really believe the ODP can be a great thing, and I try to do my tiny part to help it. (Hats off to those who REALLY put in the hours!) It's frustrating, though, when system performance is so slow as to discourage editing, when some cats haven't been updated in years, and when owners of sites in my little category write me to complain about the search function not working and other sites (e.g., Google) not having up to date info.

I think it's possible to discuss the current issues affecting DMOZ without bashing it. There ARE some real problems. Dismissing them or minimizing them won't help. I wish I could offer some solutions, but short of taking up a collection to hire more sys admins and programmers, I'm fresh out of ideas...

rafalk

11:43 pm on Dec 21, 2002 (gmt 0)

10+ Year Member



I think it's possible to discuss the current issues affecting DMOZ without bashing it.

I completely agree with you, however the complaints that constantly appear on this forum are the ones that editors are powerless to address. This is why there's this sense of frustration on the part of ODP editors. Complain about abuse and you'll see it taken care of the same day. OTOH, if you complain about the RDF dump and how's its taking forever, well then we're just as powerless to change anything as you are.

frontpage

12:41 am on Dec 22, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Uhmm... Google crawls the ODP daily so it doesn't rely on just the RDF dump.

Later the same day.

I have never stated that Google updates its directory by crawling. I stated that it crawls ODP for new/changed listings

What exactly is Google doing? Updating or crawling for new/changed listings in ODP...what exactly are you trying to say?

Dante_Maure

12:57 am on Dec 22, 2002 (gmt 0)

10+ Year Member



What exactly is Google doing? Updating or crawling for new/changed listings in ODP...what exactly are you trying to say?

I believe what was being expressed is that Google crawls ODP for new/changed sites for inclusion in their main database (which drives the primary Google SERPs), while the RDF dump is what feeds the Google Directory.

gimmster

1:04 am on Dec 22, 2002 (gmt 0)

10+ Year Member



OK
If you use Google to do a *search* it will include listings that have been added to DMOZ lately because it has found those sites while crawling.

However if you look in the Google *directory* those sites will not appear, because the google directory is a modified version of the DMOZ RDF with modifications by Google.

The two are completely independant of each other with the exception that a Google *search* will also return results from the Google *directory*

Clear as mud? :)

steveb

1:20 am on Dec 22, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"the Google Directory nearly always have a higher PR than their counterparts on the DMOZ site .."

Is this true for anyone? I've never seen them anything but exactly the same -- except now when DMOZ is higher in a few spots because Google is using the dump from when Dmoz seemed to be devalued a point a few updates ago.

The effect of the Google directory not being updated is the impact of one quality link. Sites listed in Dmoz the past couple months are artifically lower in pagerank and thus a bit less in search rank than they would be if the directory was updated. A few phantom sites get pagerank they shouldn't. Neither of these is good but it is certainly not a meaningful impact on more than 1% of the Internet.

Google crawls DMOZ all the time. New sites get a nice benefit out of being listed. Some folks just seem to not be able to comprehend the concept that the dump only effects the Google directory and the jillion little mirrors out there. That is something, but the dump is obviously not DMOZ's reason for being. The OPEN-directory itself is the main thing.

fathom

1:38 am on Dec 22, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"the Google Directory nearly always have a higher PR than their counterparts on the DMOZ site .."
Is this true for anyone? I've never seen them anything but exactly the same -- except now when DMOZ is higher in a few spots because Google is using the dump from when Dmoz seemed to be devalued a point a few updates ago.

In my particular case... most of DMOZ cats are higher, but I provide a backlink to DMOZ on all pages that correspond to that listing, for a few reasons:

1. quality of complementing content.

2. authority site backlinks

3. product/service listings -- comparative assessment: unique selling points, price, etc.

4. slight boost in PageRank

5. strategic planning

The effect of the Google directory not being updated is the impact of one quality link. Sites listed in Dmoz the past couple months are artifically lower in pagerank and thus a bit less in search rank than they would be if the directory was updated. A few phantom sites get pagerank they shouldn't. Neither of these is good but it is certainly not a meaningful impact on more than 1% of the Internet.

The backlinks to DMOZ helps alot here, Googlebot and "Minty Refresh" crawls through the link continuously.

Google crawls DMOZ all the time. New sites get a nice benefit out of being listed. Some folks just seem to not be able to comprehend the concept that the dump only effects the Google directory and the jillion little mirrors out there. That is something, but the dump is obviously not DMOZ's reason for being. The OPEN-directory itself is the main thing.

It does suck to wait, but I suspect the wait will be worth it.

The Contractor

3:34 am on Dec 22, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Dante_Maure,
Thank you, as that is exactly what I was stating ;)

kfander

6:16 am on Dec 22, 2002 (gmt 0)

10+ Year Member



>> No, it's much ado about an outdated directory. <<

While it's nice to see my site in the Google directory, I doubt that very many of my site visitors find me via that route. I do okay in the SERPs, and any problems that the ODP may be having with their RDF dumps doesn't seem to have adversely affected that, at least not in any large way. Perhaps I'd do better if the RDF dumps were working, but I have sites that aren't even in the ODP that do okay on Google SERPs so ... Don't get me wrong, I'd like to see it fixed too, but it's not terribly high on my list of frustrations, that's all.

vitaplease

7:11 am on Dec 22, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Chico_Loco is right, it is not in Google style to be so "stale".

Googleguy cannot help mentioning "minty fresh" in any thread that complements Google on its uptodateness ;)

I just hope it's a sign that new spectacular things are going to happen, such as e.g. an option to opt in for categorisation [uk.altavista.com].

Pity, Google directory does not mention the number of listings per cat as DMOZ does. Makes the comparison easier.

chiyo

7:23 am on Dec 22, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google does not claim to be "minty fresh". It just uses freshness as criteria for respidering some pages that change quicker/have high PR and perhaps ranking higher new pages for a few days. There are no claims that it related to ALL their listings NOR the directory or other specialist searches.

Chico_Loco

3:27 pm on Dec 22, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think most people in here have taken me up wrong, except for vitaplease.

What I'm trying to state simply is this:

If Google spends thousands of manhours, terabytes or bandwith, terabytes of storage, equating to hundreds of thousands of dollars (if not more) to maintain and boast their minty-fresh, monthly updated database, why on earth have a directory that is over 3 months old (will be 4 months on Jan 22nd) and goes essentially against everything they are trying to promote themselves as being?

As for the comment I made about whether or not the Google Directory pages have higher PR than their DMOZ counterparts, well obviously this can't be true in every in instance, but it does seem to be a majority case, at least for those which I have seen.

I'm not bashing (or at least I don't intent to)... I commend those people who spend their voluntary time keeping this giant directory moving (except those with alterior motives), I'm not bashing them at all... I'm just stating my feelings about a technical team which could either me managed better, or consist of somewhat better people (not that those on the DMOZ team aren't doing their best)...

Seems to be a lot lot the parent company's product (Netscape Browser), slow, immobile, full of bugs. :):):):)

EliteWeb

4:40 pm on Dec 22, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As stated prior DMOZ has not been halted because of the RDF dump delay (which should be resolved any day now) because it has not been dumped doesnt mean that DMOZ isnt updated. Once I say yes add this link to dmoz its there. Google finds it however it may be (a crawl).

If you read about the DMOZ dump you will find it isn't only hardware related, the dump is because of errors in the database.

4 days till Christmas!

Dumpy

4:56 pm on Dec 22, 2002 (gmt 0)

10+ Year Member



I notice tags.new.html with 0 entries and a current date in the RDF Dump area of DMOZ. Does this indicate another FAILURE?

I'm sure everyone would agree that it is time for someone to publish SOMETHING on DMOZ concerning their failure to update and what EXACTLY is the plan to correct it, if any. The DMOZ operation is suffering from a lack of responsibility to it's users. What happened to it's "Social Contract"?

Dumpy

5:07 pm on Dec 22, 2002 (gmt 0)

10+ Year Member



Here is Netscape's Social Contract:

Are they in BREACH?

Our Social Contract with the Web Community

Netscape Communications Corporation hosts and administers the Open Directory Project (ODP), and has discretion over its content, use, and operation as described in the ODP's Terms of Use. The ODP is an Open Source inspired initiative created and maintained by a vast, global community of volunteer editors. The following is a social contract that we created to reflect Netscape's commitment to the Web community to keep the ODP a free and open resource. It has been inspired by, derived from, the Debian Social Contract.
1. The Open Directory Will Remain 100% Free

We promise to keep the distribution of ODP data, and the submission process to this data, entirely free. We will support our data users who choose to add propriety and revenue generating content, and other non-free value-added functionality upon versions of the ODP in which they download. In turn, data users agree to attribute use back to us per the free use license.

2. We Give Back to the Web Community

We license our content as free with attribution back to the ODP. We will make the most comprehensive, user-friendly directory possible, so the content and taxonomy will be widely used and distributed. We will do our best to list web sites in a fair and impartial manner, and consider all user requests and suggestions for improvement.

We will make every effort to build a high quality and comprehensive directory. We will make every effort to evaluate all sites submitted to the directory. However, we do not guarantee all submitted sites will get listed. We will be highly selective and judicious about sites we add, and how we organize them. Sites that we do not routinely list are outlined in our submission policies and editorial guidelines.

We will protect the ODP's intellectual property from infringement. Netscape owns the rights to the compilation of the Open Directory, as well as to individual contributions. However, we provide non-exclusive, royalty free rights back to our editors for their personal contributions, so, they can present their material to the greater Web community in other ways if desired.

3. We Don't Hide Our Official Editorial Policies

We will keep all official ODP editorial guidelines and policies open for public view at all times.

4. We Provide an Open Invitation to Join

We extend an open invitation to the general public to join to the ODP. Our community is a diverse group of subject experts and web aficionados. Our categories attempt to express the depth and breadth of human knowledge. We accept editors from all walks of life, and we attempt to represent all points of view. We will keep our application process open to anyone interested in joining. Each new editor application will be reviewed by a member of the ODP community.

Our application process is necessarily selective due to our commitment to building a quality resource. Not all applications will be accepted. We do not endorse any formula for the perfect application, however our intent is to accept applications that show fairness, impartiality, objectivity, and "fit" within our editorial guidelines and codes of conduct.

5. We Encourage a Self-Regulating Community

We foster a self-regulating community governed by community-driven standards. We encourage the community to regulate itself, and to provide the checks and balances needed to ensure that its members follow mutually accepted codes of conduct and editorial standards. We depend on the honesty and integrity of the volunteer editors to ensure the directory is high quality, user-friendly, and free of abuse.

6. Our Priorities are Our Data Users and the Community

We will be guided by the needs of our data users and the ODP editorial community. We will place their interests first in our priorities. We allow others to create value-added distributions containing ODP data and data from other commercial and noncommercial sources, subject to the terms of the free use license, without any fee from us.

7. Users Not Meeting The Free Use License

In order for the ODP to continue to flourish as a free and open resource, it is critical that our users comply with our free use license. We do not permit unattributed use of our data, and will request data users to place the attribution on their site or remove the data entirely if they wish not to comply. We consider unattributed use a legal infringement of the free use license, and contrary to the ODP's purpose as an Open Source inspired initiative..

This 55 message thread spans 2 pages: 55