Forum Moderators: bakedjake

Message Too Old, No Replies

Metadata : A comeback?

Gigablast will use geolocation's metadata

         

Allergic

3:19 pm on Sep 18, 2003 (gmt 0)

10+ Year Member



Today Gigablast implement the use of Geolocation's metadata [gigablast.com].

The Canadian governement also use Dublin Core metadata [cio-dpi.gc.ca] since a few months.

Will other engines gonna follow this and the semantic web will start soon?

jeremy goodrich

5:42 pm on Sep 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Those tags will never work long term, too open to fraud, misrepresentation, etc.

Is there a character limit to the "zip" meta tag, or can my nationally focused websites just put in a whole laundry list? ;)

Not to mention my wife is from a country that doesn't use 'zip codes' 'street numbers' or anything remotely like that - each house has a special name, and that is the address.

Though I guess the feature is for USA people only, still, I suspect that it won't work that well, long term, though for some people that classify their documents to the fullest extent possible, I guess this will help their Gigablast rankings.

mattdwells

5:58 pm on Sep 18, 2003 (gmt 0)

10+ Year Member



I agree that the percentage of fraud will be high initially, but as the spammers get manually filtered out, the good guys will remain, and the percent of fraud will decrease over time. Yes, of course, misrepresentative pages will continue to be added to the index, but they will not be cumulative like the good pages.

I really don't see this as being a whole lot different than the fraud and spam that I already have to deal with on a daily basis. A lot of pages contain random, unrelated words. Now they'll contain random unrelated zip codes. But as time goes on, the spam will be diluted. To further abate this problem, more popular/legit pages have higher scores for these new meta tags.

Matt Wells

jeremy goodrich

6:01 pm on Sep 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok, that all makes sense. Thanks for clearing that up.

>>up to 400 million pages

Not too shaby, sounds like the index is getting up there in size.

Allergic

8:29 pm on Sep 18, 2003 (gmt 0)

10+ Year Member



I personnaly think they will come back, because all the bigs companies and Governement sites start to use some internal search engine like Google appliance and thoses ISE have the option to take in account the metadata. When you're looking for something in a site over 300k pages, you've got to classify your information.

I also think it is pretty easy to a algo to look inside metada and in the rest of a page to see if the keywords match. Will see in 2-3 years.

mole

8:48 pm on Sep 18, 2003 (gmt 0)

10+ Year Member



It would be really great if search engines took up the geographical bit and did at least *something* useful.

Even if it were only to allow, say sites which deal with locally-provided services to be classified by locality.

So if I want to look for a specialist widget repairer in North London for example, I don't get SERPs which include all sorts of dumb matches on Widget, or London, or North, or Repair.

Sites which provide regional, national, or even worldwide services could be classified as exacly that so we know what we're getting.

Sites which are informational, not service / sales oriented need not be classified at all by locale.

- just some odd ideas, don't know if anyone's thinking on the same lines.

mbauser2

5:06 am on Oct 4, 2003 (gmt 0)

10+ Year Member



Matt,

As far as metadata schemes go, that one's all screwed up. I'm not even sure where to start, so I'll start with the worst one:

classification -- I'm not even sure what you're doing with this one. Is it a categorization thing? If so, it needs a set of available categories, or it'll just become a useless synonym for "keywords".

zipcode -- Jeremy's got a point, and skipped over a bigger point: Lots of countries use postal codes ("ZIP" is a U.S. abbreviation) with different schemes. If that attribute-value is going to useful internationally, the tag at least needs a scheme attribute [w3.org] to indicate which country's postal system is being referenced.

country -- Using unqualified (free-form) strings in the country field is like putting a "Kick Me" sign on your back. You're going to get dragged into annoying political debates like "Is this China or Taiwan?" and "How do you spell Corea?" Pass the buck to the ISO like everybody else does, and use the ISO 3166 codes [directory.google.com]. You'll save yourself a lot of grief in the long run.

language -- Ditto, sort of. Use the ISO 639 codes [unicode.org], combined with the ISO 3166 codes to identify national dialects. Better yet, lose the META tag and tell people to put a lang attribute [w3.org] on the HTML element. They're supposed to be doing that anyway.

city, state, country, zipcode (in general) -- Geotags [geotags.com], Syndic8 [syndic8.com], and GeoURL [geourl.org] already have geolocation metadata schemes, and most of them are better thought out. Maybe you should adapt one (or more) of those?

amznVibe

5:14 am on Oct 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have alot of fun with GeoURL.com, he has a great system. The simple decimal lat/long is the way to go.

Every few days I check within 35 miles of my town and find other local bloggers (and sometimes interesting businesses). The people who do it now seem to be a little bit more savvy so their sites are usually very interesting.

mbauser2 thanks for all those links, I didn't know about Syndic8 and their page was very helpful!

[edited by: amznVibe at 5:40 am (utc) on Oct. 4, 2003]

amznVibe

5:38 am on Oct 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Gigablast also needs to add context to those meta tag names:

meta name="classification" is a bad approach

meta name="giga.classification" is a much better way not to step on other tags

amznVibe

5:30 am on Oct 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ah, some hope for some standards! The Dublin Core (someone explain that name to me?) Metadata Initiative [dublincore.org]
is having their Dublin Core 2003 Conference (Seattle) [dc2003.ischool.washington.edu] right now. (IAslash.org has some blog links for live reviews etc.)