Forum Moderators: open

Message Too Old, No Replies

MetagerBot/0.8-dev (MetagerBot; http://metager.de; )

Note space before close paren

         

Pfui

1:20 am on Jul 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



kursix.rrzn.uni-hannover.de
MetagerBot/0.8-dev (MetagerBot; [metager.de;...] )
07/19 15:16:02 /robots.txt

(Yet Another uni class project.)

incrediBILL

11:55 pm on Jul 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's a meta search engine powered by Exalead which is why I didn't understand them needing a bot.

GaryK

10:26 pm on Jul 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



FWIW, the version I saw had two spaces before the close parenthesis.

One one site it did not request robots.txt and took a disallowed file. On another site it got into an endless loop of getting robots.txt and then index.asp until it eventually got stopped by my abuse detection script.

Maybe the bot is for some specialized service they provide. Google and Babelfish translations services didn't do a good job translating the page.

[If only the spell checker could catch grammar mistakes I'd look like the genius I am! :)]

EDIT #2: My notes suggest it's a link checker for the website. The last time I saw a bot from that IP Addy it was: MetaGer-LinkChecker.

[edited by: GaryK at 10:31 pm (utc) on July 23, 2006]

thetrasher

11:21 am on Jul 24, 2006 (gmt 0)

10+ Year Member



Link check is described in their FAQ [metager.de] (#27).

There a three options:
- no link check (fast results) (default)
- test on existence and sort (most current first)
- test on existence and sort according to relevance

GaryK

3:11 pm on Jul 24, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the info, thrasher.

Pfui

6:09 pm on Jul 24, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's more than a link-checker, really -- first and foremost it searches search engines (metacrawler) -- so why does it need its own bot?. From a babelfish-translated [babelfish.altavista.com] version of the site:

MetaGer, the search over
German-language search machines

A service of the RRZN, Leibniz University of Hanover

Enter in or several search words:

powered by: [Exalead logo; www.exalead.de]

And from the FAQ:

1. What is MetaGer? MetaGer is a search machine, which German-language (and if necessary also international) search services parallel after the search words entered by you absucht and summarizes all results. One calls such a thing a Meta Suchmaschine.

The code given (for copy-pasting into one's site) also shows the various sources, from dMoz to Wikipedia.

I'm not clear on what FAQ #27 means from thetrasher's info (thanks!) and this translation --

27. takes place before the announcement an examination whether the found left actually exists? If you hit on existence sorts the switch "test and according to amendment date" clicks, happens exactly that. With the examination also the date of the last change of a document is determined (if it exists), so that you get the hits additionally sorted according to the amendment date.

MetaGer tries to wait however not eternally for the answer of a badly attainable www server, but maximally approx.. 5 seconds. Below the respective hit you receive to information about the status of the hit now: "status: unknown "with a Timeout or" a status: Existed (date of preparation)". "hits", those according to www server any longer do not exist (however with search services are still registered) are not any longer indicated.

-- but all of that plus the Exalead (read: exabot) connection plus the robots.txt call almost sound more like a web ping or HEAD-check than a link-checker per se.

And it's that Exalead connection that troubles me. E.g., here are the first of the SERPs [rzserv.rrzn.uni-hannover.de] for "webmasterworld.com" (no quotes):

1 ) *** www.webmasterworld.com/ forum92/
QCheck:
[webmasterworld.com...]

* (gefunden von exalead.de) (no description)

[webmasterworld.com...]

* (gefunden von exalead.de) (no description)

Bottom line, for me, is that there's no reason for it to (also) spider my stuff as MetagerBot, and certainly not if it's using other search sites'/bots' SERPs, too. Show a link to a page found via another site's link, cool. But spider, or even link-check on its own? Nah.

thetrasher

12:26 pm on Jul 25, 2006 (gmt 0)

10+ Year Member



first and foremost it searches search engines (metacrawler) -- so why does it need its own bot?
Correct, Metager does not have an own crawler or search index, this (old) service uses other search engines - including Exalead (if selected). If desired, results/links delivered by all selected search engines are checked before displaying them. This bot checks whether the found sites really exist. 404-sites are not displayed in the SERP - only valid links are presented.

MetagerBot arrives if a user selects "check existence", searches for a phrase and your website was found by the searched search engines. This way, it's a link-checker and nothing more.

EDIT: This bot also comes to you if someone clicks on the "QuickCheck" [mserv.rrzn.uni-hannover.de]-link to test if a found website really exists and contains the search phrase: [metager.de ]. They show a snippet with up to 5 sentences containing the search phrase. That's really more than a simple link check.

Pfui, you can also search for "webmasterworld.com" without Exalead [jserv.rrzn.uni-hannover.de]. I think that Exalead paid for that logo-advertisement and getting searched by default. That's the only "connection".

Pfui

7:19 pm on Jul 25, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Danke, thetrasher. (Ich spreche nicht Deutsches:) Seems like [google.de...] would be easier to use, and easier on sites, too.