Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

canonical tag exploit... beware of the bot

         

minnapple

3:13 am on Feb 24, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Apparently there is a bot checking for errors in canonical tags.

The errors cause little harm UNTIL someone exploits them.

If they exploit the error the site is toast.

This is all I am going to say on this subject, other than check your tags closely.

[edited by: minnapple at 4:08 am (utc) on Feb 24, 2015]

brotherhood of LAN

3:29 am on Feb 24, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Very cryptic.

My wild guess is it's something to do with not using absolute URLs.

minnapple

3:36 am on Feb 24, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sorry, I would like to be more specific, but at the same time I do not want to plant any ideas into the wrong minds.

ganzojin

7:12 am on Feb 24, 2015 (gmt 0)

10+ Year Member



You mean doing negative SEO on wrong canonical url ?

Shai

2:20 pm on Feb 24, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Apparently there is a bot checking for errors in canonical tags.


By a 'bot' I take it you mean an external naughty type spider... not Google.

rish3

4:22 pm on Feb 24, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I think the guess of "using a relative url" is probably right, but with a specific type of error, like this:

<link rel=canonical href=“example.com/content.html” />

Google says it will interpret this mistake like so:

<link rel=canonical href=“http://example.com/example.com/content.html” />

At which point, it ignores the tag, as it's pointing to a page that doesn't exist.

Creates an opening for someone to scrape the content and publish it with an absolute url, maybe even one that ends in the same relative url:

<link rel=canoncial href="http://scraper.net/example.com/content.html" />

Perhaps that creates the impression with the search engines that the scraped copy is the original content?