homepage Welcome to WebmasterWorld Guest from 107.22.37.143
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Website
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Another way to get incoming links?
JohnRoy




msg:4290800
 4:10 pm on Apr 1, 2011 (gmt 0)

If a page displays a line that seems as a link, google would try to crawl there although it's not a live url.

While analyzing webmaster tools crawl errors, in an effort to fix any 404 on a site, I noticed the following.

Example Below.
example.com/search?query=good+nice+cute+hot+chocolate+clock

Directory would show a live link, description, and the url address.
If the url is more than about 80 characters, it would trim it and show example.com/article/good_nice_cute_hot_choco
...

WMT shows that "good_nice_cute_hot_choco..." returned a 404.

Is that another way to target incoming links?

[edited by: tedster at 4:50 pm (utc) on Apr 1, 2011]
[edit reason] switch to example.com [/edit]

 

tedster




msg:4290883
 6:54 pm on Apr 1, 2011 (gmt 0)

Is this what you're saying - Google sees the alpha-numeric string that is only a variable for your script and treats it like a URL on its own right?

aristotle




msg:4290888
 6:58 pm on Apr 1, 2011 (gmt 0)

What you're seeing might be auto-generated spam links that have been improperly formatted.

crobb305




msg:4290898
 7:04 pm on Apr 1, 2011 (gmt 0)

I have also noticed an upswing in 404s on urls where Gbot is requesting <hr> in place of an underscore. WMT is reporting only 1 discovery for those urls, so I checked the sources and they are linking correctly. I set up a 301 for those, but if the upward trend continues, I will be setting up 301s for everything that Google can't follow correctly.

JohnRoy




msg:4290905
 7:08 pm on Apr 1, 2011 (gmt 0)

The URL I posted was edited/modified, but the original URL was from a site/directory that showed the following

1. <link>Click Here</link>
2. Description. Included about 1 or 2 lines of search engine scraped data.
3. www.address (not link). if to long: www.beginning_of address_followed with "..."


Google tries to crawl item 3 (and reports 404 in WMT).

Yes it's all because auto-generated spam links but they have been properly formatted.

crobb305




msg:4290907
 7:13 pm on Apr 1, 2011 (gmt 0)

I can't tell if you're saying that Gbot is following a link that isn't a link or incorrectly following live links that are properly formatted? I'm confused on that point, but I am seeing lots of 404s on correctly-formatted links being reported in my WMT. I have wondered if Googlebot can create links from text, for cases where people may type a website address in context but not make it an href (which also sounds like what you're reporting?)

tedster




msg:4290955
 8:23 pm on Apr 1, 2011 (gmt 0)

This information occurs on the other website, not yours - correct? That means you don't need to worry about the 404 messages. They are just their for your information in case you WANT to make some changes in how your server responds to those requests.

But you are also saying that, maybe, you can get Google "credit" for an inbound link (assuming your server replies with a 200 OK status) from unlinked text on another website that looks like a URL.

Have I got that right?

(My apology for any confusion cause by the edit,
but we will not ask people to visit and analyze other
specific sites, especially spammers and scrapers.)

Sgt_Kickaxe




msg:4291003
 9:49 pm on Apr 1, 2011 (gmt 0)

You get credit for a mention, the site was obviously talking about your site. Google never said a link had to be an href, in fact it can be anything they can quantify as "buzz".

How much value there is in that? Only Google knows, I wouldn't go pasting your address everywhere :-)

indyank




msg:4291109
 3:59 am on Apr 2, 2011 (gmt 0)

What? From when did google started crawling links that aren't clickable? Why do they want to do that?

We often turn links in comment section into links that aren't clickable? why should google crawl them when the site on which they appear doesn't want them to?

It is a big joke if they do that.

indyank




msg:4291110
 4:05 am on Apr 2, 2011 (gmt 0)

Why is google not providing information to the owner of the site on which a broken link appears?

None of the webmaster tools (including bing) report broken links on sites on which they appear.Why is it so?

JohnRoy




msg:4291396
 4:59 am on Apr 3, 2011 (gmt 0)

I can't tell if you're saying that Gbot is following a link that isn't a link or incorrectly following live links that are properly formatted?
Sorry for the confusion. Gbot is following a link that isn't a link

Google never said a link had to be an href.
Google did not invent the internet. A link, up until now, at least to me, was when it's an href.

But you are also saying that, maybe, you can get Google "credit" for an inbound link (assuming your server replies with a 200 OK status) from unlinked text on another website that looks like a URL.
Have I got that right?
Yep - That was the question in mind.


Why is google not providing information to the owner of the site on which a broken link appears?
(disclaimer: I'm not a G spokesman). Unless that site is hooked up with them, such as using WMT, Google as any other search engine or any other company has no relationship with the site owner.
For your own site, you may wish to use tools such as xenu.

JohnRoy




msg:4291521
 6:00 pm on Apr 3, 2011 (gmt 0)

I checked that se spammer/scraper site "cached" version in google. Its from an earlier date than the date on WMT, and yes, the live link is correctly formatted, while the incorrect formatted text (which includes "...." and google treats as a link) is not a live link.
Just wanted to make sure that the site didn't fix their problem "after" google visited. Now it's clear that Gbot acts odd

tedster




msg:4291528
 6:10 pm on Apr 3, 2011 (gmt 0)

I just found this Google Help page that seems to be addressing this situation. In the SERP, it is dated as March 20, 2011, so it seems like it is pretty new, if the SERP date can be believed.

Unexpected 404 errors

In Crawl Errors, you may occasionally see 404 errors for URLs you don't believe exist on your own site or on the web. These unexpected URLs may be generated by Googlebot trying to follow links found in JavaScript, Flash files, or other embedded content... [examples follow]

...Google strives to detect these types of issues and resolve them so that they will disappear from Crawl Errors. In general, 404 errors won't impact your site's search performance, and you can safely ignore them if you're certain that the URLs should not exist on your site.

[google.com...]

DanAbbamont




msg:4292976
 8:31 am on Apr 6, 2011 (gmt 0)

I was wondering how they were trying to follow javascript links but now it looks pretty simple. I doubt these pass any value, though. That would open up the door for a whole new generation of spam bots.

TheMadScientist




msg:4292985
 8:39 am on Apr 6, 2011 (gmt 0)

why should google crawl them when the site on which they appear doesn't want them to?

Because it's their Internet and they'll crawl if they want to?

Idk, they're getting annoying ... Really, their need to 'overdo' everything may be their biggest downfall, because to me, they really are getting just plain annoying in some ways, and imo if other site owners feel the same way it could be an issue for them in the long-run.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved