Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Internal Links report in Google Search Console

         

nestman

6:54 pm on Feb 18, 2016 (gmt 0)

10+ Year Member



I have a link on my Internal Links report that I have been trying to filter out. It's a link for a lightbox popup and Google sees over 6,000 instances of it on my site, due to the many pages that I have. Here is what I have in my robots.txt

Disallow: /*sendMessage.php*

This format works great for filtering out other pages that I don't want Google to see. I also have a rel='nofollow' included in the link for the popup window. So, does anyone know why Google still sees sendMessage.php? I have had these blocks in place since the fist of February, and still the numbers are not coming down.

Thank you!

aakk9999

12:57 pm on Feb 19, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is a difference between Google seeing a link and Google crawling the page.

Robots.txt stops Google crawling the page. It will not stop it indexing it nor reporting that other pages link to it.

Rel nofollow stop Google following this link from this page and passing the link juice. It should not be relied on Google not crawling or not indexing the page.

I would not worry about lightbox popup URL being shown in Internal Links in Google Search Console.

JS_Harris

10:58 am on Mar 1, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Webmasters worry though, so if you must...
I also have a rel='nofollow' included in the link for the popup window.

I'd go ahead and remove that, in fact there is no real benefit to placing it on any internal link. It's like saying 'hey Google, I don't trust this page' while pointing at your own site. The old practice of pagerank sculpting was made obsolete by Google ignoring such internal nofollow tags so it's likely not doing anything at all anyway.

robots.txt Disallow: /*sendMessage.php*

I'd go ahead and remove that as well, you're telling Google not to crawl it. They will index it anyway with url, title and a description that says robots.txt blocked them from crawling. That's not ideal.

Since you have an href link pointing to the lightbox and each lightbox is resolving to its own URI(unique paramaters I'm assuming) then the file itself is being rendered. I would set up the site to return a noindex meta tag on any page with that parameter. I'm assuming sendMessage.php is pulling a generic template from somewhere, within that template add the noindex meta tag.

If that's not possible another alternative is to add the noindex tag to the headers all lightbox popup URIs send. Just make sure you only send it for the popup URI, not all URIs
X-Robots-Tag: noindex