Welcome to WebmasterWorld Guest from 54.91.71.108

Forum Moderators: open

Link check, including fragment identifier

What tool exists?

     
9:24 pm on Dec 13, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Feb 25, 2004
posts:1001
votes: 47


I've cleaved a number of web pages into more focused ones.
In doing so, I anticipate that I've probably broken some links on my other pages that used to link to this information.

For example:
The id="anchor1" used to exist on page1.
example.com/page1#anchor1

Now this information and anchor exists on page2
example.com/page2#anchor1

Is there a link checking tool that I can run on my website pages that identifies instances of use of the now broken link form:
example.com/page1#anchor1

I've tried some Chrome link checkers but it was my impression they don't verify that the fragment identifier really exists on the page or not.

I tried W3C's link checker, but got strange errors about whether the page I was checking even existed. Even their example URL returned a "document not found error," so evidently this tool doesn't work any longer.
10:18 pm on Dec 13, 2018 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11681
votes: 205


you must fetch and examine a document before you can determine the absence or presence of a document fragment identifier within that document.

the fact that an identical identifier is present in other documents on a site would typically be irrelevant.
10:43 pm on Dec 13, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15636
votes: 795


I tried W3C's link checker, but got strange errors about whether the page I was checking even existed. Even their example URL returned a "document not found error," so evidently this tool doesn't work any longer.
Say what now? I've used it routinely for years, with no problems. You weren't trying to enter that whole blahblah#anchor string into the URL field were you? It works from the other direction: Give it an URL, and it will check all links found on that URL, including fragments and supporting files.
3:52 am on Dec 14, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Feb 25, 2004
posts:1001
votes: 47


When I Google "w3c link checker", Google serves up this URL:
[w3.org...]

That page's input box comes prepopulated with a w3.org URL.
I would assume that URL can be used as an example to see how things work.
When click the "Check" box on that page it takes me to URL:
[w3.org...]

The title of the page is "404 not found", the page says "Document not found".

When I place a random wikipedia page, or my site's pages, in the URL box, I am transferred to the same "Document not found" page.

When on the initial w3.org page mentioned above, there is a link "back to the link checker". Clicking that also leads to the "Document not found" page.
3:57 am on Dec 14, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Feb 25, 2004
posts:1001
votes: 47


phranque, I didn't really understand you answer. But if page1 initially had some content on it associated with an anchor. And that text and anchor has since been transferred to a new page2, any links to page page1 referencing that anchor will be useless because the content and anchor is no longer there. It's now on page 2.

As a check, I was hoping to find a link checker that I could run all of my pages through, eventually, to see if any of them had a reference to page1's initial configuration that has now been changed. I have a large volume of such changes and need a way of identifying such problem links.
5:03 am on Dec 14, 2018 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11681
votes: 205


the first part of my answer is intended to indicate it isn't an easy problem - since you have to keep crawling pages until you find the fragment identifier on another page.

the second part of my answer indicates that you will have a decision to make when the same fragment identifier appears on multiple pages.
5:17 am on Dec 14, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:9712
votes: 925


generally unique is the best method, but if fragment is duplicated, the urls should be different! Just keep it simple.
6:00 am on Dec 14, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15636
votes: 795


When I Google "w3c link checker"
Good grief. Who knows what the googlebot did to achieve that URL. It's correctly:
[validator.w3.org...]
6:49 am on Dec 14, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14902
votes: 484


Xenu broken link checker?
4:45 pm on Dec 14, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Feb 25, 2004
posts:1001
votes: 47


lucy24 - Thank you for the correct link to the w3c link checker.
That service does identify what I'm after. As an example, it seems to show things in the following form:

Status: 200 OK
Some of the links to this resource point to broken URI fragments (such as index.html#fragment).
Broken fragments:
https://example.com/page1#anchor1 (line 50)

Thank you too martinibuster. I had forgotten about Xenu. I had attempted to use that in the past but had gotten bogged down in the implementation of it. I should try again.
5:45 pm on Dec 14, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15636
votes: 795


it seems to show things in the following form
Yes, that's the form for bad fragments: it means that it found the document itself--it might even be the page you're checking in the first place--but not the named fragment. And then, if both are on your own site, you get to decide whether to change the link or the fragment name. Generally it's a mix of both.

While you're in there, look at messages about permanent redirects. Any document with external links created more than a few months ago is almost certain to have something you listed as http that is now https.
5:15 am on Dec 15, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:9712
votes: 925


This thread made me look (Xenu) and was surprised to find 6 that fall into that error. Even the best laid plans and all that happy stuff can be caught out by simply missing a change here, a change there.

Most of these were "heat of the moment" trying to solve one problem (mostly solved) and generating a few on the side that were not obvious.

Beauty is that one CAN fix this stuff. The Nit Picking. The pain in the assets...
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members