Find website broken links

An easy way to find if your website has broken links is by checking it with a specialized software.

LinkChecker

A free open source program to accomplish this task is Linkchecker.

It checks recursively the given domain and any URL outside the domain.

The program supports the following output formats: text, html, csv, gml, dot, gxml (GraphXML sitemap graph), xml, sql (log check result as SQL script with INSERT commands), blacklist (logs only entries with invalid URLs).

The project is hosted on Sourceforge http://linkchecker.sourceforge.net/

To redirect the program output to a file in HTML format you can use this command:

linkchecker --output=html http://www.example.com/ > /path/result.html

Log result in text format using file output.txt

linkchecker --file-output=text/output.txt http://example.com/

To install linkchecker in Ubuntu 9+ run the following command:

sudo apt-get install linkchecker

Ignore images

linkchecker --file-output=text/output.txt --ignore-url=\.jpg$ http://example.com/

Regular expressions accepted, this option can be specified multiple times

linkchecker --file-output=text/output.txt --ignore-url=\.jpe?g$ --ignore-url=\.png$ --ignore-url=\.gif$ http://example.com/

http://docs.python.org/howto/regex.html

WebCheck

Check only links within the specified path and ignore robots.txt

webcheck --base-only --ignore-robots http://www.example.com/checked-path

http://arthurdejong.org/webcheck/news.html