This is a sample document used to test URL extraction. Double Leading Slash Starting with question mark Starting with hash mark Starting with a leading slash Relative to last URL segment Absolute URL HTML-encoded URL HREF is on two lines, starting new line xhref is a bad pattern This title
blah No follow Ampersands should be unescaped. /addedTagNoAttribUrlInBody.html /addedTagAttribUrlInBody.html Phone Number Email URL with spaces