NAME HTML::Normalize - HTML light weight cleanup VERSION Version 1.0004 SYNOPSIS my $norm = HTML::Normalize->new (); my $cleanHtml = $norm->cleanup (-html => $dirtyHtml); DESCRIPTION HTML::Normalize uses HTML::TreeBuilder to parse an HTML string then processes the resultant tree to clean up various structural issues in the original HTML. The result is then rendered using HTML::Element's as_HTML member or an internal renderer. Key structural clean ups fix tag soup ("foo" becomes "foo") and inline/block element nesting ("

foo

" becomes "

foo

"). "
" tags at the start or end of a link element are migrated out of the element. HTML::Normalize can also remove attributes set to default values and empty elements. For example a "" element would become and "" and "" would be removed if Verdana size 1 is set as the default font.