HTMLCleaner
is an open source html parser. This provides us option to convert ill format
html to well format xml file and eliminating comments etc . Using HTMLCleaner,
we can directly format the html files on the internet or in local system and
store it in local file system.
We can include HTMLCleaner in any project using below dependency -
For detailed list of available options, kindly refer the below link -
http://htmlcleaner.sourceforge.net/commandlineuse.php
Reference -http://htmlcleaner.sourceforge.net/index.php
We can include HTMLCleaner in any project using below dependency -
<dependency>
 <groupId>net.sourceforge.htmlcleaner</groupId>
 <artifactId>htmlcleaner</artifactId>
 <version>2.2</version>
</dependency>
 Sample of command to perform the cleanup is as mentioned
below -
mvn exec:java -Dexec.mainClass="org.htmlcleaner.CommandLine" -Dexec.args="src=C:\\Programming\\WorkSpace\\tempTestIndex.html dest=C:\\Programming\\WorkSpace\\abc.html outputtype=compact omitcomments=true"
For detailed list of available options, kindly refer the below link -
http://htmlcleaner.sourceforge.net/commandlineuse.php
Reference -http://htmlcleaner.sourceforge.net/index.php
 
No comments:
Post a Comment