Skip to content Skip to sidebar Skip to footer

How To Remove All Elements On Text Level With Jsoup?

I'm working on a project and i'm only interested in the page layout and not in the text. I'm currently having trouble getting rid of every element at text level. for example:

Solution 1:

You can select and remove all p, li and ul elements with standard:

doc.select("p").remove();
doc.select("ul").remove();
doc.select("li").remove();

Solution 2:

I first found the tags that I want to get rid of, and then called empty() on their parent.

    public static void main(String[] args) {
        String html = "<div><ul><li>some menu item</li><li>some menu item</li><li>some menu item</li></ul></div><div><h3>Tile of some text</h3><p></p><p>some text</p><ul><li>some other text</li><li>some other text</li><li>some other text</li></ul></div>";
        Document doc = Jsoup.parse(html.toString());
        Elements elements = doc.body().select("*");
        for (Element element : elements) {
            if ("h3".equals(element.tagName()) || "p".equals(element.tagName())) {
                element.parent().empty();
            }
        }
        System.out.println(doc.toString());
    }

Post a Comment for "How To Remove All Elements On Text Level With Jsoup?"