Why Is Lxml Closing This "ol" Tag When Parsing?
Here is some HTML:
- item
Solution 1:
I think neither HTML 4 nor HTML5 allows an ul element as a child of an ol element. Only li elements can be direct children.
That might be why an HTML parser builds a tree structure not representing the nesting you have in your input markup. Whether a "traditional" HTML 4 parser, like probably implemented in lxml's/libxml's HTML parser algorithm, did the same change to the structure is something I don't remember and I am not sure where to test it.
While two HTML5 validators flag your ul as a not-allowed child of ol, current browsers seem to preserve that nesting.
Post a Comment for "Why Is Lxml Closing This "ol" Tag When Parsing?"