Skip to content Skip to sidebar Skip to footer

Parsing Financial Information From Html

First attempt at learning to work with HTML in Visual Studio and C#. I am using html agility pack library. to do the parsing. From this page I am attempting to pull out the number

Solution 1:

Well, first of all there's no need to get the body first, you can directly query the document for what you want. As for finding the value you're looking for, this is how you could do it:

HtmlNode tdNode = document.DocumentNode.DescendantNodes()
  .FirstOrDefault(n => n.Name == "td"
    && n.InnerText.Trim() == "Net Income");
if (tdNode != null)
{
  HtmlNode trNode = tdNode.ParentNode;
  foreach (HtmlNode node in trNode.DescendantNodes().Where(n => n.NodeType == HtmlNodeType.Element))
  {
    Console.WriteLine(node.InnerText.Trim());
    //Output://Net Income//265.00//298.00//601.00//672.00//666.00
  }
}

Also note the Trim calls because there are newlines in the innertext of some elements.

Post a Comment for "Parsing Financial Information From Html"