I converted pdf to text and tried substring method to get the data between two headings.
Substring code used: rdtext.IndexOf("heading A") - rdtext.IndexOf("heading B")
but this is not working always. Also I am new with regex, any source that I can use to get a basic idea
Hi guys, I am stuck with something. I have a pdf with numerous pages and each page have datas which may vary from page to page. Each data have headings which is the only thing fixed in the pdf. The position keeps on varying. Even if we scrape it may work for some pages, but not completely. So I...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.