#1
Hello,

*****I'm trying to OCR pdf in region mode and I kept it as size to fit(it should be sized to fit in ADOBE READER for development purpose) as i need to take lots of data for example :the name of the company region.
example:
The actual pdf contains the name as : Americhem Inc.
I'm getting the name as : Amedchem Inc.

****I tried using page segmentation, scale and tried various combinations but I'm not getting the data which is present in the pdf

**** How to read Bolded data like this BILL TO becomes as BI.L TO
 
Last edited:

cs.andras

Active Member
Staff member
#2
Hi,
OCR certainly has its limitations. Especially when it comes to those characters. You might be stuck because of the font you are using. Is font-smoothing turned off on the computer where the script is running? That might enhance the results. Also, I see you tried to modify scale... that is a good starting point but don't be afraid to experiment on higher (24-32+) settings as well. I had some successes with this before.
 

cs.andras

Active Member
Staff member
#4
To be honest I don't know what page segmentation is, probably a new function of BluePrism's OCR capability. I'm sadly using 4.2, due to be upgraded here in my company. Don't give up, someone ought to be able to help out.
 
#6
If it's a true PDF you can also use send keys to get the data and extract details accordingly using in string function.
Also, not sure but v6.2 has added some conginite capabilities in BP (customer made vbos) which you can leverage for this purpose .
 
Top