How To Start {OCR}

chitrang
03-03-2006, 11:56 PM
hi guys
i m new babie in this world.i given project which include the ocr (optical charecter recognition) i searched lot about it. but i dont know how to start the ocr development. i choosed vb 6.0 as my development language.plzzz help me out. send me any link or any tuuorial to start the development.thnks for any help u can give

anthony_n
03-04-2006, 01:11 AM
this is not going to be a easy project. first off you will have to correct the skew and rotation on the picture. then try and clean it up before you can start with ocr.

most ocr prog work by having a series of matrix that relates to the char in question. a mach occurred when a % of a matrix matches. the % part is usually upto the user to select so they can fine tune it for the job at hand.

good luck with your project, but this is way to hard for you first one

DougT
03-04-2006, 01:14 AM
Hi,

The best place to start is at the beginning.

I'd do some analysis of the problem you've been set:

(1) What output is going to be required ?
(2) What Inputs are you going to need ?
(3) What processes do you need to convert the Inputs to the Outputs ?

Once you've got answers to those questions you will have a better idea of what you've got to do and what you need in order to do it.(ie the 'tool set') In your case you will need, at least, a DLL or equivalent that will perform the necessary OCR processes. If you're processing image data from a scanner you will also need to know how to control the scanner (eg start the scan, collect the scanned data etc)

Then you can move on to the Design phase, and actually think about the order in which things have to be done, often you can split this into 4 steps:

(1) Input - accepting user input from the screen. The design of the User Interface
(2) Validate - making sure that the user input is valid, in the correct context, and in any pre-required ranges. Reporting any errors back to the user for correction.
(3) Process - doing the actual processing. Performing the scan, performing the OCR
(4) Output - outputting the results back to the user. Into a File ? Onto the Screen? to the Printer ?

Once you've completed the design you'll be in a position to start to cut the code.

Regards
Doug

chitrang
03-04-2006, 03:36 AM
thank for taking interest in my thread
thanks for advice me that it will going to be tough. but it does not metter.
i want to tell some thing that my input will be the tiff file or bmp file.
my out will be the text file.
and how ocr algoritm means read the file pixelwise,split the character and comare to database or put it into the neural network.get uot put from it.
are all these right procedures ?if they are than also i cant start my project.beacause i dont know how can i do all these stuff in vb.or if u have any dll free which i can use in my project than it will help me a lot.
& if not than any one of u made it before than tell me & provide me useful link that i can reffer & start my work
once again thanks u for for interest.

Flyguy
03-04-2006, 06:09 AM
Why not buy an OCR engine?

http://www.componentsource.com/features/xscanni/index.html

Considered the average price you to have pay for an already developed component I would say that it makes no uses writing such a component yourself.

chitrang
03-04-2006, 06:28 AM
no i want to develop so it is worth to buy. so sorry but i want to develop. so if u can help me u r most welcome ,but plz do not give this type of sugession

DougT
03-04-2006, 06:53 AM
Hi,

Before you get too far into the project be aware that there are quite a few free OCR packages already available (eg WCAR, Simple OCR) the quality may not be very high but it means that your program must have some key differentiators which would make it worth someone buying it rather than using a free program.

So, in the research part of your project, apart from thinking about the technicalities of how to do it, you should also think about the features your program is going to have which the others do not, and hence give it a value.

I would expect that for an OCR program, differentiators might be the accuracy of the recognition, the number of different fonts it can recognise and perhaps the ability to recognise handwriting.

This is going to be a long, time consuming, frustrating and educational project, certainly not one that I would expect a novice to be writing, especially if you're going to write the recognition part (and the Neural network ?) yourself.

However I admire your tenacity and wish you a lot of luck with it.

Regards
Doug

chitrang
03-04-2006, 07:23 AM
hey thnks very much

mkaras
03-04-2006, 08:14 AM
This is interesting. You want to develop the OCR algorithm. On the otherhand you dismissed the suggestion to purchase an OCR engine. Yet you say you do not know where to start and want a link to some work from somebody that has already done this project. This does not add up in the maths world I grew up in.

chitrang
03-06-2006, 12:39 AM
plz guys if u cant help me than its ok but atleat donot discourage me. im still new to this world but i have that enough power to fight against it.thnk buddy for good positive negative response it will add my confudance more & more to do this project

EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum