Optical Character recognition Resources

http://www.prima.cse.salford.ac.uk/tools/TesseractOCRToPAGE

see this, irt is tool to get pagelayout from tesseract