Logbook

(31/08/2016)
First, We used Image5 of testbench to test. We performed interpolation and then gave it to tesseract. output is shown here. Interpolation increased resolution and repaired broken characters at some extent. Compare it with previous output of same image. click to see Take It or Leave It

Be the one to always give and not expect anything in return. Give, no: to avoid being hurt, but to feel content with whet you can do yourself. to be independent. to be happy without needing anyone to give you happiness. Be like a breeze of change that inspires others to see their abilities without needing you to stay. Make people think. Make them wonder, and let them know that what you give is only based on what they are willing to take. Don't attribute your success based on whether you make a difference with everyone you meet but on the kind of difference you are willing to make. Accept that you have no ownership over people even if you do give them more than 3'0“ receive. The effort that you put into inspiring #- Othm to value their own selves, and to see the best in 'hdeS. gives them two choices: either to take it or leave

it. Whatever they choose is not a ' f °‘ . . . sign 0 your success f failure unless you believe that to be the use. "1

mm. 3

After that, we tried same thing on Image6 of testbech. resolution of this image is already high. so, if we further increase resolution of that image, tesseract internally gives output for only middle part of image. It simply ignores some lines at starting and ending of Image. Tesseract internally reduces resolution of image if it is too high so may be because of this, this kind of problem has occured.

Input Text file of original size image

Be Semidve

h inimcs me whm People nclually hehm

V” hm , hene. dun ozhns. 1- Amurcs In: that sommzijuu F" ..

him or herself due right .0 ulk am. In when m ., orhns ofdoing dung: wuhom asking for xhrix «.4. ..,-T; story. I: irritates me when peoplc gr: plenum u. M, .,,,\m fall ﬂaron Lhzir rm: in punuil ul rhdr dxnms. h mm, M whm people min In pleas: ma live up m we mrh ,...,...., sums, and ham. h ixrilares me whzn penpk are mun negalivr dun posxlivc ma spxvad lhux ncg:n\'1r}' enema Ihzy go. I: imulcs me when someone dots Dol mum. nae It heme; me when people ,e.d,;e mhm baud Ml that looks Iufox: me, even 3:! .0 know rhcm. h xmum me I-hm people wax for the hidden pun-pone bchu-ad crux)‘ m of Hndnas, llol lulizing nhn sometimes h put dou llol emu. h same. we when I we love given md hm hem“: .\Xxny whiny inim: me. mad 1 have Ina-nnd am me my m Hop lhax iuinn'on ‘s not by becoming immun: ha (ha: dungs. ha. nimble qunlily to he Able to recognize hea things u-hm win we um ma to lam (mm them. 1: you hue um semilrvm. ntvulct in go. lflhme thing: shame you, work oh covwiodng ylmndidlllyouwilldowhalyouuunmdlhnlyonmﬂm-‘Y dcIomII¢h.AlIh:endofIhed.|y.II:'x::JJhumzn.Ind nmdmuveunnottxtmdmxbelpialghands ﬁudwnhzn

EMT; itouhusueumwunhgmurrnd

yqehzhum

Text file of resized image with interpolation factor=2

better than others. It irritates me that so him or herself the right to talk down to others 0, K ‘- others of doing things without asking for their <ide us‘ 9, story. lt irritates me when people get pleasure In wing “hm fall ﬂat on their faces in pursuit of their dreams. It trnntcs tn- when people strive to please and live up to those with rnon¢‘_ status, and power. It irritates me when people are mm} negstive than positive and spread their negatixity I'l|crz\~g-g they go. It irrimes me when someone does not return a smile. It irritates me when people judge others based on then look before they even get to know them. It irntates me when people wait for the hidden purpose behind every act of kindness, not realizing that sometimes it just does not 6305!. It irritates me when I see love and not returned Many thing initnte me, and I have learned that the way to stop that imitation ‘u not by becoming immune to these things. It‘: I valuable quality to be able to recognize bad things when you see them and to learn from them. If you have that sensitivity.

This is 3 times larger than original image.(interpolation factor =3)

Be Sensitive

It irritates me when people actually believe better than others. It irritates me that so

him or herself the right to talk down to

smile. It irritates me when people judge others based on their

loolts before they even get to know them. It irritates me when

people wait for the hidden purpose behind every act of kindness, not realizing that sometimes it just does not exist. It irritates me when l see love given and not returned. Many things irritate me. and I have learned that the way to stop that irritation is not by becoming immune to these things. It's a valuable quality to be able to recognize bad things when you see them and to learn from them. If you have that sensitivity. never let it go. If these things irritate you. work on convincing

(18/08/2016)
Input 1:



Output 1

Here procedure:
 * 1) img1 =  histogram_equalization(input)
 * 2) img2 = AverageRow(img1)  // here img2 is output1 in Experiment for line detection on (15/08/2016)
 * 3) edge = find_edge(img2)
 * 4) dilate = do_dilate(edge) // this is morphological operation which makes edges thick, i used this to covere all text on image so that background can be subtracted
 * 5) J = imgaussfilt(img1,4) - imgaussfilt(img1,0.3); // it blures image and subtract less blured image so that illumination will be uniform.
 * 6) compare J with dilate and make backgroung in J white.
 * 7) perform otsu thresholding on J.

Input 1:



Output 2:



Input 3:



output 3:



(18/08/2016)
Input 1:

Output 1:

Input 2:

Output 2:

Input 3:

Output 3:



(15/08/2016)
Input 1:

Output 1:

We found average intensity of each row and then of whole image. After that we tried to make average intensity of each row near to avg. of image. Which produced output shown above.

Output 2:

After that we tried to find line coordinates. we tried as you had suggested. we passed output1 image to tesseract to get boxes on it. but tesseract first threshold and then it finds coordinates. so when it thresholds, image becomes worse and after that when it finds coordinates, it gives only for portion where thresholding was good. There is no way to bypass otsu thresholding in tesseract. if we give true thresholded image as input than and than it bypasses internal thresholding.

Output 3:

After that, I tried morphological operation in which I converted surrounding background to complete white. Image is shown in Output 3. Here again How to get true thresholded image is question. I tried otsu on this image and also some other algorithms. it is still not working.

(11/08/2016)
Input 1:

Output 1:

Input 2:

Output 2:

Input 3:

Output 3:

Input 4:

Output 4:

(21/07/2016)
Input Image to tesseract

'''Outout From tesseract. It is thresholded Image with Otsu method.'''

'''Output from OpenCV-Python. Thresholded imgae with Adaptive gaussian method.'''

Conclusion: We know that Adaptive thresholding is better than Otsu's thresholding. but here, i compared both and it seems like otsu is better. because, in tesseract they filter image or we can say they pre-processes the image before doing thresholding where i direcly use image as input in adaptive thresholding so its output is poor compare to otsu's method. So we need to find step to pre-process Image before we threshold it with adaptive thresholding.

(19/07/2016)
Input Image:

Output Images:





On English Image - (04/07/2016)
Input:

Output: Regions

Output: lines

Output: words

Bounding is not good in this last image but if we use different page-segment-mode then we result can be improved.

Conclusion: Boxing of tesseract is good enough. Provided we use correct page-segment-mode.

On Gujarati Image - (12/07/2016)
Input

Output

Input

Output

Conclusion: Boxing on Gujarati image is also good but sometimes it is not boxing upper modifiers.

(29/06/2016)
input:

output is xml file:  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">        This is a lot of 12 point <span class='ocrx_word' id='word_1_8' title='bbox 374 93 427 116; x_wconf 85'>text <span class='ocrx_word' id='word_1_9'  title='bbox 437 93 463 116; x_wconf 93'>to <span class='ocrx_word' id='word_1_10' title='bbox 474 93 526 116; x_wconf 90'>test <span  class='ocrx_word' id='word_1_11' title='bbox 536 92 580 116; x_wconf 87'>the <span class='ocr_line' id='line_1_2' title="bbox 36 126 618 157; baseline 0 -7; x_size 31; x_descenders 7; x_ascenders 6"><span class='ocrx_word' id='word_1_12' title='bbox 36 132 81 150; x_wconf 93'>ocr <span class='ocrx_word' id='word_1_13' title='bbox 91 126 160 150;  x_wconf 91'>code <span class='ocrx_word' id='word_1_14' title='bbox 172 126 223 150; x_wconf 94'>and <span class='ocrx_word'  id='word_1_15' title='bbox 236 132 286 150; x_wconf 88'>see <span class='ocrx_word' id='word_1_16' title='bbox 299 126 314 150; x_wconf  96'>if <span class='ocrx_word' id='word_1_17' title='bbox 325 126 339 150; x_wconf 88'>it <span class='ocrx_word' id='word_1_18'  title='bbox 348 126 433 150; x_wconf 90'>works <span class='ocrx_word' id='word_1_19' title='bbox 445 132 478 150; x_wconf 94'>on <span class='ocrx_word' id='word_1_20' title='bbox 500 126 529 150; x_wconf 91'>all <span class='ocrx_word' id='word_1_21' title='bbox 541 127 618 157; x_wconf 89'>types <span class='ocr_line' id='line_1_3' title="bbox 36 160 223 184; baseline 0 0; x_size 31.214842; x_descenders 7.2148418; x_ascenders 6"><span class='ocrx_word' id='word_1_22' title='bbox 36 160 64 184; x_wconf 91'>of <span class='ocrx_word' id='word_1_23' title='bbox 72 160 113 184; x_wconf 92'>file <span class='ocrx_word' id='word_1_24' title='bbox 123 160 223 184; x_wconf 88'>format. <p class='ocr_par' id='par_1_2' lang='eng' title="bbox 36 194 597 361"> <span class='ocr_line' id='line_1_4' title="bbox 36 194 585 225; baseline 0 -7; x_size 31; x_descenders 7; x_ascenders 6"><span class='ocrx_word' id='word_1_25' title='bbox 36 194 91 218; x_wconf 94'>The <span class='ocrx_word' id='word_1_26' title='bbox 102 194 177 224; x_wconf 90'> quick <span class='ocrx_word' id='word_1_27' title='bbox 189 194 274 218; x_wconf 91'>brown <span class='ocrx_word' id='word_1_28' title='bbox 287 194 339 225; x_wconf 90'>dog <span class='ocrx_word' id='word_1_29' title='bbox 348 194 456 225; x_wconf 91'>jumped <span class='ocrx_word' id='word_1_30' title='bbox 468 200 531 218; x_wconf 94'>over <span class='ocrx_word' id='word_1_31' title='bbox 540 194 585 218; x_wconf 87'>the <span class='ocr_line' id='line_1_5' title="bbox 37 228 585 259; baseline 0 -7; x_size 31; x_descenders 7; x_ascenders 6"><span class='ocrx_word' id='word_1_32' title='bbox 37 228 92 259; x_wconf 89'>lazy <span class='ocrx_word' id='word_1_33' title='bbox 103 228 153 252; x_wconf 91'>fox. <span class='ocrx_word' id='word_1_34' title='bbox 165 228 220 252; x_wconf 98'>The <span class='ocrx_word' id='word_1_35' title='bbox 232 228 307 258; x_wconf 91'>quick <span class='ocrx_word' id='word_1_36' title='bbox 319 228 404 252; x_wconf 93'>brown <span class='ocrx_word' id='word_1_37' title='bbox 417 228 468 259; x_wconf 93'>dog <span class='ocrx_word' id='word_1_38' title='bbox 478 228 585 259; x_wconf 92'>jumped <span class='ocr_line' id='line_1_6' title="bbox 36 262 597 293; baseline 0 -7; x_size 31; x_descenders 7; x_ascenders 6"><span class='ocrx_word' id='word_1_39' title='bbox 36 268 99 286; x_wconf 93'>over <span class='ocrx_word' id='word_1_40' title='bbox 109 262 153 286; x_wconf 90'>the <span class='ocrx_word' id='word_1_41' title='bbox 165 262 221 293; x_wconf 91'>lazy <span class='ocrx_word' id='word_1_42' title='bbox 231 262 281 286; x_wconf 93'>fox. <span class='ocrx_word' id='word_1_43' title='bbox 294 262 349 286; x_wconf 95'>The <span class='ocrx_word' id='word_1_44' title='bbox 360 262 435 292; x_wconf 90'>quick <span class='ocrx_word' id='word_1_45' title='bbox 447 262 532 286; x_wconf 91'>brown <span class='ocrx_word' id='word_1_46' title='bbox 545 262 597 293; x_wconf 90'>dog <span class='ocr_line' id='line_1_7' title="bbox 43 296 561 327; baseline 0 -7; x_size 31; x_descenders 7; x_ascenders 6"><span class='ocrx_word' id='word_1_47' title='bbox 43 296 150 327; x_wconf 91'>jumped <span class='ocrx_word' id='word_1_48' title='bbox 162 302 226 320; x_wconf 91'>over <span class='ocrx_word' id='word_1_49' title='bbox 235 296 279 320; x_wconf 94'>the <span class='ocrx_word' id='word_1_50' title='bbox 292 296 347 327; x_wconf 92'>lazy <span class='ocrx_word' id='word_1_51' title='bbox 357 296 407 320; x_wconf 91'>fox. <span class='ocrx_word' id='word_1_52' title='bbox 420 296 475 320; x_wconf 94'>The <span class='ocrx_word' id='word_1_53' title='bbox 486 296 561 326; x_wconf 91'>quick <span class='ocr_line' id='line_1_8' title="bbox 37 330 561 361; baseline 0 -7; x_size 31; x_descenders 7; x_ascenders 6"><span class='ocrx_word' id='word_1_54' title='bbox 37 330 122 354; x_wconf 91'>brown <span class='ocrx_word' id='word_1_55' title='bbox 135 330 187 361; x_wconf 90'>dog <span class='ocrx_word' id='word_1_56' title='bbox 196 330 304 361; x_wconf 91'>jumped <span class='ocrx_word' id='word_1_57' title='bbox 316 336 379 354; x_wconf 94'>over <span class='ocrx_word' id='word_1_58' title='bbox 388 330 433 354; x_wconf 94'>the <span class='ocrx_word' id='word_1_59' title='bbox 445 330 500 361; x_wconf 96'>lazy <span class='ocrx_word' id='word_1_60' title='bbox 511 330 561 354; x_wconf 91'>fox.

page layout analysis as output:

Thresholding Operation
Input to tesseract for thresholding.

Output of thresholding

Input:

Output of thresholding:

Input:

Output:

Input:

Output:

Conclusion: Thresholding operation of tesseract is good if image is clear but it is worst if background has non uniform light or noise. It needs improvement.

Experiments on Gujarati scanned image
I have converted many Gujarati Images as our goal is to make tesseract better for Gujarati Language.

I tried some blur images.

Input :

Output:

ગ્પ્જરૃદુત રૃદુષુ માગ્ અને. મક્દ્ન વભાગ ' તની મોનલાઇન યોર ત્તિવિદા નંબર -1૯ સને પ્પ-પ્ડ ગુજરાત રાજ્યના રાજ્યપાલ ના વતી કા પાલક ઈજનેર [મા-મ) વિભાગ. રાવપુરા પોલીસ સ્ટેશનની બાજુમાં. રાવપુરા, વડોદરા (શેન નં…૬૫૫૪૩0૮1ની કચેરીઐયી સરક્રારથ્રીમાં યોગ્ય. શ્રેણીમાં નોંધણી ધરાવતા ‘ચે.’ વર્ગ અને ઉપરના ર્ઘજારદારશ્રોઓ પાસેથી સમરસ શૈરટેલ વડોદરા ખાતે ફર્નીચરના કામ માટે ટેન્ડર ઓનલાઇન મંગાવવામાં આવે છે. કે જેની અંદાજીત રકમ રૂ.૫૦0.0૦ લાખ સુધી છે, સદર કામની ઓનલાઇન ટૅન્ડર ભરવાની છેલ્લી તામ્ડગ્-પ્-પ્ક સાંજના પ્૮:00 કલાક સુધી વેબસાઇટ

!! 8://|'|]1મ્.|1 [બ્ભાત્ત્ … પર ઉપલબ્ધ થશે. વધુ વિગતો માટે ઉપરોક્ત ક્ચૈક્રુનો સંપરું સાધવા વિનંતી. માહિનીં- પડો- ૧૦૯૭/૨૦૧ ૨૦૧૬

Some images with clean white backgrounds.

Input:

Output: 2015 માં શરૂ કરીને, આપણે શું હાંસલ કર્યું ની સપ્થ્રે આપણે ’કેવી રીતે’ તે હાંસલ કર્યું આ બનેલું માપન અને તેને પુરસ્કૃત કરવા માટે આપણી વૈશ્વિક કામગીરી વ્યવસ્થાપન પ્રક્રિયાને અમે સરેખિત કરી રહ્યા છીએ. આ પ્રક્રિયામાં ભાગ લેતા દરેક કર્મચારીને તેમના આ વર્ષના લક્ષ્ચોનો જ વિકાસ કરવા માટે નહિ પણ તેમના ભવિષ્ય માટેની વિકય્સ યોજનાઓ માટે પણ પૂછીને આપણે આપણી વ્યક્તિગત ક્ષમતાઓ તેમ જ આપણી કંપનીની તાકપ્તને મજબૂત બનાવશું. અને, જેમ આપણે સાથે મળીને આ પ્રકારના પરિવર્તન કરીશું, તેમ આપણે સ્પષ્ટપણે એક સંસ્કૃતિ અને એક કંપની ને પણ મજબૂત કરીશું કે જે યોગ્ય રીતે પરિણપ્મો આપવા પર કેન્તિત છે.

નવા અને અલગ તરીકપ્ઓમાં વૃદ્ધિ પામવા માટે એક કંપની તરીકે આપણે એક મહત્વના પ્રવાસ પર છીએ. આપણા દરેક દ્વારા કરપ્તા યોગદાન કરતાં આપણી વૃદ્ધિ ણૂંહરચના માટે બીજું… કંઇ મહત્વપૂર્ણ નથી. મને આશા છે કે તમે આપણી નવી કદમગીરી વ્યવસ્થાપન પ્રક્રિયાને ભવિષ્યની તમારી પોતાની વૃદ્ધિ યોજનાનો નિકાસ કરવાની એક તક તરીકે રવીકપ્રશૌ જેથી આપણે બધા સાથે મળીને સફળતા મેળવી શકીએ. કૃપા કરીને એયઆર અને તમારા મેનેજર પાસેથી આ નવી પ્રક્રિયા વિષે વધારે માહિતી અને પ્રશિક્ષણ તકો જાણી લો

Some with inclined text lines in images.

Input:

Output: ચરણુ ×…,× ચાંપીરૂંમૂછમરડી, ‘…નાગ'રૅક્રુ …ન્…… નામ [*જગાડિંર્થેમ્ક્ ઊઠે! ને …ર્પ્ ખ ળ વ‘ ત,…ર્ડ્સ …ષ્ઠા ઈંફ઼ ખારથ્રે ડુંફ઼ખાળક આવિચૈપ્.-જલ૦

બેઉ ’ ખળિચા બંયિ વળગિયા, કૃષ્ણે કાળીનાગ ’…નપ્યિયેય્;

સહસ્ર ફેંણેય્ ફૂંક્વૈ જૈમ ગગન ગાજૈ હાયિચૈમ્.--જલ૦ *

નાગણુ સૈપ્ ત્રિલાપ/ક્રૈ હૈ?, તાગનૈ ખરૂં દુઃખ આપશે; મથુરા નગરોમાં લઈ જશે, પછી નાગર્તુ * શીરપ્ કાપશે…-જલ૦

બે ક્સ્ ;તેડીર્જનિઃપ્યે, સ્વામીડૂ મૂ'ફેંમ્ અમારા કંથ'તે;

અમે અપરાધી કાંઈ ન સમજમાં, ન રેંમેપ્ળખ્ધા ભગવ'તનૈત્મ્જલબ્

‘થાળ ભરી રામ ચૈપ્તીડે,

શ્રીકૃષ્ણુનૈ રે વધાવિ’યેપ્; નરરૈ’ય્યાતા નાથ પાસેથી,

નાગ્ણિ નાગ છેપ્ડપ્યિચૈય્,-જલ૦

નાંરસિહ મહેતા

રું‘ન્ર્ડ્સત્_

૧. કૃષ્ણ યમુનાનાં જળમાં શા માટે પડ્યા હતા ૬ … '… ×… × … “ ***-. …ભી …… દ્રુતી !

I have tried on many more images that can be found here.

Conclusion: Accuracy of tesseract for Gujarati language is so poor. There were many wrong detection of modifiers. In noisy image, there is errors even in detecting alphabets. Some images with different inclination, images with line slope more than 25-30 degree can't be detected by tesseract. It gives error like file is empty. For vertical line text, output file of this image is just horrible.

High quality image
Output = STATEMENT OF GEORGE SOROS

BEFORE THE US. HOUSE OF REPRESENTATIVES COMMITTEE ON OVERSIGHT AND GOVERNMENT REFORM

NOVEMBER 13, 2008

Thank you Mr. Chairman and members of the Committee.

The salient feature of the current ﬁnancial crisis is that it was not caused by some external shock like OPEC raising the price of oil or a particular country or ﬁnancial institution defaulting. The crisis was generated by the financial system itself. This fact——that the defect was inherent in the system—contradicts the prevailing theory, which holds that ﬁnancial markets tend toward equilibrium and that deviations from the equilibrium either occur in a random manner or are caused by some sudden external event to which markets have difficulty adjusting. The severity and amplitude of the crisis provides convincing evidence that there is something fundamentally wrong with this prevailing theory and with the approach to market regulation that has gone with it. To understand what has happened, and what should be done to avoid such a catastrophic crisis in the future, will require a new way of thinking about how markets work.

Consider how the crisis has unfolded over the past eighteen months. The proximate cause is to be found in the housing bubble or more exactly in the excesses of the subprime mortgage market. The longer a double-digit rise in house prices lasted, the more lax the lending practices became. In the end, people could borrow 100 percent of inﬂated house prices with no money down. Insiders referred to subprime loans as ninja loans—no income, no job, no questions asked.

The excesses became evident aﬁer house prices peaked in 2006 and subprime mortgage lenders began declaring bankruptcy around March 2007. The problems reached crisis proportions in August 2007. The Federal Reserve and other ﬁnancial authorities had believed

that the subprime crisis was an isolated phenomenon that might cause losses of around $100

Low quality image


1'21me 0! [give I.

a: {he on m um, gn'K um um um“! mm; In mm an nm In lmld bung hm. hm w feel mm" mm. m mu un du mum". to be WE‘KPEIIJ‘HI. l0 Ix happy Wuhan: mam, :munt m ”v: mu inﬁrm: m hku: 1 hum of dung: am mgpm amen m m Ihur .wms Wuhan nxdmg pm m in) Mm ptnplr dunk mm um mun. ma 1n vhcm 1mm- um wh-r van gave u only hm: all m am 2-: wdlmg m uke Don't mnbule m... sumx Xanadu wlmhn ,m nuke a meme: \nlh («layout you mm bu an m: 1m aldﬂmm} u m M‘A‘bng in mm, .\<u'n zhl you luv: no o'lluslnp m‘ﬂ pmpl: :vm ﬂyuu as Dr: mu- , max: mm \m mm, Th: :[fon um ywu Fm Inn) mrpiuu * ulhm In Him Ibur own uhvuv wt] in in All but i hanﬁv pus them No chum: a'Lhu m uh n u: luv:

'* m”""!dmol:unma ormmmu- 5"“ “Mn-m helm: le mix mi“ ' 'l

my... 11



Be Sensitive

1e im'nm me whm people “man,- helm r

h x n bene. dun olhm 1- mum nee lhzl Sammie?” r’" ..

him or neneu an night In A“: dam. m Dlhm n, k orhm ofdoing dung: wuhom «king {or xhzix «Me .n- n, story. It mute: me when people 5:: plenum nu mm “M fall ﬂaron Lhzir rm: in punuil u! mean dxnms. Ir .nnnu ,r‘ when pedpre min In plea: and live up to dune mrh m, slams, .nd pom. 1e ixrimcs me whzn peopk an- n.4,, negaliw dun posm'vc .nd spmd men nepnnn» enema Ihzy go. 1: imulcs me when someone dots no! mum. smile. I: mink: me when people judge mhm baud on men looks Info" duty even 3:! lo know Khan. 1. Imum me when people wait for on hidden W bnhmd n-en- m of Hudneu, no! lulizing um mam“ in Just does no! em. n inium a: when l we love given .nd nm xemnd Mm; rung. ininm m. and 1 have leaned um [11: I'ly m “on um iuinn'on 'n w by becoming immun: to am. dungs. m . 'Ihlue Mr! :0 be nbl: to recognize bed things u'hm 3w we then: .nd n; lam (mm them. 1: you hue u... semim'lm min it go. "thug ming- him: you, work on ednn‘mn Willinmwﬂldowhuywanmdlhnlyonunm‘! dcmmuch.AIIheeudofvhedly.w'xeaﬂhmiﬂd Wumnmmmm Mum“

Manx-ch. iiothus xwillin mun-Id Mews-ex "u” 5

”um-n

Conclusion: Tesseract's accuracy is quite good for (English) images with less noise. For Images which are having some noise, though they are human readable, tesseract cannot accurately detect characters and thus it misinterprets characters. It has less accuracy if noise is there in scanned image so it cannot be trusted.