We’ll be working a project in a government learning institution. I was hired to manage the project. There are volumes of many data to be converted, digitized into texts. So it wil be scanned, OCR is optic character recognition, into texts. These texts then must of course be turned into useful information, sorted maybe become a part of a database etc. What I heard is that it's about physical research materials (written like "masters or PhD") who wil be OCRed into text. Does anyone have experience with this? Can you share journey or give some tips for us newcomers to the field?