Digitisation is the process of converting information into a digital or electronic format. It is the manner in which analogue or hard copy formats are converted.
There are many multi-function devices (MFD) that are capable of scanning a flat image and possibly Optical Character Recognition (OCR) to enable searching a digital document. OCR can be unreliable if there is handwriting present, too many images or even if the document has numerous columns. For a quick, simple scan MFDs can do the job but for important business documents to be searchable, accessible and useful, they must be digitised in a professional manner. Below is five key considerations when digitising documents.
1. Accuracy of the whole document
Given the size and scope of documents (e.g. A4, A3, folio etc.) it is important to have hardware that is capable of scanning a range of sizes, not just conventional A4 or smaller. This ensures the accuracy of the scan, no loss of text at the edges and no blurring of text either. This is achieved with large format and other specialised scanners.
2. Character recognition
Documents may not just have typed text, there may also be handwriting, images and graphs which should be included in the scan to be a true representation of the hard copy. OCR manages the typed text, Intelligent Character Recognition (ICR) recognises handwritten characters and Optical Mark Recognition (OMR) recognises characters entered into forms, such as “ticked boxes”. These are all important to be captured in the digitisation process.
3. Useable formats
Digitisation output should be possible in a variety of formats such as PDF, JPEG, TIFF, GIFF, etc., this enables the documents to be repurposed as high definition images for presentation, high-quality printing and more. This is particularly relevant when dealing with medical documents, engineering drawings and similar.
4. Redaction to ensure privacy
It may also be necessary to remove sensitive or personal information from documents in the digitisation process. This maintains the privacy of the individual while making important information available e.g. medical tests which can be used for research purposes. This requires specialist software. Redaction can also bring together disparate documents into a single accessible document.
5. It’s not just scanning!
To just scan documents and store them on a computer drive somewhere will be a good backup for the hard copy but what use is it to the organisation? They must have metadata applied, keywords applied and filed in a proper structure. This ensures documents can be located either by the metadata applied to the digital document or the words and phrases found in the document – made possible with OCR, ICR and OMR. The classification and indexing of documents requires knowledge and experience to get it right.
If your business documents are not being digitised with all these components in place, you are not getting the most out of the documents you hold. All of the above features enable documents to become data and ultimately knowledge.