Developing a Korean-alphabet OCR application(1)

Background of Development

Despite the existence of Korean character OCR programs applied in many products, most of OCR systems are internally developed(not open source), have low performance, and very expensive. There also aren’t much research papers discussing the performance of these OCR systems, very different from Chinese and English character generation problems.

While designing an evaluation method for a Korean handwriting GAN project, I had a hard time finding quality open source Korean text OCR programs, and I just decided to make one myself.

The structure of the Korean alphabet(Han-geul) varies from other alphabets. There are a very large pool of characters(11,000+) of the total alphabet, but unlike the Chinese alphabet, the Korean character can be split into 3 components: Chosung, Jungsung, and Jongsung. There are 20~30 possible letters for each component. Using the characteristic of the Korean alphabet, development of Korean-specialized OCR application may be considerably different from OCR for other languages.

Structure of the Korean alphabet Han-geul

Existing Korean OCR API/Projects

Google Tesseract OCR: Google Tesseract is an open source OCR engine developed by Google. This engine is very powerful for English character detection, although suffers in recognizing Korean strings.

Naver CLOVA OCR API: OCR engine by NAVER, very expensive API that the basic plan costs 6,000$ for 100,000 images.

Github projects: parksunwoo, MijeongJeon, Wongi-Choi1014, The second model uses a very noisy data augmentation process that makes data close to the wild with 87% test accuracy. The third paper uses no noises, although shows a 97% test accuracy.

Other OCR Programs: Many applications such as the Korean search engine, word processor,converter has OCR functionalities embedded inside the software.

A Google search of ‘Korean OCR’ doesn’t show many quality Github projects, research papers.

Project Objective

  • Compare multiple renown deep learning model architectures for Korean OCR.
  • Compare whether splitting the three components is beneficial for training speed/performance.
  • Compare performance between other applications, projects and discuss improvements.
  • Provide an open-source Korean OCR pipeline.

--

--

--

Loves reading and writing about AI, DL💘. Passionate️ 🔥 about learning new technology. Contact me via LinkedIn: https://bit.ly/2VTkth7

Love podcasts or audiobooks? Learn on the go with our new app.

Low-Code For Payroll Software Development

Moodle Installation Guide for Ubuntu Linux (2)

Setting Up Apache Kafka on Windows

ARKit and CoreLocation: Part Three

Types of Binary Tree [DSA-2022]

Adding a new step in event wizard, the open source way — Part 1

Lab 8: Output: Servor Motors

Steps to reach to the efficient solution — Recursion, Memoization and Dynamic Programming.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sieun Park

Sieun Park

Loves reading and writing about AI, DL💘. Passionate️ 🔥 about learning new technology. Contact me via LinkedIn: https://bit.ly/2VTkth7

More from Medium

Implementing Stochastic Depth/Drop Path In PyTorch

Custom Keras model running on OpenCV AI Kit (OAK-1)

CLIP: Learning Transferable Visual Models From Natural Language Supervision

Training a model for custom object detection (TF 2.x) on Google Colab