[Story]Exploring the Hangul Font Standard and Its Surrounding Stories.

Insight


Introduction

Do you know which characters are included—or missing—in the fonts you use? You may have experienced finding a font you really like and applying it to your work, only to discover that certain characters won’t appear or are replaced by a default font. This happens because each font supports a different character set, meaning the characters included in a font file can vary. According to the Unicode Standard—an industry standard designed to consistently represent all characters worldwide on computers—nearly 160,000 characters have been registered to date. Since no single font can support all of them, fonts are built by selecting only the characters needed for specific users and usage environments. The Sandoll font standard (SD-KR) that we researched is also a Hangul font standard composed of carefully selected characters, designed with the environment of modern Korean users in mind. So how did this research begin, and how was it carried out? Let’s take a closer look.

e755effe9f181.png

Unicode continues to collect characters in order to represent all written languages of the world.
Unicode website main page (Image source: unicode.org)



Font Standard History — From Hangul 2,350 to 11,172, and to 2,780

Have you ever seen descriptions such as “Hangul 2,350 / Latin 95 / Symbols 985” on the Sandoll Cloud website or other platforms? This information indicates a font’s character support range—in other words, which characters are included in the font and what it supports. Specifically, it means that the font contains 2,350 Hangul characters, 95 basic Latin characters including the alphabet, punctuation, and numerals, and 985 symbols*, such as numbers, shapes, pictograms, and unit symbols. So where does the number “2,350 Hangul characters” come from?

Modern Hangul can form a total of 11,172 characters by combining consonants** such as ‘ㄱ, ㄴ, ㄷ’ with vowels*** such as ‘ㅏ, ㅗ, ㅘ’, which together make up jamo****. In 1974, the Korean Standards Association established KS C 5601-1974 (Code for Information Interchange) to assign standardized codes for Hangul in computing systems. At the time, it adopted a compositional Hangul input method, which combined 51 jamo, meaning there was effectively no limitation on the number of Hangul characters that could be entered. In addition, codes were assigned for a standard set that included basic Latin and Roman symbols (34 characters) and Hangul-related symbols (13 characters). While this standard was designed with domestic usage environments in mind, it caused conflicts with the internationally used ASCII code (ASCII Code), leading to issues with international compatibility.

* Symbols: In printing, symbols other than standard letterforms. For clarity and ease of communication, the terms yakmul, symbols, and special characters are unified under the term symbols in this article.

** Datja (consonant): A shortened term for consonant letters that produce sound when combined with vowels.

*** Holja (vowel): A shortened term for vowel letters that can produce sound on their own.

**** Jjokja (jamo): The basic units that make up individual Hangul characters.

***** Encoding: Character encoding. The process of converting characters into codes according to predefined rules, or the set of code-mapping rules established for this purpose.


317c623205e2b.pngTable specifying the 8-bit ISO codes for Roman characters and Hangul characters in KS C 5601-1974.
(Image source – standard.go.kr)


27b48fd58cad9.pngTable explaining the shapes, names, and code table positions of Roman character symbols in KS C 5601-1974.
(Image source – standard.go.kr)


After this, the Unicode Consortium began preparing an international industry standard—Unicode 1.0—using ISO 2022 (an ISO standard for character encoding), with the goal of assigning a unique code to each character so that different scripts could be represented consistently anywhere in the world. To secure the Hangul block (approximately 8,192 characters) within the two-byte code space allocated for CJK (Chinese, Japanese, and Korean) characters, Korea revised KS C 5601-1987 in 1987 to conform to ISO 2022 and submitted it to the ISO (International Organization for Standardization).

Within this pool of roughly 8,000 code points, space had to be allocated not only for Hangul but also for Hanja, symbols, and characters from other languages used alongside Hangul. From the remaining slots, 2,350 characters were designated as precomposed Hangul. The selection of these 2,350 characters was based on research into frequently used Hangul characters conducted by institutions such as the Korea Research Institute of Standards and Science and KAIST. This research analyzed Hangul usage across printing houses and newspaper organizations, industrial sectors, as well as books and dictionaries.

The resulting set of “Hangul 2,350 characters” has survived to this day and has become the minimum standard specification for Hangul fonts.






cd913a048b861.png

Explanation of the method used to select Hangul character units during the 1987 revision, from the commentary on KS X 1001:2004.
(Image source – standard.go.kr)


While the 2,350 Hangul characters were generally sufficient for everyday use, the standard faced criticism at the time of its adoption across many fields—including education—due to limitations such as the inability to represent dialects and loanwords, constraints on literary expression, and the lack of characters required for proper Korean orthography. As a result, in 1992, the Korean Standards Association adopted the combinational Hangul encoding widely used in the private sector and introduced it as a parallel standard under KS X 1001:1992, incorporating all 11,172 Hangul characters into the specification.

In addition, as ISO 10646—an international standard capable of encoding a far larger number of characters—was standardized globally, all 11,172 Hangul characters were included in Unicode 2.0. Adobe’s character collection standard, Adobe-Korea 1-2, also adopted KS X 1001:1992, thereby supporting the full set of modern Hangul characters.

* KS X 1001:1992: With changes to the KS numbering system, the standard was renumbered as KS X 1001. It is also referred to as KS C 5601-1992.

6b256a3920c46.pngGuidance on KS X 1002:1992 included in Adobe-Korea 1-2, from an Adobe technical note.
(Image source – adobe.com)


Although there were no longer technical limitations on inputting and outputting Hangul on computers, many font companies continued to produce fonts based on KS C 5601-1987, the first precomposed Hangul standard. As a result, issues such as the representation of dialects and loanwords persisted, and over time there were significant limitations in expressing newly emerging foreign words and neologisms. To address these inconveniences, font companies began adding characters as needed and developing fonts based on their own criteria. This ultimately led to a lack of compatibility between different fonts.

Based on this recognition of the problem, the ahngraphics Typography Institute published a paper titled “A Proposal for Additional Characters in KS Code Precomposed Hangul” (Noh Minji, Yoon Mingoo, 2015) in 2015, aiming to propose a Hangul standard better suited to contemporary usage by reflecting changes in standard language and evolving linguistic habits. This study proposed adding 224 characters to the existing 2,350-character set—resulting in a total of 2,574 characters—by selecting characters from categories such as standard language (9 characters), abbreviations (39), dialects (41), inflected forms (67), and interjections, onomatopoeia, and neologisms (42).

c34d9fa483926.png

Characters not included in the 2,350-character set.
When writing dialects, personal names, or foreign words, required characters may fail to display correctly.


Building on this research, the Korea Font Association conducted the “Hangul Code Standardization Study” in 2017 (Noh Minji, Shim Woojin, Lee Yongje), with support from Type Space and Sandoll. This study analyzed modern Korean usage frequency data collected by institutions such as the National Institute of the Korean Language, the Hangul Engineering and Information Retrieval Institute at Kookmin University, and the Research Institute of Korean Studies at Korea University, along with road-name addresses, family names, and educational vocabulary lists. Based on this analysis, a new Hangul standard consisting of 2,774 characters was announced.

At the same time, Adobe—then preparing to improve Adobe-Korea 1-2, which had not been updated since 1998—sought to align its work with the Korea Font Association’s standard. Through collaboration between the Korea Font Association (research on Hangul and punctuation) and Sandoll (research on Hangul OpenType features), Adobe ultimately announced a new standard, Adobe-KR 9, in 2018. This standard included 2,780 Hangul characters, incorporating additional characters for foreign-language notation and characters exposed during input.

In the following year, 2019, the Korea Font Association introduced KFA-HFCS 1.0, a standard that retained the 2,780 Hangul characters while reducing the range of symbols and foreign characters, with the aim of improving the font production environment for independent designers and lowering the total number of characters required for production. This is how the Hangul 2,780-character set—now widely used as a font standard—came into being.



Background of SD-KR — For Efficient Font Production

When we read a book, talk with a friend, or fill out documents at a local community office, how many symbols do we encounter? And how different are the types of symbols used in each of these situations? As you can see even in this text, there are punctuation marks that are essential for writing Hangul. In some contexts, strictly defined symbols are required, while in others, symbols are used more flexibly for lighter purposes. As with the Hangul examples discussed earlier, selecting and including only the symbols we actually need—out of tens of thousands available—is a necessary step in font production.

What we commonly refer to today as the “985 symbols” were defined in 1987, around the same time as the Hangul 2,350-character set. Although the standard explains that these symbols were “systematically included with consideration for frequency, usage, and opinions from various sectors in Korea,” criticism regarding the appropriateness of this composition has persisted. In “Problems of the Current KS Precomposed Hangul Code” (Korean Language Life, Fall 1989 issue, Kim Chung-hoe), the author argued that “among special characters, those that can be handled through character substitution should be boldly removed… It seems possible to eliminate more than 200 characters, including sequence symbols such as ‘(가), (나), (다)…’ and ‘(a), (b), (c)…’, unit symbols like ‘mm, cm, mg, kg’, and others such as ‘№, ㏇, (tm), ㏂, ㏘, ℡’.”
(For the same reason, 「SD Greta Sans」—used in the body of this article—does not include symbols with low practical use, which is why you may notice some characters being replaced by system fonts.)

0029f9869d408.png

Problems of the Current KS Precomposed Hangul Code,” published in the Fall 1989 issue of Korean Language Life.


31f382b9d6a91.png

Issues with symbol composition discussed in “(2) Problems of Foreign Characters and Symbols”


Among the 985 symbols are characters that cannot be properly used because the accompanying symbols or letters required for their use are missing, such as kana, Greek, and Cyrillic characters. In addition, many characters are unsuitable for use in today’s PC environments, including box-drawing symbols and mathematical symbols that cannot actually be used in equations. Unit symbols are also rarely used, as Unicode recommends writing units by combining typable alphabetic characters rather than using precomposed symbols. As a result, even countries that use the Latin script tend not to rely on these characters.

24bf20ab33a45.pngRecommendation to use the combination of the degree sign ‘°’ (DEGREE SIGN) and an uppercase ‘C’
 instead of the fullwidth unit symbol ‘℃’ (DEGREE CELSIUS), when indicating degrees Celsius.
(Image source – unicode.org)


Aside from the brief period when styles like so-called “Cyworld fonts,” which combined various special characters to form Hangul, were popular, most users rarely made use of these symbols. Nevertheless, because the standard was fixed at 985 symbols, many fonts were compelled to include them. In this process, fonts were often produced without a clear understanding of what each symbol was, how it should be used, or how it ought to be drawn. Instead, designers tended to follow the shapes found in existing fonts out of convention, simply to meet the required character count. This practice can lower the overall quality of a font and lead to confusion for users.

A representative example is the misuse of the symbols ‘<’ (LESS-THAN SIGN) and ‘>’ (GREATER-THAN SIGN), which are intended for numerical comparison. Because they are easy to type on the web, they are sometimes incorrectly drawn using ‘〈’ (LEFT ANGLE BRACKET) and ‘〉’ (RIGHT ANGLE BRACKET), which are meant to denote titles or other textual elements. Another issue arises when fullwidth symbols—meant to have uniform widths—are designed with inconsistent widths. Such problems prevent symbols from functioning properly when used for their original purposes.

a204aab77b565.png

Examples of incorrect symbol usage. Instead of using fullwidth unit symbols and fullwidth punctuation, direct input is recommended.
Parentheses and inequality signs should not be confused.


To organize rarely used symbols and establish a symbol standard suited to modern usage environments, the National Hangeul Museum conducted two rounds of font research study groups in 2015 and 2016. Through this process, it published “A Proposal for Improving the Hangul Punctuation Code System” (Shim Woojin, 2016) and “A Survey of the Current Hangul Punctuation Code System and Proposals for Code Compatibility and Additions” (Noh Youngkwon, Kim Daekwon, 2016). These studies proposed a basic Hangul punctuation code system designed with user convenience in mind, and整理ed various issues and design directions related to Hangul punctuation. This research later led into the studies conducted by the Korea Font Association and was applied to the Korea Font Association standard (KFA-HFCS 1.0). The results of this research were also largely reflected in the Adobe-KR-9 standard.

In fact, the primary reason Adobe sought to update its standard was to remove glyphs for symbols that were rarely used. As a result, symbols included in KS X 1001 were separated and distributed across Adobe-KR-0 (Supplement 0), 3, 4, 5, and 9 according to their importance and usage. Compared to the previous standard, in which all symbols were grouped together, this approach subdivided the ranges so that designers could selectively produce only what they needed. However, even with this change, it remained difficult for type designers to determine which ranges should be included in a font and which symbols were appropriate to draw. This ultimately led to inconsistent practices, where only selected symbols were produced depending on the range, forcing users to check font by font to see which characters were included—an ongoing inconvenience.

SD-KR was developed with a focus on addressing these issues. It aims to help font creators identify which symbols are most frequently used and work more efficiently, while also enabling users to more easily understand and choose fonts that suit their needs. To this end, research was conducted to provide designers with clear information about each character’s original purpose and how it should be drawn.



SD-KR Production Process

The research began by closely examining existing standards. We investigated why the characters included in the KS X 1001 standard were incorporated in the first place, what their intended uses are, and how they should be properly drawn. At the same time, we examined how symbols are classified in the Adobe-KR-9 standard and identified the limitations and inconveniences that arise when applying this classification approach.

9bc446399d3b8.png

The process of organizing all characters in Adobe-KR-9 into a table and identifying which characters Sandoll most frequently includes in its retail fonts.


Because symbols used across many different fields are intermingled, researching the origin, usage, and proper drawing of each symbol required gathering information scattered across various websites, books, and reference materials, piece by piece. Like language itself, characters often evolve over time, with their original purposes expanding depending on context, and in many cases their origins are debated, with multiple theories coexisting. Rather than forcing these findings into a single definition, we aimed to document as much of the collected information as possible and clearly note all sources. This process proved highly valuable later on when selecting which characters to include, as it helped us assess how broadly a character is used and how essential it is within specific fields.

e88fe4cea1cd7.pngA compiled and organized sheet collecting information on all symbols defined in KS X 1001.



Based on the materials collected in this way, we held discussions centered on characters that had caused issues in the process of font production or communication with clients, and proceeded to identify characters with high usage frequency. Through this process, we initially classified essential symbols and extended symbols, and based on these results, conducted a survey with Sandoll in-house designers and prior researchers.


20504ebbdd33b.pngA document prepared to discuss characters with identified issues together with Sandoll in-house designers.


20ee8fad96650.png

Sample survey questions


By analyzing the results of two rounds of surveys together with Sandoll’s ongoing frequent-use character research, we ultimately classified symbol standards into three tiers: 140 high-usage characters as ‘essential symbols’, 446 characters including these as ‘extended symbols’, and 1,022 characters as ‘full symbols’ compatible with Adobe-KR-0~5. We then grouped these with the Hangul standard to define a specification that offers high production efficiency and includes characters essential to the modern Hangul typography environment, naming it ‘SD-KR 1’, which contains 2,780 Hangul characters and 140 essential symbols. A second specification, named ‘SD-KR 2’, includes 11,172 Hangul characters—enabling input of the full modern Hangul set—along with 446 extended symbols required for more diverse typographic composition. With higher overall versatility, SD-KR 1 can be used for basic document writing or when specific words or short phrases are needed for titles, while SD-KR 2 can be applied when handling larger volumes of text.



SD-KR usage — how it will be used going forward?

As a result of this research, we created a website so that anyone can access and review the findings. On the website’s ‘Learn More’ page, users can find an introduction to SD-KR along with the sources of reference materials used in the research. On the ‘Details’ page, users can check which characters are included in each specification, and practitioners can view the usage, origins, and recommended drawing methods of symbols, making the content practical for use in production. Finally, through the ‘Download’ page, we provide specification-based sample files and list filters that are useful for glyphs work.

a86f670b2b485.png

Sandoll Type Lab SD-KR Website — Details page


In line with the original purpose of this research, we plan to provide clear and accurate information about font specifications on the Sandoll Gureum website so that more people can easily select fonts suited to their intended use and apply them to their work. For example, if a font currently used in a project is labeled ‘SD-KR 1’, and the font to be replaced is also ‘SD-KR 1’, users can switch fonts without concern. In addition, on the font detail pages of the Sandoll Gureum website, users will be able to preview all glyphs included in a font before purchasing. This will allow users to directly check whether the characters they need are included and how they are designed prior to purchase. This update is currently in preparation, and we appreciate your anticipation.

Despite extensive internal review, we believe that in the process of collecting and sharing such a large volume of information, there may be inaccuracies or omissions in the data. If you find any errors or areas that require supplementation after reviewing the published website and this article, please share your feedback via the provided link. This research was conducted with the hope that many creators will be able to produce fonts more easily according to their intended use, and that more people will be able to select and use the fonts they want with greater ease. Just as previous studies served as a foundation for this work, we hope that this research will in turn become nourishment for future follow-up studies, and we conclude this article with that expectation.

66bd9d73823dd.png


References
http://standard.go.kr 
http://unicode.org 
http://adobe.com 
koreafont.or.kr
http://typography-dictionary.kr 
http://wikipedia.org 
“Problems with the Current KS Precomposed Hangul Code.”
Noh Minji, Yoon Mingu (2015). Proposal for Additional Characters in the KS Code Precomposed Hangul Set.
Sim Woojin (2016). Proposal for an Improved Code System for Korean Sentence Punctuation.
Noh Younggwon, Kim Daegwon (2016). Survey of Korean Punctuation Code Status, Compatibility, and Proposals for Additional Codes.