日本語
krm_pronunciations

krm_pronunciations #

Overview and file formats #

The Phonetic Glosses in the Kanchi-in manuscript of the Ruiju Myōgishō (hereafter Myōgishō) include Fanqie spellings (反切), Similar sound notes (類音注, ruion-chū), and Kana glosses (仮名注, kana-chū). These are often accompanied by Tone marks (声点, shōten). As a database for Sino-Japanese pronunciations, the “Database of Historical Sino-Japanese Readings” (abbreviated as DHSJR), developed by Professor Katō Taitsuru and others, offers exceptionally rich content. Its specifications are also publicly available in detail. Accordingly, as part of the HDIC project, we have decided to release data in accordance with the DHSJR specifications.

The DHSJR defines a data structure with 23 column names.

To facilitate linkage with the Myōgishō data included in HDIC, it is necessary for HDIC to assign unique column names to its own data files and to establish Primary Keys and Foreign Keys for interoperability between HDIC’s internal data files.

For this purpose, pronunciation_id (音注ID) has been set as the Primary Key, and definition_seq_id (注文ID) as the Foreign Key.

Since the Myōgishō features diverse formats for its Phonetic Glosses, a classification field named annotation_format (音注型) has been established to categorize them.

While DHSJR uses Japanese column names, HDIC employs English ones. Therefore, for data processing convenience within HDIC, English column names have been adopted.

Column name comparison #

The current draft, with English and Japanese explanations side-by-side, is as follows. The Japanese explanations are those stipulated by DHSJR. The English explanations are formulated to facilitate correspondence with HDIC. This is a provisional measure until official English explanations are released by DHSJR.

HDIC’s original column names are indicated in bold.

DHSJR (Japanese)HDIC (English)KeyEnglish ExplanationJapanese Explanation (from DHSJR)
IDdhsjr_idDHSJR unique ID for each single Hanzi (Chinese character) (integrated data only)単字ごとのユニークID(統合データのみ)
音注IDpronunciation_idPrimary KeyID for each Phonetic Gloss. This is derived from definition_seq_id by extracting only those elements where the type (from definition_type_name in krm_notes) is Phonetic Gloss. Suffixes ‘b’, ‘c’, ’d’ are appended for variant forms.音注ID。kr_definition_sequence_idから、注文の種類が音注のものだけを取り出したもの。変異形を追加したものには末尾にxを付した。 (User indicates ‘x’ is incorrect, and ‘b,c,d’ is correct for variants)
注文IDdefinition_seq_idForeign KeyAn identifier for each component of the Definition (Original Glosses) or for the Headword itself within an Entry. It is formed by appending a sequential suffix (e.g., “_00” for the Headword, “_01”, “_02” for subsequent elements) to the corresponding entry_id.連番で与えられるFで始まる5桁の見出しの数値IDに加えて、見出しの下に記される注文の各要素を出現順に区分し、出現の順番に_01、_02のように追加したもの。見出しには_00を追加する。
資料番号material_idMaterial ID資料ID
資料名material_nameName of the material資料の名称
資料内漢字番号material_character_indexSequential number of a Hanzi (Chinese character)’s appearance in the material漢字の資料内出現順の通し番号
資料内漢語番号material_word_indexSequential number of a Chinese word’s appearance in the material漢語の資料内出現順の通し番号
単字_見出しcharacter_headwordHeadword column for Hanzi (Chinese characters) with Phonetic Glosses音注が付された漢字の見出し列
単字_出現形character_formHanzi (Chinese characters) that have Phonetic Glosses音注が付された漢字
漢語_見出しword_headwordHeadword column of Chinese words containing Hanzi (Chinese characters) with Phonetic Glosses音注が付された漢字を含む漢語の見出し列
漢語_出現形word_formChinese words containing Hanzi (Chinese characters) with Phonetic Glosses音注が付された漢字を含む漢語
漢語_alphabetword_alphaEntered when there is an alphabetic representation of the Chinese word欧文による漢語の表記がある場合に入力されている。
語種word_typeIndicates the word type when there are mixed-language words (e.g., hybrid Sino-Japanese words)混種語がある場合に、語種を示す。
漢語内位置word_positionPosition of the single Hanzi (Chinese character) within the Chinese word漢語内での単字の位置
単字長character_mora_countNumber of morae for the single Hanzi (Chinese character)単字の拍数
声点tone_marksTone marks for single Hanzi (Chinese characters), indicating Four Tones (平上去入), Six Tones (平平軽上去入軽入), and voicing (清濁).単字に対する四声(平上去入)、六声(平平軽上去入軽入)及び清濁。
声点型tone_patternCombination of Tone marks for Chinese words. Hanzi (Chinese characters) without Tone marks are represented by a full-width asterisk (*).漢語に対する声点の組合せ。声点がない単字については*で表す。
仮名注kana_notesKana glosses (仮名注) for Hanzi (Chinese characters), including kana-based fanqie.仮名表記による字音注(仮名反切を含む)
仮名型kana_patternCombination of Kana glosses for Chinese words. Hanzi (Chinese characters) without Kana glosses are represented by a full-width asterisk (*).漢語に対する仮名注の組合せ。仮名注がない単字については*で表す。
反切fanqieFanqie spellings (反切) for single Hanzi (Chinese characters).単字に対する反切注
類音similar_soundSimilar sound notes (類音注) for single Hanzi (Chinese characters).単字に対する類音注
音注型annotation_formatPattern of combined phonetic information (e.g., Kana glosses, Fanqie spellings, Similar sound notes, Tone marks).仮名注、反切、類音、声点などの複数の音注が組み合わさった形式のパターン。
節博士fushi_hakaseFushi-hakase notations (melodic or intonational markings) attached to musical materials such as Shōmyō (Buddhist chant).声明等音楽資料に付される博士譜など
その他other_phonetic_annotationsOther types of Phonetic Glosses.その他の音注
出現位置material_locationLocation of single Hanzi (Chinese characters) and Chinese words within the material.資料内の単字・漢語の所在
備考remarks_pronunciationMatters to be noted regarding these phonetic elements.注記すべき事柄

The material_location is indicated in the format: K + Volume (2 digits) + Kazama Edition Page (3 digits) + Line (1 digit) + Segment (1 digit). For example, K0201474 indicates an appearance in Volume 2, Page 14, Line 7, Segment 4.

Currently, this is under consideration in the case study “Linkage with DHSJR,” which should also be consulted.