日本語
krm_pronunciations

krm_pronunciations #

Overview and file formats #

The Phonetic Glosses in the Kanchiin manuscript of the Ruiju Myōgishō (hereafter Myōgishō) include Fanqie spellings (反切), Similar sound notes (類音注, ruion-chū), and Kana glosses (仮名注, kana-chū). These are often accompanied by Tone marks (声点, shōten). As a database for Sino-Japanese pronunciations, the “Database of Historical Sino-Japanese Readings” (abbreviated as DHSJR), developed by Professor Katō Taitsuru and others, offers exceptionally rich content. Its specifications are also publicly available in detail. Accordingly, as part of the HDIC project, we have decided to release data in accordance with the DHSJR specifications.

The DHSJR defines a data structure with 23 column names.

To facilitate linkage with the Myōgishō data included in HDIC, it is necessary for HDIC to assign unique column names to its own data files and to establish Primary Keys and Foreign Keys for interoperability between HDIC’s internal data files.

For this purpose, pronunciation_id (音注ID) has been set as the Primary Key, and definition_seq_id (注文ID) as the Foreign Key.

Since the Myōgishō features diverse formats for its Phonetic Glosses, a classification field named annotation_format (音注型) has been established to categorize them.

While DHSJR uses Japanese column names, HDIC employs English ones. Therefore, for data processing convenience within HDIC, English column names have been adopted.

Column name comparison #

The current draft, with English and Japanese explanations side-by-side, is as follows. The Japanese explanations are those stipulated by DHSJR. The English explanations are formulated to facilitate correspondence with HDIC. This is a provisional measure until official English explanations are released by DHSJR.

HDIC’s original column names are indicated in bold.

DHSJR (Japanese) HDIC (English) Key English Explanation Japanese Explanation (from DHSJR)
ID dhsjr_id DHSJR unique ID for each single Hanzi (Chinese character) (integrated data only) 単字ごとのユニークID(統合データのみ)
音注ID pronunciation_id Primary Key ID for each Phonetic Gloss. This is derived from definition_seq_id by extracting only those elements where the type (from definition_type_name in krm_notes) is Phonetic Gloss. Suffixes ‘b’, ‘c’, ’d’ are appended for variant forms. 音注ID。kr_definition_sequence_idから、注文の種類が音注のものだけを取り出したもの。変異形を追加したものには末尾にxを付した。 (User indicates ‘x’ is incorrect, and ‘b,c,d’ is correct for variants)
注文ID definition_seq_id Foreign Key An identifier for each component of the Definition (Original Glosses) or for the Headword itself within an Entry. It is formed by appending a sequential suffix (e.g., “_00” for the Headword, “_01”, “_02” for subsequent elements) to the corresponding entry_id. 連番で与えられるFで始まる5桁の見出しの数値IDに加えて、見出しの下に記される注文の各要素を出現順に区分し、出現の順番に_01、_02のように追加したもの。見出しには_00を追加する。
資料番号 material_id Material ID 資料ID
資料名 material_name Name of the material 資料の名称
資料内漢字番号 material_character_index Sequential number of a Hanzi (Chinese character)’s appearance in the material 漢字の資料内出現順の通し番号
資料内漢語番号 material_word_index Sequential number of a Chinese word’s appearance in the material 漢語の資料内出現順の通し番号
単字_見出し character_headword Headword column for Hanzi (Chinese characters) with Phonetic Glosses 音注が付された漢字の見出し列
単字_出現形 character_form Hanzi (Chinese characters) that have Phonetic Glosses 音注が付された漢字
漢語_見出し word_headword Headword column of Chinese words containing Hanzi (Chinese characters) with Phonetic Glosses 音注が付された漢字を含む漢語の見出し列
漢語_出現形 word_form Chinese words containing Hanzi (Chinese characters) with Phonetic Glosses 音注が付された漢字を含む漢語
漢語_alphabet word_alpha Entered when there is an alphabetic representation of the Chinese word 欧文による漢語の表記がある場合に入力されている。
語種 word_type Indicates the word type when there are mixed-language words (e.g., hybrid Sino-Japanese words) 混種語がある場合に、語種を示す。
漢語内位置 word_position Position of the single Hanzi (Chinese character) within the Chinese word 漢語内での単字の位置
単字長 character_mora_count Number of morae for the single Hanzi (Chinese character) 単字の拍数
声点 tone_marks Tone marks for single Hanzi (Chinese characters), indicating Four Tones (平上去入), Six Tones (平平軽上去入軽入), and voicing (清濁). 単字に対する四声(平上去入)、六声(平平軽上去入軽入)及び清濁。
声点型 tone_pattern Combination of Tone marks for Chinese words. Hanzi (Chinese characters) without Tone marks are represented by a full-width asterisk (*). 漢語に対する声点の組合せ。声点がない単字については*で表す。
仮名注 kana_notes Kana glosses (仮名注) for Hanzi (Chinese characters), including kana-based fanqie. 仮名表記による字音注(仮名反切を含む)
仮名型 kana_pattern Combination of Kana glosses for Chinese words. Hanzi (Chinese characters) without Kana glosses are represented by a full-width asterisk (*). 漢語に対する仮名注の組合せ。仮名注がない単字については*で表す。
反切 fanqie Fanqie spellings (反切) for single Hanzi (Chinese characters). 単字に対する反切注
類音 similar_sound Similar sound notes (類音注) for single Hanzi (Chinese characters). 単字に対する類音注
音注型 annotation_format Pattern of combined phonetic information (e.g., Kana glosses, Fanqie spellings, Similar sound notes, Tone marks). 仮名注、反切、類音、声点などの複数の音注が組み合わさった形式のパターン。
節博士 fushi_hakase Fushi-hakase notations (melodic or intonational markings) attached to musical materials such as Shōmyō (Buddhist chant). 声明等音楽資料に付される博士譜など
その他 other_phonetic_annotations Other types of Phonetic Glosses. その他の音注
出現位置 material_location Location of single Hanzi (Chinese characters) and Chinese words within the material. 資料内の単字・漢語の所在
備考 remarks_pronunciation Matters to be noted regarding these phonetic elements. 注記すべき事柄

The material_location is indicated in the format: K + Volume (2 digits) + Kazama Edition Page (3 digits) + Line (1 digit) + Segment (1 digit). For example, K0201474 indicates an appearance in Volume 2, Page 14, Line 7, Segment 4.

Currently, this is under consideration in the case study “Linkage with DHSJR,” which should also be consulted.