研究此設定好久，終於有大致上了解。讓大家儘快有個大架構，不致於轉暈。

我自己的置定原則如下各位可參考。考量加詞組或刷改詞重置 bin 很花時間所以 speller/algebra 和 dict要先搞再後面調整 pattern 副翻譯反查

我會做兩個原始和詞組

原始必要功能

製作頭尾反查碼 speller tabel accd --> prism ab??

vcode pattern 專解碼 ab? ab?? ab??? tags 井透過 reverse 顯示全碼

enable_completion: false # 和 vocde 衝突提前顯示尚未輸入完整碼的字

增加副字典拼音蒼六倉五 pattern

詞組 schema

禁用 speller 轉碼 ( dict 己將詞組編碼設定)

vcode pattern 調用原始 bin 反查單字

enable_completion: true 提前顯示尚未輸入完整碼的字

其他不變

cangjie5.dict.yaml手動加權重大於八股文約180000往下排) cangjie5_ext.dict.yaml (用import cangjie5 和其他符號 ... or enable 八股文 ..用詞頻詞組編碼方法和重碼看個人習慣)

這樣的好處是單字詞組不搶字單字先加上 disable_user_dict_for_patterns: 禁止某些編碼錄入用戶詞典一二碼不搶字

簡碼先單字字頻，單字先詞組 ( 影響的條件有 dict 八股文權重 initial_quality: 設定此翻譯器出字優先級 single_char_filter 單字過濾器

如多拼方法反查的實現

1 多拼實現: 在副翻譯器 tag: 標示 abc 或拿掉 tag 預設就是以abc tag 字串翻譯。

tag: 設定此翻譯器針對的tag。可不塡，不塡則僅針對abc

segmentor配合recognizer標記出tag。這裏會用到affix_segmentor和abc_translator
tag用在translator、reverse_lookup_filter、simplifier中用以標定各自作用範圍(這句有點讓人困擾) translator tag 表示接收該tag ，reverse_lookup_filter simplifier 是用 tags: [ a , b , c ] 。請將schema 詳解 segmentor 6項 / translator 23(19-23)項/ reverse_lookup_filter 4項( tags) /simplifier 6項(tags) 一起看 , 前兩項收1 後兩收多
如果不需要用到extra_tags則不需要單獨配置segmentor

extra_tags 是轉發。例如。多輸入法反查主字典碼 利用 recognizer 觸發 tags 利用 affix_segment@ 製作不是translator 也不是filter 只做轉發

segment : 建立 affix_segment@lookup , 建立接收 lookup tag 的segment 并轉發多個 tags 由多每個字典同時翻譯 , pattern 觸發 lookup tags ,

此時列出的候選字會以 lookup tag 標示再由 filters/reverse_lookup/tags: [ ] 設定收字tag 範圍

tag tags extra_tags 都有使用環境(以我的理解)

tag : translator 表接收符合 TAG 字串

extra_tags: 利用 recognizer affix_segmentor 安排複雜的一功能

recognizer/pattern 觸發 tag

tag 在其他 segment 上表示

缺點 preedit_format comment_format 會有重疊不清

最好加上 initial_quality 設定候選字的排序

tag 在

ex :

transtors/script_transtor@pinyin/tag: abc #表示此transtor 以 tag 為 ABC 翻譯

pinyin:

tag: abe

詳解說明請查 https://github.com/LEOYoon-Tsaw/Rime_collections/blob/master/Rime_description.md

1 整個流程( 主翻譯器最好只有一個因為布署時配合 speller/algebra dict.yaml file 要產生 bin檔案 ( table prism reverse)

其他副字典靠引用已產生的 bin 檔案 ( 利用 schema/dependencies: [ cangjie5 , luna_pinyin , bopomofo ] 協助完成，前提是要準備好 schema.yaml )

1 processors 是最先開始此設定是 ARRAY 有順序原則

較麻煩的是 recognizer key_binder speller punctuator 的順序影響整個輸入法衝突

recognizer/patterns 設定輸入字 pattern 以便標示此段字串的 tag ,沒有符合的大部份會標示為 ABC tag

此項功能是為了觸發 tag 達到反查或調用副字典

2 speller 本方案輸入法碼 prism碼設置

注意事項 algebra : 其他請查詳解

xform --改寫〔不保留原形〕 用於 注音拼音 轉換  terry_pinyni --> bopomofo  , cangjin5 --> quick  
derive --衍生〔保留原形〕  用於 增加簡碼或新碼       cangjin5 -->  cangjin5+ quick

3 translator 請查詳解目前對自動造詞不熟對於常用 table_transtor 的我，不喜歡太自動調詞

下文引自 https://github.com/LEOYoon-Tsaw/Rime_collections/blob/master/Rime_description.md 我會在範例增加紅字說明當然是我目前己知的

`Schema.yaml` 詳解

開始之前

# Rime schema
# encoding: utf-8

描述檔

name: 方案的顯示名偁〔即出現於方案選單中以示人的，通常爲中文〕
schema_id: 方案內部名，在代碼中引用此方案時以此名爲正，通常由英文、數字、下劃線組成
author: 發明人、撰寫者。如果您對方案做出了修改，請保留原作者名，並將自己的名字加在後面
description: 請簡要描述方案歷史、碼表來源、該方案規則等
dependencies: 如果本方案依賴於其它方案〔通常來說會依頼其它方案做爲反查，抑或是兩種或多種方案混用時〕事前要先準備好且可以產生 bin file 。每個方案相關依賴檔案 ( name.schema.yaml name.custom.yaml dict.yaml )
version: 版本號，在發佈新版前請確保已陞版本號

示例

schema:
  name: "蒼頡檢字法"
  schema_id: cangjie6
  author:
    - "發明人 朱邦復先生、沈紅蓮女士"
  dependencies:
    - luna_pinyin
    - jyutping
    - zyenpheng
  description: |
    第六代倉頡輸入法
    碼表由雪齋、惜緣和crazy4u整理
  version: 0.19

開關 ( 并無大問題就不說了，也還沒有 opencc 其他用途 .此部份我會在schema中預設再提供 custom中給USER 修改個人習慣)

通常包含以下五個：

ascii_mode 是中英文轉換開關。預設0爲中文，1爲英文
full_shape 是全角符號／半角符號開關。注意，開啓全角時英文字母亦爲全角。0爲半角，1爲全角
extended_charset 是字符集開關。0爲CJK基本字符集，1爲CJK全字符集
- 僅table_translator可用
simplification 是轉化字開關。一般情況下與上同，0爲不開啓轉化，1爲轉化。
ascii_punct 是中西文標點轉換開關，0爲中文句讀，1爲西文標點。

示例

switches:
  - name: ascii_mode
    reset: 0
    states: ["中文", "西文"]
  - name: full_shape
    states: ["半角", "全角"]
  - name: extended_charset
    states: ["通用", "增廣"]
  - name: simplification
    states: ["漢字", "汉字"]
  - name: ascii_punct
    states: ["句讀", "符號"]

此選項名偁可自定義，亦可添加多套替換用字方案：

- name: zh_cn
  states: ["漢字", "汉字"]
  reset: 0

或

- options: [ zh_trad, zh_cn, zh_mars ]
  states:
    - 字型 → 漢字
    - 字型 → 汉字
    - 字型 → 䕼茡
  reset: 0

states: 可不寫，如不寫則此開關存在但不可見，可由快捷鍵操作
reset: 設定默認狀態〔reset可不寫，此時切換窗口時不會重置到默認狀態〕

引擎

以下加粗項爲可細配者，斜體者爲不常用者

引擎分四組：

一、`processors ( recognizer key_binder speller punctuator 會有衝突目前不了解，以目前的用法是不動順序先攔載 pattern`

這批組件處理各類按鍵消息

ascii_composer 處理西文模式及中西文切
recognizer 與matcher搭配，處理符合特定規則的輸入碼，如網址、反查等tags
key_binder 在特定條件下將按鍵綁定到其他按鍵，如重定義逗號、句號爲候選翻頁、開關快捷鍵等
speller 拼寫處理器，接受字符按鍵，編輯輸入
punctuator 句讀處理器，將單個字符按鍵直接映射爲標點符號或文字
selector 選字處理器，處理數字選字鍵〔可以換成別的哦〕、上、下候選定位、換頁
navigator 處理輸入欄內的光標移動
express_editor 編輯器，處理空格、回車上屏、回退鍵 拆碼者適用
fluid_editor 句式編輯器，用於以空格斷詞、回車上屏的【注音】、【語句流】等輸入方案，替換express_editor 連打按 enter 上字適用智能+ 拼音
chord_composer 和絃作曲家或曰並擊處理器，用於【宮保拼音】等多鍵並擊的輸入方案

二、`segmentors`

這批組件識別不同內容類型，將輸入碼分段並加上tag

ascii_segmentor 標識西文段落〔譬如在西文模式下〕字母直接上屛
matcher 配合recognizer標識符合特定規則的段落，如網址、反查等，加上特定tag
abc_segmentor 標識常規的文字段落，加上abc這個tag
punct_segmentor 標識句讀段落〔鍵入標點符號用〕加上punct這個tag
fallback_segmentor 標識其他未標識段落
affix_segmentor 用戶自定義tag 自製複雜輸入法要了解相關 recognizer/pattern tag tags extra-tag

- 此項可加載多個實例，後接@+tag名

三、`translators`

這批組件翻譯特定類型的編碼段爲一組候選文字

echo_translator 沒有其他候選字時，回顯輸入碼〔輸入碼可以Shift+Enter上屛〕不加此項打錯或無碼就會被清除
punct_translator 配合punct_segmentor轉換標點符號
table_translator 碼表翻譯器，用於倉頡、五筆等基於碼表的輸入方案
- 此項可加載多個實例，後接@+翻譯器名〔如：cangjie、wubi等〕要多拼 tag: 不要設或設 ABC 表示此 transltors 收 abc tag , 另外可以設 abclextra_tag [ pin-yin] 見二、segmenton
script_translator 腳本翻譯器，用於拼音、粵拼等基於音節表的輸入方案 script_translator@cangjie -->dictionary: cangjie5.dict 也是可行有什麼影響未知，我沒這樣用
- 此項可加載多個實例，後接@+翻譯器名〔如：pinyin、jyutping等〕
reverse_lookup_translator 反查翻譯器，用另一種編碼方案查碼 (沒用此項我都用 reverse_lookup_filter 此項如果不想花時間大多現成 schema 都用此法反查)

四、`filters`

這批組件過濾翻譯的結果

simplifier 用字轉換
uniquifier 過濾重複的候選字，有可能來自simplifier
cjk_minifier 字符集過濾〔用於script_translator，使之支援extended_charset開關〕
reverse_lookup_filter 反查濾鏡，以更靈活的方式反查，Rime1.0後替代reverse_lookup_translator
- 此項可加載多個實例，後接@+濾鏡名〔如：pinyin_lookup、jyutping_lookup等〕
single_char_filter 單字過濾器，如加載此組件，則屛敝詞典中的詞組〔僅table_translator有效〕只是將詞組往後調

示例

cangjie6.schema.yaml

engine:
  processors:
    - ascii_composer
    - recognizer
    - key_binder
    - speller
    - punctuator
    - selector
    - navigator
    - express_editor
  segmentors:
    - ascii_segmentor
    - matcher
    - affix_segmentor@pinyin
    - affix_segmentor@jyutping
    - affix_segmentor@pinyin_lookup
    - affix_segmentor@jyutping_lookup
    - affix_segmentor@reverse_lookup
    - abc_segmentor
    - punct_segmentor
    - fallback_segmentor
  translators:
    - punct_translator
    - table_translator
    - script_translator@pinyin
    - script_translator@jyutping
    - script_translator@pinyin_lookup
    - script_translator@jyutping_lookup
  filters:
    - simplifier@zh_simp
    - uniquifier
    - cjk_minifier
    - reverse_lookup_filter@middle_chinese
    - reverse_lookup_filter@pinyin_reverse_lookup
    - reverse_lookup_filter@jyutping_reverse_lookup

細項配置

凡comment_format、preedit_format、speller/algebra所用之正則表達式，請參閱「Perl正則表達式」

引擎中所舉之加粗者均可在下方詳細描述，格式爲：

name:
  branches: configurations

或

name:
  branches:
    - configurations

一、`speller`

alphabet: 定義本方案輸入鍵
initials: 定義僅作始碼之鍵
finals: 定義僅作末碼之鍵
delimiter: 上屛時的音節間分音符
algebra: 拼寫運算規則，由之算出的拼寫匯入prism中
max_code_length: 形碼最大碼長，超過則頂字上屛〔number〕
auto_select: 自動上屛〔true或false〕
auto_select_pattern: 自動上屏規則，以正則表達式描述，當輸入串可以被匹配時自動頂字上屏。
use_space: 以空格作輸入碼〔true或false〕

xform --改寫〔不保留原形〕拼音轉碼 原碼無字 拼音改大千    倉頡改快速
derive --衍生〔保留原形〕 增加 新編碼    倉頡-> 倉頡+快速
abbrev --簡拼〔出字優先級較上兩組更低〕
fuzz --畧拼〔此種簡拼僅組詞，不出單字〕
xlit --變換〔適合大量一對一變換〕 類似 xform 
erase --刪除

speller的演算包含：

示例

luna_pinyin.schema.yaml

speller:
  alphabet: zyxwvutsrqponmlkjihgfedcba
  delimiter: " '"
  algebra:
    - erase/^xx$/
    - abbrev/^([a-z]).+$/$1/
    - abbrev/^([zcs]h).+$/$1/
    - derive/^([nl])ve$/$1ue/
    - derive/^([jqxy])u/$1v/
    - derive/un$/uen/
    - derive/ui$/uei/
    - derive/iu$/iou/
    - derive/([aeiou])ng$/$1gn/
    - derive/([dtngkhrzcs])o(u|ng)$/$1o/
    - derive/ong$/on/
    - derive/ao$/oa/
    - derive/([iu])a(o|ng?)$/a$1$2/

二、`segmentor`

segmentor配合recognizer標記出tag。這裏會用到affix_segmentor和abc_translator
tag用在translator、reverse_lookup_filter、simplifier中用以標定各自作用範圍
如果不需要用到extra_tags則不需要單獨配置segmentor 還沒了解透目前只要觸發不了先加上affix_segmentor 再說

tag: 設定其tag 接收此 tag 字串
prefix: 設定其前綴標識，可不塡，不塡則無前綴 (去除因 pattern增加的前綴標示 )
suffix: 設定其尾綴標識，可不塡，不塡則無尾綴( 後標示碼觸發試不出來目前沒用)
tips: 設定其輸入前提示符，可不塡，不塡則無提示符
closing_tips: 設定其結束輸入提示符，可不塡，不塡則無提示符
extra_tags: 爲此segmentor所標記的段落插上其它tag( 我的理解用在觸發 translator or sub segmentor --> translator , revers_lookup_filter 已有tags 接收想不出來怎麼用在 filter 上如: pattern 觸發 test testseg/tag: test ; testseg/extra_tag: [ pinyin, abc ] ; rever_lookup_filter@test/tags: [ test ] : pinyin , 主字典翻譯出來的字加上 test tag filter 對 test 的字再反查 )

當affix_segmentor和translator重名時，兩者可併在一處配置，此處1-5條對應下面19-23條。abc_segmentor僅可設extra_tags

示例

cangjie6.schema.yaml

#  6 extra_tags: 說明

segmentors:
   - affix_segmentor@testseg

translators:
   - table_translator
   - script_translator@pinyin
filters:
   - simplifier   # opencc  繁簡轉換 filter 
   - uniquifier   # 重覆字清除
   - reverse_lookup_filter@test  


 recognizer: 
    patterns:
      test: "/PP[a-z;.]*$/   # PPabc --> PPabc tag:test 
      pinyin: "P[a-;,]*$/    # Pabc -->   Pabc tag:pinyin 

testseg:   PPabc tag:test 
    tag: test   # 收 test tag 
    extra: [  pinyin] #轉發 pinyin 處理
    prifix: "PP"  # 去 PP  "abc" 要求 pinyin 的字典查找 . 清除 前置 字串 再轉發 , 如果有資料 (字) 加 tag:test 推入候選字
pinyin:  #  Pabc tag:pinyin 
    tag: pinyin
    directionry: luna_pinyin       # tag:pinyin Pabc -->  tag:pinyin abc  --> word  --> 推入候選字 word + tag:pinyin 
                                   # tag:test abc  -->  tag:test abc --> word  --> 推仆候選字 word + tag:test 
    prifix: "P" #  去P abc 查碼 。  只會處理 pinyin tag  , test tag 是被 testseg 要求協助翻譯 井反回 testseg 由 testseg 輸出 
    
test: #     word tag:test reverser_lookup_filter 
    tags: [ test ]
    commit_format:  # 反查碼 轉碼
    directionry: cangjie6 # cangjie5.reverse.bin   










reverse_lookup:
  tag: reverse_lookup    # 收到 reverse_lookup tag 發給 pinyin_lookup jyutpin_lookup 處理(轉發 且輸出的字加上 reverse_lookup tag 
  prefix: "`"   # 觸發己處理 完 字串交由 pinyin 翻譯  pinyin 不會再處理 prefix suffix 直接翻譯   
  # EX:    `abc ->    'abc tagreverse_lookup -> abc -> pinyin 翻譯-> "字" tag: reverse_lookup  
  # 如果 清除不乾淨  字串  會查不到碼  結果 會是原來的字串  如pattern 規則有誤  'Aabc -> Aabc   --> 查不到 --> Aabc  
  suffix: ";"
  tips: "【反查】"
  closing_tips: "【蒼頡】"
  extra_tags:
    - pinyin_lookup
    - jyutping_lookup

三、`translator`

每個方案有一個主translator，在引擎列表中不以@+翻譯器名定義，在細項配置時直接以translator:命名。以下加粗項爲可在主translator中定義之項，其它可在副〔以@+翻譯器名命名〕translator中定義

enable_charset_filter: 是否開啓字符集過濾〔僅table_translator有效。啓用cjk_minifier後可適用於script_translator〕
enable_encoder: 是否開啓自動造詞〔僅table_translator有效〕
encode_commit_history: 是否對已上屛詞自動成詞〔僅table_translator有效〕
max_phrase_length: 最大自動成詞詞長〔僅table_translator有效〕
enable_completion: 提前顯示尚未輸入完整碼的字〔僅table_translator有效〕可以將未完成碼也放在候選字上一碼優先方便拆碼
sentence_over_completion: 在無全碼對應字而僅有逐鍵提示時也開啓智能組句〔僅table_translator有效〕
strict_spelling: 配合speller中的fuzz規則，僅以畧拼碼組詞〔僅table_translator有效〕
disable_user_dict_for_patterns: 禁止某些編碼錄入用戶詞典( 可以限制常用簡碼被調整順序我是設定 /^..?$/ 一二碼不讓動，這樣就可以學習使用者習慣常用簡碼不亂調)
enable_sentence: 是否開啓自動造句
enable_user_dict: 是否開啓用戶詞典〔用戶詞典記錄動態字詞頻、用戶詞〕
- 以上選塡true或false
dictionary: 翻譯器將調取此字典文件
prism: 設定由此主翻譯器的speller生成的棱鏡文件名，或此副編譯器調用的棱鏡名(這個記住)
user_dict: 設定用戶詞典名
db_class: 設定用戶詞典類型，可設tabledb〔文本〕或userdb〔二進制〕
preedit_format: 上屛碼自定義 (輸入字元轉換 xlit|abcd|日月金木| ; xform/^(.*)$/倉-$1/ 鍵碼轉倉碼 , 最前面啟倉-提醒
comment_format: 提示碼自定義 transtor 完整字無提示和剩餘碼候選字 + 殘碼(處理此部份) reverse_lookup_filter 候選字 + 多個解碼 ( 字 + jpk tta tta (處理此部份)
spelling_hints: 設定多少字以內候選標註完整帶調拼音〔僅script_translator有效〕(拼音有用可以顯示拼音 )
initial_quality: 設定此翻譯器出字優先級 (同時用兩個以上時可以設定)
tag: 設定此翻譯器針對的tag。可不塡，不塡則僅針對abc
prefix: 設定此翻譯器的前綴標識，可不塡，不塡則無前綴
suffix: 設定此翻譯器的尾綴標識，可不塡，不塡則無尾綴
tips: 設定此翻譯器的輸入前提示符，可不塡，不塡則無提示符
closing_tips: 設定此翻譯器的結束輸入提示符，可不塡，不塡則無提示符

示例

cangjie6.schema.yaml 蒼頡主翻譯器

translator:
  dictionary: cangjie6
  enable_charset_filter: true
  enable_sentence: true
  enable_encoder: true
  encode_commit_history: true
  max_phrase_length: 5
  preedit_format:
    - xform/^([a-z ])$/$1｜\U$1\E/
    - xform/(?<=[a-z])\s(?=[a-z])//
    - "xlit|ABCDEFGHIJKLMNOPQRSTUVWXYZ|日月金木水火土竹戈十大中一弓人心手口尸廿山女田止卜片|"
  comment_format: # 殘碼(處理此部份)
    - "xlit|abcdefghijklmnopqrstuvwxyz~|日月金木水火土竹戈十大中一弓人心手口尸廿山女田止卜片・|"
  disable_user_dict_for_patterns:
    - "^z.$"
  initial_quality: 0.75

cangjie6.schema.yaml 拼音副翻譯器

pinyin:
  tag: pinyin  #收 pinyin tag  字串翻譯   
  dictionary: luna_pinyin  #用 luna_pinyin.table.bin luna_pinyin.prism.bin 找字
  enable_charset_filter: true
  prefix: 'P' #須配合recognizer  字串前置處理 
  suffix: ';' #須配合recognizer
  preedit_format:
    - "xform/([nl])v/$1ü/"
    - "xform/([nl])ue/$1üe/"
    - "xform/([jqxy])v/$1u/"
  tips: "【漢拼】"
  closing_tips: "【蒼頡】"

pinyin_simp.schema.yaml 拼音・簡化字主翻譯器

translator:  #沒有 tag 為接收 abc tag  , 主要翻譯 會製作 luna_pinyin.table.bin luna_pinyin_simp.prism.bin luna_pinyin.reverse.bin 
  dictionary: luna_pinyin
  prism: luna_pinyin_simp
  preedit_format:
    - xform/([nl])v/$1ü/
    - xform/([nl])ue/$1üe/
    - xform/([jqxy])v/$1u/

luna_pinyin.schema.yaml 朙月拼音用戶短語

custom_phrase: #這是一個table_translator
  dictionary: ""
  user_dict: custom_phrase
  db_class: tabledb
  enable_sentence: false
  enable_completion: false
  initial_quality: 1

四、`reverse_lookup_filter`

此濾鏡須掛在translator上，不影響該translator工作

tags: 設定其作用範圍
overwrite_comment: 是否覆蓋其他提示
dictionary: 反查所得提示碼之碼表 (name.reverse.bin)
comment_format: 自定義提示碼格式reverse_lookup_filter 候選字 + 多個解碼 ( 字 + jpk tta tta (處理此部份)

示例

cangjie6.schema.yaml

pinyin_reverse_lookup: #該反查濾鏡名
  tags: [ pinyin_lookup ] #掛在這個tag所對應的翻譯器上 接收 pinyin_lookup tag 的字 用 reverse_dictionary.bin 轉出 拼碼  
  overwrite_comment: true
  dictionary: cangjie6 #反查所得爲蒼頡碼 reverse.bin
  comment_format:  # 拼碼置換
    - "xform/$/〕/"
    - "xform/^/〔/"
    - "xlit|abcdefghijklmnopqrstuvwxyz |日月金木水火土竹戈十大中一弓人心手口尸廿山女田止卜片、|"

五、`simplifier`

option_name: 對應switches中設定的切換項名
opencc_config: 用字轉換配置文件
- 位於：rime_dir/opencc/，自帶之配置文件含：
  1. 繁轉簡〔默認〕：t2s.json
  2. 繁轉臺灣：t2tw.json
  3. 繁轉香港：t2hk.json
  4. 簡轉繁：s2t.json
tags: 設定轉換範圍
tips: 設定是否提示轉換前的字，可塡none〔或不塡〕、char〔僅對單字有效〕、all
show_in_comment: 設定是否僅將轉換結果顯示在備注中
excluded_types: 取消特定範圍〔一般爲reverse_lookup_translator〕轉化用字

示例

修改自 luna_pinyin_kunki.schema

zh_tw:
  option_name: zh_tw
  opencc_config: t2tw.json
  tags: [ abc ] #abc對應abc_segmentor
  tips: none

六、`chord_composer`

並擊把鍵盤分兩半，相當於兩塊鍵盤。兩邊同時擊鍵，系統默認在其中一半上按的鍵先於另一半，由此得出上屛碼

alphabet: 字母表，包含用於並擊的按鍵。擊鍵雖有先後，形成並擊時，一律以字母表順序排列
algebra: 拼寫運算規則，將一組並擊編碼轉換爲拼音音節
output_format: 並擊完成後套用的式樣，追加隔音符號
prompt_format: 並擊過程中套用的式樣，加方括弧

示例

combo_pinyin.schema.yaml

chord_composer:
  # 字母表，包含用於並擊的按鍵
  # 擊鍵雖有先後，形成並擊時，一律以字母表順序排列
  alphabet: "swxdecfrvgtbnjum ki,lo."
  # 拼寫運算規則，將一組並擊編碼轉換爲拼音音節
  algebra:
    # 先將物理按鍵字符對應到宮保拼音鍵位中的拼音字母
    - 'xlit|swxdecfrvgtbnjum ki,lo.|sczhlfgdbktpRiuVaNIUeoE|'
    # 以下根據宮保拼音的鍵位分別變換聲母、韻母部分
    # 組合聲母
    - xform/^zf/zh/
    - xform/^cl/ch/
    - xform/^fb/m/
    - xform/^ld/n/
    - xform/^hg/r/
    ……
    # 聲母獨用時補足隠含的韻母
    - xform/^([bpf])$/$1u/
    - xform/^([mdtnlgkh])$/$1e/
    - xform/^([mdtnlgkh])$/$1e/
    - xform/^([zcsr]h?)$/$1i/
  # 並擊完成後套用的式樣，追加隔音符號
  output_format:
    - "xform/^([a-z]+)$/$1'/"
  # 並擊過程中套用的式樣，加方括弧
  prompt_format:
    - "xform/^(.*)$/[$1]/"

七、其它

包括recognizer、key_binder、punctuator。標點、快捷鍵、二三選重、特殊字符等均於此設置

import_preset: 由外部統一文件導入
recognizer: 下設patterns: 配合segmentor的prefix和suffix完成段落劃分、tag分配
- :前字段可以爲以affix_segmentor@someTag定義的Tag名，或者punct、reverse_lookup兩個內設的字段。其它字段不調用輸入法引擎，輸入即輸出〔如url等字段〕
key_binder: 下設bindings: 設置功能性快捷鍵
- 每一條binding可能包含：accept實際所按之鍵、send輸出效果、toggle切換開關和when作用範圍〔send和toggle二選一〕
  - toggle可用字段包含五個開關名
  - when可用字段包含：
```
paging	翻䈎用
has_menu	操作候選項用
composing	操作輸入碼用
always	全域
```
  - accept和send可用字段除A-Za-z0-9外，還包含以下鍵板上實際有的鍵：
```
BackSpace	退格
Tab	水平定位符
Linefeed	换行
Clear	清除
Return	回車
Pause	暫停
Sys_Req	印屏
Escape	退出
Delete	刪除
Home	原位
Left	左箭頭
Up	上箭頭
Right	右箭頭
Down	下箭頭
Prior、Page_Up	上翻
Next、Page_Down	下翻
End	末位
Begin	始位
Shift_L	左Shift
Shift_R	右Shift
Control_L	左Ctrl
Control_R	右Ctrl
Meta_L	左Meta
Meta_R	右Meta
Alt_L	左Alt
Alt_R	右Alt
Super_L	左Super
Super_R	右Super
Hyper_L	左Hyper
Hyper_R	右Hyper
Caps_Lock	大寫鎖
Shift_Lock	上檔鎖
Scroll_Lock	滾動鎖
Num_Lock	小鍵板鎖
Select	選定
Print	列印
Execute	執行
Insert	插入
Undo	還原
Redo	重做
Menu	菜單
Find	蒐尋
Cancel	取消
Help	幫助
Break	中斷
```
    space exclam ! quotedbl " numbersign # dollar $ percent % ampersand & apostrophe ' parenleft ( parenright ) asterisk * plus + comma , minus - period . slash / colon : semicolon ; less < equal = greater > question ? at @ bracketleft [ backslash bracketright ] asciicircum ^ underscore _ grave ` braceleft { bar | braceright } asciitilde ~
    
    KP_Space 小鍵板空格 KP_Tab 小鍵板水平定位符 KP_Enter 小鍵板回車 KP_Delete 小鍵板刪除 KP_Home 小鍵板原位 KP_Left 小鍵板左箭頭 KP_Up 小鍵板上箭頭 KP_Right 小鍵板右箭頭 KP_Down 小鍵板下箭頭 KP_Prior、KP_Page_Up 小鍵板上翻 KP_Next、KP_Page_Down 小鍵板下翻 KP_End 小鍵板末位 KP_Begin 小鍵板始位 KP_Insert 小鍵板插入 KP_Equal 小鍵板等於 KP_Multiply 小鍵板乘號 KP_Add 小鍵板加號 KP_Subtract 小鍵板減號 KP_Divide 小鍵板除號 KP_Decimal 小鍵板小數點 KP_0 小鍵板0 KP_1 小鍵板1 KP_2 小鍵板2 KP_3 小鍵板3 KP_4 小鍵板4 KP_5 小鍵板5 KP_6 小鍵板6 KP_7 小鍵板7 KP_8 小鍵板8 KP_9 小鍵板9

editor用以訂製操作鍵〔不支持import_preset:〕，鍵板鍵名同key_binder/bindings中的accept和send，效果定義如下：

confirm	上屏候選項
commit_comment	上屏候選項備注
commit_raw_input	上屏原始輸入
commit_script_text	上屏變換後輸入
commit_composition	語句流單字上屏
revert	撤消上次輸入
back	按字符回退
back_syllable	按音節回退
delete_candidate	刪除候選項
delete	向後刪除
cancel	取消輸入
noop	空

punctuator: 下設full_shape:和half_shape:分别控制全角模式下的符號和半角模式下的符號，另有use_space:空格頂字〔true或false〕
- 每條標點項可加commit直接上屏和pair交替上屏兩種模式，默認爲選單模式

示例

修改自 cangjie6.schema.yaml

key_binder:
  import_preset: default
  bindings:
    - {accept: semicolon, send: 2, when: has_menu} #分號選第二重碼
    - {accept: apostrophe, send: 3, when: has_menu} #引號選第三重碼
    - {accept: "Control+1", select: .next, when: always}
    - {accept: "Control+2", toggle: full_shape, when: always}
    - {accept: "Control+3", toggle: simplification, when: always}
    - {accept: "Control+4", toggle: extended_charset, when: always}

editor: bindings: Return: commit_comment

punctuator: import_preset: symbols half_shape: "'": {pair: ["「", "」"]} #第一次按是「，第二次是」 "(": ["〔", "［"] #彈出選單 .: {commit: "。"} #無選單，直接上屛。優先級最高

recognizer: import_preset: default patterns: email: "^[a-z][-_.0-9a-z]@.$" url: "^(www[.]|https?:|ftp:|mailto:).$" reverse_lookup: "[a-z]*;?$" pinyin_lookup: "P[a-z];?$" jyutping_lookup: "J[a-z]*;?$" pinyin: "(?&lt!)P[a-z'];?$" jyutping: "(?&lt!`)J[a-z'];?$" punct: "/[a-z]*$" #配合symbols.yaml中的特殊字符輸入

其它

示例

menu:
  alternative_select_labels: [ ①, ②, ③, ④, ⑤, ⑥, ⑦, ⑧, ⑨ ]  # 修改候選標籤
  alternative_select_keys: ASDFGHJKL #如編碼字符佔用數字鍵則須另設選字鍵
  page_size: 5 #選單每䈎顯示個數

style: font_face: "HanaMinA, HanaMinB" #字體〔小狼毫得且僅得設一個字體；鼠鬚管得設多個字體，後面的字體自動補前面字體不含的字〕 font_point: 15 #字號 label_format: '%s' # 候選標籤格式 horizontal: false #橫／直排 line_spacing: 1 #行距 inline_preedit: true #輸入碼內嵌

Rime還爲每個方案提供選單和一定的外觀訂製能力
通常情況下menu在default.yaml中定義〔或用戶修改檔default.custom.yaml〕，style在squirrel.yaml或weasel.yaml〔或用戶修改檔squirrel.custom.yaml或weasel.custom.yaml〕

`Dict.yaml` 詳解

開始之前

# Rime dict
# encoding: utf-8
〔你還可以在這註釋字典來源、變動記錄等〕

描述檔

name: 內部字典名，也即schema所引用的字典名，確保與文件名相一致
version: 如果發佈，請確保每次改動陞版本號

示例

name: "cangjie6.extended"
version: "0.1"

配置

sort: 字典初始排序，可選original或by_weight
use_preset_vocabulary: 是否引入「八股文」〔含字詞頻、詞庫〕
max_phrase_length: 配合use_preset_vocabulary:，設定導入詞條最大詞長
min_phrase_weight: 配合use_preset_vocabulary:，設定導入詞條最小詞頻
columns: 定義碼表以Tab分隔出的各列，可設text【文本】、code【碼】、weight【權重】、stem【造詞碼】
import_tables: 加載其它外部碼表
encoder: 形碼造詞規則
1. exclude_patterns:
2. rules: 可用length_equal:和length_in_range:定義。大寫字母表示字序，小寫字母表示其所跟隨的大寫字母所以表的字中的編碼序
3. tail_anchor: 造詞碼包含結構分割符〔僅用於倉頡〕
4. exclude_patterns 取消某編碼的造詞資格

示例

cangjie6.extended.dict.yaml

sort: by_weight
use_preset_vocabulary: false
import_tables:
  - cangjie6 #單字碼表由cangjie6.dict.yaml導入
columns: #此字典爲純詞典，無單字編碼，僅有字和詞頻
  - text #字／詞
  - weight #字／詞頻
encoder:
  exclude_patterns:
    - '^z.*$'
  rules:
    - length_equal: 2 #對於二字詞
      formula: "AaAzBaBbBz" #取第一字首尾碼、第二字首次尾碼
    - length_equal: 3 #對於三字詞
      formula: "AaAzBaYzZz" #取第一字首尾碼、第二字首尾碼、第三字尾碼
    - length_in_range: [4, 5] #對於四至五字詞
      formula: "AaBzCaYzZz" #取第一字首碼，第二字尾碼、第三字首碼、倒數第二字尾碼、最後一字尾碼
  tail_anchor: "'"

碼表

以Tab分隔各列，各列依columns:定義排列。

示例

cangjie6.dict.yaml

columns:
  - text #第一列字／詞
  - code #第二列碼
  - weight #第三列字／詞頻
  - stem #第四列造詞碼

cangjie6.dict.yaml

個	owjr	246268	ow'jr
看	hqbu	245668
中	l	243881
呢	rsp	242970
來	doo	235101
嗎	rsqf	221092
爲	bhnf	211340
會	owfa	209844
她	vpd	204725
與	xyc	203975
給	vfor	193007
等	hgdi	183340
這	yymr	181787
用	bq	168934	b'q

shewerlu

shewerlu的部落格

shewerlu 發表在痞客邦留言(0) 人氣()

E-mail轉寄

shewerlu的部落格

歡迎光臨shewerlu在痞客邦的小天地

RIME schema 教學

Schema.yaml 詳解

開始之前

描述檔

示例

開關 ( 并無大問題 就不說了，也還沒有 opencc 其他用途 .此部份我會在schema中預設 再提供 custom中給USER 修改個人習慣)

示例

引擎

一、processors ( recognizer key_binder speller punctuator 會有衝突 目前不了解，以目前的用法 是不動順序 先攔載 pattern

二、segmentors

三、translators

四、filters

示例

細項配置

一、speller

示例

二、segmentor

示例

三、translator

示例

四、reverse_lookup_filter

示例

五、simplifier

示例

六、chord_composer

示例

七、其它

示例

其它

示例

Dict.yaml 詳解

開始之前

描述檔

示例

配置

示例

碼表

示例

留言列表

站方公告

活動快報

【船井...

我的好友

熱門文章

文章分類

nodemcu (0)

最新文章

最新留言

動態訂閱

文章精選

文章搜尋

新聞交換(RSS)

誰來我家

參觀人氣

QR Code

POWERED BY

`Schema.yaml` 詳解

開關 ( 并無大問題就不說了，也還沒有 opencc 其他用途 .此部份我會在schema中預設再提供 custom中給USER 修改個人習慣)

一、`processors ( recognizer key_binder speller punctuator 會有衝突目前不了解，以目前的用法是不動順序先攔載 pattern`

二、`segmentors`

三、`translators`

四、`filters`

一、`speller`

二、`segmentor`

三、`translator`

四、`reverse_lookup_filter`

五、`simplifier`

六、`chord_composer`

`Dict.yaml` 詳解