deberta-base-japanese-wikipedia-ud-goeswithとUDPipe 2の精度比較 | yasuokaの日記

yasuokaの日記： deberta-base-japanese-wikipedia-ud-goeswithとUDPipe 2の精度比較 0

日記 by yasuoka 2022年11月17日 23時29分

Universal Dependencies 2.11がリリースされたので、国語研長単位モデルdeberta-base-japanese-wikipedia-ud-goeswithのLAS/MLAS/BLEXを見てみた。Google Colaboratoryだとこんな感じ。

!pip install transformers import os url="https://github.com/UniversalDependencies/UD_Japanese-GSDLUW" d=os.path.basename(url) !test -d {d} || git clone --depth=1 {url} !for F in train dev test ; do cp {d}/*-$$F.conllu $$F.conllu ; done url="https://universaldependencies.org/conll18/conll18_ud_eval.py" c=os.path.basename(url) !test -f {c} || curl -LO {url} from transformers import pipeline nlp=pipeline(task="universal-dependencies",model="KoichiYasuoka/deberta-base-japanese-wikipedia-ud-goeswith",aggregation_strategy="simple",trust_remote_code=True) with open("test.conllu","r",encoding="utf-8") as r: s=r.read() with open("result-test.conllu","w",encoding="utf-8") as w: for t in s.split("\n"): if t.startswith("# text = "): w.write(nlp(t[9:])) !python {c} test.conllu result-test.conllu

私(安岡孝一)の手元では、以下の結果になった。

LAS F1 Score: 88.06 MLAS Score: 77.79 BLEX Score: 0.00

LEMMAを出力していないので、BLEXが0.00になってしまう。同様のテストをUDPipe 2のjapanese-gsdluw-ud-2.10-220711モデルで試したところ、以下の結果になった。

LAS F1 Score: 85.15 MLAS Score: 75.91 BLEX Score: 76.57

deberta-base-japanese-wikipedia-ud-goeswithの方が多少いい結果なのだが、やはりBLEX 0.00はツライ。さて、LEMMAを出力するには、どうしたらいいかな。

yasuokaの日記： deberta-base-japanese-wikipedia-ud-goeswithとUDPipe 2の精度比較 0

deberta-base-japanese-wikipedia-ud-goeswithとUDPipe 2の精度比較 More ログイン

スラド