ひさびさのパッチ
fht関数もそれなりに速くなったのでパッチにまとめました。パッチはこちらです。
いつものstarwarsの3秒WAVファイルのエンコ結果は以下のとおり。
systemsim % mysim spu 6 display statistics SPU DD3.0 *** Total Cycle count 981364963 Total Instruction count 643 Total CPI 1526228.50 *** Performance Cycle count 979809189 Performance Instruction count 597527927 (541391140) Performance CPI 1.64 (1.81) Branch instructions 30037952 Branch taken 19331618 Branch not taken 10706334 Hint instructions 7303105 Hint hit 13310919 Contention at LS between Load/Store and Prefetch 14136619 Single cycle 339400092 ( 34.6%) Dual cycle 100995524 ( 10.3%) Nop cycle 21308403 ( 2.2%) Stall due to branch miss 133529380 ( 13.6%) Stall due to prefetch miss 94213 ( 0.0%) Stall due to dependency 321072410 ( 32.8%) Stall due to fp resource conflict 0 ( 0.0%) Stall due to waiting for hint target 14876990 ( 1.5%) Stall due to dp pipeline 6265944 ( 0.6%) Channel stall cycle 42266233 ( 4.3%) SPU Initialization cycle 0 ( 0.0%) ----------------------------------------------------------------------- Total cycle 979809189 (100.0%) Stall cycles due to dependency on each pipelines FX2 24787486 ( 7.7% of all dependency stalls) SHUF 108813277 ( 33.9% of all dependency stalls) FX3 10420670 ( 3.2% of all dependency stalls) LS 94967457 ( 29.6% of all dependency stalls) BR 242474 ( 0.1% of all dependency stalls) SPR 488 ( 0.0% of all dependency stalls) LNOP 0 ( 0.0% of all dependency stalls) NOP 0 ( 0.0% of all dependency stalls) FXB 0 ( 0.0% of all dependency stalls) FP6 60554752 ( 18.9% of all dependency stalls) FP7 18592943 ( 5.8% of all dependency stalls) FPD 2692863 ( 0.8% of all dependency stalls) The number of used registers are 128, the used ratio is 100.00 dumped pipeline stats systemsim %
979809189/3000000000=0.327[sec]
約9.17倍速です。なかなか劇的な改善はないです。まぁしょうがないですね。