パスワードを忘れた? アカウント作成
15036710 journal
Debian

cyber205の日記: BYTE BENCH @ Core i7 860

日記 by cyber205

最近のBYTEBENCHはグラフィックスコアが測定できたり、マルチコアCPUでの性能が測定できる。
残念なことに、最新のGCCではgithubのソースでエラーが出たり、結果をまとめるスクリプトがperlだったりと、シンプルだった以前の形態から逸脱してるのが多少気になるところ。
LOCALE設定なんかも見ているようで、簡単にエラーなしで実行させるのはちょっと難しそう。

手持ちの高性能(?) な 4C8Tプロセッサ Corei7をベンチにかけてみた。
思ったより性能が出なくて気になっているモデルなので、もう1つ設計の古いCPUと比較してみたい。

------------------------------------------------------------------------
Benchmark_Run:_金__1月_01_2021_23:31:12_-_23:59:12
8_CPUs_in_system;_running_1_parallel_copy_of_tests

Dhrystone_2_using_register_variables_______34109062.2_lps___(10.0_s,_7_samples)
Double-Precision_Whetstone_____________________5042.8_MWIPS_(9.4_s,_7_samples)
Execl_Throughput_______________________________4139.3_lps___(30.0_s,_2_samples)
File_Copy_1024_bufsize_2000_maxblocks________563127.3_KBps__(30.0_s,_2_samples)
File_Copy_256_bufsize_500_maxblocks__________142387.3_KBps__(30.0_s,_2_samples)
File_Copy_4096_bufsize_8000_maxblocks_______1486933.4_KBps__(30.0_s,_2_samples)
Pipe_Throughput______________________________866857.2_lps___(10.0_s,_7_samples)
Pipe-based_Context_Switching_________________192783.5_lps___(10.0_s,_7_samples)
Process_Creation_______________________________8070.5_lps___(30.0_s,_2_samples)
Shell_Scripts_(1_concurrent)___________________9789.3_lpm___(60.0_s,_2_samples)
Shell_Scripts_(8_concurrent)___________________2882.0_lpm___(60.0_s,_2_samples)
System_Call_Overhead_________________________745427.1_lps___(10.0_s,_7_samples)

System_Benchmarks_Index_Values_______________BASELINE_______RESULT____INDEX
Dhrystone_2_using_register_variables_________116700.0___34109062.2___2922.8
Double-Precision_Whetstone_______________________55.0_______5042.8____916.9
Execl_Throughput_________________________________43.0_______4139.3____962.6
File_Copy_1024_bufsize_2000_maxblocks__________3960.0_____563127.3___1422.0
File_Copy_256_bufsize_500_maxblocks____________1655.0_____142387.3____860.3
File_Copy_4096_bufsize_8000_maxblocks__________5800.0____1486933.4___2563.7
Pipe_Throughput_______________________________12440.0_____866857.2____696.8
Pipe-based_Context_Switching___________________4000.0_____192783.5____482.0
Process_Creation________________________________126.0_______8070.5____640.5
Shell_Scripts_(1_concurrent)_____________________42.4_______9789.3___2308.8
Shell_Scripts_(8_concurrent)______________________6.0_______2882.0___4803.4
System_Call_Overhead__________________________15000.0_____745427.1____497.0
___________________________________________________________________========
System_Benchmarks_Index_Score________________________________________1207.3

シングルコア、シングルスレッドでの性能がこの値。
------------------------------------------------------------------------
Benchmark_Run:_金__1月_01_2021_23:59:12_-_00:27:28
8_CPUs_in_system;_running_4_parallel_copies_of_tests

Dhrystone_2_using_register_variables_______90173078.1_lps___(10.0_s,_7_samples)
Double-Precision_Whetstone____________________15653.9_MWIPS_(10.6_s,_7_samples)
Execl_Throughput______________________________11701.4_lps___(30.0_s,_2_samples)
File_Copy_1024_bufsize_2000_maxblocks________792873.8_KBps__(30.0_s,_2_samples)
File_Copy_256_bufsize_500_maxblocks__________203522.1_KBps__(30.0_s,_2_samples)
File_Copy_4096_bufsize_8000_maxblocks_______2458298.5_KBps__(30.0_s,_2_samples)
Pipe_Throughput_____________________________2782591.3_lps___(10.0_s,_7_samples)
Pipe-based_Context_Switching_________________475336.1_lps___(10.0_s,_7_samples)
Process_Creation______________________________20834.4_lps___(30.0_s,_2_samples)
Shell_Scripts_(1_concurrent)__________________22122.9_lpm___(60.0_s,_2_samples)
Shell_Scripts_(8_concurrent)___________________3937.9_lpm___(60.0_s,_2_samples)
System_Call_Overhead________________________2427961.5_lps___(10.0_s,_7_samples)

System_Benchmarks_Index_Values_______________BASELINE_______RESULT____INDEX
Dhrystone_2_using_register_variables_________116700.0___90173078.1___7726.9
Double-Precision_Whetstone_______________________55.0______15653.9___2846.2
Execl_Throughput_________________________________43.0______11701.4___2721.3
File_Copy_1024_bufsize_2000_maxblocks__________3960.0_____792873.8___2002.2
File_Copy_256_bufsize_500_maxblocks____________1655.0_____203522.1___1229.7
File_Copy_4096_bufsize_8000_maxblocks__________5800.0____2458298.5___4238.4
Pipe_Throughput_______________________________12440.0____2782591.3___2236.8
Pipe-based_Context_Switching___________________4000.0_____475336.1___1188.3
Process_Creation________________________________126.0______20834.4___1653.5
Shell_Scripts_(1_concurrent)_____________________42.4______22122.9___5217.7
Shell_Scripts_(8_concurrent)______________________6.0_______3937.9___6563.1
System_Call_Overhead__________________________15000.0____2427961.5___1618.6
___________________________________________________________________========
System_Benchmarks_Index_Score________________________________________2703.1

4コア全部を一応回してみると性能は2.23倍
------------------------------------------------------------------------
Benchmark_Run:_土__1月_02_2021_00:27:28_-_00:55:52
8_CPUs_in_system;_running_8_parallel_copies_of_tests

Dhrystone_2_using_register_variables______111751290.7_lps___(10.0_s,_7_samples)
Double-Precision_Whetstone____________________25548.7_MWIPS_(11.1_s,_7_samples)
Execl_Throughput______________________________15429.8_lps___(30.0_s,_2_samples)
File_Copy_1024_bufsize_2000_maxblocks________633169.5_KBps__(30.0_s,_2_samples)
File_Copy_256_bufsize_500_maxblocks__________166026.4_KBps__(30.0_s,_2_samples)
File_Copy_4096_bufsize_8000_maxblocks_______1978584.9_KBps__(30.0_s,_2_samples)
Pipe_Throughput_____________________________3580670.8_lps___(10.0_s,_7_samples)
Pipe-based_Context_Switching_________________750889.6_lps___(10.0_s,_7_samples)
Process_Creation______________________________32291.7_lps___(30.0_s,_2_samples)
Shell_Scripts_(1_concurrent)__________________25542.0_lpm___(60.0_s,_2_samples)
Shell_Scripts_(8_concurrent)___________________4007.0_lpm___(60.0_s,_2_samples)
System_Call_Overhead________________________3539051.9_lps___(10.0_s,_7_samples)

System_Benchmarks_Index_Values_______________BASELINE_______RESULT____INDEX
Dhrystone_2_using_register_variables_________116700.0__111751290.7___9575.9
Double-Precision_Whetstone_______________________55.0______25548.7___4645.2
Execl_Throughput_________________________________43.0______15429.8___3588.3
File_Copy_1024_bufsize_2000_maxblocks__________3960.0_____633169.5___1598.9
File_Copy_256_bufsize_500_maxblocks____________1655.0_____166026.4___1003.2
File_Copy_4096_bufsize_8000_maxblocks__________5800.0____1978584.9___3411.4
Pipe_Throughput_______________________________12440.0____3580670.8___2878.4
Pipe-based_Context_Switching___________________4000.0_____750889.6___1877.2
Process_Creation________________________________126.0______32291.7___2562.8
Shell_Scripts_(1_concurrent)_____________________42.4______25542.0___6024.1
Shell_Scripts_(8_concurrent)______________________6.0_______4007.0___6678.3
System_Call_Overhead__________________________15000.0____3539051.9___2359.4
___________________________________________________________________========
System_Benchmarks_Index_Score________________________________________3198.9

4コアそれぞれに2スレッド投入して見えている8CPU全てを回すと、
確かに性能が伸びて2.65倍まで性能が上がる。
周辺I/Oアクセスは並列運転が難しいだろうから、CPUコアだけ性能が伸びる条件で、
ここまで性能を稼げるのは立派なのか。

この議論は、cyber205 (4374)によって ログインユーザだけとして作成されたが、今となっては 新たにコメントを付けることはできません。
typodupeerror

長期的な見通しやビジョンはあえて持たないようにしてる -- Linus Torvalds

読み込み中...