slaon的个人博客分享 http://blog.sciencenet.cn/u/slaon

博文

Fluent 并行核数对比

已有 4999 次阅读 2022-11-1 23:18 |系统分类:科研笔记

测试平台:2颗 epyc 7742  Fluent2020

网格12w,density based,开启energy,计算161 迭代步

核数
时间/迭代步 [s/iter]
10
0.124
20
0.068
40
0.023
60
0.018
80
0.016

可见,核数越多,越快。

但是,如果同时开启2个以上fluent,

速度会相互影响。


Performance Timer for 161 iterations on 10 compute nodes
  Average wall-clock time per iteration:              0.124 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)

  Message count per iteration:                         2284 messages

  Data transfer per iteration:                        2.981 MB

  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.035 sec (28.5%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.000 sec (0.2%)
  LE global matrix maximum size:                        62
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      226 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.31 updates
  Time-step wall-clock time per iteration:            0.002 sec (1.9%)

  Total wall-clock time:                             19.916 sec
Performance Timer for 330 iterations on 20 compute nodes
  Average wall-clock time per iteration:              0.068 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)
  Message count per iteration:                         7687 messages
  Data transfer per iteration:                        5.265 MB
  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.020 sec (28.7%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.000 sec (0.6%)
  LE global matrix maximum size:                       192
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      212 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.30 updates
  Time-step wall-clock time per iteration:            0.001 sec (2.1%)

  Total wall-clock time:                             22.477 sec
Performance Timer for 161 iterations on 40 compute nodes
  Average wall-clock time per iteration:              0.023 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)
  Message count per iteration:                        13744 messages
  Data transfer per iteration:                        7.996 MB
  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.008 sec (33.1%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.001 sec (4.0%)
  LE global matrix maximum size:                       602
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      200 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.31 updates
  Time-step wall-clock time per iteration:            0.001 sec (2.6%)

  Total wall-clock time:                              3.732 sec
Performance Timer for 161 iterations on 60 compute nodes
  Average wall-clock time per iteration:              0.018 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)
  Message count per iteration:                        22381 messages
  Data transfer per iteration:                       10.313 MB
  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.006 sec (34.2%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.001 sec (5.6%)
  LE global matrix maximum size:                       612
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      197 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.31 updates
  Time-step wall-clock time per iteration:            0.001 sec (3.1%)

  Total wall-clock time:                              2.843 sec
Performance Timer for161 iterations on 80 compute nodes
  Average wall-clock time per iteration:              0.016 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)
  Message count per iteration:                        31017 messages
  Data transfer per iteration:                       12.356 MB
  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.007 sec (41.4%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.001 sec (7.4%)
  LE global matrix maximum size:                       623
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      199 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.31 updates
  Time-step wall-clock time per iteration:            0.001 sec (3.3%)

  Total wall-clock time:                              2.631 sec





https://blog.sciencenet.cn/blog-531760-1361900.html

上一篇:Sajben扩压器型线python计算
下一篇:[转载]百度网盘上传网络异常
收藏 IP: 112.32.26.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-12-23 22:40

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部