博文

[转载]EfficientNet启示

已有 2072 次阅读 2020-5-18 16:04 |个人分类:DeepLearning|系统分类:论文交流|文章来源:转载

Depth (d): Scaling network depth is the most common way used by many ConvN. The intuition is that deeper ConvNet can capture richer and more complex features, and generalize well on new tasks. However, deeper networks are also more difficult to train due to the vanishing gradient problem (Zagoruyko & Komodakis, 2016). Although several techniques, such as skip connections (He et al., 2016) and batch normalization (Ioffe & Szegedy, 2015), alleviate the training problem, the accuracy gain of very deep network diminishes: for example, ResNet-1000 has similar accuracy as ResNet-101 even though it has much more layers.

Width (w): Scaling network width is commonly used for small size model. Wider networks tend to be able to capture more fine-grained features and are easier to train. However,extremely wide but shallow networks tend to have difficulties in capturing higher level features.

Resolution (r): With higher resolution input images, ConvNets can potentially capture more fine-grained patterns. Starting from 224x224 in early ConvNets, modern ConvNets tend to use 299x299 (Szegedy et al., 2016) or 331x331 (Zoph et al., 2018) for better accuracy. Recently, GPipe (Huang et al., 2018) achieves state-of-the-art ImageNet accuracy with 480x480 resolution. Higher resolutions, such as 600x600, are also widely used in object detection ConvNets (He et al., 2017; Lin et al., 2017). Figure 3 (right) shows the results of scaling network resolutions, where indeed higher resolutions improve accuracy, but the accuracy gain diminishes for very high resolutions (r = 1.0 denotes resolution 224x224 and r = 2.5 denotes resolution 560x560).

Observation 1 – Scaling up any dimension of network width, depth, or resolution improves accuracy, but the accuracy gain diminishes for bigger models.

Observation 2 – In order to pursue better accuracy and efficiency, it is critical to balance all dimensions of network width, depth, and resolution during ConvNet scaling.

转载本文请联系原作者获取授权，同时请注明本文来自高琳琳科学网博客。
链接地址：https://blog.sciencenet.cn/blog-1969089-1233742.html

上一篇：Pytorch installation on Ubuntu18.04
下一篇：consecutive_numbers_in_a list & nested_loop & same_actions

收藏 IP: 221.12.44.*| 热度|

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

高琳琳

扫一扫，分享此博文

gll89的个人博客分享 http://blog.sciencenet.cn/u/gll89

博文

[转载]EfficientNet启示

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

高琳琳

全部作者的其他最新博文

全部精选博文导读

gll89的个人博客分享 http://blog.sciencenet.cn/u/gll89

博文

[转载]EfficientNet启示

当前推荐数：0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

高琳琳

全部作者的其他最新博文

全部精选博文导读

该博文允许注册用户评论请点击登录评论 (0 个评论)