大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【电信学】【2011.08】图形处理单元的软MIMO检测及迭代MIMO译码性能研究

已有 1232 次阅读 2019-12-2 09:32 |系统分类:科研笔记|文章来源:转载

本文为美国德州A&M大学(作者:RICHEEK ARYA)的硕士论文,共56页。

 

本文提出了一种在图形处理单元(GPU)上实现软多输入多输出(MIMO)检测的单树搜索算法,我们比较了它在不同GPU和中央处理器(CPU)上的性能。我们还对迭代译码算法进行了性能研究。我们已经证明,通过增加外部迭代次数,可以进一步降低错误率。GPU是专门为加速图形处理而设计的设备,它们是大规模并行设备,可以同时运行数千个线程。由于其巨大的处理能力,人们越来越有兴趣将其用于科学和通用计算。因此,NvidiaAdvanced Micro DevicesAMD)等公司已经开始支持通用GPUGPGPU)应用。Nvidia提出了计算统一设备架构(CUDA)来编程其GPU

 

我们正在努力开发一种可以跨平台使用的并行计算标准语言。OpenCL是所有主要GPUCPU供应商都支持的第一种此类语言。MIMO检测器具有很高的计算复杂度。我们在GPU上实现了一个软MIMO检测器,并对其吞吐量和延迟性能进行了研究。我们已经证明,对于软检测算法,GPU可以提供高达4Mbps的吞吐量,这对于大多数通用任务(如语音通信等)来说是足够的。与CPU相比,其吞吐量大约增加了7x。我们还比较了两种GPU的性能:一种是低计算能力的GPU,另一种是高计算能力的GPU。这些比较显示了线程序列化对算法的影响,低端GPU的执行时间曲线显示出1/2的斜率。为了进一步提高误码率性能,在检测器和译码器之间采用反馈路径的情况下,采用迭代译码技术。着眼于GPU的实现,我们探索研究了这些算法,然而,更好的错误率性能是以更高的功耗和更大的延迟为代价的。通过仿真模拟,我们已经表明,人们可以根据信噪比(SNR)值预测在得到可接受误码率(BER)和帧错误率(FER)性能之前需要进行多少次迭代。迭代译码技术表明,当外迭代次数从零增加时,信噪比增益达到1.5dB。为了降低复杂度,可以调整算法需要生成的候选数量。我们已经给出,当128的候选列表不足以满足使用16-QAM调制方案的4x4 MIMO系统的可接受误码率性能时,将其性能分别与5121024的列表分别进行了比较。

 

In this thesis we have presented animplementation of soft Multi Input Multi Output (MIMO) detection, single treesearch algorithm on Graphics Processing Units (GPUs). We have compared itsperformance on different GPUs and a Central Processing Unit (CPU). We have alsodone a performance study of iterative decoding algorithms. We have shown thatby increasing the number of outer iterations error rate performance can befurther improved. GPUs are specialized devices specially designed to accelerategraphics processing. They are massively parallel devices which can runthousands of threads simultaneously. Because of their tremendous processingpower there is an increasing interest in using them for scientific and generalpurpose computations. Hence companies like Nvidia, Advanced Micro Devices (AMD)etc. have started their support for General Purpose GPU (GPGPU) applications.Nvidia came up with Compute Unified Device Architecture (CUDA) to program itsGPUs. Efforts are made to come up with a standard language for parallelcomputing that can be used across platforms. OpenCL is the first such languagewhich is supported by all major GPU and CPU vendors. MIMO detector has a highcomputational complexity. We have implemented a soft MIMO detector on GPUs andstudied its throughput and latency performance. We have shown that a GPU cangive throughput of up to 4Mbps for a soft detection algorithms which is morethan sufficient for most general purpose tasks like voice communication etc.Compare to CPU a throughput increase of ~ 7x is achieved.  We alsocompared the performances of two GPUs one with low computational power and onewith high computational power. These comparisons shows effect of threadserialization on algorithms with the lower end GPU’s execution time curve showsa slope of 1/2. To further improve error rate performance iterative decodingtechniques are employed where a feedback path is employed between detector anddecoder. With an eye towards GPU implementation we have explored thesealgorithms. Better error rate performance however, comes at a price of higherpower dissipation and more latency. By simulations we have shown that one canpredict based on the Signal to Noise Ratio (SNR) values how many iterationsneed to be done before getting an acceptable Bit Error Rate (BER) and FrameError Rate (FER) performance. Iterative decoding technique shows that a SNRgain of ~ 1.5dB isachieved when number of outer iterations are increased from zero. To reduce thecomplexity one can adjust number of possible candidates the algorithm cangenerate. We showed that where a candidate list of 128 is not sufficient foracceptable error rate performance for a 4x4 MIMO system using 16-QAM modulationscheme, performances are comparable with the list size of 512 and 1024 respectively.

 

引言

项目背景

单树搜索的GPU实现及迭代译码性能研究

结论与未来工作展望


更多精彩文章请关注公众号:qrcode_for_gh_60b944f6c215_258.jpg



https://blog.sciencenet.cn/blog-69686-1208423.html

上一篇:[转载]【电力电子】【2012.07】基于功率因数校正的三相整流器设计与仿真
下一篇:[转载]【信息技术】【2005】基于互信息的数字化重建射线照片与电子束图像配准
收藏 IP: 60.169.68.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-20 08:47

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部