大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【计算机科学】【2016】【含源码】利用深度神经网络进行基因组选择

已有 1702 次阅读 2019-8-12 11:36 |系统分类:科研笔记|文章来源:转载

本文为美国爱荷华州立大学(作者:Riley Mitchell Mcdowell)的硕士论文,共50页。

 

DNA标记技术的成本降低,产生了大量的分子数据,并使得在育种计划中生成密集的全基因组标记图谱在经济上是可行的。数据密度和容量的增加推动了对工具和技术的进一步探索,以通过分析这些数据改进品种。数据科学理论和应用已经经历了对各种技术应用中检测或“学习”噪声数据的复兴。机器学习的几种变体已被提出用于分析大型DNA标记数据集,以帮助表型预测和基因组选择。

 

在此,我们回顾了基因组预测和机器学习文献。我们将机器学习研究中的深度学习技术应用到六个表型预测任务中,这些都是已发布的参考数据集。由于正则化经常能够提高神经网络的预测精度,我们在神经网络模型中加入了正则化方法。将神经网络模型与通常用于表型预测和基因组选择的正则化贝叶斯和线性回归技术进行比较,在其中三个表型预测任务中,正则化神经网络是最准确的模型。令人惊讶的是,对于这些数据集,网络架构的深度并没有影响训练模型的准确性。(最后一句话感觉怪怪的,读者自己推敲一下吧,应该是关于GPU用于神经网络运算量的)

 

Reduced costs for DNA marker technology hasgenerated a huge amount of molecular data and made it economically feasible togenerate dense genome-wide marker maps of lines in a breeding program.Increased data density and volume has driven an exploration of tools and techniquesto analyze these data for cultivar improvement. Data science theory andapplication has experienced a resurgence of research into techniques to detector ”learn” patterns in noisy data in a variety of technical applications.Several variants of machine learning have been proposed for analyzing large DNAmarker data sets to aid in phenotype prediction and genomic selection. Here, wepresent a review of the genomic prediction and machine learning literature. Weapply deep learning techniques from machine learning research to six phenotypicprediction tasks using published reference datasets. Because regularizationfrequently improves neural network prediction accuracy, we includedregularization methods in the neural network models. The neural network modelsare compared to a selection of regularized Bayesian and linear regressiontechniques commonly employed for phenotypic prediction and genomic selection.On three of the phenotype prediction tasks, regularized neural networks werethe most accurate of the models evaluated. Surprisingly, for these data setsthe depth of the network architecture did not affect the accuracy of thetrained model. We also find that concerns about the computer processing timeneeded to train neural network models to perform well in genomic predictiontasks may not apply when Graphics Processing Units are used for model training.

  

引言

1.1 概述

1.2 本文组织结构

1.3 文献回顾

利用深度神经网络进行基因组预测

2.1 摘要

2.2 引言

2.3 材料与方法

2.4 结果与讨论

结论

3.1 一般讨论

3.2 未来研究展望

附录原始数据

附录分析代码

附录本文的软件代码


更多精彩文章请关注公众号:qrcode_for_gh_60b944f6c215_258.jpg



https://blog.sciencenet.cn/blog-69686-1193337.html

上一篇:[转载]【信息技术】【2002.12】基于麦克风阵列的语音增强研究
下一篇:[转载]【雷达与对抗】【2010.12】海浪高度的雷达探测研究
收藏 IP: 220.178.172.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-27 08:04

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部