Overview Titanic可谓是Kaggler的必经之路。我们以其为例,走一个完整的机器学习分析流程。 Step 1: 问题分析 关于Titanic的相关描述可参考官网,这是一个二分类的基本问题。 The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the T ...
The Workflow Description Language (WDL) makes it straightforward to define analysis tasks, chain them together in workflows, and parallelize their execution. 对于不同性质的数据,我们面临着不同流程、不同工具、不同参数的选择,一套合适的流程化数据处理框架至关重要。Broad Institute可谓业内之 ...
Build, Manage and Secure Your Apps Anywhere. Your Way. 流程化是工业进步的标志,生物学科尚处于基因组学大发现时代,面临着庞杂数据的处理,相应的分析流程必不可少( 我早应开发自己的流程工具集,错过第一波的最佳时机,sigh… )。最近被人强行安利docker,作为一个开源的应用容器引擎,小巧、可移植、 ...
These three metrics attempt to normalize for sequencing depth and gene length. 测序数据的标准化/归一化是生物信息学分析的必要步骤,可根据生物问题或是技术手段的不同而采取不同的策略进行。对于RNA-seq,常见的标准化手段有RPKM (Reads Per Kilobase per Million mapped reads) / FPKM (Fragments Per ...