||
The Cancer Genome Atlas (TCGA)
下载TCGA-Assembler软件(以TCGA-Assembler.2.0.6为例), 解压,我的解压路径为D:\TCGA-Assembler,注意路径最好为根目录以防下载文件存储出错(见下图,源自操作手册),同时将curl.exe这个文件,复制到电脑C盘Windows文件夹的System32这个文件夹中
需要用到的R包:“RCurl", "rjson", "httr", "stringr", "HGNChelper"
--------------------------------------------------------------------------------------------------------------------
DownloadMethylationData
用法
DownloadMethylationData(cancerType, assayPlatform = NULL, tissueType = NULL, saveFolderName = ".",outputFileName = "", inputPatientIDs = NULL)
解释说明
cancerType 癌症类型以及缩写
assayPlatform 数据测得平台
tissueType 组织类型,如果有就写上去,如果没有就默认全选
saveFolderName 数据保存位置
outputFileName 数据文件名
inputPatientIDs TCGA上自己挑选的样本ID,一般形式为“TCGA-XX-XXXX”
--------------------------------------------------------------------------------------------------------------------
完整代码实例
##清空内存变量 -----------------------------------------------------------------
rm(list = ls())
##设置工作空间 -----------------------------------------------------------------
homedir= "D:\\TCGA-Assembler"
setwd(homedir)
##加载TCGA程序模块 -------------------------------------------------------------
source("./Module_A.R")
source("./Module_B.R")
##加载所需要的R包 --------------------------------------------------------------
library(readr)
library(HGNChelper)
library(httr)
library(RCurl)
library(rjson)
library(stringr)
##下载数据(例:GBM甲基化) ----------------------------------------------------------
Patient_ID<-read_tsv(file.choose())
vPatient_ID<-Patient_ID$`Case ID`
filename_READ_Methylation450 <- DownloadMethylationData(
cancerType = "GBM",
assayPlatform = "methylation_450",
inputPatientIDs = vPatient_ID,
saveFolderName = "./Methylation_450"
)
--------------------------------------------------------------------------------------------------------------------
对inputPatientIDs的说明
filename_READ_Methylation450<- DownloadMethylationData(cancerType="READ",assayPlatform="methylation_450",saveFolderName="./ManualExampleData/RawData.TCGA-Assembler", inputPatientIDs=c("TCGA-EI-6884","TCGA-DC-5869","TCGA-G5-6572","TCGA-F5-6812","TCGA-AG-A01W","TCGA-AG-3731"))
当所需要的ID比较多的时候,建议去TCGA网站上下载ID文件,用read_tsv的方式来读取(如代码实例中所示)
下图为下载好的ID文件(.tsv格式)
下载如图所示:
下载文件过程中,会生成临时文件夹,"tmp_YYYYMMDDhhmmss" 名称对应信息如下:
YYYY MM DD hh mm ss
year month date hour minute second
下载完毕后,临时文件夹会被移除;
当下载中断时,临时文件夹不会自动移除,要手动删除
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-21 22:05
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社