huzm的个人博客分享 http://blog.sciencenet.cn/u/huzm

博文

Big Data in Ecology

已有 3993 次阅读 2016-5-3 08:55 |系统分类:科研笔记

生态学研究进入大数据时代,我们需要具备哪些技能呢?下面链接是一个研究者在ESA年会上做的报告,有很重要的参考价值。
https://jabberwocky.weecology.org/2013/08/12/ignite-talk-big-data-in-ecology


Ignite Talk: Big Data in Ecology

Slides and script from Ethan White’s Ignite talk on Big Data in Ecology from Sandra Chung and Jacquelyn Gill‘s excellent ESA 2013 session on Sharing Makes Science Better. Slides are also archived on figshare.

Title slide

1.  I’m here to talk to you about the use of big data in ecology and to help motivate a lot of the great tools and approaches that other folks will talk about later in the session.

Photos of field work

2.  The definition of big is of course relative, and so when we talk about big data in ecology we typically mean big relative to our standard approaches based on observations and experiments conducted by single investigators or small teams.

Image of Microsoft Excel

3.  And for those of you who prefer a more precise definition, my friend Michael Weiser defines big data and ecoinformatics as involving anything that can’t be successfully opened in Microsoft Excel.

Map of Breeding Bird Survey

4.  Data can be of unusually large size in two ways. It can be inherently large, like citizen science efforts such as Breeding Bird Survey, where large amounts of data are collected in a consistent manner.

Images of Dryad, figshare, and Ecological Archives

5.  Or it can be large because it’s composed of a large number of small datasets that are compiled from sources like Dryad, figshare, and Ecological Archives to form useful compilation datasets for analysis.

Dataset logos

6.  We have increasing amounts of both kinds of data in ecology as a result of both major data collection efforts and an increased emphasis on sharing data.

Maps and quote about large scale ecology from NEON

7-8.  But what does this kind of data buy us. First, big data allows us to work at scales beyond those at which traditional approaches are typically feasible. This is critical because many of the most pressing issues in ecology including climate change, biodiversity, and invasive species operate at broad spatial and long temporal scales.

Map and results of general analysis

9-10.  Second, big data allows us to answer questions in general ways, so that we get the answer today instead of waiting a decade to gradually compile enough results to reach concensus. We can do this by testing theories using large amounts of data from across ecosystems and taxonomic groups, so that we know that our results are general, and not specific to a single system (e.g., White et al. 2012).

The most interesting man in the worlds says:I don't always analyze data, but when I do, I prefer a lot of it

11. This is the promise of big data in ecology, but realizing this potential is difficult because working with either truly big data or data compilations is inherently challenging, and we still lack sufficient data to answer many important questions.

Bullett points:1. Training, 2. Tools, 3. More data.

12. This means that if we are going to take full advantage of big data in ecology we need 3 things. Training in computational methods for ecologists, tools to make it easier to work with existing data, and more data.

Logos of groups running training initiatives

13. We need to train ecologists in the computational tools needed for working with big data, and there are an increasing number of efforts to do this including Software Carpentry (which I’m actively involved in) as well as training initiatives at many of the data and synthesis centers.

Logos for DataONE, Dryad, NEON, Morpho, and DataUP

14. We need systems for storing, distributing, and searching data like DataONE, Dryad, NEON‘s data portal, as well as the standardized metadata and associated tools that make finding data to answer a particular research question easier.

Screenshot of Ecological Data Wiki

15. We need crowd-sourced systems like the Ecological Data Wiki to allow us to work together on improving insufficient metadata and understanding what kinds of analyses are appropriate for different datasets and how to conduct them rigorously.

rOpenSci and EcoData Retriever logos

16. We need tools for quickly and easily accessing data like rOpenSci and the EcoData Retriever so that we can spend our time thinking and analyzing data rather than figuring out how to access it and restructure it.

Map of Life, GBIF, and EcoData Retriever logos

17. We also need systems that help turn small data into big data compilations, whether it be through centralized standardized databases like GBIF or tools that pull data together from disparate sources like Map of Life.

Screen shot of preprint, and Morpho, DataUP, and CC0 logos

18. And finally we we need to continue to share more and more data and share it in useful ways. With the good formats, standardized metadata, and open licenses that make it easy to work with.

Dataset logos

19. And so, what I would like to leave you with is that we live in an exciting time in ecology thanks to the generation of large amounts of data by citizen science projects, exciting federal efforts like NEON, and a shift in scientific culture towards sharing data openly.

River Ernest-White saying

20. If we can train ecologists to work with and combine existing tools in interesting ways, it will let us combine datasets spanning the surface of the globe and diversity of life to make meaningful predictions about ecological systems.









https://blog.sciencenet.cn/blog-2446337-974451.html

上一篇:陆地生态系统蒸散模拟与拆分模型代码共享,欢迎使用。
下一篇:如何提高模型对气候变化对生态系统影响的预测能力?
收藏 IP: 111.132.234.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-6-2 00:12

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部