|||
Complex Networks
(点击数据集名称查看详情,部分数据集需国外IP访问)
1) AMiner Citation Network Dataset
3) DIMACS Road Networks Collection
5) NIST complex networks data collection
6) Network Repository with Interactive Exploratory Analysis Tools
7) Protein-protein interaction network
8) PyPI and Maven Dependency Network
12) Stanford Large Network Dataset Collection
13) The Laboratory for Web Algorithmics (UNIMI)
14) UCI Network Data Repository
15) UFL sparse matrix collection
Computer Networks
(点击数据集名称查看详情,部分数据集需国外IP访问)
1) 3.5B Web Pages from CommonCrawl 2012
2) 53.5B Web clicks of 100K users in Indiana Univ.
4) CRAWDAD Wireless datasets from Dartmouth Univ.
7) CommonCrawl Web Data over 7 years
9) Internet-Wide Scan Data Repository
10) OONI: Open Observatory of Network Interference - Internet censorship data
11) Open Mobile Data by MobiPerf
12) The Peer-to-Peer Trace Archive - Real-world measurements play a key role [...]
13) Rapid7 Sonar Internet Scans
14) UCSD Network Telescope, IPv4 /8 net
Data Challenges
(点击数据集名称查看详情,部分数据集需国外IP访问)
2) Challenges in Machine Learning
4) DrivenData Competitions for Social Good
5) ICWSM Data Challenge (since 2009)
8) Localytics Data Visualization Challenge
11) Telecom Italia Big Data Challenge
12) TravisTorrent Dataset - MSR'2017 Mining Challenge
13) TunedIT - Data mining & machine learning data sets, algorithms, challenges
Image Processing
(点击数据集名称查看详情,部分数据集需国外IP访问)
1) 10k US Adult Faces Database
3) Adience Unfiltered faces for gender and age classification
4) Affective Image Classification
6) CADDY Underwater Stereo-Vision Dataset of divers' hand gestures - [...]
7) Caltech Pedestrian Detection Benchmark
8) Chars74K dataset - Character Recognition in Natural Images (both English [...]
9) Danbooru Tagged Anime Illustration Dataset - A large-scale anime image [...]
10) Face Recognition Benchmark
11) Flickr: 32 Class Brand Logos
12) GDXray - X-ray images for X-ray testing and Computer Vision
13) HumanEva Dataset - The HumanEva-I dataset contains 7 calibrated video [...]
14) ImageNet (in WordNet hierarchy)
16) International Affective Picture System, UFL
17) KITTI Vision Benchmark Suite
18) Labeled Information Library of Alexandria - Biology and Conservation - [...]
19) MNIST database of handwritten digits, near 1 million examples
20) Massive Visual Memory Stimuli, MIT
21) Open Images From Google - Pictures with segmentation masks for 2.8 [...]
23) The Action Similarity Labeling (ASLAN) Challenge
24) The Oxford-IIIT Pet Dataset
25) Violent-Flows - Crowd Violence / Non-violence Database and benchmark
Machine Learning
(点击数据集名称查看详情,部分数据集需国外IP访问)
1) All-Age-Faces Dataset - Contains 13'322 Asian face images distributed [...]
2) Context-aware data sets from five domains
3) Delve Datasets for classification and regression
6) Keel Repository for classification, regression and time series
7) Labeled Faces in the Wild (LFW)
12) New Yorker caption contest ratings
13) RDataMining - "R and Data Mining" ebook data
14) Registered Meteorites on Earth
15) Restaurants Health Score Data in San Francisco
16) UCI Machine Learning Repository
17) Yahoo! Ratings and Classification Data
20) eBay Online Auctions (2012)
Natural Language
(点击数据集名称查看详情,部分数据集需国外IP访问)
1) Automatic Keyphrase Extraction
2) Blizzard Challenge Speech - The speech + text data comes from [...]
4) CLiPS Stylometry Investigation Corpus
7) DBpedia - 4.58M things with 583M facts
9) Freebase of people, places, and things
10) German Political Speeches Corpus - Collection of political speeches from [...]
11) Google Books Ngrams (2.2TB)
12) Google MC-AFP - Generated based on the public available Gigaword dataset [...]
13) Google Web 5gram (1TB, 2006)
15) Hansards text chunks of Canadian Parliament
16) LJ Speech - Speech dataset consisting of 13,100 short audio clips of a [...]
17) Microsoft MAchine Reading COmprehension Dataset (or MS MARCO)
18) Machine Comprehension Test (MCTest) of text from Microsoft Research
19) Machine Translation of European languages
20) Making Sense of Microposts 2016 - Named Entity rEcognition and Linking
21) Multi-Domain Sentiment Dataset (version 2.0)
22) Noisy speech database for training speech enhancement algorithms and TTS [...]
24) POS/NER/Chunk annotated data
26) SMS Spam Collection in English
27) SaudiNewsNet Collection of Saudi Newspaper Articles (Arabic, 30K articles)
28) Stanford Question Answering Dataset (SQuAD)
29) USENET postings corpus of 2005~2011
31) Webhose - News/Blogs in multiple languages
32) Wikidata - Wikipedia databases
33) Wikipedia Links data - 40 Million Entities in Context
34) WordNet databases and tools
35) WorldTree Corpus of Explanation Graphs for Elementary Science Questions - [...]
Neuroscience
(点击数据集名称查看详情,部分数据集需国外IP访问)
3) Collaborative Research in Computational Neuroscience (CRCNS)
9) NeuroMorpho - NeuroMorpho.Org is a centrally curated inventory of [...]
Social Networks
(点击数据集名称查看详情,部分数据集需国外IP访问)
1) 72 hours #gamergate Twitter Scrape
2) Ancestry.com Forum Dataset over 10 years
3) CMU Enron Email of 150 users
4) Cheng-Caverlee-Lee September 2009 - January 2010 Twitter Scrape
5) EDRM Enron EMail of 151 users, hosted on S3
6) Facebook Data Scrape (2005)
7) Facebook Social Networks from LAW (since 2007)
8) Foursquare from UMN/Sarwat (2013)
9) GitHub Collaboration Archive
10) Google Scholar citation relations
11) High-Resolution Contact Networks from Wearable Sensors
12) Indie Map: social graph and crawl of top IndieWeb sites
15) Skytrax' Air Travel Reviews Dataset
17) SourceForge.net Research Data
18) Twitter Data for Online Reputation Management
19) Twitter Data for Sentiment Analysis
20) Twitter Graph of entire Twitter site
21) UNIMI/LAW Social Network Datasets
22) United States Congress Twitter Data - Daily datasets with tweets of 1100+ [...]
23) Yahoo! Graph and Social Data
24) Youtube Video Social Graph in 2007,2008
Transportation
(点击数据集名称查看详情,部分数据集需国外IP访问)
2) Ford GoBike Data (formerly Bay Area Bike Share Data)
3) Bike Share Systems (BSS) collection
5) GeoLife GPS Trajectory from Microsoft Research
6) German train system by Deutsche Bahn
10) NYC Taxi Trip Data 2013 (FOIA/FOILed)
11) NYC Uber trip data April 2014 to September 2014
13) OpenFlights - airport, airline and route data
14) Philadelphia Bike Share Stations (JSON)
15) Plane Crash Database, since 1920
16) RITA Airline On-Time Performance data
17) RITA/BTS transport data collection (TranStat)
18) Toronto Bike Share Stations (JSON and GBFS files)
19) Transport for London (TFL)
20) Travel Tracker Survey (TTS) for Chicago
21) U.S. Bureau of Transportation Statistics (BTS)
22) U.S. Domestic Flights 1990 to 2009
23) U.S. Freight Analysis Framework since 2007
最受欢迎的干货 更多精彩内容,欢迎关注 1) IJAC官方网站: http://link.springer.com/journal/11633 2) Linkedin: Int. J. of Automation and Computing 3) 新浪微博: IJAC-国际自动化与计算杂志 4) Twitter: IJAC_Journal 5) Facebook: ijac journal 关于杂志或文章,您有任何意见或建议,欢迎后台留言或私信小编 本文编辑:欧梨成
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-11-29 06:49
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社