Data Science Weekly
Become A Data Scientist Faster
Home
Data Science Resources
Data Science Datasets
Data Science Datasets
A list of publicly available datasets
General
Amazon Public Data Sets
Public Data Sets on AWS: centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications
Wikipedia
Wikipedia offers free copies of all available content to interested users. These databases can be used for mirroring, personal use, informal backups, offline use or database queries
Freebase
A community-curated database of people, places and things
World Bank
DataBank is an analysis and visualization tool that contains collections of time series data on a variety of topics
Windows Azure Marketplace
Free datasets via Windows Azure Data Market including Academic data, Speech Recognition data, etc.
Machine Learning Repository
200+ Datasets from Center for ML & Intelligent Systems
Deep Learning Data Sets
Music, natural images, text, speech, faces, recommendation systems datasets for benchmarking algorithms
Stanford Large Network Dataset Collection
A collection of about 50 large network datasets from tens of thousands of nodes and edges to tens of millions of nodes and edges. It includes social networks, web graphs, road networks, internet networks, citation networks, collaboration networks, and communication networks.
Yahoo Datasets
We have various types of data available to share. They are categorized into Ratings, Language, Graph, Advertising and Market Data, Computing Systems and an appendix of other relevant data and resources available via the Yahoo! Developer Network.
And, if you are looking for something specific, you can always try your luck posting on
reddit/r/datasets
or on
Open Data StackExchange
Sign up to receive the Data Science Weekly Newsletter every Thursday
Easy to unsubscribe. No spam — we keep your email safe and do not share it.
Back