Here is the collection of data sets from different public repos and for various areas of industries.
I will keep updating the list as we go. Sources for the links are referenced at the bottom in the References section. I thank to the original posters.
Broken links are updated with latest urls.
These links are the sources for range of huge volume of data made available by Governments and organizations across the world for good cause.
Please use responsibly.
Aviation
National Flight Data Center (NFDC)
FAA Data & Research
Flight Delay Information
FAA Aviation Safety Information Analysis and Sharing (ASIAS)
Aircraft Situation Display to Industry (ASDI)
NTSB Accident Database & Synopses
OpenFlights.org
The Center for Innovation in Engineering and Science Education Real time data sites
MIT Airline Data Project
Space
Real-Time Space Weather Data Sources
Politics
Data on the U.S. Congress – A Joint Effort from Brookings and the American Enterprise Institute
Sports
Open Sports Data/API
Football (Soccer) Stats
Government
Public Government Data Sets
U.S. Department of Homeland Security Data
Public Data for the State of Utah
Compilations by others
Finding Data on the Internet - Inside-R
Nathan Yau's collection of data sets
Dr. Jerry A. Smith's Favorite Data sets
Hilary Mason's "Research Quality" Data-sets
https://bitly.com/bundles/hmason/1
This is a bundle that gathers public data sets that might be interesting to researchers in a variety of fields in one place.
This is a bundle that gathers public data sets that might be interesting to researchers in a variety of fields in one place.
Peter Skomoroch's list of data sets on Delicious
Data Wrangling blog data set list
Other
DonorsChoose.org - Hacking Education: A Contest for Developers and Data Crunchers
Datasets for "The Elements of Statistical Learning"
Enron Email Dataset
http://www.cs.cmu.edu/~enron/
CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a total of about 0.5M messages.
CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a total of about 0.5M messages.
No comments:
Post a Comment