Open Datasets

Quality training data is the foundation of every machine learning project.

We’ve assembled a collection of free, open-source datasets that are available for machine learning use cases. If you have a free, publicly-available dataset you’d like to add, contact us to let us know!

 

How it works

Clicking into an entry shows an expanded view of all criteria associated with the dataset.  

Click “Sort” to select a field to sort by.
Fields include Producer, Description, Usage, Media Type, Purpose, Language, Labeled, Feature, Cardinality, Quality, Topics/tags, Format, Access, Splits, Usage, Fields of Application, Size, and  Publication Date.

The Search icon allows for searching by words or phrases.

You can download a CSV file or “View Larger Version” through links at the bottom of the Airtable.