Libraries As a {Developer} I to get a Data Package into Node so that I can start using the data for doing analysis and visualizations As a {Researcher} I want to get a Data Package into Julia in seconds so that I can start using the data for doing analysis and visualizations. As a {Publisher} I want to add type information to my data so that it is more useful to others and can be used better with tools like visualization programs. 1 As a {Publisher} I want to be able to provide a visualization of data in the Data Package so that I can provide my analysis and show my work to users of the data. 1 As a {Researcher} I want to be able to save new visualizations so that I can share them with others or include them in the Data Package. 1 As a {Researcher/Publisher} I want to know that my data conforms to its Data Package “profile” (tabular, fiscal, etc.) so that I can feel trust in the validity and usefulness of the data. 1 As a {Researcher/Publisher} I want to understand the ways in which my data is invalid so that I can know how to fix it. As a {Researcher} I want to get a Data Package into R in seconds so that I can start using the data for doing analysis and visualizations. 3 As a {Researcher} I want to get a Data Package into Excel in seconds so that I can start using the data for doing analysis and visualizations. 3 3 As a {Researcher} I want to get a Data Package into SPSS in seconds so that I can start using the data for doing analysis and visualizations. As a {Researcher} I want to get a Data Package into STATA in seconds so that I can start using the data for doing analysis and visualizations. As a {Researcher} working with Ecological Metadata I want to be able to translate my EML dataset to a Data Package so that I can benefit from the wide array of tools available for Data Packages. As a {Researcher} I want to get a Data Package into LibreOffice/OpenOffice in seconds so that I can start using the data for doing analysis and visualizations. As a {Developer} I want to get a Data Package into Python in seconds so that I can start using the data for doing analysis and visualizations. As a {Developer} I want a jQuery plugin for "Core" Data Packages so that I can use it to apply to a form control that uses a core dataset for autocompletion As a {Researcher}, I want to get my Excel spreadsheet into a Data Package so that I can benefit from better tooling and standardization As a {Developer}, I want to do "exploratory" data analysis in R and "operationalize" that analysis in Python so that I can use the best tool for the job As a {Developer} I want to get a Data Package into Clojure in seconds so that I can start using the data in doing analysis and visualizations As a {Developer} I want to get a Data Package into Julia in seconds so that I can start using the data in doing analysis and visualizations As a {Developer} I want to get a Data Package into C++ in seconds so that I can start using the data in doing analysis and visualizations As a {Machine Learning expert}, I would like to package ML datasets as data packages, so that I can easily import them into my ML platform, so I can start using the data in doing analysis Integrations As a {Developer} working with Data Packages, I want an Elasticsearch integration, so I can integrate data-packaged data with pipelines that use Elasticsearch 1 2 As a {Developer} working with Data Packages, I want an SPSS integration, so I can integrate data-packaged data with pipelines that use SPSS. 3 As a {Developer} working with Data Packages, I want an EPrints integration, so I can integrate data-packaged data with pipelines that use EPrints. 2 As a {Developer} working with Data Packages, I want a Mongo integration, so I can integrate data-packaged data with pipelines that use Mongo As a {Developer} working with Data Packages, I want a DAT integration, so I can integrate data-packaged data with pipelines that use DAT As a {Researcher/Government Publisher} I want to add general reference data to my narrow dataset so that my dataset is more useful. As a {Researcher/Government Publisher} I want to add general country names to my dataset that only contains country codes so that my dataset is more readable. 2 As a {Researcher/Government Publisher} I want to add reference data on inflation to my spending dataset so that the spending lines in my dataset is more understandable. As a {Researcher/Government Publisher} I want to map lines in my dataset using geographic data in my dataset so that my dataset is more engaging for non-technical users. As a {Researcher} I want to be able to reference a remote controlled vocabulary for my dataset so that I can be sure that column(s) of my (tabular) dataset are valid against a single shared list of terms. 2 As a {Developer} working with Data Packages, I want an DSpace integration, so I can integrate data-packaged data with pipelines that use DSpace. As a {Developer} working with Data Packages, I want Feather integration, so I can integrate data-packaged data with pipelines that use Feather As a {Developer} working with Data Packages, I want HDF5 integration, so I can integrate data-packaged data with pipelines that use HDF5 As a {Researcher} working with data, I want an Microsoft Power BI integration, so I can import datasets without downloading them locally 1 4 As a {Researcher/Publisher} I want an integration with Zenodo so that when I publish my dataset on GitHub, I don't have to retype descriptive information about my dataset As a {Publisher} working with Data Packages, I would like an integration with OpenRefine so that I can output cleaned Data Packages As a {Researcher/Publisher} I want to publish Data Packages to CKAN (and data pushed to the datastore) so that my data is findable and I can have a data API As a {Researcher/Developer} working with Data Packages, I would like the ability import/export from MS-SQL so that I can use Data Packages in workflows that involve MS-SQL 1 As a {Researcher} working with data in NetCDF, I want NetCDF integration so that I can store my data in plaintext while still retaining its metadata 1 As a {Researcher} I want an integration with https://data.mendeley.com/ so that I can validate my data upon ingest to the service As a {Publisher} working with Data Packages, I would like an integration with Excel so that I can output cleaned Data Packages 1 Apps As a {Publisher} I want to publish my data quickly and easily online. As a {Repository Manager} I want a tool that makes it easy for researchers/users to add basic metadata to their research data, so that it is more findable and therefore useful. 1 As a {Researcher/Publisher} I want validate my data with a minimum of clicks, so that I can feel trust in the validity and usefulness of the data. As a {Publisher} I want to be able to check that every time I update my data it is still good (conforms to schema) so that I can catch errors early and publish reliable data. 1 As a {Developer/Wrangler} I want to use a command line tool that allows me to validate data so that I can feel trust in the validity and usefulness of the data quickly and in the context of my command line workflow. As a {Developer} I want an online service that is connected to my data repository (e.g. git repo) that validates data on update so that I can delegate data validation to a third party. As a {Government Publisher} I want to make it easy to prove that our published data is valid so that I can claim that we are living up to our transparency commitments. As a {Civic Tech Activist} I want to make it easy to assess the quality of data published by the government so that I can make sure that government is living up to its transparency commitments. As a {Publisher} I want to embed an interactive preview of my data on my site so that users can be encouraged that this is the correct data for them. 1 As a {Publisher} I want to embed a preview button on my site so that users can preview the data and be encouraged that this is the correct data for them. As a {Publisher} I want to know how many users have previewed a dataset so that I know how interest in a dataset relates to its actual download numbers. As a {Developer} I want to customize an existing wizard for my specific type of data (e.g. including data validation for my schema) so that I can give my users a great user experience. As a {Publisher} I want to add useful metadata (e.g. column types) or add in new data columns to make the dataset more useful. 1 As a {Publisher}, I want to package reproducible steps to get a certain data state, so my methodology is transparent and can be rerun by others. As a {Developer/Data Wrangler} I want to store my Data Package in GitHub and have it automatically pushed into CKAN so that I get a data API and my dataset is listed in a relevant catalog 1 As a {Researcher} I want a tool that can generate basic statistics about a dataset so that I can get a quick preview of the data As a {Developer/Publisher} I want a tool to create an embeddable data summary via iframe so that I can embed data summaries across sites As a {Researcher} I want an app that generates an OpenRefine reconciliation API endpoint from a Tabular Data Package so that I can use it to clean messy data As a {Researcher} I want an app that creates "proxy" Data Packages for well known and reliable data sources so that I can load high quality data using Data Package tooling As a {Repository Manager/Researcher} I want an app that acts as a match-making service for packaging (e.g. biodiversity) data so that data owners are paired with data packagers Specs As a {Researcher/Publisher} I want to specify the funding that contributed to the creation of a given dataset so that funding agencies can identify the funding source for a given dataset. 3 As a {Researcher/Publisher} I want to add a DOI to a dataset so that I can cite it in papers published with the data. Done As a {Researcher/Developer/Wrangler} I want to get a Data Package into Postgres in seconds so that I can start using the data for doing analysis and visualizations. Scratch TDP Friendly Data Editor Give Me an API 1 Tableau-lite Denormalize continuous data integration 2 model-it join-it 1 Gmail plugin - attach datapackage to an email and then get a preview in gmail enrichment examples 1 TDP + SQL 1 1 TDP + R 1 1 JTS / Tabular Data Package Data Validator 1 DP / TDP + CKAN Oct 14 Data Package Creator (3) 1 Sep 30 Geo Data Package Slide decks for geeks (1) JSON Table Schema v1.0 Data Package Core Libs 1 JSON Table Schema Core Libs 1 TDP + BigQuery Oct 9 TDP + OpenRefine Oct 9 TDP + Redshift TDP + AWS RDS TDP + Pandas TDP + Excel TDP + Google Spreadsheets Specifications (datapackages.com) Ruby library Python library DataPackage Builder (DataPackagist or next gen. from OpenSpending) (apps.datapackages.com/builder|packager) Data Package Manager (dpm) JTS Validation service (apis.datapackages.com/jts/validate) JTS infer (?) (apis.datapackages.com/jts/infer) DataPackage.json validator (apis.datapackages.com/dp/validate) Registry (apis.datapackages.com/registry) Schemas (schemas.datapackages.com/) DataDeck / DataExplorer (OpenRefine in HMTL + JS app) Data Summary and Quality spec Use Case: Describe and Model 1 Wordpress + Data Packages 2 As a Data Package creator I want to validate my datapackage.json as often and early as possible and usually before I have uploaded it somewhere so that it is correct before I publish it online DataCatalog.js - Create your own data catalog in 5m from a list of your (or others) data packages DataHub.io As a {Data Publisher}, I want to understand how datahub.io differs from old.datahub.io so I can decide which is best suited for my needs As a {Researcher / Data Wrangler}, I want to understand what constitutes a core dataset so I can recommend this to others / contribute additions As a {Data Publisher}, I want to understand how to submit my data packages to the core datasets list so other datahub.io users can find and use them As a {Data Publisher}, I want to make my old.datahub.io datasets available on datahub.io so it is easier for people to find them As a {Data Scientist}, I want to combine several datasets so I can carry out data analysis on a given topic