(Tabular) Data Package Manager for R

Fork me on

Under Development: v0.1.7

Build Status

Issues/suggestions

Maintainer: Christopher Gandrud

Description

The R package for creating and installing data packages that follow the Open Knowledge’s’s Tabular Data Package spec.

dpmr has three core functions:

Examples

Create Data Packages

To initiate a barebones data package in the current working directory called My_Data_Package use:

A <- B <- C <- sample(1:20, size = 20, replace = TRUE)
ID <- sort(rep('a', 20))
Data <- data.frame(ID, A, B, C)

datapackage_init(df = Data, package_name = 'My_Data_Package')

This will create a data package with barebones metadata in a datapackage.json file. You can then edit this by hand.

Alternatively, you can also create a list with the metadata in R and have this included with the data package:

meta_list <- list(name = 'My_Data_Package',
                title = 'A fake data package',
                last_updated = Sys.Date(),
                version = '0.1',
                license = data.frame(type = 'PDDL-1.0',
                        url = 'http://opendatacommons.org/licenses/pddl/'),
                sources = data.frame(name = 'Fake',
                        web = 'No URL, its fake.'))

datapackage_init(df = Data, meta = meta_list)

Note if you don’t include the resources fields in your metadata list, then they will automatically be added. These fields identify the data files’ paths and data schema.

Installing Data Packages

Locally

To load a data package called gdp stored in the current working directory use:

gdp_data <- datapackage_install(path = 'gdp/')

From the web

You can install a package stored remotely using its URL. In this example we directly download the gdp data package from GitHub using the URL for its zip file:

URL <- 'https://github.com/datasets/gdp/archive/master.zip'
gdp_data <- datapackage_install(path = URL)

Get Data Package Metadata

Use datapackage_info to read a data package’s metadata into R:

# Print information when working directory is a data package
datapackage_info()

Install dpmr R package

To install the dpmr package use:

devtools::install_github('christophergandrud/dpmr')

To-do for v0.2

dpmr is underdevelopment. Key features we aim to implement in version 0.2.


Licensed under GPL-3