Need a way to capture a Title and Description for each resource in the Distribution array #248

cew821 · 2014-01-15T16:24:26Z

The required fields for the distribution type are accessURL and format i.e. MIME-type. I think we should consider adding two additional fields (which could be optional) to the distribution type, which would be resourceTitle and resourceDescription.

Right now, if data managers create a data record using CKAN (i.e. using inventory.data.gov), the data manager is prompted to provide a Title and Description in addition to the accessURL/webService and File Format. When a user browses the inventory within that CKAN instance, they see this title and description as a part of the record. See this screenshot for an example:

However, since the schema doesn't have a field for these Title and Description elements, this data is lost when the data.json file is generated from the inventory. As a result, users of catalog.data.gov don't see a title or description for each resource, even if it was originally provided by a data manager in inventory.data.gov:

I propose that we add some optional fields to the schema that would allow each object in a distribution include a resourceTitle and resourceDescription text field, which, if present, could be used by catalog.data.gov and other catalogs to provide users of the catalog more information about each resource in the distribution.

The text was updated successfully, but these errors were encountered:

dsmorgan77 · 2014-01-22T15:03:43Z

👍

pschweitzerusgsgov · 2014-02-27T20:58:02Z

Yes.

DCAT has these fields in its Distribution type, so it should be a simple matter to take notice of them if present.

haleyvandyck · 2014-04-14T21:27:15Z

Thanks for this-- looks good. We've added it to the list of things to consider with the next schema update.

bletalien · 2014-07-17T18:31:41Z

Please!!! Here's an example of why: http://catalog.data.gov/dataset/general-schedule-and-locality-pay

gbinal · 2014-07-17T18:35:48Z

There seems to be a lot of support from this (I literally just heard someone say 'yes, yes, yes please').

What seems to be at issue is not necc. adding new fields but providing guidance for how these could be used within the distribution when an agency has the complexity and feels strongly about it.

+1

smrgeoinfo · 2014-07-17T18:43:36Z

classic example! @gbinal are you arguing against adding link properties for the accessURLs?

dafeder · 2014-07-17T18:50:08Z

I think the idea would be to use existing title and description field names inside the distribution array (rather than add new field names to the schema, especially using the term "resource" which is CKAN-specific). This would be consistent with DCAT: http://www.w3.org/TR/vocab-dcat/#class-distribution

smrgeoinfo · 2014-07-17T19:18:24Z

So the idea would be to extent the current https://project-open-data.github.io/schema/ distribution that has an accessURL and format to include dcat:title and dcat:description?

from the current POD distribution documentation:
"Distribution is a concatenation, as appropriate, of the following elements: accessURL and format. If an entry has only one dataset, enter details for that one; if it has multiple datasets (such as a bulk download and an API), separate entries as seen below:"

dafeder · 2014-07-17T19:26:43Z

That would certainly make sense for CKAN and DKAN! Agree they should both be optional.

dafeder · 2014-07-17T19:28:08Z

Except, when implementing as RDFa it would be dct:title, not dcat:title. JSON would use simply title.

gbinal · 2014-07-24T14:51:28Z

There's been pretty good agreement on allowing for this, though not requiring it. Also,

That it needs to in sync with DCAT
This said, though, distribution is still not meant for gathering similar datasets - that is what the collection discussion in Issue Guidance on defining collections to group datasets released as a series or in fragments #258 is for.

philipashlock · 2014-08-25T21:44:25Z

Here's an example of adding the title and description within a distribution just like DCAT allows (see #350). Note that this example also includes the change of format to mediaType (#272) for IANA MIME types and accessURL to downloadURL (#335) for file download URLs

One question is what good examples of title and description would look like. Should one of them be the file name or a description of the file format? A human readable value for the file format should already be covered by format

This is an excerpt, but see the gist for the full data.json

 "distribution": [
    {
        "description": "Widgets data as a CSV file",
        "downloadURL": "https://data.agency.gov/datasets/widgets-statistics/widgets.csv",
        "format": "CSV",
        "mediaType": "text/csv",
        "title": "widgets.csv"
    },
    {
        "description": "Widgets data as a zipped CSV file with attached data dictionary",
        "downloadURL": "https://data.agency.gov/datasets/widgets-statistics/widgets-all.zip",
        "format": "Zipped CSV",
        "mediaType": "application/zip",
        "title": "widgets-all.zip"
    },
    {
        "accessURL": "https://data.agency.gov/api/widgets-statistics/",
        "description": "A fully queryable REST API with JSON and XML output",
        "format": "API",
        "title": "Widgets REST API"
    }
]

In response to #217, #248 I still need to update the expanded guidance

In response to #217, #248

gbinal · 2014-09-08T22:45:06Z

This is addressed by baa0178 and by cd7a527

Changes that still need to be addressed are changes in structure and should we add usage notes additions here or no?: * Adds optional describedByType field at the dataset and distribution level (#291, #332) * Changes contactPoint field to an object that contains the name (fn) and email address (hasEmail) (#358) * Adds fn field as part of contactPoint replacing earlier use of contactPoint (#358) * Changes publisher field to an object that allows multiple levels of organizations (#296) * Changes accessURL field to represent indirect access and to exist only within distribution (#217, #335) * Changes format field to a human readable description and to exist only within distribution (#272, #293) * Adds optional description field for use within distribution (#248) * Adds optional title field for use within distribution (#248) * Changes accrualPeriodicity field to use ISO 8601 date syntax (#292) * Changes distribution field to become required-if-applicable and to always contain the accessURL or downloadURL fields (#217) * Changes license field to be a URL (#196)

gbinal · 2014-11-07T22:57:36Z

Thank you for driving the conversation around this issue and helping to assemble the v1.1 metadata update.

There appears to be strong consensus around this issue, which has been accepted in the v1.1 update and merged into Project Open Data. Project Open Data is a living project though. Please continue any conversations around how the schema can be improved with new issues and pull requests!

It's important for government staff as well as the public to continue to collaborate to make the Open Data Policy ever better. Though the v1.1 update is a substantial update, future iterations do not have to be, so whatever your ideas - big or small - please continue to work with this community to improve how government manages and opens its data.

gbinal added the schema label Apr 14, 2014

gbinal added this to the Next Version of Common Core Metadata Schema milestone Apr 14, 2014

philipashlock added the new-field label May 8, 2014

philipashlock modified the milestone: Next Version of Common Core Metadata Schema (1.0 -> 1.1.) Jul 24, 2014

philipashlock added the schema (future) label Jul 24, 2014

gbinal added a commit that referenced this issue Sep 8, 2014

updating distribution guidance, part 1

baa0178

In response to #217, #248 I still need to update the expanded guidance

gbinal added a commit that referenced this issue Sep 8, 2014

updating distribution guidance, part 2

cd7a527

In response to #217, #248

philipashlock mentioned this issue Sep 23, 2014

A proposal to improve the usability of distribution links in data.gov GSA/datagov-wptheme#362

Closed

gbinal closed this as completed Nov 7, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need a way to capture a Title and Description for each resource in the Distribution array #248

Need a way to capture a Title and Description for each resource in the Distribution array #248

cew821 commented Jan 15, 2014

dsmorgan77 commented Jan 22, 2014

pschweitzerusgsgov commented Feb 27, 2014

haleyvandyck commented Apr 14, 2014

bletalien commented Jul 17, 2014

gbinal commented Jul 17, 2014

smrgeoinfo commented Jul 17, 2014

dafeder commented Jul 17, 2014

smrgeoinfo commented Jul 17, 2014

dafeder commented Jul 17, 2014

dafeder commented Jul 17, 2014

gbinal commented Jul 24, 2014

philipashlock commented Aug 25, 2014

gbinal commented Sep 8, 2014

gbinal commented Nov 7, 2014

Need a way to capture a Title and Description for each resource in the Distribution array #248

Need a way to capture a Title and Description for each resource in the Distribution array #248

Comments

cew821 commented Jan 15, 2014

dsmorgan77 commented Jan 22, 2014

pschweitzerusgsgov commented Feb 27, 2014

haleyvandyck commented Apr 14, 2014

bletalien commented Jul 17, 2014

gbinal commented Jul 17, 2014

smrgeoinfo commented Jul 17, 2014

dafeder commented Jul 17, 2014

smrgeoinfo commented Jul 17, 2014

dafeder commented Jul 17, 2014

dafeder commented Jul 17, 2014

gbinal commented Jul 24, 2014

philipashlock commented Aug 25, 2014

gbinal commented Sep 8, 2014

gbinal commented Nov 7, 2014