Skip to content Learn about the access keys available for My Metadata Registry

Concept help - Data Set

A Data Set describes a record of data, including any location or time boundaries for the data, that has been captured and is available for use under a specific licence. A Data Set may be included in a Data Catalog, and can reference multiple Distributions that record different parts or formats of the data that are available to download.

A a dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more formats". A dataset does not have to be available as a downloadable file. For example, a dataset that is available via an API can be defined as an instance of dcat:Dataset and the API can be defined as an instance of dcat:Distribution. DCAT itself does not define properties specific to APIs description. These are considered out of the scope of this version of the vocabulary. Nevertheless, this can be defined as a profile of the DCAT vocabulary.

Fields available on this metadata type

Field ISO definition and Registry Help (where available)
Name The primary name used for human identification purposes.

The title is a unique, clear and descriptive name for the data asset. People searching for data should gain a basic understanding of the business use/intent of the data asset from the title. If a new data asset record is required due to an error in the data, the title should note that it is a revised version.

Free text (max. 200 char) e.g. Team productivity modelling based on APS Employee Census results for 2000-2022

Definition Representation of a concept by a descriptive statement which serves to differentiate it from related concepts. (3.2.39)

Easy to read information about the data asset to enable users to find and evaluate the data asset for their needs. The Description attribute is typically several sentences long and is used to search for the data asset so keywords should be carefully considered. Agencies are encouraged to include field names (e.g. gender/sex, age, address) collected within the data asset. This will help answer any specific research, policy or program questions a user may have, and help manage requests for additional information about the data asset received by your agency.

This attribute is supplemented by the Title, Keyword and Purpose attributes in the ONDC data catalogue.

Free text (max. 500 char) e.g. This model provides the breakdown of team productivity across the APS by the team job functions, providing management with richer insights into employee perceptions on a range of key indicators. These Census indicators include staff engagement, leadership, communication and change management, workplace conditions, health and wellbeing among others.

Is Federated
Is Not Federable
Version Unique version identifier of this metadata item.
References Significant documents that contributed to the development of the metadata item which were not the direct source for the metadata content.
Origin The source (e.g. document, project, discipline or model) for the item (8.1.2.2.3.5)
Comments Descriptive comments about the metadata item (8.1.2.2.3.4)
Deleted The date after which the item has been soft deleted and is no longer visible in the registry
License Information about the license document under which the dataset is made available.
Rights Information about rights held in and over the dataset.

Specifies access (or restrictions) to the data asset. Access is based on the agency’s privacy, security, or other policy approaches that apply to the data asset. Access can be: • Open - Data that is publicly accessible online (account registration may be required). • Conditional - Data that is publicly accessible subject to condition(s) that the user must meet to access the data. For example: a fee-for-service model applies to access the data; the user must have a .gov.au email to create an account and access the data; or the data is only accessible at a specific physical location. • Restricted - Data access is limited for reasons such as legal, privacy and sensitivity. For example: during an embargo period; security classification is PROTECTED and above; access can only be provided under the DATA Scheme. This attribute is supplemented by the Security Classification and Sensitive Data attributes.

Choose term from: Open, Conditional, Restricted

Release Date Date of formal publication of the dataset.

The date which the underlying data asset was made available for use, consumption, or analysis. This date should constantly change whenever the underlying data is updated. This is not to be confused with Date Modified which is the recorded date of metadata in an agency’s data inventory.

Date/Time in format: AS/NZS ISO 8601.1:2021

e.g. 1973-09

1973-09-17

1973-09-17T23:20:30+04:00

Modification Date Most recent date on which the dataset was changed, updated or modified.

The most recent date the data asset record was either created, changed, updated or modified. This date refers to the date in which the metadata of the data asset changes or is first recorded in the data inventory, not a date pertaining to the underlying data asset itself. This attribute is critical for agencies in managing their data assets and supplemented by the ONDC Publish Date attribute.

Date/Time in format: AS/NZS ISO 8601.1:2021
e.g. 2023-09
2023-09-17
2023-09-17T23:20:30+04:00

Frequency The frequency at which dataset is published.
Spatial Coverage Spatial or geographic coverage of the dataset.
Temporal Coverage The temporal or time period that the dataset covers.

Combination of the two ONDC fields Temporal coverage from/to:

The start period for the underlying data. Temporal coverage refers to the time period that the data asset covers. This field is related to the attribute Temporal coverage to. 

The end period for the underlying data. Temporal coverage refers to the time period that the data asset covers. The data asset may not have an end date if it is being continually added to, in which case, a value is not required. This field is related to the attribute Temporal coverage from.

Catalog An entity responsible for making the dataset available.
Landing Page A Web page that can be navigated to in a Web browser to gain access to the dataset, its distributions and/or additional information
Contact Point Relevant contact information for the Dataset.

An email address or a contact web form for users to request additional information related to the data asset. A group email address or contact web form is preferred because it is generic and enduring compared to an individual’s contact. This minimises the need to regularly update this attribute. Some agencies may choose to have a different point of contact for the Australian Government Data Catalogue and for internal purposes.

Group Email (or URL to contact web form) for the point of contact e.g. data.discovery@finance.gov.au

Conforming Specification An established standard to which the described resource conforms.
Item Base

Custom Fields

Field Short definition Long definition
Security Classification Choose term from: UNOFFICIAL OFFICIAL OFFICIAL: Sensitive PROTECTED SECRET TOP SECRET

The security classification applied to the data asset is specified by the Australian Government Protective Security Policy Framework - Policy 8.

The originator of the data asset is responsible for applying the relevant Security Classification. This attribute is supplemented by the Sensitive Data and Access Rights attributes.

Data Custodian Free text selected from the following: • For Government department and agency • For non-government organisation • For research organisation e.g. Department of Finance

The data custodian(s) is the agency that is responsible for the data asset and has the authority for sharing and disclosure. The data custodian can differ from the publisher (see Publisher attribute).

An agency may also be a data custodian under the Data Availability and Transparency Act 2022 if:

(a) “[they are] a Commonwealth body; and

(b) [are] not an excluded entity; and

(c) either:

      (i) controls public sector data (whether alone or jointly with another entity),             including by having the right to deal with that data; or

      (ii) has become the data custodian of output of a project in accordance                   with  section 20F.”

The data custodian value must be consistent with the Government Directory, Non-Government Organisation (NGO) List or Research Organisations Register. This field is related to Publisher attribute.

Official Definition

A representation of a dataset in a catalog. Data Catalog Vocabulary (DCAT): 5.3 Class: Dataset