View on GitHub

RESO Transport Workgroup

RESO Transport Workgroup - Specifications and Change Proposals

RESO Data Dictionary Endorsement

RCP 40
Version 2.0
Authors Joshua Darnell (RESO)
Status RATIFIED
Date Submitted December 2021
Date Ratified December 2023
Dependencies Web API Core 2.0.0+
Related Links DD Wiki 2.0
Data Dictionary 2.0 Spreadsheet


The Data Dictionary endorsement defines models for use in the RESO domain. These include Resources, Fields, Lookups, and Relationships between Resources.

New in version 2.0

RESO End User License Agreement (EULA)

This End User License Agreement (the “EULA”) is entered into by and between the Real Estate Standards Organization (“RESO”) and the person or entity (“End User”) that is downloading or otherwise obtaining the product associated with this EULA (“RESO Product”). This EULA governs End Users use of the RESO Product and End User agrees to the terms of this EULA by downloading or otherwise obtaining or using the RESO Product.


Table of Contents


Summary of Changes


Introduction

The RESO Data Dictionary defines the set of data elements available within RESO’s domain. These consist of resources, fields, and enumerations, also known as lookups.

This document outlines what the Data Dictionary is and how it maps to the RESO Web API transport layer.


Section 1: Purpose

The primary goal of the RESO Data Dictionary is interoperability through the consistent use of standard data elements.

While the Web API Server specification ensures that servers can talk to each other in a uniform manner, if they are using different fields to represent the same data, it causes additional effort where mapping is concerned. This means products that need to interoperate between systems will be slow to market and complex.

The point of the RESO Data Dictionary is to give data consumers and producers a common language to exchange data with.


Section 2: Specification

Overview

The RESO Data Dictionary consists of three main sets of data elements:

Section 2.1: Data Dictionary Spreadsheet

The Data Dictionary specification is defined as a spreadsheet, where each newly adopted version produces its own spreadsheet when ratified.

This worksheet is divided into three main sections:

Section 2.2: Lookup Resource for Enumeration Metadata

This section defines a RESO Data Dictionary resource called Lookup that can be used to convey metadata about the enumerations available on a given server.

In systems that cover large geographic areas, the amount of metadata can grow quite large. This is due to the fact that there are lookups for cities, counties, subdivisions, etc. for each of the areas a given vendor covers, making it impractical to deliver this information through a static OData XML metadata document.

The Lookup resource also allows human-friendly display names to be used in transport, both for queries and payloads, while providing consumers with a way to replicate metadata, as needed, rather than all at once.

Resource Definition

The Lookup resource is defined as follows:

Field Data Type Sample Value Nullable Description
LookupKey Edm.String “ABC123” false The key used to uniquely identify the Lookup entry.
LookupName Edm.String “ListingAgreementType” false The name of the enumeration. This is the LookupName in the adopted Data Dictionary 2.0 spreadsheet.

It is called a “LookupName” in this proposal because more than one field can have a given lookup, so it refers to the name of the lookup rather than a given field. For example, Listing with CountyOrParish and Office with OfficeCountyOrParish having the same CountyOrParish LookupName.

This MUST match the Data Dictionary definition for in cases where the lookup is defined. Vendors MAY add their own enumerations otherwise.

The LookupName a given field uses is required to be annotated at the field level in the OData XML Metadata, as outlined later in this proposal.
LookupValue Edm.String “Seller Reserve” false The human-friendly display name the data consumer receives in the payload and uses in queries.

This MAY be a local name or synonym for a given RESO Data Dictionary lookup item.
StandardLookupValue Edm.String “Exclusive Agency” true The standard Data Dictionary value of the enumerated value.

This field is required when a given enumeration is a standard lookup value, regardless of the value in LookupValue.

Local lookups MAY omit this information if they don’t correspond to an existing RESO standard lookup value.
LegacyODataValue Edm.String “ExclusiveAgency” true The Legacy OData lookup value that the server vendor provided in their OData XML Metadata.

This value is optional, and has been included in order to provide a stable mechanism for translating OData lookup values to RESO standard lookup display names, as well as for historical data that might have included the OData value at some point, even after the vendor had converted to human friendly display names.
ModificationTimestamp Edm.DateTimeOffset “2020-07-07T17:36:14+00:00” false The timestamp for when the enumeration value was last modified.

This is used to help rebuild caches when metadata items change so consumers don’t have to re-pull and reprocess the entire set of metadata when only a small number of changes have been made.

Required Annotation

For any String List, Single or String List, Multi field using the Lookup resource, the following MUST be present in the server metadata:

Example

<!-- OData annotation for String List, Single field -->
<Property Name="OfficeCountyOrParish" Type="Edm.String">
  <Annotation Term="RESO.OData.Metadata.LookupName" String="CountyOrParish" />  
</Property>

<!-- OData annotation for String List, Multi field -->
<Property Name="ExteriorFeatures" Type="Collection(Edm.String)">
  <Annotation Term="RESO.OData.Metadata.LookupName" String="ExteriorFeatures" />  
</Property>

Where:

Notes:

Queries

The Lookup resource MUST support queries that use the OData$top and $skip query operators, in conjunction with a ModificationTimestamp parameter so consumers can synchronize since the last update. The client MUST be able to consume the advertised count of records from the server or testing will not pass.

Providers MAY support other queries on this resource, such as filtering by LookupName.

Example: GET Lookups using OData $top and $skip

The following example shows retrieving a page of records using an OData $top and $skip query:

GET /Lookup?$top=100&$skip=0
{
  "value": [{
    "LookupKey": "CDE125",
    "LookupName": "CountyOrParish",
    "LookupValue": "Contra Costa County",
    "StandardLookupValue": null,
    "ModificationTimestamp": "2020-07-07T17:36:16Z"
  }, {
    "LookupKey": "BCD124",
    "LookupName": "CountyOrParish",
    "LookupValue": "Ventura County",
    "StandardLookupValue": null,
    "ModificationTimestamp": "2020-07-07T17:36:15Z"
  }, {
    "LookupKey": "ABC123",
    "LookupName": "CountyOrParish",
    "LookupValue": "Los Angeles County",
    "StandardLookupValue": null,
    "ModificationTimestamp": "2020-07-07T17:36:14Z"
 }]
}

In the previous example, the client has requested 100 records but only the 3 shown were available. Clients should be prepared to paginate with page sizes less than the requested size. For example, if 1,000 were requested but only 100 were supported on the server, the consumer’s next query should have a $top=100 and $skip=100.

Example: GET Lookups with $count=true

Providers MUST support the OData $count=true parameter.

GET /Lookup?$count=true
{
  "@odata.count": 3,
  "value": [{
    "LookupKey": "CDE125",
    "LookupName": "CountyOrParish",
    "LookupValue": "Contra Costa County",
    "StandardLookupValue": null,
    "ModificationTimestamp": "2020-07-07T17:36:16Z"
  }, {
    "LookupKey": "BCD124",
    "LookupName": "CountyOrParish",
    "LookupValue": "Ventura County",
    "StandardLookupValue": null,
    "ModificationTimestamp": "2020-07-07T17:36:15Z"
  }, {
    "LookupKey": "ABC123",
    "LookupName": "CountyOrParish",
    "LookupValue": "Los Angeles County",
    "StandardLookupValue": null,
    "ModificationTimestamp": "2020-07-07T17:36:14Z"
 }]
}

The count query may be used in conjunction with $top=0 to provide a count without returning any values.

Example: GET records that were updated since they were last synced

If a consumer wanted to catch up with any records updated since the last time they had synced, they might use the following query:

GET /Lookup?$filter=ModificationTimestamp ge 2022-01-01T00:00:00Z&$top=100&skip=0&$count=true
{
  "@odata.count": 0,
  "value": []
}

Since the count is zero, this means that there were no updates since the last sync on 2022-08-01T00:00:00Z.

If there were updates, something similar to the following might be expected:

GET /Lookup?$filter=ModificationTimestamp ge 2022-01-01T00:00:00Z&$top=100&skip=0&$count=true
{
  "@odata.count": 1,
  "value": [{
    "LookupKey": "CDE125",
    "LookupName": "CountyOrParish",
    "LookupValue": "Contra Costa County",
    "StandardLookupValue": null,
    "ModificationTimestamp": "2022-03-07T17:36:16Z"
  }]
}

Usage

Example Metadata

This section shows how the Lookup resource might be used in conjunction with data from the Property resource with OData XML Metadata defined as follows:

<?xml version="1.0" encoding="UTF-8"?>
<edmx:Edmx Version="4.0" xmlns:edmx="http://docs.oasis-open.org/odata/ns/edmx">
  <edmx:DataServices>
    <Schema Namespace="org.reso.metadata" xmlns="http://docs.oasis-open.org/odata/ns/edm">
      <EntityType Name="Property">
        <Key>
          <PropertyRef Name="ListingKey"/>
        </Key>
        <Property MaxLength="255" Name="ListingKey" Type="Edm.String"/>
        <Property Name="StandardStatus" Type="Edm.String">
            <Annotation Term="RESO.OData.Metadata.LookupName" String="StandardStatus" />  
        </Property>
        <Property Name="AccessibilityFeatures" Type="Collection(Edm.String)">
            <Annotation Term="RESO.OData.Metadata.LookupName" String="AccessibilityFeatures" />  
        </Property>
        <Property Name="ModificationTimestamp" Precision="27" Type="Edm.DateTimeOffset"/>
      </EntityType>
      <EntityType Name="Lookup">
        <Key>
          <PropertyRef Name="LookupKey"/>
        </Key>
        <Property Name="LookupKey" Type="Edm.String" Nullable="false" />
        <Property Name="LookupName" Type="Edm.String" Nullable="false" />
        <Property Name="LookupValue" Type="Edm.String" Nullable="false" />
        <Property Name="StandardLookupValue" Type="Edm.String" />
        <Property Name="LegacyODataValue" Type="Edm.String" />
        <Property Name="ModificationTimestamp" Precision="27" Type="Edm.DateTimeOffset"  Nullable="false" />
      </EntityType>
    </Schema>
  </edmx:DataServices>
</edmx:Edmx>

GET Property Record with Human Friendly Standard Lookups

When using the Lookup resource, the values in the payload response will be the human friendly display name for a given enumeration:

GET /Property?$top=1 
{
  "value": [
    {
      "ListingKey": "abc123",
      "StandardStatus": "Active Under Contract",     
      "AccessibilityFeatures": ["Accessible Approach with Ramp", "Accessible Entrance", "Visitable"],
      "ModificationTimestamp": "2020-04-02T02:02:02.02Z"
    }
  ]
}

In the preceding example, the Lookup resource MUST contain the following:


Section 3: Certification

When standards are approved for the RESO Data Dictionary, those changes are stored in a format that allows their corresponding testing rules to be generated automatically, which ensures consistency with Data Dictionary and Web API specifications. This also allows for new versions of the testing tool to be created almost immediately when Data Dictionary standards are passed.

Robust statistics are created through the use of the RESO Data Dictionary application, which are then ingested into a real-time analytics framework that lets users see industry-wide statistics about resources, fields, and enumerations. This information can be used to inform decisions about standardization and data mapping between RESO certified servers.

Background

The RESO Data Dictionary testing tool ensures compliance with RESO Data Dictionary definitions of resources, fields, and enumerations.

Nonstandard or “local” data elements are also allowed, provided Data Dictionary resources are used whenever present on a given server and when metadata for any additional items are in a supported and valid transport format.

Resources are top-level containers in the RESO ecosystem. Some examples are Property, Member, Office, Media, and OpenHouse.

Fields exist within a given resource and have name and type definitions that must be adhered to in order to be considered compliant. In the case of Property, examples of fields are ListPrice, ModificationTimestamp, etc. Fields don’t exist on their own in the metadata. They will always be contained within a top-level resource definition that MUST match RESO Standard Resource definitions when they exist.

Lookups define possible values for a given field, and are used in cases such as StandardStatus and ExteriorFeatures.

Testing Framework

Data Dictionary Certification is provided by the RESO Certification Utils CL, which uses the RESO Commander for the metadata testing portion.

The RESO Commander is an open source, cross-platform Java library created by RESO that uses established community libraries, such as the Apache Olingo OData Client, XML parsers, and JSON Schema Validators, to provide a testing API.

Acceptance tests define the requirements applicants are expected to meet in order to achieve certification. Data Dictionary acceptance tests are written in a high-level language (DSL) called Gherkin. This is part of a Behavior Driven Development (BDD) platform called Cucumber, which allows for the expression of testing workflows using a natural language that is intended to be accessible to business analysts and QA testers in addition to programmers. Tests are automatically generated from the adopted Data Dictionary spreadsheet for each given version of the specification, and can target any version of the Data Dictionary from 1.0 onwards.

The benefit of this strategy is that when a new Data Dictionary version is ratified, the tests may be generated and testing can begin right away, significantly reducing tool development time and adoption of the standard.

A command-line interface (CLI) has been provided for local testing. This provides the environment to be used for certification and self-assessment, as well as that needed to run the automated testing tools in a continuous integration and deployment (CI/CD) pipeline, on platforms such as GitHub CI, Jenkins, Travis, or CircleCI, to help prevent regressions in a RESO-certified codebase.

A graphical user interface (GUI) is also available through popular and free Integrated Development Environment (IDE) plugins for IntelliJ and Eclipse. IDEs provide an enhanced testing experience, with better informational messages and the ability to easily run and debug each test step, when needed. The availability of plugins saves significant time in testing, development, and certification. The level of community support is one of the reasons open source tools were chosen as a testing platform.

Testing Methodology

RESO Data Dictionary certification is based on adherence to: a) Resource, Field, and Lookup definitions outlined in each approved RESO Data Dictionary spreadsheet (2.0 at the time of publication), b) Transport requirements regarding authentication and OData conformance, and c) conformance with Data Dictionary to Web API data mappings.

Configuring the Test Client

The starting point is for applicants to create a configuration file in RESOScript (XML) format which contains credentials and a server’s RESO Web API endpoint. A sample RESOScript file and instructions for how to use it will be provided with the initial release of the testing tool.

Metadata Request Using RESO Standard Authentication

When testing begins, an HTTP request is made to an applicant’s given service location with either OAuth2 Bearer Tokens or Client Credentials. Both of these authentication strategies allow for data consumption to be machine automated so that additional interaction from a user isn’t necessary during the authentication process. As such, the RESO Data Dictionary Commander can be used for automated testing. The metadata request is expected to function according to the OData specification in terms of request and response headers and response formats. RESO specifically uses an XML version of OData metadata, which contains an Entity Data Model (EDM) and model definitions, and is often referred to as EDMX.

OData Metadata Validation

Syntax Checking

Metadata returned from a RESO Web API server are checked for XML validity as well as validated against Entity Data Model (EDM) and EDMX definitions published by OASIS, the creators of the OData specification. If metadata are invalid for any reason, Data Dictionary testing will halt.

Semantic Checking

After metadata syntax has been validated, declared data models are checked for correctness. For example, if a given server declares support for the RESO Property resource, then the RESO Commander will look for an OData EntityType definition for Property. If the underlying data model is not found, metadata validation will fail with a diagnostic message to help users understand why a given error occurred. Once the model is found, its field and enumeration definitions will be checked for correctness as well. Another aspect of semantic checking is ensuring that all models have keys so they can be indexed, meaning that a data request can be made to the server by key. This is a basic requirement for fetching data from a server.

RESO Certification

Several requirements must be met during Data Dictionary testing to ensure conformance with RESO Certification rules.

Conformance with the RESO Standard Data Model

In this step, tests that have been generated from a given adopted RESO Data Dictionary version are run to locate and verify resources, fields, and enumerations contained within a server’s metadata. This phase of testing is designed to test that items declared in the metadata using RESO Standard Field Names are consistent with the Data Dictionary definitions for those items.

Resources

Standard Resources MUST be expressed using RESO Standard Resource Names. For instance, Property would be used rather than Properties otherwise they will not be counted. These will be verified during the certification process.

For each RESO Standard Resource found, its standard fields and lookups will be verified. Normative resource names for any Standard Resource can be found in the RESO DDWiki.

Fields

Fields have both StandardName and data type mapping requirements.

Implementers are allowed to commingle their own fields and data types alongside RESO standard fields, but standard fields MUST match their Data Dictionary type definition mappings.

Standard Field Names

RESO Standard Fields MUST be named in accordance with the Data Dictionary definitions of those fields when present on a given server instance..

For example, if a server presents a Property resource and list price field data are present, they MUST be conveyed as ListPrice. Local fields SHOULD use the same naming conventions, when practical. There may be reasons to use nonstandard field names, such as for backwards compatibility, but they MUST pass OData validation.

Variations such as Price or any Data Dictionary synonym of the ListPrice field such as AskingPrice will fail.

Various techniques are used to find potential matches with Data Dictionary definitions of resources, fields, and enumerations that don’t conform to the RESO Definitions of these items. See Additional Compliance Checking for more information.

Standard Display Names

Previously, an annotation of RESO.OData.Metadata.StandardName was used to indicate the display name of a field or enumeration prior to Data Dictionary 1.7 and Web API Core 2.0.0. This has been deprecated.

To convey display names of fields or lookups, the Field and Lookup Resources should be used.

Lookups

Underlying OData enumerations for Data Dictionary lookups MUST adhere to the naming conventions outlined in the OData specification and map to the correct types, as outlined in the Data Type Mappings section.

Standard lookup values for OData are provided in the Data Dictionary 2.0 Spreadsheet. They are not required, but are intended to serve as a guide for those using OData.

DEPRECATION NOTICE: RESO will eventually be deprecating OData IsFlags enumerations in favor of the Lookup resource in a future version of the Data Dictionary. This change will come with a major version bump, and perhaps be part of Data Dictionary 3.0, TBD. See the section on the Lookup Resource for more information.

Data Type Mappings

The following mappings apply to the RESO Data Dictionary and Web API specifications. Data Dictionary data types shown in the following table are contained in the SimpleDataType column of the adopted Data Dictionary 2.0 spreadsheet, for instance those for the Property Resource.

Data Dictionary (1.7+) Web API Core (2.0.0+)
Boolean Edm.Bool
Collection Related Resource Expansion, e.g. PropertyRooms or Units expanded into the Property resource. Requires $expand Endorsement.
Date Edm.Date
Number Edm.Decimal OR Edm.Double for decimal values; Edm.Int64 OR Edm.Int32 OR Edm.Int16 for integers.
String Edm.String
String List, Single EITHER Edm.EnumType OR Edm.String
Sting List, Multi EITHER Collection(Edm.EnumType) OR Edm.EnumType with IsFlags=”true” OR Collection(Edm.String)
Timestamp Edm.DateTimeOffset

Each data type mapping has a corresponding Cucumber BDD acceptance test template that enforces the rules of a given type.

Acceptance Test Templates

Boolean

Boolean values are mapped to the Edm.Bool data type and MUST contain a literal value of “true” or “false” when returned in a payload for a given Boolean field, which is enforced by the RESO Commander. Boolean fields MAY be null as any OData field is nullable. Null values are interpreted as “false.”

Sample Test

  Scenario: AdditionalParcelsYN
    When "AvailabilityDate" exists in the "Property" metadata
    Then "AvailabilityDate" MUST be "Date" data type

Collection

Collection data types are used in two cases:

Collection data types are used in the Data Dictionary to indicate possible expansions, which use the OData Edm.Collection type to present a collection of instances of a given type, such as Media items related to a Property record.

StandardName items for expanded fields have been provided in the adopted Data Dictionary spreadsheet to and reference metadata to guide vendors in the meantime. It’s also worth noting that collection items are not nullable in the normal OData sense, rather if there are no values present in the collection the response should be the empty list [] (by the OData specification).

Date

Date data types use the OData Edm.Date data type. Dates are expected to be in the format “yyyy-mm-dd” and should not include time zone offsets. For dates with time zone support, see Timestamp.

Sample Test

  Scenario: AvailabilityDate
    When "AvailabilityDate" exists in the "Property" metadata
    Then "AvailabilityDate" MUST be "Date" data type

Number

Numbers may either be Integers or Decimals.

Integers

Numbers without Scale and Precision are treated as Integers in the Data Dictionary.

Integers are expected to be expressed using the OData Edm.Int data type and MUST NOT contain length, precision, or scale attributes.

Sample Test

  Scenario: BathroomsFull
    Given that the following synonyms for "BathroomsFull" DO NOT exist in the "Property" metadata
      | FullBaths |
    When "BathroomsFull" exists in the "Property" metadata
    Then "BathroomsFull" MUST be "Integer" data type

Note: Synonyms testing is shown in the last line of the above example and is discussed further in a subsequent section.

Decimals

Decimals are expected to be Edm.Decimal or Edm.Double according to the Data Dictionary Type Mappings. They MAY contain Precision and Scale attributes, as described by the entity data model type definition, which also MAY be omitted.

If the vendor declares Precision and Scale attributes, they SHOULD match those defined by the Data Dictionary but this is not an absolute requirement. Suggested values are provided in the Data Dictionary specification but they are not mandatory at this time. This is reflected in the BDD acceptance tests.

Sample Test

  Scenario: BuildingAreaTotal
    When "BuildingAreaTotal" exists in the "Property" metadata
    Then "BuildingAreaTotal" MUST be "Decimal" data type
    And "BuildingAreaTotal" precision SHOULD be equal to the RESO Suggested Max Precision of 14
    And "BuildingAreaTotal" scale SHOULD be equal to the RESO Suggested Max Scale of 2

Note: The Data Dictionary contains references to Length and Precision which have been found to be inaccurate with respect to standard definitions of decimal numbers. It uses Length and Precision to mean Precision and Scale, respectively. These items have been corrected in the code generation for decimal acceptance tests.

String

String values use the OData Edm.String data type. These strings represent a sequence of UTF-8 characters. String data types MAY specify a length attribute that specifies the length of a string a given server supports. The length property is not required by OData and may be omitted.

RESO provides recommended best practices for these lengths, and applicants will be informed when their length definitions don’t match the RESO definitions, but will not fail certification in these cases.

Sample Test

  Scenario: AboveGradeFinishedArea
    When "AboveGradeFinishedArea" exists in the "Property" metadata
    Then "AboveGradeFinishedArea" MUST be "Decimal" data type
    And "AboveGradeFinishedArea" precision SHOULD be equal to the RESO Suggested Max Precision of 14
    And "AboveGradeFinishedArea" scale SHOULD be equal to the RESO Suggested Max Scale of 2

String List, Single

The current RESO Web API specification uses the OData Edm.EnumType for single enumerations. As such, Data Dictionary items use this data type as well.

These items are similar to fields in that they MUST follow OData field naming conventions. Enumerations not following these conventions will fail certification in the metadata validation step.

Sample Test

  Scenario: StandardStatus
    When "StandardStatus" exists in the "Property" metadata
    Then "StandardStatus" MUST be "Single Enumeration" data type
    And the following synonyms for "StandardStatus" MUST NOT exist in the metadata
      | NormalizedListingStatus |
      | RetsStatus |
    And "StandardStatus" MUST contain at least one of the following standard lookups
      | LegacyODataValue | StandardLookupValue |
      | Active | Active |
      | ActiveUnderContract | Active Under Contract |
      | Canceled | Canceled |
      | Closed | Closed |
      | ComingSoon | Coming Soon |
      | Delete | Delete |
      | Expired | Expired |
      | Hold | Hold |
      | Incomplete | Incomplete |
      | Pending | Pending |
      | Withdrawn | Withdrawn |
    And "StandardStatus" MUST contain only standard enumerations

String List, Multi

As of Web API 2.0.0 Core, there are three formats allowed for String List, Multi.

Edm.EnumType with IsFlags=”true”

The Web API Server Core 2.0.0 specification outlines the use of the OData Edm.EnumType data type with the IsFlags=”true” attribute set to signify that a given field supports multi-valued enumerations. Applicants using this format will still be able to be certified.

Collection(Edm.EnumType)

As there are limitations to the IsFlags approach in cases where multi-select items contain more than 64 distinct values, support for Collections(Edm.EnumType) was added to the Data Dictionary Type Mappings and back-ported to the Web API 2.0.0 Core specification to be used instead.

The following sample test covers both representations:

Sample Test

  Scenario: BuyerFinancing
    When "BuyerFinancing" exists in the "Property" metadata
    Then "BuyerFinancing" MUST be "Multiple Enumeration" data type
    And "BuyerFinancing" MAY contain any of the following standard lookups
      | LegacyODataValue | StandardLookupValue |
      | Assumed | Assumed |
      | Cash | Cash |
      | Contract | Contract |
      | Conventional | Conventional |
      | FHA | FHA |
      | FHA203b | FHA 203(b) |
      | FHA203k | FHA 203(k) |
      | Other | Other |
      | Private | Private |
      | SellerFinancing | Seller Financing |
      | TrustDeed | Trust Deed |
      | USDA | USDA |
      | VA | VA |
    But "BuyerFinancing" MUST NOT contain any similar lookups
Collection(Edm.String)

The addition of the Lookup Resource provides another data type for multiple enumerations, the Collection(Edm.String).

This behaves similarly Collection(OData.EnumType) in terms of filtering with any() or all(), but uses human-friendly strings instead.

Timestamp

Timestamps are expected to use the OData edm:DateTimeOffset data type.

This represents an ISO 8601 compliant date that includes support for both fractional seconds and time zones. The edm:DateTimeOffset doesn’t have any additional length, precision, or scale attributes. Data conveyed using this format is expected to match the date timestamp data type in the W3C specification.

Sample Test

  Scenario: ModificationTimestamp
    Given that the following synonyms for "ModificationTimestamp" DO NOT exist in the "Property" metadata
      | ModificationDateTime |
      | DateTimeModified |
      | ModDate |
      | DateMod |
      | UpdateDate |
      | UpdateTimestamp |
    When "ModificationTimestamp" exists in the "Property" metadata
    Then "ModificationTimestamp" MUST be "Timestamp" data type

Lookup Resource

RESO supports use of a Data Dictionary resource in order to advertise lookup metadata. This has the advantage of providing human friendly lookup values as well as the ability to more easily replicate large sets of enumerations, such as subdivisions or cities.

The following testing rules will be used during RESO Certification:

Servers MUST be able to provide the entire set of lookups relevant for testing through the replication operation so, for example, if a given system has 101 records from the $count=true query option but only 100 records were fetched, this would fail. The opposite is also true, if 100 records were advertised and 101 were found, this would not pass testing.

See the Lookup resource section in the specification for more information.

Schema Validation

RESO Certification will include strict schema validation to ensure that the data available in the payload matches what’s advertised in the metadata.

This will consist of creating a JSON Schema representation of the available metadata, including the Field and Lookup resources, and validating all responses from the server with it. JSON Schema allows an additionalProperties flag to be set to false, meaning that if any additional properties or enumerated values exist in the payload that aren’t in the metadata, schema validation will fail.

This will also include stricter validation for things like string lengths. If an Edm.String field declares itself as 100 characters and the payload has 101 characters, the provider will fail certification.

Additional References

The current version of the generated BDD acceptance tests from which the Sample BDD Tests above were taken from may be found here.

Additional Compliance Checking

In addition to finding exact matches for Standard Resources, Fields, and Lookups, algorithmic techniques and heuristics will be used to determine potential matches with the RESO Data Dictionary standard.

In contrast with the other testing methodologies outlined in this document, the techniques used for additional compliance checking exist to enforce the stated policy that data being presented MUST match the RESO Data Dictionary format when they exist on the server. These methods will continue to be refined over time.

The methods will be published along with the Data Dictionary testing tool for transparency, community review, and to allow self-assessment by applicants prior to RESO Certification.

Informational messages will be generated in cases where potential matches with an existing Data Dictionary definition is found.

Some of the techniques used are described in the following sections.

Synonym Matching

The metadata for a given server is checked for synonyms at the resource and field level.

Synonyms MUST NOT be used at the resource or field level. If a synonym of these items is found within the server metadata, certification will fail.

Example

  Scenario: AccessCode
    Given that the following synonyms for "AccessCode" DO NOT exist in the "Property" metadata
      | GateCode |
    When "AccessCode" exists in the "Property" metadata
    Then "AccessCode" MUST be "String" data type
    And "AccessCode" length SHOULD be equal to the RESO Suggested Max Length of 25

Similar Name Matching

Edit distance matching has been incorporated into the RESO Commander in order to find potential variations of Data Dictionary Resources and Fields. Specifically, the Levenshtein Distance method is used.

A configuration value has been provided that allows the “fuzziness” threshold to be set to a fraction of the length of each term, currently greater than 25% of the word length. This means that terms of length 5-8 characters will allow up to 1 edit distance variation, and 9-12 will allow 2 variations, etc. The threshold has been chosen to provide a low error rate, while still providing meaningful fuzzy matching results.

Edit distance matches within the given threshold will trigger an error in the Data Dictionary Commander. Unresolved matches will not be granted exceptions and will prevent certification from proceeding.

Due to the probabilistic nature of “fuzzy matching,” some false negatives may be generated when local terminology too closely resembles RESO Standard items.

Applicants are expected to provide corrections through the new variations review process.

Certification Workflow

The Certification workflow has been optimized around self-assessment prior to certification.

Self Assessment

It’s expected that applicants will ensure they pass all RESO Data Dictionary tests and have reviewed results to their satisfaction prior to applying for certification.

Guides exist to help them with the evaluation process.

Any questions regarding automated testing tools and revised certification procedures should be directed to dev@reso.org. For any other questions, or to start the certification process please contact RESO Certification.

Application

Those seeking RESO Certification will apply with the Membership Department prior to having their application reviewed by the Certification Department. Once an application has been processed, RESO will confirm the outcome of the automated testing tools using a RESOScript provided by the vendor, as described in the next section.

Certification Issuance

A RESOScript file is required for testing. This file should contain credentials and the service location of the Web API Server instance hosting the Data Dictionary metadata to be tested. See sample Data Dictionary RESOScript file.

Reporting

Data Collection

Metadata for a given server instance will be consumed by the RESO Commander in the OData XML CSDL metadata format but is not stored locally. Data analysis is done in memory and discarded upon termination of the application so applicants’ source code is not retained.

A report will be generated when a certification application is processed that will contain statistics about what was found on a server when the testing tool was run. The report will be used to help the RESO Certification Department and the applicant evaluate results. The report will be emailed to the applicant and kept on file at RESO as proof of certification.

The RESO Commander will also produce summary test statistics in the JSON format with the results of each test step and include relevant data such as Resources, Fields, and Lookups found during testing. These reports will be uploaded into a RESO data collection service for the purpose of analytics.

Data Collection Pipeline

Test data will be collected for analytics purposes. This information will be stored on a cloud drive in order to catalog results.

Once test results are stored, they are sent to a collector service for analysis. The collector will be implemented in Elasticsearch.

While the Collector Service and ancillary reports will be delivered after the MVP testing tool, test data will be available from an API so that analytics may be shown on the RESO Certification Map during the initial release of the Data Dictionary testing tool.

RESO Certification Map

Certification results will be published to the RESO Certification Map, which shows information about certified applicants in a geographical manner.

These information includes, but is not limited to (1) a report showing the RESO Standard Resources, Fields and Lookups in relation to the total number available on a per-resource basis; and, once enough aggregate data have been collected, (2) a field comparison report showing how an applicant scored relative to the market average, as shown in the following diagram:

RESO Data Compatibility Report

A comparison tool will be created to show alignment between resources, fields, and lookups between two or more RESO certified organizations. This will be useful for planning conversions and data shares, among other things.

While the reporting format has yet to be decided, conceptually the tool will find the intersection and difference between sets of resources, fields, and lookups between organizations. The information needed to produce these reports will be produced upon the initial release of the Data Dictionary testing tool, and a web-based UI will be created at a later time.

RESO Analytics Dashboard

An analytics dashboard will be populated with testing data, and will be driven by Kibana, a popular real-time analytics tool. This dashboard will be available to RESO staff and workgroup chairs for planning purposes and to provide information regarding adoption of RESO standards.

Display of Information on RESO Website

RESO may use anonymous aggregates collected during the certification process for display on its public websites. These items consist of Resource, Field, and Enumeration tallies but will not be displayed for a given area so as not to reveal the source, unless permission is specifically granted. Aggregate summary reports will be available at the Resource, Field, and Enumeration level.

For example:

Data Retention Policies

Applicants and certification recipients have the right to be forgotten.

At the time of writing, the Data Dictionary testing tool does not store any information during automated testing aside from generating a local log during runtime and producing JSON-based test results used for reporting.

RESO will be retrieving and saving server metadata in XML (EDMX) format at the time of Data Dictionary Certification for further analysis and to show what was retrieved from the server at the time of testing in case future questions arise. Metadata will be stored securely in the cloud and not available publicly. Information about resources, fields, and lookups found in the metadata during certification will be created as a derivative report.

Feature Requests

Feature requests can be requested as issues on the RESO Commander’s GitHub project or by contacting the RESO development team..

Support

To apply for certification, or for help with an existing application, please contact RESO Certification.

For questions about revised certification procedures or for help or questions about RESO’s automated testing tools, please contact RESO’s dev support.


Section 4. Contributors

This document was written by Joshua Darnell.

Thanks to the following contributors for their help with this project:

Contributor Company
Sergio Del Rio Templates for Business, Inc.
Eric Finlay Zillow Group
Dylan Gmyrek FBS
Rob Larson Larson Consulting, LLC
Paul Stusiak Falcon Technologies Corp.
Cody Gustafson FBS

Many thanks to those who contributed to the RESO Data Dictionary specification, including volunteers from the RESO Data Dictionary and Transport Workgroups.

If you would like to contribute, please contact RESO Development. This could mean anything from QA or beta testing to technical writing to doing code reviews or writing code.


Section 5: References

Please see the following references for more information regarding topics covered in this document:


Section 6: Appendices

The following RCPs are related to Data Dictionary 2.0:


Section 7: License

This document is covered by the RESO EULA.

Please contact RESO if you have any questions.