Data exchange formats - Stefanos Cloud Data Exchange Specifications

This post is also available in audible format on my podcast. The post discusses the most common and widely used data exchange formats.

There are various scenarios in which both software and systems engineers need a way to exchange data between users, systems, scripts and applications using mutually understandable protocols and formats. The most widely accepted data exchange formats are the following:

XML
YAML
CSV
JSON

XML

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. Several schema systems exist to aid in the definition of XML-based languages, while programmers have developed many application programming interfaces (APIs) to aid the processing of XML data. An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. There are languages developed specifically to express XML schemas. The document type definition (DTD) language is a schema language. The XML specification is defined at: https://www.w3.org/TR/REC-xml/

YAML

YAML (a recursive acronym for "YAML Ain't Markup Language") is a human-readable data-serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. It uses both Python-style indentation to indicate nesting, and a more compact format that uses [] for lists and {} for maps, making YAML 1.2 a superset of JSON. More details about the YAML specification can be found at: https://yaml.org/

CSV

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas.

JSON

JavaScript Object Notation (JSON) is an open-standard file format or data interchange format that uses human-readable text to transmit data objects consisting of attribute–value pairs and array data types (or any other serializable value). JSON is also extensively used in cloud computing services and applications to achieve Infrastructure As Code (IAC). IAC provides the ability to cloud administrators to design and provision cloud infrastructure in an automated fashion. JSON files are being used as IAC templates and describe the resources which will be provisioned in the cloud infrastructure and the features and dependencies of these cloud resources. Such an example is Azure Resource Manager (ARM) templates, which are written in JSON format. More details about the JSON specification can be found at: https://www.json.org/json-en.html

HTTP Content-Type

There are various content types available. One case in which content types are categorized is the HTTP protocol header. The content type field is supported by the HTTP/ 1.1 and HTTP/ 2.0 protocol.