Understanding the glTF 2.0 Format

Introduction

glTF stands for GL Transmission Format.

glTF is a standardized file format to store and load 3D scenes, its fundamental intent is to be easily generated by a 3D Creation Tool and used by any graphics application, with minimal processing, regardless of the API used.

Its main difference against other formats, is that glTF makes its top priority for its data to be GPU ready. Meaning less processing steps needed to format/adapt/interpret the data on the file, before feeding it to the GPU.

Once you get to know the file format, you will appreciate how well the whole asset generation flow clicks; from the 3D editing tool to how you feed its output into your Graphics Pipeline, and this, in my opinion, is the magic of glTF, it bridges the gap between your design and implementation in a portable and standardized way.

glTF’s support for animations and physically based materials aligned well with what I want to achieve with my Physically Based Renderer, I’ll use the TinyGLTF library (https://github.com/syoyo/tinygltf), to import and manage the assets, but I will include an overview of the format on this post.

File Format

glTF consists of:

  • JSON file: Describes the scene, nodes and their hierarchy; meshes, materials, cameras, light sources. It also contains pointers to binary and image data.
  • Binary Data: The actual geometry and animation data from the scene.
  • Image Files: Image Data stored as JPG or PNG.
Figure 1: glTF2.0 overview from Khronos’ glTF 2.0 spec

Generating or first glTF 2.0 file

Blender 2.8, a free 3D modelling tool, has a top notch glTF 2.0 exporter:

https://docs.blender.org/manual/en/2.80/addons/io_scene_gltf2.html

Using its exporter, I generated a glTF 2.0 file from the starting scene:

The scene description is:

1.- A cube of 2m per side with its center located at the origin, it has 8 vertices (the exporter might increase the number and order of vertices):

  1. (-1, -1, 1)
  2. (1, -1, 1)
  3. (-1, -1, -1)
  4. (1, -1, -1)
  5. (-1, 1, 1)
  6. (1, 1, 1)
  7. (-1, 1, -1)
  8. (1, 1, -1)

The cube has a default material assignment.

2.- A camera with:

  1. Position in (7.35889 m, -6.92579 m, 4.95831 m)
  2. Rotation of (0.483536, 0.208704, 0.336872, 0.780483)

3.- A light source with:

  1. Position in (4.07625 m, 1.00545 m, 5.90386 m)
  2. Rotation of (0.169076, 0.272171, 0.75588, 0.570948)
Figure 2: Blender’s Starting scene, a cube, a camera and a light source
Figure 3. When exporting this scene I used the following configuration: Notice the +Y up setting. Blender works on a +Z up configuration, so the exporter will change that for us.

Once you press the export button, you will have 2 files:

*.gltf

*.bin

Open the gltf file and take a look, it will be divided into several sections:

assets and scenes

 "asset" : {
        "generator" : "Khronos glTF Blender I/O v1.1.46",
        "version" : "2.0"
    },
    "scene" : 0,
    "scenes" : [
        {
            "name" : "Scene",
            "nodes" : [
                0,
                1,
                2
            ]
        }
    ],
  • “asset” describes the generator used to create the file and the version of glTF, in our case it will be 2.0
  • “scene” specifies which of the available scenes, from the scene array, will be shown at load time. On the file we just generated, we only have a scene, so it will be 0.
  • “scenes” is the scene array that describes the objects that need to be rendered to show that scene. Those objects are called “nodes”, our “nodes” tag within “scenes” has [0, 1, 2]  as its values.

nodes

    "nodes" : [
        {
            "mesh" : 0,
            "name" : "Cube"
        },
        {
            "name" : "Light",
            "rotation" : [
                0.16907551884651184,
                0.7558804154396057,
                -0.2721710503101349,
                0.5709475874900818
            ],
            "translation" : [
                4.076250076293945,
                5.903860092163086,
                -1.0054500102996826
            ]
        },
        {
            "name" : "Camera",
            "rotation" : [
                0.483536034822464,
                0.33687159419059753,
                -0.20870360732078552,
                0.7804827094078064
            ],
            "translation" : [
                7.358890056610107,
                4.958310127258301,
                6.925789833068848
            ]
        }
    ],
  • “nodes” are the actual objects that will be rendered, the “name” field is optional, but most 3D Modelling tools will generate a name per object. Notice how those names correspond to the names in our Blender Scene.
    • “mesh” is the first property of our node, a “mesh” is the Geometry of our object, it will be described later on. Here we specify which mesh is included in the nodes for this scene.
    • It also contains information about the initial transformation of those objects, specifically for our camera and our light, remember that both had positions and rotations? Well, they are included here too, just notice 3 things: 
      • The Y and Z axis have been changed, from a -Y to a +Z.
      • In the Json file the rotation is ordered X, Y, Z, W and in Blender is W, X, Y, Z.
      • Rotation is stored as a Quaternion, not as a matrix or as Euler angles. We need to keep that in mind when coding our renderer.

materials

    "materials" : [
        {
            "doubleSided" : true,
            "emissiveFactor" : [
                0,
                0,
                0
            ],
            "name" : "Material",
            "pbrMetallicRoughness" : {
                "baseColorFactor" : [
                    0.800000011920929,
                    0.800000011920929,
                    0.800000011920929,
                    1
                ],
                "metallicFactor" : 0,
                "roughnessFactor" : 0.4000000059604645
            }
        }
    ],
  • “materials” is the next entry, it will specify properties of the surfaces that our geometry constitutes.
    • “doubleSided” specifies if our polygon faces will be culled, depending of their winding order. This setting can be specified directly in your modelling tool, in Blender, it is in the Material Settings
Figure 4. Specifying the culling parameter in Blender
  • “emissiveFactor” is not used by our samples
  • “name” is the material, name, it matches the Material name you assigned it in Blender.
  • “pbrMetallicRoughness” are the parameter values for Base Color, Metallic and Roughness used in PBR rendering, they match the settings of your material node:
Figure 5. Blender’s material description; Base Color is:
R: 0.8
G: 0.8
B: 0.8
A: 1.0
Metallic is set as 0
Roughness is set as 0.4

meshes

    "meshes" : [
        {
            "name" : "Cube",
            "primitives" : [
                {
                    "attributes" : {
                        "POSITION" : 0,
                        "NORMAL" : 1,
                        "TEXCOORD_0" : 2
                    },
                    "indices" : 3,
                    "material" : 0
                }
            ]
        }
    ],
  • “meshes”: this property describes our geometry as primitives that are built of several properties; our cube will have:
    • Attributes:
      • Vertex Positions: “POSITION”
      • Vertex Normals: “NORMAL”
      • Texture Coordinates: “TEXCOORD_0”
    • Indices that will define the vertices that create each triangle.
    • Material that our mesh will use.

accessors

    "accessors" : [
        {
            "bufferView" : 0,
            "componentType" : 5126,
            "count" : 24,
            "max" : [
                1,
                1,
                1
            ],
            "min" : [
                -1,
                -1,
                -1
            ],
            "type" : "VEC3"
        },
        {
            "bufferView" : 1,
            "componentType" : 5126,
            "count" : 24,
            "type" : "VEC3"
        },
        {
            "bufferView" : 2,
            "componentType" : 5126,
            "count" : 24,
            "type" : "VEC2"
        },
        {
            "bufferView" : 3,
            "componentType" : 5123,
            "count" : 36,
            "type" : "SCALAR"
        }
    ],
  • accessors” are the way we will access the binary file where all the beefy data is stored, and retrieve the data we require to render our Geometry. The cube’s binary file has:
    • Vertex Positions, Normals, Texture Coordinates and Indices.
    • Each accessor fetches the data from the buffer view, it will fetch a number, “count“, of  data types, “type“, each component of that data type is of a certain “componentType” as well. 
      • The first accessor will fetch 24 (“count“) VEC3 (“type“), each component of that VEC3 is a float (“componentType”), so in total it will fetch 24*3 floats
      • The second accessor fetches the same amount of data, but from a different buffer view, this corresponds to our vertex normals.
      • The third accessor fetches 24 VEC2s of type float, these are the texture coordinates.
      • The last accessor fetches 36 short ints, these are scalars for the indices that will assemble our triangles.
      • For more information on accessor types: https://github.com/KhronosGroup/glTF/tree/master/specification/2.0#accessors

buffers and bufferViews

    "bufferViews" : [
        {
            "buffer" : 0,
            "byteLength" : 288,
            "byteOffset" : 0
        },
        {
            "buffer" : 0,
            "byteLength" : 288,
            "byteOffset" : 288
        },
        {
            "buffer" : 0,
            "byteLength" : 192,
            "byteOffset" : 576
        },
        {
            "buffer" : 0,
            "byteLength" : 72,
            "byteOffset" : 768
        }
    ],
  • bufferViews” represent contained chunks of data within a buffer. They point to a specific buffer, using an index, “buffer“, and the data within that buffer is delimited by the “byteOffset” and “byteLength
  • bufferView 0 points to buffer 0, it has an offset of 0 and is 288 bytes long. Matching this data with our accessor structure, you can see that the vertex positions are stored in this part of the buffer and accessed with this buffer view.
    "buffers" : [
        {
            "byteLength" : 840,
            "uri" : "DefaultCube.bin"
        }
    ]

buffers” specify where our data is located and its size. This means our cube’s geometry is stored in “DefaultCube.bin”

Fetching the data to a C++ application – using nlohmann/json

We now understand where is the data we need to fetch, but we still need a way to fetch it.

There are multiple Json interpreters out there, but I decided to use this one:

https://github.com/nlohmann/json

As it can be used as a single header and it’s easy to use.

Once you have the “nlohmann/json.hpp” header included and the namespace added, you use the code below to populate your whole json file into a parseable json object.

#include "nlohmann/json.hpp"

using json = nlohmann::json;

int main()
{
    std::ifstream input("Content/DefaultCube.gltf");
    if (!input)
    {
        std::cout << "Could not find gltf file" << std::endl;
        return -1;
    }

    json cubeFile;
    input >> cubeFile;

Once you have this object, you can access the properties one by one if needed:

To extract the “scenes” property for example:

json scenes = cubeFile["scenes"];

If you print out the content of scenes you will get this:

[{“name”:”Scene”,”nodes”:[0,1,2]}]

The library also makes it easy for you to access the individual values stored in there, let’s say you want to access the node indices from the “scenes” property and store them in an int vector:

    json nodeOverview = scenes[0]["nodes"];
    std::vector<uint32_t> nodeIndices;
    nodeIndices.resize(nodeOverview.size());
    for (uint32_t i = 0; i < nodeIndices.size(); i++)
    {
        nodeIndices[i] = nodeOverview[i];
    }

With this cool json tool from Niels Lohmann we can go ahead and start filling data structures out of the Json and the bin file.

For this Project we will extract:

Mesh – position, normals, texCoords, indices and material..

Camera – position and rotation.

Light – position and rotation.

The only one that requires additional information pulled out of the .bin file is the mesh.

Storing the Data

For now, we’ll store it in easily accessible structures, I used glm (https://glm.g-truc.net/0.9.9/index.html ) for their vector and matrix structures and its diverse math functions; glm is handy as it already includes definitions for vec3 and vec2 that can be directly matched to the VEC3 and VEC2 on our glTF file. As you can see from the structure below, I will use vectors of these types to store my mesh data. 

struct pbrMaterial
{
    glm::vec4 color;
    float metallicFactor;
    float roughnessFactor;
};

struct mesh
{
    std::string name;
    std::vector<glm::vec3> positions;
    std::vector<glm::vec3> normals;
    std::vector<glm::vec2> texCoords;
    std::vector<uint16_t> indices;
    std::vector<pbrMaterial> materials;

Getting the mesh data out of the file requires us to:

  • Read the nodes property and parse it to check for the “mesh” property within:
json nodes = cubeFile["nodes"];
for (uint32_t i = 0; i < nodes.size(); i++)
{
  if (nodes[i].find("mesh") != nodes[i].end())
  {
    //There is a mesh in this node:
  • If there is a mesh property in the “nodes” property, then store the mesh index it points to, this index will let you know which array element will you pick from the “meshes” property:
json node = nodes[i];
int meshIndex = node["mesh"];
  • Using this index, you can now parse the “meshes” property, and look for the respective “attributes” we want to look for within the “primitives” property of “meshes”; once you find the attribute you are looking for, it will give you the index for the “accessor” property
  • ddd
json meshes = cubeFile["meshes"];
json meshPrimitives = meshes[meshIndex]["primitives"];
if (meshPrimitives[0]["attributes"].find("POSITION") !=
    meshPrimitives[0]["attributes"].end())
{
  positionAccessorIndex = meshPrimitives[0]["attributes"]["POSITION"];
}
  • Now that you know the accessor index, you have everything you need to know, to fetch the actual data from the binary file!!
    • accessor[positionAccessorIndex][“count”] will let you know how many elements of data type will you need to allocate for.
    • accessor[positionAccessorIndex][“bufferView”] will tell you which buffer view index will you access, bufferViewIndex.
    • bufferView[bufferViewIndex][“byteLength”] will give you the number of bytes you will fetch from the buffer.
    • bufferView[bufferViewIndex][“byteOffset”] provides the offset where your will find your attribute within the buffer.
json accessors = cubeFile["accessors"];
json bufferViews = cubeFile["bufferViews"];

std::ifstream binFile("Content/DefaultCube.bin", std::ios_base::binary);

meshes.positions.resize(accessors[cubeScene.meshes[0].positionAccessorIndex]["count"]);

uint32_t bufferViewIndex = accessors[positionAccessorIndex]["bufferView"];
uint32_t bytesCount = bufferViews[bufferViewIndex]["byteLength"];
uint32_t byteOffset = bufferViews[bufferViewIndex]["byteOffset"];
binFile.seekg(0);
binFile.seekg(byteOffset);
binFile.read((char *)meshes.positions.data(), bytesCount);

After this step, your positions are ready to be sent to any graphics API.

Feeding the data to the Vulkan Graphics Pipeline

Most of that code was directly stolen from:

https://github.com/Overv/VulkanTutorial/blob/master/code/27_model_loading.cpp

I removed their tinyobjloader dependency and added our own primitive glTF parser to get the vertices and normal data.

Figure 6. The default cube being rendered by a Vulkan Graphics pipeline.