# Frame Conventions
In this note, we describe the coordinate frame conventions used in `viser`.
## Scene tree naming
Each object that we add to the scene in viser is instantiated as a node in a
scene tree. The structure of this tree is determined by the names assigned to
the nodes.
If we add a coordinate frame called `/base_link/shoulder/wrist`, it signifies
three nodes: the `wrist` is a child of the `shoulder` which is a child of the
`base_link`.
If we set the transformation of a given node like `/base_link/shoulder`, both
it and its child `/base_link/shoulder/wrist` will move. Its parent,
`/base_link`, will be unaffected.
## Poses
Poses in `viser` are defined using a pair of fields:
- `wxyz`, a unit quaternion orientation term. This should always be 4D.
- `position`, a translation term. This should always be 3D.
These correspond to a transformation from coordinates in the local frame to the
parent frame:
.. math::
p_\mathrm{parent} = \begin{bmatrix} R & t \end{bmatrix}\begin{bmatrix}p_\mathrm{local} \\ 1\end{bmatrix}
where `wxyz` is the quaternion form of the :math:`\mathrm{SO}(3)` matrix
:math:`R` and `position` is the :math:`\mathbb{R}^3` translation term
:math:`t`.
## World coordinates
In the world coordinate space, +Z points upward by default. This can be
overridden with :func:`viser.SceneApi.set_up_direction()`.
## Cameras
In `viser`, all camera parameters exposed to the Python API use the
COLMAP/OpenCV convention:
- Forward: +Z
- Up: -Y
- Right: +X
Confusingly, this is different from Nerfstudio, which adopts the OpenGL/Blender
convention:
- Forward: -Z
- Up: +Y
- Right: +X
Conversion between the two is a simple 180 degree rotation around the local X-axis.