**Coordinate systems in OmniPhotos** ******************************************************************** * ^ y y ^ * * ^ z | | ^ z * * / | | / * * / | |/ * * +-------> x +-------> x x <-------+ * * | / * * | COLMAP + / OpenGL + Hartley & * * | OpenVSLAM z v OmniPhotos Zisserman * * v y ERP * ******************************************************************** Input =============================================================================== COLMAP coordinate system ------------------------------------------------------------------------------- > "The reconstructed pose of an image is specified as the projection from > world to the camera coordinate system of an image using a quaternion (QW, QX, QY, QZ) > and a translation vector (TX, TY, TZ). The quaternion is defined using the Hamilton > convention, which is, for example, also used by the Eigen library. The coordinates of > the projection/camera center are given by `-R^t * T`, where `R^t` is the inverse/transpose > of the 3x3 rotation matrix composed from the quaternion and `T` is the translation vector. > The local camera coordinate system of an image is defined in a way that the X axis points > to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image." > -- https://colmap.github.io/format.html OpenVSLAM ------------------------------------------------------------------------------- * z forward, x right, y down (same as COLMAP) – inferred from [source code](https://github.com/xdspacelab/openvslam/blob/5a0b1a5f52b4d29b699624052c9d5dc4417d9882/src/openvslam/camera/equirectangular.h#L46): ~~~~~~ cv::KeyPoint convert_bearing_to_keypoint(const Vec3_t& bearing) const override final { cv::KeyPoint undistorted; // convert to unit polar coordinates const double latitude = -std::asin(bearing[1]); const double longitude = std::atan2(bearing[0], bearing[2]); // convert to pixel image coordinated undistorted.pt.x = cols_ * (0.5 + longitude / (2.0 * M_PI)); undistorted.pt.y = rows_ * (0.5 - latitude / M_PI); return undistorted; } ~~~~~~ axis | `bearing` | `latitude` | `longitude` | `x` | `y` | direction :---:|-------------|-----------:|------------:|------|-----|---------- +x | `[+1 0 0]` | 0 | π/2 | 0.75 | 0.5 | right –x | `[-1 0 0]` | 0 | –π/2 | 0.25 | 0.5 | left +y | `[0 +1 0]` | –π/2 | 0 | 0.5 | 1.0 | down –y | `[0 -1 0]` | π/2 | 0 | 0.5 | 0.0 | up +z | `[0 0 +1]` | 0 | 0 | 0.5 | 0.5 | forward –z | `[0 0 -1]` | 0 | π | 1.0 | 0.5 | backward * We export the camera pose using the ["TUM" format](https://github.com/xdspacelab/openvslam/blob/5a0b1a5f52b4d29b699624052c9d5dc4417d9882/src/openvslam/io/trajectory_io.cc#L86): ~~~~~~ else if (format == "TUM") { const Mat33_t& rot_wc = cam_pose_wc.block<3, 3>(0, 0); const Vec3_t& trans_wc = cam_pose_wc.block<3, 1>(0, 3); const Quat_t quat_wc = Quat_t(rot_wc); ofs << std::setprecision(15) << timestamps.at(frm_id) << " " << std::setprecision(9) << trans_wc(0) << " " << trans_wc(1) << " " << trans_wc(2) << " " << quat_wc.x() << " " << quat_wc.y() << " " << quat_wc.z() << " " << quat_wc.w() << std::endl; } ~~~~~~ * The "TUM" format [is defined](https://vision.in.tum.de/data/datasets/rgbd-dataset/file_formats) as follows: * **tx ty tz** (3 floats) give the position of the optical center of the color camera with respect to the world origin as defined by the motion capture system. * **qx qy qz qw** (4 floats) give the orientation of the optical center of the color camera in form of a unit quaternion with respect to the world origin as defined by the motion capture system. !!! Note that the pose is not given in the same `[R t]` format as COLMAP's world-to-camera transform, but as its inverse, the camera-to-world transform `[R t]^-1 = [R^T C]`. Preprocessing =============================================================================== *************************************************** * [R t] * * .---------------. * * | | * * | v * * .----------+----. .---------------. * * | COLMAP_World | | COLMAP_Camera | * * '---------------' '----+----------' * * ^ ^ | ^ * * | | | | * * T_YZ | '---------------' | T_YZ * * | [R^T -R^T×t] | * * v v * * .-----------+---. .--+------------. * * | OpenGL_World | | OpenGL_Camera | * * '---------------' '--+------------' * *************************************************** 1. Cameras are loaded by the SfM loaders (`ColmapLoader::readCameraGeometryColmap` and `OpenVSLAMLoader::loadCameraParameters`) 1. Both file formats use quaternion rotation + translation notation. 1. Cameras are converted to rotation matrix R and camera centre C: ``` // COLMAP camera orientation: x-right, y-down, z-forward. Eigen::Matrix3f R = Eigen::Quaternionf(image.qw, image.qx, image.qy, image.qz).toRotationMatrix(); // Compute camera centre from rotation + translation (standard H&Z stuff). Eigen::Vector3f translation(image.tx, image.ty, image.tz); Eigen::Vector3f C = -R.transpose() * translation; ``` 1. A transformation `colmapTransform = constMatrices.invYZ;` is applied to negate the Y and Z axes. This effectively converts the source coordinate system (world points) from COLMAP/OpenVSLAM to OpenGL coordinates. The target coordinate system remains COLMAP/OpenVSLAM. ``` R = R * openVSLAMTransform; C = openVSLAMTransform * C; ``` 1. A `Camera` object is created for each input camera from K, R, t. Equirectangular images assume an identity intrinsic matrix K. 1. `Camera` class – _Most likely using a transposed rotation matrix (columns instead of rows)._ 1. … 1. `PreprocessingApp::changeBasis` (if ChangeBasis=1) 1. … 1. Write `Cameras.csv` in `CameraSetupDataset::writeCamerasCSV`. Viewer =============================================================================== 1. Read `Cameras.csv` in `CameraSetupDataset::readCamerasCSV` 1. … 1. Shaders … a whole different world.