ArUco is a kind of encoding code similar to QR code. It’s commonly used in Augmented Reality.
- focal length of the camera $f_x$ and $f_y$
- the optical center in the image $(c_x, c_y)$
- the radial distortion parameters $\gamma$
The PnP (Perspective n-Point) algorithm takes n points in the 3D world and their corresponding 2D points in the image as inputs, and then calculates the pose and position of the camera relative to the world. (NOTE: Chosen points cannot be colinear)
RANSAC might be useful for removing outliers.
- $s$ is a scale factor for the image point
- $p_c$ is coordinate in camera referece frame
- $p_w$ is coordinate in world reference frame
- $R$ is rotation matrix
- $T$ is translation vector
- $f_x$ and $f_y$ are the scaled focal lengths
- $\gamma$ is the skew parameter which is sometimes assumed to be 0
- $(c_x, c_y)$ is the principal point
- 4 ArUcos
- 1 camera
- 1 Manipulator(UR5)
I used 4 ArUcos just for averaging the result. The algorithm will work if it’s just one ArUco. The 4 corner points of a certain ArUco can be treated as feature points. I defined the center of 4 ArUcos as the origin (0, 0) of the world, then we know the 3-D coordinates of 4 corner points. After we detect the ArUco in the images, we will know their coresponding 2-D coordinates in the images. So, we can solve and find the translation and rotation vectors using the equation above. Here, translation and rotation vectors are the so-called camera pose (orientation and position).
Finally, I used MoveIt! library in ROS framework to control a UR5 robot, which I installed a camera. Keeping moving the robot, its pose (same as the camera pose) will change, then I can verify my algorithm compare to the pose data in the robot control panel.
Actually, this is a review section. I have done this project a year ago, so I forget the accuracy data from my research. I just remember this algorithm worked well.