[question] Camera intrinsics matrix from cameras.sfm #2326

AndreaMaestri18 · 2024-02-21T16:04:13Z

Describe the problem
I am trying to obtain the projection coordinates on an image of a 3D point of the mesh generated with Meshroom. However when constructing the camera intrinsic matrix as described in [(https://en.wikipedia.org/wiki/Camera_resectioning)] I get wrong results.
This is what i get from cameras.sfm after structure from motion.

        {  "intrinsicId": "1128448763",
            "width": "4000",
            "height": "3000",
            "sensorWidth": "6.1699999999999999",
            "sensorHeight": "4.6275000000000004",
            "serialNumber": "6a3bff7cff7dffff206affff13ff3028",
            "type": "radial3",
            "initializationMode": "estimated",
            "initialFocalLength": "3.6099999999999994",
            "focalLength": "3.5988558298758133",
            "pixelRatio": "1",
            "pixelRatioLocked": "true",
            "principalPoint": [
                "13.498251531673015",
                "-26.248889380993322"
            ],
            "distortionInitializationMode": "none",
            "distortionParams": [
                "-0.0063465089616403896",
                "0.0030250668407204571",
                "-0.0017936039159354772"
            ],
            "undistortionOffset": [
                "0",
                "0"
            ],
            "undistortionParams": "",
            "locked": "true"
        }
    ],
    "poses": [
        {
            "poseId": "15822413",
            "pose": {
                "transform": {
                    "rotation": [
                        "-0.35775470843537033",
                        "0.72431172821648682",
                        "-0.58939298346720159",
                        "-0.00060470419753410275",
                        "0.6309865263412392",
                        "0.77579355366530955",
                        "0.93381540088239956",
                        "0.27790020500867246",
                        "-0.2253004064154816"
                    ],
                    "center": [
                        "29.711263680710861",
                        "-35.714602995244938",
                        "11.560545223684104"
                    ]
                },
                "locked": "1"
            }
        },
        ...

from here I did the following:

f = 3.5988558298758133
mx = 6.1699999999999999
my = 4.6275000000000004
px = 13.498251531673015
py = -26.248889380993322
p_w = np.array([12.168, 20.072,15.644,1]) # point coordinates in world
t = np.array([29.711263680710861, -35.714602995244938, 11.560545223684104]) # center of camera in world

K = np.array([
    [f/mx , 0, px ],
    [0, f/my, py],
    [0, 0, 1]
])
R = np.array([[-0.35775470843537033, -0.00060470419753410275, 0.93381540088239956],
            [0.72431172821648682, 0.6309865263412392, 0.27790020500867246],
            [-0.58939298346720159, 0.77579355366530955, -0.2253004064154816]], dtype = "double"
            )

I = np.identity(3)
q=np.zeros((3,4))
q[0:3, 0:3]= I
q[0:3,3] = -t     
M = K@R@q
point_image = M @ p_w / (M @ p_w)[2]

obtaining point_image = array([ 13.60954989, -25.90018649, 1. ]) which is unfortunately incorrect. Therefore my questions are the following:

is K correct?
is R correct?
what are the units of the principal points? are those in mm or in pixel coordinates?
ultimately, what am I doing wrong here?

additional info both the coordinates of the point and the camera center are in meters (obtained from blender), but i guess rescaling everything in mm would not make a difference. Is this wrong?

Desktop (please complete the following and other pertinent information):

OS: [linux]
Python version [e.g. 3.10]
Meshroom version: please specify if you are using a release version or your own build
- Binary version (if applicable) [e.g. 2023]

The text was updated successfully, but these errors were encountered:

simogasp · 2024-02-21T16:48:28Z

px and py are offsets wrt the center of the image. I know it's not the standard way that everybody uses but here you have to add px and py to width/2 and height/2, respectively, to get the real principal point.

Also I never remember in which format, row-major or colum-major, the matrices are saved. If after fixing the px and py problem you still have a large projection error try to read R and transpose it before using it.

simogasp · 2024-02-21T16:54:15Z

Also for the focal length, again it is not a standard format as it is expressed in mm. To get it back in pixel you can use the formula

pxFocalLength = (focalLength / sensorWidth) *std::max(image().Width(), image().Height());

simogasp · 2024-02-21T16:59:13Z

As for the matrix, it should be stored in column-major order as per default in Eigen
https://eigen.tuxfamily.org/dox/group__TopicStorageOrders.html
So your R should be ok.

AndreaMaestri18 · 2024-02-21T17:05:50Z

hey @simogasp, thanks a lot!

The projection is still way off unfortunately.

pxFocalLength = (f / mx) * 4000
pyFocalLength = (f / my) * 3000
K = np.array([
    [pxFocalLength , 0, 2000 + px ],
    [0, pyFocalLength, 1500+py],
    [0, 0, 1]
])
R = np.array([[-0.35775470843537033, -0.00060470419753410275, 0.93381540088239956],
            [0.72431172821648682, 0.6309865263412392, 0.27790020500867246],
            [-0.58939298346720159, 0.77579355366530955, -0.2253004064154816]], dtype = "double"
            )
RT=R.T
I = np.identity(3)
q=np.zeros((3,4))
q[0:3, 0:3]= I
q[0:3,3] = -t     
M = K@R@q
print(M @ p_w / (M @ p_w)[2])

gives array([2.45869170e+03, 2.51985979e+03, 1.00000000e+00]).
this should be the result: (see green box)

but i get:

also why is it pxFocalLength = (focalLength / sensorWidth) *std::max(image().Width(), image().Height()); and not times the width for pxFocalLength and times the height for pyFocalLength?

simogasp · 2024-02-21T18:25:39Z

also why is it pxFocalLength = (focalLength / sensorWidth) *std::max(image().Width(), image().Height()); and not times the width for pxFocalLength and times the height for pyFocalLength?

It's coming from here when reading the exif
https://github.com/alicevision/AliceVision/blob/57cc8a02f653ce1f754cda2dcf8a3cf517405bf0/src/aliceVision/sfmDataIO/viewIO.cpp#L193

and here when reading from json
https://github.com/alicevision/AliceVision/blob/57cc8a02f653ce1f754cda2dcf8a3cf517405bf0/src/aliceVision/sfmDataIO/jsonIO.cpp#L287
here is just

fx = (fmm / sensorWidth) * double(width);

with fmm the focal in mm from the json and fx the focal on the x in pixels.

It's confusing because the focal length in pixel is always used for all computations but it's exported in mm for compatibility with the ABC and software like Maya Blender and so on.

@fabiencastan @servantftechnicolor can you check if it is the right conversion?

I see that you transpose R in the snippet of code. I imagine that without transposing it does not work either, does it?

AndreaMaestri18 · 2024-02-21T23:19:05Z

Yeah indeed it doesn't work without transposing R either. Is pheraps the way I compute the coordinates wrong?

simogasp · 2024-02-21T23:57:35Z

just to be sure because I don't speak numpy, I was assuming that

M = K@R@q

is the matrix product of the matrices, right? so that we correctly have K * [R |-R*t]. Am I right?
(you'd better not call it t because it could confuse people, that is the center c of the camera in the world coordinates and the actual t of the rototranslation matrix is -R*c = t)

AndreaMaestri18 · 2024-02-22T08:35:18Z

Thanks a lot! I got it just now. All the things you suggested were correct:

translating the principle point by adding width/2 and height/2
FX = (f/sensor_width)*width
FY = (f/sensor_height)*height
The reason why it was not working for me it's because I was reading the coordinates of the point in the 3D space from blender, which flipped the axis. I was therefore expecting something that would never happen 😂

But now it works perfectly even without distortion parameters! Thanks again!!

simogasp · 2024-02-22T08:39:10Z

Thanks a lot! I got it just now. All the things you suggested were correct:

translating the principle point by adding width/2 and height/2

FX = (f/sensor_width)*width

FY = (f/sensor_height)*height
The reason why it was not working for me it's because I was reading the coordinates of the point in the 3D space from blender, which flipped the axis. I was therefore expecting something that would never happen 😂

But now it works perfectly even without distortion parameters! Thanks again!!

That was my next question. I was smelling the usual problem with the different conventions used for expressing the camera frame from computer graphics and computer vision... It's always the usual suspect! ;-)

Just for future reference would you mind posting your working snippet of code, like the one above? Thanks!

AndreaMaestri18 · 2024-02-22T09:50:17Z

yes of course! from the json I posted at the beginning of the question i get the data, then:

# info from json
f = 3.5988558298758133
mx = 6.1699999999999999
my = 4.6275000000000004
px = 13.498251531673015
py = -26.248889380993322
width = 4000
height = 3000

# points
p_w = np.array([12.187, -20.025,  -16.133, 1]) # point coordinates in world
t = np.array([29.711263680710861,-35.714602995244938, 11.560545223684104]) # center of camera in world

pxFocalLength = (f / mx) * width
pyFocalLength = (f / my) * height

K = np.array([
    [pxFocalLength , 0, px+width/2 ],
    [0, pyFocalLength, py+height/2],
    [0, 0, 1]
])
R = np.array([[-0.35775470843537033, -0.00060470419753410275, 0.93381540088239956],
            [0.72431172821648682, 0.6309865263412392, 0.27790020500867246],
            [-0.58939298346720159, 0.77579355366530955, -0.2253004064154816]], dtype = "double"
            )

q=np.zeros((3,4))
q[0:3, 0:3]= R
q[0:3,3] = -np.dot(R,t)   

M = np.dot(K,q)

pixel_coordinates = np.dot(M, p_w) / np.dot(M , p_w)[2]

and that works perfectly :)

thanks again

AndreaMaestri18 added the type:question label Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[question] Camera intrinsics matrix from cameras.sfm #2326

[question] Camera intrinsics matrix from cameras.sfm #2326

AndreaMaestri18 commented Feb 21, 2024 •

edited

simogasp commented Feb 21, 2024 •

edited

simogasp commented Feb 21, 2024

simogasp commented Feb 21, 2024

AndreaMaestri18 commented Feb 21, 2024

simogasp commented Feb 21, 2024 •

edited

AndreaMaestri18 commented Feb 21, 2024

simogasp commented Feb 21, 2024

AndreaMaestri18 commented Feb 22, 2024

simogasp commented Feb 22, 2024 •

edited

AndreaMaestri18 commented Feb 22, 2024

[question] Camera intrinsics matrix from cameras.sfm #2326

[question] Camera intrinsics matrix from cameras.sfm #2326

Comments

AndreaMaestri18 commented Feb 21, 2024 • edited

simogasp commented Feb 21, 2024 • edited

simogasp commented Feb 21, 2024

simogasp commented Feb 21, 2024

AndreaMaestri18 commented Feb 21, 2024

simogasp commented Feb 21, 2024 • edited

AndreaMaestri18 commented Feb 21, 2024

simogasp commented Feb 21, 2024

AndreaMaestri18 commented Feb 22, 2024

simogasp commented Feb 22, 2024 • edited

AndreaMaestri18 commented Feb 22, 2024

AndreaMaestri18 commented Feb 21, 2024 •

edited

simogasp commented Feb 21, 2024 •

edited

simogasp commented Feb 21, 2024 •

edited

simogasp commented Feb 22, 2024 •

edited