How to convert screen to world coordinates by way of a camera?

dakom · January 12, 2021, 7:28pm

I tried following the guide at https://nalgebra.org/cg_recipes/#screen-space-to-view-space

Currently, I’ve got something like this:

//projection is &Perspective3<f64>
//screen_point is Point2<f64>, x/y coordinates on the screen
//viewport is Vector2<f64>, the screen size
//ndx_depth is f64, I tried -1.0, 0.0, and 1.0

let ndc_point = Point3::new(screen_point.x / viewport.x, screen_point.y / viewport.y, ndc_depth);
let view_point = projection.unproject_point(&ndc_point);
let world_point = camera.view().try_inverse().unwrap().transform_point(&view_point);

I’d expect to get some value where the x and y is in some non-normalized range, like 200,300. Instead I get very low normalized values.

In this particular case I happen to be using the Kiss3d ArcBall camera, so I tried this as well, but I ran into the same issue:

let world_point = camera.inverse_transformation().transform_point(&view_point);

Any tips?

dakom · January 12, 2021, 7:31pm

Also, as a followup, what I really want to be doing is then moving my target object to this new location. In other words the experience is ultimately dragging a 3d object around on the x and y access according to mouse movement.

I suppose I could do that once the above is working by getting the near and far world space coordinates on the frustum and then finding the point along that line at the same depth as the target object, but I wonder if there’s a more direct way?

sebcrozet · January 12, 2021, 9:49pm

dakom:

In this particular case I happen to be using the Kiss3d ArcBall camera, so I tried this as well, but I ran into the same issue:
let world_point = camera.inverse_transformation().transform_point(&view_point);
Any tips?

Hi! Could you please try camera.unproject(&screen_point)?

Your use of camera.inverse_transformation() isn’t correct because this inverse_transformation() includes the inverse projection too.

dakom:

Currently, I’ve got something like this:

//projection is &Perspective3<f64>
//screen_point is Point2<f64>, x/y coordinates on the screen
//viewport is Vector2<f64>, the screen size
//ndx_depth is f64, I tried -1.0, 0.0, and 1.0

let ndc_point = Point3::new(screen_point.x / viewport.x, screen_point.y / viewport.y, ndc_depth);
let view_point = projection.unproject_point(&ndc_point);
let world_point = camera.view().try_inverse().unwrap().transform_point(&view_point);

I’d expect to get some value where the x and y is in some non-normalized range, like 200,300. Instead I get very low normalized values.

Mmh, these computations look correct. With a ndx_depth equal to -1.0 you may end up with very small values if the near-plane is very close to the eye, but with 0.0 and 1.0 you should get reasonably large values (depending on your far-plane). Could you perhaps show some actual values for all the terms involved in the computation here?

No, there is no built-in way of doing this in Kiss3d. What you propose here sounds like a reasonable way of doing it. Though keep in mind that because of the perspective, keeping the same depth will cause your object to be moved along the surface of a sphere instead of on a plane.

dakom · January 13, 2021, 6:54am

when I use the ArcBall camera’s unproject (modified it only to use a stashed screen size as an argument, no other changes at all) - I still get the same small values

Sure!

Code

let view:&Matrix4<f64> = &camera.view;
let projection:&Perspective3<f64> = &camera.projection;
let view_inverse = view.try_inverse().unwrap();
let screen_point = Point2::new(x as f64, y as f64);
let screen_size = camera.get_viewport(); // just retrieves the stashed screensize
let ndc_screen_point = Point3::new(screen_point.x / screen_size.x, screen_point.y / screen_size.y, 0.0);
let view_point_1 = projection.unproject_point(&ndc_screen_point);
let (view_point_2, view_dir_2) = camera.unproject(&screen_point);
let (view_point_3, view_dir_3) = camera.unproject(&Point2::new(screen_point.x / screen_size.x, screen_point.y / screen_size.y));

let world_point_1 = view_inverse.transform_point(&view_point_1);
let world_point_2 = view_inverse.transform_point(&view_point_2);
let world_point_3 = view_inverse.transform_point(&view_point_3);

log::info!("
    \n----- GIVEN -----
    \nview: {:#?}
    \nview_inverse: {:#?}
    \nprojection: {:#?}
    \nscreen_point: {:#?}
    \nscreen_size: {:#?}
    \nndc_screen_point: {:#?}
    \n----- VIEW_POINT -----
    \nview_point_1: {:#?}
    \nview_point_2: {:#?}
    \nview_dir_2: {:#?}
    \nview_point_3: {:#?}
    \nview_dir_3: {:#?}
    \n----- WORLD_POINT -----
    \nworld_point_1: {:#?}
    \nworld_point_2: {:#?}
    \nworld_point_3: {:#?}
", 
    view,
    view_inverse,
    projection,
    screen_point,
    screen_size,
    ndc_screen_point,
    view_point_1,
    view_point_2,
    view_dir_2,
    view_point_3,
    view_dir_3,
    world_point_1,
    world_point_2,
    world_point_3,
);

Logs with screen_point: 977,489 and screen_size: 1620,686

----- GIVEN -----
    
view: Matrix {
    data: [
        1.0,
        -0.000000000000000000000000000000003749399456654644,
        0.00000000000000006123233995736766,
        0.0,
        0.0,
        1.0,
        0.00000000000000006123233995736766,
        0.0,
        -0.00000000000000006123233995736766,
        -0.00000000000000006123233995736766,
        1.0,
        0.0,
        0.0,
        0.0,
        -1000.0,
        1.0,
    ],
}
    
view_inverse: Matrix {
    data: [
        1.0,
        0.0,
        -0.00000000000000006123233995736766,
        0.0,
        -0.000000000000000000000000000000003749399456654644,
        1.0,
        -0.00000000000000006123233995736766,
        0.0,
        0.00000000000000006123233995736766,
        0.00000000000000006123233995736766,
        1.0,
        0.0,
        0.00000000000006123233995736766,
        0.00000000000006123233995736766,
        1000.0,
        1.0,
    ],
}
    
projection: Matrix {
    data: [
        1.0223151257950265,
        0.0,
        0.0,
        0.0,
        0.0,
        2.414213562373095,
        0.0,
        0.0,
        0.0,
        0.0,
        -1.0000200002000021,
        -1.0,
        0.0,
        0.0,
        -0.2000020000200002,
        0.0,
    ],
}
    
screen_point: Point {
    coords: Matrix {
        data: [
            977.0,
            489.0,
        ],
    },
}
    
screen_size: Matrix {
    data: [
        1620.0,
        686.0,
    ],
}
    
ndc_screen_point: Point {
    coords: Matrix {
        data: [
            0.6030864197530864,
            0.7128279883381924,
            0.0,
        ],
    },
}
    
----- VIEW_POINT -----
    
view_point_1: Point {
    coords: Matrix {
        data: [
            0.11798326635932296,
            0.05905201356162632,
            -0.1999980000199998,
        ],
    },
}
    
view_point_2: Point {
    coords: Matrix {
        data: [
            0.02016724924679395,
            -0.017631247844387254,
            999.9,
        ],
    },
}
    
view_dir_2: Matrix {
    data: [
        0.19480420276709898,
        -0.17030786589218974,
        -0.9659433489597218,
    ],
}
    
view_point_3: Point {
    coords: Matrix {
        data: [
            -0.09774436704011258,
            0.04133527372412273,
            999.9,
        ],
    },
}
    
view_dir_3: Matrix {
    data: [
        -0.6703226584807581,
        0.2834738349715284,
        -0.6857915998419808,
    ],
}
    
----- WORLD_POINT -----
    
world_point_1: Point {
    coords: Matrix {
        data: [
            0.11798326635938418,
            0.059052013561687544,
            999.80000199998,
        ],
    },
}
    
world_point_2: Point {
    coords: Matrix {
        data: [
            0.02016724924691641,
            -0.017631247844264796,
            1999.9,
        ],
    },
}
    
world_point_3: Point {
    coords: Matrix {
        data: [
            -0.09774436703999012,
            0.0413352737242452,
            1999.9,
        ],
    },
}

Seems like it’s doubling the world_point depth, but I don’t see where I can account for z-axis being inverted here so maybe that’s misleading. Also, even if I spin the camera around to the other side, differences in the world point movements are within a very small range when they should be bigger.

The camera itself is correct - I see graphics on the screen just fine.

dakom · January 14, 2021, 10:54am

I’m sure you’re busy and don’t want to put too much pressure, and appreciate the free help. If there’s any more detail I can add please let me know

sebcrozet · January 14, 2021, 1:46pm

Thank you for the details. Since your z-range (0.09 - 9999) is very large, I don’t think there is anything wrong.

One thing to keep in mind is that the depth remapping applied by the perspective projection is non-linear (it’s kid of a parabola). So it means that the point (0, 0, -1) in NDC will lie on the near-plane of the frustum, the point (0, 0, 1) in NDC will lie on the far-plane of the frustum, but the point (0, 0, 0) won’t lie at the middle of the frustum: it will actually be much closer to the near-plane.

In your case, you have a fovy of 0.78. Therefore your near-plane has a height equal to tan(0.78 / 2.0) * 0.09 * 2.0 in view-space. Because of the non-linear depth remaping operated by the perspective projection, a z = 0.0 in NDC will give you a z = 0.19 in view-space (there is a negative sign here, but I did not included it for simplicit). So this means that the frustum slice at z = 0.0 in NDC will have a height equal to tan(0.78 / 2.0) * 0.09 * 2.0 * (0.19 / 0.09) = 0.086 (Thales theorem) in view-space. So with z = 0.0 in NDC, you should not expect your view-space coordinates about the y axis to be greater than 0.086 (approximately, I made a bunch of rounding in there). This explains the small values you are getting.

With a z-range as big as 0.09-9999, you won’t get any “big” value on the x-y plane in view-space, for any z-value bellow 0.9 in NDC.

dakom · January 14, 2021, 4:09pm

Ok, so that explains view space - so is my mistake in how I’m mapping that to world space?

e.g. if my object is at say (200,200,0) in world space, and I click on it, I’d expect that one of these world_space_n values would be close to that.

ultimately my actual problem is if I set that object’s transform to any of these world_space values, it doesn’t work (e.g. the x,y values are always around 0,0).

sebcrozet · January 14, 2021, 4:20pm

Having a world_space_n value that is close to (200, 200, 0) isn’t easy. You would need to know the right z component of the NDC point you are unprojecting so it ends up in the right z depth in world-space. And because of the non-linear z-mapping operated by the perspective, this is not easy.

The most reliable info you can do with the unprojection is build a ray: unproject two points with different z values in NDC. Then you will know that the line passing through these two points will intersect your object. Then you can for example do a ray-cast with that line in order to see exactly where the hit takes place.

I believe there are some techniques with picking (on the GPU) that allow you to get an approximation of the world-space depth at the point where you clicked. I have never implemented picking, but maybe some resources on the internet could give you some interesting ideas.

Topic		Replies	Views
3D Camera Control nalgebra	2	2673	May 30, 2016
Face culling and dot product nalgebra	2	656	October 25, 2020
Draw a line to an object in kiss3d help	2	687	April 14, 2020
Camera projections issues nalgebra help	2	621	July 2, 2018
Generic camera controllers nalgebra	1	519	July 25, 2019

How to convert screen to world coordinates by way of a camera?

Related Topics