# How to convert screen to world coordinates by way of a camera?

I tried following the guide at https://nalgebra.org/cg_recipes/#screen-space-to-view-space

Currently, I’ve got something like this:

``````//projection is &Perspective3<f64>
//screen_point is Point2<f64>, x/y coordinates on the screen
//viewport is Vector2<f64>, the screen size
//ndx_depth is f64, I tried -1.0, 0.0, and 1.0

let ndc_point = Point3::new(screen_point.x / viewport.x, screen_point.y / viewport.y, ndc_depth);
let view_point = projection.unproject_point(&ndc_point);
let world_point = camera.view().try_inverse().unwrap().transform_point(&view_point);
``````

I’d expect to get some value where the x and y is in some non-normalized range, like 200,300. Instead I get very low normalized values.

In this particular case I happen to be using the Kiss3d ArcBall camera, so I tried this as well, but I ran into the same issue:

``````let world_point = camera.inverse_transformation().transform_point(&view_point);
``````

Any tips?

Also, as a followup, what I really want to be doing is then moving my target object to this new location. In other words the experience is ultimately dragging a 3d object around on the x and y access according to mouse movement.

I suppose I could do that once the above is working by getting the near and far world space coordinates on the frustum and then finding the point along that line at the same depth as the target object, but I wonder if there’s a more direct way?

Hi! Could you please try `camera.unproject(&screen_point)`?

Your use of `camera.inverse_transformation()` isn’t correct because this `inverse_transformation()` includes the inverse projection too.

Mmh, these computations look correct. With a `ndx_depth` equal to `-1.0` you may end up with very small values if the near-plane is very close to the eye, but with `0.0` and `1.0` you should get reasonably large values (depending on your far-plane). Could you perhaps show some actual values for all the terms involved in the computation here?

No, there is no built-in way of doing this in Kiss3d. What you propose here sounds like a reasonable way of doing it. Though keep in mind that because of the perspective, keeping the same depth will cause your object to be moved along the surface of a sphere instead of on a plane.

1 Like

when I use the `ArcBall` camera’s `unproject` (modified it only to use a stashed screen size as an argument, no other changes at all) - I still get the same small values

Sure!

Code
``````let view:&Matrix4<f64> = &camera.view;
let projection:&Perspective3<f64> = &camera.projection;
let view_inverse = view.try_inverse().unwrap();
let screen_point = Point2::new(x as f64, y as f64);
let screen_size = camera.get_viewport(); // just retrieves the stashed screensize
let ndc_screen_point = Point3::new(screen_point.x / screen_size.x, screen_point.y / screen_size.y, 0.0);
let view_point_1 = projection.unproject_point(&ndc_screen_point);
let (view_point_2, view_dir_2) = camera.unproject(&screen_point);
let (view_point_3, view_dir_3) = camera.unproject(&Point2::new(screen_point.x / screen_size.x, screen_point.y / screen_size.y));

let world_point_1 = view_inverse.transform_point(&view_point_1);
let world_point_2 = view_inverse.transform_point(&view_point_2);
let world_point_3 = view_inverse.transform_point(&view_point_3);

log::info!("
\n----- GIVEN -----
\nview: {:#?}
\nview_inverse: {:#?}
\nprojection: {:#?}
\nscreen_point: {:#?}
\nscreen_size: {:#?}
\nndc_screen_point: {:#?}
\n----- VIEW_POINT -----
\nview_point_1: {:#?}
\nview_point_2: {:#?}
\nview_dir_2: {:#?}
\nview_point_3: {:#?}
\nview_dir_3: {:#?}
\n----- WORLD_POINT -----
\nworld_point_1: {:#?}
\nworld_point_2: {:#?}
\nworld_point_3: {:#?}
",
view,
view_inverse,
projection,
screen_point,
screen_size,
ndc_screen_point,
view_point_1,
view_point_2,
view_dir_2,
view_point_3,
view_dir_3,
world_point_1,
world_point_2,
world_point_3,
);

``````
Logs with screen_point: 977,489 and screen_size: 1620,686
``````----- GIVEN -----

view: Matrix {
data: [
1.0,
-0.000000000000000000000000000000003749399456654644,
0.00000000000000006123233995736766,
0.0,
0.0,
1.0,
0.00000000000000006123233995736766,
0.0,
-0.00000000000000006123233995736766,
-0.00000000000000006123233995736766,
1.0,
0.0,
0.0,
0.0,
-1000.0,
1.0,
],
}

view_inverse: Matrix {
data: [
1.0,
0.0,
-0.00000000000000006123233995736766,
0.0,
-0.000000000000000000000000000000003749399456654644,
1.0,
-0.00000000000000006123233995736766,
0.0,
0.00000000000000006123233995736766,
0.00000000000000006123233995736766,
1.0,
0.0,
0.00000000000006123233995736766,
0.00000000000006123233995736766,
1000.0,
1.0,
],
}

projection: Matrix {
data: [
1.0223151257950265,
0.0,
0.0,
0.0,
0.0,
2.414213562373095,
0.0,
0.0,
0.0,
0.0,
-1.0000200002000021,
-1.0,
0.0,
0.0,
-0.2000020000200002,
0.0,
],
}

screen_point: Point {
coords: Matrix {
data: [
977.0,
489.0,
],
},
}

screen_size: Matrix {
data: [
1620.0,
686.0,
],
}

ndc_screen_point: Point {
coords: Matrix {
data: [
0.6030864197530864,
0.7128279883381924,
0.0,
],
},
}

----- VIEW_POINT -----

view_point_1: Point {
coords: Matrix {
data: [
0.11798326635932296,
0.05905201356162632,
-0.1999980000199998,
],
},
}

view_point_2: Point {
coords: Matrix {
data: [
0.02016724924679395,
-0.017631247844387254,
999.9,
],
},
}

view_dir_2: Matrix {
data: [
0.19480420276709898,
-0.17030786589218974,
-0.9659433489597218,
],
}

view_point_3: Point {
coords: Matrix {
data: [
-0.09774436704011258,
0.04133527372412273,
999.9,
],
},
}

view_dir_3: Matrix {
data: [
-0.6703226584807581,
0.2834738349715284,
-0.6857915998419808,
],
}

----- WORLD_POINT -----

world_point_1: Point {
coords: Matrix {
data: [
0.11798326635938418,
0.059052013561687544,
999.80000199998,
],
},
}

world_point_2: Point {
coords: Matrix {
data: [
0.02016724924691641,
-0.017631247844264796,
1999.9,
],
},
}

world_point_3: Point {
coords: Matrix {
data: [
-0.09774436703999012,
0.0413352737242452,
1999.9,
],
},
}

``````

Seems like it’s doubling the `world_point` depth, but I don’t see where I can account for z-axis being inverted here so maybe that’s misleading. Also, even if I spin the camera around to the other side, differences in the world point movements are within a very small range when they should be bigger.

The camera itself is correct - I see graphics on the screen just fine.

I’m sure you’re busy and don’t want to put too much pressure, and appreciate the free help. If there’s any more detail I can add please let me know

Thank you for the details. Since your z-range (0.09 - 9999) is very large, I don’t think there is anything wrong.

One thing to keep in mind is that the depth remapping applied by the perspective projection is non-linear (it’s kid of a parabola). So it means that the point (0, 0, -1) in NDC will lie on the near-plane of the frustum, the point (0, 0, 1) in NDC will lie on the far-plane of the frustum, but the point (0, 0, 0) won’t lie at the middle of the frustum: it will actually be much closer to the near-plane.

In your case, you have a fovy of 0.78. Therefore your near-plane has a height equal to `tan(0.78 / 2.0) * 0.09 * 2.0` in view-space. Because of the non-linear depth remaping operated by the perspective projection, a `z = 0.0` in NDC will give you a `z = 0.19` in view-space (there is a negative sign here, but I did not included it for simplicit). So this means that the frustum slice at `z = 0.0` in NDC will have a height equal to `tan(0.78 / 2.0) * 0.09 * 2.0 * (0.19 / 0.09) = 0.086` (Thales theorem) in view-space. So with `z = 0.0` in NDC, you should not expect your view-space coordinates about the `y` axis to be greater than `0.086` (approximately, I made a bunch of rounding in there). This explains the small values you are getting.

With a z-range as big as 0.09-9999, you won’t get any “big” value on the `x-y` plane in view-space, for any z-value bellow 0.9 in NDC.

Ok, so that explains view space - so is my mistake in how I’m mapping that to world space?

e.g. if my object is at say (200,200,0) in world space, and I click on it, I’d expect that one of these `world_space_n` values would be close to that.

ultimately my actual problem is if I set that object’s transform to any of these `world_space` values, it doesn’t work (e.g. the x,y values are always around 0,0).

Having a `world_space_n` value that is close to (200, 200, 0) isn’t easy. You would need to know the right `z` component of the NDC point you are unprojecting so it ends up in the right `z` depth in world-space. And because of the non-linear z-mapping operated by the perspective, this is not easy.

The most reliable info you can do with the unprojection is build a ray: unproject two points with different `z` values in NDC. Then you will know that the line passing through these two points will intersect your object. Then you can for example do a ray-cast with that line in order to see exactly where the hit takes place.

I believe there are some techniques with picking (on the GPU) that allow you to get an approximation of the world-space depth at the point where you clicked. I have never implemented picking, but maybe some resources on the internet could give you some interesting ideas.

1 Like