[ARFORUM] Extra calibration hidden matrix issues

Hi,

I'm currently using the ARtoolkit (v2.33 is similar in v2.4) to perform
object tracking, but I am having problems getting accurate values. I was
wondering if either of you had any advice you could give me - I sent email
to the mailing list but did not receive a response so I am not sure if it is
working. I use only the image recognition part of the toolkit, read the
matrices returned from arGetTransMat, and feed them into my system.

The values coming out are *almost* ok, but the objects are a constant few
centimetres off in the up and right directions using my custom made camera
calibration file. If I use the original v2.33 camera calibration file, the
registration is good, but as you move the target away from the centre of the
camera, it starts to drift since the calibration does not match my camera,
which is expected.

Using the supplied simpleText and exview programs with any calibration file,
the registration is always perfect though, while my custom code was not. I
checked through all my transformations and they were all correct, so I went
digging into the toolkit and it turns out there is another matrix, gl_cpara
which is passed to OpenGL as the GL_PROJECTION matrix. This matrix performs
the world to screen based projection, but also contains some extra obscure
camera calibration values which distort the final object to fit accurately
against the target.

So I need some kind of extra matrix to apply with the arGetTransMat matrix
for everything to line up properly, otherwise the registration is off.
Currently I can hack it to work using the original camera calibration and a
scaling factor, but I want to do this properly and need this scaling matrix.

The gl_cpara matrix is calculated in the file gsub.c in function
argConvGLcpara, and it works out a matrix to perform perspective
transformation, object to screen coordinate conversion, and then the camera
distortion. The problem is that the code which puts it all together is a
jumble of obscure variable names and mathematics and I can't work out what
is going on here in order to separate them into their components. If I could
see a matrix which was undistorted I could multiply by the inverse or
something but I cannot work it out.

I don't want the screen transformation code, as I am putting the target
locations into my scene graph code for rendering later. I've gone through
all the matrices trying to separate the values out but I cannot work it out.

Do either of you have any ideas as to how I can get the matrix I need?

I have looked through all the code to see if there is a function which does
it but it does not look like it. Am I going about this the wrong way? The
arGetTransMat function claims to do what I want, but its missing the extra
transformation to make everything align properly.

If anyone could help me here I would really appreciate it.

thanks,
Wayne

----------------------------------------------------------------------------
---
Wayne Piekarski - PhD Student / Lecturer           Phone:
+61-8-8302-3669
Advanced Computing Research Centre                   Fax:
+61-8-8302-3381
University of South Australia                     Mobile:
0407-395-889
                                                Internet:
wayne@c ..............
Research & Development Manager
wayne@s ...........
SE Network Access Pty Ltd
http://www.cs.unisa.edu.au/~ciswp

==^================================================================
EASY UNSUBSCRIBE click here: http://topica.com/u/?a84Ao5.a9zIQj
Or send an email To: arforum-unsubscribe@t .........
This email was sent to: webmaster@e ............

T O P I C A -- Register now to manage your mail!
http://www.topica.com/partner/tag02/register
==^================================================================

From:	Wayne Piekarski <wayne@c ..............>	Received:	Sep 25, 2001
To	arforum@t .........
Subject:	[ARFORUM] Extra calibration hidden matrix issues
Hi, I'm currently using the ARtoolkit (v2.33 is similar in v2.4) to perform object tracking, but I am having problems getting accurate values. I was wondering if either of you had any advice you could give me - I sent email to the mailing list but did not receive a response so I am not sure if it is working. I use only the image recognition part of the toolkit, read the matrices returned from arGetTransMat, and feed them into my system. The values coming out are almost ok, but the objects are a constant few centimetres off in the up and right directions using my custom made camera calibration file. If I use the original v2.33 camera calibration file, the registration is good, but as you move the target away from the centre of the camera, it starts to drift since the calibration does not match my camera, which is expected. Using the supplied simpleText and exview programs with any calibration file, the registration is always perfect though, while my custom code was not. I checked through all my transformations and they were all correct, so I went digging into the toolkit and it turns out there is another matrix, gl_cpara which is passed to OpenGL as the GL_PROJECTION matrix. This matrix performs the world to screen based projection, but also contains some extra obscure camera calibration values which distort the final object to fit accurately against the target. So I need some kind of extra matrix to apply with the arGetTransMat matrix for everything to line up properly, otherwise the registration is off. Currently I can hack it to work using the original camera calibration and a scaling factor, but I want to do this properly and need this scaling matrix. The gl_cpara matrix is calculated in the file gsub.c in function argConvGLcpara, and it works out a matrix to perform perspective transformation, object to screen coordinate conversion, and then the camera distortion. The problem is that the code which puts it all together is a jumble of obscure variable names and mathematics and I can't work out what is going on here in order to separate them into their components. If I could see a matrix which was undistorted I could multiply by the inverse or something but I cannot work it out. I don't want the screen transformation code, as I am putting the target locations into my scene graph code for rendering later. I've gone through all the matrices trying to separate the values out but I cannot work it out. Do either of you have any ideas as to how I can get the matrix I need? I have looked through all the code to see if there is a function which does it but it does not look like it. Am I going about this the wrong way? The arGetTransMat function claims to do what I want, but its missing the extra transformation to make everything align properly. If anyone could help me here I would really appreciate it. thanks, Wayne ---------------------------------------------------------------------------- --- Wayne Piekarski - PhD Student / Lecturer Phone: +61-8-8302-3669 Advanced Computing Research Centre Fax: +61-8-8302-3381 University of South Australia Mobile: 0407-395-889 Internet: wayne@c .............. Research & Development Manager wayne@s ........... SE Network Access Pty Ltd http://www.cs.unisa.edu.au/~ciswp ==^================================================================ EASY UNSUBSCRIBE click here: http://topica.com/u/?a84Ao5.a9zIQj Or send an email To: arforum-unsubscribe@t ......... This email was sent to: webmaster@e ............ T O P I C A -- Register now to manage your mail! http://www.topica.com/partner/tag02/register ==^================================================================

From:	Blair MacIntyre <blair@c ............>	Received:	Sep 26, 2001
To	arforum@t .........
Subject:	Re: [ARFORUM] Extra calibration hidden matrix issues
I'm not an expert on this, but we happen to be trying to do similar stuff to you right now (we can talk about it before ISAR :), so I'll try to throw in my $0.02 worth. I don't have an answer, but I have some comments and questions ... I'm just throwing out every stupid question I can think of in case it helps. > I'm currently using the ARtoolkit (v2.33 is similar in v2.4) to perform > object tracking, but I am having problems getting accurate values. I was > wondering if either of you had any advice you could give me - I sent email > to the mailing list but did not receive a response so I am not sure if it > is working. I use only the image recognition part of the toolkit, read the > matrices returned from arGetTransMat, and feed them into my system. > > The values coming out are almost ok, but the objects are a constant few > centimetres off in the up and right directions using my custom made camera > calibration file. If I use the original v2.33 camera calibration file, the > registration is good, but as you move the target away from the centre of > the camera, it starts to drift since the calibration does not match my > camera, which is expected. Right. If you are trying to actually use the explicit values, you must have good camera calibration. Otherwise, the values will be consistent, but off by a factor of somesort. Question: are you doing video-mixed AR and getting these values from the camera video you are showing the user? Or, are you doing optical-see-through, and just using the camera to track? Question2: are you _sure_ you have a good calibration? :) > Using the supplied simpleText and exview programs with any calibration > file, the registration is always perfect though, while my custom code was > not. Their demos do not need calibrated cameras. They get the reports and draw the graphics in the same coordinate system, which does not need to correspond to reality, since they never actually worry about anything that isn't just "relative to the fiducial". Basically, since their rendering corresponds exactly to their extracted matrices, everything lines up by definition. If you want to use these 3D pose estimates, things get hairy, as you've discovered. I assume you are using a completely different 3D projection matrix (that the one you mention below), and trying to take their pose estimates and use them? Which means that, your projection matrix does not correspond to the video on your display. If you are doing that, I don't see how you can get perfect registration, since the graphics projections do not correspond to the camera parameters of the world you are seeing in the video. If you are using optical-see-through, what is the relationship between the camera and your display? Have to included that offset? Is it below and to the left, perhaps? :) > I checked through all my transformations and they were all correct, > so I went digging into the toolkit and it turns out there is another > matrix, gl_cpara which is passed to OpenGL as the GL_PROJECTION matrix. > This matrix performs the world to screen based projection, but also > contains some extra obscure camera calibration values which distort the > final object to fit accurately against the target. > > So I need some kind of extra matrix to apply with the arGetTransMat matrix > for everything to line up properly, otherwise the registration is off. > Currently I can hack it to work using the original camera calibration and > a scaling factor, but I want to do this properly and need this scaling > matrix. > > The gl_cpara matrix is calculated in the file gsub.c in function > argConvGLcpara, and it works out a matrix to perform perspective > transformation, object to screen coordinate conversion, and then the > camera distortion. The problem is that the code which puts it all > together is a jumble of obscure variable names and mathematics and I > can't work out what is going on here in order to separate them into their > components. If I could see a matrix which was undistorted I could > multiply by the inverse or something but I cannot work it out. This is "typical" camera calibration stuff (as far as vision folks are concerned), of the sort camera calibration code (such as theirs or things like the Microsoft Vision SDK) spit out. It accounts not just for the pose of the camera (the "extrinsic" parameters, I believe), but the optical properities of the lens and camera (the "intrinsic" parameters), which include the focal length, center of the optical axis (which might be off-center causing skew) and so forth. The skew factor, in particular, is not something that can be dealt with by typical 3D projection matricies. If you want to use a "standard" OpenGL/Phigs-style projection matrix, you will not be able to get perfect registration. This is something I don't understand how to get around either. Given this matrix, I would like to extract the "closest" Phigs-style parameters (cop, direction-of-projection, aspect ration, field of view). Not sure how best to do that, but I need it, since at least one of the graphics libraries I'm using absolutely will not let me specify an arbitrary projection matrix. > I don't want the screen transformation code, as I am putting the target > locations into my scene graph code for rendering later. I've gone through > all the matrices trying to separate the values out but I cannot work it > out. > > Do either of you have any ideas as to how I can get the matrix I need? > > I have looked through all the code to see if there is a function which > does it but it does not look like it. Am I going about this the wrong > way? The arGetTransMat function claims to do what I want, but its missing > the extra transformation to make everything align properly. As I said above, arGetTransMat does give you the pose estimate between the camera and the fiducial assuming a calibrated camera; the cpara stuff is all about setting up a graphics projection that matches the camera parameters, for using in video-see-through modes. I suspect the problem lies elsewhere in how you are setting up the system. ==^================================================================ EASY UNSUBSCRIBE click here: http://topica.com/u/?a84Ao5.a9zIQj Or send an email To: arforum-unsubscribe@t ......... This email was sent to: webmaster@e ............ T O P I C A -- Register now to manage your mail! http://www.topica.com/partner/tag02/register ==^================================================================

From:	Wayne Piekarski <wayne@c ..............>	Received:	Sep 28, 2001
To	arforum@t .........
Subject:	[ARFORUM] Extra calibration hidden matrix issues
> Date: Wed, 26 Sep 2001 22:25:34 -0400 > From: Blair MacIntyre <blair@c ............> > Subject: Re: [ARFORUM] Extra calibration hidden matrix issues > > I'm not an expert on this, but we happen to be trying to do similar stuff > to you right now (we can talk about it before ISAR :), so I'll try to throw > in my $0.02 worth. I don't have an answer, but I have some comments and > questions ... I'm just throwing out every stupid question I can think of in > case it helps. Hi Blair ... thanks for your reply! This is something I've been playing with for a while lately, for the paper/poster I'm presenting at ISWC and ISAR I'm showing my new hand input based modelling system which uses 3D tracking and data gloves. It has always worked but the accuracy was a bit off and required some extra matrices to make things work enough to make it all run well. Now that I have this problem sorted it runs a lot nicer (please see my other posting which was sent just a little before you would have sent this). > Right. If you are trying to actually use the explicit values, you must > have good camera calibration. Otherwise, the values will be consistent, > but off by a factor of somesort. The camera calibration is a tricky issue. It is such a low res camera (352x288) that its hard to get ultra-fine details. Each time we calibrate we get slightly different calibration, and getting the alignment right is very tricky. So the problem is that the calibration matrices are always trying to correct for effects that really arent necessarily the cameras fault and vary depending on what you are doing and if you have bumped the camera lately :) So I now have a faked up calibration file which defines the camera as being nice and straight, which gives me good results. The interesting part about the distortion correction (or whatever it is) is that it is applied in two parts. The code which detects the targets and gives you the matrix via arGetTransMat produces a result which is skewed off by the calibration, and it then brings it back into line using gl_cpara. If you have a "straight" camera matrix the values that come out of arGetTransMat are already quite good so you don't need gl_cpara any more :) And so it works quite well now (see my previous posting with my results) when using my specially constructed camera models. > Question: are you doing video-mixed AR and getting these values from the > camera video you are showing the user? Or, are you doing > optical-see-through, and just using the camera to track? Ok, I'm doing both :) Normally we do optical see-thru, but the targets weren't aligned and we weren't sure why. So I rejigged the system up so that the video is put in the background as well so we can check to see what the camera is seeing and so forth. (More on this later). Then I found there was this misalignment problem and so had to fix that and now things are a lot better. For shooting videos of the system running, I turn on the video feed and record everything as it comes out of the laptop straight to a portable VCR. For actually using the system you can choose between optical or video mode, depending on what you want. Video mode is a tad slower and you get all the VR problems like image lag and so forth, but it allows us to make videos. Previously, we've shot a real camera through the HMD to capture videos but that requires using a tripod and things like that. Now that I'm running my modelling system and doing things like hand tracking, I need to look through the HMD to see what is happening, and if I'm looking through it then a camera won't fit in. So the Video AR mode makes this easier for recording things. > Question2: are you _sure_ you have a good calibration? :) No, the calibration is poor - this is my way of getting around it :) I found that having a calibrated camera caused more problems than using a straight camera. Using the default AR toolkit camera parameter file gave better results than my custom made calibration file! My file had a permanent offset to the left and upwards, but the default one was nicely centred when the target was in the centre of the display. As you moved the target to the edge of the screen the 3D object would move off a bit faster, but scaling factors fixed that up. > Their demos do not need calibrated cameras. They get the reports and draw > the graphics in the same coordinate system, which does not need to > correspond to reality, since they never actually worry about anything that > isn't just "relative to the fiducial". Basically, since their rendering > corresponds exactly to their extracted matrices, everything lines up by > definition. > > If you want to use these 3D pose estimates, things get hairy, as you've > discovered. Yeah. I was sort of hoping to use the camera model to produce an undistorted 3D world coordinate of the target (which I thought I had before but it turns out I didn't) but I don't think this is possible. The camera calibration is a 2D flat image thing, and doesn't allow you to correct the 3D coordinates with what we'd like to do. > I assume you are using a completely different 3D projection matrix (that > the one you mention below), and trying to take their pose estimates and use > them? Which means that, your projection matrix does not correspond to the > video on your display. > > If you are doing that, I don't see how you can get perfect registration, > since the graphics projections do not correspond to the camera parameters > of the world you are seeing in the video. > > If you are using optical-see-through, what is the relationship between the > camera and your display? Have to included that offset? Is it below and > to the left, perhaps? :) Ok, I think I've dealt with all these things. In my system, I have a scene graph which contains all the objects I'm rendering. Now, I also have a 2m high avatar model person which moves around with you. This avatar contains body definitions for things like how tall the person is, how far is the head off their shoulders, and the view frustum is defined there as well. The view the user sees is controlled by the view frustum object in the scene graph. My USB camera is mounted 8 cm higher than the glasstron, and I defined another frustum for the camera. On the back of the frustum is a polygon where the video can get texture mapped to as well. The cursors for the hand tracking is defined as child objects relative to the USB camera frustum. The overall result is that using this scene graph thing we get the right output. As the scene is rendered from the wearable user's eyes the video is not directly mapped to the display, the frustums are different FOV and location, but because we've modelled all this the texture mapping draws the video where it would be, with necessary offsets and distortion and so forth. The scene graph resolves all kinds of hard problems graphically such as converting camera relative cursors to world coordinates, handling bizarre camera angles (so you can tilt it and rotate it, rejig the scene graph, and it will draw everything right). As a special hack, the video is scaled so instead of being on a polygon about 1m in front of you, it is actually drawn on a polygon 10km away on a massive polygon, so it does not occlude any local objects :) I have lots of pictures that I can show you of how this all works, it is a bit hard to explain with just text. > This is "typical" camera calibration stuff (as far as vision folks are > concerned), of the sort camera calibration code (such as theirs or things > like the Microsoft Vision SDK) spit out. It accounts not just for the pose > of the camera (the "extrinsic" parameters, I believe), but the optical > properities of the lens and camera (the "intrinsic" parameters), which > include the focal length, center of the optical axis (which might be > off-center causing skew) and so forth. > > The skew factor, in particular, is not something that can be dealt with by > typical 3D projection matricies. If you want to use a "standard" > OpenGL/Phigs-style projection matrix, you will not be able to get perfect > registration. > > This is something I don't understand how to get around either. Given this > matrix, I would like to extract the "closest" Phigs-style parameters (cop, > direction-of-projection, aspect ration, field of view). > Not sure how best to do that, but I need it, since at least one of the > graphics libraries I'm using absolutely will not let me specify an > arbitrary projection matrix. I think given the straight camera model I have now, you can use your own frustum model and it should work ok maybe. I haven't tried it extensively but it looks good so far. What I was thinking of doing is getting the distorted camera matrix D, then getting the straight camera matrix S, and then doing something like D * Inv(S) to work out a matrix which would work out the necessary shift required and just do that. I'm not good enough with these camera models to really understand what I'm doing though. My problem is I'm trying to use the ARtoolkit as a tracker, in some cases the camera might not even be pointing in the same direction you are looking. One possibility for our system is the HMD looks forward but the camera points directly down so you can track your hands. In this case none of the video from the camera would be visible. > As I said above, arGetTransMat does give you the pose estimate between the > camera and the fiducial assuming a calibrated camera; the cpara stuff is > all about setting up a graphics projection that matches the camera > parameters, for using in video-see-through modes. I suspect the problem > lies elsewhere in how you are setting up the system. I think my main problem is that the ARtoolkit thought my camera is distorted when in fact it isn't distortion but just plastic fittings that move slightly, poor attachment points on the Sony, the inability to work out the exact centre of the camera, and so these change over time and cause bad calibration which cause bad output. So the distortion of my camera is inconsistent which is why each time I perform calibration I get different results. Eeeep! I had some problems with my scene graph setup before but I think/hope :) that I've sorted all these out and its just the ARtoolkit to deal with now. Hopefully I think I have it all right now ... I'm shooting some new videos in the next day or so which should demonstrate this tracking and everything else. Before when I was doing optical only the registration appeared good but when I started the video AR I realised there were some problems. I'll have to get back to you with more details once I've had a good run with this new set up. Thanks for your time Blair ... talk to you soon! regards, Wayne ---------------------------------------------------------------------------- --- Wayne Piekarski - PhD Student / Lecturer Phone: +61-8-8302-3669 Advanced Computing Research Centre Fax: +61-8-8302-3381 University of South Australia Mobile: 0407-395-889 Internet: wayne@t .......... Research & Development Manager http://www.tinmith.net SE Network Access Pty Ltd ==^================================================================ EASY UNSUBSCRIBE click here: http://topica.com/u/?a84Ao5.a9zIQj Or send an email To: arforum-unsubscribe@t ......... This email was sent to: webmaster@e ............ T O P I C A -- Register now to manage your mail! http://www.topica.com/partner/tag02/register ==^================================================================