Jump to content

"Efficiently Combining Positions and Normals for Precise 3D Geometry"


Recommended Posts

I have generated a dense point cloud of 62,500,000 points (approximately 35,000 points per square inch) and medium-res 3D mesh of a painting using Agisoft Photoscan.  It would be possible to generate the dense point cloud and mesh at a higher density, but it's beyond the memory capacity of my Mac mini (16 Gb of RAM), so I might out-source further processing of the photogrammetry data.  I also have a mosaic of 36 RTIs at a ground-sample resolution of approximately 500 pixels per inch (250,000 pixels per square inch), plus higher-res RTIs of some details.  The mosaic RTIs have approximately 10 percent overlap horizontally and vertically.


I'm interested in combining the point-cloud and normal maps into a 3D model of the painting surface using methods described in the  2005 SIGGRAPH paper whose title is quoted above, "Efficiently Combining Positions and Normals for Precise 3D Geometry," by Nehab, Rusinkiewicz, Davis, and Ramamoorthi (or other methods anyone here might suggest).  However, the link to the source code for the algorithm by these authors isn't working.  I'm wondering if others here have tried this technique, and if anyone can provide the source code (with appropriate permissions) and offer advice for implementing it.  I haven't written to the authors to ask for the code, but may do so.  I'm interested in hearing others' experience with it or other similar techniques, working with large data sets.


Another tool that looks like it might be useful in this regard is XNormal, which "bakes" the texture maps from high-resolution 3D meshes into lower-resolution meshes.  Could it also accurately combine high-resolution RTI normal maps with high-res 3D meshes?  I'm not sure if this modeling technique would produce the same result as the algorithm from the 2005 paper cited above.


I'm also interested in suggestions for an appropriate work flow for cleaning, repairing, and decimating the 3D mesh.  Would it be it better to start with the highest density point cloud and mesh that I can generate from the photogrammetry data set, then combine this with the normal maps from RTIs?  Or perhaps clean, repair, and decimate the 3D mesh and then apply algorithms to combine it with the normal maps?  I'm learning to use MeshLab, but I find it a bit daunting with the number of possible filters and it crashes pretty frequently with large data sets (it might be too much for my Mini). 


I also have approximately 53 Gb of RAW multispectral images at resolutions of approximately 500 to 1,000 pixels per inch that were captured using a different camera system than the one used to capture photogrammetry and RTIs.  The 500 dpi images were captured in a 4x4 mosaic, or 16 images per waveband.  There are 12 discrete wavebands (1 UV, 6 visible, and 5 IR) plus visible fluorescence images captured with emission filters.  I'm interested in texturing the 3D mesh generated from the combined photogrammetry and RTI datasets using each of the discrete multispectral wavebands and reconstructed visible, fluorescence, and false-color IR images.  I'd like to know what would be involved in registering these images to the 3D mesh generated from a different set of images. 


I'm hoping that the result of this project will be a 3D model with accurate surface normals that would allow interactive relighting, tiled zooming, algorithmic enhancement, and selective re-texturing at various wavebands in a web viewer, if a suitable viewer becomes available, such as the one being developed by Graeme Earl and the AHRC project or perhaps one of the Web3D viewers.  Any advice and assitance would be appreciated!




Link to comment
Share on other sites



My two cents...The algorithm developed by Nehab et al. is amazingly powerful, but I fear would be extraordinarily difficult to implement with your data even if you had the source code. The scanner Nehab developed used the same two machine vision cameras to capture the range data by structured light AND the photometric stereo produced by the six (?) lamps. Note in the article he specifies that both the cameras were calibrated, i.e. their photogrammetric distortion parameters were all calculated. What this all means is that the normal maps and the range maps were perfectly aligned right from the start. Each normal generated by photometric stereo could be put in a one-to-one correspondence with the range point because they were gathered by the same cameras. 


In your case you have two evidently rich data sets, but no way of precisely aligning them that I can see (did you have an fiducial markers that were shared by the RTI and the photogrammetry?). If there's the slightest misalignment the two data sets will not correct each other but instead propagated even more errors (as Nehab et al note), the opposite of what you want. Alignment of RTI data on your photogrammetry data is further complicated by the fact that the RTIs you have produced have not been "undistorted" according to the photogrammetric parameters of the camera. Even if you have been careful to use a prime lens with minimal distortion like a 50mm or 105mm prime, there will be still be enough distortion to prevent perfect point-to-textel alignment. I suggested in another thread on the forum that alignment of RTIs and photogrammetry could be done, but it would required code to "undistort" normal maps using the parameters collected by photogrammetry. Why, you may ask is this code already not out there (it may already, correct me if I'm wrong)? The (game) designers who use XNormal and equivalent packages aren't concerned that the texture of a bricks perfectly aligns with the building they're modelling. The normal map just gets repeated across the structure of the building to give the impression of photorealism. For them the surface normals are just a vastly more computational efficient way of creating realistic scenes than using, say, ray-tracing on a detailed 3D surface.


What you could do as an interesting experiment would be put a light Laplacian filter on your mesh to smooth it. This means the low-frequency structure on the range data will be retained but the low frequency stuff removed (little bumps, cracks etc). You could then filter your normal map to increase high-frequency, and decrease low-frequency (XNormal should be able to do this). Then align and bake your normal map onto the range data. In theory the small features of the painting will be handled by the RTI-generated normals, and larger structures, like unevenness in the canvas, would be represented by the photogrammetry data. What you don't want is to have the more error-prone high-frequency texture from your photogrammetry conflicting with the more accurate high-frequency texture from the RTI normals.  


With the size of your mesh and point cloud, I would suggest using CloudCompare over Meshlab. The former is regularly updated and can deal quite well with big data sets, provided you have a good enough video card (I don't think Mac Minis have discrete graphics, do they?). The meshing algorithms in CloudCompare aren't quite the equal of what's in Meshlab, but Photoscan probably generated a pretty clean mesh for you already, no?


Just a thought.  



Link to comment
Share on other sites

Thanks, George!  You've given me lots of good useful thoughts in your reply. 


Some quick thoughts:  The PhotoScan software has provided camera calibration parameters, and I've saved undistorted jpegs from the aligned photogrammetry images.  Perhaps I can use the camera calibration parameters to undistort the images used in the RTI sequences, and reprocess the RTIs with the undistorted images?  Or perhaps I can use Agisoft's lens calibration software to generate calibration parameters.  I used the same 45-mm macro prime lens on a micro four-thirds camera (equivalent to a 90-mm macro on a full-frame DSLR) for both the RTIs and photogrammetry, but the distances and resolutions are not the same.  (As you mentioned in a previous post, the 45-mm macro isn't ideal for accurate photogrammetry, but it seems to have produced at least a reasonable result.)  Still, the alignment problems might be too much to overcome, and your alternate approach sounds like it might work.


An update:  I've found the source code and a reference manual through Diego Nehab's website (thanks to Dr. Nehab for making his code available!): 



Based on George's comments, it sounds like another approach might be more productive.  I'll provide updates on my progress with this. 



Link to comment
Share on other sites

To effectively undistort the normal maps from RTI you'd need the camera parameters from exactly the camera set-up you used for the RTI, i.e. same focal length, focal distance, f-stop. As CHI has indicated in their workflow it is always prudent to do a calibration sequence with your RTI set-up just in case you need to get those parameters at some time in the future. I don't know how accurate it would be to recover those parameters after the fact with the lens calibration software. It may be good enough for government work. 


In principal it should be possible to align photogrammetry and RTI dataq quite precisely. After all, the image used for texture in an OBJ file (the image pointed to by the MTL file) is just an undistorted image taken from the original photogrammetry image series. If this texture image could be swapped for a normal map that had undergone the same transformation as the texture image (assuming that you've used imagery for photogrammetry taken with the same set-up and positions as your RTIs) you'd have your alignment. Maybe someone is working on this sort of thing right now?


I'm very interested to hear how you find XNormal/Blender for applying RTI normals to a mesh! A flat painting is a good candidate for this since the normals should suffer a lot less from the errors introduced by objects that have more depth and self-shadowing. 

Link to comment
Share on other sites

  • 1 month later...

hi all!

back in april 2013 i've been doing some very rudimentary tests on that, but since i have little to no coding background, all i've managed to do is trick photoscan into using the extracted normal snapshot as a texture:



My workflow is as follows:

-capture RTI sequence, take one last photo under ambient lighting conditions ('A')

-unscrewed camera from the copy stand/tripod, capture a photogrammetric sequence of the object

-processed the PTM/RTI (no cropping!) and take a normalmap-snapshot ('B')

-process photogrammetric sequence in Agisoft Photoscan (including 'A') up until mesh generation



-close project, swap A and B (also rename 'B' to 'A')


-reopen project, generate texture using single photo, choose 'A'


I am aware that this is a very cheap solution, since the color-coded normal visualization is not as precise as the stored normals in the rti-file (i think). but it could be used to calculate detailed depth maps from the normals using the coarse depth information from the mesh, this way elliminating the accumulating error when trtying to convert normals to depth.


what do you guys think?

Link to comment
Share on other sites

Very clever!  As I understand it, you're replacing the ambient light image in the photogrammetry sequence with a snapshot of the color-coded normal map from the RTI.  The snapshot of the normal map from the RTI gives you only visual (RGB) information about the normal directions, but it doesn't actually contain any normal direction data.  The depth map and normals you generate comes only from the photogrammetry mesh, but you can at least get more high-frequency visual information about the normals from the RTI and use that image to add visual textural information to your photogrammetry model.  However, it doesn't improve the accuracy of the depth map in the model, since all the depth information comes only from the mesh, not from the RTI normals.

Link to comment
Share on other sites

You are right, at this stage this doesn't improve the actual mesh's depth or normal data. You could use it to imrove your point cloud's normals though. Reliable low frequency depth and vertex-normal (dense cloud) is derived from the photogrammetric/SfM scene, and high frequency surface normal data with high relative accuracy is accquired using photometric stereo/RTI (as George pointed out above). When you reconstruct a mesh from a point cloud, each vertex coordinate in space as well as its normal are taken into consideration. What about replacing these low frequency normal values with the high frequency ones from the RTI-data, before actual surface reconstruction?


At least in theory, this is how it could be done using meshlab:

1. Import the textured mesh (see above),

2. Import the dense cloud

3. Transfer texture color from mesh to vertex color of point cloud

4. convert each vertices' RGB-values to normal vector values and apply those to respective vertices

5. Apply surface reconstruction.


This would require some coding for steps 3 and 4. You would also lose normal precision using the color coded normals, since you are workinging with integers from 0-255 for each axis, whereas the RTI works with float values from -1 to 1 (I'm not sure how many decimals though, but i thinks it´s more precise than Int 0-255). Also, I'm not sure how noticable the improvement would be.



Link to comment
Share on other sites

  • 2 weeks later...

Very interesting work! Out of curiosity, what sort of normal map to do you get from the mesh surface of the object without applying the normal map from RTI as texture? This would be an informative comparison. 


I think the coding your propose is actually quite simple, especially if you save your point cloud as an xyz rgb file. You just need to apply the inverse transformation from the 8-bit integer to the normal vector and call that Nx, Ny, Nz instead of R,G,B. You could even do it in something like Excel. In CloudCompare you can very easily change the columns from RGB to Nx, Ny, Nz when you import the cloud. 

Link to comment
Share on other sites


This topic is now archived and is closed to further replies.

  • Create New...