I used the notebook by Andy McSherry for VGGSfM. I also used Google’s MultiNeRF repository for NeRF training. Thanks to Google’s TPU Research Cloud (TRC) for making cloud TPUs available for my use.
@inproceedings{wang2024vggsfm,
title = {VGGSfM: Visual Geometry Grounded Deep Structure From Motion},
author = {Wang, Jianyuan and Karaev, Nikita and Rupprecht, Christian and Novotny, David},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition},
pages = {21686--21697},
year = {2024}
}
@article{barron2022mipnerf360,
title = {Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields},
author = {Jonathan T. Barron and Ben Mildenhall and
Dor Verbin and Pratul P. Srinivasan and Peter Hedman},
journal = {CVPR},
year = {2022}
}
mkdir -p bike/images
ffmpeg -i bike.mp4 -vf "fps=2" bike/images/%04d.jpg
python vggsfm/demo.py visualize=True SCENE_DIR=./bike resume_ckpt=vggsfm_v2_0_0.bin
You can see the output over the port number 8097
. In lightning ai there is a web server extension, but in other cloud providers you need a tunneling service like ngrok.
conda install imagemagick
The following commands use imagemagick
tool to resize the images. I took these commands from a script in multinerf repository.
DATASET_PATH=bike
cp -r "$DATASET_PATH"/images "$DATASET_PATH"/images_2
pushd "$DATASET_PATH"/images_2
ls | xargs -P 8 -I {} mogrify -resize 50% {}
popd
cp -r "$DATASET_PATH"/images "$DATASET_PATH"/images_4
pushd "$DATASET_PATH"/images_4
ls | xargs -P 8 -I {} mogrify -resize 25% {}
popd
cp -r "$DATASET_PATH"/images "$DATASET_PATH"/images_8
pushd "$DATASET_PATH"/images_8
ls | xargs -P 8 -I {} mogrify -resize 12.5% {}
popd
conda install zip
zip -r archive.zip ~/bike
NeRF expects the folders to be ordered as follows:
bike
│
└───sparse # colmap output files
│ │
│ └───0
│ │ points3D.bin
│ │ images.bin
│ │ cameras.bin
│
└───images # images with the original resolution
│ │ image01.jpg
│ │ image02.jpg
│ │ ...
│
│
└───images_2 # images with 50% of the original resolution
│ │ image01.jpg
│ │ image02.jpg
│ │ ...
│
│
└───images_4 # images with 25% of the original resolution
│ │ image01.jpg
│ │ image02.jpg
│ │ ...
│
│
└───images_8 # images with 12.5% of the original resolution
│ │ image01.jpg
│ │ image02.jpg
│ │ ...
Copy bike.zip
to ~/multinerf/nerf_data/
then unzip it:
mv bike.zip ~/multinerf/nerf_data/
unzip bike.zip
Prepare terminal variables to run the training script:
SCENE=bike \
EXPERIMENT=360 \
DATA_DIR=/home/mashaan14/multinerf/nerf_data/ \
CHECKPOINT_DIR=/home/mashaan14/multinerf/nerf_results/"$EXPERIMENT"/"$SCENE"
Run the training script:
python -m train \
--gin_configs=configs/360.gin \
--gin_bindings="Config.data_dir = '${DATA_DIR}/${SCENE}'" \
--gin_bindings="Config.checkpoint_dir = '${CHECKPOINT_DIR}'" \
--gin_bindings="Config.checkpoint_every = 25000" \
--logtostderr
Here’s a snippet from the training output:
249100/250000: loss=0.00407, psnr=45.627, lr=2.03e-05 | data=0.00386, dist=3.0e-05, inte=0.00017, 112785 r/s
249200/250000: loss=0.00407, psnr=45.620, lr=2.03e-05 | data=0.00387, dist=3.0e-05, inte=0.00017, 112759 r/s
249300/250000: loss=0.00407, psnr=45.633, lr=2.03e-05 | data=0.00386, dist=3.0e-05, inte=0.00017, 112771 r/s
249400/250000: loss=0.00406, psnr=45.651, lr=2.02e-05 | data=0.00386, dist=3.0e-05, inte=0.00017, 112305 r/s
249500/250000: loss=0.00407, psnr=45.628, lr=2.02e-05 | data=0.00386, dist=3.0e-05, inte=0.00017, 112297 r/s
249600/250000: loss=0.00407, psnr=45.625, lr=2.01e-05 | data=0.00386, dist=3.0e-05, inte=0.00017, 112332 r/s
249700/250000: loss=0.00407, psnr=45.613, lr=2.01e-05 | data=0.00387, dist=3.0e-05, inte=0.00017, 112350 r/s
249800/250000: loss=0.00407, psnr=45.628, lr=2.01e-05 | data=0.00386, dist=3.0e-05, inte=0.00017, 112341 r/s
249900/250000: loss=0.00407, psnr=45.623, lr=2.00e-05 | data=0.00387, dist=3.0e-05, inte=0.00017, 112345 r/s
250000/250000: loss=0.00406, psnr=45.654, lr=2.00e-05 | data=0.00386, dist=3.0e-05, inte=0.00017, 109458 r/s
prepare the variables:
SCENE=bike \
EXPERIMENT=360 \
DATA_DIR=/home/mashaan14/multinerf/nerf_data/ \
CHECKPOINT_DIR=/home/mashaan14/multinerf/nerf_results/"$EXPERIMENT"/"$SCENE"
and run the rendering script:
python -m render \
--gin_configs=configs/360.gin \
--gin_bindings="Config.data_dir = '${DATA_DIR}/${SCENE}'" \
--gin_bindings="Config.checkpoint_dir = '${CHECKPOINT_DIR}'" \
--gin_bindings="Config.render_path = True" \
--gin_bindings="Config.render_path_frames = 200" \
--gin_bindings="Config.render_dir = '${CHECKPOINT_DIR}/render/'" \
--gin_bindings="Config.render_video_fps = 25" \
--logtostderr
type these links into the downloading window to download them:
/home/mashaan14/multinerf/nerf_results/360/bike/render/bike_360_path_renders_step_250000_color.mp4
/home/mashaan14/multinerf/nerf_results/360/bike/render/bike_360_path_renders_step_250000_acc.mp4
/home/mashaan14/multinerf/nerf_results/360/bike/render/bike_360_path_renders_step_250000_distance_mean.mp4
/home/mashaan14/multinerf/nerf_results/360/bike/render/bike_360_path_renders_step_250000_distance_median.mp4