Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid SameFileError during restore_config #45

Open
le-horizon opened this issue Sep 16, 2024 · 3 comments
Open

avoid SameFileError during restore_config #45

le-horizon opened this issue Sep 16, 2024 · 3 comments

Comments

@le-horizon
Copy link

Problem: error while running

jetson-containers run $(autotag nano_llm) python3 -m nano_llm.vision.vla --api mlc --model dusty-nv/openvla-7b-mimicgen --quantization q4f16_ft --dataset dusty-nv/bridge_orig_ep100 --dataset-type rlds --max-episodes 10 --save-stats /data/benchmarks/openvla_mimicgen_int4.json

Message:
18:21:36 | INFO | using chat template 'openvla' for model openvla-7b-mimicgen
18:21:36 | INFO | model 'openvla-7b-mimicgen', chat template 'openvla' stop tokens: [''] -> [2]
18:21:38 | INFO | Warmup response: '测ษ装Έ专装ശ'
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/opt/NanoLLM/nano_llm/vision/vla.py", line 446, in
vla_process_dataset(**{**vars(args), 'dataset': dataset})
File "/opt/NanoLLM/nano_llm/vision/vla.py", line 296, in vla_process_dataset
model = NanoLLM.from_pretrained(model, **kwargs)
File "/opt/NanoLLM/nano_llm/nano_llm.py", line 100, in from_pretrained
model.restore_config()
File "/opt/NanoLLM/nano_llm/nano_llm.py", line 390, in restore_config
shutil.copyfile(backup_path, self.config_path)
File "/usr/lib/python3.10/shutil.py", line 234, in copyfile
raise SameFileError("{!r} and {!r} are the same file".format(src, dst))
shutil.SameFileError: '/data/models/huggingface/models--dusty-nv--openvla-7b-mimicgen/snapshots/865b827044ba379bac2023688dd84520e268e29d/config.json.backup' and '/data/models/huggingface/models--dusty-nv--openvla-7b-mimicgen/snapshots/865b827044ba379bac2023688dd84520e268e29d/config.json' are the same file

Manual inspection shows the symlinks point to the same file.

One possible fix:

logging.debug(f"restoring original model config from {backup_path}")

# Check if source and destination are the same
if os.path.abspath(backup_path) == os.path.abspath(self.config_path):
    # Remove the destination file to force the copy
    os.remove(self.config_path)
@le-horizon le-horizon changed the title avoid error during restore_config avoid SameFileError during restore_config Sep 16, 2024
@dusty-nv
Copy link
Owner

@le-horizon hmm I did not think those were symlinks, it is supposed to copy config.json to config.json.backup (not link it) before it applies changes to the config during the quantization stage (then copy the backup back to the original after quantization is complete)

Can you try manually restoring config.json.backup -> config.json if needed, then delete config.json.backup?

@le-horizon
Copy link
Author

Thanks for the quick reply. The symlink points here:
jetson-containers//data/models/huggingface/models--dusty-nv--openvla-7b-mimicgen/snapshots/865b827044ba379bac2023688dd84520e268e29d/config.json -> ../../blobs/bdd3c3e855b5e679de41ec76fab0ea0491816773

Let me try overwriting the link with the actual file.

@le-horizon
Copy link
Author

Seeing this error now:

│ mm_projector_path │ /data/models/huggingface/models--dusty-nv--openvla-7b-mimicgen/snapshots/86 │
├───────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ quant │ q4f16_ft │
├───────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ type │ llama │
├───────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ max_length │ 4096 │
├───────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ prefill_chunk_size │ -1 │
├───────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ load_time │ 15.15503216200159 │
├───────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ params_size │ 3233.0078125 │
└───────────────────────────┴─────────────────────────────────────────────────────────────────────────────┘

Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/opt/NanoLLM/nano_llm/vision/vla.py", line 446, in
vla_process_dataset(**{**vars(args), 'dataset': dataset})
File "/opt/NanoLLM/nano_llm/vision/vla.py", line 318, in vla_process_dataset
vla.action_space = 'bridge_orig'
File "/opt/NanoLLM/nano_llm/vision/vla.py", line 124, in action_space
self._action_space = self.action_spaces[key]
KeyError: 'bridge_orig'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants