Skip to content

CPU fp32 to CUDA fp16/bf16 Cast Op Best Practices #21372

Discussion options

You must be logged in to vote

Can you modify the consumers so they accept the output from the Cast op? The same output can be reused by many different ops as an input.

Also in Java 20 there are efficient fp32 <-> fp16 conversions which have been incorporated into ONNX Runtime, so if you want to work in FloatBuffer and have ONNX use fp16 tensors you can do that.

Replies: 3 comments 3 replies

Comment options

You must be logged in to vote
0 replies
Answer selected by contrebande-labs
Comment options

You must be logged in to vote
1 reply
@Craigacp
Comment options

Comment options

You must be logged in to vote
2 replies
@Craigacp
Comment options

@contrebande-labs
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
EP Q&A
Labels
None yet
2 participants