Currently, IPA prover does
// Step 6.e G_vec_new = G_vec_lo + G_vec_hi * round_challenge_inv
auto G_hi_by_inverse_challenge = GroupElement::batch_mul_with_endomorphism( std::span{ G_vec_local.begin() + static_cast<std::ptrdiff_t>(round_size), G_vec_local.begin() + static_cast<std::ptrdiff_t>(round_size * 2) }, round_challenge_inv);
Which is the bottleneck of IPA (confirm!), if we rescale this by round_challenge, we'll get G_vec_lo * round_challenge, which is a 127-bit scalar, so the cost of batch_mul_with_endomorphism must drop by 50%.
I don't see if we can use the fact that challenges are short verifier-side, it might be worth investigating if splitting single mul computing G_0 into a sequence of muls can reduce the number of constraints/improve verification time.
Currently, IPA prover does
// Step 6.e G_vec_new = G_vec_lo + G_vec_hi * round_challenge_invauto G_hi_by_inverse_challenge = GroupElement::batch_mul_with_endomorphism( std::span{ G_vec_local.begin() + static_cast<std::ptrdiff_t>(round_size), G_vec_local.begin() + static_cast<std::ptrdiff_t>(round_size * 2) }, round_challenge_inv);Which is the bottleneck of IPA (confirm!), if we rescale this by
round_challenge, we'll getG_vec_lo * round_challenge, which is a127-bit scalar, so the cost ofbatch_mul_with_endomorphismmust drop by 50%.I don't see if we can use the fact that challenges are short verifier-side, it might be worth investigating if splitting single mul computing
G_0into a sequence of muls can reduce the number of constraints/improve verification time.