Conversation
| return hidden_states | ||
|
|
||
|
|
||
| class LTX2PerturbedAttnProcessor: |
There was a problem hiding this comment.
I think this is just a guider https://github.com/huggingface/diffusers/blob/main/src/diffusers/guiders/skip_layer_guidance.py
There was a problem hiding this comment.
Thanks! Looking at the code, it's unclear to me whether SkipLayerGuidance currently works for LTX-2.3 for the following reasons:
- Not attention backend agnostic: if I understand correctly, STG is implemented through
AttentionProcessorSkipHook, which usesAttentionScoreSkipFunctionModeto intercept calls totorch.nn.functional.scaled_dot_product_attentionto simply return thevalue: But I think other attention backends likeflash-attnwon't call that function and thus will not work withSkipLayerGuidance. - LTX-2.3 does additional computation on the
values: LTX-2.3 additionally processes thevalues using learned per-head gates before sending it to the attention output projectionto_out. This is not supported by the currentSkipLayerGuidanceimplementation.
I'm not sure whether these issues can be resolved with changes to the SkipLayerGuidance implementation or whether something like a new attention processor would make more sense here.
There was a problem hiding this comment.
I have opened a PR with a possible modification to SkipLayerGuidance to allow it to better support LTX-2.3 at #13220.
There was a problem hiding this comment.
This is a good callout! From my understanding, guider as a component doesn't change much. LTX-2 is probably an exception. If more models start to do their own form of SLG, we could think of giving them their own guider classes / attention processors. But for now, I think modifications to the existing SLG class make more sense.
What does this PR do?
This PR adds support for LTX-2.3 (official code, model weights), a new model in the LTX-2.X family of audio-video models. LTX-2.3 has improved audio and visual quality and prompt adherence as compared to LTX-2.0.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@yiyixuxu
@sayakpaul