I don't believe that it just analyzes the transcription. I asked Gemini to look at the youtube video referenced on the site below and "build" something that duplicates that device. It did a pretty good approximation that it could not have done without going through the full video.
Same - I see a lot of "vaguely interesting but no way I'm spending 40 minutes on that" kind of videos, and it usually works. However, I have noticed it occasionally will just summarize the wrong video for me. It might be if the video is very new, or something? I'm not sure.
I think Gemini analyzes the transcription.
Can I do the same for free with Qwen3?