And here come multimodal Chain of Thought.

February 26, 2023 0 Comments

Amazon researchers paired text with images as input to drive LLM ability ever higher. And they did it with a model that’s only 770 million params (vs the 175 Billion in GPT-3.5). “Our method achieves new state-of-the-art performance on the ScienceQA benchmark, outperforming accuracy of GPT-3.5 by 16% and even surpassing human performance.” ~ learn more