I understand there are things a typical LLM can do and things that it cannot, this is mostly just because I figured it couldn’t do it and I just wanted to see what would happen. But the average person is not really given much information on the constraints and all of these companies are promising the moon with these tools.
Short version: It definitely did not have more common sense or information than a human, and we all know it sure would have given a very confident answer about conditions in the area to this person that were likely not correct. Definitely incorrect if it’s based off a photo.
In my experience when it has to crawl the Internet it’s particularly flaky. The other day I queried who won which awards in the game awards. 3 different models got it wrong, all of them omitted at least 2 categories. You could throw a rock on a search engine and find 80 lists ready to go.
But yeah, I can imagine a multi-modal model actually might have more information and common sense than a human in a (for them) novel situation.
If only to say "don't be an idiot", "pick higher ground" . Or even just as a rubber duck!