This is a recording of a talk given at the International Search Summit: Global Virtual Edition 2022. The way users search on Google and other search engines is changing dramatically. Google multisearch is just one of these examples, where Google is introducing an entirely new way to search: using text and images simultaneously. Google Lens has completely revolutionised the way people can find a product they like. Whereas before it was possible to use only text and we had to convert ideas into keywords, now we can search with different modalities, combining entities, text, and images to get the results that match our expectations. In this session, Andrea Volpini from WordLift will introduce the concept of visual semantic SEO and experiment with AI-generated artwork to explain how deep learning models such as CLIP, Google's LitT and other contrastive-based neural networks work. He will also walk us through a new form of SEO made of structured data with prompt engineering. Based on these insights, Andrea will provide practical advice on how to improve your chances of ranking on Google Lens and Google multisearch.