“AI agents have a brigh...

Since Alphabet merged its AI units, DeepMind and Google Brain, the former has taken to a different tempo within the company. The change steered a turnaround for the tech giant in the now hyper-competitive artificial intelligence market. In an exclusive interaction with The Hindu, Seshu Ajjarapu, Senior Director of Engineering & Product at Google DeepMind, and Manish Gupta, Director at Google DeepMind, India, shared what has worked for them since the merger, the truth behind GenAI hype, and why commercialisation is important for research.

Edited excerpts below:

THG: How has the transition been since the merger between Google Brain and DeepMind?

Seshu Ajjarapu: The mandate was always to keep them (Google’s internal teams and DeepMind) separate. We needed one team to think long-term and solve for advanced intelligence and artificial general intelligence (AGI) without the daily pressures of delivering for products. This worked out well for sometime. But eventually there was a recognition that there are many ways to solve intelligence. One was through neural nets which became a separate path after we published the Transformers paper. Then, there was another path we were pursuing called ‘cognitive agents’ where agents were learning how to behave in simulated environments. At some point, all these paths started converging and we realised we don’t need multiple teams doing the same thing. Within six months after the merger in April, we delivered the Gemini models. On the heels of that [launch], we very quickly launched Gemini 1.5 Pro with a longer context window and Mixture of Experts (MoE) architecture. Then, we launched Gemini Flash and Mini.

THG: Are you worried that research might take a backseat because of the shift in focus towards making commercial products?

Seshu: The only way we can show the promise of research is to first apply for a patent, publish the research and then launch a product, right? I’m sure when you consider a company with over 184,000 employees, there will always be someone who doesn’t agree but I think the net impact is that people are happy to contribute finally.

THG: There was a recent comment from Demis Hassabis, CEO of Google DeepMind, about the excess amount of hype around Generative AI and the risk that other areas of research in AI may be left behind. Is this true?

Seshu: Maybe that’s true for the startup ecosystem, but not with us because we have a significant number of people working on non-Gen AI research. We tend to have a more balanced portfolio of the products. I think there are a particular set of use-cases where GenAI has much to offer. One of them could be a Copilot-like application. Another one could be, creativity, and another one could be companionship. And all these use-cases are very compelling. But we do struggle with issues like hallucination, factuality etc. So, there’s work to be done in these areas. For some problems, I actually don’t think it’s a hype at all because it can make a significant amount of change in people’s lives.

(For top technology news of the day, subscribe to our tech newsletter Today’s Cache)

THG: There’s a general worry that the investment in GenAI isn’t yielding proportionate returns. What do you think went wrong initially while adopting GenAI applications?

Seshu: GenAI is a means to an end and what you are trying to solve will dictate your means. But we’ve been generally cautious and have a framework in place. Of course, when it comes to AGI, it’s an entirely different approach. We’re trying to take a piece of technology and generalize it more and more so it has to be technologically pure. So, the idea is to not [to simply] apply GenAI to everything. For example, we have a lot of projects around YouTube or Google Maps where we don’t use GenAI.

THG: Speaking about hallucinations, they’ve been a consistent problem with large language models. Isn’t inaccuracy an issue that has to be fully resolved before we push the technology into enterprises?

Manish Gupta: We certainly recognize this as an important problem that’s still unsolved. So, there’s no silver bullet yet, but there are a whole bunch of practical techniques that have been developed. There’s a lot of work going on in grounding these models and ensuring factuality. You can check the output of the LLM with another grounded model and validate them. We also Retrieval Augmented Generation (RAG) where you don’t start from scratch but ask the model to retrieve information from a specific data set. We have now announced further availability of the two million token context window at the Google I/O Connect event. There is an inherent tension between the Gen AI technique, where you’re probabilistic with predicting the next token and ensuring actuality. That still remains.

THG: Will we be seeing the developer base for Google DeepMind grow in India?

Manish: If you look at these 1.5 million developers today that are using our Gemini models, a large number of them come from India. We also see India as a very fertile ground for so many dimensions of AI whether its multimodal, mobile, and in fact, another dimension that my team has been looking at is multicultural. Now, we know that you have to go beyond even understanding different languages, we need to understand different cultures so that we don’t respond to queries in say, Assamese with something that’s simply a literal translation of English. We were very happy to be welcomed into Google DeepMind because our team had been contributing ideas and methods around inclusivity which directly went into the Gemini models. Eventually we recognize that the lack of representation isn’t just a problem for India but globally extremely relevant.

THB: How important is scale, and in what direction will the sizes of large language models move in future?

Manish: We’ve clearly seen that as these models have scaled; they’ve become visibly more powerful and also more robust and are able to handle a broad range of different kinds of data, different kinds of queries, and so on. There’s no doubt that scale has helped these models become more powerful. Is scale all there is to it? We believe the answer is no. There are things to tackle beyond just the scale of these models. We can’t continue just scaling them till infinity.

THB: What do you predict in the future of AI Agents?

Manish: It’s bright. There’s also a lot of very exciting work happening on agenting capabilities. You build agents that take advantage of these models’ capabilities, that are able to explore and reason. You can spawn out different systems that are able to learn from multiple agents, and do something more powerful than what a single agent could do. There are some very exciting directions that are being pursued by the research community. We also just introduced the MatFormer Framework where specificity instead of scale is important. It’s very interesting, pioneering work from our team where you can build AI models in a manner that a large model contains inside it a variety of smaller and medium sized models. Then, you can choose the model depending on the complexity of the task at hand and the cost that you’re willing to pay. We’re enabling elasticity with Gemini just like in cloud where you can choose the amount of compute you require.

Disclaimer: The copyright of this article belongs to the original author. Reposting this article is solely for the purpose of information dissemination and does not constitute any investment advice. If there is any infringement, please contact us immediately. We will make corrections or deletions as necessary. Thank you.