Tag Archives: machine-learning

My 2025 AI Predictions: INs & OUTs

28 Jan

Read the original post on Linkedin here

With the latest AI darling, DeepSeek AI, wiping billions off the market value of US tech giants just yesterday, 2025 is already shaping up to be a fascinating year for AI. The rapid evolution of AI, its promises, pitfalls, and shifting priorities, sets the stage for a year full of disruption. Here are my predictions for what’s IN and what’s OUT in AI for 2025:

AI Tech Stack: OUT with Training Obsession, IN with Inference*

The obsession with training massive models is OUT. What’s IN? Ruthlessly efficient inference. In 2025, if you’re not optimizing for inference, you’re already behind. Here’s why.

The cost of achieving OpenAI o1 level intelligence fell 27x in just the last 3 months, as my Google Cloud colleague Antonio Gulli observed – impressive price-performance improvement.

https://www.linkedin.com/feed/update/urn:li:activity:7289697297397944320/

The recent DeepSeek AI breakthrough proves this point perfectly. Their R1 model (trained for just $5.6 million, a fraction OpenAI’s rumored 500 million budget for its o1 model), achieves feature-parity and even outperforms major competitors in key benchmarks:

https://arxiv.org/pdf/2501.12948

We clearly figured out how to make LLM training more effective and cost efficient. Time to reap the benefits and use the models for inference.

*We will still be enhancing LLMs’ capabilities, developing smaller, purpose-built models and re-training them with new data-sets.

AI Architecture: OUT with Cloud-First, IN with Edge-First**

The pioneers in the most AI-advanced industries like Manufacturing have exposed the limitations of cloud-first AI approaches. According to Gartner, 27% of manufacturing enterprises have already deployed edge computing, and 64% plan to have it deployed by the end of 2027. Why the rush to edge-first AI architectures?

In industrial applications, especially those requiring real-time control and automation, latency requirements as low as 1-10 milliseconds demand a fundamental rethinking of distributed AI system design. At these speeds, edge-to-cloud roundtrips are impractical; systems must operate as edge-native, with processing and decision-making happening locally at the edge.

One of Synadia‘s most innovative customers, Intelecy, a No-Code AI platform that helps industrial companies optimize factory and plant processes with real-time machine learning insights, perfectly illustrates this paradigm shift. Their initial cloud-first approach had processing delays of 15-30 minutes. By redesigning their AI architecture for the edge, they achieved less than one-second round-trip latencies. This dramatic improvement enabled real-world applications like automated temperature control in dairy production, where ML models can provide real-time insights for process optimization.

Processing data where it is generated isn’t just more efficient—it’s becoming a competitive necessity for every industry. Gartner predicts that by 2029, 50% of enterprises will use edge computing, up from just 20% in 2024.

**The cloud’s role in AI isn’t disappearing (of course), but the default is shifting rapidly towards edge-first thinking.

AI Impact: OUT with What-If, IN with What-Now***

Focusing on model capabilities is OUT. What’s IN? Solving real business problems. The most compelling AI stories in 2025 won’t mention model architecture. Instead, they’ll focus on measurable business impact.

Intelecy’s Chief Security Officer 🔐 Jonathan Camp explains how AI can help ensure quality in manufacturing: “A dairy can use a machine learning forecast model to set temperature control systems using the real-time predicted state of the cheese production process. The process engineering team can use Intelecy insights to identify trends and then automate temperature adjustments on a vat of yogurt to ensure quality and output are not compromised.”

Source: https://www.intelecy.com/industries/food-and-beverage

The shift is clear: success is no longer measured in model capabilities, but in hard metrics like revenue gained, costs saved, and efficiency improved. The question isn’t “What can AI do?” but “What value did it deliver this quarter?”

***As an innovation-obsessed marketer, I’l never give up on “what-if” dreams but “what-now” is the state of AI in 2025.

The Elephant in the Room: Can gen AI be trusted?

We’ve solved training costs. We’ve started to crack real-time processing. Now, the focus shifts to trust: Can AI deliver consistent, reliable, and verifiable results at scale?

For example, try to ask 3x gen AI bots this prompt 3x, and see for yourself:

Name top 3 ski resorts in Europe by the total length of ski runs that are truly interconnected (no bus transfers)

We’re entering the era of agentic AI where AI-made decisions will be automatically implemented by chains if AI-functions. Are we ready?

What’s on your IN/OUT list for 2025?

#AIin2025 #Data #AI #DataAndAI #Tech2025 #FutureOfAI #Inference #Training

Comments

  • insightfullikesupport4Simone Morellato and 3 others
Photo of Justyna Bak

likecelebratesupportloveinsightfulfunnyLikeCommentShareComments settings

Add a comment…

Open Emoji Keyboard

No comments, yet.

Be the first to comment.Start the conversation

The Magic of Generative AI

8 May

“The Magic of Generative AI” is still my favorite talk I’ve ever given, hands down. I loved collaborating with Google’s top AI minds on the story and the visuals, building demos that showed how Vertex AI helps marketers like me, and connecting with fellow AI enthusiasts in awesome places like LA and Rome.

But the best part was diving deep into how large language models (LLMs) actually work, reading those mind-bending research papers, and piecing together the “magic” they create. Preparing this talk was like living Google’s innovation mantra: stay curious, experiment, build something useful.

In this newsletter, I’m sharing my reflections on the magic of Gen AI and how Google’s unique innovation culture was key to making these incredible tools a reality.

Innovation💡 = Curiosity🧐 + Experimentation🧪 + Application 🚀

Curiosity, experimentation, and application: This is the heart of how Google is driving the generative AI revolution. It’s the same formula behind some of our biggest breakthroughs, like Google Search, Translate, and Vertex AI.

Here’s how it works:

  • Curiosity: This is where it all starts – that burning question of “what if?” or “why not?” Curiosity is what drives us to explore the unknown and challenge the status quo.
  • Experimentation: Curiosity without action is just daydreaming. Experimentation is where we get our hands dirty, trying new things, making mistakes, and learning from them. It’s the messy but essential part of the process.
  • Application: The ultimate goal of innovation is to create something that makes a real difference in the world. Application takes those wild ideas and experiments and turns them into practical solutions that people can use and benefit from.

This isn’t just a theory; it’s the blueprint behind Google’s most groundbreaking AI tools.

Embeddings in Google Search: Grasp query intent beyond exact keywords

In 2013, Google researchers authored the seminal paper “Efficient Estimation of Word Representations in Vector Space“. This paper unveiled a revolutionary method for creating Word Embeddings, mathematical representations of words capturing both their meaning (semantics) and relationships (semantic similarity). Here’s the Google’s innovation formula in action:

  • Curiosity: Dissatisfied with existing word organizational methods, such as dictionaries ordering words by lexicographical order, researchers were curious if a better approach could capture word semantics and organize them by semantic meaning.
  • Experimentation: They explored various neural network types, training objectives and relationship representations. Through experimentation, they discovered how to automatically create a word embedding. A name to be remembered, an embedding is a mathematical representation for each word that captures their semantic meaning in the form of a vector of 768 numbers.
  • Application: Way before it was applied in gen AI, word embeddings found a magical application in semantic search, enabling Google Search 🔍 to grasp query intent beyond exact keywords. For example, a search for “cars that are good on gas” now returns results for fuel-efficient cars, even if the word “gas” doesn’t appear in the options returned.
Source: “The Magic of Generative AI” talk, Google Gen AI Live and Labs event series

Transformer in Google Translate: More accurate translations

In 2017, Google researchers presented “Attention is All You Need” introducing the Transformer architecture, built on decision-making and attention-span concepts. It empowers the language models to understand context and relationships within word sequences. Curiosity, experimentation and application were again vital:

  • Curiosity: In the search to improve the quality of language translation, researchers sought ways to model relationships among words in a sentence.
  • Experimentation: They experimented with various mechanisms, relationship representations and training methods, discovering that much could be extracted by simply paying attention to the relationship between each word and every other word in a sentence. They discovered that these interdependencies could be achieved through parallel computations, which accelerated time to result, and found that representations through embeddings could capture long-range dependencies between words in fluent, grammatically correct text. Voilà! The Transformer architecture was born, introducing a huge breakthrough in science.
  • Application: The transformer revolutionized Google Translate 🌐. The Transformer’s attention mechanisms are excellent at understanding the relationships between words in a sentence, leading to more accurate translations.
Source: Transformers, FT,

Let’s see this in action by translating this sentence from English to Italian: “The cat didn’t cross the street because it was too wide.

Source: “The Magic of Generative AI” talk, Google Gen AI Live and Labs event series

Gen AI in Enterprise Search: New way of working

Fast forward to 2023, Google Cloud researchers set to simultaneously tackle two common challenges for many organizations:

  • How to organize enterprises information scattered across many internal systems
  • How to make this information accessible and useful for enterprises, and seamlessly available and actionable in applications such as customer service bots, document summarizations or as part of steps in automated workflows.

Not surprisingly, Google Cloud researchers followed the proven innovation framework:

  • Curiosity: While Google Search was designed to scale to organize the world’s information, researchers started exploring whether the technology could be descaled and made available to enterprises to organize their information in a way that could be easily accessible and useful to them, and only to them.
  • Experimentation: Intrigued by the potential to bring together several cutting-edge technologies, researchers used the ability to crawl web-sites to discover content on internal websites and structured content, and Optical Character Recognition (OCR) to discover content from all sorts of semi-structured and unstructured documents, creating a wealth of knowledge about the enterprise. The researchers then used embeddings to extract and organize the semantic meaning of all of this data. Once enterprise’s data has been semantically organized in embeddings, the full power of Generative AI can be applied to it and leveraged across the Vertex AI platform.
  • Application: First launched in March 2023, Google Cloud Vertex AI Search 🔍 quickly became “the killer enterprise app”. A killer application, often abbreviated as killer app, is a software application that is so necessary or desirable that it proves the core value of some larger technology, such as its video game console, software platform, or in this case of gen AI in the enterprise context. Killer apps are the pinnacle of innovation: well-designed, easy to use and solving a real problem for users. Enterprise Search is the killer enterprise app as it unlocks unprecedented levels of productivity and efficiency.

These transformative breakthroughs exemplify Google’s dedication to AI innovation, with continued explorations on the horizon.

Ready to experience the magic of Gen AI? Explore Gemini today: https://gemini.google.com/

#GoogleAI #GenerativeAI #Innovation #ArtificialIntelligence #SemanticSearch #NLP #AIInnovation #TransformerArchitecture #ApplicationsOfAI