June 19, 2025
By Anastasiia D.
Data Management,
Data Engineering,
Databricks,
Data & AI Summit,
Earlier this month, the Data + AI Summit 2025, hosted by Databricks, took place at the Moscone Center in San Francisco. As one of the largest gatherings of data professionals, engineers, AI researchers, and enterprise leaders, the summit offered a comprehensive look into the evolving data and AI ecosystem.
The central theme of this year was bringing data and AI closer together, not just in tooling but in practice. Our colleague, Bill Sanders, was on-site to take in the keynotes, breakout sessions, and hallway conversations that make this event as much about people as it is about platforms. Here’s what stood out.
Databricks co-founder and CEO Ali Ghodsi opened the summit by tracing the conference’s decade-long journey — a story of exponential growth and a shift in industry focus from data wrangling to intelligent automation.
While the event may now draw tens of thousands, its foundations remain deeply rooted in one principle — openness. And that’s where the conversation naturally turned next.
Ali Ghodsi underscored what has always been at the core of Databricks’ growth: a deep commitment to open source and a mission to democratize data and AI.
Databricks has contributed to and supported some of the most widely adopted tools in the modern data stack:
Beyond the numbers, Ghodsi reinforced the broader goal: making advanced data and AI capabilities available to as many people and organizations as possible, regardless of their infrastructure, cloud provider, or technical maturity.
This vision isn’t pursued in isolation. Ghodsi acknowledged the critical role of key partners in realizing it: AWS, Google Cloud, Microsoft Azure, Accenture, Deloitte, and the growing network of technology and marketplace partners.
He also emphasized the importance of data providers available through the Databricks Marketplace, making Delta Sharing more seamless, enabling businesses to collaborate over live data.
And at the center of it all: 15,000 customers using Databricks to power real-world outcomes. As Ghodsi put it, “That's the real impact. That's where the interesting things are happening.”
While the momentum around data and AI has never been greater, many organizations — especially those with long histories and legacy systems — struggle with AI adoption. The excitement is there. The potential is clear. But the path forward remains blocked by deep architectural and organizational complexity.
For data and analytics leaders, this moment presents an opportunity to re-evaluate architectural choices, explore new modes of user engagement, and consider emerging frameworks for modernization and AI adoption.
Bill Sanders
Here’s the most common scenario Databricks encounters across industries: an ecosystem of fragmented, overlapping technologies that have grown organically over time, often without a cohesive plan.
Most enterprises deal with:
Each component may solve a specific problem. But together, they form a brittle, disjointed architecture — difficult to evolve and expensive to maintain. This results in:
Ghodsi argued that the real problem lies beneath the surface — in the metadata.
Each system not only stores data but also metadata, access controls, security models, and governance logic. Enterprises don’t just manage data silos — they manage policy silos. This makes end-to-end visibility, governance, and control nearly impossible. Without a unified approach to data and metadata, AI becomes harder to operationalize — and even harder to trust.
These challenges mirror what we observe across many of our client engagements, where disconnected data pipelines, legacy BI tooling, and fragmented governance delay innovation. In our experience, optimizing data engineering tasks can result in up to 24% faster ML workflows.
In response to the architectural and operational fragmentation, Databricks continues to advance a solution that is both technically rigorous and strategically bold: the Lakehouse.
It’s a concept introduced by Databricks five years ago — initially met with skepticism, particularly from traditional data warehousing vendors and cloud hyperscalers. But today, the model has gained widespread traction. And in Ghodsi’s view, it’s not just a compelling architecture — it’s a pragmatic path forward.
The first move is deceptively simple: centralize data using open formats.
That means getting data out of proprietary systems and into affordable, open cloud object storage, such as Amazon S3, Azure Data Lake Storage (ADLS), or Google Cloud Storage (GCS), using non-proprietary table formats like Delta Lake or Apache Iceberg.
The goal is not just interoperability. It’s control. The Lakehouse gives organizations full ownership of their data, while preserving performance and structure that bridges the historical gap between lakes and warehouses.
But open storage alone isn’t enough. The real breakthrough comes with governance — and more specifically, a unified governance layer that spans the entire data estate.
Key Idea 1: Govern All Data Assets
Modern governance must extend beyond databases. That includes:
The vision is lineage-aware governance: track and secure data across its full lifecycle in one consistent framework.
Key Idea 2: Unified Capabilities on Top of Data
True governance isn’t just about access permissions. It’s about enabling clarity, coordination, and control across teams. The Unity Catalog, Databricks’ governance layer, is designed to support:
While many solutions in the market (like Polaris) focus narrowly on access control for structured data, Databricks is positioning Unity Catalog as a governance platform for the full data + AI lifecycle.
Databricks extends the Lakehouse into a Data Intelligence Platform, where AI is embedded throughout to make data and insights accessible to everyone. The vision focuses on two goals:
By combining open data infrastructure with intelligent automation, Databricks aims to simplify the user experience and prepare organizations for the next wave of AI-native workflows. We've explored this in depth in our experiment on AI in frontend and backend engineering, where AI streamlines developer velocity across the stack.
Chase’s Jamie Dimon from JPMorgan, one of the guest speakers, hammered home the message: “AI isn’t the hard part — data is.”
That message echoed throughout the Databricks Data + AI Summit 2025, where the spotlight was not on flashy AI demos, but on the groundwork required to make AI viable at scale. The AI Summit revealed a decisive shift in how enterprises approach data, moving beyond fragmented stacks and legacy dashboards toward simplified, AI-native platforms that emphasize usability, governance, and trust.
While some of the announcements will take time to translate into real-world impact, the direction is unmistakable: simplified experiences, governed access, and more intelligent consumption of data.
As data complexity and expectations around AI grow, data management takes a central stage. Janea Systems team brings over two decades of experience helping organizations manage and modernize their data infrastructure, from open-source platforms to enterprise-grade software.
We're proud to be one of Microsoft’s long-standing technical partners. Our engineering team co-developed mission-critical MSFT projects, including:
Whether you modernize pipelines or adopt AI across the SDLC, our team has the experience delivering intelligent systems that scale. We’ve explored AI-assisted development across the software development lifecycle and now leverage our findings for our clients’ projects.
Get in touch – let's explore ways to simplify and scale your data architecture.
Ready to discuss your software engineering needs with our team of experts?