Saket Saurabh, CEO and Co-Founder of Nexla – Interview Series

Saket Saurabh, CEO and Co-Founder of Nexla, is an entrepreneur with a deep passion for data and infrastructure. He is leading the development of a next-generation, automated data engineering platform designed to bring scale and velocity to those working with data.

Previously, Saurabh founded a successful mobile startup that achieved significant milestones, including acquisition, IPO, and growth into a multi-million-dollar business. He also contributed to multiple innovative products and technologies during his tenure at Nvidia.

Nexla enables the automation of data engineering so that data can be ready-to-use. They achieve this through a unique approach of Nexsets – data products that make it easy for anyone to integrate, transform, deliver, and monitor data.

What inspired you to co-found Nexla, and how did your experiences in data engineering shape your vision for the company?

Prior to founding Nexla, I started my data engineering journey at Nvidia building highly scalable, high-end technology on the compute side. After that, I took my previous startup through an acquisition and IPO journey in the mobile advertising space, where large amounts of data and machine learning were a core part of our offering, processing about 300 billion records of data every day.

Looking at the landscape in 2015 after my previous company went public, I was seeking the next big challenge that excited me. Coming from those two backgrounds, it was very clear to me that the data and compute challenges were converging as the industry was moving towards more advanced applications powered by data and AI.

While we didn’t know at the time that Generative AI (GenAI) would progress as rapidly as it has, it was obvious that machine learning and AI would be the foundation for taking advantage of data. So I started to think about what kind of infrastructure is needed for people to be successful in working with data, and how we can make it possible for anybody, not just engineers, to leverage data in their day-to-day professional lives.

That led to the vision for Nexla – to simplify and automate the engineering behind data, as data engineering was a very bespoke solution within most companies, especially when dealing with complex or large-scale data problems. The goal was to make data accessible and approachable for a wider range of users, not just data engineers. My experiences in building scalable data systems and applications fueled this vision to democratize access to data through automation and simplification.

How do Nexsets exemplify Nexla’s mission to make data ready-to-use for everyone, and why is this innovation crucial for modern enterprises?

Nexsets exemplify Nexla’s mission to make data ready-to-use for everyone by addressing the core challenge of data. The 3Vs of data – volume, velocity, and variety – have been a persistent issue. The industry has made some progress in tackling challenges with volume and velocity. However, the variety of data has remained a significant hurdle as the proliferation of new systems and applications have led to an ever-increasing diversity in data structures and formats.

Nexla’s approach is to automatically model and connect data from diverse sources into a consistent, packaged entity, a data product that we call a Nexset. This allows users to access and work with data without having to understand the underlying complexity of the various data sources and structures. A Nexset acts as a gateway, providing a simple, straightforward interface to the data.

This is crucial for modern enterprises because it enables more people, not just data engineers, to leverage data in their day-to-day work. By abstracting away the variety and complexity of data, Nexsets makes it possible for business users, analysts, and others to directly interact with the data they need, without requiring extensive technical expertise.

We also worked on making integration easy to use for less technical data consumers – from the user interface and how people collaborate and govern data to how they build transforms and workflows. Abstracting away the complexity of data variety is key to democratizing access to data and empowering a wider range of users to derive value from their information assets. This is a critical capability for modern enterprises seeking to become more data-driven and leverage data-powered insights across the organization.

What makes data “GenAI-ready,” and how does Nexla address these requirements effectively?

The answer partly depends on how you’re using GenAI. The majority of companies are implementing GenAI Retrieval Augmented Generation (RAG). That requires first preparing and encoding data to load into a vector database, and then retrieving data via search to add to any prompt as context as input to a Large Language Model (LLM) that hasn’t been trained using this data. So the data needs to be prepared in such a way to work well for both vector searches and for LLMs.

Regardless of whether you’re using RAG, Retrieval Augmented Fine-Tuning (RAFT) or doing model training, there are a few key requirements:

Data format: GenAI LLMs often work best with data in a specific format. The data needs to be structured in a way that the models can easily ingest and process. It should also be “chunked” in a way that helps the LLM better use the data.
Connectivity: GenAI LLMs need to be able to dynamically access the relevant data sources, rather than relying on static data sets. This requires continual connectivity to the various enterprise systems and data repositories.
Security and governance: When using sensitive enterprise data, it’s critical to have robust security and governance controls in place. The data access and usage need to be secure and compliant with existing organizational policies. You also need to govern data used by LLMs to help prevent data breaches.
Scalability: GenAI LLMs can be data- and compute-intensive, so the underlying data infrastructure needs to be able to scale to meet the demands of these models.

Nexla addresses these requirements for making data GenAI-ready in a few key ways:

Dynamic data access: Nexla’s data integration platform provides a single way to connect to 100s of sources and uses various integration styles and data speed, along with orchestration, to give GenAI LLMs the most recent data they need, when they need it, rather than relying on static data sets.
Data preparation: Nexla has the capability to extract, transform and prepare data in formats optimized for each GenAI use case, including built-in data chunking and support for multiple encoding models.
Self-service and collaboration: With Nexla, data consumers not only access data on their own and build Nexsets and flows. They can collaborate and share their work via a marketplace that ensures data is in the right format and improves productivity through reuse.
Auto generation: Integration and GenAI are both hard. Nexla auto-generates a lot of the steps needed based on choices by the data consumer – using AI and other techniques – so that users can do the work on their own.
Governance and security: Nexla incorporates robust security and governance controls throughout, including collaboration, to ensure that sensitive enterprise data is accessed and used in a secure and compliant manner.
Scalability: The Nexla platform is designed to scale to handle the demands of GenAI workloads, providing the necessary compute power and elastic scale.

Converged integration, self service and collaboration, auto generation, and data governance need to be built together to make data democratization possible.

How do diverse data types and sources contribute to the success of GenAI models, and what role does Nexla play in simplifying the integration process?

GenAI models need access to all kinds of information to deliver the best insights and generate relevant outputs. If you don’t provide this information, you shouldn’t expect good results. It’s the same with people.

GenAI models need to be trained on a broad range of data, from structured databases to unstructured documents, to build a comprehensive understanding of the world. Different data sources, such as news articles, financial reports, and customer interactions, provide valuable contextual information that these models can leverage. Exposure to diverse data also allows GenAI models to become more flexible and adaptable, enabling them to handle a wider range of queries and tasks.

Nexla abstracts away the variety of all this data with Nexsets, and makes it easy to access just about any source, then extract, transform, orchestrate, and load data so data consumers can focus just on the data, and on making it GenAI ready.

What trends are shaping the data ecosystem in 2025 and beyond, particularly with the rise of GenAI?

Companies have mostly been focused on using GenAI to build assistants, or copilots, to help people find answers and make better decisions. Agentic AI, agents that automate tasks without people being involved, is definitely a growing trend as we move into 2025. Agents, just like copilots, need integration to ensure that data flows seamlessly–not just in one direction but also in enabling the AI to act on that data.

Another major trend for 2025 is the increasing complexity of AI systems. These systems are becoming more sophisticated by combining components from different sources to create cohesive solutions. It’s similar to how humans rely on various tools throughout the day to accomplish tasks. Empowered AI systems will follow this approach, orchestrating multiple tools and components. This orchestration presents a significant challenge but also a key area of development.

From a trends perspective, we’re seeing a push toward generative AI advancing beyond simple pattern matching to actual reasoning. There’s a lot of technological progress happening in this space. While these advancements might not fully translate into commercial value in 2025, they represent the direction we’re heading.

Another key trend is the increased application of accelerated technologies for AI inferencing, particularly with companies like Nvidia. Traditionally, GPUs have been heavily used for training AI models, but runtime inferencing—the point where the model is actively used—is becoming equally important. We can expect advancements in optimizing inferencing, making it more efficient and impactful.

Additionally, there’s a realization that the available training data has largely been maxed out. This means further improvements in models won’t come from adding more data during training but from how models operate during inferencing. At runtime, leveraging new information to enhance model outcomes is becoming a critical focus.

While some exciting technologies begin to reach their limits, new approaches will continue to arise, ultimately highlighting the importance of agility for organizations adopting AI. What works well today could become obsolete within six months to a year, so be prepared to add or replace data sources and any components of your AI pipelines. Staying adaptable and open to change is critical to keeping up with the rapidly evolving landscape.

What strategies can organizations adopt to break down data silos and improve data flow across their systems?

First, people need to accept that data silos will always exist. This has always been the case. Many organizations attempt to centralize all their data in one place, believing it will create an ideal setup and unlock significant value, but this proves nearly impossible. It often turns into a lengthy, costly, multi-year endeavor, particularly for large enterprises.

So, the reality is that data silos are here to stay. Once we accept that, the question becomes: How can we work with data silos more efficiently?

A helpful analogy is to think about large companies. No major corporation operates from a single office where everyone works together globally. Instead, they split into headquarters and multiple offices. The goal isn’t to resist this natural division but to ensure those offices can collaborate effectively. That’s why we invest in productivity tools like Zoom or Slack—to connect people and enable seamless workflows across locations.

Similarly, data silos are fragmented systems that will always exist across teams, divisions, or other boundaries. The key isn’t to eliminate them but to make them work together smoothly. Knowing this, we can focus on technologies that facilitate these connections.

For instance, technologies like Nexsets provide a common interface or abstraction layer that works across diverse data sources. By acting as a gateway to data silos, they simplify the process of interoperating with data spread across various silos. This creates efficiencies and minimizes the negative impacts of silos.

In essence, the strategy should be about enhancing collaboration between silos rather than trying to fight them. Many enterprises make the mistake of attempting to consolidate everything into a massive data lake. But, to be honest, that’s a nearly impossible battle to win.

How do modern data platforms handle challenges like speed and scalability, and what sets Nexla apart in addressing these issues?

The way I see it, many tools within the modern data stack were initially designed with a focus on ease of use and development speed, which came from making the tools more accessible–enabling marketing analysts to move their data from a marketing platform directly to a visualization tool, for example. The evolution of these tools often involved the development of point solutions, or tools designed to solve specific, narrowly defined problems.

When we talk about scalability, people often think of scaling in terms of handling larger volumes of data. But the real challenge of scalability comes from two main factors: The increasing number of people who need to work with data, and the growing variety of systems and types of data that organizations need to manage.

Modern tools, being highly specialized, tend to solve only a small subset of these challenges. As a result, organizations end up using multiple tools, each addressing a single problem, which eventually creates its own challenges, like tool overload and inefficiency.

Nexla addresses this issue by threading a careful balance between ease of use and flexibility. On one hand, we provide simplicity through features like templates and user-friendly interfaces. On the other hand, we offer flexibility and developer-friendly capabilities that allow teams to continuously enhance the platform. Developers can add new capabilities to the system, but these enhancements remain accessible as simple buttons and clicks for non-technical users. This approach avoids the trap of overly specialized tools while delivering a broad range of enterprise-grade functionalities.

What truly sets Nexla apart is its ability to combine ease of use with the scalability and breadth required by organizations. Our platform connects these two worlds seamlessly, enabling teams to work efficiently without compromising on power or flexibility.

One of Nexla’s main strengths lies in its abstracted architecture. For example, while users can visually design a data pipeline, the way that pipeline executes is highly adaptable. Depending on the user’s requirements—such as the source, destination, or whether the data needs to be real-time—the platform automatically maps the pipeline to one of six different engines. This ensures optimal performance without requiring users to manage these complexities manually.

The platform is also loosely coupled, meaning that source systems and destination systems are decoupled. This allows users to easily add more destinations to existing sources, add more sources to existing destinations, and enable bi-directional integrations between systems.

Importantly, Nexla abstracts the design of pipelines so users can handle batch data, streaming data, and real-time data without changing their workflows or designs. The platform automatically adapts to these needs, making it easier for users to work with data in any format or speed. This is more about thoughtful design than programming language specifics, ensuring a seamless experience.

All of this illustrates that we built Nexla with the end consumer of data in mind. Many traditional tools were designed for those producing data or managing systems, but we focus on the needs of data consumers that want consistent, straightforward interfaces to access data, regardless of its source. Prioritizing the consumer’s experience enabled us to design a platform that simplifies access to data while maintaining the flexibility needed to support diverse use cases.

Can you share examples of how no-code and low-code features have transformed data engineering for your customers?

No-code and low-code features have transformed the data engineering process into a truly collaborative experience for users. For example, in the past, DoorDash’s account operations team, which manages data for merchants, needed to provide requirements to the engineering team. The engineers would then build solutions, leading to an iterative back-and-forth process that consumed a lot of time.

Now, with no-code and low-code tools, this dynamic has changed. The day-to-day operations team can use a low-code interface to handle their tasks directly. Meanwhile, the engineering team can quickly add new features and capabilities through the same low-code platform, enabling immediate updates. The operations team can then seamlessly use these features without delays.

This shift has turned the process into a collaborative effort rather than a creative bottleneck, resulting in significant time savings. Customers have reported that tasks that previously took two to three months can now be completed in under two weeks—a 5x to 10x improvement in speed.

How is the role of data engineering evolving, particularly with the increasing adoption of AI?

Data engineering is evolving rapidly, driven by automation and advancements like GenAI. Many aspects of the field, such as code generation and connector creation, are becoming faster and more efficient. For instance, with GenAI, the pace at which connectors can be generated, tested, and deployed has drastically improved. But this progress also introduces new challenges, including increased complexity, security concerns, and the need for robust governance.

One pressing concern is the potential misuse of enterprise data. Businesses worry about their proprietary data inadvertently being used to train AI models and losing their competitive edge or experiencing a data breach as the data is leaked to others. The growing complexity of systems and the sheer volume of data require data engineering teams to adopt a broader perspective, focusing on overarching system issues like security, governance, and ensuring data integrity. These challenges cannot simply be solved by AI.

While generative AI can automate lower-level tasks, the role of data engineering is shifting toward orchestrating the broader ecosystem. Data engineers now act more like conductors, managing numerous interconnected components and processes like setting up safeguards to prevent errors or unauthorized access, ensuring compliance with governance standards, and monitoring how AI-generated outputs are used in business decisions.

Errors and mistakes in these systems can be costly. For example, AI systems might pull outdated policy information, leading to incorrect responses, such as promising a refund to a customer when it isn’t allowed. These types of issues require rigorous oversight and well-defined processes to catch and address these errors before they impact the business.

Another key responsibility for data engineering teams is adapting to the shift in user demographics. AI tools are no longer limited to analysts or technical users who can question the validity of reports and data. These tools are now used by individuals at the edges of the organization, such as customer support agents, who may not have the expertise to challenge incorrect outputs. This wider democratization of technology increases the responsibility of data engineering teams to ensure data accuracy and reliability.

What new features or advancements can be expected from Nexla as the field of data engineering continues to grow?

We’re focusing on several advancements to address emerging challenges and opportunities as data engineering continues to evolve. One of these is AI-driven solutions to address data variety. One of the major challenges in data engineering is managing the variety of data from diverse sources, so we’re leveraging AI to streamline this process. For example, when receiving data from hundreds of different merchants, the system can automatically map it into a standard structure. Today, this process often requires significant human input, but Nexla’s AI-driven capabilities aim to minimize manual effort and enhance efficiency.

We’re also advancing our connector technology to support the next generation of data workflows, including the ability to easily generate new agents. These agents enable seamless connections to new systems and allow users to perform specific actions within those systems. This is particularly geared toward the growing needs of GenAI users and making it easier to integrate and interact with a variety of platforms.

Third, we continue to innovate on improved monitoring and quality assurance. As more users consume data across various systems, the importance of monitoring and ensuring data quality has grown significantly. Our aim is to provide robust tools for system monitoring and quality assurance so data remains reliable and actionable even as usage scales.

Finally, Nexla is also taking steps to open-source some of our core capabilities. The thought is that by sharing our tech with the broader community, we can empower more people to take advantage of advanced data engineering tools and solutions, which ultimately reflects our commitment to fostering innovation and collaboration within the field.

Thank you for the great responses, readers who wish to learn more should visit Nexla.

The post Saket Saurabh, CEO and Co-Founder of Nexla – Interview Series appeared first on Unite.AI.

Related Posts