Denas Grybauskas, Chief Governance and Strategy Officer at Oxylabs – Interview Series

Denas Grybauskas is the Chief Governance and Strategy Officer at Oxylabs, a global leader in web intelligence collection and premium proxy solutions.

Founded in 2015, Oxylabs provides one of the largest ethically sourced proxy networks in the world—spanning over 177 million IPs across 195 countries—along with advanced tools like Web Unblocker, Web Scraper API, and OxyCopilot, an AI-powered scraping assistant that converts natural language into structured data queries.

You’ve had an impressive legal and governance journey across Lithuania’s legal tech space. What personally motivated you to tackle one of AI’s most polarising challenges—ethics and copyright—in your role at Oxylabs?

Oxylabs have always been the flagbearer for responsible innovation in the industry. We were the first to advocate for ethical proxy sourcing and web scraping industry standards. Now, with AI moving so fast, we must make sure that innovation is balanced with responsibility.

We saw this as a huge problem facing the AI industry, and we could also see the solution. By providing these datasets, we’re enabling AI companies and creators to be on the same page regarding fair AI development, which is beneficial for everyone involved. We knew how important it was to keep creators’ rights at the forefront but also provide content for the development of future AI systems, so we created these datasets as something that can meet the demands of today’s market.

The UK is in the midst of a heated copyright battle, with strong voices on both sides. How do you interpret the current state of the debate between AI innovation and creator rights?

While it’s important that the UK government favours productive technological innovation as a priority, it’s vital that creators should feel enhanced and protected by AI, not stolen from. The legal framework currently under debate must find a sweet spot between fostering innovation and, at the same time, protecting the creators, and I hope in the coming weeks we see them find a way to strike a balance.

Oxylabs has just launched the world’s first ethical YouTube datasets, which requires creator consent for AI training. How exactly does this consent process work—and how scalable is it for other industries like music or publishing?

All of the millions of original videos in the datasets have the explicit consent of the creators to be used for AI training, connecting creators and innovators ethically. All datasets offered by Oxylabs include videos, transcripts, and rich metadata. While such data has many potential use cases, Oxylabs refined and prepared it specifically for AI training, which is the use that the content creators have knowingly agreed to.

Many tech leaders argue that requiring explicit opt-in from all creators could “kill” the AI industry. What’s your response to that claim, and how does Oxylabs’ approach prove otherwise?

Requiring that, for every usage of material for AI training, there be a previous explicit opt-in presents significant operational challenges and would come at a significant cost to AI innovation. Instead of protecting creators’ rights, it could unintentionally incentivize companies to shift development activities to jurisdictions with less rigorous enforcement or differing copyright regimes. However, this does not mean that there can be no middle ground where AI development is encouraged while copyright is respected. On the contrary, what we need are workable mechanisms that simplify the relationship between AI companies and creators.

These datasets offer one approach to moving forward. The opt-out model, according to which content can be used unless the copyright owner explicitly opts out, is another. The third way would be facilitating deal-making between publishers, creators, and AI companies through technological solutions, such as online platforms.

Ultimately, any solution must operate within the bounds of applicable copyright and data protection laws. At Oxylabs, we believe AI innovation must be pursued responsibly, and our goal is to contribute to lawful, practical frameworks that respect creators while enabling progress.

What were the biggest hurdles your team had to overcome to make consent-based datasets viable?

The path for us was opened by YouTube, enabling content creators to easily and conveniently license their work for AI training. After that, our work was mostly technical, involving gathering data, cleaning and structuring it to prepare the datasets, and building the entire technical setup for companies to access the data they needed. But this is something that we’ve been doing for years, in one way or another. Of course, each case presents its own set of challenges, especially when you’re dealing with something as huge and complex as multimodal data. But we had both the knowledge and the technical capacity to do this. Given this, once YouTube authors got the chance to give consent, the rest was only a matter of putting our time and resources into it.

Beyond YouTube content, do you envision a future where other major content types—such as music, writing, or digital art—can also be systematically licensed for use as training data?

For a while now, we have been pointing out the need for a systematic approach to consent-giving and content-licensing in order to enable AI innovation while balancing it with creator rights. Only when there is a convenient and cooperative way for both sides to achieve their goals will there be mutual benefit.

This is just the beginning. We believe that providing datasets like ours across a range of industries can provide a solution that finally brings the copyright debate to an amicable close.

Does the importance of offerings like Oxylabs’ ethical datasets vary depending on different AI governance approaches in the EU, the UK, and other jurisdictions?

On the one hand, the availability of explicit-consent-based datasets levels the field for AI companies based in jurisdictions where governments lean toward stricter regulation. The primary concern of these companies is that, rather than supporting creators, strict rules for obtaining consent will only give an unfair advantage to AI developers in other jurisdictions. The problem is not that these companies don’t care about consent but rather that without a convenient way to obtain it, they are doomed to lag behind.

On the other hand, we believe that if granting consent and accessing data licensed for AI training is simplified, there is no reason why this approach should not become the preferred way globally. Our datasets built on licensed YouTube content are a step toward this simplification.

With growing public distrust toward how AI is trained, how do you think transparency and consent can become competitive advantages for tech companies?

Although transparency is often seen as a hindrance to competitive edge, it’s also our greatest weapon to fight mistrust. The more transparency AI companies can provide, the more evidence there is for ethical and beneficial AI training, thereby rebuilding trust in the AI industry. And in turn, creators seeing that they and the society can get value from AI innovation will have more reason to give consent in the future.

Oxylabs is often associated with data scraping and web intelligence. How does this new ethical initiative fit into the broader vision of the company?

The release of ethically sourced YouTube datasets continues our mission at Oxylabs to establish and promote ethical industry practices. As part of this, we co-founded the Ethical Web Data Collection Initiative (EWDCI) and introduced an industry-first transparent tier framework for proxy sourcing. We also launched Project 4β as part of our mission to enable researchers and academics to maximise their research impact and enhance the understanding of critical public web data.

Looking ahead, do you think governments should mandate consent-by-default for training data, or should it remain a voluntary industry-led initiative?

In a free market economy, it is generally best to let the market correct itself. By allowing innovation to develop in response to market needs, we continually reinvent and renew our prosperity. Heavy-handed legislation is never a good first choice and should only be resorted to when all other avenues to ensure justice while allowing innovation have been exhausted.

It doesn’t look like we have already reached that point in AI training. YouTube’s licensing options for creators and our datasets demonstrate that this ecosystem is actively seeking ways to adapt to new realities. Thus, while clear regulation is, of course, needed to ensure that everyone acts within their rights, governments might want to tread lightly. Rather than requiring expressed consent in every case, they might want to examine the ways industries can develop mechanisms for resolving the current tensions and take their cues from that when legislating to encourage innovation rather than hinder it.

What advice would you offer to startups and AI developers who want to prioritise ethical data use without stalling innovation?

One way startups can help facilitate ethical data use is by developing technological solutions that simplify the process of obtaining consent and deriving value for creators. As options to acquire transparently sourced data emerge, AI companies need not compromise on speed; therefore, I advise them to keep their eyes open for such offerings.

 Thank you for the great interview, readers who wish to learn more should visit Oxylabs.

The post Denas Grybauskas, Chief Governance and Strategy Officer at Oxylabs – Interview Series appeared first on Unite.AI.