<rss xmlns="" version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>Cloudera Blog</title><description>A RSS news feed containing the latest blog entries from the Cloudera blog</description><link>https://www.cloudera.com/blog.html</link><language>en-US</language><lastBuildDate/><generator/><atom:link rel="self" type="application/rss+xml"><href>https://www.cloudera.com/api/www/blog-feed</href></atom:link><sy:updatePeriod/><item><title>Data for AI Anywhere: Cloudera’s AI Investments Are Fueling a Hiring Surge</title><description><![CDATA[In an industry defined by reductions in force and hiring freezes, Cloudera is taking a different path and actively expanding its global workforce to meet accelerating demand for enterprise AI.]]></description><link>https://www.cloudera.com/blog/business/clouderas-ai-investments-are-fueling-a-hiring-surge.html</link><guid>https://www.cloudera.com/blog/business/clouderas-ai-investments-are-fueling-a-hiring-surge.html</guid><pubDate>Thu, 02 Apr 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Angela Mann]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-overhead-shot-employees-working-2200202394.webp"><p>In an industry defined by reductions in force and hiring freezes, Cloudera is taking a different path and actively expanding its global workforce to meet accelerating demand for enterprise AI.</p>
<p>This growth is the direct result of a multi-year investment strategy in Research &amp; Development (R&amp;D) and AI, which is now entering its breakout phase. We are augmenting our teams globally to build the platform that makes enterprise AI possible anywhere.</p>
<h2>Why R&amp;D is Our North Star</h2>
<p>As our CTO, Sergio Gago, recently noted, we have entered the “Era of Convergence,” where data centers and cloud come together so that AI can be managed “<a href="https://securityonscreen.com/cloudera-era-of-convergence-ai-dec25/" target="_blank" rel="noopener noreferrer">as another part of the workforce</a>.” This shift from experimental pilots to enterprise-scale impact is exactly why we are expanding our R&amp;D teams to build a unified architecture that allows our customers to bring AI to their data, anywhere it lives.</p>
<h2>The Strategy: Investing in the &quot;Era of Convergence&quot;</h2>
<p>The experimentation phase of AI is over. Enterprises are moving from simple proofs of concept to agentic AI with autonomous workflows that require secure, governed access to data across hybrid environments.</p>
<p>To meet this demand, we have significantly ramped up our R&amp;D spending, focusing on:</p>
<ul>
<li><p><b>Cloudera AI Inference:</b> Powered by NVIDIA technology to scale GenAI, agentic workflows, and traditional predictive ML use cases</p>
</li>
<li><p><b>AI Agent Studio:</b> Empowering developers and business teams to build autonomous agents within a trusted data ecosystem using low- and no-code techniques</p>
</li>
<li><p><b>Unified Data:</b> Blurring the lines between the clouds and on-premises data centers to ensure 100% of your data can be made &quot;AI-ready&quot; without friction</p>
</li>
</ul>
<h2>Deep Dive on the Launch of Cloudera Agent Studio</h2>
<p>The surge in our R&amp;D hiring is a direct response to a fundamental shift in the market. In 2024 and 2025, enterprises were experimenting with LLMs. In 2026, they are operationalizing them.</p>
<p>To lead this transition, we recently unveiled <a href="https://docs.cloudera.com/machine-learning/cloud/use-ai-studios/topics/ml-agent-studio-overview.html" target="_blank">Cloudera Agent Studio</a>, a centerpiece of our AI roadmap. Agentic AI is the new frontier with systems that can plan, reason, and execute multi-step tasks across a company's entire data estate.</p>
<h2>Why This Product Matters</h2>
<p>Cloudera Agent Studio is an orchestration layer that allows developers to build autonomous agents that are:</p>
<ul>
<li><p><b>Context-Aware:</b> They use your actual enterprise data (stored in the Cloudera platform) to provide accurate, governed answers</p>
</li>
<li><p><b>Hybrid-Ready:</b> With <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">our new AI Inference service</a> powered by NVIDIA, these agents can run just as efficiently in your private data center as they do in the public clouds</p>
</li>
<li><p><b>Secure by Design:</b> Every action an agent takes is logged and governed by Cloudera Shared Data Experience (SDX), ensuring that AI never sees data it isn't supposed to</p>
</li>
</ul>
<h2>Growing the Team: What We’re Looking For</h2>
<p>We are building the future of the hybrid data and AI platform, so we are looking for builders. Our hiring remains heavy on the Research &amp; Development and Engineering side, but our growth is felt across the entire organization.</p>
<p>We are currently seeking experts who can bridge the gap between &quot;data in motion&quot; and &quot;intelligence at scale.&quot; Current high-priority roles include:</p>
<ul>
<li><p><b>AI Solutions Engineering</b> - Building RAG pipelines and custom GenAI prototypes for global enterprises</p>
</li>
<li><p><b>Platform Engineering</b> - Optimizing Lakehouse architectures and hybrid-cloud deployments (K8s, Iceberg)</p>
</li>
<li><p><b>Machine Learning Ops</b> - Scaling model serving and observability via MLflow and Cloudera AI</p>
</li>
<li><p><b>Data Architecture</b> - Designing the streaming foundations (NiFi, Flink) that feed real-time AI</p>
</li>
</ul>
<h2>Why Join Us Now?</h2>
<p>Cloudera is building to accelerate. We offer a stable, high-innovation environment where you can work on the world's most complex data challenges with leading brands that collectively manage more than 30 exabytes of enterprise data.</p>
<p>If you’re ready to move past the AI hype and start building AI that works in the real world, we have a seat for you.</p>
<p>Explore <a href="/content/www/en-us/careers.html">our open opportunities</a> and help us build for the Era of Convergence.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderas-ai-investments-are-fueling-a-hiring-surge</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Navigating the Future of Data &amp;amp; AI: Key Takeaways from Gartner Data &amp;amp; Analytics 2026</title><description><![CDATA[At Gartner’s 2026 Data &amp; Analytics Summit, the message was clear: the era of experimental AI is over, and the era of integrated, governed, and value-driven AI has begun. As organizations race to modernize, the focus has moved from &quot;What is AI?&quot; to &quot;How do we scale AI reliably?&quot;. Here are five key takeaways from the conference and how Cloudera can help you deliver business value in each of these areas.
]]></description><link>https://www.cloudera.com/blog/business/navigating-the-future-of-data-and-ai-key-takeaways-from-gartner-d-and-a-2026.html</link><guid>https://www.cloudera.com/blog/business/navigating-the-future-of-data-and-ai-key-takeaways-from-gartner-d-and-a-2026.html</guid><pubDate>Wed, 01 Apr 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Katie Gdula]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-woman-reading-tablet.webp"><p>At <a href="https://www.gartner.com/en/newsroom/press-releases/2026-03-11-gartner-data-and-analytics-summit-2026-orlando-day-3-highlights" target="_blank" rel="noopener noreferrer">Gartner’s 2026 Data &amp; Analytics Summit</a>, the message was clear: the era of experimental AI is over, and the era of integrated, governed, and value-driven AI has begun. As organizations race to modernize, the focus has moved from &quot;What is AI?&quot; to &quot;How do we scale AI reliably?&quot;&nbsp;</p>
<p>Here are five key takeaways from the conference and how Cloudera can help you deliver business value in each of these areas.</p>
<h2>5 Key Takeaways from Gartner’s D&amp;A Conference</h2>
<h3>1. There is No AI Without AI-Ready Data</h3>
<p>AI-ready data is the prerequisite for successful AI initiatives. The market is moving toward converged platforms that simplify operations, specifically the open data lakehouse architecture.</p>
<p>A data lakehouse combines the benefits of a traditional data warehouse and the flexibility of data lake architectures. The lakehouse is expected to replace traditional data warehouses because it provides the necessary access to unstructured data—the lifeblood of modern generative AI (GenAI).</p>
<p><b>The Cloudera Edge:</b> <a href="/content/www/en-us/products/open-data-lakehouse.html">Cloudera’s Open Data Lakehouse</a> allows organizations to manage structured and unstructured data across hybrid and multi-cloud environments. By providing a single, unified integrated architecture, Cloudera eliminates data silos, ensuring all your data is AI-ready regardless of where it resides.</p>
<h3>2. The Rise of Agentic Systems</h3>
<p>2026 is the year of <a href="/content/www/en-us/blog/partners/cloudera-agent-studio-and-nvidia-bring-next-gen-agents-to-enterprise-ai.html">AI agents</a>. Unlike simple chatbots, these agents move toward autonomous decision-making and require robust agentic data management to automate complex tasks. AI agents must be governed, budgeted, and contextualized to create value and reduce risk.</p>
<p><b>The Cloudera Edge: </b>Cloudera provides the high-performance data streaming and real-time processing power needed to fuel agentic ecosystems. With <a href="/content/www/en-us/products/data-in-motion.html">Cloudera Data in Motion</a>, enterprises can build the real-time pipelines that allow AI agents to act on the most current data, ensuring autonomous decisions are based on reality, not stale information.</p>
<h3>3. Context is King: Semantics and Graph RAG</h3>
<p>Gartner highlighted that for AI to be trustworthy, it must understand the context of specific jobs and processes. This is driving a shift toward knowledge graphs and graph retrieval-augmented generation (RAG) to handle content complexity and ensure traceability. Leaders need a composite semantic layer to ensure interoperability and transparency.</p>
<p><b>The Cloudera Edge:</b> <a href="/content/www/en-us/products/unified-data-fabric.html">Cloudera’s Unified Data Fabric</a> is designed to handle the complexity of massive datasets while maintaining metadata integrity. By integrating specialized tools for vector databases and knowledge graphs, Cloudera enables graph RAG at scale, allowing enterprises to feed their large language models (LLMs) highly specific, proprietary context while maintaining a clear audit trail of where that information came from.</p>
<h3>4. Governance as a Risk Mitigator</h3>
<p>Gartner also warned, &quot;Governance derisks our aspirations.&quot; Meaning, without right-sized governance, AI initiatives will fail to build the necessary trust to scale. D&amp;A leaders must modernize governance to meet the requirements of the entire AI lifecycle, from data ingestion to model deployment.</p>
<p><b>The Cloudera Edge: </b><a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Cloudera Shared Data Experience (SDX) </a>offers enterprise-grade security and governance that follows data wherever it goes. Whether you are running a model on-premises or in a public cloud, Cloudera SDX provides a consistent security policy, ensuring that sovereign AI is not just a buzzword, but a reality for regulated industries.</p>
<h3>5. The Hybrid Mandate: Sovereign AI</h3>
<p>A significant focus at the summit was the need for sovereign AI solutions that allow organizations to localize D&amp;A control, particularly for compliance and data privacy. Organizations need platforms that offer unified management while allowing for localized control over data and models.</p>
<p><b>The Cloudera Edge:</b> As the <a href="/content/www/en-us/why-cloudera.html">only true hybrid platform for data and AI</a>, Cloudera gives customers the ability to run high-performance AI workloads in the cloud and keep your most sensitive data on-premises. This hybrid flexibility is the cornerstone of a <a href="/content/www/en-us/campaign/unlock-secure-ai-innovation-in-financial-services-with-sovereign-cloud.html">sovereign AI strategy</a>, giving you total control over your intellectual property.</p>
<h2>Final Thoughts: Moving to an AI-First Mentality</h2>
<p>The industry is moving away from fragmented tools toward unified data management solutions. Success in this new era requires a platform that can handle the entire lifecycle—from data ingestion and engineering to warehousing, machine learning, and monitoring.</p>
<p>Cloudera’s hybrid, open, and secure platform provides the foundation for AI-ready data and the governance to protect it, empowering leaders to turn AI disruption into a sustainable competitive advantage.</p>
<p>To learn more about how Cloudera can power your AI use cases, check out our webinar series “<a href="/content/www/en-us/events/webinars/accelerate-enterprise-and-agentic-ai-7-part-series.html">Accelerate Enterprise &amp; Agentic AI: From Development to Inference with Private AI</a>.”</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=navigating-the-future-of-data-and-ai-key-takeaways-from-gartner-d-and-a-2026</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Moneyball’s Billy Beane on Why Ignoring Data Is the Biggest Risk of All</title><description><![CDATA[In episode 62 of The AI Forecast, How Moneyball&apos;s Billy Beane Changed Baseball Forever with Data Analytics, Billy Beane joins host Paul Muller to discuss how evidence-based decisions challenged traditional baseball. He explains how constraints spur innovation, questioning assumptions is vital, and data helps organizations reinvent decision-making.]]></description><link>https://www.cloudera.com/blog/business/moneyball-billy-beane-on-why-ignoring-data-is-the-biggest-risk-of-all.html</link><guid>https://www.cloudera.com/blog/business/moneyball-billy-beane-on-why-ignoring-data-is-the-biggest-risk-of-all.html</guid><pubDate>Wed, 25 Mar 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-podcast-billy-beane.webp"><p><span class="text-lead">Baseball always ran on gut instinct and tradition… until Billy Beane proved the numbers could win.</span></p>
<p>In episode 62 of The AI Forecast, <a href="https://www.youtube.com/watch?v=RZo90YbHtR8&amp;list=PLe-h9HrA9qfAmGHgsmXUZgLL-T4Xjhlq8&amp;index=3" target="_blank" rel="noopener noreferrer">How Moneyball's Billy Beane Changed Baseball Forever with Data Analytics</a>, Billy Beane joins host Paul Muller to discuss how evidence-based decisions challenged traditional baseball. He explains how constraints spur innovation, questioning assumptions is vital, and data helps organizations reinvent decision-making.&nbsp;</p>
<p>From evaluating talent to managing resources, Billy asserts that success depends on creating systems that prioritize evidence over ego. Below are a few of the main moments from Paul and Billy’s fascinating discussion.</p>
<p><b>Reframing Risk</b></p>
<p><b>Paul:</b> How tough is it to navigate that point where you're confident in the idea, but the results aren't showing up quickly enough?</p>
<p><b>Billy</b>: That’s a great question, and I leaned on my assistant. He used to say that if you're going to take a math test and someone is going to give you the answers, wouldn't you take them? We felt like using data was like that. They were giving you the answer to the test. Now, we wanted to leverage data and make a lot of decisions. We knew we weren't going to be right every single time; we weren't going to win every hand, but if we were disciplined with the data, ruthless with the numbers, and consistent in how we made decisions, over time, we would be correct.</p>
<p>I think there were a lot of assumptions when we were doing things that we were nervous about how this was going to turn out, but we felt the opposite, completely. We felt the use of data was kind of a roadmap and a fog light for us. And again, we weren't going to be right about every single decision, but if we were consistent with the way we made decisions over time, we would end up where we wanted to be, and it was going to be that discipline that was going to carry us through.&nbsp;</p>
<p>If you're right three times in a row, everybody's on board. Then the fourth time, if you're wrong, everybody says, ‘Oh, well, I told you that numbers don’t tell you the whole thing.’ And they sort of jump back to an emotional decision-making position, yet they don't hold emotional decisions to the same standard. One of the things we get complimented for, which I think is a little misguided, is that we were risk takers. We were actually completely the opposite. We wanted to manage risk, we wanted to be actuaries, and we thought what was risky was having information to help you make predictive decisions and not using that. That, to us, was the risk.&nbsp;</p>
<p><b>Data Over Orthodoxy</b></p>
<p><b>Paul:</b> The good news is you got famous, and the bad news is you got famous. As other teams figured out what you're up to, how did you find a new edge? How did you stay scrappy?</p>
<p><b>Billy</b>: I think the real revolution was when other teams started realizing the importance of data, collecting their own data, and using that data to build more predictive models. When we first started making decisions, we based them on statistics. Statistics are a result. What teams started figuring out was that there was a better way to measure process, which was a better predictor of skill, and that data collection was important. And quite frankly, it wasn't just about collecting data, but about bringing in some really, really bright, passionate people into our business who previously weren't working there.&nbsp;</p>
<p>The thing about the book Moneyball was that everything in that book was public information. We basically stole Bill James's ideas. The culture allowed us to do it because nobody really tried the ideas of Bill James or what he talked about in his pamphlets for years after that. Over the next 20 years, though, and as we sit here now, teams have become very private. They hire, and they have very large analytical staffs with bright young men and women helping them build these models using biometrics to improve player performance. It's gotten very, very sophisticated—far beyond even my understanding, to be totally frank.</p>
<p><b>Everyone’s a Data Person… Until It Disagrees With Them</b></p>
<p><b>Paul:</b> In my experience, the challenge now is that you may be in a situation—particularly for really bright, experienced people—who will say, “I'm a data-driven person,” and they'll point to data, and they'll agree with it. But as soon as they come up with something that doesn't back up their experience, they may say, “Well, that data's not right, and I'm not going to use that data.” In short, cherry-picking the data is something that I've seen happen, and it goes back to the statement I made about everyone being a data person until it doesn't back up their opinion.&nbsp;</p>
<p><b>Billy:</b> To me, that's the real opportunity. The experiences of a really successful long-term CEO in a business are data, and drawing on those experiences to help him make decisions is data. But I think in many cases, when you're with experienced people, we have a tendency to give in when they say, “Hey, that data’s not right.” Well, my response is usually that you don't get to disagree with the data, because it's not an opinion. It's a fact. In today's world, with all the data that we have exposure to, the real opportunity is when the data tells you one thing and your own experiences tell you something else. Personally, I prefer to always nod towards the data and ignore my own experiences when making decisions. And again, I know many people will disagree with that. To me, the opportunity is when really smart people see the same thing and the data tells them something, because you have to assume your competitor is going to see the same thing you do and make a decision along those lines.</p>
<p>Catch the full conversation with Billy Beane on The AI Forecast on&nbsp;<a href="https://open.spotify.com/show/102S8zoZR6nmZV0HxZlxZu" target="_blank" rel="noopener noreferrer">Spotify</a>, <a href="https://podcasts.apple.com/us/podcast/the-ai-forecast-data-and-ai-in-the-cloud-era/id1797635628" target="_blank" rel="noopener noreferrer">Apple Podcasts</a>, and <a href="https://www.youtube.com/@ClouderaInc/podcasts" target="_blank" rel="noopener noreferrer">YouTube</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=moneyball-billy-beane-on-why-ignoring-data-is-the-biggest-risk-of-all</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>#ClouderaLife Employee Spotlight: Meet Jim Ewton, A Veteran Building Community and Mission Impact at Cloudera</title><description><![CDATA[Let’s meet Jim and learn how a lifetime of service led him to Cloudera, and how he’s helping fellow veterans find belonging along the way. ]]></description><link>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-jim-ewton-a-veteran-building-community-and-mission-impact-at-cloudera.html</link><guid>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-jim-ewton-a-veteran-building-community-and-mission-impact-at-cloudera.html</guid><pubDate>Fri, 20 Mar 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-industrial-logistics-people.webp"><p>“At Cloudera, we just seem to draw individuals from the military,” Jim says. “And that makes me happy because it means they feel comfortable coming to talk to and work with us.”&nbsp;</p>
<p>At Cloudera, innovation starts with belonging. We work to build an environment where people from all backgrounds, including those who have served, can continue their mission in new ways. For <a href="https://www.linkedin.com/in/jim-ewton-4ab365103/" target="_blank">Jim Ewton</a>, a U.S. Air Force veteran and active member of Cloudera’s Veterans Employee Resource Group (ERG), that sense of purpose and community is what makes being a Clouderan special.&nbsp;</p>
<p>Let’s meet Jim and learn how a lifetime of service led him to Cloudera, and how he’s helping fellow veterans find belonging along the way.&nbsp;</p>
<h2>From the Air Force to Cloudera Government Solutions&nbsp;</h2>
<p>Jim spent 23 and a half years in the U.S. Air Force, traveling the world and serving in roles spanning communications and law enforcement. His career took him across Asia and South America, and to 39 U.S. states, including four years at the Pentagon.&nbsp;</p>
<p>“When you wear the uniform that long, it becomes part of who you are,” he says.&nbsp;</p>
<p>After retiring in 2002, Jim transitioned into government contracting before joining Cloudera in 2015. Today, he’s part of Cloudera Government Solutions, the company’s public sector arm that supports sensitive U.S. government missions.&nbsp;</p>
<p>That work carries deep responsibility. <a href="/content/www/en-us/solutions/public-sector.html">Cloudera Government Solutions</a> operates under strict security and compliance standards, supporting agencies that rely on secure, mission-critical data capabilities every day.&nbsp;</p>
<p>“We do a lot of sensitive work,” Jim says. “There are multiple agencies that depend on our capabilities and our software every day.”&nbsp;&nbsp;</p>
<h2>The Hardest Mission: Transitioning to Civilian Life&nbsp;</h2>
<p>The path from military service to civilian life isn’t seamless.&nbsp;</p>
<p>“Even when you take off the uniform, it’s not an immediate immersion into civilian life,” Jim says. “It takes a while. It’s a different world. It can be scary.”&nbsp;&nbsp;</p>
<p>He speaks candidly about the challenges many veterans face—from having to pick out an outfit for work for the first time, to translating military experience into a civilian résumé, to navigating invisible wounds like PTSD or social anxiety. Everything feels new, and recognizing that shock is an important part of the process.&nbsp;</p>
<p>“I say this a lot in my mentorship,” he explains. “A lot of folks coming out of the military have visible or invisible health issues. It’s important to help them find value again in who they are in their new endeavor.”&nbsp;</p>
<p>That belief is what drew him deeper into Cloudera’s Veterans ERG.&nbsp;</p>
<h2>Building Community Through Cloudera’s Veterans ERG&nbsp;</h2>
<p>At Cloudera, our Veterans ERG offers an incredible support system. Members support one another, mentor transitioning service members, and seek ways to give back to the broader military community.&nbsp;&nbsp;</p>
<p>Jim is especially passionate about mentorship, helping veterans translate their skills and experiences into new opportunities.&nbsp;&nbsp;</p>
<p>“The ERGs help create a sense of community,” he says. “I’ve really enjoyed getting involved more, and I hope more Clouderans learn about them and the good work they do.”&nbsp;</p>
<p>Cloudera’s veteran presence extends well beyond the ERG. Veterans serve at every level of the organization, including executive leadership. Seeing that representation throughout sends a powerful message: your background is understood here, and your experience has a place at the table.&nbsp;&nbsp;</p>
<p>“When you see veterans across leadership, it reinforces that you belong here,” Jim says.&nbsp;</p>
<h2>A Culture That Makes Space&nbsp;</h2>
<p>One of the first things Jim noticed when he joined Cloudera was the environment itself. After decades in highly structured military settings, Cloudera’s approachable, casual culture stood out.&nbsp;</p>
<p>“It wasn’t suits and ties. It wasn’t stuffy,” he says. “It was comfortable. People were accepted no matter where they came from—background, education, experience.”&nbsp;</p>
<p>Over more than a decade at Cloudera, Jim has seen the company evolve from its early Hadoop foundations to today’s leadership in hybrid data and AI. Through growth and change, one thing has remained constant: a focus on team building.&nbsp;</p>
<p>“Every time we change direction or pace, leadership comes back to team building,” he says. “That’s always been fundamental.” Now past his 10-year mark, Jim calls Cloudera “one of the best environments I’ve ever been in.”&nbsp;</p>
<h2>Giving Back Is Part of the Mission&nbsp;</h2>
<p>For Jim, being a Clouderan also means giving back. Through the Veterans ERG and <a href="/content/www/en-us/about/philanthropy.html">Cloudera Cares</a> initiatives, he supports organizations like <a href="https://fisherhouse.org/" target="_blank">Fisher House</a>, which provides housing for military families while loved ones receive medical care, and <a href="https://operationmotorsport.org/home-us/" target="_blank">Operation Motorsport</a>, which helps veterans rediscover purpose and community through hands-on engagement in motorsports.&nbsp;</p>
<p>“The testimonies from the young folks are what turned me into a true believer,” he says of Operation Motorsport. “They’re thankful. I cannot begin to tell you how many times they said ‘thank you’ during the event.”&nbsp;</p>
<p>“Just a little bit of energy goes a long way when it comes to offering a helping hand,” he adds. “That’s one thing Cloudera does exceptionally well—we give back.”&nbsp;</p>
<p>Jim also brings a deeply personal dimension to this work. He is supported by a service dog who accompanies him to the office and business reviews, helping to create a sense of calm wherever she goes. “When a dog walks into a room, it changes the environment immediately,” he says. “It provides comfort. Relief. That’s powerful.”&nbsp;</p>
<p>The openness and flexibility to bring his full self (and his fluffy support system!) to work isn’t something he takes lightly.&nbsp;</p>
<h2>Continuing the Mission&nbsp;</h2>
<p>Jim’s story is ultimately about belonging and how powerful it can be when that feeling extends beyond the workplace. Organizations like <a href="https://operationmotorsport.org/home-us/" target="_blank">Operation Motorsport</a> are doing life-changing work to help veterans rediscover purpose and community after service. The impact is tangible, personal, and lasting.&nbsp;</p>
<p>At its best, Cloudera’s culture has always been about showing up—for each other and for the communities around us. As Jim’s journey reflects, there is always room to deepen that impact and to show what it truly means to be a Clouderan: mission-driven, people-first, and committed to making a difference.&nbsp;</p>
<p>Hear from another <a href="/content/www/en-us/blog/culture/clouderalife-employee-spotlight-meet-charles-aad-clouderas-senior-industry-solutions-engineer.html">Clouderan</a> and explore <a href="/content/www/en-us/careers.html">career opportunities</a> at Cloudera.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderalife-employee-spotlight-meet-jim-ewton-a-veteran-building-community-and-mission-impact-at-cloudera</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Reimagining Prescription Analysis: How Specialized AI Agents Solve Healthcare&amp;apos;s Toughest Document Processing Challenges</title><description><![CDATA[In document-intensive fields such as healthcare and pharmaceuticals, the speed and accuracy of data extraction are critical for patient safety and timely care. Prescriptions are a critical document in the healthcare workflows, and accurate transcription is paramount to reducing medication errors and adverse drug events.]]></description><link>https://www.cloudera.com/blog/business/reimagining-prescription-analysis-how-specialized-ai-agents-solve-healthcares-toughest-document-processing-challenges.html</link><guid>https://www.cloudera.com/blog/business/reimagining-prescription-analysis-how-specialized-ai-agents-solve-healthcares-toughest-document-processing-challenges.html</guid><pubDate>Thu, 19 Mar 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Vish Rajagopalan,Kathy Wong,Maximilian Engelhardt,Laurent Edel,Maxim Belikov]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-doctors-talking-data.webp"><p>In document-intensive fields such as healthcare and pharmaceuticals, the speed and accuracy of data extraction are critical for patient safety and timely care. Prescriptions are a critical document in the healthcare workflows, and accurate transcription is paramount to reducing medication errors and adverse drug events.</p>
<p>This blog shows how <a href="/content/www/en-us/solutions/healthcare.html">Cloudera can help healthcare organizations modernize</a>, improving the speed and accuracy of data extraction and prescription generation by replacing traditional optical character recognition (OCR) with specialized AI agents.</p>
<h2>Modernizing the US Pharmacy with Agentic AI</h2>
<p>The US pharmacy sector faces rising demand, tighter margins, and increasing expectations for accuracy and speed. More than <a href="https://www.singlecare.com/blog/news/prescription-drug-statistics" target="_blank">6 billion prescriptions</a> are generated in the US alone every year, yet dispensing still relies heavily on manual data entry, verification, and documentation.&nbsp;</p>
<p>Pharmacist wages have grown, while reimbursement pressure from pharmacy benefit managers (PBMs) and operational friction continue to compress profitability. Pharmacies face a structural challenge: delivering faster, safer dispensing at a time when labor is costly, workflows are increasingly complex, and reimbursement is becoming more volatile.</p>
<p>US pharmacies are experiencing a dual squeeze of rising workload and falling margins:</p>
<ul>
<li><p><b>The labor gap:</b> Pharmacist wages <a href="https://www.bls.gov/ooh/healthcare/pharmacists.htm" target="_blank">average $66/hr</a>, yet a large proportion of their time is consumed by manual data entry and clerical verification.</p>
</li>
</ul>
<ul>
<li><p><b>The audit</b>: Pharmacy benefit managers recoup billions annually via <a href="https://www.kellerrohrback.com/news/clawbacks-prescription-drugs#:~:text=Pharmacy%20benefit%20managers%20(PBMs)%20and,the%20$5%20difference%20for%20profit." target="_blank">clawback</a>, retroactive payment reversals triggered by minor documentation errors.</p>
</li>
</ul>
<ul>
<li><p><b>The revenue shift:</b> Dispensing margins continue to decline, while clinical services offer materially stronger economics for pharmacies.</p>
</li>
</ul>
<h2>Moving Beyond Traditional Entity Extraction</h2>
<p>For many years, optical character recognition has been the de facto technology for transcribing prescriptions. However, it continues to face real-world complexity, such as:</p>
<ul>
<li><p><b>Lack of standardized formats:</b> Prescriptions vary widely in format, and handwritten prescriptions further increase complexity due to differences in handwriting and language.</p>
</li>
</ul>
<ul>
<li><p><b>High error rates:</b> This variability is due to frequent errors in processing optical character recognition from written text, requiring significant manual review and correction.</p>
</li>
</ul>
<ul>
<li><p><b>Custom software stack:</b> Most optical character recognition-based solutions employ a custom software stack. As such, healthcare systems struggle with licensing, upgrades, and staff training.</p>
</li>
</ul>
<ul>
<li><p><b>Privacy and PII regulations:</b> There’s a high degree of regulatory compliance (such as GDPR) around patient records, which constrains storage and transmission of processing of health records.&nbsp;</p>
</li>
</ul>
<h3>The Business Value of AI-Enabled Prescription Verification</h3>
<p>AI-enabled verification strengthens—not replaces—pharmacists by automating repetitive, potentially error-prone steps and converting unstructured prescriptions into reliable data.&nbsp;</p>
<p><b>Labor Optimization&nbsp;</b></p>
<p>Verification is one of the most time-intensive steps in the dispensing workflow, as pharmacists must intake, interpret, transcribe, and confirm each prescription. AI-enabled optical character recognition automates prescription intake and verification, reducing manual effort and allowing pharmacies to meet demand with existing staff—lowering overtime and reliance on relief pharmacists.</p>
<p><b>Reallocated Capacity</b></p>
<p>By reducing time spent on fulfillment, pharmacists regain time for higher-margin clinical services—such as vaccinations, medication therapy management (MTM), and point-of-care testing—improving overall margin mix.</p>
<p><b>Error Reduction</b></p>
<p>Medication errors and clerical discrepancies often stem from inconsistent handwriting, incomplete information, or manual data entry. During pharmacy benefit manager audits, even small documentation errors can result in full claim clawbacks, creating significant financial exposure. AI-enabled optical character recognition adds an automated safety layer by flagging ambiguous or inconsistent data before submission. This improves documentation quality, reduces dispensing errors, and lowers the risk of audit recoupments.&nbsp;</p>
<p><b>Reimbursement Accuracy</b></p>
<p>Pharmacy benefit managers manage most prescription claims and enforce strict documentation standards. Small discrepancies in directions, quantities, or prescriber information frequently trigger claim denials, creating rework and administrative burden. AI-enabled optical character recognition improves documentation accuracy at the point of entry, reducing avoidable denials and the time spent correcting and resubmitting claims. This results in fewer reworks, faster reimbursement, and more predictable cash flow in an already margin-constrained environment.</p>
<h2>Success Story: How a Healthcare Provider Transformed Prescription Analysis with Cloudera AI</h2>
<p>A Central European healthcare provider partnered with Cloudera to modernize prescription analysis under strict PII regulations. The solution replaced a single-pass optical character recognition workflow with an agent-based AI pipeline deployed in a private, air-gapped environment. Further, the solution improved accuracy by over 16%, reached near human-level performance, and scaled from proof of concept to production in a matter of weeks.</p>
<h3>A Specialized Agentic Approach</h3>
<p>The solution’s effectiveness comes from an orchestrated, AI agent-based workflow that combines fine-tuned vision models with authoritative medical data validation.&nbsp;</p>
<ul>
<li><p>First, a <a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera AI</a> agent first extracts prescription data using a vision optical character recognition model specifically trained on real-world prescription formats and handwriting patterns.</p>
</li>
</ul>
<ul>
<li><p>Then, the extracted drug names, dosages, and ingredients are then validated against certified medical and drug databases using probabilistic matching.</p>
</li>
</ul>
<ul>
<li><p>Finally, a human-in-the-loop feedback continuously retrains the model, allowing the system to learn from prior errors and steadily improve accuracy. This closed-loop approach moves prescription analysis beyond static optical character recognition into a self-improving, production-grade workflow.</p>
</li>
</ul>
<h3>Benefits Achieved with Cloudera AI</h3>
<p>This agentic workflow delivered clear operational and financial benefits:</p>
<ul>
<li><p>Improved accuracy: Certified medical database validation reduced optical character recognition and documentation errors.</p>
</li>
</ul>
<ul>
<li><p>Lower operational costs: Automation reduced manual review, error correction, and audit-related rework.</p>
</li>
</ul>
<ul>
<li><p>Faster processing: Automated inference shortened fulfillment cycles and freed pharmacist capacity.</p>
</li>
</ul>
<h2>Next Steps</h2>
<p>Pharmacies that adopt agentic workflows gain speed, resilience, and economic advantage. Those that delay face rising labor costs, greater audit exposure, and widening competitive pressure driven by pharmacy benefit manager requirements.&nbsp;</p>
<p>To learn more about how <a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a> can power your use cases, check out our webinar series “<a href="/content/www/en-us/events/webinars/accelerate-enterprise-and-agentic-ai-7-part-series.html">Accelerate Enterprise &amp; Agentic AI: From Development to Inference with Private AI</a>.”</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=reimagining-prescription-analysis-how-specialized-ai-agents-solve-healthcares-toughest-document-processing-challenges</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Beyond the Screen: Deepfakes, Trust, and the Next Cybersecurity Frontier </title><description><![CDATA[Trust is the foundation of cooperation, trade, and enterprise decision-making. In the digital age, trust is established through signatures, voices, and virtual interactions. But as deepfake technology rapidly advances, that trust erodes, creating new risks that bypass decades of cybersecurity investment.  ]]></description><link>https://www.cloudera.com/blog/technical/beyond-the-screen-deepfakes-trust-and-the-next-cybersecurity-frontier.html</link><guid>https://www.cloudera.com/blog/technical/beyond-the-screen-deepfakes-trust-and-the-next-cybersecurity-frontier.html</guid><pubDate>Wed, 18 Mar 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-window-scene.jpg"><p>Trust is the foundation of cooperation, trade, and enterprise decision-making. In the digital age, trust is established through signatures, voices, and virtual interactions. But as deepfake technology rapidly advances, that trust erodes, creating new risks that bypass decades of cybersecurity investment.&nbsp;&nbsp;</p>
<p>In this episode of <a href="/content/www/en-us/resources/podcast/the-ai-forecast.html">The AI Forecast</a>, Paul Muller speaks with Jim Brennan, Chief Product and Technical Officer at <a href="https://www.getrealsecurity.com/" target="_blank">GetReal Security</a>, about how AI-powered authenticity threats change the enterprise security equation. Their conversation reveals why deepfakes are the new face of social engineering, why technology—not the human eye—must lead the defense, and how leaders can protect their businesses and people.&nbsp;&nbsp;&nbsp;&nbsp;</p>
<h2>The Human Layer Has Become the Weakest Link&nbsp;&nbsp;</h2>
<p><b>Paul:</b> Decades of digital transformation gave us the ability to collaborate instantly. But now the very thing we rely on—the little window on our screens—has become the new attack surface. If I can’t trust what I see, the only fallback is expensive, slow, physical interactions.&nbsp;&nbsp;</p>
<p><b>Jim:</b> A CIO told me, ‘This little window is where I run my business and now, I can’t trust anything coming through it.’ That’s profound. The human eye can’t detect this level of sophistication. Most people are guessing 50/50. That’s why technology, not instinct, has to lead the defense.&nbsp;&nbsp;</p>
<p>Trust fuels cooperation, and cooperation powers business. But deepfakes undermine that trust at its most personal level—the daily conversations and video calls leaders depend on. Jim describes this as a new human-facing interaction layer, which he calls the “display layer,” and Paul jokingly dubbed “Liar 8,” an entirely new attack surface. Unlike firewalls and intrusion detection systems, this is not a technical but a human layer. The medium executives use to communicate and make decisions is now open to manipulation.&nbsp;&nbsp;</p>
<h2>Boards Respond to Realistic Threats, Not Hollywood Plots&nbsp;&nbsp;</h2>
<p><b>Paul:</b> Do boards risk dismissing deepfakes as something that could never happen to them?&nbsp;&nbsp;</p>
<p><b>Jim:</b> It only takes seeing it once to believe it’s real. However, the real challenge is showing boards what it means for their business. If you lean on big sensational stories, they may shrug them off. The reality is that smaller, everyday incidents are already happening, which resonate far more.&nbsp;&nbsp;</p>
<p>He points to fraudulent hiring as a prime example. Attackers are using deepfakes to impersonate candidates and slip through HR processes. Sometimes the motive is simple financial gain, like pocketing a sign-on bonus. Other times, it’s far more serious: nation-state actors planting impostors inside companies for espionage or large-scale fraud. ‘&nbsp;</p>
<p><b>Jim:</b> In the last three months, every Fortune 500 and 1000 company I’ve spoken to has told us it’s having issues with fraudulent hiring. HR teams aren’t built to think like attackers, making hiring an easy target.&nbsp;&nbsp;&nbsp;&nbsp;</p>
<h2>Technology Must Lead the Fight for Digital Authenticity&nbsp;</h2>
<p><b>Paul: </b>We’ve always used technology to fight technology—firewalls, antivirus, intrusion detection. Can we do the same against deepfakes?&nbsp;&nbsp;</p>
<p><b>Jim:</b> You can’t simply train your way out of this problem. Standing up a black-box model and feeding it real and fake examples won’t cut it. The better approach is to use digital forensics to study the artifacts deepfakes leave behind, whether it’s facial distortions, audio noise, or lighting inconsistencies and then use machine learning to find those signals at scale.&nbsp;&nbsp;</p>
<p>Jim explained that effective defenses must go beyond generic AI, getting “under the covers” of generation tools to identify subtle traces and artifacts. Practically, enterprises can deploy these protections through APIs from platforms like Zoom or Teams, avoiding endpoint installs and keeping defenses scalable. At the same time, awareness is critical—webinars, demos, and simulations give employees the context to pause and think before acting. Technology and training form the two layers needed to protect digital trust.&nbsp;</p>
<h2>Closing Insight for Enterprise Leaders&nbsp;</h2>
<p><b>Jim:</b> We live in an age where you can’t trust anything in this window or screen. New policies for organizations are called for, and new ways of operating are called for as well.&nbsp;</p>
<p>The threat landscape has shifted. Deepfakes are not just a futuristic risk. They are here, undermining both enterprise decision-making and personal safety. From fraudulent hires to AI-cloned ransom calls, digital trust is no longer guaranteed.&nbsp;</p>
<p>The path forward is threefold:&nbsp;</p>
<ul>
<li>Educate boards with credible, relatable examples that fit existing risk frameworks&nbsp;</li>
<li>Equip employees with awareness that “seeing” and “hearing” are no longer enough to establish truth</li>
<li>Deploy technology that can detect and respond to authenticity threats in real time</li>
</ul>
<p>Catch the whole conversation with Jim Brennan on The AI Forecast on <a href="https://open.spotify.com/episode/4XiZgW2bRuopX53sb8AKmK?si=wJZ5-jXcSA-SxE-66-XPaA" target="_blank">Spotify</a>, <a href="https://podcasts.apple.com/us/podcast/securing-the-evolving-frontier-of-digital-trust/id1797635628?i=1000722805805" target="_blank">Apple Podcasts</a>, and <a href="https://youtu.be/GkIFK0Bi3PQ?si=Y9O6RPeOOkLUJTkR" target="_blank">YouTube</a>.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=beyond-the-screen-deepfakes-trust-and-the-next-cybersecurity-frontier</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera Agent Studio and NVIDIA Bring Next-Gen Agents to Enterprise AI</title><description><![CDATA[Autonomous agents act toward complex goals without requiring human direction at each step. 
In enterprise environments, deploying these agents introduces a more exacting set of challenges: they must navigate heterogeneous data systems; satisfy compliance, audit, and data sovereignty mandates; and keep all data within the organization&apos;s operational boundary.]]></description><link>https://www.cloudera.com/blog/partners/cloudera-agent-studio-and-nvidia-bring-next-gen-agents-to-enterprise-ai.html</link><guid>https://www.cloudera.com/blog/partners/cloudera-agent-studio-and-nvidia-bring-next-gen-agents-to-enterprise-ai.html</guid><pubDate>Wed, 18 Mar 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Partners]]></category><dc:creator><![CDATA[Charu Anchlia,Suryakant Bhardwaj,Pamela Pan]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-ai-agent-txt.webp"><h2>The Foundation: Private Model Deployment with NVIDIA Nemotron</h2>
<p>Enterprise AI starts with data governance. Prompts, proprietary data, and model outputs must stay within the organization's operational boundary, meeting compliance mandates without architectural compromise. This is the core requirement of Private AI: the full inference stack running inside the enterprise, not outside it.</p>
<p><a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Cloudera AI Inference service</a>, powered by NVIDIA NIM microservices, enables high-performance, scalable model serving directly within the enterprise environment, keeping prompts, data, and outputs inside the security perimeter. Accelerated by the NVIDIA AI stack, including <a href="https://www.nvidia.com/en-us/data-center/technologies/blackwell-architecture/">Blackwell GPUs</a> and <a href="https://developer.nvidia.com/dynamo-triton">Dynamo-Triton</a>, the service supports a wide range of models, including NVIDIA’s Nemotron model family for agentic AI with advanced reasoning, tool use, and long-horizon workflows. This foundation allows organizations to build and run enterprise AI agents directly on their data—securely and at scale.</p>
<h2>Four Pillars of Cloudera Agent Studio</h2>
<h3>1. Dynamic, Iterative, Multi-Step Planning</h3>
<p>Enterprise data environments are not clean. Real deployments involve dozens of databases with inconsistent schemas, sparse documentation, and no deterministic path from a business question to the right data source. The agent must construct that path at runtime.</p>
<p>Agent Studio's orchestrator treats exploration as part of execution. It decomposes complex requests into multi-step plans, executes them iteratively, and self-evaluates after each step before committing to a path. This self-correcting planning loop makes agents reliable in environments they have never encountered and sustains long-horizon workflows across many sequential steps.</p>
<h3>2. Multi-Agent Collaboration: Reusability and Transparency</h3>
<p>Complex enterprise workflows span multiple domains, each requiring distinct reasoning strategies and specialized tools. A single agent attempting to cover all of them cannot be well-optimized for any, and the broader its scope, the harder it becomes to understand and govern agent behavior.&nbsp;</p>
<p>Agent Studio is built around specialized agents, each scoped to a specific domain and equipped with the appropriate tools, coordinated by an orchestrator that understands how to delegate. What makes this collaboration transparent and reusable is how agents communicate: each agent writes structured outputs to shared project context, and subsequent agents consume those outputs as explicit, inspectable inputs. The full chain of reasoning is traceable at every step, providing the auditability enterprises require and the reusability to build on prior work across runs.</p>
<h3>3. Context Engineering: Accuracy, Speed, and Cost</h3>
<p>At enterprise data scales, passing raw data directly to the model does not work. Context windows are finite, and as unstructured context grows, accuracy degrades well before the window limit is reached.</p>
<p>Agent Studio treats the context window as a precision instrument: at each step, only the information relevant to that agent's specific task reaches the model. This artifact-driven design reduces token consumption, cutting inference cost and latency while improving accuracy. That combination is what makes long-horizon workflows tractable at enterprise scale.</p>
<h3>4. Sandboxed Execution</h3>
<p>What makes autonomous agents genuinely powerful is their ability to dynamically generate tools, skills, and executable code as workflows demand them, capabilities that Agent Studio supports natively. But without isolation, agent-generated code and tools executing directly against enterprise systems present unacceptable risk.&nbsp;</p>
<p>We architected Agent Studio's execution layer around isolation by default. All agent-generated code and tool execution runs in a sandboxed runtime with no access to systems outside their defined scope. Agents begin with zero permissions, and every action is policy-enforced at the infrastructure layer, not inside the agent process itself. This gives regulated industries the auditability they require, without restricting what agents can do.&nbsp;</p>
<h2>Customer Story: Agentic AI Transforming Petabyte-Scale Data Analytics</h2>
<p>Cloudera manages over 30 exabytes of structured data across its customer base, making structured data analytics where this architecture delivers immediate impact. A major media and entertainment company deployed it to give business users and analysts a natural language interface to their operational data. Their data estate spanned petabytes across dozens of databases, often with conflicting metadata and sparse documentation.</p>
<p>Cloudera Agent Studio orchestrated specialized agents backed by NVIDIA Nemotron running inside the customer's private network. A business user's analytical question triggered an iterative planning loop: the orchestrator explored the data estate, navigated schema ambiguity, and identified the right data sources autonomously. When the analysis required statistical computation beyond what SQL could express, the orchestrator delegated to the appropriate code execution agent. Intermediate outputs were written as artifacts and passed forward through the long-horizon workflow. All generated code executed in a sandboxed environment, maintaining a complete audit trail throughout.</p>
<p>Workflows that once required a data engineer, developer, and an analyst working in sequence became accessible to any business user. The agents' outputs, including SQL commands, generated code, and visualizations, were written to shared project context throughout, each inspectable and auditable. Those artifacts were also exportable as production pipelines. Because the code that agents generate is deterministic even when the underlying models are not, those pipelines are reliable and reproducible without additional engineering.</p>
<h2>Architecture as Competitive Advantage</h2>
<p>Every pillar in this architecture builds on the one before it. A private inference layer provides the foundation, supporting the call volumes and reliability that long-horizon workflows require. Iterative planning enables agents to navigate environments they have never seen. Multi-agent collaboration brings domain precision to multi-step reasoning. Artifact-based context management improves accuracy while reducing inference cost and latency. Sandboxed execution ensures agents operate safely within defined boundaries, with every action governed and auditable.</p>
<p>Cloudera and NVIDIA bring this architecture to life through <a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera Agent Studio</a>, <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Cloudera AI Inference</a> powered by NVIDIA NIM, and the NVIDIA Nemotron family of models. Together, they deliver the foundation of building orchestration and agentic reasoning needed to run enterprise AI agents directly on enterprise data—securely, privately, and at scale.</p>
<p>To learn more, <a href="https://www.youtube.com/watch?v=XZMaJaLBN6s">see Cloudera Agent Studio in action</a>.</p>
<p>Autonomous agents act toward complex goals without requiring human direction at each step.&nbsp;In enterprise environments, deploying these agents introduces a more exacting set of challenges: they must navigate heterogeneous data systems; satisfy compliance, audit, and data sovereignty mandates; and keep all data within the organization's operational boundary.</p>
<p>Long-horizon agents represent a new class of autonomous AI, extending beyond single tasks to pursue objectives across dozens of sequential decisions, running workflows for hours or days while maintaining context throughout. At enterprise scale, every one of those challenges is amplified.</p>
<h2>An Architecture Built for Enterprise AI Agents</h2>
<p>Cloudera designed <a rel="noopener noreferrer" href="https://docs.cloudera.com/machine-learning/cloud/use-ai-studios/topics/ml-agent-studio-overview.html">Cloudera Agent Studio</a> (part of Cloudera AI Studios)&nbsp; in collaboration with NVIDIA to address exactly these challenges.&nbsp;</p>
<ul>
<li><p><a rel="noopener noreferrer" href="https://developer.nvidia.com/nemotron">NVIDIA Nemotron</a> provides the model foundation: it’s purpose-built for agentic AI and the high-throughput inference demands of long-horizon workflows.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera Agent Studio</a> provides the orchestration layer that builds on that foundation through four architectural pillars: dynamic multi-step planning, transparent multi-agent collaboration, context engineering for accuracy, and sandboxed execution. Each pillar addresses a specific requirement that emerges when autonomous agents operate at enterprise scale.&nbsp;</p>
</li>
</ul>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-agent-studio-and-nvidia-bring-next-gen-agents-to-enterprise-ai</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Adam Skotnicky on Taming Data Complexity and Building Cloud-Like Simplicity</title><description><![CDATA[Paul Muller, host of The AI Forecast Podcast, and Adam discuss how engineering teams can find their way back to simplicity while maintaining flexibility and control. They delve into why IT teams feel swamped by tooling and operational challenges, how platform engineering can make things easier for users, and what it really means to achieve that cloud-like agility in hybrid environments.]]></description><link>https://www.cloudera.com/blog/business/adam-skotnicky-on-taming-data-complexity-and-building-cloud-like-simplicity.html</link><guid>https://www.cloudera.com/blog/business/adam-skotnicky-on-taming-data-complexity-and-building-cloud-like-simplicity.html</guid><pubDate>Tue, 17 Mar 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-window-cleaning.webp"><p>If there’s one thing serial entrepreneur Adam Skotnicky would warn organizations about, it’s data complexity. As VP of Engineering at Cloudera and founder of tcp.cloud and Taikun, which was <a href="/content/www/en-us/about/news-and-blogs/press-releases/2025-08-04-cloudera-acquires-taikun-to-deliver-cloud-experience-to-data-anywhere-for-ai-everywhere.html">recently acquired by Cloudera</a>, Adam is an expert at capitalizing on emerging opportunities in the tech sector without letting complicated data structures hold him back.</p>
<p>Paul Muller, host of <a href="/content/www/en-us/resources/podcast/the-ai-forecast.html">The AI Forecast Podcast</a>, and Adam discuss how engineering teams can find their way back to simplicity while maintaining flexibility and control. They delve into why IT teams feel swamped by tooling and operational challenges, how platform engineering can make things easier for users, and what it really means to achieve that cloud-like agility in hybrid environments.</p>
<p>Here are a few of the main points from the discussion.</p>
<h2>The Pitfall of Overengineering</h2>
<p><b>Paul:</b> Organizations today are managing data across multiple clouds, on-prem, and hybrid environments. From your perspective, what are the biggest challenges they face in that complexity?</p>
<p><b>Adam:</b> The thing is that you need to focus on the core value of what you’re trying to build.&nbsp;</p>
<p>If you go all in, you might overengineer your solution. You don’t need to have all the features on the planet. It’s like a candy shop for engineers, right? They go crazy. Then you have the sugar rush, and then you have this huge fall after that. It’s exactly what it is.</p>
<h2>The Future Is Workload-First, Invisible Infrastructure</h2>
<p><b>Paul:</b> What was the inspiration to try to create a more cloud-like experience in your data center? I think a lot of technologists would say that the issue with this promise of hybrid has always been that my on-premises stuff might have a little bit of automation, but it’s nowhere near as slick or as simple as when I’m using a public cloud service, where they spend a lot of engineering dollars to make it really feel like a catalog. Do you agree that that’s been the compromise in the past, and how did you get around that with what you were doing with Taikun?</p>
<p><b>Adam: </b>If you want to build something similar, the cloud-like experience means removing people from the process. If you have any ticket between you and your application, or if I own this application, you log in, go to the catalog, and deploy things. That’s the ultimate goal. Beyond that, no people touch it; they observe, make sure it works, and ensure it’s performant and secure. They do this without you, without requiring anything from you, and that’s how public cloud works. That’s the experience; that’s what cloud-like means.</p>
<p><b>Paul:</b> Talk to me about what you’re seeing in the marketplace as it comes to deploying these big data workloads. How does a self-service, flexible cloud experience empower teams to focus on insights rather than infrastructure?&nbsp;</p>
<p><b>Adam:</b> I absolutely agree that it’s about workload and workload only. It’s not about the infrastructure, and that’s why we don’t want anyone to touch it. You want to abstract the infrastructure completely, but we still allow you to go and tinker with it. You can do that and explore, but in production environments, you shouldn’t touch it. You should follow best practices because then you can finally focus on the workload, and you shouldn’t go from the workload down. The infrastructure should be there. That’s what we’re doing at Taikun. We focus on the workload.&nbsp;</p>
<h2>One Platform, Any Environment</h2>
<p><b>Paul:</b> What are people using workloads like the Cloudera platform going to notice that's different about this new way of working as they start to deploy?</p>
<p><b>Adam:</b> We are now the abstraction layer for Cloudera services, so Cloudera services will be independent of that environment, so they can run on public or private cloud on your few servers or hundreds or thousands of servers and still have the same experience. You can now run as many of them as you want, connect them to as many endpoints as you want, choose where to combine, and then configure them. It’s not a public cloud or a hybrid cloud. You can use both. You can run your production environments, which you can scale on-prem because of data sovereignty, and you can play with technologies in the public cloud because you can scale up and down from zero to a hundred in minutes. You can combine these approaches.</p>
<p><b>Paul:</b> Amazing. What do people need to do to start preparing for this new world? Is it something they can just instantly drop in, and it’s a technology problem, or how much of this is a people problem where you need to start to get people to think differently? What do I need to do to get ready to get the most out of hybrid?</p>
<p><b>Adam:</b> You can choose your approach. You can go with my preferred way, which we call the “golden pot.” Everything is built in, so you can go one way or another or somewhere in between. You can still run your old, good virtual machine side by side with this environment. There are loads and loads of know-how built into the structures and processes already in place. Both approaches will be there, and in Cloudera products, if you choose not to interface with this new world, it’ll be embedded for you.</p>
<p>Catch the full conversation with Adam Skotnicky on The AI Forecast on&nbsp;<a href="https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopen.spotify.com%2Fepisode%2F54sHI1NZrIF413jsHjIZ6k&amp;data=05%7C02%7Clyoung%40v2comms.com%7Cdc79fc8ab1184f1895f308de6402ed0b%7Cbf0b48c768944eb6bf538621676ccee4%7C0%7C0%7C639058160910549974%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=mS2USZVpC9mvdWo8eLo7sPVvHB8tzONQckSTgH1yXxM%3D&amp;reserved=0" target="_blank" rel="noopener noreferrer">Spotify</a>, <a href="https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpodcasts.apple.com%2Fus%2Fpodcast%2Fthe-secret-to-creating-the-cloud-like-experience%2Fid1779293119%3Fi%3D1000747995473&amp;data=05%7C02%7Clyoung%40v2comms.com%7Cdc79fc8ab1184f1895f308de6402ed0b%7Cbf0b48c768944eb6bf538621676ccee4%7C0%7C0%7C639058160910570839%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=Wp3Na%2Fpv5MNqGG9fnVoAuDTa4hQ8tC0%2FUGpV7dLWA2I%3D&amp;reserved=0" target="_blank" rel="noopener noreferrer">Apple Podcasts</a>, and <a href="https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dw4cbzJRq7B8%26list%3DPLe-h9HrA9qfAmGHgsmXUZgLL-T4Xjhlq8&amp;data=05%7C02%7Clyoung%40v2comms.com%7Cdc79fc8ab1184f1895f308de6402ed0b%7Cbf0b48c768944eb6bf538621676ccee4%7C0%7C0%7C639058160910591240%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=vtJo4R9qo99MGnXPjA4XA8kAHC64kIGQPuNEZNueTJE%3D&amp;reserved=0" target="_blank" rel="noopener noreferrer">YouTube</a>.</p>
<p>&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=adam-skotnicky-on-taming-data-complexity-and-building-cloud-like-simplicity</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Now is the Time for Higher Education Institutions to Master Data Lineage</title><description><![CDATA[In today&apos;s state, local, and education (SLED) environments—especially higher education—budgets are under constant scrutiny, and the demand for data excellence is constant. One high-impact change to your data workflows can transform the quality of your data and AI while lowering costs.]]></description><link>https://www.cloudera.com/blog/business/now-is-the-time-for-higher-education-institutions-to-master-data-lineage.html</link><guid>https://www.cloudera.com/blog/business/now-is-the-time-for-higher-education-institutions-to-master-data-lineage.html</guid><pubDate>Mon, 16 Mar 2026 17:43:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Jeremiah Morrow,Hilary Billingslea,Art Jordan]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/person-from-audience-talking.webp"><p>In today's state, local, and education (SLED) environments—especially higher education—budgets are under constant scrutiny, and the demand for data excellence is constant. That means doing more with fewer resources. One high-impact change to your data workflows that can transform the quality of your data and AI while lowering costs is automating and documenting data lineage.</p>
<p>Higher education institutions are battling data complexity: critical data lives across systems and environments that were never designed to talk to each other—on-premises databases, cloud environments, and edge devices. Managing fields like student IDs, grant IDs, or year-to-date endowment performance, across sources and teams is necessary but difficult, manual, and prone to error.&nbsp;</p>
<p>Without first having trusted, high-quality data, high-impact analytic and AI use cases remain a pipedream. However, if higher ed institutions have a unified view of data lineage across systems, they can successfully leverage this data for AI-driven insights and actions in curriculum development, student recruiting, student retention, efficient campus operations, migrations to the cloud, and so much more.</p>
<p><a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Data Lineage</a> provides an automated and consistent way to map the flow of data from its creation (source) to its ultimate consumption (BI or AI). It harvests and interprets metadata very quickly, helping organizations build a comprehensive knowledge graph that shows exactly how data is created, transformed, and consumed, consistently across the entire map with no gaps.</p>
<h2>Achieving Data Excellence with Cloudera Data Lineage</h2>
<p>In our recent webinar, <a href="https://www.carahsoft.com/learn/event/72519-building-trust-and-compliance-in-sled-organizations-with-cloudera-octopai-data-lineage" target="_blank" rel="noopener noreferrer">Building Trust and Compliance in SLED Organizations</a>, hosted by Cloudera and partner, Carahsoft–panelist Art Jordan (Sales Go-to-Market Director, Data Intelligence Products for Cloudera Data Lineage), notes that “data lineage is a billion-dollar problem.” If you rely on manual processes and have blind spots in your data mapping, inefficiencies and delays are inevitable, which creates critical challenges around explainable AI, personally identifiable information (PII) privacy, and regulatory compliance.</p>
<p>Cloudera Data Lineage addresses these challenges by providing detailed views of lineage with dependencies and transformations consistently across the entire map:</p>
<ul>
<li><p>Cross-system lineage: Provides lineage at the system level from the entry point, all the way to reporting, analytics, and any data consumer.</p>
</li>
</ul>
<ul>
<li><p>Inner-system lineage: Details the asset-level lineage within an extract, transform, and load (ETL) process, report, or database object. This includes seeing how a field is derived or calculated inside a pipeline or repository.</p>
</li>
</ul>
<ul>
<li><p>End-to-end lineage: End-to-end asset-level lineage between systems. This accounts for complex relationships where one field may feed multiple systems or come from multiple sources (one-to-many and many-to-one).</p>
</li>
</ul>
<p>Mastering lineage gives higher education institutions the ability to perform upstream and downstream analytics and mapping quickly. It provides end-to-end visibility and governance, enabling organizations to understand where their data is going, where it came from, and how it was derived. This transparency and ability to guarantee integrity is essential for ensuring you have trusted, high-quality data for use in AI models and that’s being delivered to senior leadership and external partners.</p>
<h2>Success Story: How The University of Arizona Improved Efficiency and Cut Costs with Cloudera Data Lineage</h2>
<p>The University of Arizona (U of A), a major research university, implemented Cloudera Data Lineage within their University Analytics and Institutional Research department. Their environment included running 10,000 extract, transform, and load (ETL) jobs each night and housing close to 40,000 distinct columns in their data warehouse. Manual data documentation was challenging due to this sheer volume.</p>
<p>The university achieved significant efficiency gains and cost reduction by:</p>
<ul>
<li><p>Performing ETL impact analysis: Analyzing the impact of major PeopleSoft updates (which change data types and lengths or delete columns) previously took the data engineering team a week or more. Cloudera Data Lineage cut this time down to a few days.</p>
</li>
</ul>
<ul>
<li><p>Consolidating artifacts: Each ETL job consumes compute, storage, and logging resources. Using Cloudera’s end-to-end metadata view, U of A consolidated artifacts, reducing ETL jobs from 10,000 down to 8,000. This 20% reduction lowered infrastructure costs, decreased pipeline complexity, and reduced operational overhead while improving data consistency and governance across the environment.&nbsp;&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Leveraging rapid discovery: Using the Cloudera Data Lineage discovery module, the team compiled a list of all ETL jobs containing specific commented-out SQL. This task–which was required for a major system upgrade–would have taken significant time to perform manually but was completed instantly via automation.</p>
</li>
</ul>
<p>Crucially, Cloudera Data Lineage strengthened audit readiness and data accuracy by providing stakeholders with clear visibility into how data flows through pipelines, repositories, and BI reports. Instead of relying solely on the data engineering team to manually trace data origins and transformations, compliance, institutional research, and finance teams could independently verify where data came from and how it was calculated. This reduced the risk of reporting errors, accelerated responses to regulatory and accreditation inquiries, and more—all while easing pressure on lean IT budgets and resources.</p>
<h2>Take the Next Step</h2>
<p>Are you confident in your organization’s ability to prove compliance and data accuracy when faced with budget scrutiny or rapid operational change? What is the single most complex data pipeline transformation you would like to automatically document and map next week?&nbsp;</p>
<p><a href="/content/www/en-us/contact-sales.html">Let’s discuss</a> how Cloudera Data Lineage can help you achieve data excellence.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=now-is-the-time-for-higher-education-institutions-to-master-data-lineage</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Enterprise Landing Zones Matter: Why Cloudera Runs Natively Within Governed AWS Environments</title><description><![CDATA[Enterprise cloud adoption has matured. Organizations no longer deploy workloads into isolated or unrestricted cloud accounts. Instead, they operate within governed, cloud provider landing zones that enforce security, identity, networking, and compliance controls by default.]]></description><link>https://www.cloudera.com/blog/business/why-cloudera-runs-natively-within-governed-aws-environments.html</link><guid>https://www.cloudera.com/blog/business/why-cloudera-runs-natively-within-governed-aws-environments.html</guid><pubDate>Fri, 13 Mar 2026 19:26:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Corin Bishop,Peter Ryan]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-clouds-building-mirror.webp"><p>Enterprise cloud adoption has matured. Organizations no longer deploy workloads into isolated or unrestricted cloud accounts. Instead, they operate within governed, cloud provider landing zones that enforce security, identity, networking, and compliance controls by default.</p>
<p>When data and AI platforms don’t integrate cleanly into these landing zones and instead expect customers to weaken governance or introduce exceptions to cloud controls, deployments slow down. Security reviews become more complex, operational risk increases, and platform teams lose confidence in long-term scalability.</p>
<p>Enterprise buyers increasingly expect data and AI platforms to work with their cloud governance models, not around them. Reflecting and supporting real customer conditions, we’re proud to note that the <a href="/content/www/en-us">Cloudera</a> platform runs natively inside <a href="https://aws.amazon.com/controltower/" target="_blank">AWS Control Tower</a>—managed landing zones delivering scale, compliance, and long-term trust.</p>
<h2>Landing Zones Are Now the Enterprise Default</h2>
<p>Landing zones act as a standardized cloud foundation, allowing organizations to scale securely and consistently. They define how accounts are created, how identity and access are managed, how networks are structured, and how security controls are enforced.</p>
<p>For large enterprises and regulated industries, operating within landing zones isn’t an option, it’s the default for running workloads in public clouds at scale.</p>
<h2>Validating Cloudera with AWS Control Tower</h2>
<p>To validate Cloudera under real-world enterprise conditions, we deployed the platform within an Amazon Web Services (AWS) landing zone built using AWS Control Tower. This environment included:</p>
<ul>
<li><p>A multi-account structure aligned with enterprise patterns</p>
</li>
</ul>
<ul>
<li><p>Centralized <a href="https://aws.amazon.com/iam/" target="_blank">AWS identity and access management</a> (IAM)<br>
</p>
</li>
<li><p>Preventive and detective security guardrails<br>
</p>
</li>
<li><p>Standardized networking, logging, and monitoring</p>
</li>
</ul>
<p>The validation demonstrated that Cloudera can be deployed, operated, and scaled without breaking or bypassing AWS landing zone controls. Running Cloudera natively within this environment reduces deployment risk, shortens security review cycles, and accelerates time to value for enterprise customers.</p>
<p>Specific outcomes from the validation exercise include:</p>
<ul>
<li><p>Cloudera operates within AWS Control Tower–managed accounts without requiring privileged exceptions</p>
</li>
</ul>
<ul>
<li><p>Security and compliance guardrails remain intact</p>
</li>
</ul>
<ul>
<li><p>Platform operations align with enterprise IAM and networking models</p>
</li>
</ul>
<ul>
<li><p>Customers can deploy Cloudera as a first-class workload within their governed AWS environments</p>
</li>
</ul>
<h2>Governance and Innovation Are Not Opposites</h2>
<p>There is a persistent misconception that governance slows innovation. In practice, strong cloud foundations enable faster and safer adoption by removing ambiguity and reducing operational friction.&nbsp;</p>
<p>By aligning our platform with enterprise landing zone architectures, Cloudera supports both innovation and control. Customers can confidently adopt advanced analytics and AI capabilities on the Cloudera platform without compromising their cloud governance model.</p>
<p>To learn more about how you can deploy Cloudera natively within governed AWS environments, <a href="/content/www/en-us/contact-sales.html">reach out</a> to our professional services team, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html">check out our product demos</a>, or <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html">sign up for a free 5-day trial</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=why-cloudera-runs-natively-within-governed-aws-environments</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>How Cloudera and Salt AI Deliver a Flagship AI Foundation for Life Sciences</title><description><![CDATA[Life sciences teams are working with more data, models, and regulatory scrutiny than ever before. And much of that data—omics, imaging, electronic health records, trial protocols, real‑world evidence, and more—is stored in unstructured formats that are hard to search and govern. ]]></description><link>https://www.cloudera.com/blog/business/how-cloudera-and-salt-ai-deliver-a-flagship-ai-foundation-for-life-sciences.html</link><guid>https://www.cloudera.com/blog/business/how-cloudera-and-salt-ai-deliver-a-flagship-ai-foundation-for-life-sciences.html</guid><pubDate>Thu, 12 Mar 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Aber Whitcomb,Andreas Skouloudis]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-gettybu003307.webp"><p><i>Figure 1. How The Cloudera and Salt AI Partnership Accelerates Innovation in Life Sciences</i></p>
<h2>From Experiments to Business Value</h2>
<p>In enterprise deployments, combinations of Cloudera and Salt AI have enabled organizations to achieve unprecedented scale, with a throughput of thousands of data engineering jobs per hour, faster prototyping of complex R&amp;D workflows, and step‑change performance and cost improvements for machine learning workloads like AlphaFold2. For example, Salt AI has delivered<a href="https://www.salt.ai/blog/accelerating-drug-discovery-with-alphafold2-optimization" target="_blank"> processing times 22x faster </a>than previous benchmarks for Alphafold2. Equally important, these gains come with full telemetry, governance inheritance, and a clear audit trail for every workflow run. Ultimately, teams can focus on scientific outcomes, and not on integration of existing data and technology solutions.&nbsp;</p>
<p>Salt AI will continue to invest in interoperability with a broad ecosystem of clouds, data platforms, and models while collaborating with partners like Cloudera to publish concrete patterns that regulated industries can adopt and adapt. For life sciences teams, that means more choices—and clearer examples—for turning AI experiments into durable, trustworthy systems. Learn more about <a href="/content/www/en-us/products.html">Cloudera capabilities</a> and <a href="https://www.salt.ai/" target="_blank">the Salt AI platform</a>.&nbsp;</p>
<h2>Life Sciences AI Needs Patterns, Not One‑Off Proofs Of Concept</h2>
<p>Life sciences teams are working with more data, models, and regulatory scrutiny than ever before. And much of that data—omics, imaging, electronic health records, trial protocols, real‑world evidence, and more—is stored in unstructured formats that are hard to search and govern.&nbsp;</p>
<p>AI has the potential to redefine what’s possible in the life sciences—transforming vast, disconnected stores of biological and clinical data into actionable intelligence that accelerates discovery, sharpens decision making, and ultimately helps bring lifesaving innovations to patients faster. But first, organizations must prove that AI‑driven decisions are explainable, stable, and compliant.</p>
<p>In this environment, one‑off proofs of concept (POCs) are not enough. To achieve an acceptable level of governance and trust in AI-driven insights,&nbsp; life sciences organizations need to combine a trusted data and compute foundation with an intelligence layer that can orchestrate models and workflows at scale.</p>
<h2>The Cloudera and Salt AI Partnership: A Reference Architecture for Contextual, Trusted AI at Scale</h2>
<p>Cloudera and Salt AI are partnering to offer one powerful reference combination for life sciences teams.&nbsp;</p>
<ul>
<li><p><a href="/content/www/en-us/products.html">Cloudera</a> provides an open data lakehouse and enterprise AI platform that integrates&nbsp; data streaming, data engineering, data warehousing, and ML/GenAI at scale with a unified governance security and governance layer through SDX. This framework features attribute-based data access controls, lineage, and active metadata enrichment and cataloging.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><a href="https://www.salt.ai/" target="_blank">Salt AI</a> leverages those foundational security mechanisms and adds an orchestration layer across AI models and data. The scalable infrastructure&nbsp; continuously captures context—prompts, system prompts, workflow designs, run performance, user roles, and data sources—enabling complex use cases that capture full value from both specialized and general AI models. Tool calls for agentic operations can be readily spun up through Salt’s txt2 assistant, and pipelines come alive visually in the canvas, showcasing exactly how data flows.</p>
</li>
</ul>
<p>This partnership enables life sciences organizations to apply fine-grained controls across on-premises, public cloud, and hybrid environments; use any model appropriate to a given task; and achieve an auditable, visual record of how AI systems make decisions.</p>
<p>In addition, both Cloudera and Salt AI drive computational and operational efficiencies across the data lifecycle. Leveraging GPU acceleration frameworks, Cloudera delivers improvements on data engineering and LLM inferencing workloads of up to <a href="https://blogs.nvidia.com/blog/cloudera-spark-irs-gpus/" target="_blank">20x</a> and <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">36x</a>, respectively. Similarly, Salt AI <a href="https://www.salt.ai/blog/accelerating-drug-discovery-with-alphafold2-optimization" target="_blank">offers optimizations</a> such as a split-compute architecture that balances CPU and GPU processes, a sophisticated caching system, and the ability to swap, mix, and combine AI models into workflows. The more complex the pipeline and the more it is run, the greater the compute efficiencies when running on Salt.</p>
<h2>Built to Live in a Broader Ecosystem</h2>
<p>The Cloudera and Salt AI solution is explicitly designed to work seamlessly within each customer’s existing ecosystem of clouds, data platforms, and AI tools. It can be deployed in a customer’s virtual private cloud (VPC), with no public egress, and integrates with a diverse array of model providers, vector stores, and data systems.</p>
<p><a href="/content/www/en-us/products/open-data-lakehouse.html">Cloudera’s open data lakehouse</a>, built on Apache Iceberg, offers a flexible and performant table format that combines multi-function analytics and automated data management capabilities (e.g., schema and partition evolution). This approach standardizes feature engineering workflows across disparate and diverse data sources, facilitating <a href="/content/www/en-us/blog/business/navigating-gxp-compliance-in-the-age-of-precision-medicine-and-ai.html">GxP compliance</a> in life sciences.&nbsp;</p>
<p>Additionally, the <a href="/content/www/en-us/blog/business/the-future-delivered-today-the-ai-powered-data-lakehouse.html">Cloudera Iceberg REST catalog</a> enables data sharing with other public cloud data platforms (e.g., Databricks, Snowflake) that support Apache Iceberg tables. Salt AI offers a mechanism that transforms text queries into R&amp;D workflows that orchestrate LLMs, graph databases, modeling tools, and internal systems. Furthermore, it empowers researchers to convert code (e.g., Python scripts) into visual workflows, improving cross-functional collaboration among research teams. These capabilities accelerate innovation cycles by democratizing siloed research initiatives and automating the integration of complex systems without the labor-intensive effort to build custom integration and orchestration logic.</p>
<p>For organizations standardizing on Cloudera, this partnership offers a fast path: governed data combined with contextual orchestration, ready for use cases like molecule design, drug repurposing, translational medicine, protocol authoring, and medical affairs assistants. For others, it serves as a blueprint for marrying existing data platforms with a context‑first AI orchestration layer.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=how-cloudera-and-salt-ai-deliver-a-flagship-ai-foundation-for-life-sciences</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Accelerating Humanitarian Impact with AI</title><description><![CDATA[Mercy Corps operates in environments where timely, well-informed decisions are essential to effective crisis response. Teams must rapidly assess conditions and draw on research and historical knowledge, often under intense pressure.]]></description><link>https://www.cloudera.com/blog/business/accelerating-humanitarian-impact-with-ai.html</link><guid>https://www.cloudera.com/blog/business/accelerating-humanitarian-impact-with-ai.html</guid><pubDate>Thu, 12 Mar 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-doctors-talking-data.webp"><p><a href="https://www.mercycorps.org/" target="_blank">Mercy Corps</a> operates in environments where timely, well-informed decisions are essential to effective crisis response. Teams must rapidly assess conditions and draw on research and historical knowledge, often under intense pressure.</p>
<p>As global crises have increased in scale and complexity, this model has become harder to sustain. At the same time, funding constraints have driven sector-wide contraction, requiring organizations like Mercy Corps to do more with fewer resources, even as delays in analysis can have real consequences on the ground.</p>
<p>To address this challenge, Mercy Corps began exploring how data and AI could reduce friction in crisis research without replacing human judgment. By combining Mercy Corps’ humanitarian expertise with Cloudera’s data and AI capabilities, the two organizations set out to strengthen crisis response and support Mercy Corps’ mission at scale.</p>
<h2>Managing Processes at Scale</h2>
<p>Mercy Corps’ Global Crisis Analysis teams support decision-making across the organization by producing research on aid and development topics in rapidly changing contexts. Their work informs everything from emergency response planning to longer-term program design. These teams analyze conflict dynamics, food insecurity, displacement trends, and economic shocks to help anticipate needs and guide action.</p>
<p>Historically, this research relied on manual processes. Analysts navigated across numerous news sources, websites, and information platforms, copying and recording information into spreadsheets and documents before synthesizing it into reports. While thorough, this process was time consuming and created bottlenecks when rapid crisis analysis was required.</p>
<p>As the scale and pace of crises increased, Mercy Corps recognized that this model was not sustainable. The organization also faced practical constraints. Technical capacity was limited, teams were under-resourced, and building new AI solutions internally would have required investments that were difficult to absorb while maintaining existing operations.</p>
<h2>Realizing the Power of Professional Services</h2>
<p><a href="/content/www/en-us/services-and-support/professional-services.html">Cloudera’s Professional Services team</a> provided the capacity and expertise Mercy Corps needed at a critical moment. And through this partnership, Mercy Corps gained support from leading technical experts without the added strain of bringing in additional staff or infrastructure.</p>
<p>“<i>The intention of this project wasn’t just to come in, do the work and then leave,” said Laurence Da Luz, Senior Director, CTO &amp; Portfolio. “It was to set them up to be self-sufficient</i>.”</p>
<p>Cloudera’s team brought deep experience in data, analytics, and AI, along with a clear understanding of the operational and mission constraints humanitarian organizations face. Working closely with Mercy Corps stakeholders, the Professional Services team helped translate real-world challenges into a scalable solution that could evolve as needs changed.</p>
<p>Rather than approaching the engagement as a one-time delivery, the focus was on partnership and enablement. The goal was to move quickly during a period of crisis while setting Mercy Corps up with a solution they could adapt, extend, and sustain over time.</p>
<h2>A Human-Centered Approach to AI</h2>
<p>From the outset, the partnership was guided by a clear objective to start with the people and decisions that matter most. Cloudera Professional Services worked closely with Mercy Corps teams to understand how crisis research happens in practice and where delays and bottlenecks most directly affect outcomes.&nbsp;</p>
<p><i>“Recognizing that there is still a human element in the solution was vitally important,” said Da Luz. “</i>The goal for us wasn’t to replace what they do with AI, as much of the work still requires human nuance and expertise.”</p>
<p>Rather than attempting to automate judgment, the solution was designed to accelerate it. AI was applied to handle information aggregation and early summarization, enabling analysts to spend more time interpreting findings and applying contextual expertise where human judgment is essential.</p>
<p>This approach resulted in a flexible, AI-driven research capability that brings fragmented workflows into a more unified experience. These capabilities allowed analysts to quickly identify, access, and synthesize information from diverse sources, reducing research cycle time while maintaining human oversight.</p>
<p>At a technical level, Mercy Corps’ solution leverages multiple agentic workflows aligned to different humanitarian research themes. These agent workflows process large volumes of diverse, fast-changing humanitarian and social data. The resulting output helps surface highly relevant information based on the analyst’s stated objectives. Because the system supports conversational interaction, analysts can iteratively refine results and guide the output toward their specific scenario, while retaining full control over interpretation and final conclusions.</p>
<p>Designed for the realities of humanitarian work, the solution adapts to varied geographies, audiences, and crisis types without requiring significant changes to existing workflows. Support for evolving research needs, multilingual sources, and rapidly changing conditions allows teams to respond faster and make more informed decisions in moments where timing and context are critical.</p>
<h2>Impact Beyond Innovation</h2>
<p>For Cloudera team members, working on the Mercy Corps project has been especially meaningful. Beyond the technical challenges, the work offered a direct connection between technology and social impact. Many involved have spoken about the pride that comes from knowing their work helps support humanitarian efforts around the world.&nbsp;</p>
<p><i>“It’s quite humbling when you sit and understand the work they’re doing and the reasons behind it,”</i> said Alastair Elliot, Director of Professional Services, North EMEA.&nbsp;</p>
<p>The project gave the team new insights and learnings to help refine and expand on Cloudera’s existing AI capabilities. It also directly helped to strengthen Cloudera’s library of&nbsp; proven patterns and reference architectures, applicable across industries. This combination of learning and collaboration reflects the company’s culture of empowering teams to pursue work that aligns with both business goals and values.</p>
<h2>AI Solutions for a Deeply Human Purpose</h2>
<p>Cloudera’s partnership with Mercy Corps demonstrates what is possible when advanced data and AI capabilities are paired with a clear mission and a collaborative approach. By focusing on human needs, operational realities, and long-term sustainability, the two organizations delivered a solution that accelerates impact where it matters most.</p>
<p>We are proud of the work accomplished together and inspired by the potential ahead. This collaboration serves as a model for how organizations can apply AI responsibly, effectively, and with purpose, not just to solve technical problems, but to support people and communities around the world.</p>
<p>Learn more about how Cloudera’s<a href="/content/www/en-us/services-and-support/professional-services.html"> Professional Services</a> team can support the most complex data and AI initiatives.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=accelerating-humanitarian-impact-with-ai</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Scalable AI Economics: Achieving Secure, Hybrid Intelligence with Cloudera, AMD, and Dell Technologies </title><description><![CDATA[Enterprise interest in generative and agentic AI has accelerated dramatically over the past two years. Organizations across industries are exploring how AI agents, intelligent assistants, and automation can improve productivity, streamline operations, and unlock insights from growing volumes of enterprise data. Yet as enthusiasm grows, so do questions around cost, security, and operational complexity.]]></description><link>https://www.cloudera.com/blog/partners/scalable-ai-economics-achieving-secure-hybrid-intelligence-with-cloudera-amd-and-dell-technologies.html</link><guid>https://www.cloudera.com/blog/partners/scalable-ai-economics-achieving-secure-hybrid-intelligence-with-cloudera-amd-and-dell-technologies.html</guid><pubDate>Wed, 11 Mar 2026 13:00:00 UTC</pubDate><comments/><category><![CDATA[Partners]]></category><dc:creator><![CDATA[Stephen Catanzano]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-girls-walking-datamesh.webp"><p><a href="/content/www/en-us/campaign/achieving-economical-ai-value-securely-at-scale.html">Enterprise interest in generative and agentic AI</a> has accelerated dramatically over the past two years. Organizations across industries are exploring how AI agents, intelligent assistants, and automation can improve productivity, streamline operations, and unlock insights from growing volumes of enterprise data. Yet as enthusiasm grows, so do questions around cost, security, and operational complexity.</p>
<p>One reality is becoming increasingly clear: not every AI workload requires graphics processing units (GPUs) or massive foundation models. In fact, many high-value enterprise use cases can be delivered efficiently using central processing units (CPUs) and smaller, task-focused language models, particularly when deployed close to the data they serve.</p>
<p>A growing number of organizations are now reevaluating their AI strategies through this lens. Rather than pursuing scale at any cost, they are prioritizing return on intelligence: the ability to deploy AI solutions securely, economically, and at scale. This shift is shaping how enterprises think about infrastructure, data architecture, and governance as AI moves from experimentation into production.</p>
<h2>A Shift in Enterprise AI Economics</h2>
<p>Research from Enterprise Strategy Group (now part of Omdia) indicates that approximately <a href="https://research.esg-global.com/reportaction/515202089/Marketing" target="_blank" rel="noopener noreferrer">80% of organizations view AI agents as a top or high business priority</a>. These agents promise tangible benefits through automation, faster decision-making, and improved employee and customer experiences. However, many organizations continue to struggle with the cost and operational burden associated with GPU-centric deployments.</p>
<p>GPU infrastructure can introduce significant capital expense, power consumption, and supply-chain constraints. For many real-time inference and knowledge-driven workloads, this approach can be misaligned with business needs. As a result, enterprises are increasingly exploring alternatives that better match compute resources to workload requirements.</p>
<p>This is where CPU-based AI, paired with smaller language models, has emerged as a practical option. Rather than pursuing the largest possible models, organizations are using the assets they already own to address their budget challenges with GPU purchases or access. This is about right-sizing AI architectures that emphasize efficiency, security, and scalability.</p>
<h2>Right-Sized AI and the Role of Small Language Models</h2>
<p>Small language models (SLMs) are designed to perform specific enterprise tasks such as summarization, question answering, content generation, and code assistance. Typically containing far fewer parameters than large language models, SLMs can run effectively on modern CPUs while delivering strong performance for targeted use cases.</p>
<p>This approach offers several advantages. CPU-based inference reduces infrastructure costs, lowers power consumption, and simplifies deployment. It also enables organizations to run AI workloads within existing data centers or private cloud environments, addressing concerns around data sovereignty and regulatory compliance.</p>
<p>Within this context, Cloudera has positioned its Private AI strategy around enabling enterprises to deploy and operate AI systems entirely within their own controlled environments. By combining an open data lakehouse architecture with integrated governance and MLOps capabilities, <a href="/content/www/en-us/products/enterprise-ai.html">Cloudera supports AI development</a> that remains close to enterprise data.</p>
<h2>Infrastructure Matters: CPUs and Enterprise Platforms</h2>
<p>The effectiveness of CPU-based AI depends heavily on the underlying infrastructure. Advances in modern processors have significantly improved performance-per-dollar for analytics and inference workloads. <a href="https://www.amd.com/en/products/processors/server/epyc.html" target="_blank" rel="noopener noreferrer">AMD EPYC™ processors</a>, for example, are designed to deliver high core density, strong memory bandwidth, and built-in security features, making them well suited for AI inference and data-intensive workloads.</p>
<p>When deployed on <a href="https://www.dell.com/en-us/lp/dt/amd-servers" target="_blank" rel="noopener noreferrer">enterprise-grade systems from Dell Technologies</a>, organizations can scale AI workloads reliably while leveraging validated architectures optimized for data and AI platforms. This combination allows enterprises to modernize AI capabilities without re-architecting their entire infrastructure footprint.</p>
<p>From an operational perspective, this model enables organizations to reuse existing investments, accelerate deployment timelines, and reduce dependency on specialized hardware. Across these scenarios, the emphasis is not on model size, but on efficiency, responsiveness, and trust.</p>
<h3>Practical AI Use Cases With CPUs</h3>
<p>Many of today’s most valuable AI applications can run efficiently on CPUs without the need for massive models or GPU acceleration. Examples include:</p>
<p><b>Internal Knowledge Assistants</b></p>
<p>Enterprises often store critical knowledge across documents, emails, and reports. By applying SLMs to this data, organizations can enable natural-language access to internal information, improving decision-making while keeping sensitive data on premises.</p>
<p><b>Employee and Agent Assist Chatbots</b></p>
<p>HR, IT, and customer support teams face recurring questions that can be automated through secure, internal chatbots. CPU-based AI enables always-available assistance without introducing external data exposure.</p>
<p><b>Content and Documentation Generation</b></p>
<p>Marketing, compliance, and engineering teams frequently produce repetitive content. AI-assisted generation and summarization can accelerate workflows while maintaining consistency and governance.</p>
<p><b>Software Development Support</b></p>
<p>SLM-powered assistants can generate code snippets, tests, and documentation within enterprise firewalls, helping development teams improve productivity without sending intellectual property to public AI services.</p>
<p><b>Predictive Analytics and Optimization</b></p>
<p>In manufacturing and operations, CPU-based AI models analyze sensor and operational data to predict failures and optimize performance, reducing downtime and operational costs.</p>
<h2>Data Gravity and the Importance of On-Premises AI</h2>
<p>Despite widespread cloud adoption, a significant portion of enterprise data remains on premises. Omdia research indicates that many organizations keep between <a href="https://research.esg-global.com/reportaction/515202097/Marketing" target="_blank" rel="noopener noreferrer">26% and 75% of their data in local or private environments</a>. This data gravity presents challenges when AI processing requires moving sensitive information to external platforms.</p>
<p>Private AI architectures address this challenge by bringing AI to the data rather than the other way around. By running AI workloads within existing environments, organizations reduce latency, improve performance, and maintain compliance with regulations such as GDPR, HIPAA, and industry-specific mandates.</p>
<p><a href="/content/www/en-us/products/machine-learning.html">Cloudera’s approach</a> integrates data ingestion, governance, model management, and serving within a single platform. Combined with CPU-based infrastructure, this enables enterprises to move from pilot projects to production AI more efficiently.</p>
<h2>From Pilot to Production: Measuring Outcomes</h2>
<p>One of the most significant barriers to AI adoption has been the gap between proof-of-concept and production deployment. CPU-based AI architectures help narrow this gap by reducing cost and operational complexity.</p>
<p>Organizations adopting this approach report several outcomes:</p>
<ul>
<li>Lower total cost of ownership for inference-heavy workloads</li>
<li>Faster deployment cycles by avoiding specialized hardware procurement</li>
<li>Reduced energy consumption aligned with sustainability goals</li>
<li>Improved ROI through workload-appropriate compute selection</li>
</ul>
<p>These benefits reinforce a growing consensus that enterprise AI success depends as much on economics and governance as it does on model performance.</p>
<h2>Conclusion: A Practical Path Forward for Enterprise AI</h2>
<p>The next phase of enterprise AI will not be defined by the largest models or the most powerful hardware. Instead, it will be shaped by organizations that can deploy AI securely, economically, and at scale, using architectures aligned with real business needs.</p>
<p>By combining <a href="/content/www/en-us.html">Cloudera’s data and governance platform</a> with <a href="/content/www/en-us/partners/solutions/amd.html">AMD</a> EPYC processors and <a href="/content/www/en-us/partners/solutions/dell-technologies.html">Dell Technologies</a> infrastructure, enterprises have a viable path to operationalizing AI within their own environments. This right-sized approach enables organizations to focus on outcomes, not infrastructure complexity, and to unlock AI value where their data already lives.</p>
<p>As enterprises continue to move AI initiatives from experimentation into production, practical, CPU-based Private AI architectures are likely to play an increasingly important role.</p>
<p>To learn more about achieving economical AI with Cloudera, AMD, and Dell Technologies, download the <a href="/content/www/en-us/campaign/achieving-economical-ai-value-securely-at-scale.html">Omdia Showcase Brief</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=scalable-ai-economics-achieving-secure-hybrid-intelligence-with-cloudera-amd-and-dell-technologies</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>When AI Models Converge, Proprietary Data Becomes the Advantage</title><description><![CDATA[Today’s leading large language models (LLMs)—including Claude, GPT, Gemini, Grok, Mistral, and Llama—are all trained on broadly available public internet data and built on comparable architectures. As a result, performance gaps between models are shrinking.]]></description><link>https://www.cloudera.com/blog/business/when-ai-models-converge-proprietary-data-becomes-the-advantage.html</link><guid>https://www.cloudera.com/blog/business/when-ai-models-converge-proprietary-data-becomes-the-advantage.html</guid><pubDate>Tue, 10 Mar 2026 16:09:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Pamela Pan]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-woman-typing-on-laptop.jpg"><p>Today’s leading large language models (LLMs)—including Claude, GPT, Gemini, Grok, Mistral, and Llama—are all trained on broadly available public internet data and built on comparable architectures. As a result, <a href="https://artificialanalysis.ai/trends" target="_blank">performance gaps between models are shrinking</a>, and the competitive edge once associated with choosing a specific AI model is narrowing. At the same time, <a href="https://www.mckinsey.com/capabilities/people-and-organizational-performance/our-insights/the-agentic-organization-contours-of-the-next-paradigm-for-the-ai-era" target="_blank">business research</a> and <a href="https://timesofindia.indiatimes.com/technology/tech-news/oracle-cofounder-larry-ellison-on-the-biggest-problem-that-all-ai-models-including-chatgpt-gemini-grok-llama-have/articleshow/127537262.cms" target="_blank">executive commentary</a> increasingly point to the same dynamic: AI delivers the greatest long-term value when it can run on proprietary, organizational data that competitors cannot access or replicate.</p>
<p><i>&quot;For these [foundation] models to reach their peak value, you need to train them not just on publicly available data, but you need to make privately owned data available to those models.&quot; -Oracle Founder and CEO Larry Ellison, <a href="https://www.youtube.com/live/4eCFmbX5rAQ?t=323s" target="_blank">Oracle AI World 2025</a></i></p>
<p>As foundational capabilities become more standardized, differentiation shifts from the model itself to how effectively enterprises capture, govern, and operationalize their unique data assets. That shift raises a practical question: how do organizations turn proprietary data into a lasting AI advantage?&nbsp;</p>
<h2>RAG is a Starting Point, Not a Differentiation Strategy.</h2>
<p>Many organizations begin their AI journey with a simple architecture: call a cloud-hosted model and add retrieval-augmented generation (RAG) to pull in internal documents. This approach is effective for early experimentation. It allows teams to build prototypes quickly and demonstrate value immediately.</p>
<p>However, it has limitations when the goal is competitive differentiation. RAG retrieves information at query time, but it does not fundamentally change how the model understands a domain. The model remains general-purpose, and the underlying enterprise knowledge stays external to the model itself. If competitors can access the same base models and implement similar retrieval pipelines, the resulting capabilities are difficult to distinguish.</p>
<p>For enterprises seeking durable advantage, simply retrieving proprietary data is not enough. The model must learn from it.</p>
<h2>Building AI on Proprietary Data</h2>
<p>To turn proprietary data into a lasting advantage, organizations need to go beyond simply querying external models. They need to adapt models to their own data and run them within environments they control. This is where fine tuning and private inference become important.</p>
<h3>Fine Tuning</h3>
<p>Fine tuning allows organizations to adjust a model’s internal weights using proprietary datasets so that domain knowledge is embedded in how the model behaves. Instead of retrieving information at query time, the model begins to understand the organization’s terminology, workflows, and decision patterns.&nbsp;</p>
<p>In many cases, organizations also augment their training pipelines with synthetic data, generating enterprise-grade datasets that expand training coverage while addressing compliance and data availability challenges. Over time, these approaches create AI systems that are aligned with the business itself, not just the public Internet.</p>
<h3>AI Inference</h3>
<p>Once models are adapted to proprietary data, the next step is how they are deployed and operated in production. Running AI inference within private infrastructure allows organizations to operate AI systems directly within their enterprise environment. This approach provides several important benefits:</p>
<ul>
<li><p>Data privacy and control. Prompts, model artifacts, and outputs remain within the organization’s environment rather than being sent to external services.</p>
</li>
</ul>
<ul>
<li><p>Improved performance. Deploying models closer to where enterprise data resides can reduce latency and improve responsiveness for production applications.</p>
</li>
</ul>
<ul>
<li><p>Unified governance. Security policies, access controls, and data lineage can be maintained consistently across the entire AI lifecycle.</p>
</li>
</ul>
<p>At enterprise scale, competitive advantage increasingly comes from the ability to adapt models to proprietary data and run models where that data resides.</p>
<h2>Your Data, Your Models, Your Way</h2>
<p>In a world where foundation models continue to converge, the ability to operationalize AI on unique enterprise data will increasingly define long-term competitive advantage.&nbsp;</p>
<p>Cloudera believes the next era of enterprise AI will be defined by this shift toward Private AI architectures. With Cloudera <a href="/content/www/en-us/products/machine-learning/ai-workbench.html">AI Workbench</a>, <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">AI Inference Service</a>, and <a href="/content/www/en-us/products/machine-learning/ai-studios.html">AI Studios</a>—which include low-code tools for RAG and model fine tuning—we provide end-to-end, governed control needed to ingest, fine-tune, and serve models within your trusted perimeter, across any cloud or data center.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=when-ai-models-converge-proprietary-data-becomes-the-advantage</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Dr. Jake Trippel on Why Your Technical Debt Is Compounding</title><description><![CDATA[In  episode 52 of The AI Forecast, Why LLMs Aren’t Enough and How AI Fabrics Will Change Everything, host Paul Muller sits down with Dr. Jake Trippel, Dean of the College of Business and Technology at Concordia University, St. Paul, and Co-Founder &amp; CTO of Codename 37, to unpack what’s holding enterprises back from scaling AI]]></description><link>https://www.cloudera.com/blog/business/dr-jake-trippel-on-why-your-technical-debt-is-compounding.html</link><guid>https://www.cloudera.com/blog/business/dr-jake-trippel-on-why-your-technical-debt-is-compounding.html</guid><pubDate>Tue, 10 Mar 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-podcast-dr-jake-trippel.webp"><p>AI is only as powerful as the data architecture behind it.</p>
<p>In&nbsp; episode 52 of The AI Forecast, <a href="https://youtu.be/bP8hzgQQDHk?si=xn1mQY7jzUZbs6WH">Why LLMs Aren’t Enough and How AI Fabrics Will Change Everything</a>, host Paul Muller sits down with <b>Dr. Jake Trippel</b>, Dean of the College of Business and Technology at Concordia University, St. Paul, and Co-Founder &amp; CTO of Codename 37, to unpack what’s holding enterprises back from scaling AI:</p>
<ul>
<li><p>Siloed data architecture</p>
</li>
<li><p>Misunderstanding of the&nbsp; power of machine learning, deep learning, and neural networks</p>
</li>
<li><p>Compounding technical debt</p>
</li>
</ul>
<p>Their conversation spans cloud versus on-prem economics to the coming shift from SaaS applications to bot-based experiences. Below are key moments from their discussion.</p>
<h2>Why AI Architectures Are Hitting Their Limits</h2>
<p><b>Paul</b>: Tell us about what we’ve seen in the past with AI and data architectures, and why we need to rethink them now.</p>
<p><b>Jake</b>: We went through the digital transformation era, that was the challenge with data. We stayed in data silos because that's how our platforms were architected, and that's how data was organized. Then we tried to do a bunch of integrations. We tried to do all these app integration engines. We tried to find nifty ways to do it, but what happened was we created a spaghetti mess pulling&nbsp; ELT to ETL, system to system.</p>
<p>Now fast forward to today. The challenge now is that these organizations are incentivized to keep us in silos because now comes AI data silos, the data still in silos, and that's where the power of cloud comes in. That's where we're proud to be a Cloudera partner.</p>
<p>Imagine the same problem, except amplified. I’ve got AI agents up the kazoo — awesome — but they’re only working inside their own data silo.</p>
<p>People are going to want more. They’re going to want agents that can work together, talk together, and reason together. But how do you do that if your data is still stuck in silos? To get to this data mesh state is going to require a transformational change, and that's why Cloudera is a cool solution that can help folks do that.</p>
<h2>Why Large Language Models Aren’t Enough</h2>
<p><b>Paul</b>: What are some of the hacks, best practices, tips or tricks that you use to help you get the most out of what you do with data?</p>
<p><b>Jake</b>: The biggest thing is understanding that large language models are not the answer for everything. AI is a big world.</p>
<p>Large language models are awesome for some things, but they’re really bad for others. People have to understand the power of machine learning, deep learning, and neural networks — which are really the guts of the other two.</p>
<p>The skillset of our time right now is being able to develop or use the right models for the right use cases, and to rapidly get through data. That’s where people need to focus.</p>
<h2>The Compounding Effect of Technical Debt</h2>
<p><b>Paul</b>: How do organizations, in your opinion and experience, pragmatically start to move from where they’ve been to where they’re going? How do they clean their data up? Is there a mechanism by which they can do it without breaking?</p>
<p><b>Jake</b>: That's a big loaded question, so I'll try to pull it apart a little bit. You’re three decades in for a reason. We still see AS/400s out there — and they work. You got to give IBM credit.</p>
<p>The challenge that these organizations have though is how much capital are you expending? Because of the compounding effect of this technical debt — you can kick the can down the road year after year, decade after decade. The cost is only going to grow.</p>
<p>But now at least you have options. We can take the data out and we can do a lot more with it than we ever have before. Instead of ripping off the Band-Aid approach, as long as we have access to the data and continue access to the data, we can now create any type of experience we want in parallel.</p>
<h2>Why Some AI Workloads Are Moving Back On-Prem</h2>
<p><b>Paul</b>: What are you seeing with your existing clients today as they’re looking to deploy new workloads?</p>
<p><b>Jake</b>: We are seeing a massive migration back to on-prem. Couldn’t believe it. Never would have predicted that.</p>
<p>As these organizations are doing more model development, training, and so on, the cloud cost model is just too expensive. I have not met a CFO who’s excited about spending how much a month training these models.</p>
<p>So, they’re making the investment. They’re going back to data centers. They’re depreciating it over the next five years. We’re seeing this in medical devices, financial services, aviation — it’s typically hybrid, but for particular workloads, especially training and development, it’s way more cost effective.</p>
<h2>AI as an Amplifier for Learning — Good and Bad</h2>
<p><b>Paul</b>: What are you seeing in terms of the academic world and how we prepare the workforce of the future?</p>
<p><b>Jake</b>: AI is an amplifier. It’s going to amplify the good — and it’s going to amplify the bad.</p>
<p>On the good side, people will learn 10, 20 times faster than they ever have before. I’ve built models that can read books in three seconds flat. I can now immerse myself in the data and create any type of learning experience I want adapted to my learning style.</p>
<p>The bad side is students choosing, I don’t have to do anything. I can let AI do all my work and I’m not going to learn anything. That’s the part that scares me.</p>
<p>The skillset of our time is, I hope you like learning. You’re going to be doing it every single day of the rest of your career.</p>
<p>Listen to the full conversation with Dr. Jake Trippel on The AI Forecast on <a href="https://open.spotify.com/episode/2QCNnEe7cWIjXQWDQesdx6?si=14ebf81c01514f4b">Spotify</a>, <a href="https://podcasts.apple.com/us/podcast/why-llms-arent-enough-and-how-ai-fabrics-will-change/id1792001677?i=1000740667250&amp;l=ko">Apple Podcasts</a>, and <a href="https://www.youtube.com/watch?v=bP8hzgQQDHk">YouTube</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=dr-jake-trippel-on-why-your-technical-debt-is-compounding</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera’s 2026 Trends in Data and AI Webinar Recap</title><description><![CDATA[I sat down with Manasi Vartak, Cloudera’s chief AI architect, and Mike Gualtieri, vice president and principal analyst at Forrester Research, for Cloudera’s 2026 Trends in Data and AI webinar to discuss how to deploy agentic AI at scale.
]]></description><link>https://www.cloudera.com/blog/business/clouderas-2026-trends-in-data-and-ai-webinar-recap.html</link><guid>https://www.cloudera.com/blog/business/clouderas-2026-trends-in-data-and-ai-webinar-recap.html</guid><pubDate>Mon, 09 Mar 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Robert Hryniewicz]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1164321876.jpg"><p>I recently sat down with Manasi Vartak, Cloudera’s chief AI architect, and Mike Gualtieri, vice president and principal analyst at Forrester Research, for Cloudera’s 2026 Trends in Data and AI webinar to discuss how to deploy agentic AI at scale.</p>
<p>While our conversation had a forward-thinking, future-oriented slant, I kicked off the webinar by posing this retrospective question: What is one belief about AI that died in 2025?</p>
<p>Between the three of us, we discovered that in 2025, several long-held beliefs about AI finally collapsed. I want to share with you the philosophies Manasi and Mike identified that we are leaving behind as we step into this new and exciting year in AI development.&nbsp;</p>
<h2>The Beliefs That Died: The Intellectual Gatekeeping of Agentic AI &nbsp;</h2>
<p>2025 began with the belief that agentic AI would be accessible only to a select few. With novel technologies, it is a basic instinct to defer to the tried-and-true experts: PhDs, engineers, and so on.</p>
<p>However, we are now seeing regular business users build their own functional AI pipelines. Manasi recalled the “lightning strike moment” from last year that sparked this realization—at a hackathon in our Agent Studio, an employee from our strategy department built a complete pipeline that had the potential to save $3 million a year. This was an incredible feat performed by someone without specialized training in agentic AI strategy.</p>
<p>To Manasi, this was the sign that agentic AI is truly being democratized across the board.</p>
<h3>The Beliefs That Died: Ubiquity of AI Hallucinations &nbsp;</h3>
<p>This past year, Mike noticed a marked reduction in AI hallucinations. He acknowledged they still occur but pointed out that, in the past, conversations surrounding AI use focused heavily on them as a threat to its dependability. Now, these fears are much less common.&nbsp;&nbsp;</p>
<p>Mike posited that people now have a better understanding of how to control the scope of an LLM model through prompting, RAG techniques, and other methods. Enough users now understand the circumstances in which these issues arise, as well as the mitigating and eliminating techniques to reduce this phenomenon.&nbsp;</p>
<h3>The Bigger Pattern &nbsp;</h3>
<p>AI has become genuinely actionable because it is now reliable and usable at scale. As agentic AI becomes more democratized, autonomous systems are no longer limited to elite technical teams—they can be deployed across organizations to execute defined tasks end-to-end. Improved accuracy and fewer hallucinations mean these systems can operate with minimal human oversight, shifting AI from an advisory role to an operational one.&nbsp;&nbsp;</p>
<p>Operational AI truly stands out because it reliably eases manual work while achieving impressive results like quicker cycle times, cost savings, and better decision-making. It’s exciting to see how automation brings real value to daily operations, making them smarter and more efficient, rather than just being limited to isolated tests.&nbsp;</p>
<h2>Why These Belief Shifts Matter Going Into 2026</h2>
<p>As trust in AI becomes informed rather than aspirational, the question is no longer whether AI can act, but where it is allowed to act. With increased confidence in data integrity and greater output reliability, AI can now move beyond isolated silos into core business processes and decision-making loops.&nbsp;&nbsp;</p>
<p>The real challenge now is whether organizations are structured to support this democratization. Spreading AI throughout the entire company means shifting away from bottlenecks that restrict experimentation to just a few technical teams. When operational leaders can safely access data across different environments, they’re empowered to build, test, and launch AI-powered tools that truly meet business needs. Without wider, well-managed access to data, AI stays centralized and disconnected from daily operations.</p>
<p>Organizations stuck in old beliefs or unwilling to adapt to new ones risk stalling and falling by the wayside of technological advancements. Cloudera’s platform is designed to avoid this outcome and weather these changes in the ever-volatile AI landscape. Whether your data resides in the cloud, in data centers, or at the edge, Cloudera provides universal access to data for AI across the entire enterprise, with governed, enterprise-wide intelligence.&nbsp;</p>
<p>These themes and more are covered in detail by Manasi, Mike, and me in our talk, and I invite you to explore these shifts in greater depth with us in our <a href="/content/www/en-us/events/webinars/2026-data-trends.html">2026 Trends in Data and AI webinar</a>. For more insight into what these observations mean in practice and how your organization can make the most of democratized AI in your own environment, explore <a href="/content/www/en-us/blog.html">Cloudera’s latest resources</a>.  &nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderas-2026-trends-in-data-and-ai-webinar-recap</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>From Log Overload to Mission Readiness: Rethinking Government Data Architecture</title><description><![CDATA[Agencies that invest now in flexible, distribution-first architectures will strengthen both their cybersecurity and compliance postures while ensuring they’re well positioned to adapt to whatever comes next. Tools like Cloudera Data Flow make it possible to achieve the scalability, observability, and performance that today’s public sector organizations demand. ]]></description><link>https://www.cloudera.com/blog/technical/from-log-overload-to-mission-readiness-rethinking-government-data-architecture.html</link><guid>https://www.cloudera.com/blog/technical/from-log-overload-to-mission-readiness-rethinking-government-data-architecture.html</guid><pubDate>Mon, 02 Mar 2026 17:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Ian Brooks]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-blue-orange-back-person-walking.webp"><p>Across government agencies today, data is both a mission enabler and a hidden drain on resources. From cybersecurity and threat detection to compliance and citizen service delivery, public-sector missions depend on timely, trusted data. Yet the success of these programs—and the regulations that ensure their accountability—create an invisible cost: a flood of log data that strains infrastructure, slows systems, and inflates storage budgets.&nbsp;</p>
<p>To stay compliant, agencies and other regulated organizations must manage this growing data volume responsibly. But as it accumulates, log data can overwhelm even the most capable environments—consuming storage, increasing processing time, and degrading overall performance.&nbsp;</p>
<p>For many agencies, security information and event management (SIEM) platforms like<a href="https://www.splunk.com/" target="_blank" rel="noopener noreferrer"> Splunk</a> sit at the heart of cybersecurity operations, yet even these best-in-class tools can struggle to keep pace. That’s why progressive agencies are rethinking the data architecture behind their SIEM platforms. Not abandoning SIEM, but optimizing how data moves into and through those systems. Let’s talk about what that looks like in practice.</p>
<h2>A New Approach to Data Movement: Cloudera Data Flow&nbsp;</h2>
<p>Public-sector organizations are increasingly adopting solutions to streamline data movement. Smarter data distribution helps agencies improve system performance and reliability, control costs, and maintain end-to-end awareness of how data moves across their environments.&nbsp;</p>
<p>Cloudera Data Flow provides centralized control and visibility across on-premises and cloud environments, helping agencies manage data more securely and efficiently at scale. Rather than relying on one-off pipelines or manual integrations, Cloudera Data Flow functions as a connective layer that intelligently routes, filters, and delivers data where it’s needed. In short, it connects and manages data intelligently across environments, minimizing duplication and complexity while conserving both infrastructure and human resources.&nbsp;</p>
<p>For agencies balancing tight budgets and strict mandates, Cloudera Data Flow offers clear advantages, including:&nbsp;</p>
<ul>
<li><p><b>Optimized resources</b>: Route only the most critical data to Splunk or other SIEM tools, while archiving less-urgent logs in cost-effective object storage</p>
</li>
</ul>
<ul>
<li><p><b>Reduced noise</b>: Preprocess and filter high-volume data to accelerate analysis and improve the signal-to-noise ratio</p>
</li>
</ul>
<ul>
<li><p><b>Maintained compliance</b>: Preserve auditable chains of custody and full observability of every data flow</p>
</li>
</ul>
<ul>
<li><p><b>Hybrid continuity</b>: Support mission operations seamlessly across secure on-premises environments and evolving cloud initiatives<br>
&nbsp;</p>
</li>
</ul>
<table>
<tbody><tr><td><p>Interested in a deep dive of how universal data distribution works with Cloudera?&nbsp;</p>
<p>&nbsp;<br>
Explore the <a href="https://community.cloudera.com/t5/Developer-Blogs/How-To-Optimize-Log-Ingestion-With-Cloudera-Data-Flow/ba-p/413457" target="_blank" rel="noopener noreferrer">step-by-step guide on optimizing Splunk log ingestion with Cloudera Data Flow</a> to see how this can be implemented in practice.</p>
</td>
</tr></tbody></table>
<h2><br>
Rethinking the Data Pipeline&nbsp;</h2>
<p>The shift toward universal data distribution reflects a larger change in how agencies think about data pipelines. For years, data integration was treated more like retrofitted plumbing—cobbling together different pipes and materials to connect and move data stored in different formats, within different tools, and governed by different rules.&nbsp;&nbsp;</p>
<p>Today, the limitations of that approach are clear. For true operational resilience, data flows need to be unified and transparent, regardless of where the data lives. Open-source technologies like<a href="/content/dam/www/marketing/resources/ebooks/scaling-nifi-for-the-enterprise-with-cloudera-dataflow.pdf.landing.html"> Apache NiFi</a> have made this approach more accessible, allowing agencies to test, replay, and adjust data flows without disruption.&nbsp;&nbsp;</p>
<p>Using an open-source framework allows these disparate systems and data formats to work together seamlessly, enabling modernization without abandoning existing investments. For public sector IT leaders, this evolution strengthens mission continuity.&nbsp;</p>
<p>By reimagining data distribution as a core capability, agencies can turn what was once operational overhead into an architectural advantage that keeps everything operating smoothly and in sync.&nbsp;</p>
<h2>A Future-Proof Data Strategy for the Public Sector&nbsp;</h2>
<p>Looking ahead, data complexity isn’t going away—it’s accelerating. The growth of tech including edge devices, IoT sensors, and AI-enabled monitoring will only increase the volume and variety of data that must be collected, secured, and analyzed while staying in compliance.&nbsp;</p>
<p>Agencies that invest now in flexible, distribution-first architectures will strengthen both their cybersecurity and compliance postures while ensuring they’re well positioned to adapt to whatever comes next. Tools like<a href="/content/www/en-us/products/dataflow.html"> Cloudera Data Flow</a> make it possible to achieve the scalability, observability, and performance that today’s public sector organizations demand.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=from-log-overload-to-mission-readiness-rethinking-government-data-architecture</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Why Native Observability is the Heart of Hybrid Cloud</title><description><![CDATA[Cloudera Observability delivers more than just the &quot;why&quot; behind performance; it provides a comprehensive cycle of insight. We’ve &quot;bottled&quot; the diagnostic intelligence gathered from more than 1.3 million nodes under subscription to create sophisticated diagnostic tools. Now, with the integration of Cloudera Cloud Factory (formerly known as Taikun CloudWorks), we’re best placed to extend these capabilities beyond cloud-native infrastructure management.]]></description><link>https://www.cloudera.com/blog/business/why-native-observability-is-the-heart-of-anywhere-cloud.html</link><guid>https://www.cloudera.com/blog/business/why-native-observability-is-the-heart-of-anywhere-cloud.html</guid><pubDate>Fri, 27 Feb 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Ron Pick]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty858239368.webp"><p>In the current enterprise technology landscape, we’re witnessing an industry-wide scramble. As organizations shift from monolithic architectures to complex environments leveraging heterogeneous infrastructures, cloud-based data platforms are hitting a visibility—i.e.,&nbsp; observability—wall. Their response has been a wave of reactive, multi-billion-dollar acquisitions designed to &quot;bolt-on&quot; the observability that they lack natively.</p>
<p>But observability shouldn't be a post-script or a line item from a recent merger—it must be a core capability. At Cloudera, <a href="/content/www/en-us/products/cloudera-data-platform/observability.html">we’re evolving</a> our native observability DNA into a unified, hybrid-first powerhouse, proving that true insight across the entire data estate is a foundational requirement for a <a href="/content/www/en-us/products/unified-data-fabric.html">unified data fabric</a>, <a href="/content/www/en-us/products/open-data-lakehouse.html">open data lakehouse</a>, <a href="/content/www/en-us/products/data-in-motion.html">data in motion</a>, <a href="/content/www/en-us/products/enterprise-ai.html">AI</a>, and your <a href="/content/www/en-us/products/cloudera-data-platform.html">data platform</a> as a whole. This is true whether you run your apps, workloads, models, and agents in public clouds, on-premises in data centers, and at the edge.&nbsp;</p>
<h2>The Multi-Faceted Nature of Observability: Beyond Simple Monitoring</h2>
<p>True observability is not a single tool; it’s a foundational capability baked into the data platform to answer critical questions for every stakeholder across the data estate. Whether it’s a business analyst wondering why a dashboard hasn't refreshed, a database admin investigating a long-running query, or a system admin identifying skewed data storage across cluster nodes, observability must offer telemetry that’s integrated to provide immediate, actionable answers.</p>
<p>In the reality of hybrid and multi-cloud landscapes, relying on separate, single-purpose tools— for data quality, cloud performance, infrastructure health, and so on—that don’t operate across the entire data landscape doesn’t grant true visibility. Instead, it creates a data silo problem of disconnected islands of observed systems.&nbsp;</p>
<p>It’s the interplay between these systems (in data, workloads, resource utilization, etc.) that necessitates observability. When these categories are disconnected, organizations lose the deep context required for operational excellence. To achieve that level of&nbsp; insight requires visibility that links logs, metrics, and traces cohesively between the data layer and the underlying infrastructure, along with everything in between.</p>
<h2>The Inevitable Complexity of the Hybrid AI Era</h2>
<p>The rise of generative AI and large-scale modeling has fundamentally transformed hybrid architecture from a strategic choice into <a href="/content/dam/www/marketing/resources/analyst-reports/the-future-of-enterprise-data-and-analytics-is-hybrid.pdf">a technical necessity</a>. AI workloads demand a delicate balance between massive cloud-scale compute for training and localized, on-premises data gravity for privacy and low-latency inference, leading the modern enterprise to become an intricate web of heterogeneous environments.&nbsp;</p>
<p>This shift toward a truly distributed footprint—spanning from the core data center to the public cloud and out to the edge—inherently magnifies complexity, as workloads behave differently both within and between these various infrastructures. This complexity makes it exponentially harder to get to the critical &quot;why&quot; behind performance lags, cost spikes, or consumption issues. In this <a href="/content/dam/www/marketing/resources/webinars/why-a-true-hybrid-platform-is-the-answer-to-data-complexity.landing.html">hybrid AI era</a>, system complexity without a unified view and telemetry becomes an unmanageable black box, leaving IT leaders unable to predict or prevent critical failures.</p>
<h2>The &quot;Bolt-On&quot; Trap: Why Observability Cannot Be an Afterthought</h2>
<p>There’s been a recent surge in cloud-based data providers acquiring observability startups: <a href="https://www.snowflake.com/en/news/press-releases/snowflake-announces-intent-to-acquire-observe-to-deliver-ai-powered-observability-at-enterprise-scale/" target="_blank" rel="noopener noreferrer">Snowflake acquiring Observe</a>, <a href="https://www.paloaltonetworks.com/company/press/2025/palo-alto-networks-to-acquire-chronosphere--next-gen-observability-leader--for-the-ai-era" target="_blank" rel="noopener noreferrer">Palo Alto Networks acquiring Chronosphere</a>, and more. These multi-billion-dollar acquisitions show that when data platforms lack native observability, they eventually hit a &quot;visibility wall.&quot; These providers are now attempting to bolt-on what should have been a core capability.</p>
<p>For the modern enterprise, a fragmented, cloud-only approach will not provide the visibility they need to achieve true operational excellence:</p>
<ul>
<li><p>Cloud-only tools are restricted to a specific segment of the stack, ignoring the vast data estate existing outside the public cloud.</p>
</li>
</ul>
<ul>
<li><p>Tools with bolted-on observability struggle to provide the unified context needed to understand the cause of issues across complex hybrid environments. Customers frequently find themselves juggling disjointed interfaces for logs, metrics, and traces, which highlights a significant lack of cohesion between the data layer and the infrastructure supporting it.</p>
</li>
</ul>
<h2>Cloudera's Native and Unified Observability Capability</h2>
<p><a href="/content/www/en-us/products/cloudera-data-platform/observability.html">Cloudera Observability</a> is a native, foundational capability that moves beyond simple monitoring to act as a unifying powerhouse. By positioning visibility as a foundational requirement, Cloudera provides total insight across the entire hybrid cloud: on-premises, public cloud, and at the edge. And by leveraging OpenTelemetry as the observability framework to collect and capture distributed traces and metrics, we’re aligned with the leading framework of observability standards.</p>
<p>Cloudera Observability delivers more than just the &quot;why&quot; behind performance; <a href="https://docs.cloudera.com/observability/cloud/overview/topics/obs-understanding-observ.html" target="_blank" rel="noopener noreferrer">it provides a comprehensive cycle of insight</a>. We’ve &quot;bottled&quot; the diagnostic intelligence gathered from more than 1.3 million nodes under subscription to create sophisticated diagnostic tools. Now, with the integration of <a href="https://docs.cloudera.com/csa-operator/1.4/installation/topics/csa-op-installation-process-taikun.html" target="_blank" rel="noopener noreferrer">Cloudera Cloud Factory</a> (formerly known as Taikun CloudWorks), we’re best placed to extend these capabilities beyond cloud-native infrastructure management.</p>
<p>This evolution places predictive reliability firmly within reach for the modern enterprise, transforming maintenance from a cycle of reactive patching into a proactive strategy. By leveraging advanced warnings on known issues and security vulnerabilities, organizations can finally transcend traditional troubleshooting to achieve a state of continuous, reliable performance across their entire data estate.<br>
</p>
<p>Ultimately, observability is the only way to navigate the complexity of the hybrid AI era, through a <a href="/content/www/en-us.html">data platform built with observability in its DNA</a>. To learn more about how you can achieve true observability with Cloudera, <a href="/content/www/en-us/contact-sales.html" target="_blank" rel="noopener noreferrer">reach out</a> to our professional services team, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html">check out our product demos</a>, or <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html">sign up for a free 5-day trial</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=why-native-observability-is-the-heart-of-anywhere-cloud</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Bring AI Models to Your Data with Cloudera AI Inference Service</title><description><![CDATA[Instead of sending your data to the cloud as context for models, Cloudera brings the models to you—unblocking intelligence exactly where it’s needed, securing it by design, and scaling it confidently behind your own firewall.]]></description><link>https://www.cloudera.com/blog/business/bring-ai-models-to-your-data-with-cloudera-ai-inference-service.html</link><guid>https://www.cloudera.com/blog/business/bring-ai-models-to-your-data-with-cloudera-ai-inference-service.html</guid><pubDate>Mon, 23 Feb 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Pamela Pan,Peter Ableda]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-night-city-scape.jpg"><p>We’ve entered a new phase of AI adoption: <a href="https://www.cio.com/article/3850763/88-of-ai-pilots-fail-to-reach-production-but-thats-not-all-on-it.html#:~:text=The%20proof%20of%20concept%20(POC,in%2Dhouse%20Al%20expertise.%E2%80%9D" target="_blank">88% of enterprise AI projects stall before reaching production</a>, not because of poor ideas or weak models, but because infrastructure can’t keep up. Cloud APIs get expensive fast. Governance is an afterthought. Latency adds up. And for <a href="/content/www/en-us/solutions.html">regulated industries</a>, moving sensitive data to a public endpoint is just not an option.&nbsp;</p>
<p>Closing the gap between an AI pilot and full-scale production requires bringing intelligence directly to the source. <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Cloudera AI Inference service</a> gives enterprise teams a secure, performant, and cost-effective production model serving layer—running directly where the data lives.&nbsp;</p>
<p>Instead of sending your data to the cloud as context for models, Cloudera brings the models to you—unblocking intelligence exactly where it’s needed, securing it by design, and scaling it confidently behind your own firewall.</p>
<h2>3 Reasons Why Bringing AI to Your Data is Important: Privacy, Cost, and Choice at Scale</h2>
<h3>Keep Data Private and Protected</h3>
<p>Most AI services require you to send data to the cloud, creating risks around compliance, cost, and latency. Cloudera takes the approach to bring models to where your data already lives. Whether it’s in a secure virtual private cloud (VPC), or within an air-gapped (fully offline and isolated) on-premises environment, this model-to-data strategy ensures your information stays private and governed, while still enabling high-performance inference to power AI in production.&nbsp;</p>
<h3>Predictable Economics in the Long Run</h3>
<p>Running AI in the cloud 24/7 leads to spiraling, unpredictable expenses. These per-request fees create a budget that fluctuates with usage, making long-term forecasting difficult. By shifting inference to infrastructure the organization already owns and controls, teams can bypass these external usage fees. Once AI moves into steady-state production, costs become more predictable, allowing for a higher return on investment as workloads scale.</p>
<h3>Control and Choice</h3>
<p>Most cloud AI providers steer customers into their proprietary ecosystem, making it hard to switch, extend, or fully control your models. With Cloudera AI Inference service, you can deploy a wide range of AI capabilities, from open-source GenAI LLMs like NVIDIA’s Nemotron to traditional predictive models, without giving up control or ownership of your intellectual property. Accelerated by the NVIDIA AI stack—<a href="https://www.nvidia.com/en-us/data-center/technologies/blackwell-architecture/" target="_blank">NVIDIA Blackwell GPUs</a>, <a href="https://developer.nvidia.com/dynamo-triton" target="_blank">NVIDIA Dynamo-Triton</a>, and <a href="https://www.nvidia.com/en-us/ai-data-science/products/nim-microservices/" target="_blank">NVIDIA NIM microservices</a> for high-performance, scalable model serving—Cloudera AI Inference service lets you innovate freely while keeping your AI infrastructure flexible, portable, and future-proof.</p>
<h2>Success Stories: Early Adoption of Cloudera AI Inference Service On Premises</h2>
<p>Cloudera AI Inference service is unlocking new AI use cases in places where the cloud can’t go: offline environments, sovereign infrastructure, and latency-critical operations. Here are three real-world scenarios now enabled by Cloudera AI Inference service and already underway with early adopters.</p>
<h3>National Security: Air-Gapped Intelligence That Never Sleeps or Leaks</h3>
<p><a href="/content/www/en-us/solutions/public-sector.html">In national defense</a>, speed and security are non-negotiable. But until recently, intelligence officers spent thousands of hours manually sifting through sensitive, offline documents—slowed by process, overwhelmed by volume, and unable to leverage public AI tools without risking exposure.</p>
<p>Now, with Cloudera AI Inference service running inside air-gapped environments, defense agencies can deploy powerful LLM assistants that scan and summarize massive document collections in seconds. These models operate entirely offline: no internet, no cloud dependencies, no data leakage, helping analysts make faster decisions without compromising security.</p>
<h3>Global Finance: Instant Operations, Zero Data Exposure</h3>
<p><a href="/content/www/en-us/solutions/financial-services.html">Cross-border finance</a> lives in dozens of languages. Previously, translating documents like contracts, fraud reports, or compliance updates meant using external tools, raising serious concerns over data exposure and auditability.</p>
<p>Today, one of the top global credit card providers is exploring Cloudera AI Inference service and testing on-premises deployment of multilingual models to translate sensitive communications across more than 200 markets in real time, and fully under internal control. By running inference on their own infrastructure, they’re unlocking faster internal operations and customer response times, while avoiding the compliance risks of third-party APIs.</p>
<h3>Public Sector: AI Agents for Every Employee</h3>
<p><a href="/content/www/en-us/solutions/public-sector.html">Government agencies</a> are under pressure to serve more people, faster—yet employees often rely on outdated portals and dense policy manuals. Public GenAI tools aren’t an option due to privacy mandates and unpredictable costs.</p>
<p>Early implementations of Cloudera AI Inference service are supporting on-premises AI chatbots trained on internal agency documentation. These agents help staff and constituents navigate complex topics with speed and confidence, delivering answers instantly, while maintaining full control over the data, prompts, and outputs.</p>
<h2>Looking Ahead: The Future of AI is Anywhere Data Lives</h2>
<p>By bringing the model to where your data lives, Cloudera AI Inference service is helping organizations scale intelligence on their own terms—with predictable cost and flexibility to choose from a wide range of production models. Whether you’re navigating air-gapped security mandates or optimizing high-volume global operations, the path to production-grade AI is now open.</p>
<p><a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a> is the trusted foundation for building, deploying, and governing all types of AI—from generative and agentic AI to traditional machine learning—across your data estate.&nbsp;</p>
<p>Ready to scale? Don’t let infrastructure limit the AI strategy. Visit the <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Cloudera AI Inference service </a>webpage for use case demos, learn more about it in this <a href="/content/www/en-us/events/webinars/enterprise-grade-genai.html?utm_medium=clouderan&amp;utm_source=field&amp;keyplay=AI&amp;utm_campaign=Other---FY26-Q4-GLOBAL-VE-WebinarCloudera-Enterprise-Grade-GenAI&amp;cid=701Ui00000fqoFdIAI">webinar</a>, or<a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html"> book a demo</a> to see how to turn “AI anywhere” into a reality.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=bring-ai-models-to-your-data-with-cloudera-ai-inference-service</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>#ClouderaLife Employee Spotlight: Meet Josephine Tan, Cloudera’s Senior Director, Human Resources, APAC</title><description><![CDATA[Let’s take a moment to get to know Josephine Tan better, explore her journey with Cloudera, and discover how the Lunar New Year is bringing the Singapore office together at the start of this new season.]]></description><link>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-josephine-tan-clouderas-senior-director-human-resources-apac.html</link><guid>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-josephine-tan-clouderas-senior-director-human-resources-apac.html</guid><pubDate>Tue, 17 Feb 2026 20:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-gettyImages-1716777308.jpg"><p>At Cloudera, we pride ourselves on fostering an environment focused on employee well-being and professional growth. At the end of 2025, that commitment was recognized as several Cloudera offices earned<a href="/content/www/en-us/blog/culture/cloudera-grows-recognition-as-great-place-to-work.html"> Best Places to Work</a> honors—including the Singapore office, a close-knit and highly connected team. One of the team members fueling that success is Josephine Tan, Senior Director, Human Resources for the Asia-Pacific (APAC) region.</p>
<p>Approaching her sixth year with the company, Josephine is proud to share the firm foundation of the Singapore office’s culture. “When a culture is strong, that’s where trust within the team grows,” she shared. “We empower people; there’s this level of trust and honesty.”</p>
<p>This time of year, the office looks forward to celebrating both big and small wins, whether it’s the end of a quarter or the start of the Lunar New Year. “You work hard, you play hard, and that’s very much what the Singapore team believes.”</p>
<p>Let’s take a moment to get to know Josephine Tan better, explore her journey with Cloudera, and discover how the Lunar New Year is bringing the Singapore office together at the start of this new season.</p>
<h2>Meet Josephine Tan&nbsp;</h2>
<p>Josephine joined Cloudera in March 2020, mere days before lockdown took effect. At that time, she not only had to learn the ropes of a new job but also navigate an entirely uncharted professional landscape, as business was conducted online. “Luckily, a growth mindset is part of Cloudera’s DNA.”</p>
<p>She is dedicated to leading the region’s people strategy with a warm focus on nurturing talent, fostering a positive culture, and supporting organizational growth. She always keeps the community at the heart of everything she does.</p>
<p>“What drives employees in Singapore is one objective, one goal. It’s all about the power of ‘we.’” This credo is Josephine’s North Star. “I believe in growing the team’s expertise.”</p>
<p>For Josephine, her role in HR is truly about inspiring progress: “I see this as a place where I can make a difference. HR is not all about maintenance, it’s about making possible change happen.”</p>
<p>Her commitment to driving actionable change shows up in both her professional and extracurricular life. Even outside of work, she prioritizes philanthropic efforts and public service: “In my free time, I will ask myself: how can I give back to the community?” It’s a question that defines her approach to leadership and lifestyle, rooted in impact and meaning.</p>
<h2>Singapore’s Collaborative Character</h2>
<p>While it may have presented new challenges, the idea of a remote work environment was not an obstacle. Rather, it was a novel opportunity to foster stronger connections with other Clouderans. “We’re so close-knit now because we went from working fully remote to having the luxury of coming back to the office,” she asserts. This shared experience drew the team together, strengthening and reaffirming their dedication to their work.</p>
<p>When asked what makes the Singapore office special, Josephine happily shared, “the people.” She believes this quality comes from the cosmopolitan and welcoming spirit engrained in Singapore. “There are easily five different cultures present in this one small office,” she notes. The diverse makeup of the office fosters a spirit of collaboration and inclusion among teams, which is a big reason why the office has been proudly recognized as a ‘<a href="https://www.greatplacetowork.com/about" target="_blank">Great Place to Work</a>’ for two consecutive years.</p>
<h2>Celebrating The Lunar New Year</h2>
<p>This solidarity shows up in how the workplace comes together and celebrates special occasions, such as the Lunar New Year. “All nationalities are welcome to celebrate, and we embrace that,” Josephine said. One of the Singapore office’s favorite Lunar New Year activities is the prosperity salad toss, or Yu Sheng. People gather to toss mixed ingredients like shredded vegetables, crackers, and raw fish high in the air while shouting auspicious phrases. “It’s like having turkey at Christmas,” she explained, a holiday tradition that symbolizes abundance and vigor.</p>
<p>Clouderans in Singapore also celebrate the holiday with festive activities such as decorating the office, exchanging oranges as gifts, and enjoying a special quarter-end lunch, where the prosperity toss salad is often the star. They also have fun dressing up to match the theme, “We dress in denim with a touch of gold or red. Red is an auspicious color for the Lunar New Year.”</p>
<p>Josephine attributes part of her office’s camaraderie to hosting activities like this—an example of how Cloudera leadership globally supports initiatives as part of the company’s commitment to inclusive, locally led culture-building. Lunar New Year values such as renewal, reunion, and recognition are reflected in this way at Cloudera Singapore: “This is a time to celebrate and express gratitude and appreciation for the time we spent building the business up to this point, to reinforce our shared commitment, and to look forward to a new chapter.”</p>
<h2>Closing Thoughts</h2>
<p>Josephine’s dedication to service and her community exemplify Cloudera’s caring, people-centered culture. Her story shows how proactive effort and collaboration lead to meaningful growth and strong bonds. For Josephine, Cloudera is the perfect place to live out these values.</p>
<p>Hear from another<a href="/content/www/en-us/blog/culture/clouderalife-employee-spotlight-meet-leo-brunnick-chief-product-officer.html"> Clouderan</a> and explore career<a href="/content/www/en-us/careers.html"> opportunities</a> at Cloudera.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderalife-employee-spotlight-meet-josephine-tan-clouderas-senior-director-human-resources-apac</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>You Can Build It Yourself, But Should You? Protecting the Value of Modern Data Platforms</title><description><![CDATA[In complex environments, early implementation decisions often determine whether a platform becomes a durable foundation or an expensive capability that never quite delivers on its promise. ]]></description><link>https://www.cloudera.com/blog/business/you-can-build-it-yourself-but-should-you-protecting-the-value-of-modern-data-platforms.html</link><guid>https://www.cloudera.com/blog/business/you-can-build-it-yourself-but-should-you-protecting-the-value-of-modern-data-platforms.html</guid><pubDate>Tue, 10 Feb 2026 17:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Jim Bisordi]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-doctors-talking-data.webp"><p>Organizations don’t invest in modern data platforms casually. They invest to support a range of mission-critical needs—from real-time fraud detection and global inventory visibility, to <a href="/content/www/en-us/products/machine-learning.html#:~:text=Private%20AI%20by%20design">private AI</a> readiness and consistent governance across complex regulatory environments.&nbsp;</p>
<p>With those outcomes in mind, teams come in ready to move fast and build with purpose. But it doesn’t take long to realize that translating intent to impact and value is harder than expected.&nbsp;</p>
<p>In complex environments, early implementation decisions often determine whether a platform becomes a durable foundation or an expensive capability that never quite delivers on its promise.&nbsp;</p>
<h2>Why Experience Compresses Time-to-Value&nbsp;</h2>
<p>The problem is that implementation is often treated as a checklist—specific steps that ladder up to a specific outcome—when it’s really a decision tree. Each choice made along the way can take teams down very different paths with long-term consequences that aren’t always obvious at the time.&nbsp;</p>
<p>These learning curves can be costly and can quietly lock in architectural and governance decisions that can limit flexibility, scale, and trust long after launch, dramatically increasing total cost of ownership and time to value.&nbsp;</p>
<p>Teams with deep platform and solution implementation experience approach these projects with a seasoned perspective. They recognize patterns early, know which trade-offs actually matter (and which don’t), and design for real operating conditions rather than idealized ones, shaping early decisions that protect the platform’s long-term value and accelerate the path to durable outcomes.&nbsp;</p>
<h2>What Professional Services &amp; Training Actually Means in Practice&nbsp;</h2>
<p>This is where Professional Services &amp; Training (PS&amp;T) comes in, a team that works with you to bridge the gap between purchasing a new platform, and seeing it adopted across the organization. This phase is a critical time in the platform’s lifecycle, as these early steps set the organization up for long-term success.&nbsp;</p>
<p>Industry-specific experts on PS&amp;T teams act as an extension of in-house teams during platform adoption and use case implementation, bringing the perspective of having done this hundreds of times before in similarly complex environments. They help shape early decisions, navigate trade-offs, and avoid common pitfalls in data flow, <a href="/content/www/en-us/services-and-support/training/learning-paths/data-governance.html">governance</a>, security and integration, so teams don’t discover too late that something foundational needs to be reworked. Just as importantly, they transfer that knowledge back to internal teams, ensuring long-term platform ownership, confidence, and self-sufficiency remain internal.&nbsp;</p>
<p>By engaging PS&amp;T early, organizations can move from evaluation to execution more quickly and confidently, avoiding unexpected challenges along the way. Instead of spending months tuning pipelines, rethinking governance models, or retrofitting for scale, teams start with a foundation designed to support today’s use cases and grow with them over time.&nbsp;</p>
<h2>When “Working” Still Isn’t Enough&nbsp;</h2>
<p>Once the platform is live, teams often assume the job is complete, but it’s really just the beginning. Despite having the tools they asked for, many still struggle to extract real value from their data. Doing so requires building trust, broadening adoption, and confidently operationalizing insights.&nbsp;</p>
<p>The gap between standing up a platform and genuinely using it is often driven by subtle, slow-moving issues—ones that don’t immediately break the system outright, but quietly erode confidence. Over time, this can lead to fragmented usage, shadow systems, stalled initiatives, and growing skepticism about the platform’s ROI. By the time these issues are recognized, momentum can be hard to recover.&nbsp;</p>
<p>Early decisions set the trajectory for whether a platform becomes foundational or gradually sidelined.&nbsp;</p>
<h2>AI-Driven Use Cases in Regulated Environments&nbsp;</h2>
<p>This dynamic becomes even more pronounced in messy, real-world environments with regulatory or operational complexity. Here, early decisions can determine whether private AI initiatives, for example, become durable assets, or introduce new risk.&nbsp;</p>
<h3>Healthcare&nbsp;</h3>
<p>In <a href="/content/www/en-us/solutions/healthcare.html">healthcare</a>, private AI enables a wide range of use cases, from automating administrative workflows to supporting advanced imaging and diagnostics. But realizing those benefits starts well before any model is trained.&nbsp;</p>
<p>It all starts at the foundation—bringing data together across hybrid environments and ensuring it is properly permissioned, tagged, and contextualized. Without that structure, AI outputs can lack the clinical or regulatory context needed to be trusted, undermining decision integrity, defensibility, and compliance. In these environments, early implementation decisions determine whether AI capabilities mature into trusted clinical tools or remain constrained by governance and data access limitations.&nbsp;</p>
<h3>Telecommunications&nbsp;</h3>
<p><a href="/content/www/en-us/solutions/telecommunications.html">Telecommunications</a> organizations face similar challenges. Data is generated continuously across highly distributed infrastructure, often spanning regions and regulatory jurisdictions.&nbsp;</p>
<p>Private AI can open up real-time threat detection, outage prediction, and network optimization, but only when governance, lineage, and access controls are consistent. When these foundations are uneven, AI-driven insights may look actionable on the surface, but lack the context needed to be truly useful.&nbsp;</p>
<p>While AI initiatives (the examples used here) tend to surface these challenges quickly, the same dynamics apply to analytics modernization, regulatory reporting, operational intelligence, and any use case that depends on trusted, well-governed data. In any case, success depends less on how sophisticated the models are, and more on consistency in early architecture and governance decisions that shape how data is accessed, secured, and interpreted.&nbsp;</p>
<h2>Where Implementation Becomes Adoption: How Momentum Is Built&nbsp;</h2>
<p>Even with the right technical foundation, realizing the full value of the data platform doesn’t happen all at once. It’s a deliberate process—one that builds confidence incrementally as teams validate results, expand usage, and integrate insights into everyday workflows.&nbsp;</p>
<p>Teams that succeed tend to treat implementation as the beginning of the journey, not the finish line. They start with well-scoped use cases, build trust in the results, and scale deliberately as confidence grows.&nbsp;</p>
<p>This is where Professional Services &amp; Training plays a guiding role—partnering with teams to sequence adoption, reinforce governance as usage expands, drive new AI use cases, and keep momentum moving without introducing rework. The result is a solution that steadily proves its value over time, protects the original investment, and becomes a dependable foundation for analytics, AI, and future data initiatives.&nbsp;</p>
<p>++&nbsp;</p>
<p>For teams thinking about how to move from standing up a platform to fully realizing its value, <a href="/content/www/en-us/services-and-support/professional-services.html">Cloudera’s PS&amp;T’s resources</a> explore what that journey looks like in practice.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=you-can-build-it-yourself-but-should-you-protecting-the-value-of-modern-data-platforms</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>The Next Evolution of Enterprise Analytics – The Data Intelligence Platform</title><description><![CDATA[Modern enterprises rely on multiple analytics platforms to support a wide range of workloads, including business intelligence and reporting, real-time analytics, observability, machine learning, and AI. ]]></description><link>https://www.cloudera.com/blog/technical/the-next-evolution-of-enterprise-analytics-the-data-intelligence-platform.html</link><guid>https://www.cloudera.com/blog/technical/the-next-evolution-of-enterprise-analytics-the-data-intelligence-platform.html</guid><pubDate>Mon, 09 Feb 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Divya Karmagam]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1256169936-1.jpg"><p style="text-align: center;"><a href="https://community.cloudera.com/t5/Developer-Blogs/Unlocking-Cross-Engine-Analytics-with-Cloudera-s-Open/ba-p/413455" target="_blank"><b>See It in Action</b> <br>
</a>&nbsp;</p>
<p style="text-align: center;">Want to see what a data intelligence platform looks like in practice? <br>
See how Iceberg tables managed by Cloudera can be queried by Snowflake and Databricks without copying data or compromising governance. </p>
<h2>How to Shift to an Intelligence-First Platform&nbsp;</h2>
<p>Adopting an intelligence platform represents a fundamental shift not just in infrastructure, but in how organizations think about and trust their data. The transition period is especially critical because it sets the expectations for reliability, integration, and adoption across teams. Early missteps can create lingering challenges and resistance to longer-term adoption.&nbsp;</p>
<p>Done well, this shift balances stability and progress, keeping mission-critical processes running while delivering early wins that build confidence and momentum.&nbsp;</p>
<p>Cloudera’s Professional Services &amp; Transformation (PS&amp;T) team helps organizations navigate this shift with care—avoiding common architectural pitfalls and building a durable foundation that supports future analytics and AI use cases.&nbsp;</p>
<p>Learn more about our <a href="/content/www/en-us/services-and-support/professional-services.html">PS&amp;T capabilities here</a>.</p>
<p>Lakehouses solved a lot of enterprise problems by unifying and simplifying data storage. But the operating landscape at the enterprise level has shifted. Today, organizations are <a href="/content/www/en-us/products/data-services.html">coordinating more tools</a>, <a href="/content/www/en-us/products/cloudera-data-platform.html">managing more data</a>, <a href="/content/www/en-us/products/machine-learning.html">operationalizing AI</a>, and navigating increasing <a href="/content/www/en-us/blog/business/embrace-a-hybrid-data-platform-for-dora-compliance.html">regulatory scrutiny</a>.&nbsp;</p>
<p>As a result, data can no longer be treated as something that’s queried occasionally or in isolation. It now needs to be operational—meaning ready for real-time use, automated decision-making, and AI-driven workflows across the organization. This shift is pushing architectures beyond lakehouses and toward a more dynamic data intelligence platform.&nbsp;</p>
<h2>What Changed? Analytics Became Multi-Platform&nbsp;</h2>
<p>Modern enterprises rely on multiple analytics platforms to support a wide range of workloads, including business intelligence and reporting, real-time analytics, observability, machine learning, and AI.&nbsp;</p>
<p>Each team brings its own needs to the same data, and in practice, platform choices are driven by productivity and speed rather than architectural purity. Much of that data also remains on premises or in regulated environments, where moving it to the cloud isn’t practical or permitted.&nbsp;</p>
<p>The original lakehouse model assumed convergence on a small number of analytics platforms. Reality proved otherwise: tools, users, and workloads diverged. The challenge now is supporting that diversity without sacrificing consistency or control.&nbsp;</p>
<h2>The Cost of Treating Data as Platform-Owned&nbsp;</h2>
<p>Despite lakehouse implementations, enterprise data often remains tightly coupled to the platform that manages it. When another platform needs access, the data is often copied, transformed, or exported to fit that environment.&nbsp;</p>
<p>Over time, simply keeping data consistent and accessible across these various platforms becomes a challenge. Duplicate datasets, fragile pipelines, delayed insights, and inconsistent <a href="/content/www/en-us/resources/faqs/data-governance.html">governance </a>introduce operational risk and drive up costs.&nbsp;</p>
<p>The result is a familiar pattern: rising spend, growing complexity, and declining trust in the data and its outputs.</p>
<h2>From Lakehouse to Intelligence Infrastructure&nbsp;</h2>
<p>The lakehouse helped bring structure to a fragmented <a href="/content/www/en-us/solutions/customer-insights.html">analytics landscape</a>, making it easier for data systems to work together. As enterprises move into the era of <a href="/content/www/en-us/resources/faqs/data-intelligence.html">full-scale</a> data intelligence platforms, the focus changes.&nbsp;</p>
<p>Instead of data being shaped and owned by individual tools, it becomes the foundation of the architecture—anywhere that data physically resides. All tools sit on top of a shared data layer, rather than pulling data into isolated environments and producing siloed outputs.&nbsp;</p>
<p>This shift allows teams to choose the right compute engine for each workload—whether it’s SQL analytics, large-scale processing, or AI—confident they’re operating on the same governed, trusted data foundation.&nbsp;</p>
<h2>What is a Data Intelligence Platform?&nbsp;</h2>
<p>A data intelligence platform is a shared infrastructure for data. Think of it like city infrastructure—the roads, power lines, and plumbing beneath a city that every building taps into and relies on.&nbsp;&nbsp;</p>
<p>In the same way, a data intelligence platform provides a centralized foundation that powers many different tools, compute engines, and applications, with governance and context embedded by design rather than bolted on later.&nbsp;</p>
<p>It’s characterized by:&nbsp;</p>
<ul>
<li><p>A shared data layer built on <a href="/content/www/en-us/products/open-data-lakehouse.html">open data</a> formats&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Rich metadata <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">lineage</a> that captures structure, meaning, and history&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Built-in <a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">governance</a> that travels with the data&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Support for multiple analytics and AI engines&nbsp;</p>
</li>
</ul>
<ul>
<li><p>The ability to evolve without re-architecting from scratch&nbsp;</p>
</li>
</ul>
<h2>Open Foundations Make Data Intelligence Possible&nbsp;</h2>
<p>A platform like this only works if data can be shared safely across all tools and environments, whether on premises, in the cloud, at the edge, or a combination. Open table formats are the common foundation that makes cross-engine interoperability possible (to continue with our city metaphor: the building codes and street standards that make the city navigable by everyone).&nbsp;</p>
<p>Without them, connecting tools often means dealing with mismatched formats, inconsistent latencies, proprietary lock-in, or data that must be governed across geographic boundaries. This can lead to familiar pain points: reduced auditability, inconsistent views of data, and growing challenges around trust.&nbsp;</p>
<p>By contrast, open formats reduce lock-in and support a growing ecosystem of tools (i.e., set it up once and let it grow with your tech stack over time). They make it easier to define governance policies once and enforce them everywhere (including where data can’t easily move), regardless of which engine needs access. This also creates a consistent “<a href="https://arize.com/ai-memory/" target="_blank">memory layer</a>” for AI-driven systems, making them more reliable, auditable, and adaptable through built-in traceability and historical context.&nbsp;</p>
<p>Without open formats and embedded governance, intelligence quickly fragments back into silos, eroding the very advantages data intelligence platforms are designed to deliver.&nbsp;&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=the-next-evolution-of-enterprise-analytics-the-data-intelligence-platform</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cheers! To Professional Growth With Toastmasters</title><description><![CDATA[Cloudera’s culture is rooted in empowerment, continuous learning, and creating spaces where people can thrive both personally and professionally. It’s a mindset and approach that is reflected by every Clouderan across the globe, building an environment where anyone can feel empowered to take on new challenges and grow. ]]></description><link>https://www.cloudera.com/blog/culture/cheers-to-professional-growth-with-toastmasters.html</link><guid>https://www.cloudera.com/blog/culture/cheers-to-professional-growth-with-toastmasters.html</guid><pubDate>Wed, 28 Jan 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1141462781.jpg"><p>Cloudera’s culture is rooted in empowerment, continuous learning, and creating spaces where people can thrive both personally and professionally. It’s a mindset and approach that is reflected by every Clouderan across the globe, building an environment where anyone can feel empowered to take on new challenges and grow.&nbsp;</p>
<p>A fantastic example of this culture in action comes from our team in Cork, Ireland. Here, thanks to the enthusiastic efforts of Clouderans like Noel Hayes, Senior Manager Global Order Management, the office has started a Toastmasters club. A vibrant and entirely in-person community that in just the last year has already made a meaningful impact on its members’ confidence, communication skills, and leadership growth.&nbsp;</p>
<p>Here’s how Noel’s own journey led him to get the club off the ground, empower others to join, and continue growing its footprint.&nbsp;</p>
<h2>Creating an Environment of Growth and Understanding</h2>
<p>Noel, a long-time participant in Toastmasters himself, has always understood how much structured public speaking practice can shape a person’s confidence and professional capability. Early in his career he found presenting challenging, but through regular participation and taking on both speaking roles and leadership roles in meetings, he gained confidence and strengthened his ability to lead with presence.</p>
<p>Noel saw an opportunity to bring that same growth experience to his colleagues. When starting out, Cloudera’s leadership encouraged him to revisit the idea once people began returning to the office. The first meetings began in late 2024, and by 2025, the club was officially chartered. Today, the club has soared to over 40 members.&nbsp;</p>
<p>With meetings every two weeks, members gather in person to practice speaking, take on a number of different roles that strengthen leadership and listening skills, and encourage one another through structured feedback and shared experiences.</p>
<p>The environment is intentionally welcoming. Members range from people who once dreaded speaking in public to those who had never stepped into a role like this before.</p>
<h2>Taking the Plunge into Professional Growth&nbsp;</h2>
<p>One great example of this group’s impact comes from the experiences of Barry O’Driscoll, Senior Sales Operations Analyst. Barry’s journey to the Toastmasters club started with a conversation with Noel. That conversation snowballed quickly, with Barry joining the group and ultimately competing, and finishing second, in an internal <a href="https://www.linkedin.com/pulse/cloudera-toastmasters-club-wild-noel-hayes-kzj1e/" target="_blank" rel="noopener noreferrer">competition</a> and later placing second in an international Toastmasters event.&nbsp;&nbsp;</p>
<p>That’s just one of many who have joined Toastmasters since the group started meeting. And while it may feel overwhelming, it’s possible to start small. “Just join a meeting,” said O’Driscoll. “Once you see the energy in the room, get to know the way it works, it will blow your mind.”&nbsp;</p>
<p>These experiences demonstrate how opportunities like the Cork Toastmasters club align with Cloudera’s broader values. By empowering employees to lead initiatives, provide space for learning, and support each person’s development journey, Cloudera continues to build a culture where people feel supported to grow and contribute in meaningful ways.&nbsp;</p>
<h2>Fostering a Stronger Community&nbsp;</h2>
<p>Toastmasters is a powerful tool for Clouderans in the Cork office to sharpen their skills, get more comfortable with anxiety-inducing elements of their work, or just step outside their comfort zones. But beyond that, it’s a place where employees can forge a broader sense of community. Oftentimes it’s easy to get siloed into groups based on your role and the kind of work you’re involved in.&nbsp;</p>
<p>A club like this is open to everyone, from HR to engineers. In any given meeting, one might get to interact with someone they would virtually never cross paths with. And because the club is fully in person, members have the chance to build that rapport on an even deeper level and support&nbsp; each other’s professional development.&nbsp;&nbsp;</p>
<h2>Looking to the Future&nbsp;</h2>
<p>As the Toastmasters club continues to develop, members are setting new goals: advancing through Toastmasters’ structured learning program, achieving distinguished status, and connecting with other clubs in the community. There is also interest in exploring how this model might support employees in other Cloudera locations, helping them build confidence and community through shared learning experiences.</p>
<p>The success of the Cork Toastmasters club is a reminder that development happens in many forms, and that when people are encouraged, supported, and trusted to lead, their potential expands far beyond what they once believed possible.&nbsp;</p>
<p>Find out more about <a href="/content/www/en-us/careers.html">opportunities</a> in our Cork, Ireland, office. And learn more about how <a href="/content/www/en-us/about/our-culture.html">Cloudera</a> is helping build a workplace where employees can learn, grow, and thrive.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cheers-to-professional-growth-with-toastmasters</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Openness in the Age of AI</title><description><![CDATA[If the AI revolution has given way to one universal data management truth, it’s the need for openness and interoperability across the data estate. After all, AI is only as good as the data it can actually reach.]]></description><link>https://www.cloudera.com/blog/business/openness-in-the-age-of-ai.html</link><guid>https://www.cloudera.com/blog/business/openness-in-the-age-of-ai.html</guid><pubDate>Tue, 27 Jan 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Matthew Michaelides]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-ai-agent-txt.webp"><p>If the AI revolution has given way to one universal data management truth, it’s the need for openness and interoperability across the data estate. After all, AI is only as good as the data it can actually reach.</p>
<p>No longer are enterprises willing to invest in disconnected legacy technologies. The cost of silos, once measured in infrastructure alone, is now exponentially higher when measured in lost time to value and the inability to run AI at scale. Considering this landscape, enterprises can’t afford not to rethink their data architectures.</p>
<p>At Cloudera, we define openness as a three-layered data management architecture (see Figure 1):&nbsp;&nbsp;</p>
<ul>
<li><p><b>Open compute:</b> The ability to use any engine regardless of where the data is stored</p>
</li>
<li><p><b>Open catalog</b>: The ability to swap in and out, and interoperate across different data access layers, ensuring schema and governance are consistent regardless of the viewing engine</p>
</li>
<li><p><b>Open data:</b> The ability to move and access data assets wherever they sit</p>
</li>
</ul>
<p>More broadly, openness is at the heart of who we are at Cloudera:</p>
<ul>
<li><p><a href="/content/www/en-us/blog/business/the-iceberg-wave-how-an-open-format-became-an-enterprise-standard.html">Early proponent of Apache Iceberg</a>: Cloudera began supporting Iceberg in our <a href="https://docs.cloudera.com/cdp-public-cloud/cloud/cdp-iceberg/topics/iceberg-in-cdp.html" target="_blank">public cloud Lakehouse in 2021</a>. Other vendors quickly followed suit—implicitly acknowledging Iceberg as the winner of the open table format war. In 2024, Databricks acquired Tabular, due in part to its open governance and sophisticated features. In 2025, both Snowflake and Amazon Web Services (AWS) invested in expanding Iceberg support and features.</p>
</li>
</ul>
<ul>
<li><p><a href="/content/www/en-us/open-source.html">Open-source foundation and ecosystem</a>: Deeply embedded in the open-source community since its founding in 2008, Cloudera was the first company to commercialize open-source data lake technology and continues to contribute to and support more than 50 open-source projects. Our open-source foundation gives freedom of choice by allowing our customers to opt in or out of Cloudera distributions far more easily compared to vendors whose proprietary overlays lock them in. <b>Cloudera customers don’t </b><i><b>have</b></i><b> to stay; they <i>choose </i>to stay.&nbsp;</b></p>
</li>
</ul>
<ul>
<li><p><a href="/content/www/en-us/blog/business/democratize-data-for-ai-using-interoperability-across-engines-and-zero-copy-data-collaboration.html">Interoperability across the data management stack</a>: Providing open compute, catalog, and data ensures interoperability at each level of the data management stack so our customers can truly win in the age of AI without having to build from scratch. Additionally, Cloudera provides the flexibility to use any compute engine or land data in any cloud service provider (CSP), and provides full access to features regardless of where the data resides or what compute engine is used. Conversely, some vendors restrict access to features based on whether all layers of the stack are running in the same platform. <b>Own your data. Control your data. Use your data—that is the promise of Cloudera</b>.&nbsp;</p>
</li>
</ul>
<p>For a deeper dive on the importance of openness in the age of AI, read our blog: <a href="/content/www/en-us/blog/business/the-future-delivered-today-the-ai-powered-data-lakehouse.html">The Future Delivered Today: The AI-Powered Data Lakehouse</a>.</p>
<small><i>Figure 1: How Cloudera Powers Unparalleled Openness and Interoperability</i></small>]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=openness-in-the-age-of-ai</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>2025 Was the Year the Cloud Reminded Us Who&amp;apos;s Really in Control</title><description><![CDATA[In October, Amazon Web Services (AWS)&apos;s US-East-1 region went dark for 15 hours—a DNS error affecting DynamoDB took down over 1,000 companies. In June, a null pointer exception in Google Cloud&apos;s Service Control binary disabled multiple systems including Cloud Storage, Compute Engine, and BigQuery for several hours, with ripple effects hitting Spotify, Discord, and OpenAI.]]></description><link>https://www.cloudera.com/blog/business/2025-was-the-year-the-cloud-reminded-us-whos-really-in-control.html</link><guid>https://www.cloudera.com/blog/business/2025-was-the-year-the-cloud-reminded-us-whos-really-in-control.html</guid><pubDate>Mon, 26 Jan 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Suzy Tonini]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-modern-architecture-building.webp"><h2>Why the outages keep happening, and what you can actually do about it</h2>
<p>2025 was rough if you were betting your business on a single cloud vendor. In December, Snowflake customers watched helplessly as a schema update cascaded across multiple regions, <a href="https://www.theregister.com/2025/12/18/snowflake_update_caused_a_blizzard/">blocking queries for 13 hours</a>. Databricks users dealt with <a href="https://status.azuredatabricks.net/pages/history/5d49ec10226b9e13cb6a422e">days of degraded AI services</a>.&nbsp;</p>
<p>In October, Amazon Web Services (AWS)'s US-East-1 region <a href="https://www.crn.com/news/cloud/2025/aws-15-hour-outage-5-big-ai-dns-ec2-and-data-center-keys-to-know?page=1">went dark for 15 hours</a>—a DNS error affecting DynamoDB took down over 1,000 companies. In June, a null pointer exception in Google Cloud's Service Control binary <a href="https://www.crn.com/news/cloud/2025/multiple-cloud-services-down-as-google-cloudflare-resolve-issues">disabled multiple systems</a> including Cloud Storage, Compute Engine, and BigQuery for several hours, with ripple effects hitting Spotify, Discord, and OpenAI.</p>
<p>Across all of these incidents, the pattern was the same: customers refreshed status pages and waited for someone else to fix the problem. The difference between vendors is not whether outages happen, it’s what options you have when they do.</p>
<h3>The Pattern: Single Points of Failure with Global Reach</h3>
<p><a href="https://status.snowflake.com/history">Snowflake’s December incident</a> was triggered by a backwards-incompatible database schema update. Version mismatch errors caused operations to fail or hang indefinitely across multiple regions on AWS, Microsoft Azure, and Google Cloud Platform (GCP). Snowflake's communications stated there were no workarounds except for customers who had pre-configured replication to non-impacted regions. Everyone else waited.</p>
<p><a href="https://status.azuredatabricks.net/pages/history/5d49ec10226b9e13cb6a422e">Databricks’ December outage </a>(spanning multiple days) included Unity Catalog issues, compute degradation across multiple regions, and a Mosaic AI disruption that stretched for days. Status updates repeatedly noted they were &quot;working with the cloud provider on potential mitigation paths.&quot; That phrase tells you everything about the dependency chain: when Azure has a bad day, Databricks customers on Azure regions have a bad day too.</p>
<p>The <a href="https://www.crn.com/news/cloud/2025/multiple-cloud-services-down-as-google-cloudflare-resolve-issues">Google Cloud June incident</a> revealed the same vulnerability. A faulty policy with blank fields was inserted into global configuration tables and replicated worldwide within seconds. The corrupted data triggered crash loops that took down core services for 7.5 hours. Google's own status dashboards were initially unavailable—SRE teams could not even confirm the scope of the disaster.</p>
<p>Regional redundancy does not help when the failure is logical rather than physical. When a platform relies on globally coordinated metadata or shared configuration, a single bad update propagates everywhere. The failure follows you from region to region.</p>
<p>Additionally, in these scenarios, the infrastructure is distributed, but control remains centralized. When Snowflake's control plane breaks, it doesn’t matter that they run on AWS, Azure, and Google Cloud underneath. When Databricks is waiting on Azure to fix something, multi-cloud marketing does not help. The single point of failure is the proprietary layer on top.</p>
<h3>What Analysts Are Saying</h3>
<p>The Gartner®<a href="https://www.gartner.com/en/newsroom/press-releases/2025-05-13-gartner-identifies-top-trends-shaping-the-future-of-cloud"> 2025 analysis of cloud adoption trends</a> estimates that more than 50% of organizations will not get the expected results from their multi-cloud implementations by 2029. The core problem: lack of interoperability between environments.&nbsp;</p>
<p>In <a href="https://www.forrester.com/blogs/predictions-2026-cloud-outages-private-ai-on-private-clouds-and-the-rise-of-the-neoclouds/">Forrester Predictions 2026: Cloud Outages, Private AI On Private Clouds, And The Rise Of The Neoclouds</a>, the research firm predicts at least two major multiday cloud outages in 2026. The cloud industry is undergoing a massive infrastructure transition as hyperscalers race to build AI-native data centers. That investment is coming at a cost: legacy x86 and ARM environments are being deprioritized, leading to aging infrastructure faltering amid growing complexity.</p>
<p>In the same Forrester predictions piece, they estimate that at least 15% of enterprises will shift toward <a href="/content/www/en-us/about/news-and-blogs/press-releases/2025-08-06-cloudera-data-services-brings-private-ai-to-the-data-center.html">private AI </a>deployments built on private clouds in 2026. The drivers: rising AI costs, concerns about data lock-in, and the operational risk of depending on infrastructure that is increasingly optimized for someone else's priorities. The 2025 outages were a preview of what happens when your workloads are not the provider's top concern.</p>
<h3>Architect for Resilience with Cloudera</h3>
<p>Most enterprises have “accidental multi-cloud” architectures by way of acquisitions, shadow IT, or best-of-breed tool selection—not through deliberate architectural planning. Their workloads are scattered across providers but they lack the ability to move data and workloads when things go wrong.&nbsp;</p>
<p>Architecting for resilience involves ensuring your data and AI platform enables portability and eliminates single points of failover.</p>
<p>The <a href="/content/www/en-us/products.html">Cloudera platform</a> is designed for portability, giving you the ability to fail over between environments to maintain operations—workloads and data can move across AWS, Azure, Google Cloud , and on-premises environments without rewrites, friction, or vendor lock-in. Updates are not forced as global, non-backward-compatible changes.</p>
<p><a href="/content/www/en-us/blog/business/the-inevitable-outage-why-your-hybrid-strategy-needs-multi-cloud-resilience.html">When the inevitable outage happens</a>, you have options: fail over to another cloud or move workloads back to your data center. You’re not stuck watching a status page—you remain in control of your data and can maintain consistent operations and compliance no matter where data resides.</p>
<p>For a deeper dive on how to build a resilient architecture with Cloudera, read our blog: <a href="/content/www/en-us/blog/business/architecting-for-data-resilience-ensuring-business-continuity-with-cloudera.html">Architecting for Data Resilience: Ensuring Business Continuity with Cloudera</a></p>
<h3>Looking Ahead</h3>
<p>The AI buildout is straining infrastructure, and analyst firms point to more turbulence moving forward: Forrester predicts multiday outages, Gartner predicts defensive multi-cloud adoption. Enterprises that come through 2026 in good shape will be those who treat resilience as an architectural principle rather than a compliance checkbox.</p>
<p>Cloudera does not have push-button cross-cloud failover out of the box—nobody does. But we’re architecturally positioned to support that resilience in ways proprietary platforms are not.</p>
<p>If the 2025 outages made you uncomfortable, we would like to have that conversation. Because the cloud is just someone else's computer. And when that computer has a bad day, you should have somewhere else to go.</p>
<p>To learn more about how you can architect for resilience with Cloudera, <a href="/content/www/en-us/contact-sales.html">reach out</a> to our professional services team, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html">check out our product demos</a>, or <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html?internal_keyplay=ALL&amp;internal_campaign=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;cid=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;internal_link=WWW-Nav-u01">sign up for a free 5-day trial</a>.</p>
<p>&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=2025-was-the-year-the-cloud-reminded-us-whos-really-in-control</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Hybrid by Design: The New AI Mandate</title><description><![CDATA[For the better part of a decade, the enterprise technology mandate was simple: “cloud first,” or more pointedly “cloud only.” Modernizing meant moving to the public cloud, and on-premises architecture was viewed as legacy infrastructure to be maintained until it could eventually be migrated.]]></description><link>https://www.cloudera.com/blog/business/hybrid-by-design-the-new-ai-mandate.html</link><guid>https://www.cloudera.com/blog/business/hybrid-by-design-the-new-ai-mandate.html</guid><pubDate>Fri, 23 Jan 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Blake Tow]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-bridge.jpg"><p>For the better part of a decade, the enterprise technology mandate was simple: “cloud first,” or more pointedly “cloud only.” Modernizing meant moving to the public cloud, and on-premises architecture was viewed as legacy infrastructure to be maintained until it could eventually be migrated.</p>
<p>Fast forward to today, that narrative has shifted dramatically, with AI as the major catalyst. A recent <a href="https://www.zdnet.com/article/ai-kills-cloud-first-hybrid-computing-comeback/" target="_blank">ZDNet article</a>, citing research from Deloitte and 451 Research, declared that the cloud-first era is over as we enter a more pragmatic, hybrid-by-design era. This approach elevates on-premises infrastructure from legacy debt to the central pillar of a strategic, optimized architecture.</p>
<p>At Cloudera, we’re living for this moment. While the industry swung wildly toward cloud only, we realized that what organizations really needed was &quot;<a href="/content/www/en-us/why-cloudera/hybrid-data-platform.html">the cloud experience, anywhere</a>.” Now, the <a href="https://www.deloitte.com/content/dam/insights/articles/2025/us188546_tt-26/pdf/DI_Tech-trends-2026.pdf" target="_blank">market is catching up</a>, and enterprises are waking up to the fact that workloads must move fluidly between public clouds, private data centers, and the edge. Here’s why the shift is happening, and why Cloudera is uniquely positioned to lead it.</p>
<h2>The Inference Economics Wake-Up Call</h2>
<p>The primary driver of this shift is what analysts call the &quot;AI infrastructure reckoning.&quot; In the early days of generative AI (GenAI), everyone rushed to the cloud for massive compute power to contextualize models. But as organizations move from experimentation to production, the math changes.</p>
<p>The critical tipping point? Inference costs. While contextualizing a model is a massive, episodic burst of compute that’s perfect for the public cloud, running that model (inference) requires 24/7 compute. When you scale AI to enterprise levels, the recurring costs of cloud inference and data egress become prohibitively expensive.</p>
<p>In 2026, the smart play is workload-first rather than cloud-first:</p>
<ul>
<li><p><b>Public cloud:</b> Ideal for bursty training workloads and elastic experimentation.</p>
</li>
<li><p><b>On premises:</b> The cost-effective powerhouse for consistent, high-volume production inference.</p>
</li>
<li><p><b>Edge:</b> Critical for low-latency decision-making where the speed of light is the bottleneck.</p>
</li>
</ul>
<p>Cloudera allows you to execute a workload-first approach seamlessly. With <a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a>, you can spin up a workspace in one infrastructure to contextualize a model on massive datasets, and then deploy that same model to another infrastructure for inference, without refactoring. We bring the compute to the data, rather than paying the &quot;gravity tax&quot; of moving petabytes of data to the compute. This empowers you to choose the deployment pattern that fits your reality, whether that means training on premises to secure sensitive IP and deploying to the cloud, or vice versa.</p>
<h2>Resilience via Hybrid Failover</h2>
<p>Another reason why enterprises are rethinking their cloud-only strategy is “concentration risk,” or more plainly: if all your workloads are tied to a single cloud provider, when the <a href="/content/www/en-us/blog/business/the-inevitable-outage-why-your-hybrid-strategy-needs-multi-cloud-resilience.html">inevitable outage</a> happens, then your business goes dark as well. Relying on a single public cloud provider for all data and AI operations creates a single point of failure. This is no longer just a matter of good business sense. Regulators are stepping in with frameworks like DORA (Digital Operational Resilience Act) to prevent concentration risk from causing systemic catastrophes.</p>
<p>For many, cloud-only resilience is simply too little. True resilience now requires the agility to move workloads instantly, whether to survive outages or navigate geopolitical mandates.</p>
<p>In a hybrid world, resilience comes from diversity. A proper hybrid architecture allows you to failover not just from one region to another, but from public cloud to private cloud, or even from one hyperscaler to another.</p>
<p>Cloudera supports a <a href="/content/www/en-us/blog/business/architecting-for-data-resilience-ensuring-business-continuity-with-cloudera.html">resilient architecture</a>. Our platform can be configured to replicate data, metadata, and security policies across environments. This setup establishes a powerful &quot;failover anywhere&quot; capability. With these configurations in place, mission-critical applications can failover in any direction, whether moving from a downed public cloud region to a private data center, or shifting from on premises to the cloud to handle sudden spikes.</p>
<h2>Security and Governance: The Sovereignty Factor</h2>
<p>Another friction point in the cloud-first approach is governance. Fragmented policies across hyperscalers and on-premises systems create security blind spots. As <a href="https://www.sdxcentral.com/opinions/data-sovereignty-and-the-future-of-freedom-a-vision-for-the-ai-era/" target="_blank">data sovereignty</a> and regulatory pressures intensify, enterprises are facing a complex web of compliance requirements. Whether navigating regional mandates like GDPR and the EU Data Act, industry standards like HIPAA and PCI DSS, or self-imposed controls for IP protection, organizations are realizing they cannot simply expose sensitive data to public environments. Instead, many are moving workloads back on-premises to regain control.</p>
<p>The challenge: How do you govern a hybrid estate without massively multiplying your workload?</p>
<p><a href="/content/www/en-us/products/unified-data-fabric.html">Cloudera’s unified data fabric </a>solves this challenge by first unlocking data access and automating understanding from a business perspective, regardless of location. This foundation allows you to decouple security and governance from the underlying infrastructure. You simply define a policy once, such as masking PII for specific users, and that policy follows the data, whether it resides in an S3 bucket, an on premises cluster, or an edge stream.</p>
<p>We’ve further strengthened this fabric with the addition of <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Data Lineage</a> (formerly Octopai), which delivers automated, end-to-end visibility into your data's journey. These advanced capabilities allow teams to trace data flows across complex hybrid environments to ensure compliance and trust, earning Cloudera recognition as a <a href="/content/www/en-us/campaign/the-forrester-wave-data-fabric-platforms-q4-2025.html">Leader in the The Forrester Wave™: Data Fabric Platforms, Q4 2025</a>. While others may stitch together separate tools, Cloudera delivers a unified platform that secures and manages the entire experience.</p>
<h2>Not All Hybrid Architectures Are Created Equal</h2>
<p>The 2025 outages may have served as the nail in the coffin of the cloud-only era. But as <a href="https://www.spglobal.com/content/dam/spglobal/mi/en/documents/solutions/451R_Consulting_TIA_HybridCloud_2024.pdf" target="_blank">451 Research</a> notes, there’s a critical difference between hybrid-by-accident architectures that leave organizations struggling with silos and complexity, and an architecture that’s hybrid by design. A designed approach includes a consistent, portable platform that abstracts complexity across data centers, clouds, and the edge, anchored by a unified data fabric with replication.</p>
<p>To succeed in 2026 and beyond, organizations cannot afford accidental architectures. Cloudera’s hybrid-by-design architecture enables enterprises to stop compromising on where their data lives. Instead, they can start capitalizing on what their data can do, turning the inherent diversity of the hybrid estate into a strategic asset rather than a burden.</p>
<p>We deliver a consistent cloud experience by bringing the best parts of the cloud to wherever the data lives. This includes cost efficiency, scalability, elasticity, increased agility, reduced IT effort, faster access to innovation, and high availability. We’re the only data and AI platform company that brings AI to data anywhere: in clouds, data centers, and at the edge.&nbsp;</p>
<p>To learn more about how you can build a hybrid-by-design architecture with Cloudera, <a href="/content/www/en-us/contact-sales.html">reach out</a> to our professional services team, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html">check out our product demos</a>, or <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html?internal_keyplay=ALL&amp;internal_campaign=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;cid=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;internal_link=WWW-Nav-u01">sign up for a free 5-day trial</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=hybrid-by-design-the-new-ai-mandate</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Luz Erez on Bringing Humanity Back to Healthcare with AI</title><description><![CDATA[In this episode of The AI Forecast, host Paul Muller sits down with Luz Erez, founder of MDClone, to explore how synthetic data is changing the way healthcare organizations conduct research, deploy AI, and safeguard sensitive information. Their conversation spans everything from clinical workflows and physician burnout to the role synthetic data plays in validating AI agents safely at scale.]]></description><link>https://www.cloudera.com/blog/business/luz-erez-on-bringing-humanity-back-to-healthcare-with-ai.html</link><guid>https://www.cloudera.com/blog/business/luz-erez-on-bringing-humanity-back-to-healthcare-with-ai.html</guid><pubDate>Thu, 22 Jan 2026 17:01:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1146067345.jpg"><p>Few types of data carry as much potential—or as much responsibility—as medical data. When used properly, healthcare data can improve outcomes, accelerate research, and quite literally save lives. But accessing and analyzing that data remains one of the hardest challenges in enterprise AI.</p>
<p>In this episode of The AI Forecast, host Paul Muller sits down with Luz Erez, founder of MDClone, to explore how synthetic data is changing the way healthcare organizations conduct research, deploy AI, and safeguard sensitive information. Their conversation spans everything from clinical workflows and physician burnout to the role synthetic data plays in validating AI agents safely at scale.</p>
<p>Here are the key takeaways from the conversation.</p>
<h2>Bringing Humanity Back to Healthcare Through Automation</h2>
<p><b>Paul</b>: We talk about AI and what it’s going to do for business, and specifically what it’s going to do in the medical industry—but what would you like to see AI automate for you personally?</p>
<p><b>Luz</b>: One of the things that happened to us during the last 40 years, we lost contact with people. A physician today spends 60% of his time behind the desk doing registering things, regulation, dosing, and so on.</p>
<p>He should be with you as a person. The rest of the work—important work, but work that AI can do—will be done by machines. And it will totally alter the way that we interact. Free time will be more, and many of the interactions that we don’t like about work will be done by machines. I really believe it’s a much, much better future. I’m excited.</p>
<h2>Why Medical Research Needs a New Data Model</h2>
<p><b>Paul</b>: You talk about the complexity of retrospective research, but what does retrospective research mean in this context?</p>
<p><b>Luz</b>: Retrospective research means I’m doing research by looking at data of patients that already exist. And most of the time, people understand there is a difference between correlation and causality.</p>
<p>A researcher might look at medical data and say: I want all the medications of patients that had a relapse in kidney disease while on a beta blocker. Tools like SQL can’t answer this because first you have to define what “on a beta blocker” means, and what a “relapse” means.</p>
<p>As a physicist, I ask: what are the basic rules? The basic rules are rules of time and people, which means this is longitudinal. So, the main question is, how frequent is something taking place? Once I put the mathematics inside it, I could build logic and a system on top of it. But then I saw another problem. I can find the answers, but I cannot give them to anyone. A physician can ask about their patient, but population-level research requires consent, privacy, and governance. So how do you solve this?</p>
<p>We built something called synthetic data. The engine looks at real data, but it doesn’t give you that data. It gives you a list of avatars that look like the original data. Any statistics will be the same, but there’s no one-to-one correlation with real people. There is no PHI issue.</p>
<p>Synthetic data allows you to share data, train models, and collaborate—without violating privacy. And today, synthetic data plays a major role in AI.</p>
<h2>Synthetic Data as the Foundation for Safe Medical AI</h2>
<p><b>Paul</b>: To get synthetic data of sufficient fidelity, surely it still has to come from actual data?</p>
<p><b>Luz</b>: Sure. It looks at actual data and creates synthetic data. There is a balance between privacy and utility. You set the level of privacy, and we give you the best utility possible.</p>
<p>The key difference is governance. When a machine does this automatically and users only see synthetic data, all the ethical and privacy issues go away. Not everything can be synthetic—rare cases are hard—but for common medical data, synthetic data works extremely well.</p>
<p><b>Paul</b>: How do you see this changing the future of medicine?</p>
<p><b>Luz</b>: AI agents are already doing things like offering dosing recommendations with incredible accuracy, but not enough yet. For dosing, you need absolute certainty. To validate these agents, you need hundreds of thousands of cases—and you don’t have them yet.</p>
<p>With synthetic data, we can bootstrap. We generate more and more cases until we can prove the agent works 100% of the time. We’ve shown that primary caregivers can save 40–60% of session time using agents like this, but only if they’re validated correctly.</p>
<p>Without synthetic data, you can’t safely test these systems at scale. I truly believe synthetic data is one of the bedrocks of medical AI.</p>
<p>Listen to the full conversation with Luz Erez on The AI Forecast on <a href="https://open.spotify.com/episode/40oF8ndYdKCxEPVnueLFqM?si=XuOvlxnSTgG1VP1GxgLrPg" target="_blank">Spotify</a>, <a href="https://podcasts.apple.com/us/podcast/unlocking-healthcares-data-potential-with-mdclones-luz/id1779293119?i=1000739472085" target="_blank">Apple Podcasts</a>, and <a href="https://www.youtube.com/watch?v=uwJf4MKBt2Y" target="_blank">YouTube</a>.</p>
<p>You can also learn more about Cloudera’s partnership with MDClone at <a href="/content/www/en-us.html">cloudera.com</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=luz-erez-on-bringing-humanity-back-to-healthcare-with-ai</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Winter, Unplugged: Making the most of the Holiday Season</title><description><![CDATA[Every year, Cloudera offers “Unplug” days to help employees truly disconnect from work, recharge, and focus on what matters to them outside of the office. For some, that means pursuing a long-awaited passion project.]]></description><link>https://www.cloudera.com/blog/culture/winter-unplugged-making-the-most-of-the-holiday-season.html</link><guid>https://www.cloudera.com/blog/culture/winter-unplugged-making-the-most-of-the-holiday-season.html</guid><pubDate>Tue, 20 Jan 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Ashton Stockstill]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-clouds-lake.webp"><h2>Is there anything about Unplug you want to share?</h2>
<p>“<i>This isn't just a holiday break; it’s a necessary pause to fully embrace this moment before a long farewell.</i>” - Jennifer Parker, Sr. Manager, Contracts</p>
<p>&nbsp;</p>
<p>“<i>It's great, and I really enjoyed these days. Especially as an employee, these breaks are important to refresh our minds and be 100 percent when we return</i>.” - Gaurav Sharma, Software QA Engineer</p>
<p>&nbsp;</p>
<p>“<i>Unplug gives me the freedom to step away, recharge, and return with better focus and energy. It’s a strong signal of trust and a people-first culture.</i>” - Vishnuprakash Palanisamy, Staff Software Engineer</p>
<p><a href="/content/www/en-us/about/our-culture.html">Learn more</a> about how Cloudera is helping create an inclusive and supportive workplace for everyone. </p>
<p><i>“These unplugged trips have always turned out to be more productive than I expected. This time, they gave me a fresh perspective on life and its surprises — along with an answer to the philosophical question most present in my mind, just as similar trips have done before.”</i> - Shivam Kumar, Software Engineer&nbsp; II </p>
<p>“<i>When not spending time with my four children and family, I was able to work on building a robotic dog</i>.” – Dr. Christopher Royles, Field CTO EMEA </p>
<p><i>&quot;Since Brazil is in the summer season, I spent time doing some cycling training in beautiful places.</i>” - Everton Fernandes, Sr. Manager, Solutions Engineering</p>
<p><i>“One of the most rewarding moments for us on the talent acquisition team is when candidates tell us they’ve already heard about our Unplug program through friends or social media. When people want to work here not just for the role, but because we genuinely respect personal time, you know the culture is real.”</i> - Rachit Chandra, Director, Talent Acquisition, APAC</p>
<p>“<i>I stayed in a wooden house in the mountains of Chiang Mai, Thailand, spending my days meditating, reading, and disconnecting from the rhythm of city life</i>.” – Ziyang Yang, Senior Talent Acquisition Advisor</p>
<p>“<i>I was able to spend time with family and friends - Pantomime, Christmas markets, dining out, touring London, watching the Christmas lights and firework show on New Year’s Eve</i>.” - Deepa Pednekar, Senior Practice Manager, EMEA</p>
<p><i>“I really appreciate the fact that I can disconnect completely without worrying about tons of emails waiting for me on my return to the office. Getting quality time off makes me very productive and motivated to give my best after the break!</i>” - Stamatis Zampetakis, Senior Staff Engineer</p>
<h2>What does it mean to you to work at a company that offers Unplug?</h2>
<p>“<i>As a busy mom, wife, and professional, having the flexibility to take time off when I need it—whether that’s to align with my kids’ school schedules, hockey tournaments, or gymnastics meets—means a lot. Cloudera Unplug gives me the freedom to step away without guilt, knowing I can take care of what matters most at home and come back recharged and ready to give my very best at work.</i>” – Molly Boyer, Sr. Director, Communications and Analyst Relations</p>
<p>&nbsp;</p>
<p>“<i>Unplug days offer a chance for employees to recharge mentally and build a stronger connection with family, which in turn prevents burnout and leads to stronger loyalty, creativity and productivity.</i>” - Dimas Ramaditya, Partner Sales Manager</p>
<p>&nbsp;</p>
<p>“<i>Having the opportunity to pursue one’s dreams with a healthy body creates a healthy mind.</i>” - Niel Dunnage, Principal Strategic Customer Success Manager</p>
<p>“<i>Scuba Diving! I went to Mexico and did a lot of cave diving in the famous cenotes. And then taught my oldest to dive as well!</i>” - Sergio Gago, CTO</p>
<p>“<i>It is nothing short of amazing to have this time off.&nbsp; I have always appreciated and thoroughly cherished any company that supports their employee’s well-being with much needed time off to unplug and recharge the batteries for a new year!</i>” - Morry Bowling, Partner Sales Director</p>
<p>Every year, Cloudera offers “Unplug” days to help employees truly disconnect from work, recharge, and focus on what matters to them outside of the office. For some, that means pursuing a long-awaited passion project. For others, that means seeing more of the world or spending quality time with family and friends. No matter how Clouderans choose to spend them, these dedicated, enterprise-wide breaks reinforce Cloudera’s commitment to the humanity behind the company—ensuring our team’s well-being and promoting a positive work-life balance.&nbsp;</p>
<p>We just returned from our Winter Unplug, which gave Clouderans the opportunity to close out an incredible year with a moment to reset. With the new year underway and more employees returning from their breaks, we checked in with Clouderans from across the globe to see how they spent their 2025 Winter Unplug.</p>
<p><i>“My family plans our year around unplug days, and we all look forward to them. We schedule vacations to coincide with unplug periods, allowing for extended breaks without missing extra workdays. This approach has significant business benefits too. My coworkers do the same, so out-of-office times cluster around unplug days, resulting in more days when everyone is online together.”</i> - Jason Fehr, Senior Staff Engineer </p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=winter-unplugged-making-the-most-of-the-holiday-season</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>2026 Predictions: The Architecture, Governance, and AI Trends Every Enterprise Must Prepare For</title><description><![CDATA[2026 enterprise tech predictions on AI, data architecture, governance, hybrid control planes, and scaling intelligent workflows for measurable impact.]]></description><link>https://www.cloudera.com/blog/business/2026-predictions-the-architecture-governance-and-ai-trends-every-enterprise-must-prepare-for.html</link><guid>https://www.cloudera.com/blog/business/2026-predictions-the-architecture-governance-and-ai-trends-every-enterprise-must-prepare-for.html</guid><pubDate>Thu, 08 Jan 2026 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-1262811236.jpg"><p>2026 marks the transition from experimentation to intelligence orchestration—a moment where AI, data, infrastructure, and governance converge into a single operating model. If 2024 and 2025 were defined by proofs of concept and one-off model deployments, 2026 will be the breakout year when enterprises begin operationalizing AI at scale, safely and with measurable ROI.&nbsp;</p>
<p>According to our Cloudera leadership team, this is the year when data evolves from passive storage to active organizational memory. Enabling data everywhere for AI anywhere by the unifying cloud and on-prem control planes. It’s also the year when AI agents move from demonstrations to becoming part of the digital workforce, but only if enterprises put governance, security, and responsible AI practices on equal footing with compute priorities.</p>
<p>Here’s what our leaders predict for the year ahead.&nbsp;</p>
<h2>Abhas Ricky, Chief Strategy Officer: The Data Foundation Becomes the Intelligence Layer&nbsp;</h2>
<p>In 2026, the leaders in the race to capitalize on AI will be the organizations that recognize that data’s value comes from how well it can be understood and acted on (not merely from how much of it exists). Data must function as a living, semantic, and governed memory system that AI can learn from and reason with.&nbsp;</p>
<p>In other words, you can’t scale AI until you re-architect the data beneath it.&nbsp;</p>
<p>Every dataset—whether structured, unstructured, real-time, or generated by a model—must carry its own semantics, lineage, and guardrails. This embedded context allows the modern <a href="/content/www/en-us/products/open-data-lakehouse.html">data lakehouse</a> to evolve from passive storage into an active intelligence layer that can contextualize information, enforce policy, audit decisions, and preserve traceability.&nbsp;</p>
<p>With this foundation in place, enterprises can begin building truly autonomous workflows that recall, adapt, and self-correct—the capabilities that will define AI ROI in the years ahead.&nbsp;</p>
<h2>Manasi Vartak, Chief AI Architect: Agentic AI Moves to Production and Governance Becomes Non-negotiable&nbsp;</h2>
<p>Despite headlines predicting a slowdown, enterprise demand for generative and agentic AI will continue to rise in 2026, but with a decisive shift toward measurable ROI (i.e., fewer rogue experiments, and more predictable and intentional use-case-based applications). Much of that value will come from enterprise-adapted models, gradually reducing reliance on public models as organizations prioritize solutions tailored to their own data and workflows.&nbsp;</p>
<p>The last few years were about testing AI’s limits.&nbsp;&nbsp;</p>
<p>2026 is about scaling what works.&nbsp;</p>
<p>To deploy agentic systems in production, organizations will need:&nbsp;</p>
<ul>
<li><p>Strong governance frameworks&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Clear data access controls&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Security rules and permission frameworks defining what data agents can access and what actions they are allowed to take&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Observability into agent actions and decision-making&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Agent registries and workflow versioning to track how agents evolve over time&nbsp;</p>
</li>
</ul>
<p>This necessarily broadens the definition of responsible AI. Fairness and bias mitigation remain important, but enterprises now require end-to-end accountability across data pipelines, system behaviors, and the choices AI agents make if they want to scale agentic AI safely and profitably.&nbsp;</p>
<h2>Sergio Gago, CTO: The Era of Convergence and the Rise of One Control Plane</h2>
<p>After years of tension between on-prem control and cloud elasticity, 2026 is the year of true convergence. Hybrid infrastructure is no longer a compromise between legacy and cloud systems. It has instead become the architectural backbone that enables intelligence at scale.&nbsp;</p>
<p>Across Cloudera’s leadership team, one theme stood out: AI agents will become part of the operational workflow. But until now, their effectiveness has been limited by fragmented data access. Some models could reach only cloud-based data, while others pieced together partial views across environments. Most thought a unified control plane simply wasn’t possible.&nbsp;</p>
<p>That changes in 2026.&nbsp;</p>
<p><a href="/content/www/en-us/products/cloudera-data-platform.html">Cloudera’s hybrid architecture</a> allows workloads (including AI agents) to run wherever they make the most sense, guided by policy, governance, and efficiency rather than storage location, unlocking the next generation of intelligent, coordinated enterprise systems.&nbsp;</p>
<h2>Implications by Vertical&nbsp;</h2>
<p>These predictions aren’t just theoretical. They stand to impact and influence sector operations. Retail and financial services, in particular, are positioned for profound transformation as data foundations strengthen, agentic AI moves to production, and control planes converge.&nbsp;</p>
<h3>Neelabh Pant, Director of Global AI: Retail: From Siloed Systems to Real-Time, Connected Intelligence</h3>
<p>Retailers are already seeing <a href="https://www.globenewswire.com/news-release/2024/11/29/2989141/0/en/The-Rise-of-Artificial-Intelligence-in-Retail-Market-A-164-74-billion-Industry-Dominated-by-Tech-Giants-Microsoft-Google-Oracle-Servicenow-MarketsandMarkets.html" target="_blank" rel="noopener noreferrer">outsized returns</a> from AI, with early adopters realizing ROI up to <a href="https://www.gartner.com/en/industries/retail-digital-transformation" target="_blank" rel="noopener noreferrer">six times faster</a>. In 2026, success will hinge on:&nbsp;</p>
<ul>
<li><p>Connecting data across stores, supply chains, customer interactions, and online ecosystems&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Enabling AI agents to act on real-time information from inventory updates and returns to customer preferences&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Empowering nontechnical teams to create new data connections and workflows without waiting on IT to put it together on their behalf&nbsp;</p>
</li>
</ul>
<p>A unified control plane means AI agents can navigate data and make inferences regardless of where it lives, unlocking personalization, operational efficiency, and faster decision-making. Retailers that modernize their data architectures will continue to set the pace of innovation.&nbsp;</p>
<h3>Adrien Chenailler, Sr. Director, AI Industry Solutions: Financial Services: AI Becomes an Operational Layer, Not a Project from&nbsp;</h3>
<p>Financial institutions have spent years modernizing their data foundations. In 2026, that work pays off. Banks, insurers, and investment firms will increasingly run day-to-day operations on AI, with agents already supporting things like:&nbsp;</p>
<ul>
<li><p>Credit risk scoring&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Fraud detection and prevention&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Compliance investigations&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Credit memo preparation&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Customer service workflows&nbsp;</p>
</li>
</ul>
<p>With <a href="https://www.globenewswire.com/news-release/2025/11/11/3185161/31982/en/Finextra-and-Cloudera-Report-Reveals-Hybrid-AI-is-the-New-Standard-in-Financial-Services-with-91-Citing-It-as-Highly-Valuable.html" target="_blank" rel="noopener noreferrer">91% of financial services leaders</a> already calling hybrid AI highly valuable, there’s a reduced need for experimentation—we've already done that. Now, enterprises will compete on execution. Unified control planes provide the secure, governed environment AI needs to analyze sensitive data across systems without compromising compliance or sovereignty.&nbsp;</p>
<p>&nbsp;</p>
<p>Cloudera’s platform is built for exactly this moment, enabling access to data anywhere for AI everywhere with governed, enterprise-wide intelligence, whether your data lives in the cloud, in data centers, or at the edge.&nbsp;&nbsp;</p>
<p>To learn how your organization can prepare for 2026 and beyond, <a href="/content/www/en-us.html">explore Cloudera’s latest resources and insights</a>.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=2026-predictions-the-architecture-governance-and-ai-trends-every-enterprise-must-prepare-for</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Unleash Peak Performance: Get 13x Faster Queries with Cloudera Lakehouse Optimizer</title><description><![CDATA[Whether you&apos;re a platform lead focused on cost controls, a data architect designing scalable solutions, or a data engineer streamlining processes, Cloudera Lakehouse Optimizer is built for you. It comes with policy templates and defaults, enabling immediate optimization without extensive configuration.]]></description><link>https://www.cloudera.com/blog/business/unleash-peak-performance-get-13x-faster-queries-with-cloudera-lakehouse-optimizer.html</link><guid>https://www.cloudera.com/blog/business/unleash-peak-performance-get-13x-faster-queries-with-cloudera-lakehouse-optimizer.html</guid><pubDate>Wed, 31 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Adam Benlemlih,Navita Sood]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-lake-in-the-middle-of-mountains.webp"><p>Cloudera's commitment to an <a href="/content/www/en-us/products/open-data-lakehouse.html">open data lakehouse</a> empowers customers with the flexibility to use any engine or tool of choice—whether from Cloudera, other vendors, or open source. We understand the complexity of modern data ecosystems, and our engine-neutral approach ensures seamless collaboration across teams accessing data to build analytical or AI applications and agents. We continuously enhance our lakehouse with innovative features for speed, security, automation, and interoperability, ensuring all engines run concurrently and efficiently and have access to all features and optimizations.</p>
<p>The <a href="https://docs.cloudera.com/management-console/cloud/clo/topics/clo-introduction.html">Cloudera Lakehouse Optimizer</a> provides predictive and intelligent <a href="/content/www/en-us/blog/technical/cloudera-lakehouse-optimizer-easier-to-deliver-high-performance-iceberg-tables.html#">optimizations</a>, automating <a href="https://docs.cloudera.com/cdp-public-cloud/cloud/cdp-iceberg/topics/iceberg-in-cdp.html">Apache Iceberg</a> table maintenance and ensuring your open data lakehouse remains performant, scalable, and cost-effective. This service empowers data teams with a cost-efficient lakehouse for all their AI and analytical workloads.</p>
<h2>The Proof is in the Performance: 13x Faster Queries and 36% Storage Cost Reduction!</h2>
<p>We know that performance and cost efficiency are paramount, which is why we're sharing compelling results from our internal benchmarks. We tested Cloudera Lakehouse Optimizer using 7 TPC-DS tables (107 GB of data), executing TPC-DS queries before and after optimization. Even after accounting for caching and removing outliers, the results are significant:</p>
<ul>
<li><p><b>13x faster queries</b>: Our data shows an average 13x query time improvement, reducing average query time from 24 seconds to a mere 1.8 seconds after optimization!&nbsp;</p>
</li>
<li><p><b>36% storage cost reduction</b>: Cloudera Lakehouse Optimizer also drives substantial cost savings by optimizing your storage footprint. Our benchmarks revealed a 36% reduction in dataset size–from 107 GB to 68 GB. This directly translates to a lower total cost of ownership (TCO).</p>
</li>
</ul>
<p>These results demonstrate how Cloudera Lakehouse Optimizer improves query performance for downstream AI, reporting, and analytics, and also significantly reduces your storage costs.</p>
<h2>What Makes Cloudera Lakehouse Optimizer Stand Out?&nbsp;</h2>
<p>Whether you're a platform lead focused on cost controls, a data architect designing scalable solutions, or a data engineer streamlining processes, Cloudera Lakehouse Optimizer is built for you. It comes with policy templates and defaults, enabling immediate optimization without extensive configuration. For specific requirements, the graphical user interface (GUI) and application programming interface (API) offer best-in-class controls.</p>
<p>Let's explore how Cloudera Lakehouse Optimizer uniquely tackles table optimization to deliver these performance and storage benefits:</p>
<ul>
<li><p><b>Intelligent policies</b>: Cloudera Lakehouse Optimizer assesses whether a table requires optimization, ensuring only necessary actions are executed, and autonomously runs the optimizations as and when necessary. It offers rich and configurable action arguments against all Iceberg optimizations, covering a large set of arguments to enable maximum performance.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><b>Engine and storage agnostic</b>: Once the tables are optimized by the Lakehouse Optimizer, any engine accessing the data from the lakehouse will see exactly the same improvements in the performance of the queries, whether those engines are Cloudera owned, open source, or from another vendor. These optimizations also apply to data stored in any cloud object storage or on-premises object stores.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><b>Unmatched scope and control</b>: Cloudera Lakehouse Optimizer allows granular control over policy application. You can create and apply policies at the table, namespace, or even entire catalog level, offering flexible and scalable management as your lakehouse evolves and allows for optimizations to be defined against nearly all arguments, enabling the best policy definition for your tables. This broad scope is a significant differentiator compared to other solutions with more limited policy application. The optimizer also includes a dedicated GUI, enabling all users to comfortably configure and monitor optimizations. For programmatic control, comprehensive API/command line interface (CLI) access is also available, ensuring ease of use for all. It&nbsp; also provides unparalleled flexibility and control over when and how optimizations run:</p>
<ul>
<li><b>Event-based intelligent scheduling</b>: Automatically triggers optimizations when a table event occurs, such as an update, insert, or delete.</li>
<li><b>Time-based scheduling</b>: Allows you to schedule optimizations on a set, recurring basis using a cron-like schedule—a feature not available from AWS S3 Table Maintenance or Databricks Predictive Optimizer.</li>
<li><b>Manual executions</b>: Trigger manual executions of policies, enabling on-demand optimization.</li>
</ul>
</li>
</ul>
<h2>Ready to Transform Your Lakehouse?</h2>
<p>Experience the power of automated, intelligent Iceberg table optimization and realize significant performance and cost benefits today.</p>
<ul>
<li><p>Learn more about the Cloudera Lakehouse Optimizer by watching a <a href="https://app.getreprise.com/launch/z6eJWkn/">demo</a>.</p>
</li>
<li><p>Take advantage of our special promotional offer: All data processed through Cloudera Lakehouse Optimizer will be free until April 26th, 2026! While there is a <a href="/content/www/en-us/products/pricing.html">minimal base cost</a>, this promotion ensures you can explore Cloudera Lakehouse Optimizer’s capabilities without worrying about data processing fees. Furthermore, you can set consumption limits via the Cloudera Management Console to ensure costs never exceed your expectations.</p>
</li>
</ul>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=unleash-peak-performance-get-13x-faster-queries-with-cloudera-lakehouse-optimizer</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Deliver Repeatable, Measurable, and Enterprise-Ready AI for Life Sciences</title><description><![CDATA[Unlock how AI for life sciences accelerates drug discovery and insights with robust data unification, governance, and measurable enterprise-ready results.]]></description><link>https://www.cloudera.com/blog/business/deliver-repeatable-measurable-and-enterprise-ready-ai-for-life-sciences.html</link><guid>https://www.cloudera.com/blog/business/deliver-repeatable-measurable-and-enterprise-ready-ai-for-life-sciences.html</guid><pubDate>Tue, 30 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Laura Blewitt]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-freeway-lights.webp"><h2>Deliver Repeatable, Measurable, and Enterprise-Ready AI for Life Sciences</h2>
<p>Pharmaceutical and life science companies use AI to enhance drug discovery, clinical development, and patient experiences. In these types of regulated environments, the key to unlocking AI-assisted&nbsp; breakthroughs and return on investment (ROI) is a back-to-basics approach—focusing on data unification, interoperability, and security and governance.</p>
<p>On the latest episode of the <a href="https://www.healthcareitnews.com/podcast/discover-life-sciences-blueprint-scalable-ai" target="_blank" rel="noopener noreferrer">Healthcare IT News podcast</a>, HIMSSCast, Rameez Chatni, Global Director of AI Solutions at Cloudera, explains that the industry is transitioning from a nascent focus on AI strategy back to the bedrock of a robust data foundation.&nbsp;</p>
<h3>Ensure Interoperability Across The Value Chain</h3>
<p>The typical global pharma organization comprises 12 to 15 distinct, enterprise-like verticals—R&amp;D, manufacturing, commercial, and so on—and building an AI-ready data set requires managing sophisticated, distributed architectures.<br>
<br>
<a href="/content/www/en-us/products/unified-data-fabric.html">Data unification</a> is difficult, and the solution isn't to force all data into one homogenous system. Instead, organizations are embracing a hybrid architecture that accommodates on-premises systems, multiple clouds, and software-as-as-service (SaaS) solutions.&nbsp;</p>
<p>Using open-source, interoperable technologies that support open data formats ensures that multiple query engines can access data for a variety of engineering, analytic, and AI workloads, and reduces the risk of vendor lock-in.</p>
<p>The ultimate goal for data unification is to give AI models the context they need to connect the dots across the organization and provide better outputs. One contextual model many pharma companies are leveraging is a knowledge graph. This structure captures the relationships within the business—linking drugs to genes, diseases, clinical trials, and commercial data— that humans often miss, creating a truly comprehensive and usable data set.</p>
<p>However, these advanced architectures hinge on one critical, often-overlooked first step: data inventory and <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">data lineage</a>. These are the unsung heroes and foundational pillars that prevent different functions (like R&amp;D and manufacturing) from duplicating licenses for the same data sets and wasting resources.</p>
<h3>Treat Governance as a Feature, Not a Bug</h3>
<p>In a sector that is trying to innovate quickly with data, governance is frequently an afterthought, and projects can stall for as long as nine months as a result. Rameez argues that governance must be treated as a feature, not a bug. This means transforming it into “governance as a service,” a proactive, continuous capability within the enterprise.</p>
<p>The only way to achieve governance as a service is through a multidisciplinary center of excellence (CoE) that connects business leaders, data strategists, technology architects, and privacy/legal lawyers. This ensures technical teams, who understand how data moves, can communicate effectively with legal teams, who understand privacy and consent restrictions.</p>
<p>Crucially, governance should be applied early. Failure to consider compliance, like restrictions on using clinical trial data for secondary purposes, can halt an entire project late in the game. In fact, AI should be applied to governance itself to accelerate contract reviews and ensure compliance checks are automated and auditable.</p>
<h3>Prove ROI To Achieve Scale</h3>
<p>The industry is littered with reports of AI pilot failures. Organizations that are just starting their AI journeys should find the operational AI use cases first. Automating &quot;boring&quot; tasks like clinical trial protocol writing (saving a week on each of a thousand documents) or processing adverse events faster are clear, quick wins.&nbsp;</p>
<p>Rameez advises that success starts with defining a clear, measurable ROI that aligns with the business. In pharma, enabling a “fail fast” culture is a ROI. Computational failure is significantly cheaper than a late-stage clinical trial crash.</p>
<p>Rameez frames this ROI simply, advising that organizations take steps to identify and solve issues quickly, before they snowball: &quot;The earlier you find problems... you can get to a (solution) much faster before it becomes a much bigger problem.&quot;</p>
<p>Finally, standardize your systems: define the agentic frameworks, the tools, the support models, and, most importantly, have clear rules for promotion from development to a validated, auditable production environment.</p>
<h3>The Next Frontier: Personalized AI</h3>
<p>Looking ahead, the next three to five years promise even greater transformation. We’ll see a rise in personalized agents that tailor interactions and insights to the individual user.<br>
</p>
<p><a href="/content/www/en-us/why-cloudera/enterprise-ai.html">AI models</a> will evolve to optimize for multi-parameters simultaneously. Instead of optimizing just for efficacy, models will suggest molecules that are effective, non-toxic, manufacturable, and have a good shelf life—all at once. We may even see the first commercially available drug marketed as “generated by AI.”</p>
<p>Want to learn how to prepare your organization for this future? Listen to the <a href="https://www.healthcareitnews.com/podcast/discover-life-sciences-blueprint-scalable-ai" target="_blank" rel="noopener noreferrer">full conversation </a>with Rameez Chatni for all the details on AI implementation and best practices.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=deliver-repeatable-measurable-and-enterprise-ready-ai-for-life-sciences</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Context Is the Hard Part: Practical Lessons in Building Agentic AI Systems</title><description><![CDATA[Why context engineering is important, and how teams are delivering it]]></description><link>https://www.cloudera.com/blog/business/context-is-the-hard-part-practical-lessons-in-building-agentic-ai-system.html</link><guid>https://www.cloudera.com/blog/business/context-is-the-hard-part-practical-lessons-in-building-agentic-ai-system.html</guid><pubDate>Mon, 29 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Pamela Pan,Navita Sood]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-building-and-windows.jpg"><h2>Why context engineering is important, and how teams are delivering it</h2>
<p>“How do you get the right data, in the right place, at the right time?”&nbsp;</p>
<p>That’s the core challenge behind bringing agentic AI to life in the enterprise. While large language models (LLMs) have unlocked powerful reasoning and orchestration capabilities, their effectiveness hinges on something more foundational: delivering the right business context for reasoning and taking action. Context engineering is a discipline focused on shaping how data, metadata, access policies, and memory come together to guide agent behavior in a secure and explainable way.</p>
<p>At Cloudera, we see this firsthand while partnering with enterprise customers experimenting with new generative AI (GenAI) and agentic AI use cases. Building agentic AI systems depends on something most organizations struggle with: data architecture that capture, govern, and reuse knowledge across the AI lifecycle.&nbsp;</p>
<p>In this blog, we share our approach to building agentic AI systems, which groups foundational capabilities into three buckets: Connect, Contextualize, and Consume. This approach enables our enterprise customers to build intelligent, trusted, explainable, and production-ready agentic systems.</p>
<h3>Connect: Break Down Silos with Control</h3>
<p>Modern AI agents can’t thrive in fragmented environments. However, most enterprises have data that’s spread across multiple clouds, data centers, legacy systems, and inconsistent formats. Exposing that data to an AI system without structure or safeguards leads to performance issues and governance risk.</p>
<p>In successful implementations, we’ve seen organizations focus first on creating a <b>unified data layer</b> that spans environments and formats. This doesn’t mean centralizing all data, but instead stitching it together in a <a href="/content/www/en-us/campaign/the-forrester-wave-data-fabric-platforms-q4-2025.html">data fabric architecture</a>. This provides a unified layer with <b>shared metadata, access policies, federated data engineering, and runtime interoperability.&nbsp;</b></p>
<p>Implementing an open table format and standard API access simplifies data access while delivering flexibility. <a href="/content/www/en-us/products/open-data-lakehouse.html">Open lakehouse </a>architectures matter here because they provide real-time, consistent views of data across engines—especially for agentic workflows that depend on reliable retrieval augmented generation (RAG) and reasoning.&nbsp;</p>
<h3>Contextualize: Give Agents More Than Access</h3>
<p>After data is connected, the challenge shifts to helping agents understand what data exists and how it's used. That starts with <b>discovery</b>: automatically identifying data sources across cloud and on-premises systems and activating the metadata—table names, fields, formats, and more. Tools like <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> scan ETL scripts, reverse-engineer pipeline logic, and capture how data moves and transforms across systems from the source to its final destination, capturing all the dependencies on its way.</p>
<p>This information forms the basis for <b>lineage</b>, which shows how datasets are related and how they change over time. Lineage matters when you need to validate a result, explain a recommendation or agent action, or trace a broken output to its source. It creates transparency and confidence in the systems with which agents interact.</p>
<p>Finally, <b>cataloging </b>brings this information into a usable structure. A centralized metadata store helps both humans and agents locate what they need, understand relationships between datasets, and surface policies that affect how data should be handled. A strong catalog acts like a blueprint—delivering a knowledge graph that gives agents a clear, navigable map of the enterprise’s data estate. It captures the technical, operational and business metadata including all the business definitions and the business logic required to understand the data and take action.&nbsp;</p>
<p>Contextualization enables agents to do more than retrieve information. It allows them to explore patterns, ask better questions, and make decisions with a deeper understanding of the environment they operate in.</p>
<h3>Consume: Deliver the Right Context at the Right Time</h3>
<p>The final step in building agentic systems involves enabling AI to take action in a way that is traceable, safe, and grounded in the right information. This is where architectural choices matter—guardrails, observability, and controlled access all shape whether agents behave predictably when it counts.</p>
<p>We’ve found it helpful to map common context engineering techniques to the underlying data challenges they’re designed to solve. Here are some examples of how they show up in practice:</p>
<table>
<tbody><tr><td><p><b>Data Readiness Challenge</b></p>
</td>
<td><p><b>Context Engineering Technique</b></p>
</td>
<td><p><b>Cloudera’s Approach</b></p>
</td>
</tr><tr><td><p>Sensitive data leaking into prompts</p>
</td>
<td><p>Prompt engineering</p>
</td>
<td><p>Prompt gateways to redact sensitive data</p>
</td>
</tr><tr><td><p>Messy, unstructured data or outdated vector indexes</p>
</td>
<td><p>RAG</p>
</td>
<td><p>Governed and secure real-time streaming data pipelines</p>
</td>
</tr><tr><td><p>Lack of lineage, brittle training sets</p>
</td>
<td><p>Fine tuning</p>
</td>
<td><p>Improve AI explainability with lineage tracking</p>
</td>
</tr><tr><td><p>Agents overstepping, opaque decisions</p>
</td>
<td><p>Tool/API access</p>
</td>
<td><p>Metadata tagging, autonomous data classification, fine-grained access and full audit trails on every system call</p>
</td>
</tr><tr><td><p>Agents unable to access internal enterprise knowledge</p>
</td>
<td><p>Model context protocols (MCPs)</p>
</td>
<td><p>Controlled access to Apache Iceberg-backed context with REST catalogs</p>
</td>
</tr></tbody></table>
<p>Choosing the right technique depends on the agent’s role, data sensitivity, and operational environment. Below are common enterprise use cases and the recommended combinations that have worked well in practice:</p>
<table>
<tbody><tr><td><p><b>Use Case</b></p>
</td>
<td><p><b>Recommended Method(s)</b></p>
</td>
</tr><tr><td><p>Internal knowledge assistant</p>
</td>
<td><p>RAG + vector DB + prompt engineering fallback</p>
</td>
</tr><tr><td><p>Sales enablement bot with customer relationship management (CRM) data</p>
</td>
<td><p>Function calling + business context injection</p>
</td>
</tr><tr><td><p>Product-specific support agent</p>
</td>
<td><p>Fine-tuning or RAG + MCP shared context</p>
</td>
</tr><tr><td><p>Data analytics multi-agentic workflow to extract insights&nbsp;</p>
</td>
<td><p>LangGraph + MCP + tool access + chunked memory</p>
</td>
</tr><tr><td><p>Document understanding (PDF, Excel)</p>
</td>
<td><p>Multi-modal inputs + preprocessing pipelines</p>
</td>
</tr></tbody></table>
<p>This approach to consumption ensures agents are operating with precision, security, and alignment to business goals.</p>
<h3>Takeaways: From Framework to Action</h3>
<p>At Cloudera, we’ve spent years navigating the complexities of enterprise data: bridging silos, enforcing governance, building secure pipelines for AI and analytics, and surfacing lineage across hybrid environments. So when agentic AI patterns began emerging, we weren’t starting from scratch. We knew where context lives, and how to capture it safely and securely with the right guardrails.</p>
<p>With <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a>, teams can automatically map data flows, trace dependencies, and catalog metadata across cloud and on-premises environments. Layering in data catalogs, observability, and access control, agents can interact with systems more safely and intelligently. Teams gain visibility, governance, and trust–critical for scaling these workflows across the enterprise.</p>
<p>To make these pieces actionable, we’ve integrated these capabilities into our <a href="/content/www/en-us/products/open-data-lakehouse.html">Open Data Lakehouse </a>and <a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera AI Studios,</a> giving enterprises the foundation to design, deploy, and manage secure agentic systems in production.</p>
<p>Learn more about how <a href="/content/www/en-us/why-cloudera.html">Cloudera</a> can help you with productionizing your AI agents with the right business context that they need.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=context-is-the-hard-part-practical-lessons-in-building-agentic-ai-system</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera Grows Recognition as Great Place to Work</title><description><![CDATA[At Cloudera, our people are the heart of our innovation. That’s what makes recognitions like the Great Places to Work so important to us as an organization.]]></description><link>https://www.cloudera.com/blog/culture/cloudera-grows-recognition-as-great-place-to-work.html</link><guid>https://www.cloudera.com/blog/culture/cloudera-grows-recognition-as-great-place-to-work.html</guid><pubDate>Fri, 19 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-people-talking.jpg"><p>At Cloudera, our people are the heart of our innovation. That’s what makes recognitions like the Great Places to Work so important to us as an organization. <a href="https://www.greatplacetowork.com/about" target="_blank">Great Places to Work</a> recognizes organizations that put an emphasis on employee well-being and professional growth. These priorities are deeply aligned with Cloudera’s ongoing commitment to put employees first and foster a collaborative work environment. At Cloudera that means building a workplace where everyone can feel included, supported, and given opportunities to grow and learn.&nbsp;</p>
<p>This year has been an exciting moment seeing multiple recognition and certifications from offices that span the globe. Over the course of 2025, the company has secured Great Places to Work certifications across regions including our offices in Ireland, Singapore, Costa Rica, Spain, Italy, and France. Here’s what our offices have earned so far:&nbsp;</p>
<p>Costa Rica</p>
<p><ul>
<li>Best Places to Work Costa Rica (#6 Place)</li>
<li>Best Place to Work By Employee Quantity (20-100 #7 Place)&nbsp;</li>
</ul>
</p>
<p>Spain&nbsp;</p>
<ul>
<li><p>1st Time Certifying, Best Workplaces&nbsp;</p>
</li>
<li><p>Best Workplaces in Tech (#2 Place)&nbsp;</p>
</li>
</ul>
<p>Italy</p>
<ul>
<li><p>1st Time Certifying, Best Workplaces&nbsp;</p>
</li>
</ul>
<p>France</p>
<ul>
<li><p>1st Time Certifying, Best Workplaces</p>
</li>
</ul>
<p>Ireland</p>
<ul>
<li><p>Best Workplace in Ireland (Med) (#1 Place)&nbsp;</p>
</li>
<li><p>Best Small/Med Workplace in Europe (#13 Place)</p>
</li>
<li><p>Best Workplace for Women</p>
</li>
<li><p>Best Workplace for Health &amp; Wellbeing</p>
</li>
</ul>
<p>Singapore&nbsp;</p>
<ul>
<li><p>Best Place to Work (Small) (#3 Place)&nbsp;</p>
</li>
</ul>
<p>Cloudera is a global leader in data management and enterprise AI, and that leadership is fueled by the incredible talent and commitment of our people,” said Amy Nelson, Chief Human Resources Officer. “As our global footprint expands, these Great Places to Work certifications mark important milestones and reinforce the culture and dedication our teams bring to life every day.”</p>
<p>To celebrate Cloudera’s commitment to employee well-being, we interviewed country leaders from the teams recognized to discuss what being certified as a Great Place to Work means to them.&nbsp;</p>
<p>Watch our video celebrating these honors.&nbsp;&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-grows-recognition-as-great-place-to-work</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>The 3 Eras of Women Leaders in Technology: Mary Wells’ Perspective</title><description><![CDATA[The conversation about women in technology has changed a lot over the years. What began as a push for visibility has become something much bigger: a story about representation, allyship, and influence.]]></description><link>https://www.cloudera.com/blog/culture/the-3-eras-of-women-leaders-in-technology-mary-wells-perspective.html</link><guid>https://www.cloudera.com/blog/culture/the-3-eras-of-women-leaders-in-technology-mary-wells-perspective.html</guid><pubDate>Wed, 17 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-1255077848.jpg"><p>The conversation about women in technology has changed a lot over the years. What began as a push for visibility has become something much bigger: a story about representation, allyship, and influence.</p>
<p>Mary Wells, Chief Marketing Officer at Cloudera, has had a front row seat for that evolution. Over her 25+-year career across some of the biggest names in tech, she’s seen firsthand how women’s roles and voices have transformed. As the executive sponsor of Cloudera’s Women Leaders in Technology (WLIT) initiative, she helps foster that next stage of growth: creating space for women and allies to learn, lead, and lift each other up.&nbsp;</p>
<p>Drawing from her experience, Mary describes the evolution of women leading in technology through three eras. Each builds on the one before it, with a new era just beginning to take shape.&nbsp;</p>
<h2>Era One: Representation and Belonging</h2>
<p>A couple decades ago, progress meant simply being seen.&nbsp;</p>
<p>Many women in tech were “the only one”—the only woman in a department, on a project team, or even in an entire building. These pioneers faced the dual challenge of doing their jobs while also proving they belonged.&nbsp;&nbsp;</p>
<p>In a recent interview, Mary reflected on her experiences during this era with informal meetups for women in tech at various company and industry events. In hindsight, she sees these as the early grassroots versions of today’s more formal women-in-tech support networks.&nbsp;&nbsp;</p>
<p>Mary recalls women sharing stories of being the only woman on their floor or in their department. Some left those WLIT conversations with other leaders (who happen to be women) in tears—not from sadness, but from relief. For many, it was the first time they realized they weren’t alone in their workplace struggles. Seeing their experiences reflected in others created a sense of representation and belonging.&nbsp;</p>
<p>Simple conversations broke the feelings of isolation, creating a sense of solidarity. Women working together to listen, encourage, and prove that belonging was a form of strength.&nbsp;</p>
<p>During this era, peer communities gave women the courage to take a seat at the table and stay.&nbsp;</p>
<h2>Era Two: Confidence and Voice</h2>
<p>Once women had a place in the room, the conversation started to shift. It wasn’t enough to just be present. It was time to meaningfully participate.&nbsp;</p>
<p>That’s why this second era of women leading in tech can be characterized by confidence. Women started searching for ways to use their voices, influence decisions, and lead authentically. Mary recalls that about ten years ago, the questions she heard most often centered on self-doubt. Women were asking, “How do we make our presence count?”&nbsp;</p>
<p>At the time, “imposter syndrome” became the go-to phrase to describe the gap between physically being in the room and truly feeling like you belonged there.&nbsp;</p>
<p>But as time went on, she sensed this was a misnomer. Imposter syndrome wasn’t just a woman’s issue. Everyone experiences self-doubt at some point. The key isn’t to wait until it disappears, but to move forward anyway. For her, confidence often begins with courage. “Do it afraid,” she tells colleagues. A reminder that stepping out of your comfort zone usually means you’re growing.&nbsp;</p>
<p>This was the era when women stopped waiting for permission to lead and began shaping conversations of their own.&nbsp;</p>
<h2>Era Three: All are Welcome Through Allyship and Partnership</h2>
<p>This third era is about allyship and shared responsibility. It’s no longer just a “women’s issue”—today, all are welcome. Men and women alike are working to build teams that reflect the diversity of the world around them.&nbsp;&nbsp;</p>
<p>Mary has seen this shift firsthand. At a recent women-in-tech-leadership panel during an event in London, she looked out at a crowd that was nearly 60% men. For her, that moment, recognizing allyship and a broader peer group actively listening to these challenges, captured how far the conversation had come.&nbsp;</p>
<p>She recalls a moment when a male colleague questioned why forums like WLIT were needed, and another man quickly stepped in to say, “Look around the table,” implying that for most in attendance, the answer was obvious. That kind of allyship, Mary notes, gives the conversation credibility and momentum.&nbsp;&nbsp;</p>
<p>Progress now depends on everyone showing up, listening, and lifting others along the way.&nbsp;</p>
<h2>The Emerging Era: Leadership and Influence&nbsp;</h2>
<p>A new chapter is already unfolding, and this next era is about influence. Ensuring women aren’t just part of the conversation about the future of tech. They are helping to define it. The WLIT sessions throughout Cloudera’s global EVOLVE event series offer a vivid example of what this new era looks like in practice.</p>
<p>Under the theme “Accelerate Action, Accelerate Innovation,” WLIT brought together leading voices across industries to explore topics ranging from adaptive leadership to responsible AI. Across four events, we saw over 300 external registrants and nearly 200 attendees demonstrating a strong appetite for these crucial conversations.&nbsp;</p>
<p>Together we discussed:</p>
<ul>
<li><p>Leading with governance and transparency (inspired by the rules of robotics)</p>
</li>
<li><p>Shaping a responsible AI future that people are excited to engage with</p>
</li>
<li><p>Cultivating adaptive, human-first leadership styles</p>
</li>
</ul>
<p>The feedback from these sessions reflects how resonant and needed these conversations are.&nbsp;</p>
<p>One attendee shared:</p>
<p style="text-align: center;"><i>“The WLIT panel from NY was honestly one of the most refreshingly honest and engaging panels I’ve seen. The diversity of thought and representation was great!”</i></p>
<p>For Mary, the WLIT sessions at <b>EVOLVE</b> demonstrate how influence becomes impact—it’s a natural evolution of the journey. The focus is no longer on women proving they belong in tech leadership—it's on equally leading the conversations that will shape the future. The goal isn’t to be seen as “women leaders,” anymore—instead, we’d rather simply be seen as leaders.&nbsp;&nbsp;</p>
<h2>Looking Ahead: Women Leading in Technology</h2>
<p>Each era has paved the way for the next. Belonging built confidence, confidence created allyship, and allyship is leading to influence. A fourth era, Mary says, we’re already seeing take shape.</p>
<p>The story of women leading in technology is still being written. It’s a story of resilience, courage, and connection. Of people who chose to lift one another up rather than climb alone.&nbsp;</p>
<p>At Cloudera and across the industry, leaders like Mary Wells remind us that progress is about using our seat at the table to make space for others and to shape what comes next.&nbsp;</p>
<p>Experience the impact of Women Leaders in Technology at <b>EVOLVE</b>25 today:</p>
<p>Want to learn more? Check out our&nbsp;<a href="/content/www/en-us/about/women-leaders-in-technology.html">Women Leaders in Tech page</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=the-3-eras-of-women-leaders-in-technology-mary-wells-perspective</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Patrick Moorhead Insights: Overinvest in Data to Scale AI</title><description><![CDATA[In this episode of The AI Forecast, host Paul Muller sits down with Patrick for a wide-ranging conversation on the evolution of AI and AI powered scalability​.]]></description><link>https://www.cloudera.com/blog/business/patrick-moorhead-insights-overinvest-in-data-to-scale-ai.html</link><guid>https://www.cloudera.com/blog/business/patrick-moorhead-insights-overinvest-in-data-to-scale-ai.html</guid><pubDate>Tue, 16 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-clouds-building-mirror.webp"><h2>Patrick Moorhead Insights: Overinvest in Data to Scale AI</h2>
<p>Few people have had a front-row seat to more technological revolutions than Patrick Moorhead. As founder, CEO, and chief analyst at Moor Insights &amp; Strategy, he’s spent decades tracking the intersection of hardware, software, and business transformation.</p>
<p>In this episode of The AI Forecast, host Paul Muller sits down with Patrick for a wide-ranging conversation on the evolution of AI—from lessons learned during the dot-com era to the rise of <a href="/content/www/en-us/products/unified-data-fabric.html">hybrid multi-cloud fabrics</a> and the future of human-machine collaboration.</p>
<p>Here are the key takeaways from the conversation.</p>
<h3>Comparing the Dot-Com Era to Today’s AI Moment</h3>
<p><b>Paul:</b> I was listening to Scott Galloway and Ed Olson’s podcast a few days ago, and they were likening the level of exuberance and frankly, even the level of dealmaking we’re seeing in the marketplace going on at the moment in AI to the dot-com era. And we all know how that ended—the internet won, but it didn't get there via straight line. How does today's AI moment compare to past waves of innovation that you've seen?</p>
<p><b>Patrick:</b> I am more comfortable about this than dot-com. When I was part of dot-com, it was, oh my gosh, I’m losing 35 bucks a bag on dog food and VCs are triple investing in the same businesses. It was like putting multiple chips on a craps table and it was pretty clear that that wasn’t going to work.</p>
<p>I like to call it the law of “if thens”—and the law of if thens said, okay, I’m building all this dark fiber and all this capability. If I can get a service that distributes video over the internet… If I can have a PC connected to a DSL or cable modem… If I have gaming that is not multiplayer yet… then yeah, we can fill up these pipes. That’s too many “if thens.”</p>
<p>Today, if I have a web browser, what I can do is absolutely amazing. All the agents that I have running are through a web browser. Now, don't get me wrong, <a href="/content/www/en-us/why-cloudera/enterprise-ai.html">enterprise AI</a> is challenging, but it’s already making a difference. You can see it touches more than the internet did. This is touching healthcare. This is touching consumers. This is touching entertainment. This is touching every form of personal productivity. It’s touching code development. Google said that they're doing 20% of all their code based on AI. That is absolutely mind blowing.</p>
<h3>Garbage In, Garbage Out—Still True in 2025</h3>
<p><b>Paul:</b> What are some of the best practices you’ve learned as someone who works with data all the time?</p>
<p><b>Patrick: </b>The most important thing I learned, I learned in 1984 in my first computer class—and that was garbage in, garbage out. And it has never changed.&nbsp;</p>
<p>If you look at GenAI today, it’s amplified. Your data has to be that much better to get a good decision. If you have the right workload and the right model, the biggest impediment to enterprise AI success is having the right data. I think it’s one of the reasons that Cloudera is such an important company.</p>
<p><b>Paul:</b> What are you seeing happen in the C-suite and at the boardroom table when it comes to recognizing and addressing the challenges of data quality and fragmentation?</p>
<p><b>Patrick: </b>The successful companies really do have a proper data management strategy—bringing multivariate data in multiple formats, making sure it’s clean, tagged correctly, and secure. We’ve been talking about having a data management strategy for decades, and this time, it matters.</p>
<h3>The Hybrid Future: Why Optionality Wins</h3>
<p><b>Paul:</b> The idea of being able to get to all your data everywhere all the time is going to be critical to connecting the dots. Because let's face it, when you're in the boardroom as an executive, the whole point of getting all those various functions together in one room was to try and assemble all the various experiences and data points that you had across the business to create a cohesive business decision. So, I suppose this hybrid notion—that you’re going to have to be able to get to data no matter where it is—you were talking about this 10 years ago.</p>
<p><b>Patrick:</b> I was the analyst who was the cloud denier, saying that hybrid was going to be the way to go. Enterprises want optionality. There are things where they want to leave the driving to someone else, and there are some things they want to control.</p>
<p>Even a 15-year-old company has an Oracle database, an SAP implementation, a mixture of on-prem, public cloud, enterprise SaaS, and now sovereign cloud. You must be able to work across all of those.</p>
<p>If you try to copy all of your data into one giant place—it’s impossible. The cost to copy and assemble it all is a complete and utter failure. That’s why I came up with this idea that the future is going to be about hybrid multi-cloud fabrics—whether it’s security, data, compute, or automation.</p>
<p>You want to choose vendors that operate in every modality. Otherwise, you’ll be playing whack-a-mole until the cows come home.</p>
<p><b>Paul: </b>For boards preparing for AI at scale—what’s your advice?</p>
<p><b>Patrick:</b> Overinvest in data. Spend more money than you think you need. Don’t do it alone—find a partner that has hybrid multi-cloud fabrics. If you find a partner that’s cloud-only or on-prem-only, you’ve failed.</p>
<p>Catch the full conversation with Patrick Moorhead on The AI Forecast on <a href="https://open.spotify.com/episode/5vJuqs9vMaoPCgJFputIba?si=KbUSRoXTS4-_LH6YejqlFw" target="_blank" rel="noopener noreferrer">Spotify</a>, <a href="https://podcasts.apple.com/us/podcast/fueling-the-ai-future-data-deployment-and/id1779293119?i=1000731958218" target="_blank" rel="noopener noreferrer">Apple Podcasts</a>, and <a href="https://www.youtube.com/watch?v=O2vGICTnxbw" target="_blank" rel="noopener noreferrer">YouTube</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=patrick-moorhead-insights-overinvest-in-data-to-scale-ai</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera’s Top Takeaways from AWS re:Invent 2025</title><description><![CDATA[AWS re:Invent is a technology conference that, for most of the world, needs no introduction. This year, more than 60,000 technology professionals descended on the Venetian in Las Vegas for the week to share best practices, hear about the latest innovations in cloud technology, and discuss infrastructure, data, and AI strategy. ]]></description><link>https://www.cloudera.com/blog/partners/clouderas-top-takeaways-from-aws-re-invent-2025.html</link><guid>https://www.cloudera.com/blog/partners/clouderas-top-takeaways-from-aws-re-invent-2025.html</guid><pubDate>Thu, 11 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Partners]]></category><dc:creator><![CDATA[Jeremiah Morrow]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-people-talking-meeting.webp"><p><a href="https://reinvent.awsevents.com/" target="_blank" rel="noopener noreferrer">AWS re:Invent</a> is a technology conference that, for most of the world, needs no introduction. This year, more than 60,000 technology professionals descended on the Venetian in Las Vegas for the week to share best practices, hear about the latest innovations in cloud technology, and discuss infrastructure, data, and AI strategy.&nbsp;</p>
<p>Cloudera had a big presence on the expo floor and throughout the week in sessions. It was our busiest re:Invent ever. Here are some of the top takeaways, stories, and announcements from the week, and what they mean for our customers.</p>
<h2>It’s (Still) All About AI</h2>
<p>In <a href="/content/www/en-us/blog/partners/key-takeaways-from-aws-reinvent-2024.html">last year’s recap</a>, our top takeaway was the pervasiveness of AI across the event. This year was much the same. Booth conversations, announcements, sessions, and demos were centered around nearly universal questions from attendees: how can we deploy AI in production safely and securely, and how can we build AI that we trust to run core parts of our business?</p>
<p>While everyone was talking about the promise of AI, many of the execution challenges lay down the stack, in the data infrastructure. In the largest enterprises, data is still often siloed in various systems, clouds, and data centers making it difficult to find, unify, secure, govern, and provide access to that data for analytics and AI.&nbsp;</p>
<p>Ultimately, distributed environments will be an inevitability, and hybrid architectures are likely the result. The goal is to leverage a platform that can apply a common set of security and governance policies across distributed data stores and enable portability so customers don’t have to care where their data lives. Taking a “cloud and data anywhere for AI everywhere” approach can solve many of the challenges customers face in delivering on their AI vision.</p>
<h2>AWS Announcements</h2>
<p>Amazon Web Services (AWS) made several announcements that Cloudera customers should know about.&nbsp;</p>
<p><a href="https://www.aboutamazon.com/news/aws/aws-graviton-5-cpu-amazon-ec2" target="_blank" rel="noopener noreferrer">AWS Graviton5 release</a>. Last year, <a href="/content/www/en-us/blog/partners/cloudera-and-aws-partner-to-deliver-cost-efficient-and-sustainable-infrastructure-for-ai-and-analytics.html">we wrote a blog post</a> announcing support for AWS Graviton, which provides customers with cheaper, more efficient, and more sustainable compute power. Now, according to AWS’s benchmarks, AWS Graviton5 delivers 25% higher performance and several customers have reported significant performance improvements.</p>
<p><a href="https://aws.amazon.com/blogs/aws/improve-model-accuracy-with-reinforcement-fine-tuning-in-amazon-bedrock/" target="_blank" rel="noopener noreferrer">New capabilities in Amazon Bedrock</a>. Many Cloudera customers use Amazon Bedrock for AI model development. AWS announced reinforcement fine tuning within Amazon Bedrock, which enables customers to create more accurate models that learn from feedback and deliver better business results. Reinforcement fine tuning is automated, so even developers who aren’t machine learning (ML) experts can use the tool.</p>
<p><a href="https://aws.amazon.com/blogs/aws/amazon-s3-vectors-now-generally-available-with-increased-scale-and-performance/" target="_blank" rel="noopener noreferrer">Amazon S3 Vectors is now GA</a>. Amazon S3 Vectors is the first cloud object storage to support the storing and querying of vector data. Now, customers can run and query vector-based AI and ML workloads directly on Amazon S3 without moving the data into a specialized vector database. Amazon S3 Vectors integrates with Amazon Bedrock, further streamlining AI workloads in AWS.</p>
<p><b>Sovereign cloud</b> was another theme of the conference, and a critical point for many customers dealing with sensitive data or regulatory concerns. Ultimately, <a href="/content/www/en-us/about/news-and-blogs/press-releases/2025-10-09-cloudera-leverages-aws-to-deliver-a-sovereign-ready-data-and-ai-platform-as-a-launch-partner-for-the-aws-european-sovereign-cloud.html">Cloudera and AWS are working together</a> to ensure customers have access to the cloud innovation they need to be successful with AI while being excellent stewards of their customers’ data.</p>
<h2>Cloudera Sessions</h2>
<p>Cloudera experts and customers presented at several sessions throughout the week. Here are a few of the topics we covered, with links to the videos where they are available.</p>
<p><a href="https://www.youtube.com/watch?v=WNqMkWqsdII" target="_blank" rel="noopener noreferrer">From Data to Action: Agentic-Powered Humanitarian Response</a>. This session focuses on how Mercy Corps used Cloudera’s “Data Anywhere for AI Everywhere” approach to build MercyCORE, an agentic AI platform that enables the delivery of mission-critical insights to support humanitarian aid and revolutionize crisis response.</p>
<p><a href="https://www.youtube.com/watch?v=maWlvpbF3n8" target="_blank" rel="noopener noreferrer">Powering Credit Risk Modernization with Cloudera and AWS</a>. Most financial services institutions still struggle with manual processes, especially when they need to work with sensitive data. This session focuses on how financial institutions can integrate reasoning models and agentic patterns to arrive at a credit decision efficiently while maintaining compliance.</p>
<p><b>Hands-On Lab: Build an AI Agent with Cloudera Agent Studio.</b> In this hands-on session, we introduced attendees to Cloudera Agent Studio, part of <a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera AI Studios</a>, and walked them through a workflow to build their own agents.&nbsp;</p>
<h2>Cloudera Booth Demos</h2>
<p>Cloudera booth demos highlight the depth and breadth of our platform capabilities, and our ability to support our customers’ AI journeys across a diverse set of organizational requirements. While we covered many data and AI topics, here are some of the highlights:</p>
<p><b>Building a Knowledge Graph with Cloudera.</b> Knowledge graphs solve the two biggest problems with generalized large language models (LLMs) for business use: they are inaccurate and they’re not deterministic. By giving the data relational context through a knowledge graph and making that context available to the model through graph retrieval-augmented generation (GraphRAG), we can produce more accurate, more deterministic results from AI. <a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a> supports this process by unifying the underlying data across silos and systems so models have the best context available.&nbsp;</p>
<p><b>Cloudera Octopai Data Lineage.</b> Data lineage is one of the biggest challenges and opportunities in data governance. Often, the large organizations we talk to don’t know where all their data lives. The first step in building a unified view of data is understanding where it lives and how it flows across the organization, and <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> makes it easier than ever to see the full view of your data estate.</p>
<p><b>Data in Motion.</b> The need for real-time insights has never been greater. Automating operational workflows often requires action at the point of ingestion–well before the data lands in a data lake. <a href="/content/www/en-us/resources/faqs/data-in-motion.html">Cloudera Data in Motion</a> enables organizations to ingest, process, and analyze data in true real time, and customers are using these capabilities for everything from network monitoring and automation to fraud detection to cybersecurity.</p>
<p><b>Cloudera AI Inference Service, powered by NVIDIA on AWS.</b> Although most organizations have deployed AI in some capacity, many are struggling to move from experimentation to reliable, scalable deployment. The first step in operationalizing AI is having a trusted, high-performance inference layer that can serve models consistently across teams and environments. <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Cloudera AI Inference Service</a> makes it easier than ever to deploy, scale, and govern AI workloads with enterprise-grade speed and control.</p>
<p><b>AI-Powered Knowledge Base on Cloudera with AMD EPYC on AWS.</b> Making data searchable and turning unstructured documents into usable knowledge is a common AI use case, but organizations struggle to do so in a cost-effective manner without relying on large, GPU-intensive models. <a href="/content/www/en-us/partners/solutions/amd.html">Small language models (SLMs) on Cloudera</a>—powered by AMD EPYC™ CPUs on AWS—make it easy to build a secure, high-performance knowledge base, delivering fast semantic search with no GPUs required.</p>
<p><b>Protegrity Banking Demo: Secure GenAI &amp; Analytics on Cloudera with AWS.</b> Protegrity and Cloudera have showcased a <a href="/content/dam/www/marketing/resources/whitepapers/how-regulated-firms-can-have-it-all-cloud-ai-and-security.pdf">banking solution</a> that secures sensitive financial data on the Cloudera platform. By integrating Protegrity’s data-centric protection with Amazon S3, the solution establishes granular, role-based access controls. This approach keeps data protected by default, empowering enterprises to confidently scale their AI and analytics pipelines while adhering to strict compliance mandates.&nbsp;</p>
<h2>Next Steps</h2>
<p>AWS re:Invent 2025 reinforced what we’ve been hearing from many of our customers: <a href="/content/www/en-us/campaign/the-state-of-enterprise-ai-and-modern-data-architectures.html">everyone is at least experimenting with AI</a>. A lot of organizations are under pressure to start to show value from their AI projects. And building AI securely, reliably, and in a cost efficient way is mission critical.&nbsp;</p>
<p>Our partnership with AWS, and our joint vision for a unified data and AI ecosystem that ensures unified, secure, governed access to organizational data, is the solution enterprises need to be successful with their AI initiatives.&nbsp;</p>
<p>If you’re ready to see Cloudera on AWS in action, check out our <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html">5-day trial</a>, which features many of the capabilities we showcased at re:Invent.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderas-top-takeaways-from-aws-re-invent-2025</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Has the Recent Acquisition Put Your Streaming Data on Lockdown?</title><description><![CDATA[The competitive landscape of enterprise data management is undergoing a significant reshaping following IBM’s announcement of its planned acquisition of Confluent, a data streaming platform.]]></description><link>https://www.cloudera.com/blog/business/has-the-recent-acquisition-put-your-streaming-data-on-lockdown.html</link><guid>https://www.cloudera.com/blog/business/has-the-recent-acquisition-put-your-streaming-data-on-lockdown.html</guid><pubDate>Tue, 09 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Katie Gdula]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-blurred-lights-freeway.jpg"><h2>Unlock The Power of Agnostic Streaming for Enterprise AI</h2>
<p>The competitive landscape of enterprise data management is undergoing a significant reshaping following <a href="https://newsroom.ibm.com/2025-12-08-ibm-to-acquire-confluent-to-create-smart-data-platform-for-enterprise-generative-ai" target="_blank" rel="noopener noreferrer">IBM’s announcement</a> of its planned acquisition of Confluent, a data streaming platform. The deal is valued at $11 billion—a staggering price tag that validates two key components of a modern data strategy:</p>
<ol>
<li>Real-time data streaming is no longer a luxury, but the indispensable foundation for the next generation of AI agents, intelligent applications, and true business automation.&nbsp;<br>
</li>
<li><a href="/content/www/en-us/resources/faqs/data-in-motion.html">Data in motion</a> is a critical layer in integrated data and AI platforms—something Cloudera has been offering our customers for many years now.</li>
</ol>
<p>Additionally, IBM’s acquisition of an independent vendor suggests a market trend toward consolidation, where vendors strive for comprehensive control over the data lifecycle—from ingestion to serving. Importantly, this trend doesn’t always align with customer needs: there are many use cases where organizations need an agile, “drop-in” data-in-motion solution, independent of a data and AI platform, deployable anywhere real-time streaming analytics, insight, and reasoning are needed.</p>
<p>This shift toward consolidation has many implications for organizations interested in an independent operator for Kubernetes, data-in-motion solution. In this blog, we’ll look at a few key considerations and what a shift towards vendor lock-in could mean for your data estate.</p>
<h2>The Hidden Cost of Consolidation</h2>
<p>Prior to the acquisition announcement, Confluent was well-known in the market as an open, independent, and cloud-agnostic data streaming solution. For organizations needing to get real-time data to a place where AI can be applied as quickly as possible, adopting a solution like Confluent was an easy choice.</p>
<p>Now, organizations that chose Confluent for its openness and flexibility face the possibility of vendor lock-in. Will this once-independent streaming vendor become a pipeline designed primarily to feed the new parent company’s broader, heavier platform? Will your once agile data-in-motion solution suddenly come with the heavyweight baggage of an entire enterprise stack that you neither want nor need?</p>
<p>This fear is valid: the reality is that when a tech giant consumes a smaller, more focused vendor, priorities inevitably shift.&nbsp;</p>
<p>The time is now to ask: Do you want your real-time data strategy—the lifeblood of your AI future—tied to a single, proprietary ecosystem? Or, do you need a solution built for openness, free of dependency on any specific vendor platform, that integrates into your existing data ecosystem with complete platform independence?</p>
<h2>The Power of an Independent Data-in-Motion Solution</h2>
<p>If the IBM-Confluent news has you concerned about the future direction, feature focus, or pricing of your current data-in-motion investment, consider an <a href="/content/www/en-us/blog/technical/cloudera-container-service-built-in-security-and-smarter-cost-control.html">independent, managed, containerized alternative</a>.&nbsp;</p>
<p>Cloudera’s data-in-motion solution is available both as an integrated part of our platform as well as an <a href="/content/www/en-us/blog/business/accelerating-deployments-of-streaming-pipelines-announcing-data-in-motion-on-kubernetes.html">independent operator for Kubernetes</a>. While your use case will dictate which is the better option for you, here are a few of the benefits associated with using an independent solution focused purely on data in motion:</p>
<ul>
<li><p><b>Platform Independence: Your Data, Your Cloud.</b> Cloudera’s data-in-motion operators are engineered from the ground up to be platform agnostic. You can run your critical, real-time pipelines—Kafka, Flink, and more—on any public cloud, on-premises data center, or hybrid environment without penalty. This means you can focus on moving and processing data, not on migrating to a vendor’s preferred ecosystem.</p>
</li>
</ul>
<ul>
<li><p><b>Faster Innovation, Less Bloat.</b> Cloudera natively incorporates the three pillars of data in motion—Apache Kafka, Flink, and NiFi—giving you a complete, visual, drag-and-drop environment for building and running efficient streaming analytics, data flow, ingestion, and routing. For example, because Kafka alone has historically not been the most efficient at data flow processing, Cloudera delivers operators for a combined Kafka/Flink solution as well as a NiFi flow-based engine operator that easily integrates with Kafka/Flink.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><b>Data in Motion Your Way–Not Dictated by a Vendor.</b> Cloudera offers an independent, enterprise-grade Operator for Kubernetes for data in motion that can manage the full suite of real-time needs. And all this can easily integrate with the rest of the Cloudera platform for a full lifecycle option, secure and governed, from edge to generative AI.</p>
</li>
</ul>
<h2>Conclusion</h2>
<p>We’re here for the customers who believe their data strategy should not be constrained by a single vendor's all-encompassing platform ambitions. We are here for the fastest path to production for your real-time pipelines, driven by open source, and delivered with the freedom to deploy and run anywhere.</p>
<p>If you’re worried about the heavy hand of IBM changing your Confluent experience, or if you simply believe your data-in-motion solution should work everywhere you have data, talk to us. We’re ready to help you navigate this changing landscape and put your data back in motion, on your terms. <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html?internal_keyplay=ALL&amp;internal_campaign=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;cid=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;internal_link=WWW-Nav-u01">Try our five-day trial on AWS</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=has-the-recent-acquisition-put-your-streaming-data-on-lockdown</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Integrate Agentic Workflows Using Cloudera AI Workbench MCP Server</title><description><![CDATA[Learn how Cloudera AI Workbench MCP Server enhances agentic workflows to list projects, upload files, and run jobs securely.]]></description><link>https://www.cloudera.com/blog/technical/integrate-agentic-workflows-using-cloudera-ai-workbench-mcp-server.html</link><guid>https://www.cloudera.com/blog/technical/integrate-agentic-workflows-using-cloudera-ai-workbench-mcp-server.html</guid><pubDate>Thu, 04 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Patrick Hunt,Peter Ableda,Khauneesh Saigal]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1252473576.png"><p><small><i>Figure 2. Cloudera Workbench MCP Server: Security by Design</i></small></p>
<p>&nbsp;</p>
<h2>How to Get Started with Cloudera MCP Server</h2>
<p>Cloudera MCP Server is designed to help your assistants interact directly with your platform, all while operating within your established governance.</p>
<p>Getting started is a straightforward process:</p>
<ol>
<li><b>Configure the server:</b> Run the open-source server in Docker, providing your Cloudera AI Workbench host and API key as secrets</li>
<li><b>Connect your client:</b> Point your preferred MCP client (like Cloudera Agent Studio) to the server using its STDIO command</li>
<li><b>Make your first request:</b> You can test the connection by asking your assistant to &quot;list my projects”</li>
</ol>
<h3>Example Workflows</h3>
<p>Here are some examples of tasks you can perform through an assistant connected to the Cloudera MCP Server:</p>
<ul>
<li><p>List all my active projects and show me any jobs that are still running</p>
</li>
<li><p>Upload the new-data-august.zip file to the “fraud-detection” project</p>
</li>
<li><p>Create a job using the train-v3.py script, give it 2 CPUs and 8GB of memory, and run it</p>
</li>
<li><p>Log these metrics to the experiment named “resnet-sweep” and tag the run with “new-data”</p>
</li>
<li><p>Take the latest model build and deploy it to the staging endpoint</p>
</li>
<li><p>Restart the “gradio-demo” application</p>
</li>
</ul>
<p>The server includes tools to support these workflows across the project lifecycle, including file management, job execution, experiment tracking, model deployment, and application management.</p>
<h1>Learn More</h1>
<p>For detailed setup steps, examples, and a full list of capabilities, please visit the<a href="https://github.com/cloudera/CAI_Workbench_MCP_Server" target="_blank" rel="noopener noreferrer"> Cloudera MCP Server GitHub repository</a>. Note: GitHub projects are provided as-is and are not formally supported by Cloudera. The Cloudera MCP Server project is made available under the Apache 2.0 license, and Cloudera provides no warranty, support, or maintenance for its use.</p>
<p>To learn more about how MCP and Cloudera work together, check out our blog <a href="/content/www/en-us/blog/technical/bringing-context-to-genai-with-cloudera-mpc-servers.html">Bringing Context to GenAI with Cloudera MCP Servers</a>.</p>
<p><small><i>Figure 1. Cloudera AI Workbench MCP Server: Architecture</i></small></p>
<p><small>&nbsp;</small></p>
<h2>Integrates with Existing Governance</h2>
<p>Cloudera MCP Server is designed to work with your existing enterprise governance, not bypass it.</p>
<ul>
<li><p><b>For data scientists and AI engineers:</b> This can help reduce context switching, allowing you to stay in your chat or IDE while initiating platform tasks. The assistant can handle the coordination, while the platform handles the execution.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><b>For platform and MLOps teams:</b> It will help with triggering an eval script, uploading new datasets, and running similar test runs. The integration also allows application updates, deletes, and restarting and tracking experiments.</p>
</li>
</ul>
<h3>Security by Design</h3>
<p>Security is a core component of the server's design, intended to fit within an enterprise environment.</p>
<ul>
<li><p><b>STDIO transport:</b> By default, it uses Standard Input/Output (STDIO) for communication between the assistant and the server. This avoids the need to open and manage a new network endpoint for this interaction.</p>
</li>
</ul>
<ul>
<li><p><b>Credential management:</b> The server is designed to read credentials from Docker secrets or environment variables, avoiding the need to hard-code keys or pass them in command-line arguments.</p>
</li>
</ul>
<ul>
<li><p><b>Easy access:</b> It uses your existing Cloudera AI Workbench API keys, allowing you to scope permissions appropriately for different users and use cases.</p>
</li>
</ul>
<p><span class="text-lead"><b>Automate Tasks and Improve Data Practitioner Efficiency</b></span></p>
<p>There are quite a few mundane tasks a data scientist or AI engineer does as part of their daily workflow—like uploading datasets, running and iterating the same scripts for different hyperparameters, observing experiments, and so on. Offloading these tasks to an AI agent could save resources and add significant value.</p>
<p>That’s where the <a href="https://github.com/cloudera/CAI_Workbench_MCP_Server" target="_blank" rel="noopener noreferrer">Cloudera AI Workbench MCP Server </a>comes in: it’s an open-source <a href="/content/www/en-us/blog/business/a-beginners-guide-to-the-model-context-protocol-mpc-what-it-is-and-why-it-matters.html">Model Context Protocol (MCP)</a>&nbsp; server designed to better integrate with your agentic workflow.&nbsp;</p>
<h2>What Cloudera MCP Server Is and How It Helps</h2>
<p>Cloudera’s MCP Server acts as a secure translator. It enables assistants (like <a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera AI Agent Studio</a>, Claude, or Cursor) to execute tasks directly inside your Cloudera AI Workbench environment.</p>
<p>This means you can ask your assistant to list projects, upload files, and run jobs, and the server will carry out the action using the platform's standard APIs.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=integrate-agentic-workflows-using-cloudera-ai-workbench-mcp-server</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>How to Avoid Building Brick Walls with Your Data and AI Platforms</title><description><![CDATA[Most large organizations today would never choose just one vendor to run their data and AI initiatives. A single, preferred cloud vendor? Perhaps, but multi-cloud and hybrid adoption have grown, particularly as these organizations prepare for the next, inevitable public cloud outage. ]]></description><link>https://www.cloudera.com/blog/business/how-to-avoid-building-brick-walls-with-your-data-and-ai-platforms.html</link><guid>https://www.cloudera.com/blog/business/how-to-avoid-building-brick-walls-with-your-data-and-ai-platforms.html</guid><pubDate>Tue, 02 Dec 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Jeff Healey]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty914197996.webp"><p>Most large organizations today would never choose just one vendor to run their data and AI initiatives. A single, preferred cloud vendor? Perhaps, but multi-cloud and <a href="/content/www/en-us/campaign/how-financial-services-institutions-are-scaling-ai.html">hybrid adoption</a> have grown, particularly as these organizations prepare for the next, inevitable <a href="/content/www/en-us/blog/business/the-inevitable-outage-why-your-hybrid-strategy-needs-multi-cloud-resilience.html">public cloud outage</a>. Companies need flexible options on where and when they run their workloads in the most cost-optimized ways, say when there’s an economic downturn or as budgets tighten.</p>
<p>If you take a glimpse into the data and AI architectures of Fortune 2000 IT organizations, you’ll find a myriad of technologies implemented from the vendors scattered as dots across Gartner Magic Quadrants and Forrester Waves.&nbsp;</p>
<p>When you’re active with mergers and acquisitions and needing a quick win, it’s easy to buy into the hype of certain vendors’ claims. And despite their best intentions to maintain an open ecosystem approach, these large organizations sometimes fail to read the fine print before investing heavily into overhyped offerings.&nbsp;</p>
<p>The result? Accidental architectures with brick walls—locking organizations into single vendors, which can lead to higher costs, limited flexibility, and slower innovation.</p>
<p>This blog explores the most common vendor lock-in pitfalls and the critical questions you should ask during platform evaluations, with examples of how Cloudera’s open data architecture helps you sidestep these challenges.</p>
<h2>Forced Costly Cloud Migrations and Lack of Support for Data Fabric and Data Sovereignty&nbsp;</h2>
<p><b>Does your data and AI platform run where my data lives?</b></p>
<p>Cloudera runs anywhere your data lives, so you can securely process and govern distributed data across hybrid environments with the same, consistent platform. <a href="/content/www/en-us/blog/business/trino-the-federation-engine-powering-your-unified-data-fabric.html">Cloudera’s integration of Trino</a> takes this even further. It enables fast, federated queries across data warehouses, lakes, and on-premises systems—without moving data. By centralizing access and accelerating insights, Trino is a key enabler for organizations building unified data fabrics and preparing for the next frontier: agentic AI.</p>
<p>Cloud-only data and AI platforms can’t handle on-premises data without forcing cloud migrations that cost millions of dollars in rewrites and refactoring—at the end of which you’re locked into a single vendor.</p>
<p><b>Does your platform allow me to connect data across silos, from on-premises systems to public clouds and everywhere in between?</b></p>
<p>That’s what a data fabric supports—allowing data to be accessed and used anywhere, by anyone, securely and efficiently. In recognition of our strengths in this area, Cloudera was just <a href="/content/www/en-us/campaign/the-forrester-wave-data-fabric-platforms-q4-2025.html">named a Leader in the 2025 Forrester Wave for Data Fabric Platforms</a>.&nbsp;</p>
<p>Vendors that don’t meet the minimum data management requirements to support data fabric use cases aren’t featured in Forrester’s report. Take note of popular platform vendors that are missing from this evaluation—investing in their solutions will force your organization to move all of your data into a single system.</p>
<p><b>Can your platform run in air-gapped environments to deliver sovereign deployments?&nbsp;</b></p>
<p>Cloudera delivers private AI by supporting fully air-gapped, sovereign deployments where control planes and data never leave your environment—a requirement for regulated industries, <a href="/content/www/en-us/blog/business/the-shifting-airgapped-data-processing-market-what-it-means-for-the-public-sector.html">particularly the public sector</a>. Other platforms require constant connection to their control plane, making true private AI impossible.</p>
<h2>Catalogs that Only Work Inside a Data Estate with Limited Functionality</h2>
<p><b>Does your data catalog work across my entire data estate?</b></p>
<p>Cloudera (and particularly <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a>) provides full-stack lineage and governance across all your data platforms. Other platforms only govern data that you've migrated into that platform, breaking data mesh architectures.&nbsp; Also, Cloudera Octopai Data Lineage delivers visual lineage out of the box with full integration—this is a key differentiator compared to other vendors that offer an API endpoint but no tooling, UI, or integrations.</p>
<p><b>Does your data and AI platform deliver complete governance?</b></p>
<p><a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Cloudera Shared Data Experience (SDX) </a>has been production-proven for years, providing complete governance across all workloads.&nbsp;</p>
<p>Other vendors fall short in this area: one announced catalog offerings years ago, with features like tag-based governance only recently reaching GA—three years after it was initially announced—while critical capabilities like attribute-based access control remain in public preview. Operating on a two-to-three year gap between big announcements and production delivery is the definition of a hype machine.</p>
<h2>Hidden Costs, Lack of Guardrails, and an Immature Data Warehouse</h2>
<p><b>Do you offer transparent pricing with guardrails to avoid bill shock?</b></p>
<p>Cloudera offers <a href="/content/www/en-us/products/pricing.html">transparent pricing</a> without hidden multipliers or consumption traps. Other vendors introduce features without guardrails, hitting customers with thousands of dollars in surprise bills for even just one day of testing.</p>
<p><b>Can your data warehouse handle true enterprise demand?</b></p>
<p><a href="/content/www/en-us/products/data-warehouse.html">Cloudera Data Warehouse</a> provides production-grade data warehouse capabilities with high availability (HA) and seamless scaling.</p>
<p>While other vendors have added autoscaling and HA, it’s important to review whether these are compatible or separate functions—if the latter, you’ll be forced to choose one or the other. Additional limitations to be on the lookout for are regional and vendor-managed storage.</p>
<h2>Limited Data Streaming with a Tax on Dubious Performance Gains</h2>
<p><b>Can your data and AI platform handle data-intensive streaming workloads?</b></p>
<p>Cloudera delivers production-proven <a href="/content/www/en-us/resources/faqs/data-in-motion.html">Apache Flink, Kafka, and NiFi </a>for complex streaming workloads. Other vendors can't compete against Flink, specifically, and have no streaming play.</p>
<p><b>Do you charge for performance gains on streaming workloads?</b></p>
<p><a href="/content/www/en-us/products/stream-processing.html">Cloudera Streaming</a> has no premium pricing tiers. Others force a ~3× cost multiplier, even though streaming workloads often see no performance gain. It’s not uncommon for these vendors to charge you more when you optimize—up to 80% more, based on internal analyses.</p>
<p><b>Does your platform deliver true open source Kafka or a proprietary, unproven version?&nbsp;</b></p>
<p>Cloudera relies on mature, open-source Apache Kafka with a proven track record. Others don’t run Apache Kafka at all. They ship a proprietary Kafka-lookalike that’s still early, unproven at scale, and wrapped in opaque pricing.</p>
<h2>Lack of Clarity Around AI Ownership (vs. API Access Rentals) and AI Assistants (vs. Chatbots)</h2>
<p><b>With your data and AI platform, will I own my AI models or do you simply charge me for API access?</b></p>
<p><a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a> enables companies to own and operate their AI models privately on their infrastructure. Other vendors act as “middlemen” for public APIs, exposing customers to sudden service cutoffs and uncapped costs while collecting massive fees.</p>
<p><b>Is your platform infused with reliable AI assistants to improve productivity?</b></p>
<p><a href="/content/www/en-us/products/machine-learning/ai-assistants.html">Cloudera AI Assistants</a> are embedded across the platform from day one with genuine intelligence. Other vendors are repackaging basic retrieve-and-respond chatbots as innovation—but if it can't trace data lineage, enforce governance, or reason across structured and unstructured data—it's just search with a better interface.</p>
<h2>Vendors Jumping On The “Open” and “Unified” Bandwagon Without the Infrastructure to Support These Claims</h2>
<p><b>How open is your data and AI platform, really?</b></p>
<p><a href="/content/www/en-us/open-source/apache-iceberg.html">Cloudera supports Apache Iceberg</a> and Hudi today across multiple engines without vendor lock-in. Other vendors claim an open approach, but their table format support is often several years away, or still in beta, and essentially remains proprietary, trapping customers.</p>
<p><b>What level of support does your platform provide for Apache Iceberg?</b></p>
<p>Cloudera supports Apache Iceberg with full read and write capabilities across the platform without vendor lock-in. Cloudera’s <a href="/content/www/en-us/blog/business/democratize-data-for-ai-using-interoperability-across-engines-and-zero-copy-data-collaboration.html">Iceberg REST Catalog</a> further enhances data sharing by delivering an open, universal metadata layer that enables zero-copy access across popular platforms, engines, and teams.&nbsp;</p>
<p>Other vendors claim openness, but their Iceberg support is still in beta. And their “unified” table format? Practitioners skip it in real deployments—using it means duplicating data or sacrificing performance, since their optimizations only work on proprietary formats.</p>
<h2>Avoid Vendor Lock-In: Choose an (Actually) Open, Unified, Governed Data and AI Platform</h2>
<p>Cloudera is the only data and AI platform company that large organizations trust to bring AI to their data anywhere it lives. Unlike other providers, Cloudera delivers a consistent cloud experience that converges public clouds, data centers, and the edge, leveraging a proven open-source foundation. As the pioneer in big data, Cloudera empowers businesses to apply AI and assert control over 100% of their data, in all forms, delivering unified security, governance, and real-time predictive insights. The world’s largest organizations across all industries rely on Cloudera to transform decision-making and ultimately boost bottom lines, safeguard against threats, and save lives.</p>
<p>To learn more about how to securely prepare, integrate, and analyze data at scale with Cloudera, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html">check out our product demos</a> or <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html?internal_keyplay=ALL&amp;internal_campaign=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;cid=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;internal_link=WWW-Nav-u01">sign up for a free 5-day trial</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=how-to-avoid-building-brick-walls-with-your-data-and-ai-platforms</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Achieve Workload Portability Without the Rewrite</title><description><![CDATA[Cloudera’s cloud bursting capability brings the cloud to your data.]]></description><link>https://www.cloudera.com/blog/business/achieve-workload-portability-without-the-rewrite.html</link><guid>https://www.cloudera.com/blog/business/achieve-workload-portability-without-the-rewrite.html</guid><pubDate>Tue, 25 Nov 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Blake Tow,Tushar Sharma]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-clouds-and-skyscrapers.webp"><p><span class="text-lead"><b>Cloudera’s cloud bursting capability brings the cloud to your data</b></span></p>
<p>The conversation around cloud adoption has matured significantly. For modern data-driven organizations, it’s no longer a question of if they should use the cloud, but how they can strategically blend public cloud agility with the security and control of their on-premises infrastructure.</p>
<p>Although the hybrid cloud market is <a href="https://www.precedenceresearch.com/hybrid-cloud-market?utm_source" target="_blank" rel="noopener noreferrer">projected to grow to over $300 billion by 2030</a>, many organizations are hitting a wall. They’re discovering that simply connecting an on-premises data center to a public cloud doesn't create a truly hybrid platform.</p>
<p>Instead, they’re often forced into a lift-and-shift cycle: permanently relocating applications and continually replicating massive datasets to the cloud just to get temporary compute capacity. This leads to fragmented management, rising costs due to data duplication, and data staleness.</p>
<h2>The Problem: How to Handle Data Spikes</h2>
<p>Scalability is a top priority for enterprises. Businesses frequently face sudden spikes in data volume that require additional resources—whether it's end-of-month reporting, model training, or seasonal traffic.</p>
<p>Resource contention during these spikes creates bottlenecks that force organizations to miss critical service level agreements or objectives (SLAs and SLOs), which can result in potential regulatory fines and increased customer churn.</p>
<p>Historically, IT leaders had two imperfect choices to handle these spikes:</p>
<ol>
<li><b>Over-provisioning</b>: Buying costly on-premises hardware that sits idle most of the time to account for peak demand</li>
<li><b>Migration</b>: Moving data and workloads to the cloud permanently, which is complex, risky, and fraught with compliance risks</li>
</ol>
<h2>The Solution: Bring the Cloud to Your Data</h2>
<p>Unlike the traditional lift-and-shift model, Cloudera’s approach brings the cloud to the data.</p>
<p>Cloudera’s cloud bursting capability enables organizations to extend the private data center into a public cloud—only when needed—and scale back down when the demand subsides. This approach instantly bridges resources to handle demand without the risk or cost of data migration.</p>
<p>Here’s how it works:</p>
<ul>
<li><p>Spin up a Hybrid Data Hub in the public cloud. This temporary compute cluster combines cloud elasticity with secure access to your on-premises data to handle heavy workloads (for example, a Spark job).</p>
</li>
</ul>
<ul>
<li><p>This cloud workload reads and writes directly from on-premises storage (such as Hadoop Distributed File System, or HDFS), intelligently fetching only the precise data subset required for the specific task rather than moving entire datasets.</p>
</li>
</ul>
<ul>
<li><p>Once the job is done, the cloud resources spin down. Your data is never replicated to the cloud; it is read only into memory and stays safely on-premises.</p>
</li>
</ul>
<h3>Why This Approach Changes the Game</h3>
<p>By using Cloudera’s cloud bursting capability, built on its unified runtime and hybrid control plane, organizations can finally achieve workload portability without the rewrite. Benefits include:</p>
<h4>Zero Data Migration</h4>
<p>This architecture eliminates the cost and complexity of application redesign and massive data migration. Organizations don't need to create and maintain a copy of their data in the cloud just to run a query. Data that is out of sync remains with the original copy before the process is even completed. To optimize performance, the system uses advanced techniques like projection pushdown and partition pruning. This guarantees high-performance query results without the latency or cost of moving massive datasets.</p>
<h4>Centralized Security and Governance</h4>
<p>One of the biggest barriers to hybrid adoption is security. With Cloudera, the&nbsp; security context moves with the workload. We establish a two-way cross-realm trust between on-premises Active Directory and the cloud, which guarantees that the user submitting the job in the cloud is authorized by the same policies defined in Ranger on-premises. All metadata and governance rules remain centralized to maintain compliance with regulations like GDPR and HIPAA.</p>
<h4>Strategic Workload Isolation and SLA Assurance</h4>
<p>Resource contention on-premises often forces IT to play traffic cop, which is where they sometimes must delay lower-priority jobs to keep mission-critical ones running. Cloud bursting resolves this conflict. Organizations can now use strategic workload isolation to offload specific workloads to the cloud so they can maintain critical SLAs and SLOs for their core business processes. Whether it’s meeting a strict deadline for regulatory reporting or delivering real-time fraud detection without latency and ensuring performance without over-provisioning hardware can be guaranteed.</p>
<h3>Real-World Application: Faster Time to Value</h3>
<p>Imagine a data engineer working on a fraud detection model. The on-premises cluster is at 95% capacity, and a new threat vector requires immediate model retraining. Running this locally would choke the production pipeline and cause an SLA breach.</p>
<p>With Cloudera, that data engineer can:</p>
<ul>
<li><p>Burst to the cloud in real time to access the necessary compute power</p>
</li>
<li><p>Process the sensitive data that lives on-prem without permanently moving it</p>
</li>
<li><p>Shut down the cloud instance immediately after the job completes</p>
</li>
</ul>
<p>This capability also accelerates software development by enabling teams to create instant development environments that leverage zero-copy data access from their production on-premises source.</p>
<h2>The Future is Cloud Anywhere</h2>
<p>Cloudera is the only data and AI platform company that brings AI to your data anywhere it lives. Whether the data is in the data center, the public cloud, or at the edge, we deliver a consistent cloud experience that empowers you to make smarter, faster decisions.</p>
<p>Ready to bring the cloud to your data?</p>
<ul>
<li><a href="https://docs.cloudera.com/hybrid-cloud/latest/architecture/topics/hc-remote-data-access-on-premise.html" target="_blank" rel="noopener noreferrer">Dive deep into the architecture of our Hybrid Environments</a></li>
<li><a href="/content/www/en-us/products/cloudera-public-cloud-trial.html">Try Cloudera today</a></li>
</ul>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=achieve-workload-portability-without-the-rewrite</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>How Leading Data Teams Build AI-Ready Pipelines with Apache Iceberg and Spark</title><description><![CDATA[Lessons from two global enterprises modernizing data engineering for scalable AI.]]></description><link>https://www.cloudera.com/blog/technical/how-leading-data-teams-build-ai-ready-pipelines-with-apache-iceberg-and-spark.html</link><guid>https://www.cloudera.com/blog/technical/how-leading-data-teams-build-ai-ready-pipelines-with-apache-iceberg-and-spark.html</guid><pubDate>Mon, 24 Nov 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Pamela Pan,Ying Chen,Akshat Mathur]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-gettyImages-1221230030.jpg"><p><span class="text-lead"><b>Lessons from two global enterprises modernizing data engineering for scalable AI</b></span></p>
<p>From predictive analytics to generative AI, every business is looking to turn data into value. But for many teams, the real challenge lies beneath the surface—in the data engineering work required to make that data usable, trusted, and scalable. Across complex environments, engineers are still stitching together pipelines using legacy table formats, duplicating logic across tools, and retrofitting governance after the fact. These inefficiencies create drag at every stage, delaying outcomes and limiting the impact of even the most advanced AI and analytics initiatives.</p>
<p>For enterprises looking to streamline and future-proof their data engineering stack, <a href="/content/www/en-us/open-source/apache-iceberg.html">Apache Iceberg</a> as the open table format and <a href="/content/www/en-us/resources/faqs/apache-spark.html">Apache Spark</a> as the open compute engine have been proven as a powerful combination. Together, they offer an open, scalable, and standardized foundation for processing and managing petabyte (PB)-scale data—without sacrificing governance, flexibility, or performance.</p>
<p>In this blog, we will take a closer look at how two global organizations transformed their data pipelines using Spark and Iceberg with the <a href="/content/www/en-us.html">Cloudera</a> data and AI platform. We’ll explore how they reduced query times by 80%, standardized workflows across teams, and accelerated their path from raw data to AI-ready insights.</p>
<h2>How Vodafone Idea Slashed Query Times by 80%</h2>
<p><a href="/content/www/en-us/customers/vodafone-idea.html">Vodafone Idea</a> is one of the three major telecommunications companies in India, serving 220 million customers. The company was struggling with scale issues: their Hive-based data lake had ballooned to more than 17 PBs, and performance bottlenecks were putting critical business operations at risk. Some reporting queries took more than 70 hours to complete! This delayed compliance, analytics, and regulatory reporting.</p>
<p>Rather than simply upgrading infrastructure, Vodafone Idea chose to re-architect its data platform. Collaborating with Cloudera, the company leveraged Iceberg for faster queries through optimized metadata and schema evolution, and rebuilt its processing workflows on Spark to leverage distributed compute for efficient, large-scale data processing.&nbsp;</p>
<p>For regulatory reporting, they paired Iceberg with <a href="/content/www/en-us/blog/technical/unlocking-the-benefits-of-apache-impala.html">Apache Impala</a> as the interactive query engine to support fast, reliable access to PB-scale datasets. While Impala handled the reporting queries, Iceberg played a critical role behind the scenes—its support for ACID transactions (atomicity, consistency, isolation, and durability—properties that ensure database transactions are processed reliably and consistently), flexible schema evolution capabilities, and rich metadata kept reporting workflows consistent, even as data changed.</p>
<p>Through integration with <a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Cloudera Shared Data Experience (SDX)</a>, the team also gained fine-grained governance with role-based and attribute-based access control, making sure that the right people had access to the right data. This foundation enabled the business to deliver timely and auditable reports while meeting growing regulatory demands.&nbsp;</p>
<table>
<tbody><tr><td><h3>Transforming Telecom with Data-Driven Efficiency</h3>
<p>By partnering with Cloudera,&nbsp; <a href="/content/www/en-us/customers/vodafone-idea.html">Vodafone Idea</a> preserved flexibility, strengthened governance, and accelerated insight delivery at scale—without having to rebuild its entire data stack. Using Spark for ingestion, Iceberg for unified table management, and Impala for reporting, they modernized their foundation while reusing existing logic and workflows.&nbsp;</p>
<p>Together, this architecture delivered measurable results:</p>
<ul>
<li>Reduced query times by 80%.</li>
<li>Decreased pipeline failures via Spark’s resilience at scale and Iceberg’s robust table management capabilities.</li>
<li>Improved regulatory reporting ( faster and more reliable).</li>
</ul>
</td>
</tr></tbody></table>
<h2><br>
How a Pharmaceutical Company Consolidated In Order To Scale: One Tech Stack, 10,000 Jobs</h2>
<p>A global pharmaceutical company managing PB-scale clinical research data faced a familiar but growing challenge: they had too many tools in play, leading to data reliability challenges and difficulty meeting compliance standards, on top of facing pressure to support faster AI and analytics. The data engineering teams needed to run more than 10,000 daily ETL jobs, but lacked a standardized way to build, govern, or validate pipelines across teams.</p>
<p>With <a href="/content/www/en-us/partners/solutions/amazon-web-services.html">Cloudera on AWS</a>, the company set a clear direction forward. The team standardized all data pipelines using Spark on <a href="/content/www/en-us/products/data-engineering.html">Cloudera Data Engineering</a>, unifying and scaling processing across batch, streaming, and machine learning workloads. At the same time, they adopted Iceberg as the default open table format to ensure consistent schema evolution, built-in version control, and enterprise-grade governance across teams and environments.</p>
<p>By adopting Spark and Iceberg on Cloudera, the company laid a clean, scalable DataOps foundation that standardized data pipelining, enabled secure data sharing across teams and tools, and paved the way for faster and more advanced AI and analytics. This foundation now supports everything from regulatory audit workflows to AI models that accelerate clinical trial discovery and drug development, ensuring the company can seamlessly integrate any new technology or engine in the future.</p>
<table>
<tbody><tr><td><h3>Transforming Pharma with a Unified Data Platform</h3>
<p>Standardizing on Cloudera’s platform gave the global pharmaceutical company a new level of operational consistency:</p>
<ul>
<li>Governance without disruption: Iceberg’s write-audit-publish pattern allows upstream teams to validate data before releasing it to production—without breaking downstream workflows.</li>
<li>Time traveling for traceability: Regulatory teams can access historical data snapshots instantly, enabling clean rollback and audit support.</li>
<li>Shared pipeline logic: With Spark as the unified engine, teams—ranging from data engineers to data scientists—can collaborate easily and reuse core transformations across jobs and environments, reducing duplication and simplifying maintenance.</li>
</ul>
</td>
</tr></tbody></table>
<h2><br>
Building A Modern Foundation for Data Engineering and AI</h2>
<p>These two stories share a common thread: both organizations faced fragmentation, scale pressure, and growing complexity in their data workflows. By standardizing on Apache Spark and Apache Iceberg with Cloudera, they rebuilt their pipelines around open, scalable, and trusted components—enabling better governance, faster performance, and cleaner data flows for AI and analytics.</p>
<p>With <a href="/content/www/en-us/products/data-engineering.html">Cloudera Data Engineering</a>, enterprises get an end-to-end solution that runs across hybrid and multi-cloud environments. It brings together Spark, Iceberg, and integrated orchestration with Airflow to empower teams to:</p>
<ul>
<li>Build pipelines once, and run them anywhere—in the data center or on clouds</li>
<li>Maintain trust and governance at scale in the <a href="/content/www/en-us/products/open-data-lakehouse.html">open data lakehouse</a></li>
</ul>
<p>Watch this <a href="/content/www/en-us/products/data-engineering/cdp-tour-data-engineering.html">interactive demo</a> to see how Spark and Iceberg power trusted, scalable pipelines on Cloudera. Try it yourself with the Cloudera Data Engineering <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html">5-day trial</a> and start building AI-ready data workflows today.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=how-leading-data-teams-build-ai-ready-pipelines-with-apache-iceberg-and-spark</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>The Future Delivered Today: The AI-Powered Data Lakehouse</title><description><![CDATA[Cloudera’s open foundations enable organizations to access 100% of their data, wherever it resides.]]></description><link>https://www.cloudera.com/blog/business/the-future-delivered-today-the-ai-powered-data-lakehouse.html</link><guid>https://www.cloudera.com/blog/business/the-future-delivered-today-the-ai-powered-data-lakehouse.html</guid><pubDate>Fri, 21 Nov 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Dipankar Mazumdar]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1292714050.jpg"><p><i>Figure 3: Cloudera AI’s Offering with AI Workbench and Inference Service</i></p>
<h3>Cloudera AI Workbench</h3>
<p><a href="/content/www/en-us/products/machine-learning/ai-workbench.html">Cloudera AI Workbench</a> is the collaborative environment where data scientists, analysts, and engineers develop, fine-tune, and test models. It brings together notebooks, low-code application builders (<a href="https://docs.cloudera.com/machine-learning/cloud/applied-ml-prototypes/topics/ml-amps-overview.html" target="_blank" rel="noopener noreferrer">AMPs</a>), and specialized studios for every stage of AI development. To accelerate AI development and deployment, Cloudera AI Workbench underpins <a href="/content/www/en-us/products/machine-learning/ai-studios.html">four AI studios</a> that bridge the gap between business and technical teams, fostering collaboration on AI projects.</p>
<ul>
<li><b>Synthetic Data Studio</b> generates synthetic datasets for testing and model training when real data is limited or restricted.</li>
<li><b>Fine-Tuning Studio</b> adapts open foundation models with enterprise-specific datasets for higher relevance and accuracy.</li>
<li><b>RAG Studio</b> builds RAG pipelines that connect LLMs (such as OpenAI, Anthropic, Amazon Bedrock) to relevant private data for grounded, contextual outputs.</li>
<li><b>Agent Studio</b> enables the creation of multi-step, agentic workflows that use models, MCPs, APIs, and internal data sources to automate domain-specific tasks.</li>
</ul>
<p>All of these capabilities operate on the open lakehouse (on Iceberg’s foundations), giving teams governed, zero-copy access to the data needed for specific tasks.</p>
<h3>Cloudera MCP Server</h3>
<p>Cloudera is also extending the openness of its AI platform through a series of emerging MCP services, beginning with the open-source <a href="https://github.com/cloudera/CAI_Workbench_MCP_Server" target="_blank" rel="noopener noreferrer">Cloudera AI Workbench MCP Server</a>. This service is designed for AI system integration, enabling agentic and tool-calling capabilities within the AI Workbench. It provides the framework for LLMs to securely interact with Cloudera AI Workbench features and components—bringing models, data, and applications into automated enterprise workflows. In this architecture, intelligent agents can reason, act, and automate tasks across the trusted, governed Cloudera environment while maintaining the security, control, and auditability required in regulated industries.</p>
<h3>Cloudera AI Inference Service</h3>
<p>The <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Cloudera AI Inference Service</a> brings models into production with autoscaling, high availability, and end-to-end observability. It supports both traditional ML models and large language models (LLMs), serving predictions and responses with low latency. Models can be deployed as REST or gRPC endpoints with enterprise-grade security, ensuring reliable and consistent access from applications and agents.</p>
<p>The <a href="https://docs.cloudera.com/machine-learning/1.5.4/models/topics/ml-using-model-registry.html" target="_blank" rel="noopener noreferrer">Cloudera AI Registry</a>, integrated within the inference layer, provides a centralized model lifecycle management with MLflow-compatible APIs for tracking, versioning, artifact storage, and lineage. You have the choice to select from the various open and enterprise language models options such as LlaMa, Cohere, Gemma, Mistral.&nbsp;</p>
<p>The inference layer also includes built-in monitoring and observability, enabling teams to track latency, throughput, and model drift while maintaining full lineage and compliance through SDX governance. This ensures that model predictions are explainable and traceable, which is a key requirement for enterprise-grade AI.</p>
<h2>The Future is Driven by AI, and AI is Fueled by <i>All</i> Data</h2>
<p>AI success depends as much on data architecture as on model/agent capability. The lakehouse provides that foundation, unifying analytical, operational, and AI workloads on a single, governed data plane. When built on open standards, it ensures that data, metadata, and models can interoperate across tools, clouds, and teams without friction.</p>
<p>Together, Cloudera AI Workbench, AI Inference Service, and the integrated AI Registry complete the data-to-AI lifecycle on an open lakehouse foundation. Built directly on governed Iceberg tables and open metadata access, this stack ensures that every model, prompt, and agent operates on trusted, versioned data.</p>
<p>The future of enterprise AI will not be defined by proprietary stacks, but by open foundations that unify data, governance, and intelligence through shared standards and transparent interoperability.</p>
<p>To learn more about how to securely prepare, integrate, and analyze data at scale with Cloudera, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html">check out our product demos</a> or <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html?internal_keyplay=ALL&amp;internal_campaign=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;cid=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;internal_link=WWW-Nav-u01">sign up for a free 5-day trial</a>.</p>
<p><i>Figure 1: Cloudera’s Data and AI Platform Built on Open Foundations (Apache Iceberg)</i></p>
<p>We’ll now review how the different components in Cloudera's platform (<b>Figure 1</b>) support teams in building ML pipelines and GenAI applications, as well as the different stages of the data and AI lifecycle—from ingest to inference—while operating as one interoperable platform. Each component is built on open standards, ensuring flexibility and interoperability across environments.</p>
<h3>Storage: Apache Iceberg</h3>
<p>Apache Iceberg is the open, versioned, and transactional table format that underpins Cloudera’s lakehouse architecture. Iceberg enables schema evolution, time travel, and atomic operations, allowing both analytical and AI workloads to operate consistently on the same governed data. Cloudera offers a <a href="/content/www/en-us/blog/technical/metadata-management-data-governance-with-cloudera-sdx.html">governed</a> and versioned foundation that ensures that every model, prompt, or retrieval task draws from a consistent and traceable view of data.&nbsp;</p>
<p>Iceberg’s native capabilities like <a href="https://iceberg.apache.org/docs/nightly/evolution/#schema-evolution" target="_blank">schema evolution</a> also align closely with how AI datasets evolve. Feature stores, training datasets, and retrieval corpora can all share the same Iceberg tables in Cloudera’s lakehouse, using snapshots to freeze consistent views for training while new data continues to flow in for inference. This eliminates the divide between analytical tables and AI-specific storage.</p>
<h3>Ingestion: Cloudera Data in Motion</h3>
<p><a href="/content/www/en-us/products/dataflow.html">Cloudera DataFlow, </a>built on <a href="https://nifi.apache.org/" target="_blank">Apache NiFi</a>, forms the foundation for continuous data movement into the lakehouse. It enables low-latency ingestion from diverse enterprise sources—databases, APIs, IoT devices, and event logs—to support both batch and streaming workloads. Recent innovations in NiFi’s native Apache Iceberg <a href="https://kevinbtalbert.github.io/iceberg/nifi/nifi-iceberg/" target="_blank">integration</a> now allow data to be written directly into the open lakehouse without intermediate staging. This tight coupling between NiFi and Iceberg reduces pipeline complexity and brings ingestion closer to the open table format itself.</p>
<p>In real-time use cases, NiFi, Apache Kafka, and Apache Flink form an event-driven ingestion fabric: NiFi orchestrates and routes data, Kafka provides durable streaming, and Flink enables real-time enrichment before persisting data into Iceberg. This design ensures that data remains both fresh and governed across all downstream consumers. This continuous flow of multimodal data is what also powers AI workloads on the lakehouse. By making real-time data continuously available in Iceberg tables under consistent governance, enterprises can feed GenAI systems with timely, domain-specific information, making RAG pipelines and agentic workflows more precise, grounded, and reliable.</p>
<h3>Catalog: Cloudera Iceberg REST Catalog</h3>
<p>The <a href="https://docs.cloudera.com/runtime/7.3.1/overview/topics/cr-ds-cloudera-iceberg-rest-catalog.html" target="_blank">Cloudera Iceberg REST Catalog</a> (based on the open <a href="https://iceberg.apache.org/rest-catalog-spec/" target="_blank">REST specification</a>) provides a centralized and interoperable metadata service that allows any third-party engine (such as Snowflake, Redshift, and Databricks) that supports the open specification to have <a href="/content/www/en-us/blog/business/democratize-data-for-ai-using-interoperability-across-engines-and-zero-copy-data-collaboration.html">zero-copy access</a> to Iceberg tables. This is a key aspect for organizations, as they are not restricted to just one compute engine offered by one platform and therefore have the flexibility to choose the best compute for the task. Users can use their preferred tools while the same security and governance policies offered by Cloudera follow the data everywhere, ensuring consistency across environments.&nbsp;</p>
<p><span class="text-lead">Cloudera’s open foundations enable organizations to access 100% of their data, wherever it resides</span></p>
<p>Across industries, data teams are rethinking how to build and run systems that do more than store information: they’re looking to turn data into <a href="/content/www/en-us/resources/faqs/artificial-intelligence.html">intelligence</a>. Just as important, they need these systems to <a href="/content/www/en-us/blog/partners/cloudera-announces-interoperability-ecosystem-with-founding-members-aws-and-snowflake.html">interoperate</a>. AI models, feature pipelines, business intelligence (BI) reports, and batch jobs often span multiple teams and engines. Sharing data across those boundaries without copying or refactoring is now a first-order requirement.&nbsp;</p>
<p>Traditionally, organizations have relied on a two-tier architecture: data warehouses optimized for BI and reporting, and data lakes designed for large-scale AI and machine learning (ML). This separation came at a cost: complex data movement, specialized engineering, and duplicated storage across systems that rarely stayed in sync.&nbsp;</p>
<p><a href="/content/www/en-us/resources/faqs/data-lakehouse.html">Cloudera’s open lakehouse</a> architecture addresses this challenge, bringing together analytical (BI, ad-hoc queries) and AI (predictive and generative AI, or GenAI) workloads on a single, governed data foundation. With open table formats like <a href="https://iceberg.apache.org/" target="_blank" rel="noopener noreferrer">Apache Iceberg</a>, this unified data architecture enables organizations to bring compute to data (not the other way around) and provides the foundation for running AI workloads closer to the data. AI workloads on the lakehouse can operate directly on governed, versioned, and high-quality data.</p>
<p><a href="/content/www/en-us.html">Cloudera</a> is the only data and AI platform company that brings AI to data anywhere. Leveraging our proven open-source foundation, we deliver a consistent cloud experience that converges public clouds, data centers, and the edge.</p>
<h2>The Importance of Open Foundations for Running AI Workloads</h2>
<p>Over the last decade, enterprises have learned that performance and scalability alone are not enough, and that flexibility and interoperability determine long-term success. AI workloads, in particular, depend on the ability to use disparate data sources, frameworks, and tools without being constrained by proprietary formats or systems.&nbsp;</p>
<p>That’s where <a href="https://www.onehouse.ai/blog/open-table-formats-and-the-open-data-lakehouse-in-perspective" target="_blank" rel="noopener noreferrer">open table formats</a> like Apache Iceberg have reshaped the architecture of data platforms. Iceberg separates the logical definition of a table from its physical storage layout, allowing multiple engines and frameworks to read and write the same data with full transactional guarantees. This openness makes it possible to evolve infrastructure and adopt new compute engines without rewriting pipelines.&nbsp;</p>
<p>Running production-grade pipelines requires a unified platform that can connect data, models, and governance across every stage of the AI lifecycle. At the core, there are data and feature engineering pipelines that continuously transform raw structured, semi-structured, and unstructured data into AI-ready features, maintaining lineage and reproducibility for model training and evaluation.&nbsp;</p>
<p>Beyond traditional ML, GenAI introduces new operational requirements. Teams need infrastructure and access to data for <a href="/content/www/en-us/resources/faqs/retrieval-augmented-generation-rag.html">retrieval-augmented generation</a> (RAG), fine-tuning <a href="/content/www/en-us/resources/faqs/large-language-models.html">large language models</a> (LLMs) on private data, and building <a href="/content/www/en-us/resources/faqs/agentic-ai.html">agentic</a> workflows that combine models, prompts, and <a href="https://www.anthropic.com/news/model-context-protocol" target="_blank" rel="noopener noreferrer">model context protocols (MCPs)</a> (APIs) to solve domain-specific tasks. These workloads rely on both tabular and unstructured data (text, documents, images, and embeddings)—all governed under a single data and metadata plane. Additionally, a scalable inference layer is essential to deploy and serve these models securely and efficiently.&nbsp;</p>
<p>As AI workloads become increasingly multi-modal and agentic, access to <a href="/content/www/en-us/products/cloudera-data-platform/sdx/data-catalog.html">catalogs</a> and metadata becomes just as critical. AI pipelines, retrieval systems, and autonomous agents all rely on metadata to discover datasets, reproduce training states, and maintain lineages. An open catalog provides a universal way for these systems to query, register, and track datasets—regardless of where or how they are processed.&nbsp;</p>
<p>Cloudera’s open foundation enables organizations to support the complete spectrum of analytical, predictive, and GenAI workloads.</p>
<h2>Cloudera’s Unified Data and AI Platform</h2>
<p>Cloudera’s <a href="/content/www/en-us/products/open-data-lakehouse.html">open data lakehouse </a>unifies data engineering, analytics, and AI on the same governed architecture by building on open foundations like Apache Iceberg and REST catalog. The platform is designed around the principle that workloads (whether analytical or AI) should operate where the data already lives. By eliminating the friction of moving or duplicating data, teams can build continuous pipelines that span ingestion, transformation, analytics, and model operations with full lineage and governance.&nbsp;</p>
<p><i>Figure 2: Cloudera’s Iceberg REST Catalog Enables Interoperability with Third-Party Engines</i></p>
<p>This catalog layer is critical for feature engineering pipelines, agentic workflows, and retrieval systems to locate and access governed datasets dynamically. AI agents can query Iceberg tables using the REST Catalog just like a knowledge graph of enterprise data. They can discover available tables, interpret their schemas, and reason over table metadata, such as partitioning, snapshots, and lineage to determine which datasets to use.</p>
<h3>Security and Governance: Cloudera SDX</h3>
<p><a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Cloudera Shared Data Experience (SDX)</a> is the unified security and governance framework that spans every service, from ingestion to inference. SDX provides a single, consistent layer for data lineage, auditing, access control, and policy enforcement, ensuring that every workload inherits the same security model regardless of where it runs. It integrates with enterprise identity systems (LDAP, SSO, OAuth) and supports fine-grained, role- and attribute-based access controls across structured and unstructured data.</p>
<p>By coupling SDX with the open lakehouse foundation, Cloudera ensures that data, models, and AI agents operate within the same governed boundary—delivering transparency, reproducibility, and trust for both analytical and GenAI workloads.</p>
<h3>Cloudera Data and AI Services</h3>
<p>The unified services layer brings together all the functional capabilities that teams need to transform, analyze, and operationalize AI, all while working on the same governed data.</p>
<p><b>Data Engineering</b></p>
<p><a href="/content/www/en-us/products/data-engineering.html">Cloudera Data Engineering</a>, built on open-source Apache Spark and Apache Airflow, provides a serverless service for building, orchestrating, and scaling data pipelines directly on Iceberg tables—enabling reliable, reproducible ETL and feature pipelines for analytics and AI workloads across hybrid environments.</p>
<p><b>AI Services</b></p>
<p>The Cloudera AI services layer operationalizes the full lifecycle of AI, starting from model training and fine-tuning to secure deployment—all running natively on the same governed data foundation with Iceberg. It unifies model development, registry, and inference into a single workflow that bridges data engineering and AI operations.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=the-future-delivered-today-the-ai-powered-data-lakehouse</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>How Clouderans Give Back During The Season of Thanks</title><description><![CDATA[Our annual Week of Giving is a dedicated time for our global Cloudera community to come together to live out our values, collaborate, and make a positive impact on the world. This year, our theme is “A Season of Thanks, A Week of Giving.”]]></description><link>https://www.cloudera.com/blog/culture/how-clouderans-give-back-during-the-season-of-thanks.html</link><guid>https://www.cloudera.com/blog/culture/how-clouderans-give-back-during-the-season-of-thanks.html</guid><pubDate>Thu, 20 Nov 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Ashton Stockstill]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/person-from-audience-talking.webp"><p>Our annual Week of Giving is a dedicated time for our global Cloudera community to come together to live out our values, collaborate, and make a positive impact on the world. This year, our theme is “A Season of Thanks, A Week of Giving.” In essence, the week is about far more than just giving back. It’s a time for us all to reflect on the things we are thankful for while embracing opportunities for service through independent volunteering, company events, and donations to local community organizations.&nbsp;</p>
<p>It’s always fulfilling and impactful for Clouderans to get out and make a difference for causes that matter to them. As we wrap up this week, Clouderans have been participating in events around the world. This year, our teams got involved in a variety of efforts that included creating youth mental health toolkits, playing Bingo with seniors, donating coats through <a href="https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fonewarmcoat.org%2F&amp;data=05%7C02%7CJSchubert%40v2comms.com%7C5b8060c4b6754ee8257008de2201b923%7Cbf0b48c768944eb6bf538621676ccee4%7C0%7C0%7C638985587977351960%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=6lH8WLvcbYIiGResSADpEpDE3sznlcB1nJ%2BrcxA52Ok%3D&amp;reserved=0" target="_blank">One Warm Coat</a>, joining Christmas light workshops, and volunteering with the <a href="https://wck.org/?utm_source=googlesearch&amp;utm_medium=cpc&amp;utm_campaign=evergreen-brand&amp;gad_source=1&amp;gad_campaignid=21364245970&amp;gbraid=0AAAAApcxQJvFNlaRu-7O8hG0ZoFelzcjs&amp;gclid=CjwKCAiA_dDIBhB6EiwAvzc1cMlVZ5HB0UfOzUOR7EePtc35TnfwNvjgdqbjF1PuronLgLSjAbQe8hoCmEYQAvD_BwE" target="_blank">World Central Kitchen</a>.&nbsp;</p>
<p>Our people are the heart of our Cloudera Cares program and Week of Giving. Their dedication, passion, and time are what make these events so special. With that, let’s hear from Clouderans about what they find special during this time of year.</p>
<h2>What makes you most thankful to work at Cloudera?</h2>
<p><i>“I'm thankful that Cloudera empowers its employees to take the lead on supporting causes we're passionate about. It is wonderful to have access to a platform like Benevity, where I can give to causes I believe in and the Company matches those funds, doubling my impact.” – TJ Sundar, Private Cloud Field Specialist (EMEA)&nbsp;</i></p>
<p><i>“I'm thankful for the opportunities to be in service to the community that I live in and where I grew up, alongside my fellow Clouderans. I've volunteered for many virtual and in-person Cloudera Cares events, and I always leave feeling more connected to my colleagues.” – Renee Castro, Learning &amp; Enrichment Partner (AMER)</i></p>
<p><i>“For me, it’s the people and the culture. I’m constantly grateful to be surrounded by such a talented, smart, and driven group of individuals. There’s a genuine spirit of collaboration, and I feel like I learn something new from my colleagues every single day. The culture here truly encourages growth, curiosity, and supporting one another, which makes coming to work inspiring and rewarding.” – Laura Hughes, Director, R&amp;D Operations &amp; Programs (EMEA)&nbsp;</i></p>
<h2>What makes giving back and volunteering important to you?</h2>
<p><i>“Volunteering for Second Harvest, handing out food and essentials for people reminds me not to take things for granted. It is incredibly meaningful for me to see how thankful the recipients are.” – Westley Chan, Sr. Manager, Business Applications (AMER)</i></p>
<p><i>“Giving back is important to me as it reminds me that my actions, no matter how small, can create a meaningful change. Volunteering allows me to help others and impact people's lives. I find this extremely rewarding.” – Deepa Pednekar, Senior Practice Manager (EMEA)</i></p>
<p><i>“It’s important to me because I want to lead by example for my child. We take so much from the world, and volunteering gives me a chance to give something back.” – Asha Mohan Chandran, Learning &amp; Enrichment Partner (APAC)&nbsp;</i></p>
<h2>Why is Week of Giving such an important part of the employee experience at Cloudera?</h2>
<p><i>“This is a time to strengthen our culture and remember that, together, we can achieve remarkable things.” – Marcus Fig, Cloud Sales Specialist (AMER)</i></p>
<p><i>“It’s one thing for a company to have values; it’s another to live them. Events like this are the 'living' part. They're a core part of our experience because they show that Cloudera Cares is more than just a program. It's an action. Week of Giving is a unique opportunity to bond with colleagues. It's a chance to collaborate with people in a different way than you normally would, whether you’re decorating smiley-face goodie bags or sharing Halloween-themed baked goods. These interactions build our culture and strengthen our relationships.” – TJ Sundar</i></p>
<p><i>“I've participated in volunteering events at previous companies that felt like we were just &quot;checking the box.&quot; Week of Giving—in addition to all other volunteering events I've participated in at Cloudera—feels full of intention and attracts individuals who truly care about the communities served. From those who run the events to the volunteers themselves, we are doing more than just &quot;checking the box&quot; - we truly care about impact.” – Renee Castro</i></p>
<h2>What have you learned from your time volunteering, both with colleagues and in your community?</h2>
<p><i>“Teamwork and organization. Special thanks and shoutouts to the folks organizing these volunteer events to bring people together and help the community. Every little bit helps!” – Westley Chan</i></p>
<p><i>“Volunteering, both with colleagues and in the community, has taught me the power of collaboration and empathy. I’ve learned that even small actions can have a meaningful impact, and that working together toward a common goal strengthens connections and builds a sense of shared purpose. It’s inspiring to see the difference we can make when we combine our skills, time, and energy to help others.” – Laura Hughes&nbsp;</i></p>
<p><i>“I've learned the power of collaboration, empathy and giving back to society. I've found that these events help strengthen relationships beyond the workplace. It's taught me humility and just realizing that meaningful change starts small with consistent efforts. You never know - your little contributions can bring big smiles across many individuals and communities.” – Deepa Pednekar</i></p>
<h2>Continuing Cloudera’s Commitment to Our Global Community&nbsp;</h2>
<p>As we celebrate another Week of Giving, we want to thank everyone who participated in this year’s activities. It has been so rewarding to see Clouderans from across our global offices volunteer and give back to their communities.&nbsp;</p>
<p><a href="/content/www/en-us/about/philanthropy.html">Learn more</a> about how Clouderans are helping shape the communities that we all call home.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=how-clouderans-give-back-during-the-season-of-thanks</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Trino: The Federation Engine Powering Your Unified Data Fabric</title><description><![CDATA[Connect, manage, and govern data across hybrid and multi-cloud environments]]></description><link>https://www.cloudera.com/blog/business/trino-the-federation-engine-powering-your-unified-data-fabric.html</link><guid>https://www.cloudera.com/blog/business/trino-the-federation-engine-powering-your-unified-data-fabric.html</guid><pubDate>Thu, 20 Nov 2025 05:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Katie Gdula]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-blue-gears.jpg"><p><span class="text-lead"><b>Connect, manage, and govern data across hybrid and multi-cloud environments</b></span></p>
<p>In today’s data landscape, organizations often grapple with massive, distributed data estates spanning multiple clouds and on-premises systems. This complexity leads to data silos and costly, time-consuming data movement for analysis.&nbsp;</p>
<p>A <a href="/content/www/en-us/blog/business/cloudera-named-a-leader-in-the-2025-forrester-wave-for-data-fabric-platforms.html">unified data fabric</a> addresses this challenge by providing an architectural layer that automates and orchestrates data discovery, access, and management across distributed, hybrid environments. It connects data, without data movement, from any source, applies consistent governance, and delivers uniﬁed access for analytics, AI, and real-time decision-making.</p>
<p>Trino, an open-source distributed SQL query engine, is a key component of <a href="/content/www/en-us/products/unified-data-fabric.html">Cloudera’s data fabric</a>. It enables big data analytics and data engineering by running interactive queries and batch processing across vast amounts of data, without requiring unnecessary data movement or storage format conversions. Trino can, in a single query, collate data from multiple sources, including data lakes, and run federated queries across these disparate systems.</p>
<h2>High-Level Use Cases for Trino&nbsp;</h2>
<p>Trino is versatile, supporting a diverse array of use cases–from high-speed, ad-hoc analytics to complex batch processes.&nbsp;</p>
<h3>Centralized Data Access and Analytics with Query Federation</h3>
<p>Query federation is a core strength of Trino. It provides the ability to query many disparate data sources within the same system using a single SQL query. This capability dramatically simplifies analytics for users who need a comprehensive view of all their data. Trino's architecture is designed for diverse connectivity, allowing it to federate across dozens of heterogeneous sources. A key feature is zero-copy data, which eliminates the need for expensive, and sometimes risky, data movement or replication.</p>
<h3>Interactive and High-Performance Data Analytics</h3>
<p>Trino is primarily driven by interactive analytics. It’s built from the ground up for efficient, low-latency query performance. Data analysts and data scientists can query large amounts of data, run hypotheses, conduct A/B testing, and build visualizations or dashboards directly. Trino is designed to be so performant that it enables analytics that were previously impossible or took hours to complete.</p>
<h3>Batch ETL Processing Across Disparate Systems</h3>
<p>While interactive analysis is key, Trino also accelerates large extract, transform, load (ETL) processes that typically run in batches and are resource-intensive. Engineers can speed up ETL processes using standard SQL statements, avoiding more complex, error-prone, and hard-to-maintain code-based ETL processes that work with a range of data sources and targets.</p>
<h2>Cloudera with Trino: A Unified Data Fabric is the Pathway to Agentic AI Anywhere</h2>
<p>Cloudera's integration of Trino addresses the needs of organizations with large, heterogeneous data estates, preparing organizations for the future of data: agentic AI. And a unified data fabric is the foundation for trusted AI.&nbsp;</p>
<p>The key differentiators of the Cloudera + Trino integration include low-latency performance for agentic AI anywhere, providing real-time reasoning directly within business flows, with unified governance and security, and a focused experience with AI automation.</p>
<h3>Hybrid and Multi-Cloud Deployment: AI Everywhere</h3>
<p>Cloudera provides an anywhere cloud experience with a data and AI platform that allows customers to run the identical software stack and unified control plane across public clouds, private clouds, and on-premises data centers. This is a decisive advantage for organizations concerned with data sovereignty and regulatory requirements.</p>
<p>Trino on Cloudera is optimized for on-premises and cloud environments and can be deployed to federate data across systems using <a href="https://docs.cloudera.com/data-warehouse/1.5.5/administration/topics/dw-trino-connectors.html" target="_blank" rel="noopener noreferrer">certified connectors</a>. Unlike cloud-native, SaaS-only architectures, Cloudera's hybrid approach is essential for regulated industries, like banking and government, whose operational data cannot be moved to a public cloud vendor’s SaaS platform.</p>
<h3>Low-Latency Performance for Agentic AI&nbsp;</h3>
<p>Cloudera leverages Trino's architecture to enable operational AI—the application of AI/ML models to live, real-time business processes—key to anyone pursuing agentic AI. Trino’s architecture is massively parallel processing (MPP), in-memory, and pipelined, allowing for sub-second to few-second performance. For interactive analytics workloads, Trino can be 2 to 30 times faster than Apache Spark. Data scientists can embed real-time model inference logic directly into a low-latency, federated Trino query, combining fast federated access with the power of Python AI/ML for true operational AI and agentic workflows.</p>
<h3>Unified Governance and Security</h3>
<p>For enterprise adoption, centralized governance is paramount. Trino is integrated with <a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Cloudera Shared Data Experience (SDX)</a>, ensuring consistent security and management. This added layer of security ensures that all metadata and access controls are unified to simplify management and self-service access. Cloudera delivers a single endpoint to access all data across various engines, including Trino, without needing to replicate access and security policies.&nbsp;</p>
<h3>Focused Experience and AI Automation</h3>
<p>Cloudera enhances the user experience for administrators and practitioners, driving efficiency and democratizing access to data. Teams benefit from automated warehouse management, natural language access, and simplified administration through guided federation connector setup and a true hybrid deployment model–simplifying data architecture and empowering zero-copy analytics with no ETL burden.</p>
<p>With Trino, Cloudera delivers a &quot;govern once, access everywhere&quot; solution, providing a secure, high-performance query engine that runs identically across your hybrid, multi-cloud estate–a necessity for mastering the complexity of modern enterprise data and enabling real-time AI workflows.</p>
<h2>Next Steps: Building a Unified Data Fabric with Cloudera and Trino</h2>
<p>Cloudera’s unified data fabric enables organizations to govern every dataset, track every lineage, and trust every prediction, ensuring responsible AI that aligns with enterprise and regulatory standards. Trino extends the value of Cloudera’s data fabric by centralizing data access, performing interactive and high-performance analytics, and running batch processing across disparate systems.</p>
<p>To learn more about how Cloudera with Trino can transform your analytics and AI experience, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html?menu-resources">schedule a virtual demo</a>.</p>
<p>Cloudera was recently named a Leader in The Forrester Wave™: Data Fabric Platforms, Q4 2025. <a href="/content/www/en-us/campaign/the-forrester-wave-data-fabric-platforms-q4-2025.html">Access the report</a> to understand the trends shaping data fabric architectures—and how we believe Cloudera continues to lead the way.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=trino-the-federation-engine-powering-your-unified-data-fabric</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Inside the Third Wave of Data and AI </title><description><![CDATA[From the rise of the internet to the explosion of cloud computing, AI and data analytics is the latest major shakeup in the world of big Data.]]></description><link>https://www.cloudera.com/blog/business/inside-the-third-wave-of-data-and-ai.html</link><guid>https://www.cloudera.com/blog/business/inside-the-third-wave-of-data-and-ai.html</guid><pubDate>Fri, 14 Nov 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-water.png"><p>From the rise of the internet to the explosion of cloud computing, every major technological era has reshaped how we use—and create—data. Now, according to Cloudera Chief Technology Officer Sergio Gago, we’re entering a third phase of big data focused on convergence. &nbsp;</p>
<p>He recently joined The AI Forecast podcast to discuss how the convergence of cloud and on-premises systems is setting the stage for a new generation of private AI—where enterprises can fully control their data, models, and AI life cycles.</p>
<p>Here are the key takeaways from the conversation.</p>
<h2>The Convergence of Cloud and On-Prem—and Why It Enables Private AI</h2>
<p><b>Paul:</b> Let’s talk about your vision. What does the third wave of big data mean to you, and why is it so important?&nbsp;</p>
<p><b>Sergio: </b>We started with the era of control. Many companies had their own data centers that gave them control of their data. Then the cloud came in and we entered what we call the era of convenience. So, you had teams with a credit card that could go into any hyperscaler and start playing with data either for machine learning or for building dashboards. It was so easy that it brought shadow IT into many enterprises, which made controlling cost, TCO, and data governance growing challenges.&nbsp;</p>
<p>That was the story of cloud and data. Now today, you kick a rock and there are hundreds of engines, databases, and options. We talk about Frankenstein architectures now, where companies have dozens—if not hundreds—of components and are struggling to bring them together. The era of convenience brought this complexity.&nbsp;</p>
<p>Now fast forward with the advent of <a href="/content/www/en-us/why-cloudera/enterprise-ai.html">AI and AI agents</a> and the regulation and compliance requirements for many enterprises and startups alike. To comply, organizations need to bring all the controls of the first era back, especially in large enterprises. All that is forcing companies and individuals to converge and manage both worlds—the data center and the cloud—to have the control and governance of the data center with the convenience of the cloud. That’s why we call the Third Wave, the era of convergence.&nbsp;</p>
<h2>Private AI: Full-Lifecycle Control and the Human Advantage</h2>
<p><b>Paul:</b> I wanted to talk to you about the private <a href="/content/www/en-us/why-cloudera/enterprise-ai.html">AI component.</a> With private data, I have a tremendous competitive advantage. How does private AI help me tap into that?&nbsp;</p>
<p><b>Sergio:</b> Private AI is the ability to control the full life cycle of your AI applications. What models do you use? How do you deploy them? Which ones are approved from a compliance perspective? How do you make sure the model weights stay constant for as long as you need? Then you have data from your company that lives both in the cloud and in the data center. You need to safely bring that data into your model—either for training, fine-tuning, or other techniques like RAG. That’s what makes your model unique to you.&nbsp;</p>
<p>The competitive advantage of most companies today is the data, but also the skills—the human capacity to drive insights. It’s not necessarily the data itself but the experience and domain knowledge that allow you to interpret it. Private AI helps you preserve that advantage by controlling everything from model lifecycle to prompt management, lineage, and benchmarking so you can move from proof of concept to true production workloads.&nbsp;</p>
<h2>Build for ROI and Risk—With Agents, Governance, and Culture in the Loop</h2>
<p><b>Paul:</b> When we talk about topics like convergence, we sometimes run the risk of alienating businesspeople who'll see this as more of a CTO-type of discussion, a technical discussion. From your perspective, what does something like convergence do to unlock new use cases or business value that you couldn't get before as a CEO or business leader?&nbsp;</p>
<p><b>Sergio: </b>I think that the CEO will always want to understand the actual value of a tool, either in terms of ROI or cost reduction, or value improvement for your company. GenAI is just the conveyor belt for all those things.&nbsp;&nbsp;</p>
<p>At the same time, the second angle every CEO has front and center is risk—either from FOMO or from fear of becoming the next company in the headlines due to a massive AI hallucination. Those are the two sides of the scale that CEOs are working with.&nbsp;</p>
<p>GenAI use cases need to start from the business side. Bring in compliance, governance, IT, cybersecurity, and legal from the very beginning so that it doesn’t become an experiment in the garage that then doesn’t go anywhere. Showing value in those terms allows you to then take them to the enterprise.&nbsp;</p>
<p>Catch the full conversation with Sergio Gago on The AI Forecast on<a href="https://open.spotify.com/episode/1tzeLBho6WJQHQvCVpouv6?si=e42d03bd9663426d" target="_blank" rel="noopener noreferrer"> Spotify</a>,<a href="https://podcasts.apple.com/us/podcast/private-ai-big-datas-third-era-with-sergio-gago/id1779293119?i=1000732950941" target="_blank" rel="noopener noreferrer"> Apple Podcasts</a>, and<a href="https://www.youtube.com/watch?v=XbrsTQlVvjM" target="_blank" rel="noopener noreferrer"> YouTube</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=inside-the-third-wave-of-data-and-ai</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>A 5-Step Framework To Streamline Your Post-Merger Data Strategy</title><description><![CDATA[This article introduces a five-step framework to address those challenges and accelerate value capture in M&amp;A settings. This framework will ensure your post-merger data strategy with Cloudera delivers the capabilities needed to streamline the technology integration process. ]]></description><link>https://www.cloudera.com/blog/business/a-5-step-framework-to-streamline-your-post-merger-data-strategy.html</link><guid>https://www.cloudera.com/blog/business/a-5-step-framework-to-streamline-your-post-merger-data-strategy.html</guid><pubDate>Thu, 13 Nov 2025 18:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Andreas Skouloudis]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-girls-walking-datamesh.webp"><p>Inorganic growth strategies, such as mergers and acquisitions (M&amp;A), serve as strategic growth levers, enabling companies to realize revenue and cost synergies or to rapidly acquire emerging capabilities that will deliver long-term competitive advantage. Today, for instance, we observe major organizations acquiring smaller, innovative AI start-ups to accelerate their AI transformation efforts and gain a competitive edge.&nbsp;</p>
<p>Technology integration plays a crucial role in value capture from M&amp;As. A <a href="https://www.deloitte.com/ch/en/services/consulting-financial/research/accelerating-it-services.html" target="_blank" rel="noopener noreferrer">Deloitte study</a> argues that IT is a key driver of integration benefits, accounting for more than 50% of all synergies. However, due to the proliferation of data silos and varying technology architectures and environments, organizations face several post-merger data challenges in realizing technology integration benefits.&nbsp;</p>
<p>This article introduces a five-step framework to address those challenges and accelerate value capture in M&amp;A settings. This framework will ensure your post-merger data strategy with Cloudera delivers the capabilities needed to streamline the technology integration process.&nbsp;</p>
<p><i>Figure 1: Post-Merger Data Integration Framework with Cloudera</i></p>
<h2>1. Accelerate Post-Merger Integration with Cloudera Octopai Data Lineage</h2>
<p>At the start of post-merger integration, the data discovery phase frequently becomes a bottleneck, since fragmented and undocumented sources delay critical analytics and compliance efforts. <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> addresses this challenge by providing an automated, AI-powered metadata management solution that accelerates data discovery, end-to-end lineage, and cataloging across complex hybrid and multi-cloud environments.&nbsp;</p>
<p>Cloudera Octopai Data Lineage effectively maps data flows and fills metadata gaps, providing multi-dimensional lineage that traces origins and transformations for complete visibility. With more than 60 native integrations and universal connectors for non-native systems, Cloudera Octopai Data Lineage streamlines the onboarding of acquired data estates, thereby improving data transparency, quality, and trust.&nbsp;</p>
<p>For example, in banking merger scenarios, this capability facilitates rapid identification and tagging of risk-related datasets, ensuring compliance with regulatory standards such as BCBS 239, while minimizing the need for extensive manual audits or intervention.&nbsp;</p>
<h2>2. Integrate Disparate Data Sources with Cloudera Data In Motion</h2>
<p>Integrating diverse data sources and eliminating complex, custom ETL pipelines is a critical post-merger challenge. Cloudera delivers robust capabilities for batch and real-time data ingestion, processing, and data distribution through <a href="/content/www/en-us/products/dataflow.html">Cloudera Data Flow</a> (powered by Apache NiFi) and <a href="/content/www/en-us/products/stream-processing.html">Cloudera Streaming</a> (powered by Apache Kafka and Apache Flink).&nbsp;</p>
<p>With more than 450 connectors, Cloudera Data Flow provides a visual, drag-and-drop interface to ingest data from a variety of heterogeneous data sources, whether on-premises, in the clouds, or at the edge. In addition, Cloudera Streaming provides a messaging bus architecture that decouples source systems from consuming systems between the two entities, thereby eliminating point-to-point integrations that add architectural complexity and higher costs.&nbsp;</p>
<p>During post-merger integration, these capabilities can significantly accelerate and simplify data movement between organizations. For instance, Cloudera Data Flow can be used to quickly integrate on-premises data from legacy source systems of the acquired company into the cloud-native data warehouse of the parent company, expediting decision-making.&nbsp;</p>
<h2>3. Build a Secure Data Sharing Layer on Cloudera Open Data Lakehouse with Apache Iceberg</h2>
<p>Data sharing between merging entities is an essential requirement for integrated decision-making and deriving insights. This process can be complex due to the diverse exploratory analytics and business intelligence technologies, as well as the varying data security models used by different systems.&nbsp;</p>
<p>An <a href="/content/www/en-us/products/open-data-lakehouse.html">open data lakehouse</a> approach that combines <a href="/content/www/en-us/open-source/apache-iceberg.html">Apache Iceberg</a>, the <a href="/content/www/en-us/blog/business/democratize-data-for-ai-using-interoperability-across-engines-and-zero-copy-data-collaboration.html">Cloudera Iceberg REST Catalog</a>, and <a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Cloudera Shared Data Experience (SDX)</a> enables organizations to develop a unified data sharing layer. This layer is compatible with various analytical engines (for example, Snowflake, Databricks, AWS EMR, AWS Athena, and Salesforce Data Cloud, as long as these engines are Iceberg REST Catalog enabled) and provides a fine-grained security and governance model to manage access for a diverse range of users, including the newly integrated data science teams.&nbsp;</p>
<p>For example, two healthcare organizations engaged in drug manufacturing can leverage Cloudera to construct a <a href="/content/www/en-us/blog/business/navigating-gxp-compliance-in-the-age-of-precision-medicine-and-ai.html">GxP-compliant</a> data lakehouse that consolidates the data assets of the merging entities while ensuring adherence to regulatory requirements.</p>
<h2>4. Standardize Cross-Environmental Initiatives on a Single, Multi-Cloud Environment</h2>
<p>The different environments used for analytical activities in the two merging entities lead to duplicative operations throughout the data lifecycle, including multiple data engineering pipelines for common tasks such as data ingestion and standardization.</p>
<p>Cloudera empowers organizations to standardize data and AI operations on a common runtime across various private and public cloud environments. This capability derives from the underlying containerized infrastructure model used across environments, a consistent user authentication and authorization mechanism (Cloudera SDX), and Cloudera Manager, which serves as the single pane of glass for managing clusters across different deployment environments and regions.</p>
<p>In a post-merger context, this standardization is transformative: the two companies can integrate their data lifecycle operations onto a single runtime, eliminating redundant tools and facilitating the sharing of data, insights, and AI models. This leads to reduced technology and labor costs for data operations and AI/ML model development, increased practitioner productivity, consolidation of multiple tools, and reduction of data silos.&nbsp;</p>
<h2>5. Scale AI Initiatives Anywhere with Cloudera AI&nbsp;</h2>
<p>Post-acquisition or merger, the immediate challenge is integrating the disparate tools, models, and data scientists from the newly acquired innovative start-up, all while managing changing capacity demands. <a href="/content/www/en-us/products/machine-learning.html">Cloudera AI Workbench and AI Inference </a>empower organizations to scale AI initiatives on-premises or in the cloud by:</p>
<ul>
<li><p>Providing a container-based, end-to-end solution for feature engineering, model training, experimentation tracking, and model deployment</p>
</li>
</ul>
<ul>
<li><p>Facilitating AI model sharing that allows data scientists to collaborate among disparate teams</p>
</li>
</ul>
<ul>
<li><p>Leveraging <a href="/content/www/en-us/blog/partners/cloudera-and-nvidia-deliver-ai-powered-transformation-in-financial-services.html">hardware and software </a>acceleration services from Clouder partners that can speed up the entire data science lifecycle by improving data engineering performance by 20x and AI inference performance by up to 6x&nbsp;</p>
</li>
</ul>
<p>With Cloudera, the integrated company can achieve substantial cost reduction by moving persistent, compute-intensive workloads such as AI/ML model serving to on-premises environments. More importantly, it can accelerate the time-to-market for new, combined AI applications. This allows the organization to rapidly realize the “competitive advantage” it sought from the M&amp;A in the first place</p>
<h2>Take the Next Step to Ensure Successful Integration After Your Next Merger and Acquisition</h2>
<p>Cloudera can accelerate the post-merger integration of data assets and analytical capabilities between the two integrating entities. Our platform offers scalability across the data lifecycle, an infrastructure-agnostic deployment model, and interoperability of the data lakehouse on Cloudera services and Apache Iceberg. This combination provides an architectural blueprint for standardizing AI/ML initiatives and data operations, and for delivering a data sharing model that can be used by both Cloudera and non-Cloudera services.&nbsp;</p>
<p>To schedule a demo or product tour, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html">contact our team</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=a-5-step-framework-to-streamline-your-post-merger-data-strategy</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloud Migration Checklist: Getting Your Data Landscape Ready</title><description><![CDATA[If you’re considering moving your data assets, processes, and applications to the cloud, you’re in good company. But if you’re dreading the move, you’re not alone there either. A data migration will inevitably strain your organization’s time, resources, and patience. But this article is here to help—a good checklist can make the process smoother so you can focus on execution. ]]></description><link>https://www.cloudera.com/blog/technical/cloud-migration-checklist-getting-your-data-landscape-ready.html</link><guid>https://www.cloudera.com/blog/technical/cloud-migration-checklist-getting-your-data-landscape-ready.html</guid><pubDate>Thu, 06 Nov 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Ron Pick]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty163521492.jpg"><p>Do you know where your data is? The number of people who can pat their server and say fondly, “Right here!” is decreasing. Instead, more people are lifting their eyes to the heavens and answering, “Um… up there… somewhere…” McKinsey reports that in 2025, large enterprises have <a href="https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/projecting-the-global-value-of-cloud-3-trillion-is-up-for-grabs-for-companies-that-go-beyond-adoption" target="_blank">60% of their environment in the cloud</a>.&nbsp;</p>
<p>If you’re considering moving your data assets, processes, and applications to the cloud, you’re in good company. But if you’re dreading the move, you’re not alone there either. A data migration will inevitably strain your organization’s time, resources, and patience. But this article is here to help—a good checklist can make the process smoother so you can focus on execution.&nbsp;</p>
<p>We’ve put together a cloud migration checklist below. It’s a helpful framework that covers the points you need to ensure it happens.</p>
<h2>Do You Have Someone To Head The Migration?</h2>
<p>If you can’t check this one, stop in your tracks. Do not pass go; do not go to jail; do not head to Free Parking; do not MOVE!</p>
<p>A revolution without a leader will quickly dissolve into chaos. A cloud migration will face the same fate. The leader of a cloud migration must have both strong technical skills and strong interpersonal skills, because <a href="https://itbusinessnet.com/2021/06/surmounting-the-cloud-adoption-plateau/" target="_blank">personnel issues can stall or hinder a migration</a>. Your migration leader needs to facilitate a shift in your data’s location as well as a shift of your employees’ attitudes and perspectives about data.</p>
<p>If you don’t have one person who can fulfill both roles, then dividing the leadership role between the technical “migration architect” and the interpersonal “migration evangelist” so that each can be responsible for the cloud migration steps in their area of expertise can also work.&nbsp;</p>
<p>One tool that will help your migration evangelist is a <a href="/content/www/en-us/products/cloudera-data-platform/sdx/data-catalog.html">data intelligence platform with a data catalog</a>. When every employee can locate the data asset they need, no matter where it’s currently situated, resistance decreases and acceptance increases.&nbsp; &nbsp;</p>
<h2>Do You Know What You Can Leave Behind?</h2>
<p>Don’t move garbage. If you’re rolling your eyes and saying “Duh!”, then you haven’t been part of lift-and-shift migrations that take a legacy system and move it to a cloud environment, basically as is. If your organization has had its legacy system for more than a few years, it’s almost guaranteed to have garbage: outdated assets, defunct reports, redundant processes… all kinds of digital dust bunnies scampering around.</p>
<p>That’s not to say there isn’t ever a place for lift-and-shift migrations. However, if you’re trying to do the migration thing right, then spend the time sorting through what you have and deciding what’s valuable enough to be migrated, and what should stay behind.</p>
<p>Here an <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html?utm_medium=sem&amp;utm_source=google&amp;keyplay=MDA&amp;utm_campaign=Other---AlwaysOn-FY26-GLOBAL-WS-Website-Data-Lineage-Schedule-a-Demo-Form&amp;cid=701Ui00000YZo89IAD&amp;xvar=geo_amer_octopai_data_lineage_phrase&amp;utm_term=data%20fabric&amp;gad_source=1&amp;gad_campaignid=22879563311&amp;gbraid=0AAAAACcmMQHdqgnYMNJLfo0VkKJENIU-l&amp;gclid=CjwKCAjw0sfHBhB6EiwAQtv5qTLr-HTHoKGo-xSu4EdBhkl_7-y2HJxIrUpWe0ysV9j5jY7nS2kicRoC9JsQAvD_BwE">automated data lineage solution</a> can be invaluable. In minutes to hours, automated data lineage can create a complete mapping of your legacy data landscape, revealing your data flow and interconnections. A close reading of this data lineage map will show you almost everything you need to decide what goes to the cloud and what can be relegated to the past.</p>
<h2>Are Your Applications Ready To Take Advantage Of The Cloud’s Benefits?</h2>
<p>So you’ve decided what’s coming to the cloud. Fantastic! Now it’s time to look more closely at your applications and pipelines. The real financial and operational benefits of cloud migration are only achieved when your data systems architecture is designed to take advantage of the cloud’s benefits, such as:</p>
<ul>
<li>Dynamic scaling</li>
<li>Distributed workloads</li>
<li>Serverless computing capabilities</li>
<li>Powerful AI and ML capabilities</li>
</ul>
<p>Make yourself a checklist for each application that you plan to migrate. For each one, check which cloud benefits that application is poised to take advantage of in its current state. For example, if an application does not yet have the ability to run on variable servers, and you just replicate that in the cloud, it can’t utilize the cloud benefit of distributed workloads.&nbsp;</p>
<h2>What Needs To Be Done To Make This Specific Application Cloud-Ready?</h2>
<p>Sometimes bringing your application up to cloud speed is simple and quick. Sometimes it requires hours upon hours of development time. Possible scenarios include:</p>
<ul>
<li>Refactoring (reconstructing the application to match cloud capabilities)</li>
<li>Optimizing (the tweaks needed are more minor than refactoring)</li>
</ul>
<p>When you see the investment needed, you can then make an educated decision as to how to handle the application in question. You may decide to refactor, to optimize, or just to leave it alone for now and do a lift-and-shift, as sometimes the return on investment for refactoring or optimizing just isn’t worth it in your current situation.</p>
<h2>Check? Check!</h2>
<p>Data migrations aren’t easy or enviable, but with a detailed cloud database migration checklist to guide you, then can at least feel a little more manageable. Ready to bring your data landscape up to speed? Check!</p>
<p>For tips on how to reduce cloud costs once your migration is complete, check out this blog next: <a href="/content/www/en-us/blog/business/3-steps-to-cutting-cloud-costs-with-data-lineage.html">3 Steps to Cutting Cloud Costs with Data Lineage</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloud-migration-checklist-getting-your-data-landscape-ready</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera Named a Leader in the 2025 Forrester Wave for Data Fabric Platforms</title><description><![CDATA[We’re thrilled to share that Cloudera has been named a Leader in the 2025 Forrester Wave for Data Fabric Platforms.]]></description><link>https://www.cloudera.com/blog/business/cloudera-named-a-leader-in-the-2025-forrester-wave-for-data-fabric-platforms.html</link><guid>https://www.cloudera.com/blog/business/cloudera-named-a-leader-in-the-2025-forrester-wave-for-data-fabric-platforms.html</guid><pubDate>Wed, 05 Nov 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Wim Stoop]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-blue-orange-back-person-walking.webp"><p>We’re thrilled to share that Cloudera has been named a Leader in the 2025 Forrester Wave for Data Fabric Platforms. This recognition underscores its commitment to helping organizations unify, secure, and activate their data across hybrid and multi-cloud environments.</p>
<p>In this blog, we cover what a data fabric is and why it matters, what sets Cloudera apart&nbsp;<a href="/content/www/en-us/products/cloudera-data-platform.html">key capabilities of the Cloudera platform </a>that resulted in this position as a leader, and why all of this matters for Cloudera customers.</p>
<h2>What is a Data Fabric?</h2>
<p>In a world where data is more distributed than ever, enterprises need a way to connect the dots across silos—from on-premises systems to public clouds and everywhere in between. That’s exactly what a data fabric enables.</p>
<p>A data fabric is an architectural approach that connects, manages, and governs data across hybrid and multi-cloud environments. This approach allows data to be accessed and used anywhere, by anyone, securely and efficiently. Instead of forcing organizations to move all their data into a single system, a data fabric creates a virtual, uniﬁed layer that integrates data from multiple sources—clouds, on-premises, streaming, and edge—into one consistent framework. It provides end-to-end visibility, lineage, governance, and access, so teams can ﬁnd, trust, and use the right data in real time.</p>
<h2>Why Data Fabric Matters Now</h2>
<p>As organizations accelerate AI adoption and cloud transformation, they face a common challenge: data fragmentation. Data lives across multiple clouds, legacy systems, and on-premises environments—making it difficult to govern, secure, and operationalize for business impact.</p>
<p>A <a href="/content/www/en-us/products/unified-data-fabric.html">data fabric</a> addresses this by providing an architectural layer that automates and orchestrates data management across distributed environments. It connects data from any source, applies consistent governance, and delivers uniﬁed access for analytics, AI, and real-time decision-making.</p>
<p>Forrester’s evaluation of the top data fabric vendors highlights the importance of this&nbsp;<br>
</p>
<p>capability as enterprises seek to implement data and AI initiatives securely and at scale—and in our opinion, Cloudera’s position as a leader in making it a reality.</p>
<p>According to the Forrester report, “[Cloudera’s] focus on private cloud and on-premises deployments gives it a stronghold in industries with data sovereignty or legacy system requirements.” This long-standing foundation, combined with our open hybrid cloud strategy, has helped our customers modernize data architectures without compromising control or governance.</p>
<h2>Key Capabilities for Open Data Fabric: Where Cloudera Scored Highest</h2>
<p>In our opinion, receiving a 5/5 score from Forrester reﬂects more than product maturity—it signals leadership, customer validation, and measurable differentiation. In the 2025 Forrester Wave for Data Fabric Platforms, Cloudera received the highest scores possible (5/5) in seven criteria:</p>
<ul>
<li><p>End-to-End Integrated Fabric</p>
</li>
<li><p>Uniﬁed Data Catalog</p>
</li>
<li><p>Real-Time Performance and Scalability</p>
</li>
<li><p>Metadata Management</p>
</li>
<li><p>Agentic AI</p>
</li>
<li><p>Vision</p>
</li>
<li><p>Roadmap</p>
</li>
</ul>
<p>For <b>End-to-End Integrated Fabric</b>, Forrester deﬁnes a score of 5 as delivering advanced data management through a comprehensive, uniﬁed management portal that spans distributed environments, with integrated metadata, governance, and policies. It also recognizes the vendor as a leading contributor to key open-source fabric components.</p>
<p>In <b>Uniﬁed Data Catalog</b>, a 5/5 score indicates the vendor provides superior support for features such as a unified and automated data catalog across multiple data fabrics, AI-powered discovery, classiﬁcation and enrichment of metadata, full customization, native integration with third-party catalogs, and the ability for business users to leverage the catalog with full capabilities.</p>
<p>Achieving a 5/5 score in <b>Real-Time Performance and Scalability</b> indicates the vendor provides superior support for features such as certified hardware integration with NVIDIA GPUs, integration with SIMD, automated advanced AI/ML query tuning, automated tiered storage, add/drop resources automatically, and AI-enabled intelligent workload management, advanced horizontal scale out, dynamic sharding and balancing, automated scale-up and down.&nbsp;</p>
<p>In <b>Metadata Management</b>, Forrester looks for advanced, automation including end-to-end metadata discovery, tagging, and classiﬁcation, AI automation (such as automated tagging of of sensitive data), comprehensive integrated metadata across distributed fabrics, and integrated support for the data product lifecycle. Cloudera’s acquisition of Octopai enhances these capabilities by delivering deep lineage and metadata intelligence across hybrid environments, supporting the full lifecycle of governed data products.</p>
<p>The <b>Agentic AI</b> criterion recognizes vendors that embed autonomous AI agents to support <a href="/content/www/en-us/products/unified-data-fabric.html">data fabric.</a> To earn a 5, platforms must demonstrate AI agents that automate&nbsp;integration, governance, and discovery, operating collaboratively and contextually.</p>
<p>Forrester’s 5/5 score in the <b>Vision and Roadmap</b> <b>criteria</b> is reserved for vendors whose strategy anticipates customer needs and shapes the direction of the market, along with evidence of execution. Cloudera’s clear, open, and hybrid approach — bridging data, analytics, and AI across any environment — demonstrates a bold and differentiated vision that continues to lead the industry forward. Our investments in intelligent automation, interoperability, and agentic AI illustrate forward momentum in the roadmap.</p>
<p>Together,&nbsp; these 5/5 scores affirm its position as a trusted, future-ready data platform that uniﬁes data, analytics, and AI across any cloud or infrastructure. The&nbsp; 5/5 scores in the metadata management and agentic AI criteria demonstrate how its data fabric continues to evolve to meet the needs of modern, AI-driven enterprises.</p>
<h2>What Sets Cloudera Apart: A Strategy Built for the Future of Data and AI</h2>
<p>At Cloudera, our mission is to make data and AI work together, seamlessly and securely, across any environment. Cloudera believes it stood out in Forrester’s evaluation for its open, hybrid-by-design architecture that enables enterprises to manage data seamlessly across on-premises and multi-cloud environments, powered by open standards and open source innovation.</p>
<p>The Forrester report notes that “Cloudera’s data fabric strategy tackles the challenge of fragmented data, aiming to deliver integrated governance, visibility, and secure access across hybrid and multicloud environments.”</p>
<p>Key elements of our strategy include:</p>
<ul>
<li><p><b>Integrated governance and visibility: </b><a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Cloudera Shared Data Experience (SDX) </a>ensures that policies for access, lineage, and compliance are applied consistently across all workloads. This uniﬁed approach brings consistency and transparency across all data assets.</p>
</li>
</ul>
<ul>
<li><p><b>Metadata intelligence and lineage:</b> <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage </a>enables end-to-end lineage, impact analysis, and automated metadata management.</p>
</li>
</ul>
<ul>
<li><p><b>Open architecture and interoperability: </b>Cloudera’s AI-ready architecture brings together advanced analytics, machine learning, and real-time streaming to help organizations transform raw data into actionable insight faster. Designed to work seamlessly with non-Cloudera engines, supporting ﬂexibility and avoiding lock-in.</p>
</li>
</ul>
<ul>
<li><p><b>Intelligent automation:</b> Our roadmap invests in agentic AI, automation, and intelligent data fabric capabilities to optimize workloads and deliver adaptive data experiences.</p>
</li>
</ul>
<ul>
<li><p><b>Trusted and proven:</b> Our platform is proven at global scale—trusted by leading banks, telecommunications providers, and public sector organizations to power some of the world’s most data-intensive and mission-critical operations with reliability and&nbsp;conﬁdence.</p>
</li>
</ul>
<p>These advancements underscore our commitment to helping enterprises simplify complexity, ensure trust, and accelerate innovation as data becomes the foundation of AI-driven transformation.</p>
<h2>Why Customers Should Care: Data Fabric is the Foundation for Trusted AI</h2>
<p>As enterprises scale their AI initiatives, the importance of a uniﬁed, governed data layer cannot be overstated. AI models are only as good as the data they’re built on—and that data must be accessible, high-quality, and compliant. Once you have trusted, governed data from anywhere, you can power trusted AI everywhere.</p>
<p>Cloudera’s data fabric enables organizations to govern every dataset, track every lineage, and trust every prediction, ensuring responsible AI that aligns with enterprise and regulatory standards. Cloudera’s open data lakehouse extends the value of the data fabric by enabling secure analytics, machine learning, and AI on uniﬁed, high-quality data.</p>
<p>Together, the <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Uniﬁed Data Fabric </a>and <a href="/content/www/en-us/products/open-data-lakehouse.html">Cloudera Open Data Lakehouse </a>form the foundation for a modern enterprise data strategy—one that brings intelligence to every workload, user, and business decision. With Cloudera, organizations don’t just unify data—they unlock its full potential to drive innovation, resilience, and responsible AI at scale.</p>
<h2>See the Full Evaluation</h2>
<p>We invite you to read The Forrester Wave™: Data Fabric Platforms, Q4 2025 to see how vendors stack up and why Cloudera was named a Leader. <a href="/content/www/en-us/campaign/the-forrester-wave-data-fabric-platforms-q4-2025.html">Access the report</a> to understand the trends shaping data fabric architectures—and how we&nbsp; believe Cloudera continues to lead the way.</p>
<p><i>Forrester does not endorse any company, product, brand, or service included in its research publications and does not advise any person to select the products or services of any company or brand based on the ratings included in such publications. Information is based on the best available resources. Opinions reflect judgment at the time and are subject to change. For more information, read about Forrester’s objectivity&nbsp;<a href="https://www.forrester.com/about-us/objectivity/">here </a>.</i></p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-named-a-leader-in-the-2025-forrester-wave-for-data-fabric-platforms</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>The Inevitable Outage: Why Your Hybrid Strategy Needs Multi-Cloud Resilience</title><description><![CDATA[The recent global IT outage experienced by a major cloud hyperscaler was a disruptive, real-world reminder that downtime and service disruptions are inevitable. The event impacted services across banking, retail, and healthcare, and served as a powerful warning that relying on any single provider, or even a single cloud region, creates a critical business vulnerability. ]]></description><link>https://www.cloudera.com/blog/business/the-inevitable-outage-why-your-hybrid-strategy-needs-multi-cloud-resilience.html</link><guid>https://www.cloudera.com/blog/business/the-inevitable-outage-why-your-hybrid-strategy-needs-multi-cloud-resilience.html</guid><pubDate>Wed, 29 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Blake Tow]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty157283393.jpg"><p>The recent global IT outage experienced by a major cloud hyperscaler was a disruptive, real-world reminder that downtime and service disruptions are inevitable. The event impacted services across banking, retail, and healthcare, and served as a powerful warning that relying on any single provider, or even a single cloud region, creates a critical business vulnerability.&nbsp;</p>
<p>This outage highlights the critical risk of a single-provider strategy, rather than an inherent problem with the cloud. It’s the clearest example yet of why a hybrid cloud strategy—one that gives you the freedom to move data and AI workloads between clouds and data centers—must include multi-cloud capabilities.</p>
<p>This is why <a href="/content/www/en-us.html">Cloudera's “anywhere cloud&quot; approach</a> is the clear choice for organizations looking to ensure business continuity. When we say &quot;data anywhere,&quot; we mean it: in your data centers, at the edge, and across multiple public clouds.</p>
<h2>Hybrid is the Foundation for Freedom</h2>
<p>For years, Cloudera has championed a hybrid cloud strategy as the foundation for enterprise freedom. We believe that you should have the flexibility to run your data and AI workloads where it makes the most sense for your business—whether that’s in your own data centers, in a public cloud, or at the edge—and the choice to move them as needed depending on changing business imperatives.</p>
<p>The goal of hybrid is to deliver a consistent cloud experience anywhere, giving you the agility and scalability of the public cloud while maintaining the security and control of your private cloud. This approach is designed to give enterprises the freedom to move data and AI workloads between clouds and data centers, without friction or vendor lock-in. This freedom from infrastructure lock-in is the core of a <a href="/content/www/en-us/blog/business/architecting-for-data-resilience-ensuring-business-continuity-with-cloudera.html">resilient architecture</a>.</p>
<h2>The Key: Hybrid <i>Includes</i> Multi-Cloud for Resilience</h2>
<p>While this hybrid foundation provides crucial freedom and choice, the recent outage exposed a critical blind spot in many hybrid strategies: if your architecture simply connects your data center to a single public cloud provider, you’re still dangerously exposed. You’ve merely swapped one single point of failure for another.</p>
<p>As we discussed in our <a href="/content/www/en-us/blog/business/architecting-for-data-resilience-ensuring-business-continuity-with-cloudera.html">last post</a>, true resilience is about eliminating single points of failure. A modern hybrid strategy, therefore, must be a multi-cloud strategy. Achieving true business continuity means having the freedom to “failover anywhere.” This capability must go beyond a simple on-premises-to-cloud connection to include failover between cloud regions, back to your data center, and critically, from one cloud provider to another.</p>
<h2>How Cloudera's Cloud Anywhere Platform Makes This a Reality</h2>
<p>On paper, a multi-cloud failover strategy is the obvious answer. In reality, it’s incredibly complex. Different cloud providers have different APIs, data services, and security models. For most organizations, moving a mission-critical data workload from one cloud to another would require a painful, time-consuming effort to refactor applications, re-architect security policies, and migrate data.</p>
<p>Reducing this complexity is precisely the problem our platform was built to solve. Cloudera’s cloud anywhere platform enables a true &quot;failover anywhere&quot; strategy by providing two essential, unique capabilities:</p>
<ul>
<li><p><b>A consistent, portable platform: </b>Our <a href="/content/www/en-us/products/open-data-lakehouse.html">open data lakehouse</a> and portable data services run identically everywhere. We provide a consistent &quot;write-once, run-anywhere&quot; data and AI platform that runs on any cloud, including AWS, Azure, and Google Cloud, as well as in your private data center. This eliminates the need to refactor applications or workloads when moving between different infrastructures, giving you true portability and eliminating infrastructure dependency.</p>
</li>
</ul>
<ul>
<li><p><b>A unified data fabric with replication:</b> A workload encompasses more than data; it also includes the security and governance that must travel with it. Our <a href="/content/www/en-us/products/unified-data-fabric.html">Unified Data Fabric</a>, powered by <a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Cloudera Shared Data Experience (SDX)</a>, ensures that critical metadata, security, and governance policies are consistent everywhere. Capabilities like <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> provide deep metadata management and lineage, which is also critical context for a failover scenario. Our <a href="/content/www/en-us/products/cloudera-data-platform/sdx/replication-manager.html">Replication Manager</a> then replicates both the data and its critical context, which includes metadata and policies, to another environment.</p>
</li>
</ul>
<p>This combination makes the multi-cloud resilience scenario a practical reality. You can run your primary workloads on one cloud provider while using Replication Manager to maintain a synchronized, secondary environment on a completely different cloud provider. When an outage strikes your primary provider, you can quickly promote the secondary environment, ensuring business continuity with minimal data loss (recovery point objective, or RPO) and minimal downtime (recovery time objective, or RTO).</p>
<h2>Your Hybrid Strategy Must Be Multi-Cloud Ready</h2>
<p>The recent outage should be treated as a drill. It was a test of every organization's resilience strategy, and it exposed a common, critical vulnerability: single-provider dependency. A hybrid architecture is the right foundation for the modern enterprise, but if your strategy has a single-provider blind spot, it's not truly resilient. Don't wait for the next, inevitable disruption to discover this.&nbsp;</p>
<p>Cloudera provides the true &quot;cloud experience anywhere,&quot; giving you the capability to design a resilience plan that can withstand any failure. To learn more about how to build a truly resilient architecture, read our blogs &quot;<a href="/content/www/en-us/blog/business/architecting-for-data-resilience-ensuring-business-continuity-with-cloudera.html">Architecting for Data Resilience</a>&quot; and “<a href="/content/www/en-us/blog/business/mastering-multi-cloud-with-cloudera-strategic-data-ai-deployments-across-clouds.html">Mastering Multi-Cloud with Cloudera</a>”.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=the-inevitable-outage-why-your-hybrid-strategy-needs-multi-cloud-resilience</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Strengthen Data Governance with the Power of Automated Data Lineage</title><description><![CDATA[Learn how successful governance managers leverage next-gen data lineage tools to improve data governance a hundredfold in four key ways.]]></description><link>https://www.cloudera.com/blog/business/strengthen-data-governance-with-the-power-of-automated-data-lineage.html</link><guid>https://www.cloudera.com/blog/business/strengthen-data-governance-with-the-power-of-automated-data-lineage.html</guid><pubDate>Tue, 28 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Ron Pick]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-two-people-product.jpg"><p>Trying to manage governance without a <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">comprehensive data lineage solution</a> can leave you feeling like your data keeps running away. It’s not easy to keep up with data and metadata on the move. Successful governance managers and data stewards leverage a data lineage tool to improve governance a hundredfold in four key ways we’ll explore next.&nbsp;<br>
&nbsp;</p>
<h2>4 Ways A Data Lineage Tool Will Improve Data Governance<br>
&nbsp;</h2>
<h3>1. Correcting Errors</h3>
<p>Maintaining quality is a key goal of <a href="/content/www/en-us/resources/faqs/data-governance.html">data governance.</a> It’s your responsibility to make sure that management and business users make important decisions based on accurate information.</p>
<p>If you find erroneous data, of course remove and replace it ASAP. But if you’re constantly correcting retroactively instead of fixing the origin of the error, you’ll be constantly pulling weeds in that data field. Long term, it’s much more effective to identify where in the system the error was introduced and fix it at the source.&nbsp;</p>
<p>A comprehensive data lineage tool enables you to trace any data point’s journey upstream to origin and downstream to target, inspecting every process that transformed the data along the way.&nbsp;</p>
<p>In the case of flawed data, you can use data lineage to quickly conduct root cause analysis to work backward from where the error first appeared and identify the stage and/or process where the data changed from accurate to flawed. You can then correct the problem at the root, eliminating the proliferation of dirty data and the necessity of correcting that data wherever it travels in your environment.&nbsp;</p>
<h3>2. Keeping Up With Minor Changes</h3>
<p>If you want to work in an industry where change seems slow, try paleontology. When you work in data governance, change is constant and fast. Technologies evolve, source systems develop, your dataset structure is modified to reflect new business demands from your data, calculation methods change, and so on.</p>
<p>All the constant little changes need to be reflected in your data governance platform, or you’ll quickly wind up with piles of ungoverned data. If it's left up to human, manual effort to keep the data governance platform updated, then it’s very easy for a change to fall through the cracks.</p>
<p><a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Automated data lineage tools</a>&nbsp;for data governance, on the other hand, will periodically and automatically run through all your metadata and make note of any new additions, deletions or changes. They will then update your data governance platform with the new fields, calculations or other metadata.</p>
<p>With an automated data lineage solution at your back, you can concentrate on managing and governing data instead of chasing it.</p>
<h3>3. Preparing For Major Changes</h3>
<p>Mergers and migrations and transitions—oh, my! Most data professionals will probably experience, if not preside over, at least one of these major events over the course of their careers.&nbsp;</p>
<p>The transition is usually unavoidable. And it will just as unavoidably wreak havoc with the work of anyone in your business who touches data and its results—from governance to BI to business—unless you foresee where the changes made to accommodate the new system will impact your current workflows.&nbsp;</p>
<p>Short of a crystal ball, this foresight can only be had by creating a complete visualization of your current system and data flow, comparing it with the intended layout and processes of the new system, and planning how to transition smoothly from one to the other.&nbsp;</p>
<p>It also usually involves lots of communication between members of different departments to apprise them of the slated changes and ask how these changes will affect them, their data and their processes (and then hope they actually respond in a timely fashion). This process, when done manually, typically takes an entire data department months to complete.</p>
<p>Furthermore, an upcoming major transition can be an opportunity—an opportunity to make your data governance more efficient by pruning out dormant fields, consolidating overlapping definitions and checking the consistency of process results. But capitalizing on that opportunity can take months of manual mapping efforts just to prepare for the real work of streamlining your data management.&nbsp;</p>
<p>An automated data lineage tool can turn those months of manual impact analysis into days, or even a single day. Talk about efficiency. One small step for an automated data lineage tool; one giant leap for data governance.&nbsp;</p>
<h3>4. Setup</h3>
<p>Let’s take a trip down memory lane to the day your company got a new enterprise data governance platform: Congratulations! This platform is going to work wonders for your company as soon as you set it up. But that’s easier said than done.&nbsp;</p>
<p>Data governance platforms usually have an incorporated <a href="/content/www/en-us/products/cloudera-data-platform/sdx/data-catalog.html">data catalog</a>, and setup means populating that catalog with all the metadata you are planning to govern. That process usually takes months upon months of work. However, with an automated data lineage tool, you can set up an entire data catalog on your lunch break.</p>
<p>As mentioned above, a comprehensive data lineage solution doesn’t lie down on the job after the initial cleanup. It periodically refreshes, updating your data governance platform with any metadata changes or additions, so you don’t have to endanger your working relationship with any other department by reminding them constantly to update you or the platform every time they make a change to a field, a process or a report.</p>
<h2>Picking The Right Tool For Data Lineage In Data Governance</h2>
<p>Not everything that calls itself a “data lineage” solution can actually perform all the functions above. Some tools come with built-in automated lineage functions that still require significant manual labor (and headache). As such, it’s important to evaluate solutions to ensure they offer the full suite of capabilities and metadata management you need.</p>
<p>To that end, request a demo to get started with <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a>—an automated lineage solution that can perform these functions and improve your data governance today.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=strengthen-data-governance-with-the-power-of-automated-data-lineage</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Architecting for Data Resilience: Ensuring Business Continuity with Cloudera</title><description><![CDATA[Learn how Cloudera customers are uniquely positioned to ensure business continuity from our portable hybrid architecture and data resiliency tools.]]></description><link>https://www.cloudera.com/blog/business/architecting-for-data-resilience-ensuring-business-continuity-with-cloudera.html</link><guid>https://www.cloudera.com/blog/business/architecting-for-data-resilience-ensuring-business-continuity-with-cloudera.html</guid><pubDate>Wed, 22 Oct 2025 18:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Jeremiah Morrow,Eileen O’Loughlin]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-2people-looking-at-tablet.webp"><p>The recent global IT outage experienced by a cloud hyperscaler was a reminder of a universal truth in technology: even if it’s minimal, downtime and service disruptions are inevitable. While the impact was widespread, disrupting services across retail, banking, healthcare, and other sectors, this wasn’t a failure unique to a single provider or a single cloud. It illustrates that disruption can occur anywhere: in any cloud region, with any provider.&nbsp;</p>
<p>The key takeaway is clear: organizations can and must take control by building a resilient data architecture that can adapt and thrive amid constant change. In this blog, we’ll share how <a href="/content/www/en-us.html">Cloudera</a> customers are uniquely positioned to ensure business continuity thanks to the flexibility our portable architecture and tools that ensure seamless failover and recovery. Cloudera is the only data and AI platform company that brings AI to data anywhere: in clouds, data centers, and at the edge.</p>
<h2>What Does it Mean to Architect for Resilience?</h2>
<p>Data resilience is an organization's ability to withstand, recover quickly from, and minimize the impact of data-related disruptions or failures. It is a proactive approach to business continuity, going beyond backup or disaster recovery to ensure that critical data always remains:</p>
<ul>
<li><p><b>Available:</b> Accessible to users and applications when needed (minimizing recovery time objective or RTO)</p>
</li>
<li><p><b>Intact/accurate (data integrity)</b>: Uncorrupted and unaltered (minimizing recovery point objective or RPO)</p>
</li>
<li><p><b>Secure</b>: Protected from unauthorized access, loss, or theft</p>
</li>
</ul>
<p>Architecting for true resilience involves two core, interconnected pillars: technology that enables portability and a vetted process for failover.</p>
<h3>1. Enable Failover Anywhere: Eliminate Single Points of Failure</h3>
<p>Relying on a single provider, a single cloud, or even a single region within a cloud creates a critical business vulnerability, or single point of failure. Outages occur due to hardware failures, software issues, human error, natural disasters, or cyberattacks. The goal of resilience is to ensure that when one environment goes down, your operations can seamlessly and automatically continue elsewhere.</p>
<p>This means you must be able to failover anywhere—between cloud regions, across cloud providers, and even back to a <a href="/content/www/en-us/products/cloudera-data-platform.html">data center</a>. Business operations must continue, and critical systems must remain up and running, regardless of where the initial disruption occurred.</p>
<h3>2. Have a Vetted Plan for Resilience</h3>
<p>Technology can provide resilience capability, but the process is essential for successful business continuity. Too many disaster recovery plans are written once and rarely revisited, even as people and technology evolve. A well-vetted plan is documented, practiced, and revisited regularly to ensure that the organization can execute in the event of a failure. Some elements of the plan include:&nbsp;</p>
<ul>
<li><p><b>Prioritizing workloads</b> to ensure mission-critical operations, such as transaction processing in retail and remote monitoring in healthcare, have the lowest&nbsp;service level agreements (SLAs) for RTO and RPO.</p>
</li>
<li><p><b>Ensuring redundancy and high availability </b>by establishing the ability to failover between environments to maintain operations.</p>
</li>
<li><p><b>Backing up </b>critical data and metadata, and establishing retention policies and governance.</p>
</li>
</ul>
<h2>How Does Cloudera Help Organizations Architect for Resilience?</h2>
<p>Cloudera is the only data and AI platform provider that delivers a consistent cloud experience to data anywhere. This gives enterprises the freedom to move data and AI workloads between clouds and data centers—without friction or vendor lock-in—so that you’re no longer tied to any one piece of infrastructure. As a result, organizations can reduce business risk by leveraging Cloudera to architect for resilience and maintain consistent operations and compliance no matter where data resides.</p>
<p>The Cloudera platform supports high availability and disaster tolerance through our solutions and services, including:</p>
<ul>
<li><p><a href="/content/www/en-us/products/cloudera-data-platform.html">Portable Data Services</a>: Cloudera’s platform, including cloud-native data services and data lake, runs consistently on any cloud (AWS, Azure, Google Cloud) and on premises in Kubernetes. The freedom from underlying infrastructure enables customers to configure a variety of available sites—mixing different clouds and on-premises resources—to drastically reduce dependency on a single platform or vendor.</p>
</li>
</ul>
<ul>
<li><p><a href="/content/www/en-us/blog/business/what-makes-data-in-motion-architectures-a-must-have-for-the-modern-enterprise.html">Data in Motion</a>: Cloudera Data Flow, Cloudera Streaming Analytics, and Cloudera Streams Messaging enable customers to capture, process, and distribute data anywhere in real time. For mission-critical, real-time workloads like fraud&nbsp;detection and network monitoring, a potential outage can have significant business impact. Cloudera ensures these services remain highly available and can be replicated across environments.</p>
</li>
</ul>
<ul>
<li><p><a href="https://docs.cloudera.com/replication-manager/cloud/operations/topics/rm-about-replication-manager.html">Replication Manager</a>: This core Cloudera component provides a simplified approach to backup and recovery. It replicates not just the data, but also the metadata, critical security and governance policies tied to that data. This replication enables easy migration, continuous synchronization, and, most importantly, the ability to quickly failover by promoting a secondary replicated environment alongside the primary operating environment with minimal data loss.</p>
</li>
</ul>
<ul>
<li><p><a href="/content/www/en-us/products/open-data-lakehouse.html">Open Data Lakehouse</a>: Cloudera’s open data lakehouse provides secure data management and portable cloud-native data analytics with a write-one, run-anywhere approach. This eliminates the time and costs associated with refactoring applications or workloads when moving between different infrastructures.&nbsp;</p>
</li>
</ul>
<p><i><b>Figure 1.</b>&nbsp;Cloudera Delivers the Cloud Experience Anywhere for AI Everywhere</i></p>
<p>Together, these capabilities enable Cloudera customers to run mission-critical data and AI workloads with confidence, ensuring near-zero downtime and data loss for their most important business processes, even during an infrastructure-level outage.&nbsp;</p>
<h2>How AM-BITS Architected for Resilience in the Face of Geopolitical Instability</h2>
<p>For many businesses, the recent outage was just a blip. But what if the disruption was a true disaster, like a war? Based in Ukraine, <a href="/content/www/en-us/customers/am-bits.html">AM-BITS</a>, an IT solutions provider for the banking, telecom, and retail sectors, faced an urgent need to secure and migrate their clients’ mission-critical data after geopolitical disruption forced organizations to rapidly accelerate their shift from on-premises systems to the cloud. A typical cloud migration could take six months or more—a timeline that many businesses could not afford.</p>
<p>To address this crisis of continuity, AM-BITS built a modern, multi-tenant data and <a href="/content/www/en-us/products/machine-learning.html">AI platform</a> powered by Cloudera. Leveraging Cloudera Shared Data Experience (Cloudera SDX), AM-BITS rapidly provided a “technical safe harbor” for its clients’ data assets, drastically reducing the time to securely migrate data to the cloud by 50%. Because Cloudera operates seamlessly across any environment, AM-BITS’ clients gained true flexibility: they could migrate to the cloud quickly, but they also maintained the option to move to a different cloud or bring data back on premises. By leveraging Cloudera, AM-BITS turned portability into a powerful tool for business continuity.</p>
<h2>Next Steps</h2>
<p>Data-related disruptions and outages can be caused by hardware failures, software issues, human error, natural disasters, cyberattacks, and more. It’s critical that organizations design their systems&nbsp;with those points of failure in mind and have a plan in place to recover their IT systems and data quickly and without significant disruption.</p>
<p>To learn more about how you can architect for resilience with Cloudera, take a look at our <a href="https://docs.cloudera.com/cdp-reference-architectures/latest/cdp-ra-operations/topics/cdp-ra-checklist-resources.html">disaster recovery checklist and resources</a>, or reach out to our professional services team who can help you design a plan for resilience.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=architecting-for-data-resilience-ensuring-business-continuity-with-cloudera</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Developer Relations at Cloudera: What We’re Building for Our Developer Community</title><description><![CDATA[In this blog post, I’ll share what I’ve seen in my first month on the job and what excites me most about what we’re building here. My goal is to enable practitioners to learn, explore, and build with technologies that matter—whether that’s open data architectures with Apache Iceberg; streaming systems with Apache Flink, Kafka, and NiFi; or generative AI (GenAI) applications. So, importantly, I’ll discuss how Cloudera’s platform and data services can support running and delivering these technologies, securely,  at scale, anywhere (clouds, data centers, and at the edge) with the openness and trust that developers expect.]]></description><link>https://www.cloudera.com/blog/business/developer-relations-at-cloudera-what-we-are-building-for-our-developer-community.html</link><guid>https://www.cloudera.com/blog/business/developer-relations-at-cloudera-what-we-are-building-for-our-developer-community.html</guid><pubDate>Wed, 22 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Dipankar Mazumdar]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1338981383.jpg"><p><b><i>Figure 4</i></b><i>: Cloudera’s Enterprise Intelligent Center at NYC <b>EVOLVE</b></i></p>
<h2>My Next 30 Days: What I’m Looking Forward To Most</h2>
<p>I love how Cloudera focuses on developers. Teams here talk constantly about how to make things easier to use, how to remove friction, and how to listen to feedback from practitioners in the field. That mindset of putting developer productivity and real-world needs first is exactly where Developer Advocacy can add value.</p>
<p>We’re building a home for the developer community—a place where engineers can learn, try out things, and build without friction. Our focus is on helping developers move from “this looks hard” to “I can build this” with the right patterns, explanations, and tools.</p>
<p>To materialize that vision of a true home for developers, keep an eye out for a Cloudera <b>Developer Hub</b>—a central place where the community can find all of this content, access labs, ask questions, and exchange ideas with other practitioners.&nbsp;</p>
<p>More to come on that soon! In the meantime, stay up to date with our latest practitioner news by <a href="https://community.cloudera.com/" target="_blank">subscribing to the Cloudera Community.</a></p>
<p><b><i>Figure 3</i></b><i>: Cloudera Iceberg REST Catalog and how it offers interoperability with 3rd party engines.</i></p>
<h3>Launching New Technologies</h3>
<p>That’s why it was exciting to see the launch of the <a href="https://docs.cloudera.com/runtime/7.3.1/overview/topics/cr-ds-cloudera-iceberg-rest-catalog.html" target="_blank" rel="noopener noreferrer">Cloudera Iceberg REST Catalog</a> at <a href="/content/www/en-us/about/news-and-blogs/press-releases/2025-09-25-cloudera-accelerates-ai-and-analytics-projects-with-a-unified-platform-for-secure-governed-and-performant-data.html">NYC EVOLVE</a>. With this release, developers can use third-party engines to access Cloudera-managed data directly—without copying or moving it around. Just as important, the same security and governance policies follow the data everywhere, ensuring consistency no matter where it’s accessed.</p>
<p>Alongside the REST Catalog, we also announced the <a href="/content/www/en-us/blog/technical/cloudera-lakehouse-optimizer-easier-to-deliver-high-performance-iceberg-tables.html">Lakehouse Optimizer</a>. For engineers (particularly with Iceberg), this matters because it takes care of the tedious, behind-the-scenes work that usually comes with managing Iceberg tables. Instead of manually handling tasks like compacting small files, rewriting manifests, or cleaning up position deletes, the optimizer does this automatically.</p>
<p>What that translates to is simple: faster queries and lower storage costs, without developers needing to constantly tune or babysit their tables. And since it’s built as an open service, the same optimizations apply no matter which Iceberg-compatible engine you’re running.&nbsp;</p>
<p>The same mindset of openness shows up in how Cloudera approaches GenAI workloads. Instead of betting on just closed-source models (which have their own advantages and challenges), <a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a> embraces flexibility: support for open-source large language models (LLMs) like LLaMA, Mistral, and Hugging Face, plus the ability to fine-tune them on enterprise-specific data. That matters because developers want choice. They want to train, fine-tune, and deploy models in their own infrastructure with the same security and governance as the rest of their stack.</p>
<h3>Creating Two-Way Feedback Loops</h3>
<p>And, finally, there’s the <i>momentum</i>. At NYC <b>EVOLVE</b>, I saw firsthand how engineers and decision-makers are leaning in and asking questions about deploying GenAI use cases, integrating Iceberg into their architecture, and making their data architectures more open, future-proof, and cost friendly. That kind of curiosity is what excites me. It reinforces why building a stronger developer community is so important here.</p>
<p>Our goal within Developer Relations is to turn those conversations into something actionable. This means showing how developers can build open architectures with Iceberg, how they can run multi-compute pipelines seamlessly with interoperability guarantees, and how they can build AI agents on top of their lakehouse data, among other groundbreaking innovations.</p>
<p><span class="text-lead"><b>This blog is part two of two</b></span></p>
<p>I recently joined <a href="/content/www/en-us.html">Cloudera</a> to lead <b>Developer Relations (DevRel)</b>, and I’m excited to build out this team and connect with the worldwide developer community.</p>
<p>In this blog post, I’ll share what I’ve seen in my first month on the job and what excites me most about what we’re building here. My goal is to enable practitioners to learn, explore, and build with technologies that matter—whether that’s open data architectures with Apache Iceberg; streaming systems with Apache Flink, Kafka, and NiFi; or generative AI (GenAI) applications. So, importantly, I’ll discuss how Cloudera’s platform and data services can support running and delivering these technologies, securely,&nbsp; at scale, anywhere (clouds, data centers, and at the edge) with the openness and trust that developers expect.</p>
<p>In part one, I cover what Developer Relations means for Cloudera. While the DevRel function can look different from one organization to another, our focus will be on educating, engaging, and building a two-way relationship with developers.</p>
<h2>My First 30 Days: What I’ve Seen</h2>
<p>What struck me right away at Cloudera is how much emphasis is placed on openness and the flexibility it creates for those building on the platform. True to its open-source foundations, Cloudera still values and prioritizes openness, which is evident in its approach to open standards and frameworks. That’s why so much investment is going into technologies that carry that vision forward.&nbsp;</p>
<h3>Prioritizing Openness</h3>
<p>Take <a href="https://iceberg.apache.org/" target="_blank" rel="noopener noreferrer">Apache Iceberg</a>, as an example. Cloudera has been an early proponent of Iceberg as the foundation for an open data architecture because it reflects that same vision of openness and interoperability.&nbsp;</p>
<p><i><b>Figure 2</b>: Comparative representation of catalogs with different implementations and catalogs that speak ‘REST’</i></p>
<p>This is exactly the problem the Iceberg REST Catalog was designed to solve. The REST Catalog API provides a <b>universal standard</b> for server–client communication, ensuring that Iceberg clients can interact with any compliant catalog,&nbsp; regardless of the server implementation’s underlying technology or programming language. Users can create tables, branch versions, or list snapshots through the same API—no matter which catalog sits underneath.&nbsp;</p>
<p>For developers, this removes the need for one-off connectors and reduces friction when adopting new engines. For organizations, it helps avoid locking Iceberg tables into a single platform’s catalog, while still keeping governance and security consistent. In short, everyone speaks the same “language.”</p>
<p><i><b>Figure 1</b>: Apache Iceberg as the foundation for open data architecture with Cloudera</i></p>
<p>Iceberg gives developers an open table format that isn’t tied to one engine or one vendor. You can write data with Spark, stream updates with Flink, query with Trino or Hive—all against the same table. That level of interoperability has traditionally been limited, if not absent, in other data architectures such as <a href="https://www.onehouse.ai/blog/towards-open-data---part-1-cloud-warehouses-now-love-open-formats" target="_blank">cloud data warehouses</a>, but it’s exactly what modern data and AI platforms need. For customers, this becomes a real advantage. By building their data architecture on Iceberg, they make themselves future-proof, and any new compute engine can be plugged into the same tables without costly migrations or lock-in.</p>
<p>Building on that, the <a href="https://iceberg.apache.org/rest-catalog-spec/" target="_blank">Iceberg REST Catalog</a> takes openness a step further. While open table formats like Iceberg have broadened access to data, the <a href="https://iceberg.apache.org/terms/#catalog" target="_blank">catalog</a> is another critical component in the lakehouse architecture that needs to be interoperable.&nbsp;</p>
<p>Today, there are many different catalog implementations—both open-source and proprietary. The challenge is that managing Iceberg tables across different catalogs has historically required custom integrations, making true interoperability difficult. On top of that, many vendor platforms only provide full support if developers use their own built-in catalog. That dependency limits what can be shared with other engines and tools, creating a new form of lock-in.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=developer-relations-at-cloudera-what-we-are-building-for-our-developer-community</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Developer Relations at Cloudera: Introducing DevRel and What Developer Advocacy Means for Cloudera</title><description><![CDATA[In this blog post,  I’ll explain what Developer Relations means for Cloudera. The DevRel function can look different from one organization to another, depending on the goals of advocacy. This is my fourth DevRel gig, and at Onehouse, Dremio, and Qlik, the focus was slightly different. But the crux has always been the same: educating, engaging, and building a two-way relationship with developers.]]></description><link>https://www.cloudera.com/blog/business/developer-relations-at-cloudera-introducing-devrel-and-what-developer-advocacy-means-for-cloudera.html</link><guid>https://www.cloudera.com/blog/business/developer-relations-at-cloudera-introducing-devrel-and-what-developer-advocacy-means-for-cloudera.html</guid><pubDate>Tue, 21 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Dipankar Mazumdar]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-1338714910.jpg"><p><i><b>Figure 3</b>: Cloudera’s support and usage of various open source software</i></p>
<p>In line with its open-source foundations, Cloudera is built on the principle of being open for integration, frameworks, and standards. That openness gives developers the freedom to use the tools they already know, adopt new ones as the ecosystem evolves, and avoid being locked into a narrow path.</p>
<p>That is why DevRel is so critical here. It means staying deeply engaged with the open-source ecosystem, while also enabling enterprise developers who rely on Cloudera to solve real problems in data and AI using these foundational technologies.&nbsp;</p>
<p>At Cloudera our DevRel work is anchored on three pillars: <b>awareness, engagement, and impact</b>. Awareness is about making sure developers discover and understand what’s possible. Engagement is about meeting them where they are. And impact is about driving real outcomes: helping developers be more productive, shaping better products through feedback, and strengthening the open-source projects we all depend on.</p>
<h2>What We’re Building for Developers</h2>
<p>As I wrap up my first month, I keep coming back to a simple thought: my journey has always been shaped by community. I started as an engineer leaning on open source—reading docs, interpreting code, and learning from community blogs. Over time, I contributed back in different ways.&nbsp;</p>
<p>Now at Cloudera, I see the chance to extend that same cycle: to learn, share, and build alongside developers. Here’s what we’ll be working on in the coming months:</p>
<ul>
<li><p><b>Technical deep-dives</b>: This includes blogs, how-tos, and whitepapers on how to operationalize technologies like Iceberg, Spark, Flink, NiFi, Ozone, Kafka, and more at scale with Cloudera. They’ll show real patterns, tradeoffs, and examples you can reuse.</p>
</li>
</ul>
<ul>
<li><p><b>New explainer series</b>: Encompassing short, focused breakdowns of concepts, use cases, and learnings from production in the data and AI space. The goal is to cut through jargon and give developers a clear mental model.</p>
</li>
</ul>
<ul>
<li><p><b>Hands-on labs</b>: These are guided, runnable examples you can try on your own laptop or cloud environments. If a blog tackles the “why,” labs will show the “how.”</p>
</li>
</ul>
<ul>
<li><p><b>Community events</b>: We are meeting engineers wherever they learn and code. So, meetups, workshops, and conference sessions are where we will engage directly, exchange ideas, and learn from one another.</p>
</li>
</ul>
<p><a href="https://community.cloudera.com/" target="_blank" rel="noopener noreferrer">&nbsp;Join me at the Cloudera Community</a> and engage with the content, try out the code, give feedback, and ask questions!</p>
<p>&nbsp;</p>
<p><span class="text-lead"><b>This blog is part one of two</b></span></p>
<p>It has been slightly more than four weeks since I joined <a href="/content/www/en-us.html">Cloudera</a> to lead <b>Developer Relations (DevRel)</b>. A month may seem brief, but it’s enough to feel the pulse of a community—its culture, its people, and the momentum behind some of the key technologies that Cloudera drives.</p>
<p>In this blog post,&nbsp; I’ll explain what Developer Relations means for Cloudera. The DevRel function can look different from one organization to another, depending on the goals of advocacy. This is my fourth DevRel gig, and at Onehouse, Dremio, and Qlik, the focus was slightly different. But the crux has always been the same: educating, engaging, and building a two-way relationship with developers.</p>
<p>In part two, I’ll share what I’ve seen in my first 30 days, our plans for supporting practitioners in their pursuits and use of the technologies that matter most to them, and how our platform supports their efforts.&nbsp;</p>
<h2>What is Developer Advocacy?</h2>
<p>Developer Advocacy is a specific role within the <a href="https://en.wikipedia.org/wiki/Developer_relations" target="_blank" rel="noopener noreferrer">DevRel function</a> and while there are other roles within the function, we will use these terms interchangeably in this blog.&nbsp;</p>
<p><i><b>Figure 2:</b> A day in the life (of DevRel)</i></p>
<p>On the other hand, it’s about carrying developers' voices back into product engineering and making sure their needs shape what gets built next. When done right, DevRel creates a two-way feedback loop. We show what's possible with a platform, and we also listen to and incorporate where developers get stuck (the issues/errors), what excites them (capabilities), and how the community evolves with the ecosystem.</p>
<h2>What Does Developer Advocacy Mean at Cloudera?</h2>
<p>At Cloudera, developers have always been at the center. The company sits at a unique intersection: <a href="/content/www/en-us/open-source.html">open-source commitment</a> on one side and <a href="/content/www/en-us/customers.html?menu-why">enterprise adoption</a> on the other. Cloudera has a long history of contributing to foundational Apache projects like Spark, Flink, Kafka, Ozone, NiFi, and Iceberg, while also serving a global customer base that depends on these technologies for production-grade scale and reliability.</p>
<p><i><b>Figure 1</b>: Developer Relations as an interface with product, engineering, and marketing teams and developers</i></p>
<p>At its core, DevRel is the <b><i>bridge</i></b> between technologies (products) and developers. On one hand, it’s about enabling developers to be productive, grow, and succeed with a range of data and AI technologies. This involves breaking down complex system internals in the form of blogs, books, or papers; showing how to accomplish something (with code); and exploring possible use cases via demos, hands-on labs, or webinars. It’s about being present where developers already are—meetups/conferences, open-source Github repositories, Slack channels, and forums.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=developer-relations-at-cloudera-introducing-devrel-and-what-developer-advocacy-means-for-cloudera</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>The Shifting Airgapped Data Processing Market: What It Means for the Public Sector</title><description><![CDATA[Discover how airgapped data collection and advanced data processing are transforming the public sector with secure, efficient information management.]]></description><link>https://www.cloudera.com/blog/business/the-shifting-airgapped-data-processing-market-what-it-means-for-the-public-sector.html</link><guid>https://www.cloudera.com/blog/business/the-shifting-airgapped-data-processing-market-what-it-means-for-the-public-sector.html</guid><pubDate>Thu, 16 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Jeremiah Morrow]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1355607977.jpg"><p><span class="text-lead"><b>For organizations in the U.S. public sector, the ability to leverage data in secure, air-gapped cloud and on-premises environments is not a preference—it’s a non-negotiable security and operational requirement.</b></span></p>
<p>Many public sector agencies currently use platform-as-a-service (PaaS) solutions for secure data processing using Apache Spark. However, with many solution providers moving to multi-tenant software-as-a-service (SaaS) offerings, these PaaS solutions are being deprecated. Moving forward, organizations for whom single-tenancy is a critical requirement will need to evaluate alternative solutions for air-gapped data processing. For most of them, a multi-tenant SaaS solution is simply not an option.</p>
<p>Cloudera is uniquely positioned to support mission-critical networks as our data and AI platform is designed for absolute control and <a href="/content/www/en-us/about/news-and-blogs/press-releases/2025-10-09-cloudera-leverages-aws-to-deliver-a-sovereign-ready-data-and-ai-platform-as-a-launch-partner-for-the-aws-european-sovereign-cloud.html">sovereignty</a>. For public sector agencies looking to maintain secure customer&nbsp; operations, Cloudera provides a clear and stable path forward.</p>
<h2>A Platform Built for the Mission, Not the Public Cloud</h2>
<p>As a graduated In-Q-Tel portfolio company, Cloudera has a long history of successful, Technology Readiness Level (TRL) 9 mission-proven deployment across the U.S. civilian, defense, and intelligence communities.</p>
<p>To exclusively serve this market, we established Cloudera Government Solutions, Inc. (CGSI). Headquartered in the Washington D.C. metropolitan area, CGSI is a dedicated subsidiary focused solely on the unique needs of government agencies. Our expertise is U.S.-based, cleared, and focused on ensuring mission success, evidenced by a strong, growing presence with Authority to Operate (ATO) qualifications across numerous secure networks.</p>
<p>For program managers and technical leaders re-evaluating their data strategy, the choice is simple: rely on a platform designed to support your specific industry requirements. Cloudera is the proven solution for any public sector agency requiring a robust, <a href="/content/www/en-us/products/cloudera-data-platform.html">self-contained data platform</a>.</p>
<h2>The Clear Choice for Secure, Air-Gapped Data and AI</h2>
<p>When moving your critical Spark workloads and data pipelines, Cloudera offers distinct advantages that ensure stability, control, and future-readiness:</p>
<h3>Cloud Anywhere</h3>
<p>Cloudera was built from the ground up to support workloads across hybrid and multi-cloud environments. Our&nbsp; data and AI platform delivers a consistent cloud experience across data centers, private clouds, and at the tactical edge—environments where pure cloud-native solutions simply cannot operate. We’re the de facto Spark provider for on-premises deployments leveraging object stores (S3 and Ozone), Kubernetes, virtual machine (VM), and bare metal technologies. This means your secure, self-managed data environment is our foundation–not an edge case or deprecated feature.</p>
<h3>Unified, Open, and Built for Longevity</h3>
<p>Our platform is built on an open-source foundation with an open-standards approach to integration, reducing the risk of vendor lock-in and ensuring maximum interoperability. Collectively, Cloudera customers ranging from Global 2000 to government&nbsp; manage more than 25 exabytes of data using our platform, demonstrating unparalleled scale and enterprise stability. Cloudera has more than $1 billion in annual recurring revenue to back our long-term partnership commitment. We provide a single, unified platform with an <a href="/content/www/en-us/products/open-data-lakehouse.html">open data lakehouse</a> and a comprehensive data fabric to manage the entire lifecycle of data–from streaming and data engineering to machine learning and enterprise AI.<a href="/content/www/en-us/products/open-data-lakehouse.html"></a></p>
<h3>Mission-Ready AI Everywhere</h3>
<p>The ability to deploy modern AI is increasingly vital to mission success. Cloudera accelerates the full AI lifecycle–from data preparation to private generative and agentic AI–with real-time, low-latency inference. You can deploy in certified AI infrastructure on premises and, crucially, in fully air-gapped cloud environments for absolute data control and sovereignty. This enables you to bring AI to your data, anywhere it resides, without ever compromising security.</p>
<h3>Comprehensive Data Control and Governance</h3>
<p>In government environments, data control is paramount. Cloudera delivers enterprise-wide data security, governance, lineage, and observability within a single platform. Our technology is tested rigorously to meet the most stringent regulatory and accreditation standards, with documented support for FIPS 140. This comprehensive compliance ensures your programs achieve and maintain their Authority to Operate (ATO) with confidence.</p>
<h2>Unwavering Commitment to Your Success</h2>
<p>Our investment in your mission goes beyond technology. CGSI provides an ecosystem of support designed specifically for the U.S. government:</p>
<ul>
<li><p><b>Dedicated, cleared U.S. expertise</b>: We offer professional services and 24x7 support from cleared U.S. citizens on U.S. soil. Our subject matter experts are available for everything from hands-on installation and optimization to supporting your most complex, mission-critical cases.</p>
</li>
</ul>
<ul>
<li><p><b>Robust partner ecosystem</b>: We partner with all key federal system integrators (FSIs) and technology providers to ensure seamless integration and mission success.</p>
</li>
</ul>
<ul>
<li><p><b>Expert training</b>: We offer comprehensive training and certification programs via live private on-site, live public, or self-service on-demand training to empower your teams to become self-sufficient experts on the platform.</p>
</li>
</ul>
<p>For government agencies that require the full power of modern data and AI without compromising on security or control, Cloudera is the proven, trusted, and superior choice.</p>
<p>&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=the-shifting-airgapped-data-processing-market-what-it-means-for-the-public-sector</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera and Protegrity: Delivering Secure AI and Analytics for Regulated Industries</title><description><![CDATA[Recently, Cloudera partnered with Protegrity, a global leader in data security and privacy, to address those security, compliance, and privacy concerns that leave regulated industries seemingly hamstrung in their adoption of AI. ]]></description><link>https://www.cloudera.com/blog/partners/cloudera-and-protegrity-delivering-secure-ai-and-analytics-for-regulated-industries.html</link><guid>https://www.cloudera.com/blog/partners/cloudera-and-protegrity-delivering-secure-ai-and-analytics-for-regulated-industries.html</guid><pubDate>Wed, 15 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Partners]]></category><dc:creator><![CDATA[Jerome Alexander]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-909200230.webp"><p>The rapid embrace of AI tools and models is yielding serious results for businesses across nearly every industry. Advanced and predictive analytics are providing deeper insights into business operations, newer forms of AI, like agentic AI, are transforming customer experiences, and machine learning is streamlining complex processes.&nbsp;</p>
<p>But for businesses in highly regulated industries—financial institutions, healthcare institutions, or any other business that’s subject to added compliance, security, and privacy considerations—the path to AI acceleration has extra obstacles along the way. For many of those highly regulated organizations, it may feel like AI is simply not an option. That, however, does not have to be the case.&nbsp;</p>
<p>Recently, <a href="https://www.protegrity.com/news/protegrity-and-cloudera-partner-to-elevate-enterprise-data-protection">Cloudera partnered with Protegrity</a>, a global leader in data security and privacy, to address those security, compliance, and privacy concerns that leave regulated industries seemingly hamstrung in their adoption of AI.&nbsp;</p>
<h2>Capitalize on AI While Maintaining Compliance</h2>
<p>Whether it’s a financial firm contending with <a href="/content/www/en-us/blog/business/embrace-a-hybrid-data-platform-for-dora-compliance.html">GDPR and DORA guidelines</a> or a healthcare institution bound by longstanding regulations like HIPAA, non-compliance is an extremely dangerous prospect.&nbsp;</p>
<p>Adherence with regulatory guidelines isn’t just a security issue. Failure to stay in compliance can bring serious financial and operational consequences that set the business back. Cloudera and Protegrity’s collaboration simplifies governance and auditability, helping streamline protection at scale while reducing operational complexity and costs. For organizations navigating highly regulated environments, this means the ability to innovate securely while ensuring adherence to evolving standards.</p>
<p>Unlike other platforms that require data movement to centralized locations, Cloudera enables businesses to apply AI directly to their data, wherever it resides—in clouds, data centers, or at the edge. That means organizations can avoid the added risk and complexity that comes with moving data from one location to another in order to feed an AI initiative.&nbsp;</p>
<p>And now, the partnership with Protegrity adds advanced data protection tools, such as vaultless tokenization, format-preserving encryption (FPE), dynamic data masking, and anonymization. These tools integrate seamlessly with Cloudera’s platform, enabling organizations to secure sensitive data while applying AI. For example, a financial institution using Cloudera can tokenize customer data with Protegrity’s solutions, ensuring compliance with GDPR while running predictive analytics in real time.</p>
<h2>Partnering to Enhance Data Protection Across Environments</h2>
<p>Cloudera and Protegrity bring a deep understanding of the data challenges that face highly regulated businesses, and together provide the heightened level of support and security to unlock the full potential of proprietary data without increasing risk exposure.&nbsp;</p>
<p>Cloudera’s enterprise data platform and Protegrity’s robust data protection enable highly regulated organizations to adopt AI, machine learning, and cloud analytics while ensuring compliance and data protection. These businesses can securely share and analyze sensitive information across teams and third parties, generating and harnessing richer insights and making real-time decisions without compromising trust.</p>
<p>Facing a heightened regulatory and compliance burden doesn’t have to mean sacrificing on the benefits of AI, machine learning, and advanced analytics. As the only data and AI platform company that large organizations trust to bring AI to their data anywhere it lives, Cloudera and its partner ecosystem deliver the security and scalability needed to support any enterprise.</p>
<p>Learn more about how<a href="/content/www/en-us/partners.html"> Cloudera, and its partners</a>, can secure AI and advanced analytics for highly regulated industries.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-and-protegrity-delivering-secure-ai-and-analytics-for-regulated-industries</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera Container Service—Built-in Security and Smarter Cost Control</title><description><![CDATA[Cloudera Container Service is our enhanced Kubernetes platform (replacing Compute Cluster). Enhancements include simplified lifecycle management, built-in security, and cost-optimized workload management across multi-cloud environments.]]></description><link>https://www.cloudera.com/blog/technical/cloudera-container-service-built-in-security-and-smarter-cost-control.html</link><guid>https://www.cloudera.com/blog/technical/cloudera-container-service-built-in-security-and-smarter-cost-control.html</guid><pubDate>Wed, 08 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Bhagya Lakshmi Gummalla]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty726887381.webp"><p><i>Figure 1: Cloudera Container Service Architecture</i></p>
<h2>Simplified Kubernetes Lifecycle Management</h2>
<p>Cloudera continues to invest in making Kubernetes and add-on services easier to operate across environments. With Cloudera Container Service, you can now use an intuitive UI to easily deploy Kubernetes clusters. Looking ahead, our roadmap includes extending unified lifecycle management across the whole Cloudera managed cluster estate, enabling enterprise admins to manage lifecycle updates consistently from a unified UI.</p>
<h2>Built-In Security and Compliance</h2>
<p>Cloudera Container Service provides several security features out of the box, ensuring that Kubernetes deployments are secure from day one, which helps you move faster and reduce risk. These features include:</p>
<ul>
<li><b>Istio service mesh</b>: Ensures secure, authenticated communication between microservices, without requiring users to install or configure Istio separately.</li>
<li><b>Knox gateway (as an Istio External Authorization Provider)</b>: Delivers enterprise-grade authentication and access control with external services while maintaining Istio's native security framework.</li>
<li><b>Calico</b>: Provides network policy enforcement to isolate workloads and meet compliance requirements through fine-grained traffic control for secure pod to pod communication.</li>
<li><b>Private cluster support</b>: Restricts access to within the customer’s cloud network, keeping workloads isolated from public internet exposure and reducing the need for complex network policy configurations.</li>
<li><b>IMDSv2 (instance metadata service v2)</b>: Uses session-based tokens to protect access to AWS instance metadata, mitigating risks and improving cloud workload security.</li>
<li><b>Non-transparent proxy support</b>: Enables secure, auditable outbound traffic from Kubernetes clusters without requiring manual proxy setup for each data service configuration.</li>
</ul>
<h2>Smarter, Cost-Optimized Workload Management</h2>
<blockquote>By 2026, organizations performing real-time cost or performance optimization of <b>cloud-based workloads will rise</b> from less than 20% in 2022, <b>to 50%.</b>”&nbsp;- <a href="https://www.gartner.com/en/infrastructure-and-it-operations-leaders/insights/manage-and-optimize-a-cloud-environment" target="_blank" rel="noopener noreferrer">Gartner(™)</a>, Evolve Service Management and Cloud Operations</blockquote>
<p>These insights underscore the increasing focus on cloud cost optimization as organizations seek to manage expenses while leveraging cloud technologies.&nbsp;</p>
<p>By giving enterprises control over cost-saving mechanisms, Cloudera ensures that organizations only pay for the resources they actually use while maintaining the flexibility of Kubernetes-based workloads.&nbsp;</p>
<p>Cloudera’s latest enhancements enable organizations to optimize spending while maintaining performance in several ways, including:</p>
<ul>
<li><p><b>AWS Graviton support</b>: Enables cost-effective compute with ARM-based instances, reducing cloud expenses and energy consumption. Further, building multi-architecture container images enables a “build once, deploy anywhere” approach.</p>
</li>
</ul>
<ul>
<li><p><b>Suspend/resume clusters</b>: Allows enterprises to pause workloads when not in use and resume them when needed, cutting down on unnecessary infrastructure costs.</p>
</li>
</ul>
<ul>
<li><p><b>Shared data services</b>: Optimizes resources by allowing multiple data services to leverage shared infrastructure, reducing duplication and improving efficiency.</p>
</li>
</ul>
<ul>
<li><p><b>Apache Yunikorn</b>: Enables higher cluster density, lower operational costs, and improved performance through an intelligent resource scheduler with enhanced workload placement and scheduling techniques like bin-packing, hierarchical quota management, gang scheduling.</p>
</li>
</ul>
<h2>Leveled-Up: Cloudera AI Inference Service with NVIDIA Accelerated Compute</h2>
<p>Cloudera AI Inference service is the first data service onboarded to Cloudera’s enhanced Kubernetes platform. By leveraging Cloudera Container Service, AI workloads can now move from development to production faster, more securely, and more cost-effectively than ever before.</p>
<p>Cloudera’s Container Service plays a critical role in enabling AI inference by providing:</p>
<ul>
<li><p><b>Optimized performance</b>: Efficient scheduling and orchestration of NVIDIA accelerated compute, ensuring AI workloads are allocating the compute power they need without over-provisioning resources.</p>
</li>
</ul>
<ul>
<li><p><b>Enterprise-grade security</b>: AI workloads remain fully contained within Cloudera’s secure, enterprise-ready platform, ensuring data governance and compliance.</p>
</li>
</ul>
<ul>
<li><p><b>Automated infrastructure management</b>: The platform handles cluster scaling, security policies, and workload isolation, allowing data scientists and AI engineers to focus on model optimization instead of infrastructure management.</p>
</li>
</ul>
<h2>Future-Ready Kubernetes: Built for AI, Analytics, and Beyond</h2>
<p>As part of Cloudera’s broader vision of supporting diverse workloads—from real-time data streaming to large-scale analytics and next-generation enterprise applications—this enhancement is a boon for organizations with an AI-first approach.&nbsp;</p>
<p>With Kubernetes as the foundation, Cloudera solves today’s infrastructure challenges and prepares your organization for future innovation.</p>
<p>Interested in learning more and seeing what’s in store for the future?</p>
<ul>
<li><p><a href="/content/www/en-us/products/cloudera-public-cloud-trial.html">Sign up for a 5-day free trial of Cloudera on cloud</a>.</p>
</li>
<li><p><a href="/content/www/en-us/contact-sales.html">Contact us to speak directly with a member of our sales team</a>.</p>
</li>
</ul>
<p><span class="text-lead"><b>Introducing Cloudera Container Service: Simple, Secure, Cost Efficient</b></span></p>
<p>Cloudera Container Service is our enhanced Kubernetes platform (replacing Compute Cluster). Enhancements include simplified lifecycle management, built-in security, and cost-optimized workload management across multi-cloud environments.</p>
<p>With Cloudera Container Service, you can focus on innovation rather than infrastructure complexity, ensuring that Kubernetes deployments are secure, scalable, and cost-effective across multi-cloud environments.</p>
<blockquote>Kubernetes should be an enabler, not an obstacle,” said Karthik Krishnamoorthy, Cloudera’s Vice President for Product Management. “With these enhancements, we’re giving enterprises the tools to manage Kubernetes more efficiently, reduce cloud costs, and onboard powerful AI and data-driven applications—all while ensuring built-in security.”&nbsp;</blockquote>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-container-service-built-in-security-and-smarter-cost-control</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>#ClouderaLife Employee Spotlight: Meet Leo Brunnick, Chief Product Officer </title><description><![CDATA[This month’s #ClouderaLife Spotlight features Leo Brunnick, Chief Product Officer ]]></description><link>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-leo-brunnick-chief-product-officer.html</link><guid>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-leo-brunnick-chief-product-officer.html</guid><pubDate>Mon, 06 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-1456193345.jpg"><p>At Cloudera, leadership is about more than just driving business strategy, it’s about inspiring innovation, nurturing community. No one embodies that spirit more clearly than Leo Brunnick, Cloudera’s Chief Product Officer.&nbsp;</p>
<p>As he settles into his tenure, Leo feels a standout quality about Cloudera.&nbsp;</p>
<blockquote>What I see is energy. What I see is joy. I see a group of people desperately wanting Cloudera to do well and win,” he shared. “I’ve been at many companies, and they’re not all like this.”&nbsp;</blockquote>
<p>That collective drive, he believes, is what sets Cloudera apart. “People aren’t here just for performance reviews or scores—they’re here to make Cloudera successful. And that’s rare.”&nbsp;</p>
<p>Let’s get to know Leo Brunnick and explore how Cloudera has supported his leadership journey and empowered him to shape our product vision.&nbsp;</p>
<h2>Meet Leo Brunnick&nbsp;</h2>
<p>As Chief Product Officer, Leo guides Cloudera’s product strategy and innovation agenda, helping ensure the company is at the forefront of data, AI, and cloud transformation.&nbsp;</p>
<p>What drew him in was not just the technology, but the people. “Our CEO, Charles Sansbury, had built his dream team across sales, marketing, and finance, and he needed leadership in product and engineering. I saw that what I could bring would make a real difference. That’s what gets me out of bed in the morning.”&nbsp;</p>
<h2>Leo’s Journey to Cloudera&nbsp;</h2>
<p>When Charles first reached out, Leo was intrigued. After speaking with executives and board members, the decision became clear.&nbsp;</p>
<p>“I don’t think I’ve ever seen a board more supportive of giving a company what it needs to be successful,” Leo said. “Cloudera was in a spot where if it made the right moves, it could take advantage of the mega trends in AI and data. That was terribly exciting.”&nbsp;</p>
<p>For Leo, it wasn’t just about joining a strong company but also helping it break through. “Cloudera is this close to becoming an even bigger success story. It’s fun to have a brass ring to chase.”&nbsp;</p>
<h2>Driving Innovation: Cloudera Data Services&nbsp;</h2>
<p>Cloudera recently announced the launch of<a href="/content/www/en-us/about/news-and-blogs/press-releases/2025-08-06-cloudera-data-services-brings-private-ai-to-the-data-center.html"> Cloudera Data Services</a>, a transformative platform designed to directly bring private AI and cloud-native agility to the data center.&nbsp;</p>
<p>Leo is energized by what this means for customers and employees alike. “That full easy button of the cloud—now available on-premises. That’s fundamentally different,” he said.&nbsp;</p>
<p>For him, this isn’t just a technology milestone but a company-wide opportunity. “When you can move quickly and deploy differently in the on-prem environment, it impacts the whole company. It is a full team sport regarding what Cloudera is poised to do now.”&nbsp;</p>
<p>Clouderans play a role in shaping this future, from how products are built and packaged to how they’re sold, supported, and scaled. Leo sees this as one of the most exciting parts of the journey: everyone has a hand in making it real.&nbsp;</p>
<h2>Culture, Community, and Representation&nbsp;</h2>
<p>Leo’s family is deeply connected to Latin heritage, which has shaped his personal life and professional outlook. Having spent time in El Paso and now living in Austin, he’s long embraced Hispanic culture's vibrancy, traditions, and energy.&nbsp;</p>
<p>“I just love the culture, the tradition, the energy, and the vibrancy,” he said. Now, as part of Cloudera’s Latinx Employee Resource Group (ERG), he feels less like a leader with a title and more like a participant in a supportive community. “ERG lead is just a fancy title. Really, I feel grateful to be allowed to be part of the group.”&nbsp;</p>
<p>For him, ERGs are about belonging: “It just feels better when you’re around people you care about and who care about you. ERGs help people connect and feel seen for who they are. Taken together, all those perspectives make Cloudera a special place.”&nbsp;</p>
<h2>Leading with Energy and Authenticity&nbsp;</h2>
<p>When asked about his leadership style, Leo is quick to ground it in humility.&nbsp;</p>
<p>“I’m never going to be the smartest person in the room, and I’m not always going to be right. But what I bring is energy and authenticity. People want to be part of a winning team—they just want to know how to participate.”&nbsp;</p>
<p>That belief drives his hands-on approach. From San Jose to Costa Rica, Raleigh, Budapest, and Bangalore, Leo embraces “management by walking around.” As he puts it, “You’ve got to get out there, pound the drum, and get people fired up.”&nbsp;</p>
<p>For him, leadership is also about clarity: “This is what we’re doing. This is why. Repeat it repeatedly. That’s how you build trust across teams.”&nbsp;</p>
<h2>Closing Thoughts&nbsp;</h2>
<p>For those considering a career at Cloudera, Leo’s advice is both candid and inspiring.&nbsp;</p>
<p>“Be ready, because Cloudera is a full-contact sport. This isn’t a place to just punch in and out. People here lean in and give it all they’ve got. And we want others who feel the same way.”&nbsp;</p>
<p>His words reflect the spirit of Cloudera: authentic, passionate, and all-in on success.&nbsp;</p>
<p>Want to learn about more inspiring Clouderans? Read<a href="/content/www/en-us/blog/culture/clouderalife-employee-spotlight-meet-charles-aad-clouderas-senior-industry-solutions-engineer.html"> here</a>.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderalife-employee-spotlight-meet-leo-brunnick-chief-product-officer</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Democratize Data for AI Using Interoperability Across Engines and Zero-Copy Data Collaboration</title><description><![CDATA[How Cloudera Iceberg REST catalog enables open, AI-ready enterprises.]]></description><link>https://www.cloudera.com/blog/business/democratize-data-for-ai-using-interoperability-across-engines-and-zero-copy-data-collaboration.html</link><guid>https://www.cloudera.com/blog/business/democratize-data-for-ai-using-interoperability-across-engines-and-zero-copy-data-collaboration.html</guid><pubDate>Fri, 03 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Pamela Pan,Akshat Mathur,Bill Zhang]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-174962301.jpg"><h2><i>How Cloudera Iceberg REST Catalog Enables Open, AI-Ready Enterprises</i></h2>
<p>Interoperability has long been a buzzword, not a capability enterprises can count on in practice. Instead, data architects are often left stitching together fragmented systems, chief data officers face massive risk and vendor lock-in from siloed governance, and platform leaders are restricted from providing a consistent data view to their teams. Whether driven by mergers, multi-cloud strategies, or external partnerships, the pattern repeats: rising costs, slower innovation, and limited ability to scale AI with confidence.</p>
<p>At Cloudera, we’ve helped our customers navigate these challenges—disconnected metadata layers, duplicated data pipelines, and governance models that fail to extend across tools—always striving to enable open, AI-ready enterprises that unlock interoperability at scale.</p>
<h3>Why Openness Matters for Enterprise AI</h3>
<p>To scale AI workloads, organizations require visibility and control over the data that fuels them. Metadata intelligence plays a critical role in this equation, enabling organizations to understand where data lives, how it’s structured, and how it’s used across teams and tools.&nbsp;</p>
<p>With open standards like Apache Iceberg and the Iceberg REST Catalog, enterprises gain a unified layer of metadata that supports zero-ETL data sharing, enforces governance, and powers secure interoperability across analytics and AI engines. This foundation transforms fragmented infrastructure into a connected, AI-ready data architecture—one where metadata becomes the key to accelerating access to insights while maintaining trust.</p>
<h3>Open, Secure, and Simple: Cloudera Iceberg REST Catalog</h3>
<p>The <a href="https://docs.cloudera.com/runtime/7.3.1/overview/topics/cr-ds-cloudera-iceberg-rest-catalog.html" target="_blank" rel="noopener noreferrer">Cloudera Iceberg REST Catalog</a> powers our open data lakehouse and helps organizations simplify architecture, reduce duplication, and extend secure data access wherever it’s needed.</p>
<p>It acts as a universal, interoperable metadata layer and provides zero-copy access to Iceberg tables across tools, clouds, and teams, enabling open-source and third-party tools to access the same data. Features and benefits include:</p>
<ul>
<li><b>Open and engine-agnostic</b>: Provides standards-based APIs that support tools like Athena, Databricks, Redshift, and Snowflake—enabling interoperability without vendor lock-in</li>
<li><b>Decoupled by design</b>: Abstracts query engines from backend metastores, reducing complexity and increasing portability across environments</li>
<li><b>Real-time metadata access</b>: Supports fast, up-to-date metadata queries from Iceberg-compatible metastores, improving data visibility across teams</li>
<li><b>Governed and secure</b>: Extends fine-grained access controls, row-level permissions, and enterprise identity access management (IAM) integration (such as LDAP and OAuth2) to all connected systems—ensuring consistent policy enforcement at scale</li>
</ul>
<p><i><b>Figure 1.</b> Cloudera's Iceberg REST Catalog provides a universal, interoperable metadata layer, enabling open source and third-party tools to access the same data.&nbsp;</i></p>
<h3>Real-World Use Cases and Impact of Iceberg REST Catalog</h3>
<p>The following real-world examples illustrate how organizations are using the Iceberg REST Catalog to simplify their data stack, reduce total cost of ownership (TCO), and accelerate time to value–all while keeping data where it belongs.</p>
<p>Together, these examples demonstrate how Cloudera’s open and interoperable approach accelerates AI outcomes, drives operational efficiency at enterprise scale, and enables security and compliance.</p>
<h4>Data Sharing: Scale AI Applications to 3,000+ Cross-Platform Users</h4>
<p>A luxury automotive manufacturer faced mounting challenges in securely sharing data with an external partner using Databricks. Traditional methods relied on data duplication, which introduced cost, complexity, and architectural inflexibility.&nbsp;</p>
<p>By adopting the Iceberg REST Catalog, the customer established secure, zero-ETL data sharing across both internal systems and external platforms. This open, standards-based approach allowed them to choose the best tools for the job—using Spark for complex data pipelines and Impala for fast SQL analytics. With this foundation, the company scaled AI applications to more than 3,000 users while maintaining full governance and control over data access.</p>
<h3>Data Warehouse Optimization: Reduce Data Movement Costs 74%</h3>
<p>Following a merger activity, a global satellite company encountered significant roadblocks in unifying fragmented data locked in proprietary systems. Without a consistent, interoperable data layer, their AI and analytics initiatives were slow to scale and difficult to manage.&nbsp;</p>
<p>Cloudera’s open data lakehouse architecture, powered by the Iceberg REST Catalog, helped the customer consolidate these silos and establish a single source of truth for all of its AI and analytics workloads. By querying managed Iceberg tables directly in S3, they eliminated the need for redundant data pipelines and replatforming efforts, leading to a 74% reduction in data movement costs.</p>
<h3>Demo: A Closer Look at Data Sharing via Cloudera’s Iceberg REST Catalog</h3>
<p>This <a href="https://app.getreprise.com/launch/wy18oBX/" target="_blank" rel="noopener noreferrer">interactive demo</a> brings the Iceberg REST Catalog to life through a financial services scenario. At the fictional Parent Bank, different teams use their preferred tools—such as Snowflake and AWS Athena—to securely access one governed source of data, all without complex ETL or costly data movement.&nbsp;</p>
<p>For a deeper dive into this offering and how it can benefit your organization, explore these resources:</p>
<p>&nbsp;</p>
<ul>
<li><a href="/content/www/en-us/products/open-data-lakehouse.html">Visit our product page</a> to learn more about Cloudera’s open data lakehouse.</li>
<li><a href="/content/www/en-us/about/news-and-blogs/press-releases/2025-09-25-cloudera-accelerates-ai-and-analytics-projects-with-a-unified-platform-for-secure-governed-and-performant-data.html">Read the press release</a> for the full announcement about Cloudera’s vision for open data sharing.</li>
</ul>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=democratize-data-for-ai-using-interoperability-across-engines-and-zero-copy-data-collaboration</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>3 Steps to Cutting Cloud Costs with Data Lineage</title><description><![CDATA[Discover 3 steps to reduce cloud costs by using data lineage to track, manage, and optimize your data storage with Octopai Data Lineage and Cloudera.]]></description><link>https://www.cloudera.com/blog/business/3-steps-to-cutting-cloud-costs-with-data-lineage.html</link><guid>https://www.cloudera.com/blog/business/3-steps-to-cutting-cloud-costs-with-data-lineage.html</guid><pubDate>Thu, 02 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Ron Pick]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty157283393.jpg"><p>Ever promise someone the moon? If so, it’s unlikely you knew the price tag in advance.</p>
<p>Whereas, if you promise someone a cloud, you can calculate your costs down to a thousandth of a cent.&nbsp;</p>
<p>Amazon, Azure, and Google offer cloud data storage cost calculators that will make your head spin with their specificity: How many TiB of data do you need for streaming reads on Google BigQuery? Do you want ra3.4xlarge or ra3.xlplus instances on Amazon Redshift—and how many nodes?</p>
<p>While storing data in the cloud is often billed as being more cost-efficient than using on-premises data storage, in truth reducing your cost for cloud storage requires investigation, elimination, and optimization. Let’s take it step by step.</p>
<h2>Step 1: Investigation</h2>
<p>One of the simplest ways of reducing data storage costs is to store less data. Obvious, yes. Easy, no.</p>
<p>There’s a reason why you have all that data. Sometimes a good reason—like for operational, administrative, and business processes—but sometimes the reason isn’t all that great, such as “we haven’t gotten rid of it yet.”&nbsp;</p>
<p>In every data ecosystem, there’s outdated, redundant, and bad quality data that you can—and should—get rid of. But how do you locate it?</p>
<p>The answer is automated&nbsp;<a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">data lineage</a>: the data housekeeper’s faithful sidekick.</p>
<p>Imagine that you have a magic wand that helps with spring cleaning. This wand tells you where each item in your household was bought, when it was last used, what shape it’s in, if you have any other items that serve the same function, and so on.</p>
<p>This is what automated data lineage does for your data ecosystem. Let it loose, and within minutes you’ll have a complete mapping of your data flow: what data assets feed what reports and trace back to which sources. Comprehensive data lineage shows this both on a zoomed-out, source-system level, as well as on a zoomed-in, column-to-column level. It can even get into the ETL processes and show exactly what transformations were performed on the data as it moved.&nbsp;</p>
<p>Once you have the complete picture mapped out, you can move on to the second stage: elimination.</p>
<h2>Step 2: Elimination</h2>
<p>Take a close look at your data lineage, and ask the following questions:</p>
<ul>
<li>Are any of these data assets or data uses (reports, for example) redundant?</li>
<li>Are any of these data assets or data uses outdated or otherwise no longer relevant?</li>
</ul>
<p>Answering “yes” points you to data that can be offloaded, directly reducing cloud-based storage costs. But offload wisely! Even if you’ve identified two data assets that are effectively duplicates, if they are both being used by downstream reports, you can’t just go and delete one of them before you line up its replacement.&nbsp;</p>
<p>Leveraging your <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">data lineage</a> for impact analysis empowers you to foresee the impact of changing a business process and take proper advance action to prevent issues.</p>
<p>Now that you’ve identified and eliminated data you don’t need (outdated, redundant, bad quality), it’s time to move on to data that you do need to keep around, but you could store more efficiently.</p>
<h2>Step 3: Optimization</h2>
<p>Take another look at your data lineage mapping, and ask the following questions about the data you are storing:</p>
<ul>
<li>What are we using this data for?</li>
<li>How often do we need to access it?</li>
<li>How fast does it need to be available when we do want to access it?</li>
</ul>
<p>Cloud-based data storage providers usually offer a range of storage levels that vary by their accessibility. For example, Amazon S3 offers Standard storage for frequently accessed data ($0.023 per GB), Standard – Infrequent Access storage for data that’s accessed infrequently but should be retrieved in milliseconds when needed ($0.0125 per GB), Glacier Flexible Retrieval storage for archive and backup data that should be retrieved in anywhere from 1 minute to 12 hours ($0.0036 per GB), and Glacier Deep Archive storage for archive data that's accessed only once or twice a year and will take 12 hours to retrieve ($0.00099 per GB).</p>
<p>Storing 1 TB of data in Standard storage would cost $23 a month. Storing the same 1 TB of data in Glacier Deep Archive Storage would cost $0.99 a month! If your organization currently stuffs all of its data into standard cloud storage without differentiating based on access needs, optimizing your storage can significantly reduce your storage costs.&nbsp;</p>
<h2>From Storage to Computing and Back Again</h2>
<p>Data lineage can reduce your data storage costs by showing you both:</p>
<ul>
<li>Which data you can eliminate</li>
<li>Which data you can store more effectively</li>
</ul>
<p>But that's not all! While less data reduces cloud storage costs, it can also reduce compute costs. <a href="/content/www/en-us/products/data-warehouse.html">Cloud-based data warehouses</a> like Snowflake and Amazon Redshift usually have a pay-per-usage model on compute, charging for the time it takes to run queries across the datasets. The more data you include in your query, the longer it will take to run, and the higher your charge will be.&nbsp;</p>
<p>Reducing the amount of data you’re storing (or keeping in standard storage) will usually mean less data included in your queries, indirectly reducing compute costs. But data lineage also provides you with a direct way to decrease your compute costs:&nbsp; restricting exploration queries.&nbsp;</p>
<p>Exploration queries tend to use a lot of computing power. With a clear data lineage map, your data team can see exactly where the relevant data is, enabling them to run much more targeted queries across the platform, and eliminating or reducing the need for general exploration queries.&nbsp;</p>
<h2>Next Steps</h2>
<p>If cloud data storage costs are getting you down, it’s time to turn the tables and get them down instead. Just pull out your automated data lineage magic wand and follow these steps: Investigate! Eliminate! Optimize!&nbsp;</p>
<p>See those data storage costs shrink!? Okay, it may take a wee bit more work than that. But when your enterprise gets its next, lower bill from its cloud data services provider, it will still feel magical.&nbsp;</p>
<p>Want to learn more?&nbsp; Request a demo to get started with <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a>—an automated data lineage solution that can help you implement these steps and reduce your cloud storage costs today.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=3-steps-to-cutting-cloud-costs-with-data-lineage</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Empowering Enterprise AI with Structured Synthetic Data: Preserving Privacy and Source-Statistical Properties</title><description><![CDATA[In the era of data-driven AI, enterprises need high-quality datasets to analyze or train AI models, yet data privacy regulations and ethical concerns restrict the use or sharing of real-world data.]]></description><link>https://www.cloudera.com/blog/business/empowering-enterprise-ai-with-structured-synthetic-data-preserving-privacy-and-source-statistical-properties.html</link><guid>https://www.cloudera.com/blog/business/empowering-enterprise-ai-with-structured-synthetic-data-preserving-privacy-and-source-statistical-properties.html</guid><pubDate>Wed, 01 Oct 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Andreas Tsiartas,Yi-Hsun Tsai,Robert Hryniewicz]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty636173148.jpg"><p>In the era of data-driven AI, enterprises need high-quality datasets to analyze or train AI models, yet data privacy regulations and ethical concerns restrict the use or sharing of real-world data. How can organizations innovate without compromising sensitive information?&nbsp;</p>
<p>At Cloudera, we’ve pioneered a solution that bridges this gap. Cloudera’s Synthetic Data Studio—part of the <a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera AI Studio </a>toolset—is a tool that creates entirely synthetic datasets that mimic an organization's actual data patterns, so organizations can innovate without risk to confidential information.</p>
<table>
<tbody><tr><td><h2>Key Takeaways</h2>
<p>Cloudera’s approach to synthetic data generation offers a blueprint for enterprises wanting to use or share sensitive structured data. The approach illustrates:</p>
<ul>
<li><p>Privacy as a feature: Synthetic data becomes a strategic asset that enables innovation in restricted domains</p>
</li>
</ul>
<ul>
<li><p>Statistical fidelity matters: Clustering and seed instructions ensure synthetic data retains the nuanced relationships that make models effective</p>
</li>
</ul>
<ul>
<li><p>Scalability for enterprise AI: Automated workflows reduce the cost and time of synthetic data generation</p>
</li>
</ul>
</td>
</tr></tbody></table>
<h2>The Business Challenge: Leveraging AI Models While Ensuring Compliance</h2>
<p>Consider a financial services company striving to predict loan defaults. Real-world data in this domain is a treasure trove of sensitive details: income levels, employment histories, and credit scores. Sharing such data with third parties or AI models is full of regulatory and ethical hurdles.&nbsp;</p>
<p>Traditional synthetic data methods often fall short, failing to capture the nuanced logical relationships between variables—such as how existing debts might influence repayment behavior—or the logical consistency between data points across rows and columns. Companies require&nbsp; a synthetic data solution that can scale, preserve the statistical integrity of the original data, and ensure compliance with privacy standards.&nbsp;&nbsp;</p>
<h2>Cloudera’s Solution: Structured Synthetic Data Generation&nbsp;</h2>
<p>Cloudera’s solution follows a four-step workflow that incorporates clustering techniques, Cloudera Synthetic Data Studio, and rigorous validation.&nbsp;</p>
<h3>Step 1: Profile Data</h3>
<p>The journey begins with partitioning and clustering the data to create statistical profiles. By categorizing borrowers into groups based on risk levels—high-risk versus low-risk applicants, for instance—and further clustering numerical variables like loan amounts and interest rates, we distill the dataset into “seed instructions.”&nbsp;</p>
<p>Seed instructions encode the statistical properties of each group, such as means, standard deviations, and correlations, while embedding borrower information such as loan grades or loan statuses. This step ensures that the synthetic data inherits the structure of the original data without exposing sensitive details.&nbsp;&nbsp;</p>
<h3>Step 2: Generate Data Using Cloudera Synthetic Data Studio</h3>
<p>With these seed instructions in place, the next phase leverages LLM-powered generation. Using advanced models like Llama 3.3-70B-Instruct, we synthesize new records guided by the statistical blueprints seen in the seed instructions. Cloudera Synthetic Data Studio acts as a creative force, generating data that preserves the relationships and patterns defined in the seed instructions.</p>
<p>This is where the magic happens: the model doesn’t just produce random numbers but constructs data that reflects the complexity of real-world scenarios, such as how a borrower’s income might logically influence their repayment history.&nbsp;&nbsp;</p>
<h3>Step 3: Filter Data</h3>
<p>However, not all generated data meets the required quality. To ensure fidelity, we employ an innovative LLM-as-a-judge workflow.&nbsp;</p>
<p>This step evaluates synthetic outputs against a set of criteria, including formatting consistency, logical coherence (for example, ensuring mortgage accounts align with home ownership status), and realism (for example, generating plausible interest rates). Only data that scores highly—meeting a threshold of 9 out of 10—is retained. This filtering process acts as a quality gate, ensuring that the final dataset is both realistic and statistically robust.&nbsp;&nbsp;</p>
<h3>Step 4: Validate Data</h3>
<p>The final phase of the workflow involves statistical and visual validation. By comparing synthetic data to the original dataset using metrics like KL divergence for categorical variables and mean/standard deviation differences for continuous features, we confirm that the synthetic data mirrors the real-world distributions.&nbsp;</p>
<h2>The Impact: Privacy Without Compromise</h2>
<p>Cloudera’s approach generates data that is free of personally identifiable information (PII) and sensitive patterns, yet retains the statistical fidelity needed to train accurate models. This enables companies to share synthetic data with third-party systems or collaborate with external partners without fear of data breaches or regulatory penalties.&nbsp;&nbsp;</p>
<p>As shown in Table 1, we find that using a Llama 3.3 70B-Instruct model to generate structured loan data (27 columns total), 100% of the generated data match the expected output, 97.2% contain no logical cross-column errors when judged by an LLM, statistical means deviate 12% from the original distribution, and cross-column correlations deviate by 0.24.&nbsp;</p>
<table>
<tbody><tr><td colspan="4"><p><b>Structured Data Generation Results Using Llama 3.3-70B-Instruct</b></p>
</td>
</tr><tr><td><p><b>Data Integrity</b></p>
</td>
<td><p>100% format accuracy</p>
</td>
<td colspan="2"><p>The synthetic data is a perfect match for the original structure.</p>
</td>
</tr><tr><td><p><b>Statistical Fidelity</b></p>
</td>
<td><p>12% mean deviation</p>
</td>
<td colspan="2"><p>The synthetic data accurately mimics the key statistical properties of the original.</p>
</td>
</tr><tr><td><p><b>Cross-Column Logical Consistency</b></p>
</td>
<td><p>2.8% logical errors</p>
</td>
<td colspan="2"><p>The generated data reflects real-world logical relationships.</p>
</td>
</tr><tr><td><p><b>Cross-Column Correlation Preservation</b></p>
</td>
<td><p>0.24% correlation difference</p>
</td>
<td colspan="2"><p>The key connections between features are authentically preserved.</p>
</td>
</tr></tbody></table>
<p><i>Table 1: Structured Data Generation Results Using Llama 3.3-70B-Instruct</i></p>
<h2>Conclusion</h2>
<p>As AI models grow more complex and privacy regulations tighten, the demand for high-quality, privacy-compliant data will only intensify. In the coming years, we expect structured data generation methodologies to redefine industries from healthcare to finance, where data privacy is non-negotiable.&nbsp;</p>
<p>Cloudera’s structured synthetic data approach shows that enterprises can meet this demand without compromising on privacy or performance. By combining clustering, Cloudera Synthetic Data Studio, and rigorous evaluations, organizations can unlock the full potential of structured data.&nbsp;</p>
<p>If you’re interested in learning more, <a href="/content/www/en-us/products/machine-learning/ai-studios/product-tour.html">take our product tour</a> of Cloudera AI Studios, or reach out to our team at <a href="mailto:ai_feedback@cloudera.com">ai_feedback@cloudera.com</a>.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=empowering-enterprise-ai-with-structured-synthetic-data-preserving-privacy-and-source-statistical-properties</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>A Year-Over-Year Look at AI Challenges and Shifting Perspectives</title><description><![CDATA[The AI landscape is constantly shifting. So, how does today’s AI environment compare to one year ago? How have attitudes changed? What challenges are enterprise leaders facing when it comes to AI adoption? Let’s dive into some of the biggest shifts.]]></description><link>https://www.cloudera.com/blog/business/a-year-over-year-look-at-ai-challenges-and-shifting-perspectives.html</link><guid>https://www.cloudera.com/blog/business/a-year-over-year-look-at-ai-challenges-and-shifting-perspectives.html</guid><pubDate>Tue, 30 Sep 2025 13:30:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-window-cleaning.webp"><p>In just the last few years, artificial intelligence (AI) has exploded across enterprise organizations, with new use cases emerging&nbsp; at a rapid pace. Tools and models like<a href="/content/www/en-us/about/news-and-blogs/press-releases/2025-04-16-96-percent-of-enterprises-are-expanding-use-of-ai-agents-according-to-latest-data-from-cloudera.html"> AI agents</a> have introduced new opportunities and innovations that are redefining the marketplace.</p>
<p>Cloudera’s latest report: <a href="/content/www/en-us/campaign/the-evolution-of-ai-the-state-of-enterprise-ai-and-data-architecture.html?internal_keyplay=AI&amp;internal_campaign=Thought-Leadership-Reports---AlwaysOn-FY26-GLOBAL-CT-Report-State-of-Enterprise-AI-Survey&amp;cid=701Ui00000cKCojIAG&amp;internal_link=press-release-body-link">The Evolution of AI: The State of Enterprise AI and Data Architecture</a>, paints a clear picture. Most organizations have moved beyond experimentation and are integrating AI models into some of the most important facets of their businesses: 96% of IT leaders surveyed say that AI is at least somewhat integrated into core business processes. At the same time, many leaders feel they’ve yet to realize the full potential of AI, and challenges to adoption and secure use of AI persist.</p>
<p>The AI landscape is constantly shifting. So, how does today’s AI environment compare to one year ago? How have attitudes changed? What challenges are enterprise leaders facing when it comes to AI adoption? Let’s dive into some of the biggest shifts.</p>
<h1>Confidence in Data is Rising, but with Room to Improve</h1>
<p>No matter the industry, maintaining a competitive edge depends on how quickly an organization can make accurate, informed decisions. But going a level deeper, that ability hinges on how an organization can tap into its own data. For AI to be impactful, IT leaders need to ensure they strive to make 100% of their data accessible. Cloudera’s survey reveals a notable gap here as just 9% said that all their data is available and accessible for AI.&nbsp;</p>
<p>Nearly one quarter (24%) of respondents said that they trust their data much more than they did last year, but 41% said they only trusted their data somewhat more. While confidence in data has shown signs of growth, enterprise leaders still hold some security concerns around AI implementation. Of those surveyed, 46% say they’re worried about the security and compliance risks that AI presents. And two of the top concerns relating to AI security are focused on data—50% cite data leakage during model training, and 48% note unauthorized data access as top challenges.</p>
<p>These results are not surprising. Enterprise leaders must maximize value from AI without exposing sensitive data or falling out of compliance. Something that, at a time where new regulations are constantly emerging, can be easier said than done. As organizations strengthen their data architecture and capabilities, governance remains a focal point of any strategy to ensure consistent security.</p>
<h1>AI Adoption Challenges Persist</h1>
<p>Even as enterprise IT leaders show more trust in their data year over year (YoY), many of the same AI adoption challenges cited in 2024 remain. For example, data integration is still ranked as the top technical limitation in data architectures when supporting AI workloads. Other challenges cited by survey respondents in 2025 included storage performance, compute power, lack of automation, and latency.</p>
<p>While many of the same challenges from 2024 have remained, one of the biggest shifts is the cost to access computer capacity for training models. The number of IT leaders who cite this as a barrier to AI adoption rose from 8% in 2024 to 42% this year—a 34-point jump! As enterprises push for more AI initiatives, with new tools and models, the costs of&nbsp; adoption and operation grow quickly—particularly if the data architecture supporting AI initiatives is not ready to handle more complex systems.</p>
<p>Then there’s the age-old problem of data silos, which have long caused trouble for IT leaders. Breaking down silos is a critical piece of effective AI. When a model is trained on incomplete data, the outputs are vulnerable to inaccuracies that could prove costly. Of the IT leaders surveyed by Cloudera, 61% say that siloed data has at least sometimes negatively impacted their ability to scale AI initiatives, but many are seemingly getting a handle on this problem, with 35% saying this was rarely impacting their own AI initiatives.</p>
<h1>What’s Next for AI and Data Architecture</h1>
<p>AI is now integrated into some of the most critical business functions across enterprises. As enterprise leaders become more familiar with AI tools and models, the demand for data has accelerated shifts in data architecture. Those shifts have seen organizations become more data-driven culturally, giving leaders more confidence in their organization’s data.</p>
<p>And yet, many of the same challenges surrounding AI adoption and security have remained consistent YoY, while new difficulties around operating costs have emerged.&nbsp;</p>
<p>Wherever an organization finds itself in their AI journey, having the right data architecture and AI infrastructure is critical to establishing long-term success.</p>
<p>Check out the full <a href="/content/www/en-us/campaign/the-evolution-of-ai-the-state-of-enterprise-ai-and-data-architecture.html?internal_keyplay=AI&amp;internal_campaign=Thought-Leadership-Reports---AlwaysOn-FY26-GLOBAL-CT-Report-State-of-Enterprise-AI-Survey&amp;cid=701Ui00000cKCojIAG&amp;internal_link=blog-body-content">report </a>and learn more about how Cloudera is helping organizations bring AI to their data, anywhere it resides.&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=a-year-over-year-look-at-ai-challenges-and-shifting-perspectives</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Enterprise AI and Data Architecture in 2025: From Experimentation to Integration</title><description><![CDATA[In 2024, Cloudera set out to understand the state of enterprise AI and data architectures, releasing its first survey report on the subject: The State of Enterprise AI and Modern Data Architectures. The results from that survey painted a picture of an enterprise AI landscape where IT leaders were ready to capitalize on AI but struggled with outdated data architectures.]]></description><link>https://www.cloudera.com/blog/business/enterprise-ai-and-data-architecture-in-2025-from-experimentation-to-integration.html</link><guid>https://www.cloudera.com/blog/business/enterprise-ai-and-data-architecture-in-2025-from-experimentation-to-integration.html</guid><pubDate>Thu, 25 Sep 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-1190444867.jpg"><p>In 2024, Cloudera set out to understand the state of enterprise AI and data architectures, releasing its first survey report on the subject:<a href="https://www.cloudera.com/campaign/the-state-of-enterprise-ai-and-modern-data-architectures.html"> The State of Enterprise AI and Modern Data Architectures</a>. The results from that survey painted a picture of an enterprise AI landscape where IT leaders were ready to capitalize on AI but struggled with outdated data architectures.</p>
<p>Now a year later, how are enterprises fairing in their AI journeys? To better understand the current state of AI and data architecture, Cloudera fielded a follow-up survey report:<a href="https://www.cloudera.com/campaign/the-evolution-of-ai-the-state-of-enterprise-ai-and-data-architecture.html?internal_keyplay=AI&amp;internal_campaign=Thought-Leadership-Reports---AlwaysOn-FY26-GLOBAL-CT-Report-State-of-Enterprise-AI-Survey&amp;cid=701Ui00000cKCojIAG&amp;internal_link=blog-body-content"> The Evolution of AI: The State of Enterprise AI and Data Architecture</a>.</p>
<p>The survey of 1,574 enterprise IT leaders across the US, EMEA, and APAC, shows that AI is moving from experimentation to deep integration, with a focus on data and current data architecture deployments evolving in lockstep.</p>
<p>Let’s dive into the findings.</p>
<h2>The State of Enterprise AI: Maximizing Value</h2>
<p>This year’s report reveals that enterprise AI has moved from experimentation to full integration in core processes and workflows:</p>
<ul>
<li><p>96% of respondents say that AI is at least somewhat integrated into their core business processes</p>
</li>
<li><p>54% say they have significant AI integration</p>
</li>
<li><p>21% say it’s already fully embedded&nbsp;</p>
</li>
</ul>
<p>These numbers make it clear—AI has become table stakes for enterprise success.</p>
<p>And the benefits of AI aren’t something relegated to the abstract or hypothetical. A growing number of IT leaders are seeing real value generated. In fact, most (52%) report they’re significantly successful in realizing measurable value from AI, while only 1% have yet to see results.</p>
<p>So, what types of AI are these organizations utilizing to generate that success? Cloudera’s survey found enterprise IT leaders are tapping into a broad set of AI forms. This includes generative (60%), deep learning (53%), predictive (50%), supervised learning (43%), classification (41%), agentic (36%), and regression (24%) models.</p>
<p>As AI portfolios diversify, the lesson is clear: leaders aren’t relying on a single “hero model” but building collections tuned to use case, risk, and cost. Likewise, they want visibility and control over all their data, not just a subset, so decisions are smarter and AI more effective.</p>
<p>Enterprises are gearing up for newer forms of AI. Agentic capabilities are crossing from experiments to production. Sixty-seven percent feel more prepared to manage agents than a year ago (26% say much more prepared). Already, 36% run agents as a primary model type, and 83% believe investing in agents is essential to maintaining a competitive edge.</p>
<p>Leading organizations will pair guardrails with clear ownership models for agent actions and data access. The pivot from applications to intelligent agents is underway, and success will depend on unifying policies wherever those agents run.</p>
<h2>Examining Today’s Data Attitudes and Architectures</h2>
<p>Enterprise culture around data is maturing. Eighty-six percent of leaders describe their organization as at least moderately data driven. Those calling their culture extremely data-driven rose to 24%, up from 17% a year ago. That culture shift is accompanied by a growing level of confidence in enterprise data as well.</p>
<p>Among survey respondents, 24% say they trust their organization’s data much more than they did one year ago, and another 41% say they trust their organization’s data somewhat more.&nbsp;</p>
<p>As enterprise leaders look to enable AI at scale, the foundation of data architecture they choose may vary:</p>
<ul>
<li><p>63% of organizations are storing their data in private clouds</p>
</li>
<li><p>52% are storing data in public clouds</p>
</li>
<li><p>38% say they rely on on-premises mainframes</p>
</li>
<li><p>32% note they use on-premises distributed options</p>
</li>
</ul>
<p>With data spread across a mix of storage methods, success with AI hinges on an organization’s ability to bring AI to data anywhere:&nbsp; in clouds, data centers, or at the edge.<br>
&nbsp;&nbsp;</p>
<h2>As Confidence in Data Rises, the Bottlenecks Still Bite</h2>
<p>Even as enterprises grow more confident in their data and embrace a wider range of AI models, many adoption and implementation challenges persist. Asked what the biggest technical limitation of their architecture was, respondents chose data integration (37%) as their top issue. This is followed by storage performance (17%), compute power (17%), lack of automation (17%), and latency (12%).&nbsp;</p>
<p>Then there are challenges that have evolved since last year. Compared to 2024, the cost to access computer capacity for training AI models is on the rise. One year ago, just 8% of surveyed IT leaders noted these costs were too high. Today, that number has increased to 42%—a 34-point jump!</p>
<p>Many respondents also have challenges around accessing and utilizing their organization's data for AI initiatives. While 38% of global respondents note that most of their organization's data was accessible and usable in these instances, just 9% say that all of their data is available. With data inaccessible to AI, these organizations may be missing potential market opportunities or operating with faulty information for decision-making.&nbsp;</p>
<h2>Where Are AI and Data Headed Next?</h2>
<p>Enterprise leaders are more confident in their data. AI is becoming deeply integrated into core processes, transforming everything from operational efficiency to customer experience. But many still have yet to make all of their data accessible to AI. This gap in access within data architectures poses serious risks from a competitive standpoint but also means AI initiatives may not be as effective as they otherwise could be.&nbsp;&nbsp;&nbsp;</p>
<p>Maximizing the value of AI is critical for the long-term outlook of enterprises, particularly as they seek to scale the technology. Overcoming these challenges starts with understanding internal data needs and prioritizing partners and tools that help bring AI to data anywhere, wherever that data resides.</p>
<p>Read the<a href="https://www.cloudera.com/campaign/the-evolution-of-ai-the-state-of-enterprise-ai-and-data-architecture.html?internal_keyplay=AI&amp;internal_campaign=Thought-Leadership-Reports---AlwaysOn-FY26-GLOBAL-CT-Report-State-of-Enterprise-AI-Survey&amp;cid=701Ui00000cKCojIAG&amp;internal_link=blog-body-content"> full report</a> to uncover the current state of AI and data architecture, and learn more about why Cloudera is the only data and AI platform company that large organizations trust to bring AI to their data anywhere it lives.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=enterprise-ai-and-data-architecture-in-2025-from-experimentation-to-integration</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Revolutionize Your Data Strategy: Unleash the Power of Cloudera Octopai Data Lineage for Seamless Metadata Management and Data Lineage</title><description><![CDATA[Unlock seamless metadata management and data lineage with Cloudera &amp; Octopai, transforming your data strategy for better insights and control.]]></description><link>https://www.cloudera.com/blog/technical/revolutionize-your-data-strategy-unleash-the-power-of-cloudera-octopai-data-lineage-for-seamless-metadata-management-and-data-lineage.html</link><guid>https://www.cloudera.com/blog/technical/revolutionize-your-data-strategy-unleash-the-power-of-cloudera-octopai-data-lineage-for-seamless-metadata-management-and-data-lineage.html</guid><pubDate>Thu, 18 Sep 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1388687270.jpg"><p>Today’s data landscape is vast and continues to evolve rapidly. With organizations collecting more data than ever before—across cloud and on-premises platforms and various analytics tools—businesses must navigate an increasingly complex ecosystem of data sources. When data is spread across multiple environments, tracking and understanding its flow becomes complex, error-prone, and time-consuming.</p>
<p>In such complex data ecosystems, metadata and data lineage become the single source of truth, leading to improved data utilization, breaking down data silos, aiding regulatory compliance, and providing AI governance. On the flip side, lacking appropriate metadata and <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">data lineage infrastructure</a> becomes a barrier to achieving actionable insights, and businesses struggle to get a complete view of their data, making it difficult to ensure quality, compliance, and security.&nbsp;</p>
<p>&nbsp;</p>
<h2>The Challenge in Managing Metadata and Data Lineage Across Various Environments and Tools</h2>
<p><span class="text-lead"><b>Inconsistent Metadata Management</b></span></p>
<p>Metadata is often called the &quot;data about data.&quot; Metadata can be business, social, or operations related and it provides essential context to raw data, such as its structure, format, source, and the rules governing its use. When metadata is inconsistent or fragmented across systems, it leads to several challenges, including:</p>
<ul>
<li><p><b>Inconsistent definitions:</b> Different departments or systems may use different terms or definitions for the same data elements. For instance, a customer record in the sales department might not have the same metadata as a customer record in the finance department. This inconsistency creates confusion and reduces the ability to work cross-functionally. The business impact can be significant—sales might report 10,000 active customers based on recent interactions, while finance reports only 7,500 because they define &quot;active&quot; differently. Such discrepancies can lead to misguided strategic decisions, misallocated budgets, and even strained customer relationships due to inconsistent communication across departments<br>
</p>
</li>
<li><p><b>Difficulties in data discovery: </b>Metadata enables teams to quickly locate the data they need, but when metadata isn’t centralized or well-maintained, it becomes a needle-in-a-haystack situation for data engineers and analysts. Teams waste valuable time searching for the right data and may miss important datasets altogether, resulting in incomplete analyses.<br>
</p>
</li>
<li><p><b>Lack of contextual understanding:</b> Without a clear understanding of how data is structured and its intended use, teams may misinterpret it or apply it incorrectly. For example, if an analyst doesn’t know that a dataset has been cleaned or transformed, they may spend time reprocessing data unnecessarily or using outdated information.</p>
</li>
</ul>
<p>&nbsp;</p>
<h3>Poor Data Traceability&nbsp;</h3>
<p>Data lineage refers to the <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">traceability of data</a>, including its origins, transformations, and movements throughout an organization's systems. Without clear data lineage, businesses struggle to understand how data flows, where it’s coming from, and how it changes over time. This becomes especially problematic when:</p>
<ul>
<li><p><b>Data is distributed across platforms:</b> Many businesses use a combination of on-premises systems, cloud platforms, and a variety of third-party applications. Each system may use different formats or methodologies for managing metadata and lineage, making it difficult to see a unified view of how data is being used and transformed.<br>
</p>
</li>
<li><p><b>Lack of visibility into transformations:</b> When data moves through multiple stages or systems, it undergoes various transformations. Without clear tracking of these changes, teams can’t confidently rely on the data for analytics, leading to incorrect insights and decisions. Missing or incomplete data lineage also hinders troubleshooting errors or improving processes.</p>
</li>
</ul>
<ul>
<li><b>Data traceability gaps:</b> As data moves through pipelines and systems, the traceability is often lost. If teams can’t pinpoint exactly where data has been sourced or how it’s been altered, it becomes a challenge to maintain data integrity and ensure that the data is trustworthy&nbsp; for use in critical decision-making.</li>
</ul>
<h3>Fragmentation from Data Silos</h3>
<p>When data is siloed within individual departments or tools, the ability to understand how data moves across the organization is compromised. Data silos cause fragmentation, which exacerbates the challenge of managing metadata and data lineage, including:</p>
<ul>
<li><p><b>Disjointed metadata: </b>As data is stored across multiple systems, metadata often resides in silos as well. Each system might have its own metadata repository, which makes it difficult to maintain a consistent, enterprise-wide understanding of the data’s lifecycle. Without a holistic view of metadata, it becomes nearly impossible to track data lineage accurately.<br>
</p>
</li>
<li><p><b>Inability to integrate new tools:</b> When data is siloed and metadata is not standardized, integrating new tools into the existing ecosystem becomes a monumental task. For example, adding new data sources or analytics tools requires businesses to manually reconcile metadata across systems, which can lead to errors and slow down adoption.<br>
</p>
</li>
<li><p><b>Difficulty in maintaining compliance:</b> As data becomes more fragmented, ensuring that it complies with governance and regulatory standards becomes more challenging. Without a consistent understanding of where data has been and how it’s been altered, businesses cannot guarantee compliance with standards like GDPR, HIPAA, or other industry-specific regulations.</p>
</li>
</ul>
<h2>Cloudera Octopai Data Lineage Unifies and Automates Metadata Management and Data Lineage Across Tools</h2>
<p><a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> offers a unified, intuitive solution that eliminates the fragmentation caused by data silos and complex integrations, helping organizations strengthen governance&nbsp; and streamline collaboration. Its capabilities act as the backbone of initiatives including data quality, compliance and governance, and cross-team collaboration.</p>
<ul>
<li><p><b>Consistent metadata management:</b> It aggregates metadata from various sources into a single, centralized repository. This ensures that all metadata—whether from cloud platforms, on-premises systems, or third-party tools—is accessible in one place.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><b>Automatic data lineage tracking: </b>It automatically maps and tracks data lineage. This is achieved through intelligent algorithms that scan the data pipelines and connections between systems, creating a visual representation of how data flows across the organization. Data lineage capabilities are multilayered: cross-system, inner-system, and E2E column level, enabling support for granular governance, debugging, and AI/ML explainability. This delivers end-to-end visibility, near real-time updates, and enables quick error and impact detection.</p>
</li>
</ul>
<ul>
<li><p><b>Breaks down silos with prebuilt connectors:</b> Cloudera Octopai Data Lineage <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html#integrations">provides more than 60 connectors</a>, covering a range of widely used platforms, including databases, cloud platforms, and ETL and BI tools. While APIs and connectors both serve as means to integrate with other systems and tools, connectors simplify the integration process significantly, providing a ready-to-use interface for connecting to a data source or system without requiring extensive custom development.&nbsp;</p>
</li>
</ul>
<h3>Connectors for Apache Hive and Apache Impala workloads on Cloudera platform</h3>
<p>Two connectors we want to highlight are those for Apache Hive and Apache Impala, two widely used SQL-based query engines in enterprise data environments. Apache Hive and Impala are critically important in AI/ML workloads, as they are used for staging data, transformations, and for serving real-time analytics.</p>
<p>These connectors offer the following capabilities and benefits:</p>
<ul>
<li><p>Seamlessly integrate metadata and data lineage from Hive and Impala into <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage,</a> providing a more complete view of your data ecosystem.</p>
</li>
</ul>
<ul>
<li><p>Easily track how data flows and transforms across Hive, Spark and Impala environments, ensuring greater visibility, data quality, and governance.&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Accelerate data discovery, enhance collaboration, and improve compliance, all while reducing the complexity of managing metadata across multiple platforms.&nbsp;</p>
</li>
</ul>
<h2>What This Means for The Future of Data and AI</h2>
<p>Whether managing a small set of data sources or large, complex data ecosystems and AI workloads, Cloudera Octopai Data Lineage is built to scale. Businesses can efficiently manage their metadata and data lineage as their data infrastructure evolves, and have the capabilities and support needed to govern model pipelines, trace training data, and meet AI auditability standards.&nbsp;</p>
<p>In a world where AI is shaping critical decisions, managing data pipelines in isolation is no longer sufficient. Organizations need full transparency into the data entering, flowing through, and leaving AI models. With Cloudera Octopai Data Lineage’s deep lineage and metadata integration, Cloudera extends governance to AI workloads—enabling responsible AI development, deployment, and oversight while ensuring compliance and trust in the data powering AI.</p>
<p>If you would like to know more, then please reach out to your account teams. If you would like to learn about how Cloudera customers are pioneering new use cases then sign up for <a href="/content/www/en-us/events/evolve.html">Cloudera EVOLVE</a> near you.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=revolutionize-your-data-strategy-unleash-the-power-of-cloudera-octopai-data-lineage-for-seamless-metadata-management-and-data-lineage</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera + NVIDIA Deliver AI-Powered Transformation in Financial Services</title><description><![CDATA[Cloudera and NVIDIA come together to streamline data pipelines at scale with Cloudera’s data management capabilities with NVIDIA’s full-stack services.]]></description><link>https://www.cloudera.com/blog/partners/cloudera-and-nvidia-deliver-ai-powered-transformation-in-financial-services.html</link><guid>https://www.cloudera.com/blog/partners/cloudera-and-nvidia-deliver-ai-powered-transformation-in-financial-services.html</guid><pubDate>Wed, 17 Sep 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Partners]]></category><dc:creator><![CDATA[Andreas Skouloudis]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-exterior-modern-high-rise-building.jpg"><p><i>Figure 1: Cloudera and NVIDIA deliver value across the data science lifecycle</i></p>
<p>In this blog, we will highlight three use cases that showcase how, together, Cloudera and NVIDIA deliver value with analytics and <a href="/content/www/en-us/solutions/financial-services.html">AI for financial services</a> institutions..</p>
<h2>NVIDIA RAPIDS Accelerator for Apache Spark for AML/KYC Compliance&nbsp;</h2>
<p>The anti-money laundering and know your customer (AML/KYC) compliance lifecycle in large financial organizations is a highly compute-intensive process. This is due to the need to integrate and standardize vast volumes of data across various activities, such as:&nbsp;</p>
<ul>
<li><p>Entity resolution, which requires the standardization of cross-border data subject to different data clearance processes and sourced from a wide range of transactional systems and external entities (such as credit card transactions, wire transfers, and SWIFT messages).&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Data consolidation from multiple AML/KYC systems that store information in different formats, which must be normalized into a unified schema and structured into data products (such as cross-business-unit AML data marts).</p>
</li>
</ul>
<ul>
<li><p>Ongoing transaction monitoring and regulatory reporting that require data processing, enrichment, and the application of rules.</p>
</li>
</ul>
<p>For many Cloudera customers who have implemented AML/KYC use cases, Apache Spark plays a pivotal role in enabling these analytic workloads. Apache Spark is a powerful engine for <a href="/content/www/en-us/products/data-engineering.html">data engineering,</a> providing capabilities like in-memory computing and distributed processing. However, the surge in transaction volumes and the increasing variety of new data sources for AML/KYC compliance place additional strain on existing compute infrastructure, demanding even greater performance.</p>
<p>The NVIDIA RAPIDS library for Apache Spark offloads specific data processing operations from CPU to GPU in a transparent manner, meaning without any code modifications. As a result, Cloudera customers have experienced <a href="https://blogs.nvidia.com/blog/cloudera-spark-irs-gpus/" target="_blank" rel="noopener noreferrer">performance improvements of up to 20x</a> by using the NVIDIA RAPIDS library for Apache Spark 3.0 workloads.</p>
<h2>NVIDIA NIM Microservices for Fraud Prevention in Payments</h2>
<p>Two of the greatest challenges in fraud prevention are the explosion in transaction volumes in digital and credit card payments and the increasing sophistication of fraud techniques. These factors have led to resource contention and scalability challenges for AI/ML inference, necessitating the deployment of multiple composable AI/ML models to address emerging fraud methods.</p>
<p>To tackle these challenges, the Cloudera AI Inference service includes NVIDIA NIM that are designed to deliver high-performance, low-latency, and high-throughput inference for fraud prevention AI models on NVIDIA accelerated computing. For example, by using NVIDIA NIM, Cloudera AI Inference service can deliver up to 6x performance improvement for PyTorch models (using the <a href="https://developer.nvidia.com/blog/accelerating-inference-up-to-6x-faster-in-pytorch-with-torch-tensorrt/" target="_blank" rel="noopener noreferrer">Torch-TensorRT library</a>) and a 2.5x improvement for TensorFlow models (using the <a href="https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/optimization.html" target="_blank" rel="noopener noreferrer">TF-TensorRT library</a>), both of which are widely used in payments fraud prevention.</p>
<p>In addition, the Cloudera AI Inference service accelerates inference requests executed on NVIDIA accelerated computing by leveraging NVIDIA’s dynamic batching feature. This feature enables the combination of server-side inference requests, avoiding the inefficiency of processing one request at a time, which leaves much of the GPU idle. As a result, the Cloudera AI Inference service with NVIDIA NIM improves GPU utilization, reducing future GPU capital expenditures to meet growing demands for fraud prevention.&nbsp;</p>
<h2>NVIDIA AI-Q Blueprint for Loan Origination in Retail Banking</h2>
<p>Credit underwriting is an important capability in banking, spanning many different lending activities such as mortgages, credit card lending, commercial banking, and trade finance. These processes have historically been inefficient given the number of activities involved in the origination process, from application submission to funding, and the numerous roles participating in the decision process.&nbsp;</p>
<p>While traditional AI/ML models can streamline many individual activities in the loan origination workflow, the process from the customer’s perspective still feels slow and fragmented. This is where agentic AI can have a significant impact: in this context, agentic AI can reduce the effort required to collect, summarize information, and draft credit decisions. It can also deliver a personalized and consistent lending experience by standardizing reviews during the approval process. Additionally, it can deliver personalized product recommendations based on the customer’s behaviors and spending patterns, with a multiple-agent workflow that orchestrates various tools, data, and AI agents.</p>
<p>By leveraging NVIDIA AI-Q Blueprint on NVIDIA accelerated computing with the Cloudera AI Inference service, banking organizations can achieve this transformative vision. For example, by using AI-Q Blueprint, Cloudera can orchestrate a multi-agent workflow that includes a GenAI-based personalized loan advisor deployed on NVIDIA NIM, an AI-based document processing agent leveraging optical character recognition (OCR) and natural language processing (NLP) techniques, and existing credit decisioning tools.</p>
<h2>Next Steps</h2>
<p>The combined power of Cloudera’s unified, cloud-anywhere data platform and NVIDIA’s hardware and software capabilities offers a holistic solution for the development of agentic AI solutions.&nbsp;</p>
<ul>
<li><p><a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Visit this page</a> to learn more about the Cloudera AI Inference service.</p>
</li>
<li><p><a href="/content/dam/www/marketing/resources/whitepapers/cloudera-and-nvidia-accelerate-ai-in-the-financial-services-industry.pdf">Read this whitepaper</a> by Enterprise Strategy Group to learn about the Cloudera + NVIDIA joint value proposition.</p>
</li>
</ul>
<p>Cloudera and NVIDIA enable organizations to streamline complex data pipelines at scale by combining Cloudera’s data management capabilities with NVIDIA’s full-stack services:</p>
<ul>
<li><p><b>Data processing</b> <a href="/content/www/en-us/products/data-engineering.html">Apache Spark</a> on Cloudera and NVIDIA RAPIDS Accelerator for Apache Spark streamlines execution of feature engineering and data engineering workloads.</p>
</li>
</ul>
<ul>
<li><p><b>AI/ML model deployment</b> with <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Cloudera AI Inference</a> and NVIDIA NIM microservices improves the throughput and latency performance of artificial intelligence (AI) models (both traditional AI/ML and generative AI) .</p>
</li>
</ul>
<ul>
<li><p><b>Agentic AI orchestration</b> with NVIDIA AI-Q Blueprint enables the integration of AI agents with private data and the interaction with other systems through APIs.</p>
</li>
</ul>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-and-nvidia-deliver-ai-powered-transformation-in-financial-services</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>#ClouderaLife Employee Spotlight: Meet Amy Nelson, Cloudera’s Chief Human Resources Officer</title><description><![CDATA[Let’s meet Amy Nelson and learn about her journey at Cloudera, the culture she’s helping to create, and how she empowers Clouderans to thrive and give back.]]></description><link>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-amy-nelson-clouderas-chief-human-resources-officer.html</link><guid>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-amy-nelson-clouderas-chief-human-resources-officer.html</guid><pubDate>Tue, 16 Sep 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1200143416-1.jpg"><p><span class="blue-nova">“At Cloudera, our greatest strength is our people. What sets us apart is how we empower employees to think independently, act with autonomy, and make a real impact,”</span> Amy Nelson says.</p>
<p>That belief has guided Amy throughout her career and continues to define her leadership at Cloudera. As Chief Human Resources Officer, Amy is the center of this people-first approach, shaping a workplace where individuals feel valued and empowered.</p>
<p>Let’s meet Amy Nelson and learn about her journey at Cloudera, the culture she’s helping to create, and how she empowers Clouderans to thrive and give back.</p>
<h2>Meet Amy Nelson</h2>
<p>Amy oversees everything from workforce planning and leadership development to inclusion and engagement programs, always keeping the community at the heart of her work.</p>
<p>That philosophy comes to life in the initiatives Amy drives: expanding learning and development programs, embedding purpose through Cloudera Cares, and advancing accessibility so every employee feels supported. Each effort reinforces her larger promise—to make Cloudera a great place to work and where people truly belong.</p>
<h2>What Drew Amy to Cloudera</h2>
<p>Amy’s career has always been rooted in people and purpose. When she encountered Cloudera, she saw a company that mirrored her values.</p>
<p><span class="blue-nova">“What initially drew me to the company was its strong commitment to innovation and people-first culture,”</span> she says. <span class="blue-nova">“I saw an opportunity to contribute to a company that truly believed in aligning talent strategy with long-term growth.”</span></p>
<p>Since then, her role has expanded far beyond traditional HR. <span class="blue-nova">“The past few years have pushed HR to the forefront of business strategy,”</span> she says. <span class="blue-nova">“I’m proud to have helped guide the company through times of change with empathy and purpose.”</span></p>
<p>From recognition as a<a href="/content/www/en-us/about/news-and-blogs/press-releases/2024-10-02-cloudera-honored-with-first-time-win-at-prestigious-2024-singapore-best-workplaces-in-technology-awards.html"> Great Place to Work</a> in multiple countries to strengthening career development pathways, Amy has helped Cloudera evolve while keeping its culture grounded in belonging and resilience.</p>
<h2>Creating a Workplace Where Everyone Can Thrive</h2>
<p>For Amy, a strong employee experience blends vision with practical support. Through her efforts, Cloudera has introduced personalized learning paths, leadership programs, and clear growth frameworks to help employees see a future for themselves at the company.</p>
<p>Listening is central to Amy’s approach. <span class="blue-nova">“We place so much emphasis on employee feedback, especially through our engagement surveys,”</span> she says. <span class="blue-nova">“We treat that feedback as a strategic input, not just a data point. It helps us make better decisions and evolve our culture in real, responsive ways.”</span></p>
<p>The employee value proposition has also been reshaped through that input. Focus groups and surveys helped Cloudera articulate what employees already felt: this is a place where your ideas matter, your work shapes the future, and your voice truly counts.</p>
<p>Amy has also championed accessibility and inclusivity. Participation in the Disability Index led to meaningful changes such as flexible work policies, enhanced benefits, and home office stipends. <span class="blue-nova">“Ultimately, our goal is to create an environment where every employee, whether they identify as having a disability or not, feels supported, empowered, and able to contribute meaningfully,” </span>she says.</p>
<p>She credits Cloudera’s Learning &amp; Enrichment team for making these programs effective. <span class="blue-nova">“Accessibility is another key strength,”</span> she says. <span class="blue-nova">“Whether virtual, in-person, or self-led learning, our programs are designed to be flexible and inclusive, giving employees the freedom to grow in a way that fits their learning style and schedule.”</span></p>
<h2>Empowering Clouderans to Give Back</h2>
<p>Amy believes people are more engaged and prouder of their work when connected to it.</p>
<p>Through&nbsp;<a href="/content/www/en-us/about/philanthropy.html">Cloudera Cares</a>—the company’s corporate social responsibility program—she helps create a culture of giving back which embodies the collective spirit of Clouderans worldwide, reinforcing the company’s mission to create positive impact inside and outside the workplace.</p>
<p>For Amy, this passion for service reflects Cloudera’s identity. <span class="blue-nova">“Giving back has always been a natural extension of who we are at Cloudera,”</span> she says. <span class="blue-nova">“The passion our employees bring to their work is the same passion they bring to their communities, and that’s something we’re incredibly proud of.”</span></p>
<p>By embedding purpose into the employee experience, Amy helps Clouderans unite globally to deliver innovation and create lasting social impact.</p>
<h2>Hiring and Scaling in a Fast-Changing Tech Landscape</h2>
<p>As Cloudera grows, Amy ensures its people strategy evolves with it.</p>
<p><span class="blue-nova">“We’re leveraging the right mix of tools, technology, and human connection to make the hiring journey seamless and engaging from first touchpoint to offer decision,”</span> she says.</p>
<p>Looking ahead, she is focused on building teams that reflect diverse perspectives and creating an environment where new talent can contribute, grow, and lead. This forward-looking approach ensures Cloudera is meeting today’s needs and building the foundation for its future.</p>
<h2>Closing Thoughts</h2>
<p>As Amy looks ahead, her vision for Cloudera is rooted in purpose and belonging. <span class="blue-nova">“I want us to continue building an environment where every employee, no matter where they are in the world or in their career, feels a deep sense of purpose and belonging here,”</span> she says. <span class="blue-nova">“I want Cloudera to be known not just as a place where people want to work, but a place they’re proud to be part of—a community that supports their growth, reflects their values, and inspires their best every day.”</span></p>
<p>Her advice for those interested in joining Cloudera reflects the company’s fast-moving, collaborative spirit. <span class="blue-nova">“Be ready to collaborate with some of the brightest, most passionate people in the data space,”</span> she says. <span class="blue-nova">“Things evolve quickly, but that means you’ll constantly have the chance to grow your skills and make an impact.”</span></p>
<p>Amy’s journey proves that when people feel supported, valued, and connected to something bigger, they can achieve extraordinary things for themselves, their company, and their communities.</p>
<p>Hear from another&nbsp;<a href="/content/www/en-us/blog/culture/clouderalife-employee-spotlight-meet-charles-aad-clouderas-senior-industry-solutions-engineer.html">Clouderan</a> and explore<a href="/content/www/en-us/careers.html"> career opportunities</a> at Cloudera.</p>
<p><span class="blue-nova">“It’s not just about giving back, it’s about embedding purpose into our culture,”</span> she says.</p>
<p><span class="blue-nova">“For me, creating a standout employee experience starts with a simple but powerful belief: everyone should feel like they belong and that their voice matters,”</span> she says.<span class="blue-nova"> “From day one at Cloudera, I’ve championed the idea that culture isn’t just built for employees, it’s built with them.”</span></p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderalife-employee-spotlight-meet-amy-nelson-clouderas-chief-human-resources-officer</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Austin Week of Learning</title><description><![CDATA[Recently, one of our 2025 summer interns, Savanna Morris, spent some time in Austin, Texas exploring Cloudera’s Week of Learning and all that the event had to offer for those looking to grow in their careers. ]]></description><link>https://www.cloudera.com/blog/culture/austin-week-of-learning.html</link><guid>https://www.cloudera.com/blog/culture/austin-week-of-learning.html</guid><pubDate>Mon, 15 Sep 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Savanna Morris]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-group-studying.webp"><p>At Cloudera, summer interns are vital to the day-to-day success of our business. But the role interns take on is about much more than just learning a business and supporting teams. It’s about taking on opportunities to learn and grow too.&nbsp;&nbsp;</p>
<p>Recently, one of our 2025 summer interns, Savanna Morris, spent some time in Austin, Texas exploring Cloudera’s Week of Learning and all that the event had to offer for those looking to grow in their careers.&nbsp;</p>
<p>Let’s hear from Savanna about what that experience was like:</p>
<p>As part of my summer internship with Cloudera, I had the opportunity to visit Austin, Texas for the &quot;The Week of Learning,&quot; an event dedicated to fostering knowledge and skill development across various disciplines. From interactive workshops to community-driven volunteering events, the week offered a diverse range of opportunities for attendees to expand their horizons and gather insight into their roles at Cloudera.&nbsp;</p>
<h2>Highlights of the Week</h2>
<p>The event kicked off with the first workshop of the day, “Unlocking your Conflict Style,” featuring Raena Mareder, the Manager of Learning and Enrichment and Misha D’ Andrea, the Learning and Enrichment Partner, who both set an enthusiastic tone for the days to follow. The workshop, with approximately twenty attendees in the first session, offered thoroughly engaging content. Not only were the activities interactive, but they also fostered greater collaboration among colleagues. As a result, I had the opportunity to connect with coworkers from various departments who had flown in from all over the States.&nbsp;</p>
<h2>Workshops &amp; Interactive Sessions</h2>
<p>A key component of The Week of Learning was its focus on hands-on experiences. Attendees had the chance to participate in workshops covering topics such as:</p>
<ul>
<li><p>Unlocking your Conflict Style: This very informational workshop discussed various conflict styles applicable to different workplace scenarios. As a part of the activities, we took the Thomas-Kilmann Conflict Mode assessment to discover our most prominent conflict style. Complemented by the assessment, the workshop provided a deeper understanding of personal management styles and their individual shortcomings, fostering self-discovery.&nbsp;</p>
</li>
<li><p>Communicate to Connect: Complementing “Unlocking your Conflict Style,” this workshop offered practical insight into honing communication skills through storytelling. For one activity, we were tasked with matching workplace “stories” to common story arcs. In their concluding remarks, Misha and Raena shared insights on public speaking, emphasizing the core principle of &quot;connecting with the audience, through presence, with yourself.&quot;&nbsp;</p>
</li>
<li><p>Udemy Lunch and Learn: Complementing the workshops, a &quot;Lunch and Learn&quot; session was held in collaboration with Udemy, focusing on AI enablement. Udemy provides access to over a thousand AI-related courses, with new content added daily. Employees have full access to Udemy's extensive catalog, including certification courses, to enhance their AI knowledge. Shayde Christian, Chief Data Officer, concluded the event by answering any questions over AI, offering insightful knowledge on this growing industry.&nbsp;</p>
</li>
<li><p>Conscious Leadership: Misha and Raena wrapped up Austin’s Week of Learning with a very interactive workshop focused on &quot;conscious leadership.” Members were asked to self-segregate into card-based groups, highlighting our human tendency to gravitate towards similarities rather than embracing differences. For the subsequent activity, attendees were asked to refer to the Ladder of Inference, a method designed to foster a conscious leadership mindset when approaching decisions based on sets of information. Using the Ladder of Inference, attendees were introduced to increasing levels of data to come up with the best course of action. The result was effective collaboration as well as utilization of differing perspectives.&nbsp;</p>
</li>
</ul>
<h2>Season of Service Opportunity</h2>
<p>In addition to the variety of workshop opportunities, attendees also had a chance to participate in a service opportunity with <a href="https://www.safeaustin.org/">SAFE</a>, a non-profit dedicated to assisting survivors of child abuse, sexual assault, trafficking, and domestic violence. SAFE’s Volunteer Services Director, Stefanie Lebens, spoke on their effectiveness in managing these sensitive situations while also introducing their work to many of the attendees. Following the presentation, approximately thirty Clouderans filled baskets with essential household items such as dish sets, soap, sponges, and trash bag rolls. To further connect with the recipients, attendees also included personalized encouraging notes in each basket. The event fostered a positive and supportive community and atmosphere, with all volunteers demonstrating hearts ready and willing to serve.&nbsp;</p>
<h2>Personal Reflection</h2>
<p>Attending Austin’s Week of Learning was truly an unforgettable experience. The Learning &amp; Enrichment team did a wonderful job planning this event. As a remote intern working in Global Communications, the event fostered connections with fellow Clouderans, exposed me to office culture, and developed leadership skills—learning more about myself in the process. I will apply the skillsets I was exposed to and encouraged to develop to my life moving forward. Additionally, in an increasingly remote world, the in-person interaction among coworkers was refreshing! I was able to meet so many talented individuals and interact with them in ways I would not have been able to remotely. I am truly grateful for this incredible opportunity.&nbsp;</p>
<h2>Looking Forward</h2>
<p>The success of The Week of Learning underscores Cloudera’s commitment to continuous growth and education. We can always improve and learn at any stage in our careers to become better, more effective colleagues. I wholeheartedly encourage participating in future Week of Learning workshops and events. Not only will it give you an excellent opportunity to connect with colleagues from various departments and locations, but it will also give you valuable skills applicable to all aspects of your life.&nbsp;</p>
<p>Learn more about how <a href="https://www.cloudera.com/about/our-culture.html#">Cloudera</a> is furthering its commitment to fostering growth and education opportunities. </p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=austin-week-of-learning</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Reduce Data Management and Hosting Costs with Data Lineage</title><description><![CDATA[Data lineage can help large organizations reduce costs across various areas. Here are some common expenditures where data lineage can be beneficial]]></description><link>https://www.cloudera.com/blog/business/reduce-data-management-and-hosting-costs-with-data-lineage.html</link><guid>https://www.cloudera.com/blog/business/reduce-data-management-and-hosting-costs-with-data-lineage.html</guid><pubDate>Mon, 15 Sep 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Ron Pick]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-industrial-logistics-people.webp"><p>Data lineage can help large organizations reduce costs across various areas. Here are some common expenditures where data lineage can be beneficial:</p>
<ul>
<li><p><b>Infrastructure and storage:</b> <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Data lineage</a> allows organizations to understand data usage patterns, access frequencies, and uncover data dependencies. By analyzing this information, organizations can optimize their infrastructure and storage strategies, avoiding unnecessary storage costs and efficiently allocating resources based on data usage patterns.<br>
</p>
</li>
<li><p><b>Data integration and ETL:</b> Large organizations often deal with complex data integration and extract, transform, load (ETL) processes. Data lineage helps identify redundant or inefficient data integration steps, allowing organizations to streamline their processes and reduce development, maintenance, and operational costs associated with ETL.<br>
</p>
</li>
<li><p><b>Data quality</b>: Poor data quality can result in significant costs for organizations. Data lineage helps trace data quality issues back to their source, enabling organizations to identify the responsible processes or systems. By addressing these issues at their root, organizations can reduce costs associated with data cleansing, error correction, and rework caused by poor data quality.<br>
</p>
</li>
<li><p><b>Regulatory compliance:</b> Compliance with data regulations is essential for large organizations, and non-compliance can result in substantial penalties. Data lineage provides transparency into data flows, transformations, and access, supporting organizations in demonstrating compliance and reducing the risk of costly violations.<br>
</p>
</li>
<li><p><b>Analytics and reporting:</b> Data lineage facilitates data discovery and understanding of data sources, transformations, and calculations. By empowering data analysts and business users with self-service analytics capabilities through data lineage, organizations can reduce the time and effort spent on data exploration, preparation, and reporting.<br>
</p>
</li>
<li><p><b>Impact analysis</b>: When making changes to data sources, structures, or processes, organizations need to understand the downstream impact. Data lineage enables impact analysis by tracing the flow of data and identifying the systems, reports, or applications affected by changes. By conducting thorough impact analysis, organizations can mitigate the risks of costly errors and minimize the associated costs.<br>
</p>
</li>
<li><p><b>Operational support: </b>Data lineage provides insights into data dependencies and the relationships between different systems or processes. This information helps organizations troubleshoot issues, identify bottlenecks, and optimize performance. By resolving issues more efficiently and reducing downtime, organizations can lower operational and support costs.</p>
</li>
</ul>
<p>It’s important to note that the specific impact areas where data lineage can help will vary depending on the organization’s industry, data landscape, and specific challenges. Conducting a thorough assessment of the organization’s data ecosystem and understanding its pain points will help identify the areas where data lineage can provide the most significant cost reduction and process improvement opportunities.</p>
<h2>How Data Lineage Can Help Organizations Save</h2>
<p>Data lineage can help organizations reduce costs by providing valuable insights into the origin, movement, and transformation of data throughout its lifecycle. Here are some ways to leverage data lineage to reduce costs:</p>
<ul>
<li><p><b>Identify unnecessary data processes:</b> Data lineage allows you to trace the path of data from its source to its destination, enabling you to identify redundant or unnecessary data processes. By eliminating these redundant processes, you can reduce resource consumption and associated costs.<br>
</p>
</li>
<li><p><b>Optimize data storage: </b>Data lineage helps you understand which datasets are frequently accessed and which ones are seldom used. By analyzing this information, you can optimize your data storage strategies, such as implementing tiered storage or archiving infrequently accessed data. This approach helps reduce storage costs by allocating resources more efficiently.<br>
</p>
</li>
<li><p><b>Identify data quality issues: </b>Poor data quality can lead to increased costs due to errors, rework, and inefficiencies. By leveraging data lineage, you can track the origin of data quality issues, identify the responsible processes or systems, and take corrective actions. Improving data quality reduces the need for data cleansing and error correction, leading to cost savings.<br>
</p>
</li>
<li><p><b>Streamline data integration processes:</b> Data lineage enables you to understand how different data sources are integrated into your systems. By analyzing the lineage, you can identify complex and inefficient data integration processes. Simplifying and streamlining these processes can reduce development, maintenance, and operational costs.<br>
</p>
</li>
<li><p><b>Enhance data governance: </b>Data lineage provides transparency into data flows, transformations, and dependencies, supporting robust data governance practices. Effective data governance ensures compliance with regulations, reduces the risk of data breaches or non-compliance penalties, and avoids associated costs.<br>
</p>
</li>
<li><p><b>Support impact analysis: </b>Data lineage helps you understand how changes in data sources, structures, or processes impact downstream systems and applications. By conducting impact analysis, you can identify potential risks, assess the cost implications of changes, and make informed decisions, thereby minimizing the chances of costly errors.<br>
</p>
</li>
<li><p><b>Facilitate data discovery and self-service analytics:</b> Data lineage helps data consumers easily discover relevant datasets and understand their lineage. By empowering users to explore and access data independently, you can reduce the time and effort spent by data engineers or analysts in fulfilling data requests, leading to cost savings.</p>
</li>
</ul>
<p>Remember that leveraging data lineage effectively requires proper data governance, documentation, and tools for capturing and visualizing lineage information. It is also crucial to regularly review and update the lineage information as data and processes evolve over time.</p>
<h2>Automating Data Lineage with Cloudera</h2>
<p>Ready to cut costs and improve efficiency? <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html#demo">Request a demo</a> to get started with <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> today.</p>
<p>&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=reduce-data-management-and-hosting-costs-with-data-lineage</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Redefining AI Leadership: Inside the Rise of the Chief AI Enablement Officer</title><description><![CDATA[AI is moving from experimentation to execution. Yet as adoption scales across the enterprise, one truth is becoming clear: tools alone are not enough. Success hinges on the people, processes, and leadership that bring AI into daily business operations.]]></description><link>https://www.cloudera.com/blog/business/redefining-ai-leadership-inside-the-rise-of-the-chief-ai-enablement-officer.html</link><guid>https://www.cloudera.com/blog/business/redefining-ai-leadership-inside-the-rise-of-the-chief-ai-enablement-officer.html</guid><pubDate>Fri, 12 Sep 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-clouds-lake.webp"><p>AI is moving from experimentation to execution. Yet as adoption scales across the enterprise, one truth is becoming clear: tools alone are not enough. Success hinges on the people, processes, and leadership that bring AI into daily business operations.</p>
<p>On a recent episode of <a href="https://www.youtube.com/@ClouderaInc/podcasts">The AI Forecast</a>, host Paul Muller sat down with Donna Beasley, Cloudera’s first-ever Chief AI Enablement Officer, to explore this newly emerging role, the challenges of scaling adoption, and what it takes to build organizational readiness when no blueprint exists.</p>
<h3>AI’s Impact Relies on Operational Discipline</h3>
<p><b>Paul</b>: AI’s impact still hinges on the basic principles of operational discipline and execution. Yet a<a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai"> McKinsey survey</a> showed that while 78% of companies use AI, only 17% saw a meaningful earnings before interest and tax contribution. Why is the impact not showing up in the results?</p>
<p><b>Donna</b>: You can’t push this on people, nor can you hold people back. You’ve got to meet them where they are and then help them just take whatever that next step is. For many people, the first encounters with AI feel uncertain—even intimidating. Some worry it could replace their role, while others don’t know where to begin. I focus on creating space for employees to explore without pressure. The goal isn’t to force everyone into the same pace of change, but to ensure everyone can see how AI connects to their work. Progress comes in steady steps. When someone sees a colleague using AI effectively, they’re far more likely to return and be ready to try it themselves. That momentum—built gradually and reinforced through real examples—turns curiosity into measurable business impact.</p>
<h3>The Chief AI Enablement Officer Role Has No Blueprint</h3>
<p><b>Paul</b>: You’ve stepped into a brand-new role at Cloudera, and unlike other executive positions there’s no predecessor, no set KPIs. How did you think about defining success in the absence of a blueprint?</p>
<p><b>Donna</b>: There is no right or wrong answer to this. We’re kind of forging this path as we go forward. The advantage here is that Cloudera already had guardrails in place—an AI council, security guidelines, and governance. That foundation meant I could focus on putting tools in people’s hands, building confidence, and creating pathways from casual use to real innovation.</p>
<p>I approached success in phases. First, we ensured everyone had access to AI tools so they could start experimenting. Next, I focused on departments eager to adopt and show early wins. From there, the goal has been to build advanced learning paths for power users who want to go deeper. We track progress through adoption metrics and by looking at what people are creating—like the internal tools employees have started building for their teams. That phased approach ensures we’re not just experimenting, but turning new capabilities into practices that can scale across the organization.</p>
<h3>Success Starts with the Right Departments</h3>
<p><b>Paul</b>: If you’re advising other companies on where to start, which departments provide the most natural footholds for AI adoption?</p>
<p><b>Donna</b>: Marketing is absolutely the place where it makes a lot of sense to start. The outputs are already designed for public sharing, so you can sidestep some of the trickier data concerns. Once one team demonstrates success, others line up. Our CMO wanted marketing to be the poster child, and from there, adoption spread quickly to engineering, sales, and beyond.</p>
<p>Each department comes with a different level of complexity. Marketing provides the fastest wins because the work is already meant for public use, which lowers data risk. Sales bring strong opportunities, but require careful governance since customer information is involved. Engineering is a natural fit because developers already operate within strict guardrails and coding practices. That’s why I always suggest starting where adoption is easiest. Early successes create momentum, and once employees see tangible results, adoption expands naturally across the business without forcing it in areas with higher risks.</p>
<h3>Organizational Change Requires Trust and Patience</h3>
<p><b>Paul</b>: One of the biggest challenges leaders face isn’t technical at all. It’s about trust, resistance, and change management. How do you help employees move past fear and skepticism?</p>
<p><b>Donna</b>: It’s much more carrot than stick. If an idea or practice is powerful enough, the value will come through in the long run. Sometimes the tools don’t work perfectly at the onset, and I’m upfront about that reality. It takes the pressure off people. If they’re not ready today, that’s fine—because once they see colleagues using AI successfully, curiosity takes over. My role is to meet them at their pace and make adoption approachable.</p>
<p>Fear and resistance are normal reactions when people are asked to change how they work. I focus on building trust through transparency—acknowledging when tools don’t perform as expected and reminding teams that a human must always be in the loop. That openness helps take the pressure off and makes adoption feel less risky. I also use peer examples to create positive momentum: when small pilot groups demonstrate success, others naturally want to join. By letting curiosity and proof drive the process, adoption spreads more smoothly and scales in a way that feels approachable rather than forced.</p>
<p>Catch the full conversation with Donna Beasley on The AI Forecast on <a href="https://open.spotify.com/episode/6nC75kLslRL0f9TOpRvOdY">Spotify</a>, <a href="https://podcasts.apple.com/us/podcast/enabling-the-future-of-ai-with-donna-beasley/id1779293119?i=1000725898498">Apple Podcasts</a>, and <a href="https://www.youtube.com/watch?v=GED3yYSJ5OY&amp;list=PLe-h9HrA9qfAmGHgsmXUZgLL-T4Xjhlq8&amp;index=1">YouTube</a>.&nbsp;&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=redefining-ai-leadership-inside-the-rise-of-the-chief-ai-enablement-officer</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Unlocking Enterprise AI Potential: Knowledge Distillation for Customer Support Analytics</title><description><![CDATA[Unlock enterprise AI potential with knowledge distillation for customer support analytics. Learn how AI for business boosts efficiency, and reduces costs.]]></description><link>https://www.cloudera.com/blog/technical/unlocking-enterprise-ai-potential-knowledge-distillation-for-customer-support-analytics.html</link><guid>https://www.cloudera.com/blog/technical/unlocking-enterprise-ai-potential-knowledge-distillation-for-customer-support-analytics.html</guid><pubDate>Thu, 11 Sep 2025 11:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Andreas Tsiartas,Yi-Hsun Tsai,Jugoslav Djajic,Robert Hryniewicz]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-woman-looking-at-data.webp"><h2>Business Challenge: Balancing AI Model Speed and Accuracy Without Compromising Data Privacy</h2>
<p>Cloudera’s customer support team leverages <a href="/content/www/en-us/why-cloudera/enterprise-ai.html">AI models</a> to analyze and summarize customer support tickets in real time. The system takes as input customer or Cloudera support agent comments. Then, it analyzes each comment and extracts a set of analytics, such as sentiment and summarization. These analytics are paramount to improve the customer experience at Cloudera.</p>
<p>Due to the sensitive nature of the customer data being processed in this pipeline, only models running in local environments can be used and no customer data can be shared with any external sources.&nbsp;</p>
<p>Initially, to analyze the comments, the team relied on local LLMs (Goliath 120B), which met basic performance requirements but lagged in speed and generation performance: on average, processing requests took 12-15 seconds each, and requests came in every 30 seconds. Adherence to the expected output was 77.5%, and generation accuracy was lower than proprietary models—a bottleneck for scalability and LLM performance.&nbsp;&nbsp;</p>
<p>The challenges of using local large LLMs (Goliath-120B) were clear: slower response times, increased costs, lower generation accuracy than state-of-the-art, cloud-based models, and compliance risks.&nbsp;</p>
<p>Large organizations face similar trade-offs—balancing AI accuracy and speed&nbsp; against the risks of data exposure.&nbsp;&nbsp;</p>
<h2>Cloudera’s Solution: Knowledge Distillation with Private Data&nbsp;</h2>
<p>Cloudera’s breakthrough lies in a privacy-first approach to knowledge distillation.&nbsp;</p>
<p>Instead of training models on raw customer data, which had regulatory and exposure risks, we generated synthetic datasets using <a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera Synthetic Data Studio.</a> This new low-code&nbsp; tool in Cloudera AI&nbsp; mimicked real-world interactions—technical questions, troubleshooting scenarios, and more—without ever exposing private information.</p>
<p>Generating synthetic customer support interactions had regulatory and exposure benefits and also enabled the team to send the synthetic data to state-of-the-art, cloud-based LLMs to extract insights such as customer sentiment from the best performing LLMs. These cloud-based LLMs provided much more accurate information extraction than large local LLMs, making them an ideal source to distill accurate insights from these state-of-the-art LLMs.&nbsp;</p>
<p>Cloudera’s synthetic data solution eliminated any compliance and privacy risks and generated the highest quality synthetic data (even higher than existing large, local LLMs). This approach unlocked the option to distill knowledge from state-of-the-art models to small LLMs and solve the same problem as the Goliath-120B but at a lower cost and higher accuracy.&nbsp;</p>
<h2>Our Process</h2>
<p>Data generation: Using the Synthetic Data Studio data generation workflow, we crafted a prompt instructing Claude Sonnet to generate customer questions and answers. The prompt instructs the LLM to create customer support questions and answers, impose the tone, and detail the structure. In addition, we provide a list of topics that appear in real-world data (such as customer support for <a href="/content/www/en-us/products/machine-learning.html">Cloudera AI </a>or <a href="/content/www/en-us/products/data-warehouse.html">Cloudera Data Warehouse</a>) and use seed topics to ensure both diverse and real-world customer support ticket generation.</p>
<p>Fine-tuning: Using only the filtered data, the team split the data into train and development tested and&nbsp; distilled knowledge from the Claude Sonnet model to a Meta Llama3.1-8B-instruct model. The team ran multiple experiments selecting the fine-tuning parameters that maximize the performance of the distilled LLM.</p>
<p>Evaluation: Using the Synthetic Data Studio evaluation workflow, the team crafted a prompt to&nbsp; instruct an LLM-as-a-judge&nbsp; on how to evaluate the quality of the generated data and filtered out low-quality samples.&nbsp;</p>
<p>Using both human and automated LLM-as-a-judge evaluations, the team scored real-world customer support ticketing questions and answers. Cloudera’s team&nbsp; focused on answers that the deployed and distilled LLMs differed and reported the win rate of each LLM. In addition, they measured speed improvements in terms of average running time, adherence to the expected output, and cost to deploy the model.</p>
<h2>The Results&nbsp;&nbsp;</h2>
<p>Improved speed: Processing time dropped 95%.</p>
<p>Better output structure: Output adherence rose from 77.5% to 99.5%.</p>
<p>Higher LLM accuracy: When comparing the smaller distilled LLM (Llama 3.1 8B) against the deployed Goliath LLM (Goliath 120B), win rate was 70% vs. 30% when using Phi-4 as a judge and 63% vs. 37% when using human evaluators to compare the two models.&nbsp;&nbsp;</p>
<p>Improved cost and efficiency: The smaller distilled LLM reduced compute and memory needs while increasing real-time scalability and maintaining data privacy, and throughput improved 11x.</p>
<p>The results are clear: enterprises can achieve AI excellence without compromising data privacy. By synthesizing training data and distilling knowledge, businesses avoid trade-offs between innovation and compliance.&nbsp;&nbsp;</p>
<p>Enterprises today face a steep challenge: they want to leverage advanced AI models to stay competitive, but need to keep the high costs of cloud-based large language models (LLMs) under control and stay compliant with data privacy regulations.&nbsp;</p>
<p>So how can businesses explore cutting-edge AI without overextending budgets or exposing sensitive private data? At Cloudera, we’ve developed a solution that turns this challenge into an opportunity—using synthetic data generated from private data and knowledge distillation to build cost-efficient, accurate, and compliant AI systems.&nbsp;&nbsp;</p>
<p>In this article, we discuss how Cloudera’s Synthetic Data Generation Studio–part of <a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera AI Studios</a>—allows organizations to capitalize on AI innovation even when real-world data is scarce or sensitive.&nbsp;</p>
<h2>Synthetic Data Enables Innovation Without Regulatory Risk</h2>
<p>By developing a knowledge distillation approach, Cloudera achieved a 95% reduction in processing time, increased output structure adherence to 99.5%, and deployed a distilled Llama 3.1 8B model that outperformed the prior Goliath 120B model by 70% in accuracy (as judged by Phi-4) and 63% in human evaluations.</p>
<p>This method eliminated compliance risks by avoiding direct use of sensitive data and also unlocked 11x greater throughput, showing that smaller, fine-tuned models can surpass larger, resource-intensive alternatives in both speed and precision.</p>
<p>Try our <a href="https://github.com/cloudera/CML_AMP_Knowledge_Distillation_With_Private_Data">AMP</a> to explore how to use private synthetic data to distill knowledge from a large model to a smaller model for a customer support use case.</p>
<p><span class="byline"><i><b>Figure 1</b>. The impact of the synthetic data distillation approach to speed, adherence, and cost for the customer support use case. The AWS cost is a hypothetical cost if the LLM runs on the AWS Cloud (based on Feb 2025 prices).</i></span></p>
<h2><span class="blue-nova">Use Case and Key Takeaways</span></h2>
<p>Use case: Drawing from an internal use case, we’ll show how we significantly improved the performance and overall throughput for Cloudera’s customer support ticket pipeline through knowledge distillation using synthetic data generated from private data, while maintaining data privacy and regulatory compliance.</p>
<p><b>Key takeaways:</b></p>
<ul>
<li><p>Data privacy as a competitive advantage: Synthetic data enables innovation without regulatory risk.&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Cost-effective performance: Smaller, fine-tuned models outperform larger, resource-heavy alternatives.</p>
</li>
</ul>
<ul>
<li><p>Applicable to multiple use cases: The same approach can power use cases from fraud detection to personalized customer service.</p>
</li>
</ul>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=unlocking-enterprise-ai-potential-knowledge-distillation-for-customer-support-analytics</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Celebrating Cloudera and IBM’s Milestone Impact in Brazil</title><description><![CDATA[Cloudera and IBM are celebrating eight years of collaboration highlighted by continuous innovation, impressive growth, and real-world impact on enterprise digital transformation across Brazil. ]]></description><link>https://www.cloudera.com/blog/partners/celebrating-cloudera-and-ibms-milestone-impact-in-brazil.html</link><guid>https://www.cloudera.com/blog/partners/celebrating-cloudera-and-ibms-milestone-impact-in-brazil.html</guid><pubDate>Wed, 03 Sep 2025 12:00:00 UTC</pubDate><comments/><category><![CDATA[Partners]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1006006496-1.jpg"><p>Cloudera and IBM are celebrating eight years of collaboration highlighted by continuous innovation, impressive growth, and real-world impact on enterprise digital transformation across Brazil. Since the collaboration began in 2017, Cloudera and IBM have together generated strong business results—achieving US$100 million in annual recurring revenue by 2020, with growth continuing thanks to the deepening synergy between the two companies.</p>
<h2>A Powerful Partnership</h2>
<p>Cloudera’s open and scalable data platform, integrated seamlessly with IBM’s advanced technologies such as watsonx, BigSQL, and Cognos, is shaping the future of data and AI for large organizations. Joint engineering and solutions teams from both companies collaborate to provide up-to-date integrations and smooth technical support, supported by robust professional services that are central to the partnership’s success.</p>
<h2>Delivering End-to-End Value</h2>
<p>By combining their expertise, Cloudera and IBM empower clients with end-to-end solutions across the enterprise data and AI lifecycle—from ingestion to inference—regardless of environment: on-premises, hybrid cloud, multicloud, or edge. The shared mission is clear: deliver a seamless, secure, and scalable customer experience that meets the demands of today’s digital business landscape.</p>
<p>“Our collaboration with IBM is one of our most valuable strategic assets,” says Rubia Coimbra, Vice President of Cloudera for Latin America. “We are helping companies unlock real-time value from their data with intelligence and scalability.”&nbsp;</p>
<p>Marcela Vairo, VP of Data &amp; AI, Americas at IBM, echoes this, pointing to the collaboration’s focus on innovation and customer impact: “Together, we are committed to providing seamless, end-to-end solutions that empower enterprises across all environments to unlock the full value of their data and AI investments, ensuring security, governance, and exceptional performance.”</p>
<h2>Joint Services and Solutions</h2>
<p>Key offerings through the Cloudera-IBM alliance include:</p>
<ul>
<li>Data in Motion (DIM): Real-time data capture, processing, and analytics</li>
<li>Data Services: Secure infrastructure and intelligent data management tools</li>
<li>Enterprise AI Integration: Predictive and generative modeling with robust privacy and compliance</li>
<li>Integrated Support &amp; Professional Services: Joint technical assistance and tailored deployments</li>
<li>Co-engineering: Ongoing innovation focused on interoperability and performance.<br>
</li>
</ul>
<h2>Impact on Key Industries</h2>
<p>Both the financial and healthcare sectors in Brazil have particularly benefited. Financial organizations have gained agility to meet regulatory requirements and produce real-time insights, while healthcare providers leverage predictive analytics to improve patient care and operational efficiency.</p>
<p>Cloudera and IBM’s ongoing collaboration demonstrates how working together can set a standard for secure, scalable, and innovative data and AI solutions in complex enterprise environments, now and in the years to come.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=celebrating-cloudera-and-ibms-milestone-impact-in-brazil</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Navigating GxP Compliance in the Age of Precision Medicine and AI</title><description><![CDATA[Explore how Cloudera ensures GxP compliance amid data explosion and AI in healthcare to accelerate precision medicine with secure, auditable data.]]></description><link>https://www.cloudera.com/blog/business/navigating-gxp-compliance-in-the-age-of-precision-medicine-and-ai.html</link><guid>https://www.cloudera.com/blog/business/navigating-gxp-compliance-in-the-age-of-precision-medicine-and-ai.html</guid><pubDate>Tue, 02 Sep 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Bruce Wilcox,Rameez Chatni,Jeremiah Morrow]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1179048342-1.jpg"><p>Like many industries, life sciences is facing an explosion of data. This data—from genomic sequences to real-world patient insights—has the potential to be an engine for innovation, accelerating drug discovery and revolutionizing patient care. However, while other industries are able to capitalize on the growth of data, innovation and transformation in life sciences has been hampered by concerns over patient safety, product efficacy, and regulatory scrutiny.</p>
<p>Good Practice (GxP) compliance seeks to build trust and progress by establishing a framework that ensures integrity and reliability within each phase of the pharmaceutical value chain, from research and development to manufacturing and distribution. And data is a critical component of GxP compliance.</p>
<p>“GxP” stands for “Good [Industry] Practice” where the “x” is a placeholder for a specific field. GxP compliance is the adherence to procedures and standards set and enforced by regulatory agencies within highly regulated industries , such as life sciences and pharmaceuticals, to ensure the quality of products and the safety of patients.</p>
<p>To comply with GxP, every data point must be traceable and auditable. But data is often distributed, stored in a variety of systems, clouds, and data centers. And the volume and velocity of that data makes it even more difficult.&nbsp;</p>
<p>This blog will explore the need for GxP compliance, the complexities that data and AI introduce, and how Cloudera empowers life sciences organizations to navigate these requirements with confidence and agility.</p>
<h2>GxP Compliance: A Foundation of Life Sciences</h2>
<p>At its core, GxP compliance is about protecting patients. Good Manufacturing Practice (GMP), Good Clinical Practice (GCP), and Good Laboratory Practice (GLP) ensure pharmaceutical products, medical devices, and biotechnologies are consistently produced, tested, and distributed according to the highest standards, translating to safe and effective treatments.</p>
<p>Crucially, GxP mandates data integrity. Principles like ALCOA+ (Attributable, Legible, Contemporaneous, Original, Accurate, Complete, Consistent, Enduring, Available) ensure all data is reliable, trustworthy, and verifiable. Without this integrity, the foundation for safe and effective medical products crumbles, potentially leading to patient harm.</p>
<h2>Adhering to GxP is a Complex Challenge</h2>
<p>The life sciences industry operates on a foundation of trust. Patients, healthcare providers, and regulators must trust that pharma adhere to the highest standards. Adhering to GxP demonstrates a commitment to quality and ethical conduct, fostering this confidence.</p>
<p>Non-compliance carries significant consequences, from financial penalties and product recalls to legal actions and irreparable reputational damage. In fact, regulatory compliance costs account for <a href="https://amplelogic.com/glossary/what-is-gxp/" target="_blank" rel="noopener noreferrer">nearly 25% of a typical pharmaceutical manufacturing facility’s annual operating expenditures</a>. GxP acts as a critical risk mitigation framework, transforming a perceived burden into a robust system for operational excellence and long-term viability.</p>
<p>However, life sciences companies face significant challenges adhering to GxP:</p>
<ul>
<li><p><b>The explosion of data</b>: The growth in volume, variety, and velocity of data in life sciences organizations makes GxP compliance more difficult than ever, and the proliferation and distribution of systems, tools, and platforms introduces even more complexity into the auditability and traceability of data.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><b>Added complexity of AI/ML workloads</b>: The growth of artificial intelligence (AI) and machine learning (ML) across the value chain introduces even more complexity. Challenges include model explainability and transparency, bias detection and mitigation, rigorous training data governance, and managing continuous learning and model drift.</p>
</li>
</ul>
<ul>
<li><p><b>Shared responsibility and system integration</b>: Most platforms and tools are not inherently “GxP certified;” validation is ultimately the customer’s responsibility. For customers who have moved some workloads to the cloud, these distributed environments complicate defining GxP system boundaries, especially in hybrid environments. The principle often applies: &quot;if GxP data can flow, validate all downstream systems.&quot; Integrating modern data platforms with existing, often siloed, legacy GxP-validated systems to achieve end-to-end data lineage is a significant hurdle.</p>
</li>
</ul>
<ul>
<li><p><b>Operational rigor</b>: Beyond technology, GxP compliance demands meticulous documentation, rigorous change control, continuous monitoring, and highly trained personnel. The burden of ongoing validation and revalidation is substantial.</p>
</li>
</ul>
<h2>Empowering Compliance: How Cloudera Simplifies the GxP Journey</h2>
<p>Cloudera is a <a href="/content/www/en-us/products/cloudera-data-platform.html">data platform</a> built to address these challenges, providing a robust and scalable foundation for achieving and maintaining GxP compliance across diverse workloads, including AI and ML. Cloudera provides several capabilities that enable life sciences companies to use more of their data for analytic insights. The following features support GxP compliance:</p>
<h3>Unified Security, Governance, and Lineage</h3>
<p><a href="/content/www/en-us/products/cloudera-data-platform/sdx.html">Cloudera Shared Data Experience</a> (SDX) provides a consistent security, governance, and metadata layer across hybrid and multi-cloud environments. By leveraging the combined power of Apache Ranger for&nbsp; fine-grained access controls, Kerberos for strong authentication, and Apache Atlas for full lineage, SDX simplifies GxP validation by providing a single pane of glass for critical controls, enabling consistent policy enforcement, streamlined auditing, and encryption at rest and in transit (TLS/SSL).</p>
<h3>Cloudera Octopai Data Lineage</h3>
<p><a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> further enhances Cloudera SDX’s data lineage capabilities by providing automated, end-to-end data lineage across the entire enterprise data estate, including systems both within and outside Cloudera's platform. Cloudera Octopai Data Lineage automatically discovers and maps data flows from source to consumption across diverse technologies, from extract, transform, and load (ETL) tools and databases to BI reports and ML models. This comprehensive, cross-platform visibility provides a robust audit trail crucial for GxP compliance, enabling deep impact analysis, root cause identification, and ensuring the trustworthiness of data used for critical life sciences insights.</p>
<h3>End-to-End Auditability and AI/ML Support</h3>
<p>Cloudera offers extensive audit trails for user access, data modifications, and policy changes. <a href="/content/www/en-us/why-cloudera/enterprise-ai.html">Cloudera AI</a> provides capabilities for tracking model development and experimentation, which is crucial for GxP-compliant AI and ML. This includes supporting experiment tracking, model registry and versioning, MLOps pipeline validation, and providing a foundation for bias detection.</p>
<h2>Conclusion: Cloudera Accelerates Innovation While Ensuring Compliance</h2>
<p>GxP compliance is not just a regulatory hurdle; it's a critical enabler for innovation and trust in life sciences. The complexities of exploding data and rapid AI/ML adoption demand a robust, unified data platform.</p>
<p>Cloudera provides a scalable foundation for achieving and maintaining GxP compliance across diverse workloads with its comprehensive security, governance, and auditability capabilities. By simplifying the GxP journey, Cloudera empowers life sciences organizations to accelerate innovation, bring life-saving therapies to market faster, and maintain trust.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=navigating-gxp-compliance-in-the-age-of-precision-medicine-and-ai</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Data Catalog Implementation: A Step-by-Step Guide</title><description><![CDATA[Implementing a data catalog can be daunting and tedious. That’s why we’ve compiled a step-by-step data catalog guide to help you and your organization.]]></description><link>https://www.cloudera.com/blog/business/data-catalog-implementation-a-step-by-step-guide.html</link><guid>https://www.cloudera.com/blog/business/data-catalog-implementation-a-step-by-step-guide.html</guid><pubDate>Wed, 20 Aug 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Ron Pick]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-clouds-building-mirror.webp"><p>As organizations deal with a deluge of too much data (data bloat) coming from every system and landscape, having a well-organized and easily accessible data catalog is critical. Data teams and owners need to understand where data originated and where it resides. Without this knowledge, their job becomes challenging.&nbsp;</p>
<p>Data catalogs offer a number of benefits:</p>
<ul>
<li><p><b>Better decision-making:</b> Data catalogs provide quick and easy access to high-quality data. The availability of accurate and timely data enables business users to make informed decisions, improving overall business strategies.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><b>Improved collaboration: </b>By serving as a central repository for enterprise data, a data catalog facilitates collaboration among different teams. Everyone has access to the same data and the same understanding of what the data represents, reducing miscommunications and discrepancies.</p>
</li>
</ul>
<ul>
<li><p><b>Better risk management and compliance: </b>Data catalogs help businesses maintain regulatory compliance by providing a clear record of what data is stored and how it’s used. This can be particularly beneficial in industries that have to comply with regulations like GDPR or HIPAA. Catalogs act as a source of truth, along with data lineage, to the origins of data.</p>
</li>
</ul>
<p>While the benefits are clear, implementing a <a href="/content/www/en-us/products/cloudera-data-platform/sdx/data-catalog.html">data catalog</a> can be daunting and tedious. From speaking with and surveying data owners, we’ve compiled a step-by-step guide to help you successfully implement a data catalog in your organization.</p>
<h2>Best Practices for Implementing a Data Catalog: An 11-Step Guide</h2>
<p>Below are some best practices to follow when implementing a data catalog, broken down into easy-to-follow steps.</p>
<h3>1. Define a Clear Purpose and Scope</h3>
<p>Before jumping into the implementation process, clearly outline the purpose and scope of the data catalog. Identify the types of data to be included, who the intended audience is, and the business goals that the data catalog will support. A well-defined purpose and scope will guide the implementation process so that the catalog effectively serves its intended function.</p>
<h3>2. Identify and Involve Stakeholders</h3>
<p>Successful implementation of a data catalog requires the involvement of key stakeholders. These can include members from the data team and business teams. Including them in the design and implementation process ensures that the data catalog meets their needs and aligns with business goals.</p>
<h3>3. Establish Data Governance Policies</h3>
<p>Establishing robust data governance policies is a crucial part of implementing a data catalog. These policies should define data standards, access controls, and data quality measures. They ensure the data catalog remains accurate, up-to-date, and secure.&nbsp;</p>
<h3>4. Use Existing Catalog Metadata Standards</h3>
<p>Ensuring consistency and interoperability within your data catalog involves defining catalog metadata standards and data models to promote coherence with other systems and data sources. Examples of these standards include uniform headers and mandatory descriptions.</p>
<h3>5. Automate Metadata Capture</h3>
<p>Leverage leading metadata management tools like <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> to automate the process of capturing metadata from various sources. Automated metadata capture increases efficiency, accuracy, and consistency in your data catalog.</p>
<h3>6. Define Clear Milestones</h3>
<p>Defining milestones is a crucial part of implementing your data catalog. This process includes:</p>
<ul>
<li><p>Identifying data assets to be cataloged: Prioritize data assets for cataloging based on the guidelines shared in the next section.</p>
</li>
</ul>
<ul>
<li><p>Defining metadata requirements: Determine the level of detail and additional information required for each data asset—initially, less is sometimes more as you figure out what works best.&nbsp;&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Creating a timeline: Identify key milestones and set start and end dates for the project.</p>
</li>
</ul>
<ul>
<li><p>Defining phases of the project: Break down the project into manageable phases.</p>
</li>
</ul>
<ul>
<li><p>Assigning responsibilities: Assign tasks to ensure completion on time and to the required quality standards. Everyone should be aligned to the catalog.</p>
</li>
</ul>
<ul>
<li><p>Establishing quality control measures: Ensure the captured metadata is accurate, complete, and consistent with established standards.</p>
</li>
</ul>
<ul>
<li><p>Monitoring progress: Keep track of the project’s progress and adjust the plan as necessary to stay on track and meet milestones.</p>
</li>
</ul>
<h3>7. Prioritize Data Assets</h3>
<p>When populating your <a href="/content/www/en-us/products/cloudera-data-platform/sdx/data-catalog.html">data catalog</a>, prioritize data assets that are critical to the organization’s operations and can significantly impact business outcomes. Consider business-critical data, high-value data, frequently used data, data that is hard to find, and new data assets.</p>
<h3>8. Populate the Data Catalog</h3>
<p>Collaborate with data owners or subject matter experts to document various attributes about the data assets they manage. This information—including data source, lineage, quality, and usage—can then be used to populate the data catalog.</p>
<h3>9. Train Users How to Use Search and Discovery Capabilities</h3>
<p>The metadata management tool you’ve invested in should provide search and discovery capabilities—such as filters, tags, owners, and other search parameters—which enable users to quickly find and access the data they need. Work with the vendor to ensure users are trained on how to use the tool effectively.</p>
<h3>10. Monitor Usage and Adoption</h3>
<p>Keep track of how your data catalog is being used and adopted within the organization. This will help you assess whether it’s meeting the organization’s needs and whether users are effectively leveraging its capabilities.</p>
<h3>11. Provide Ongoing Maintenance and Support</h3>
<p>Just like any other system, a data catalog requires ongoing maintenance and support. This includes regular updates and enhancements to ensure it remains relevant, useful, and up-to-date. This process also involves monitoring and rectifying any issues that may arise, thus ensuring the catalog’s integrity and usability.</p>
<h2>Conclusion and Next Steps</h2>
<p>Implementing a data catalog can be a complex process, but with careful planning, stakeholder involvement, and a focus on quality and usability, it can yield significant benefits for an organization.&nbsp;</p>
<p>By following these best practices, you can ensure a successful data catalog implementation that supports your organization’s data management and business goals. Remember that the data catalog is a living entity, continually evolving with your organization’s changing data landscape. It requires a dedicated effort and commitment to keep it accurate, useful, and valuable for all its users.</p>
<p>Ready to conquer data chaos? <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html#demo">&nbsp;Request a demo</a> to get started with <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> today—instantly harness automated metadata capture, end-to-end lineage, and intuitive cataloging so your teams can collaborate effortlessly, make smarter decisions, and stay compliant without the headache of manual cataloging.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=data-catalog-implementation-a-step-by-step-guide</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera Recognized Among the Nation’s Cybersecurity Leaders</title><description><![CDATA[Cloudera has been recognized as one of the nation&apos;s top cybersecurity experts, leading innovation in data security and business resilience. ]]></description><link>https://www.cloudera.com/blog/business/cloudera-recognized-among-the-nation-s-cybersecurity-leaders.html</link><guid>https://www.cloudera.com/blog/business/cloudera-recognized-among-the-nation-s-cybersecurity-leaders.html</guid><pubDate>Thu, 14 Aug 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1213825456.jpg"><p>Cloudera is proud to announce our <a href="https://rankings.newsweek.com/americas-best-cybersecurity-companies-2025">recognition </a>as one of the nation’s most trusted cybersecurity providers, one of just 83 companies selected from thousands evaluated nationwide. Validated through expert assessments and public reviews, Cloudera earned top marks in three critical security categories: application and data security, infrastructure security, and operational technology (OT) and IoT security.<br>
</p>
<p>This award reflects Cloudera’s <a href="/content/www/en-us/solutions/lower-business-risks.html">cybersecurity leadership</a> across government, finance, healthcare, telecommunications, and education sectors. Whether our customers are managing sensitive citizen data or mission-critical financial systems, Cloudera provides a secure, scalable platform designed to meet the highest standards of trust and compliance.<br>
</p>
<h3>Where Cloudera Stands Out</h3>
<p>Cloudera earned service quality honors in three core cybersecurity domains:<br>
</p>
<ul>
<li><p>Application &amp; Data Security: Awarded for robust encryption, access control, and governance features.</p>
</li>
<li><p>Infrastructure Security: Honored for our zero-trust architecture and resilient hybrid-cloud capabilities.</p>
</li>
<li><p>OT and IoT Security: Recognized for delivering enterprise-grade security for connected systems and intelligent infrastructure.<br>
<br>
</p>
</li>
</ul>
<p>These capabilities reflect Cloudera’s long-standing belief that security and governance are foundational to how data platforms should operate.<br>
</p>
<p>“Our platform helps customers bring data insights to life quickly and safely,” said Carolyn Duby, Cloudera Field CTO. “That starts with embedding security at every layer, from data to infrastructure to connected systems. We help customers secure and govern data across multiple platforms through a single pane of glass, applying zero-trust principles every step of the way.”</p>
<h3>Security Across Verticals: A Business Imperative</h3>
<p>Security is a foundational enabler of trust, compliance, and innovation across every industry. Cloudera’s secure-by-design architecture protects everything from research data and healthcare records to payment systems and connected infrastructure.</p>
<p>To support these wide-ranging needs, Cloudera has embedded governance and protection into every layer of our platform. This commitment to compliance is exhibited by recent achievements of <a href="https://www.cloudera.com/about/news-and-blogs/press-releases/2025-06-25-cloudera-achieves-fedramp-moderate-authorization-advancing-its-commitment-to-secure-data-analytics-and-ai-in-the-public-sector.html">FedRAMP Moderate authorization</a>, making us a trusted provider to U.S. government agencies, and<a href="https://www.cloudera.com/about/news-and-blogs/press-releases/2024-08-22-cloudera-achieves-pci-dss-4-compliance-to-unlock-business-value-from-ai-for-financial-institutions.html"> PCI DSS 4.0 compliance</a>, unlocking new opportunities for financial institutions to harness AI while maintaining the highest data security standards.</p>
<h3>Evolving with the Threat Landscape</h3>
<p>As cybersecurity threats grow more sophisticated and AI-driven, organizations need more than basic protection—they need intelligent, adaptive tools that evolve alongside the threat landscape. That’s why Cloudera secures platforms and empowers customers to derive insight from their own cybersecurity data. We support the full data lifecycle, from ingestion and threat modeling to anomaly detection and AI-enabled response.</p>
<p>“AI is a strategic asset in detection and response,” said Duby. “Cloudera enables some of the world’s most critical enterprises to manage high-volume cyber data and respond quickly using low-code tooling, AI code assistants, and hybrid-cloud solutions.”</p>
<p>To further support this vision, Cloudera’s acquisition of <a href="https://www.cloudera.com/about/news-and-blogs/press-releases/2024-11-14-cloudera-to-acquire-octopais-platform.html">Octopai </a>brings advanced metadata visibility and automated data lineage into the platform, helping customers identify where sensitive data lives, how it flows, and how to govern it securely across complex hybrid environments.</p>
<h3>Built for Security. Driven by Innovation</h3>
<p>This national recognition confirms what our customers already know: Cloudera is a trusted partner for building secure, data-driven operations at scale. From public sector modernization to financial compliance, we help organizations navigate complex regulatory landscapes while turning data into a strategic advantage.</p>
<p>Whether you’re scaling AI in a tightly regulated industry or building mission-critical applications in the cloud, Cloudera delivers the tools and trust needed to move forward securely, intelligently, and confidently.</p>
<p>Explore how Cloudera helps protect your mission-critical data and drive secure innovation. Learn more about our<a href="/content/www/en-us/products/cloudera-data-platform.html">&nbsp;platform and public-sector capabilities</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-recognized-among-the-nation-s-cybersecurity-leaders</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera’s 2025 Season of Service: A Word from Our Ambassadors</title><description><![CDATA[Every July, our teams set out for Cloudera’s Season of Service: a month-long initiative that gives Clouderans the opportunity to get outside of the office, work together, and give back to our communities. To make it happen, we host events around the world, led by our exceptional team of Cloudera Cares ambassadors.]]></description><link>https://www.cloudera.com/blog/culture/cloudera-s-2025-season-of-service-a-word-from-our-ambassadors.html</link><guid>https://www.cloudera.com/blog/culture/cloudera-s-2025-season-of-service-a-word-from-our-ambassadors.html</guid><pubDate>Tue, 12 Aug 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-1360699558.jpg"><p><span class="blue-nova">“What’s most meaningful to me about being a Cloudera Cares Ambassador is seeing the real impact we can have—both in our communities and within Cloudera.</span></p>
<p><span class="blue-nova">Just last week, I delivered donations to a children’s hospital where kids with ADHD were preparing for their summer camp. Seeing how happy they were to receive even a small present was incredibly heartwarming. Moments like that remind me why this work matters so much. It’s not just about giving—it’s about connection, kindness, and showing people they’re not alone.”</span></p>
<p>- Gloria Benko, Software Engineer</p>
<p><span class="blue-nova">“I’m most looking forward to promoting the culture of engagement as a Cloudera Costa Rica office, outside the work space, while doing activities that serve the community.”</span></p>
<p>- Natalia Aviles Fioravanti, Lead Data Analyst</p>
<p><span class="blue-nova">““The opportunity to provide service and witness the positive ripple effect of our collective actions through Cloudera Cares brings me immense joy. It's more than just a program; it's a chance to live out my values and contribute to something larger than myself.”</span></p>
<p>- Kylie Drake, Data Analytics and Insight Manager</p>
<p>Cloudera Cares has an impressive 29 ambassadors representing 12 countries across nine business units who brought these events to life around the globe. As we wrap up a busy few weeks of engagement, we invited our ambassadors to reflect on their experiences and share why this month holds such significance for them, touching on their goals for this season and why they joined the program in the first place.</p>
<p>Here’s what they had to say.</p>
<h2>Meet Our Cloudera Cares Ambassadors&nbsp;</h2>
<p>When asked why the Cloudera Cares ambassadorship matters, Gloria Benko reflected on the greater alignment with her personal values. “Becoming a Cloudera Cares Ambassador was a natural step for me. Through this role, I can help bring people together around meaningful causes, create opportunities to give back, and foster a sense of community and compassion at work. It’s about making a difference, even in small ways, and encouraging others to do the same.”</p>
<p>Other ambassadors shared similar sentiments. Kylie Drake told us that she <br>
recognized a profound personal need to contribute more meaningfully to the world around her. Natalia Aviles Fioravanti, Lead Data Analyst, added that service is part of her DNA. To her, “Cloudera Cares just felt natural.”</p>
<p>Ken Tabuchi had more personal ties to the cause, sharing that “My journey as a Japanese American, raised in the U.S. and now living in Japan after years in Australia, has deeply ingrained in me the philosophy of ‘Think globally, act locally.’” When considering the Cloudera Cares initiative as a whole, he expressed that it was really the global perspective, combined with a passion for making a tangible difference that drew him into the program.</p>
<p>Kylie Drake added, “I see it as a powerful platform to channel my time, talents, and effort towards initiatives that create a better world. For me, contributing in this way is not just a duty, but a privilege, and the least I can do to help foster positive change.” A mindset we feel both perfectly embodies our ambassadors and our company-wide approach toward facilitating the change we wish to see.</p>
<p><span class="blue-nova">“As a Cloudera Cares Ambassador, I see an incredible opportunity to pave the way for corporate social responsibility (CSR), no matter where I am in the world. Cloudera allows me to actively contribute to initiatives that align with my vision of a more sustainable and equitable future, demonstrating how a global mindset can translate into meaningful local action.”</span></p>
<p>- Ken Tabuchi, Customer Operations Manager</p>
<p><span class="blue-nova">“I enjoy the moments when we connect beyond our usual work settings—Clouderans coming together as a community to support one another and create a positive impact. It’s these experiences that truly bring the spirit of &quot;We’re better together&quot; to life.”</span></p>
<p>- Irene Li, Associate HR Business Partner</p>
<p>To learn more about our <a href="https://www.cloudera.com/about/our-culture.html">corporate culture</a>, Season of Service or to become a Cloudera Cares Ambassador, <a href="mailto:cloudera_cares@cloudera.com">get in touch</a>.</p>
<p>Every July, our teams set out for Cloudera’s Season of Service: a month-long initiative that gives Clouderans the opportunity to get outside of the office, work together, and give back to our communities. To make it happen, we host events around the world, led by our exceptional team of Cloudera Cares ambassadors.</p>
<p>This initiative is one of the many that contribute to our<a href="https://www.cloudera.com/about/philanthropy.html?tab=1"> broader corporate social responsibility (CSR) goals</a>.</p>
<h2>Season of Service: What It Means to Us&nbsp;</h2>
<p>Our ambassadors are Clouderans who opt into additional volunteering opportunities beyond their day jobs and perfectly embody the value of giving, both in and out of the office. Before jumping into this year’s events, we asked them to share what they most hoped to accomplish during this year’s Season of Service.</p>
<p>Gloria Benko, Software Engineer, shared that she was “most looking forward to the smiles of the people and the happiness that comes from giving back—it’s the most rewarding feeling.” While Kylie Drake, Data Analytics and Insight Manager, shared that she was “looking forward to empowering others with abundant opportunities to make a difference in the world.”&nbsp;</p>
<p>As Clouderans, we do what we can to ‘think globally, and act locally’–a mantra that keeps us grounded in the fact that even the small things we do each day make a big difference. Ken Tabuchi, Customer Operations Manager, shared that this mindset is what makes him believe that by working together for the greater good through events like Cloudera’s Season of Service, we can “overcome geographical barriers and foster a stronger sense of shared purpose and collective impact across our diverse teams.”&nbsp;</p>
<h2>Cloudera Cares Initiatives&nbsp;</h2>
<p>If you ask a Clouderan what makes working here special, most of them will tell you about the unique corporate culture rooted in a group of people who actually care–about each other, our customers, and our impact. That culture is a large part of why we believe this year’s events were so successful.</p>
<p>Our 2025 Season of Service events brought together Clouderans from around the world, offering their time and energy to causes that support children, seniors, underserved communities, and those in need of essential supplies. Together, we made a meaningful difference across continents, demonstrating the power of collective action and compassion.</p>
<p>Our contributions spanned both virtual and in-person efforts, including: mentoring children in Bangalore, assembling SAFE kitchen kits in Austin, supporting the elderly in Cork, community gardening, preparing Maitri care packages in Santa Clara, crafting toys in Budapest, and much more.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-s-2025-season-of-service-a-word-from-our-ambassadors</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Redesigning Decisions: How Enterprises Can Unlock AI’s True Potential</title><description><![CDATA[Cassie Kozyrkov, first and former Chief Decision Scientist at Google and now founder and CEO of Kozyr, a company dedicated to advancing decision intelligence, joined The AI Forecast to explore how this often-discounted discipline can transform AI from a technical novelty into a strategic asset by simply remembering to ask ‘why’. ]]></description><link>https://www.cloudera.com/blog/business/redesigning-decisions-how-enterprises-can-unlock-ai-s-true-potential.html</link><guid>https://www.cloudera.com/blog/business/redesigning-decisions-how-enterprises-can-unlock-ai-s-true-potential.html</guid><pubDate>Mon, 11 Aug 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1072593696.jpg"><p>While AI excitement continues to revolve around bigger models and better data, many enterprises get caught up in the complexity of bigger, faster, better. Precision, the quiet backbone of an effective strategy, often remains overlooked as a consequence. Yet without it, even the most powerful AI can accelerate the wrong outcomes.&nbsp;</p>
<p>Cassie Kozyrkov, first and former Chief Decision Scientist at Google and now founder and CEO of Kozyr, a company dedicated to advancing decision intelligence, joined The AI Forecast to explore how this often-discounted discipline can transform AI from a technical novelty into a strategic asset by simply remembering to ask ‘why’.&nbsp;</p>
<p>Here are some key takeaways from that conversation.&nbsp;</p>
<h3>Choosing the right model isn’t enough. Ask better questions first.&nbsp;&nbsp;</h3>
<p><b>Paul:</b> There’s growing concern about the dark side of AI hype, particularly its impact on thinking and decision-making. Are we genuinely improving how we think, or just outsourcing it?&nbsp;</p>
<p><b>Cassie:</b> The biggest misconception is that AI’s value lies in prediction. It doesn’t. The true power is in choosing better actions, but only if we start with the right question. Enterprises often overprioritize modeling and predictions while neglecting the decision-making process that gives those models value. The model itself, no matter how advanced, is secondary to the clarity of the problem it’s solving.&nbsp;&nbsp;</p>
<p>Decision intelligence is about starting with the right question, not just optimizing the algorithm. We’ve invested heavily in tools that generate answers, not frameworks that help us ask the right questions.&nbsp;&nbsp;</p>
<p>I caution against the cognitive cost of overreliance. When shortcuts replace engagement, organizations lose their capacity for deep thinking. Think of, “Should I be surprised I don’t get big biceps if I use a forklift?”&nbsp;</p>
<h3>The unicorn myth is stalling enterprise AI.&nbsp;</h3>
<p><b>Paul:</b> If the real advantage lies in asking better questions, how should enterprises rethink the way they structure their AI efforts? What does it take to succeed across modeling, data management, and translating insights into decisions?&nbsp;</p>
<p><b>Cassie:</b> Enterprises often fall into the trap of hiring a so-called “unicorn,” someone expected to do it all, but believing that myth holds companies back. It assumes technical skills alone deliver value, when in reality, data science is inherently interdisciplinary. Real success doesn’t come from individual brilliance; it comes from well-orchestrated teams with distinct, complementary roles.&nbsp;</p>
<p>Enterprises must shift from “finding the right person” to “designing the right process,” since AI is a team sport. Success depends on collaboration between decision-makers, domain experts, data scientists, and engineers—all aligned around the decisions that matter.&nbsp;</p>
<p>Start with the decision, not the tool. Define what you’re trying to decide, what information supports it, who owns the outcome, and what action it should drive. That’s how you make AI useful, but decision intelligence ties it together. It creates clarity across disciplines, helps you reason through uncertainty, and ensures your AI efforts drive real, actionable outcomes.&nbsp;</p>
<h3>AI is a magic lamp. The danger lies in the wisher.&nbsp;&nbsp;</h3>
<p><b>Paul:</b> Hearing AI success depends on collaboration and well-designed processes. How should organizations be thinking about their responsibility in using these powerful tools? What concerns you most about how they’re approaching generative and agentic AI today?&nbsp;</p>
<p><b>Cassie: </b>When answers are cheap, the real power lies in clarity of intent. We talk too much about the genie and not enough about the lamp, or the person making the wish.&nbsp;</p>
<p>Enterprises that focus on the wisher, not just the genie, are the ones thinking in terms of decision intelligence. The danger isn’t in how powerful the AI is; it’s in how well the human formulates the request. You can make the genie as big and strong as you like, but how skilled is the wisher? How responsible? How thoughtful?&nbsp;</p>
<p>The lamp, which contains and governs the AI, is just as critical. In the world of agentic AI, that structure matters as much as the model itself.&nbsp;</p>
<p>Too many enterprises are fixated on what AI can do, neglecting how humans frame the problem. That’s where misalignment happens, not because the AI is flawed, but because the ask was misguided. The real skill enterprise leaders need isn’t prompt engineering, it’s decision design. We need more skilled wishers who know what to ask, why it matters, and what to do with the answer. <br>
 <br>
Catch the full conversation with Cassie Kozyrkov on The AI Forecast on  <a href="https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopen.spotify.com%2Fepisode%2F77f7BXJrRX94Msekw8xbsZ&amp;data=05%7C02%7C%7Cc3aa8fac02c94b33b87408ddcf6ea793%7Cbf0b48c768944eb6bf538621676ccee4%7C0%7C0%7C638894796341407891%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=foyUN%2F57EDWaGLPEsqWMqLIEZcqCqiHEGNgM%2BkDrvXE%3D&amp;reserved=0">Spotify </a>,<a href="https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpodcasts.apple.com%2Fus%2Fpodcast%2Fai-means-nothing-if-you-dont-know-why-youre-using-it%2Fid1779293119%3Fi%3D1000719790604&amp;data=05%7C02%7C%7Cc3aa8fac02c94b33b87408ddcf6ea793%7Cbf0b48c768944eb6bf538621676ccee4%7C0%7C0%7C638894796341432760%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=Caa0AcrEe7NOriJ5zv50iVjtfrJjGB9TYbCoxhyMVJo%3D&amp;reserved=0"> Apple Podcasts</a>, and<a href="https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DTrg9Tz49NrE%26list%3DPLe-h9HrA9qfAmGHgsmXUZgLL-T4Xjhlq8&amp;data=05%7C02%7C%7Cc3aa8fac02c94b33b87408ddcf6ea793%7Cbf0b48c768944eb6bf538621676ccee4%7C0%7C0%7C638894796341452641%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=vbz3mUjs9%2Fpi7DnNMhuW1%2FqkEeJyPmv9AYdzBNTEKkI%3D&amp;reserved=0"> YouTube</a>.  </p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=redesigning-decisions-how-enterprises-can-unlock-ai-s-true-potential</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Bringing Private AI To Your Data Center with Cloudera Data Services 1.5.5</title><description><![CDATA[Bring private AI securely to your data center with Cloudera Data Services 1.5.5 — delivering GPU-accelerated AI inference, hybrid cloud agility, and more.]]></description><link>https://www.cloudera.com/blog/partners/bringing-private-ai-to-your-data-center-with-cloudera-data-services-1-5-5.html</link><guid>https://www.cloudera.com/blog/partners/bringing-private-ai-to-your-data-center-with-cloudera-data-services-1-5-5.html</guid><pubDate>Mon, 11 Aug 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Partners]]></category><dc:creator><![CDATA[Blake Tow,Rahul Sharma]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-gettybu003307.webp"><p>What was once a distant vision, the age of enterprise AI has now become an immediate strategic imperative for business and technology leaders. A recent <a href="https://cloud.google.com/resources/roi-of-generative-ai" target="_blank" rel="noopener noreferrer">Google report</a> found that 74% of large organizations are already seeing a significant return on investment (ROI) from generative AI (GenAI) investments.</p>
<p>Despite the clear mandate, the path forward is fraught with risk: a staggering 88% of enterprise AI projects fail to reach production, <a href="https://www.cio.com/article/3850763/88-of-ai-pilots-fail-to-reach-production-but-thats-not-all-on-it.html" target="_blank" rel="noopener noreferrer">according to recent research</a>. This high failure rate is a direct result of the complex hurdles businesses must overcome. Leaders are forced to navigate a minefield of security vulnerabilities, unpredictable costs, a persistent skills gap, and uncertain ROI, often proving to be insurmountable.</p>
<p>What if you could eliminate these barriers by flipping the script? Instead of moving your most sensitive, proprietary data to external models and facing the security vulnerabilities and unpredictable costs that entails, what if you could bring the power of GenAI to your data where it already lives, securely within the data center you've already invested in?</p>
<p>Now, you can: <a href="/content/www/en-us/products/data-services.html">Cloudera Data Services</a> can help you modernize to a more powerful and secure platform, empower your teams, and unlock the full potential of private AI within your data center. This means bringing AI directly to your proprietary data to innovate securely behind your own firewall, without exposing sensitive intellectual property.</p>
<h2>Introducing Cloudera Data Services 1.5.5</h2>
<p>Cloudera Data Services is a powerful suite of containarized applications<a href="/content/www/en-us/products/data-engineering.html">&nbsp;data engineering</a>, data warehousing, and AI that you run securely in your own data center. This latest release marks a significant leap forward and expands on that foundation to unlock the full potential of private AI within your enterprise.&nbsp;</p>
<p>With this release, Cloudera Data Services now includes:</p>
<ul>
<li><b>Private enterprise AI</b>: The ability for secure innovation behind your firewall by bringing AI directly to your data, without exposing sensitive intellectual property.</li>
<li><b>A true cloud-native experience</b>: The agility, elasticity, and autoscaling of the cloud combined with the security and control of your own on-premises data center.</li>
<li><b>Dramatically improved efficiency</b>: A modern architecture where the independent scaling of compute and storage reduces infrastructure costs and optimizes resource utilization.</li>
<li><b>An empowered practitioner experience</b>: Access to the next-generation tools and self-service experience your teams need to accelerate their time to value.</li>
<li><b>An open data lakehouse foundation</b>: A unified platform for your analytics and AI that is powered by open standards like Apache Iceberg and Trino to eliminate vendor lock-in and offers a purpose-built petabyte-scale object store powered by Apache Ozone.</li>
</ul>
<h2>Private Enterprise AI Behind Your Firewall</h2>
<p>Running <a href="/content/www/en-us/why-cloudera/enterprise-ai.html">enterprise AI</a> across a hybrid environment has historically meant building complex, unsupported DIY solutions. Without a consistent, unified platform, crucial capabilities for the end-to-end AI lifecycle were siloed in different places, forcing teams to manually bridge the gaps. Cloudera Data Services 1.5.5 fundamentally changes this by providing the tools to access data anywhere – creating one seamless AI lifecycle on a single hybrid platform. This allows you to innovate faster, without fragmenting the process or compromising on security and control.</p>
<p>An exciting aspect of this release is the introduction of on-premises support for <a href="/content/www/en-us/products/machine-learning.html">Cloudera AI Inference service</a>, accelerated by NVIDIA, and <a href="https://docs.cloudera.com/machine-learning/cloud/setup-ai-studios/topics/ml-ai-studios-overview.html" target="_blank" rel="noopener noreferrer">Cloudera AI Studios</a>. Previously, these offerings were available only in the cloud.&nbsp;</p>
<ul>
<li><b>Cloudera AI Inference service</b> provides a secure and scalable engine to deploy and manage all of your AI models in production, right in your data center. This allows you to bring AI directly to your data, keeping your most valuable assets and sensitive intellectual property secure within your own firewall.&nbsp;</li>
<li>To address the persistent skills gap, <b>Cloudera AI Studios</b> offer low-code templates that empower your existing teams to rapidly build and deploy GenAI applications and agents. This dramatically reduces the need for scarce and specialized AI talent. With this integrated, secure-by-default platform, you can now go from prototype to production in days, not months, to deliver tangible business value faster than ever before.</li>
</ul>
<p>These capabilities empower organizations to accelerate AI adoption by building and running GenAI applications within the security of their own data centers, keeping sensitive intellectual property safely behind their firewall. This move from monolithic clusters to a suite of agile, containerized applications delivers a true cloud-native experience on premises, providing agility and efficiency without sacrificing security or control.</p>
<h2>Empowering Practitioners and Accelerating Value</h2>
<p>A platform is only as good as the people who use it. That’s why this release of Data Services is practitioner-focused, designed to remove bottlenecks and accelerate time to value. According to a <a href="/content/dam/www/marketing/resources/analyst-reports/the-tei-of-cloudera-on-private-cloud.pdf?daqp=true">Forrester study</a>, customers who adopt this modern architecture see an 80% faster workload deployment and a 20% increase in productivity for their data teams. This is achieved by providing a streamlined, self-service experience that reduces dependency on administrators and empowers your data scientists and engineers to focus on results, not administrative overhead.</p>
<p>This enhanced experience comes to life through new capabilities that simplify daily workflows. Practitioners can now onboard themselves securely with self-service Kerberos, debug issues faster with the new Hive Query History, and gain greater autonomy through fine-grained Spark job access control lists. By simplifying the onboarding experience and setup, your technical teams can spend less time on administrative tasks and more time delivering the business value that drives your enterprise forward.</p>
<h2>A Platform Built for Enterprise Scale</h2>
<p>These advanced AI and practitioner-focused capabilities are built upon a foundation engineered for the world’s most demanding enterprises. The key architectural advantage of <a href="/content/www/en-us/products/data-services.html">Data Services</a> is its decoupled and containerized nature, which allows you to scale compute and storage resources independently.&nbsp;</p>
<p>This efficiency translates into major savings. According to internal reporting, one large global bank is driving an estimated <b>$28 million in annual infrastructure savings and a 30% improvement in compute efficiency by modernizing with Cloudera Data Services</b>. This is made possible by a platform relentlessly focused on the core enterprise pillars of end-to-end security for all data and applications, reliability for mission-critical business continuity, and the scalability to perform consistently as data and workloads grow.</p>
<h2>Your Path to Modernization Starts Here</h2>
<p>Cloudera Data Services 1.5.5 provides a clear, proven path to transforming your data architecture and unlocking the full potential of enterprise AI. This release marks a significant leap forward, delivering a powerful, cloud-native platform designed to accelerate data modernization efforts. We invite you to take the next step on your modernization journey.&nbsp;</p>
<p>Whether you're ready to schedule a demo to see the new capabilities in Data Services 1.5.5, book a deep-dive session with our technical experts, or begin planning your upgrade or a proof of concept, <a href="/content/www/en-us/contact-sales.html">we're here to help you get started</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=bringing-private-ai-to-your-data-center-with-cloudera-data-services-1-5-5</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera Named a Leader and Fast Mover in 2025 GigaOm Radar for Streaming Data Platforms</title><description><![CDATA[For the second year in a row, Cloudera has been named a Leader in the GigaOm Radar for Streaming Data Platforms report. The 2025 edition of the report evaluated the top 17 streaming data platforms in the market against various functional and non-functional criteria. Cloudera was recognized for delivering a well-rounded solution in all fundamental aspects of streaming data management.]]></description><link>https://www.cloudera.com/blog/business/cloudera-named-a-leader-and-fast-mover-in-2025-gigaom-radar-for-streaming-data-platforms.html</link><guid>https://www.cloudera.com/blog/business/cloudera-named-a-leader-and-fast-mover-in-2025-gigaom-radar-for-streaming-data-platforms.html</guid><pubDate>Fri, 08 Aug 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[André Araújo]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-clouds-lake.webp"><p>For the second year in a row, Cloudera has been named a Leader in the <a href="/content/www/en-us/campaign/2025-gigaom-radar-for-streaming-data-platforms.html">GigaOm Radar for Streaming Data Platforms</a> report. The 2025 edition of the report evaluated the top 17 streaming data platforms in the market against various functional and non-functional criteria. Cloudera was recognized for delivering a well-rounded solution in all fundamental aspects of streaming data management.</p>
<h2>Cloudera’s Key Differentiators in the GigaOm Radar for Streaming Data Platforms Report</h2>
<p>GigaOm highlighted the following as Cloudera’s strengths in the streaming data market:<br>
</p>
<h3>Enabling a Wide Range of Real-Time Applications</h3>
<p>Cloudera brings together Apache NiFi, Apache Kafka, and Apache Flink—the de facto standards in data streaming—to support a broad array of real-time applications, including data product creation, multi-cloud data movement, and AI/ML pipelines.</p>
<h3>Providing Advanced Analytics For Immediate Insights</h3>
<p>The ability to process large amounts of data in real-time is paramount for our customers. We make this possible, scalable and easy. With <a href="https://docs.cloudera.com/csa/latest/ssb-overview/topics/csa-ssb-intro.html">Cloudera SQL Stream Builder</a>, users can create real-time pipelines using SQL to perform advanced analytics and streaming data processing.<br>
</p>
<h3>Accelerating Generative AI (GenAI) Applications</h3>
<p><a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a> makes it fast and easy to get data into and out of AI models by supporting native integration for AI use cases—for example, integration with vector databases like Pinecone, Milvus, and Qdrant via exclusive NiFi processors for rapid processing and embedding of data. Cloudera provides seamless data flow from ingestion to production AI, supporting advanced use cases in data science and AI.</p>
<p>Real-time analytics and AI-driven applications are a necessity in today's business use cases. Cloudera's streaming data platform is well-positioned to provide customers with the streaming capabilities they need to build GenAI pipelines and workloads, create and update models in real time, and perform fast and scalable analytics on the streaming data using features like SQL Stream Builder, multi-cloud pipelines, and vector DB connectivity.</p>
<h3>Enabling Cost-Effective and Scalable Data Storage with Apache Iceberg</h3>
<p>Most streaming data is eventually stored at rest. Due to regulatory or business requirements, this data may have to be retained for several years, amounting to several petabytes of data. <a href="https://docs.cloudera.com/cdp-public-cloud/cloud/cdp-iceberg/topics/iceberg-in-cdp.html">Apache Iceberg</a> offers a low-cost and scalable storage solution for streaming data—ideal for stream enrichment and long-term data access.&nbsp;</p>
<p>Data lakehouses are at the core of enterprise modern architectures. Cloudera is one of the major contributors to Apache Iceberg, which powers our own data lakehouse. Any streaming solutions that exist in such architectures must effectively integrate with the data lakehouse, providing a low-cost and efficient streaming data storage and enrichment solution. Cloudera provides this integration natively, supporting it through NiFi, Flink and Kafka integrations with Apache Iceberg.<br>
</p>
<h3>Unlocking True Flexibility by Running Real-Time Data On-Premises or Across Any Cloud</h3>
<p>Cloudera offers unparalleled support for a fully integrated, open-source-aligned, data-in-motion stack. We’re the only vendor offering a seamless experience across Apache NiFi for diverse data ingestion, Apache Kafka for scalable event streaming, and Apache Flink for powerful real-time processing. This unique combination empowers enterprises to build robust, real-time data pipelines that span on-premises, multi-cloud, and edge environments.</p>
<p>At Cloudera, we treat data in motion as a first-class citizen in our platform because we believe the ability to move and process data in real time is key to success for any data-driven organization. We blend open-source innovation with enterprise-ready tools to give our customers the best of both worlds—the continuous innovation of the open source community backed by enterprise-scale support and customer service. By doing so, Cloudera delivers real-time data movement and processing you can trust.</p>
<p>Streaming capabilities are usually part of larger business use cases and are required to integrate with other types of technologies (for example, <a href="/content/www/en-us/products/open-data-lakehouse.html">data lakehouses</a>, <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">AI inference</a>, and <a href="/content/www/en-us/products/data-engineering.html">data engineering tools</a>). We recognize the importance of using streaming technologies as part of a larger ecosystem and deliver an integrated experience that enables customers to implement various types of business applications and use cases quickly and easily.</p>
<h3>Read the Report</h3>
<p><a href="/content/www/en-us/campaign/2025-gigaom-radar-for-streaming-data-platforms.html">Read GigaOm’s analysis</a> of the vendors that excel in the streaming data platform space and how they stack up to each other. You’ll see for yourself why Cloudera’s comprehensive, enterprise-ready streaming solution is positioned as a Leader and Fast Mover.</p>
<p>The report also provides details about:</p>
<ul>
<li><p>The diverse vendor landscape: Learn about the broad range of vendors from niche specialists to full-stack cloud providers like Cloudera</p>
</li>
</ul>
<ul>
<li><p>Real-time processing: Get an understanding of the vendors who perform well at providing real-time processing, analytics, and machine learning integration</p>
</li>
</ul>
<ul>
<li><p>Strategic fit and integration: See why it’s important to consider alignment of platform capabilities with infrastructure, scalability, and enterprise toolchains when comparing solutions.</p>
</li>
</ul>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-named-a-leader-and-fast-mover-in-2025-gigaom-radar-for-streaming-data-platforms</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>AI and Data in the Real World: Key Lessons from the Mid-Year Enterprise Tech Conference Circuit</title><description><![CDATA[So far in 2025, a key theme has emerged consistently across the most influential data and cloud events, including AWS in Sydney, Hamburg, DC and NYC, DTW Ignite, UN Open Source Week and Gartner Sydney: AI implementation is adaptable to a range of use cases and organizational needs. Whether your use case requires a specific type of AI, a human touch, or has real-world impact, this year’s major tech events reinforced how to take advantage of AI’s versatility, allowing your organization to unlock the full potential of its data. ]]></description><link>https://www.cloudera.com/blog/business/ai-and-data-in-the-real-world-key-lessons-from-the-mid-year-enterprise-tech-conference-circuit.html</link><guid>https://www.cloudera.com/blog/business/ai-and-data-in-the-real-world-key-lessons-from-the-mid-year-enterprise-tech-conference-circuit.html</guid><pubDate>Thu, 07 Aug 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty51572797-1.jpg"><p>So far in 2025, a key theme has emerged consistently across the most influential data and cloud events, including AWS in Sydney, Hamburg, DC and NYC, DTW Ignite, UN Open Source Week and Gartner Sydney: AI implementation is adaptable to a range of use cases and organizational needs. Whether your use case requires a specific type of AI, a human touch, or has real-world impact, this year’s major tech events reinforced how to take advantage of AI’s versatility, allowing your organization to unlock the full potential of its data.&nbsp;</p>
<p>Here’s what you missed if you couldn’t attend these events.&nbsp;</p>
<h3>The Many Forms of AI: From Traditional to Agentic&nbsp;</h3>
<p>Whether it’s improving customer experiences, accelerating operational efficiency or transforming humanitarian aid and public health, the message is clear: organizations that understand how to harness and scale AI effectively and responsibly will be the ones to lead.&nbsp;</p>
<p>That process starts with understanding the right type of AI for your use case. <a href="https://www.linkedin.com/in/wimstoop/" target="_blank">Wim Stoop</a>,&nbsp; Director of Product Marketing at Cloudera, emphasized this point at DTW Ignite. Different types of AI have varying uses, and knowing how to leverage each type will drive real value for businesses and customers.&nbsp;</p>
<p>Stoop differentiated between four categories of AI:&nbsp;</p>
<ol>
<li><b>Traditional AI</b> focuses on performing specific, predefined tasks. Using algorithms and pre-programmed rules, traditional AI is great for internal efficiencies, such as network optimization and anomaly detection.&nbsp;</li>
<li><b>Generative AI</b> creates new content based on the data it’s trained on. This often has stronger customer-facing use cases, including generating personalized offers and chatbot responses.</li>
<li><b>Multimodal AI</b> processes information from multiple data types simultaneously, enabling smarter in-house support for use cases including anomaly and fraud detection using multiple data streams.&nbsp;</li>
<li><b>Agentic AI</b> operates autonomously. It can learn from data and execute complex tasks independently, enabling use cases like autonomous ticket resolution or resource reallocation.&nbsp;&nbsp;</li>
</ol>
<p>A key emerging theme from this year’s events is that smart AI use is a focus for organizations in 2025—effective AI implementation starts with usability and choosing the right type of AI for your case.&nbsp;</p>
<h3>Bringing a Human Touch to AI</h3>
<p>How do we ensure that AI produces the most accurate, useful results possible? During DTW Ignite’s “AI at Scale for Business: From Data Chaos to Intelligent Action” panel, Stoop highlighted that AI is only as effective as the data that fuels it. AI integrations must be scalable, built with strong data fundamentals and consolidated across the organization. To make that happen, AI requires a human touch—as people and professionals, we must ensure data integrity, handle governance, and combat bias within AI platforms.</p>
<p>The “Data to Decisions: How Cloudera AI Powers Agile Crisis-Response at Scale” talk at Gartner Sydney,&nbsp; led by <a href="https://www.linkedin.com/in/domenic-brasacchio/" target="_blank">Domenic Brasacchio</a>, Senior Account Director at Cloudera, took this point further. He emphasized that humans have the responsibility to maintain data quality as the baseline for trustworthy AI outcomes. Fostering AI literacy across teams and creating a culture and infrastructure that supports and supervises AI will work towards achieving this.&nbsp;</p>
<h3>How AI is Already Making a Difference in the Real World&nbsp;</h3>
<p>At AWS Hamburg, <a href="https://www.linkedin.com/in/jaidev-karthickeyan-83783613/" target="_blank">Jaidev Karthickeyan</a>, Global Head, Value Advisory at Cloudera, discussed AI's real-world application. Using <a href="/content/www/en-us/about/news-and-blogs/press-releases/2024-11-25-clouderas-advanced-ai-solutions-accelerate-global-humanitarian-aid-for-mercy-corps.html">Mercy Corps</a> as an illustrative example, he showed how&nbsp; AI can be a tool for impact, resilience, and saving lives—empowering their humanitarian and public health missions.</p>
<p>At the event, Karthickeyan showcased Mercy Corps’ AI-powered response system, which improved field intelligence and decision-making under resource-constrained conditions. By centralizing what was once fragmented research data and optimizing compute capacity while facing spotty connections out in the field, AI met their urgent need for contextualized, applicable and action-ready intelligence.&nbsp;&nbsp;</p>
<p>Examples like this highlight how AI can be a tool for impact, resilience and as applicable, lifesaving efficiency. Further, the lessons learned here can be applied to any use case—the ability to creatively apply AI to strained processes will allow enterprises to more easily reach goals. As such, AI is quickly becoming an invaluable tool for any business looking to overcome a history of roadblocks that slow down growth or efficiency.&nbsp;</p>
<h3>Cloudera’s Role in Democratizing AI</h3>
<p>Cloudera’s presence at major enterprise tech events this year has reinforced our position as a trusted thought leader in AI and data innovation. From event stages to collaborative sessions, our teams have demonstrated how Cloudera’s hybrid, open data platforms continue to meet the evolving needs of enterprises and mission-driven organizations alike. <a href="https://www.linkedin.com/in/sergiogh/" target="_blank">Sergio Gago</a>, CTO at Cloudera, reinforced the significance of this message at UN Open Source Week: Cloudera leads in helping organizations manage and make the best use of their data.</p>
<p>Whether enabling global scalability, ensuring end-to-end governance, or transforming data into real-time, actionable insights, Cloudera remains focused on helping organizations unlock the full potential of their data. As the conversation around AI deepens, Cloudera stands ready—shaping the future with solutions that are open, secure and built for impact.</p>
<h3>Continue the Conversation: Join us at an Upcoming Event</h3>
<p>Interested in joining the conversation and learning more about how to effectively utilize AI and data in the real world? <a href="/content/www/en-us/events.html">Click here to see where we’ll be</a> and how to join us at any of these upcoming events:</p>
<ol>
<li>AWS Summit Mexico City: Aug. 6</li>
<li>Big Data London: Sept. 24-25</li>
<li>Big Data Paris: Oct. 1-2</li>
<li>Gitex 2025: Oct. 13-17</li>
<li>AWS re:Invent: Dec. 1-5</li>
</ol>
<p>Or <a href="/content/www/en-us/events/evolve.html">join us at an EVOLVE25 event near you</a> to connect with industry visionaries, data and AI experts and your peers to explore the impact of accessible data and AI across industries.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=ai-and-data-in-the-real-world-key-lessons-from-the-mid-year-enterprise-tech-conference-circuit</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Activate Your Data and Transform Possibilities with AI Agents</title><description><![CDATA[This blog explores how this partnership addresses major data challenges and reveals real-world applications that can spark your business’s AI transformation. Whether you&apos;re exploring AI adoption for the first time or aiming to scale current efforts, this guide provides actionable insights.]]></description><link>https://www.cloudera.com/blog/partners/activate-your-data-and-transform-possibilities-with-ai-agents.html</link><guid>https://www.cloudera.com/blog/partners/activate-your-data-and-transform-possibilities-with-ai-agents.html</guid><pubDate>Mon, 04 Aug 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Partners]]></category><dc:creator><![CDATA[Jerome Alexander]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-blue-orange-back-person-walking.webp"><p>Artificial intelligence (AI) is no longer just a futuristic concept. Today, enterprise AI is reshaping businesses globally, unlocking new efficiencies, and creating pathways to innovation. A major breakthrough in this transformation is the emergence of AI agents, specialized tools designed to automate complex tasks, analyze vast amounts of data, and deliver actionable insights.</p>
<p>However, the road to successfully implementing AI is not without its hurdles. Poor data quality, fragmented systems, and integration challenges hinder many organizations from realizing AI’s full potential. That’s where <a href="https://www.cloudera.com/partners/solutions/crewai.html">CrewAI and Cloudera</a> are stepping in as partners to bridge the gap.</p>
<p>This blog explores how this partnership addresses major data challenges and reveals real-world applications that can spark your business’s AI transformation. Whether you're exploring AI adoption for the first time or aiming to scale current efforts, this guide provides actionable insights.</p>
<h3>Key Applications of AI Agents Across Industries</h3>
<p>AI agents are delivering major impacts across industries. Below are some use cases that highlight their applications and benefits:</p>
<h3>Financial Auditing</h3>
<ul>
<li><p>Problem: Manual auditing is time-consuming and prone to errors, leading to compliance risks.</p>
</li>
<li><p>Solution: AI agents automate regulatory audits by ingesting and analyzing financial data in real time, flagging discrepancies instantly.&nbsp;</p>
</li>
</ul>
<h3>Predictive Maintenance in Manufacturing</h3>
<ul>
<li><p>Problem: Unexpected equipment failures lead to costly downtime.&nbsp;</p>
</li>
<li><p>Solution: AI agents monitor sensors across machines, detecting potential issues before they escalate. Preventative maintenance schedules are generated automatically.&nbsp;</p>
</li>
</ul>
<h3>&nbsp;Contract Management</h3>
<ul>
<li><p>Problem: Reviewing legal contracts manually is tedious and error-prone.&nbsp;</p>
</li>
<li><p>Solution: AI-powered contract analysis automates text extraction, highlights risk clauses, and compares terms across documents.&nbsp;</p>
</li>
</ul>
<p>These use cases showcase AI’s immense potential to streamline operations and deliver tangible results, all while enhancing decision-making capabilities.</p>
<h3>Understanding the Challenges of AI Adoption</h3>
<p>Adopting AI technologies in enterprise settings isn’t as straightforward as many assume. Here are some of the most common barriers for organizations, as discussed in Cloudera’s recent webinar “<a href="https://www.cloudera.com/content/dam/www/marketing/resources/webinars/activate-your-data-transforming-possibilities-with-ai-agents.landing.html">Activate Your Data: Transforming Possibilities with AI Agents</a>:”</p>
<ol>
<li>Data quality: 29% of enterprises reported that poor data quality and availability are the biggest factors delaying value realization in AI projects. Without reliable and complete datasets, AI models can't generate meaningful or actionable insights.</li>
<li>Data integration: Nearly 31% of organizations cite data integration as a major roadblock. Data scattered across legacy systems, cloud platforms, and departmental silos creates barriers to seamless AI deployment. Connecting these fragmented sources into one cohesive system often feels like an impossible task.</li>
<li>Operational scalability: Even businesses that succeed in developing AI models face challenges scaling them into production environments. For example, 62% of companies struggle to move AI models from the development phase into full-scale production. Without scalable platforms, AI risks becoming an interesting experiment instead of a business-critical tool.</li>
</ol>
<p>These challenges are clear indicators that traditional approaches to managing enterprise data systems are no longer sufficient. Instead, businesses must rethink how they manage their data, ensuring they have a foundation of trusted data on which to deploy AI models.</p>
<h3>CrewAI and Cloudera Pioneering Intelligent AI Solutions</h3>
<p>To address these challenges, the partnership between <a href="https://www.cloudera.com/partners/solutions/crewai.html">CrewAI and Cloudera</a> offers a paradigm shift in how enterprises leverage AI. By combining CrewAI’s AI agent technology with Cloudera’s secure, scalable data platform, businesses can operationalize data like never before. Here’s why this partnership stands out:</p>
<ul>
<li><p>Secure AI frameworks: Security and compliance are built into every stage. From financial institutions to healthcare providers, industries dealing with sensitive information can confidently deploy AI agents knowing their data privacy is safeguarded.</p>
</li>
</ul>
<ul>
<li><p>Integrated ecosystem: Seamless interoperability with <a href="https://www.cloudera.com/partners.html">leading technologies and ecosystem partners</a>. This ensures enterprises don’t have to overhaul existing systems but instead integrate AI into their current tech stack.</p>
</li>
</ul>
<ul>
<li><p>Scalable infrastructure: Adaptable and robust solutions designed for enterprises of any size. Whether automating mundane back-office tasks or implementing predictive analytics in manufacturing, the infrastructure supports a wide range of applications.</p>
</li>
</ul>
<p>With this powerful partnership, businesses can activate their dormant data, turning challenges into opportunities.</p>
<h3>Don’t Just Sit on Your Data—Activate it Today</h3>
<p>AI agents are shaping the future of enterprise operations. With the collaboration between CrewAI and Cloudera, businesses now have access to intelligent, secure, and scalable solutions designed to turn their data into their most valuable asset.</p>
<p>If your organization is ready to harness the possibilities of AI, we encourage you to explore the opportunities this partnership offers. Whether you're looking to improve customer service, enhance productivity, or redefine your data strategy, CrewAI and Cloudera provide the tools you need.</p>
<p>Learn more about the CrewAI and Cloudera partnership <a href="https://www.cloudera.com/partners/solutions/crewai.html">here</a>, or head&nbsp; over to our Community for a deep dive into the topic: <a href="https://community.cloudera.com/t5/Community-Articles/Fully-Private-Agents-with-Cloudera-s-AI-Inference-Service/ta-p/400799">Fully Private Agents with Cloudera's AI Inference Service and CrewAI</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=activate-your-data-and-transform-possibilities-with-ai-agents</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>A Conversation with Mary Wells and Alison Fragale on Learning to Lead as Your Authentic Self </title><description><![CDATA[Cloudera’s Chief Marketing Officer, Mary Wells, recently hosted an energizing, candid conversation with Likeable Badass author and organizational psychologist, Alison Fragale, as part of Cloudera’s Women Leaders in Tech (WLIT) series. Together, they explored what it means to lead with clarity and confidence while staying grounded in who you are. ]]></description><link>https://www.cloudera.com/blog/culture/a-conversation-with-mary-wells-and-alison-fragale-on-learning-to-lead-as-your-authentic-self.html</link><guid>https://www.cloudera.com/blog/culture/a-conversation-with-mary-wells-and-alison-fragale-on-learning-to-lead-as-your-authentic-self.html</guid><pubDate>Fri, 01 Aug 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Debbie Kruger]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/person-from-audience-talking.webp"><p>Cloudera’s Chief Marketing Officer, Mary Wells, recently hosted an energizing, candid conversation with Likeable Badass author and organizational psychologist, Alison Fragale, as part of Cloudera’s Women Leaders in Tech (WLIT) series. Together, they explored what it means to lead with clarity and confidence while staying grounded in who you are. They discussed how to balance status and power, warmth and authority, influence and intention, and how to navigate doubt and redefine ambition to understand the difference between holding power and earning respect. Packed with both inspiration and practical tools, it was a conversation about how to lead like a likeable badass. Here’s what you missed.</p>
<h3>What Is a “Likeable Badass”?</h3>
<p>Alison opened the conversation by unpacking the science behind her book’s title, a phrase that instantly grabbed Mary’s attention. What is a “likeable badass”? For Alison, it means showing up as both warm and assertive. She explained that the combination unlocks real influence, not by demanding it, but by earning respect through capability and care.</p>
<p>She said, “Respect comes from showing up as warm and assertive—caring and capable.”</p>
<p>Mary echoed this with her reflection on confidence and self-perception. “That internal conversation about wanting to be respected and liked is so real for many of us,” she said. “You’ve got to get your swagger and keep inspiring.”</p>
<p>Together, they reframed likability not as a trade-off with authority, but as its amplifier. Their message: credibility is built through both kindness and edge, and you don’t need to sacrifice one for the other.</p>
<h3>Balancing Authority and Warmth</h3>
<p>When Mary asked how to balance assertiveness and warmth—especially under pressure—Alison quickly acknowledged how real that challenge is. For her, the default can lean too far toward assertiveness. For others, warmth comes more easily.</p>
<p>Their shared advice was to zoom out: leadership isn’t about perfect balance in every moment but building a long-term track record of trust.</p>
<p>Alison said, “If you have a history of warmth, people will give you grace when you need to go harder. Then if you have a more assertive moment, it’s just a blip in a much longer story.”</p>
<p>Mary added, “You know when you’re doing it, so own it. Be transparent, and people will respect that.”</p>
<p>The takeaway: Respect and authenticity can go hand in hand, especially when leadership is built on intention rather than perfection.</p>
<h3>Confidence, Doubt, and the Inner Critic</h3>
<p>Mary turned the conversation to a widely resonant question: How do you keep your confidence when doubt creeps in? How do you quiet the inner critic without losing your ambition?</p>
<p>Alison offered a grounding reframe. The presence of doubt, she said, doesn’t mean you’re weak, but that you’re growing. Like muscle soreness after a good workout, it’s uncomfortable but a sign you’re getting stronger.</p>
<p>She summarized it with insight and clarity: “The doubt sits in the gap between where I am and where I’m going.”</p>
<p>Mary shared how she consciously avoids the term “impostor syndrome,” favoring the phrase “inner critic” instead. Her advice was to pause and look at the bigger picture: “If you just think about yourself five or ten years ago and where you are now, you’re like, I was hoping for this.”</p>
<p>For both leaders, confidence is built, lost, and rebuilt with every step forward, but self-doubt is a normal part of achievement.</p>
<h3>Redefining Power and Status</h3>
<p>As the conversation turned to power and status, Alison clearly distinguished between the two. She said power comes from controlling resources: money, authority, and access. Status, however, exists in perception. You can’t own it outright, but you can shape it by how you show up.</p>
<p>“Power is what you hold, and status is how others see you.”</p>
<p>Mary emphasized how status, especially for women, is often gained through contribution, but that contribution must be strategic, not invisible. It’s not about doing more; it’s about being intentional and being seen.</p>
<p>Alison underscored that with a pointed reminder: “Be of service, but do it in ways that reflect your unique value.” She urged women to think about how they offer help, not as a general act of generosity, but as a way to showcase their distinct skills and strengths.</p>
<p>Mary agreed. “We don’t want to become a doormat,” she said. “We want to show up with value that reflects our strengths.”</p>
<p>Together, they reframed service as Strategy and influence as something earned through volume, visibility, and value.</p>
<h3>Practical Swagger: Tactics for Influence</h3>
<p>Alison offered two of her most practical and powerful recommendations to close the discussion: make introductions and start a ripple effect of compliments. These small, intentional actions build trust and amplify your leadership presence in ways that matter.</p>
<p>“Make introductions. Talk people up. Add value in 5 minutes.”</p>
<p>She explained how thoughtful connections can position you as a connector and how spotlighting others builds status for them and you. “People will talk you up in return, and often when you’re not in the room,” she said.</p>
<p>Mary added her practice: every year, she writes a personal list of goals—one for each year of her age—anchored by a theme. It’s an exercise in reflection and intention. Her advice: “Put that stake in the ground and work backwards.”</p>
<p>These strategies provide a framework for leading with purpose, presence, and practical impact every day.</p>
<h3>Closing Reflections</h3>
<p>This WLIT conversation reflected Cloudera’s belief that leadership can be strategic and human. Mary and Alison challenged conventional models of influence and offered a blueprint for a more inclusive and authentic future.</p>
<p>To bring their insights into everyday practice, here are a few key takeaways:</p>
<ul>
<li><p>Show up as both warm and assertive to earn real respect and influence</p>
</li>
<li><p>Build long-term trust instead of aiming for perfect balance in every moment</p>
</li>
<li><p>Treat self-doubt as a sign of growth, not a reason to hold back</p>
</li>
<li><p>Earn influence by being caring and capable, not just by holding power</p>
</li>
<li><p>Make introductions and elevate others to grow your presence and credibility</p>
</li>
</ul>
<p>As Alison concluded, “Strategy and authenticity are not opposites. Just be yourself.” That line captured it all. Influence isn’t about changing who you are; it’s about showing up with clarity, purpose, and presence. And when you do that, you don’t just lead but inspire.</p>
<p>&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=a-conversation-with-mary-wells-and-alison-fragale-on-learning-to-lead-as-your-authentic-self</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Ensure Transparency and Build Trust with Data Lineage Automation—Here’s How</title><description><![CDATA[In today’s data-centric world, data is an organization’s most valuable asset. Yet, many struggle to maintain reliable, trustworthy data amidst complex, evolving environments. This challenge is especially critical for executives responsible for data strategy and operations.]]></description><link>https://www.cloudera.com/blog/business/ensure-transparency-and-build-trust-with-data-lineage-automation-here-s-how.html</link><guid>https://www.cloudera.com/blog/business/ensure-transparency-and-build-trust-with-data-lineage-automation-here-s-how.html</guid><pubDate>Wed, 30 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Zinette Ezra]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-468838119.jpg"><p>In today’s data-centric world, data is an organization’s most valuable asset. Yet, many struggle to maintain reliable, trustworthy data amidst complex, evolving environments. This challenge is especially critical for executives responsible for data strategy and operations.&nbsp;</p>
<p>Here’s how automated data lineage can transform these challenges into opportunities, as illustrated by the journey of a health services company we’ll call “HealthCo.”</p>
<h4>The Data Strategy</h4>
<p>Like many forward-thinking organizations, HealthCo’s leaders recognized early on that data is more than a valuable asset: it’s a strategic imperative. They put data at the forefront of their business, integrating it into decision-making processes, products, and services. By doing so, they aimed to drive innovation, optimize operations, and enhance patient care.&nbsp;</p>
<p>They invested heavily in data infrastructure and hired a talented team of data scientists and analysts. Their goal was to develop sophisticated data products, such as predictive analytics models to forecast patient needs, patient care optimization tools, and operational efficiency dashboards. These data products were intended to enhance patient outcomes, streamline hospital operations, and provide actionable insights for decision-making.&nbsp;</p>
<p>This strategic choice justified further investment into their data team, infrastructure, management, and science. HealthCo’s team envisioned a flywheel effect where the more value they derived from their data products, the more they could invest in and enhance their data capabilities.</p>
<h4>The Problem: Siloed and Inconsistent Data</h4>
<p>Despite the strategic vision, HealthCo encountered significant challenges as it scaled. The complexity of its data ecosystem became a major obstacle. The company's data team managed a diverse array of sources, including SQL Server, Oracle databases, and Informatica. Additionally, they used multiple BI tools like Power BI, Tableau, MicroStrategy, and Qlik. This complex web of platforms created substantial integration and management hurdles.</p>
<p>HealthCo’s hybrid data environment offered flexibility and access to advanced tools, but also introduced significant integration challenges. Each system had its own protocols and handling methods, making it difficult to create a unified view. For instance, aligning patient care data from Oracle databases with operational metrics from Power BI was daunting without clear data lineage. Different departments managed their data independently, leading to silos and inconsistencies. This fragmentation meant patient treatment data might not align with financial records, causing conflicting insights that undermined decision-making.</p>
<p>As data inconsistencies grew, so did skepticism about the accuracy of the data. Decision-makers hesitated to rely on data-driven insights, fearing the consequences of potential errors. The deployment of new data products, such as machine learning models for predicting patient readmissions, was delayed due to fears of inaccuracies and potential negative impacts on patient care. Ensuring compliance with healthcare regulations became a daunting task. The inability to trace data lineage accurately made it difficult to demonstrate compliance during audits. This situation posed legal risks and threatened the organization’s reputation.&nbsp;</p>
<p>The lack of trust in data created inertia. Despite the potential for high-impact data-driven initiatives, HealthCo hesitated to deploy data products directly to healthcare providers and patients, fearing the high risk of data inaccuracies. This hesitation impeded their ability to fully leverage their data investments and improve patient care.</p>
<h4>The Solution: Automated Data Lineage</h4>
<p>Automated data lineage solved these challenges, offering comprehensive, end-to-end visibility of data flow across all systems. For HealthCo, this meant that stakeholders could finally see how data moved from its source through various transformations to its final destination. This visibility was crucial for quickly identifying and rectifying data quality issues, ensuring consistent and reliable insights. By mapping out data lineage, HealthCo broke down data silos, enabling a unified approach to data management. This led to better integration and consistency across the organization. For example, operational efficiency metrics could now be directly correlated with patient outcomes, providing a holistic view that was previously unattainable.</p>
<p>Accurate data lineage rebuilt trust among decision-makers. HealthCo’s leadership could confidently rely on data-driven insights, knowing the data’s journey was well-documented and reliable. This trust empowered them to deploy new data products without fear of inaccuracies, driving innovation and operational improvements.&nbsp;</p>
<p>Automated data lineage also made it easier to track data processes and demonstrate compliance with healthcare regulations. During audits, HealthCo could clearly show how data was handled and processed, reducing the risk of non-compliance penalties. This protected the organization legally and also reinforced its commitment to high standards of data governance.</p>
<p>By embracing automated, multidimensional data lineage, HealthCo maintained a consistent and reliable data environment across its hybrid systems. Solving the data lineage problem directly supported its data products by ensuring data integrity and reliability. Predictive analytics models became more accurate as they were based on trustworthy data flows. Patient care optimization tools could pull consistent and integrated data from multiple sources, leading to more effective treatment plans. Operational efficiency dashboards could now provide real-time, accurate insights into hospital operations, enabling better decision-making.</p>
<h4>Make Data Your Most Trusted Asset with Cloudera Octopai&nbsp; Data Lineage</h4>
<p>This is where <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai&nbsp; Data Lineage</a> excels. Cloudera’s metadata management solution simplifies the process of managing data lineage by providing automated, multidimensional mapping that integrates seamlessly with complex hybrid data landscapes. Cloudera Octopai Data Lineage provides deep visibility into the transformation processes that are performed by other dedicated tools—like dbt, Informatica, Talend, SSIS, or custom SQL scripts—and automatically maps and analyzes these external transformation workflows, making them understandable, traceable, and manageable.&nbsp;</p>
<p>The Cloudera Octopai&nbsp; Data Lineage workspace provides a single pane for organizations to discover, understand, govern, and trust their data across the entire data estate, from on-premises to multi-cloud environments. It’s designed to empower data practitioners, business users, and data stewards to confidently leverage data for analytics and AI, ensuring that organizations can maintain data accuracy, rebuild trust, and drive innovation. This solution helps businesses bridge the gap between data strategy and operational reality, turning data into a reliable, strategic asset that powers success.</p>
<p>For executives responsible for data strategy and operations, ensuring data reliability and compliance while navigating a complex data ecosystem is a formidable challenge. Automated data lineage is essential in addressing these challenges, and Cloudera’s solution makes it both achievable and manageable. With automated data lineage, organizations can unlock the full potential of their data, transforming it into a powerful asset that drives their success.</p>
<p>To see for yourself how <a href="/content/www/en-us/products/unified-data-fabric/data-lineage.html">Cloudera Octopai Data Lineage</a> enables complete data traceability and builds trust in enterprise data, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html">schedule a demo</a>!</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=ensure-transparency-and-build-trust-with-data-lineage-automation-here-s-how</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>#ClouderaLife Employee Spotlight: Meet Charles Aad, Cloudera’s Senior Industry Solutions Engineer</title><description><![CDATA[Between global customer visits, raising two young children, and attending classes in three countries, Charles Aad earned an Executive MBA from one of the world’s top programs — without pressing pause on his day job.]]></description><link>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-charles-aad-clouderas-senior-industry-solutions-engineer.html</link><guid>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-charles-aad-clouderas-senior-industry-solutions-engineer.html</guid><pubDate>Tue, 29 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-1225383258.jpg"><p><b>Between global customer visits, raising two young children, and attending classes in three countries, Charles Aad earned an Executive MBA from one of the world’s top programs — without pressing pause on his day job.</b></p>
<p>At Cloudera, trust and growth are core tenets of our ethos. That’s why we support our employees’ goals and do what we can to help them advance their careers, both within the company and beyond.</p>
<p>Whether it’s pursuing advanced education, participating in<a href="/content/www/en-us/blog/culture/clouderalife-employee-spotlight-meet-orla-mccarthy-cloudera-s-vice-president-of-professional-services-emea.html"> skill development training</a>, or finding purpose through our<a href="/content/www/en-us/blog/culture/celebrating-a-busy-week-of-giving-at-cloudera.html"> community initiatives</a> around the globe, we’re committed to meeting people at every stage of their growth. Through our culture of connection, Clouderans continue to foster an environment that values development at every step.</p>
<p>Charles Aad reflects that culture. A seasoned data professional based in France, Charles recently completed a top-tier Executive MBA while managing major customer accounts, traveling internationally for collaborative trainings, and raising a family. His journey is one of ambition, balance, and impact.</p>
<p>Let’s meet Charles Aad and explore how Cloudera supported his personal and professional growth.</p>
<h3>Meet Charles Aad</h3>
<p>Charles Aad’s decision to join Cloudera traces back to a memorable moment earlier in his career. While working with a customer on a data-solution project at a previous company, he saw firsthand how highly they spoke of Cloudera, praising the company’s agility, customer-centric mindset, and industry-focused use cases. So, when he found an opportunity to join them, the choice felt natural. “Cloudera was at the top of my list,” said Charles.</p>
<p>As a Senior Industry Solutions Engineer in France, Charles helps leading enterprises solve complex data challenges, from modernizing platforms to unlocking AI-powered services. Based in France and part of the EMEA team, he’s known for balancing technical expertise with business insight—and for always keeping the customer at the center.</p>
<p>Whether helping a bank improve credit scoring applications or enabling a company to launch new digital services, Charles focuses on delivering outcomes that matter.</p>
<p>That mindset is especially critical as AI continues to reshape the industry. Charles works closely with clients to translate emerging technologies into meaningful, personalized use cases.</p>
<p>“For me, it’s not just about delivering technology,” Charles said. “It’s about helping customers do something useful with it and making sure they can turn it into real value for their own customers.”</p>
<h3>How Cloudera Helped Make an Executive MBA Possible</h3>
<p>When Charles decided to pursue an Executive MBA at HEC Paris, a globally top-ranked business school, it wasn’t just a personal milestone. It was also a reflection of Cloudera’s commitment to growth, flexibility, and real support for its people. Over 18 months, Charles joined a cohort of 57 professionals representing 27 nationalities, with immersive modules in France, Germany, and the United States.</p>
<p>It was an intense commitment alongside his full-time role and family life, but Cloudera made it possible. Through its education assistance policies, Charles received financial backing and time away to attend classes. His managers helped make room in his schedule, and he took advantage of unplugged days—an internal benefit that allows employees to disconnect for personal growth.</p>
<p>“Without the support of Cloudera, it wouldn’t have been possible,” Charles shared. “Executive MBA programs are usually very tough to pursue, but with Cloudera’s help, I had fewer things to worry about.”</p>
<p>Charles describes his colleagues as people who genuinely care about each other’s successes and are willing to take real measures to make their dreams possible. His MBA journey felt like climbing a mountain with the right people alongside him. “People guided me the whole way, some doing the heavy lifting for me when I needed it.”</p>
<h3>Growing Through Travel, Training, and Global Collaboration</h3>
<p>At Cloudera, collaboration extends across borders. As part of the EMEA team, Charles worked with colleagues and customers worldwide, from New Zealand to North America, sharing ideas and learning from diverse experiences.<br>
<br>
Last May, he attended a training course in Amsterdam, where Cloudera brought together team members from across regions. The training wasn’t just technical content, but was designed to connect people around real customer stories and shared use cases.</p>
<p>“It was truly eye-opening,” he said. “You weren’t just learning the material but learning how others think, work with their customers, and solve problems.”</p>
<p>The global perspective continues to shape how Charles approaches collaboration and customer impact today.</p>
<h3>A Culture of Belonging, Mentorship, and Impact</h3>
<p>For Charles, Cloudera’s people-first mindset stands above the rest. “From day one at Cloudera, you feel like part of a community. You’re a team player, and together, we strive for success.”</p>
<p>Charles credits Cloudera’s open-door policy across the organization, which creates space for people to ask questions, gain new perspectives, and collaborate openly. Cloudera embeds mentorship and leadership support at every level. “You can talk to anyone,” he explained. “People are humble, open, and willing to help.”</p>
<p>That same spirit of support extends beyond the workplace. One initiative close to Charles’ heart is Cloudera’s annual<a href="/content/www/en-us/blog/culture/cloudera-week-of-giving-recap.html"> Week of Giving</a>, a company-wide program that empowers employees to give back to their communities. Whether volunteering during work hours or donating to causes they care about, Cloudera reinforces its commitment by matching employee contributions. For Charles, it reflects a culture where giving and growing go hand in hand.</p>
<h3>Outside the Office</h3>
<p>A lifelong learner, Charles often spends his train rides reading leadership and coaching books. One of his recent favorites is Trillion Dollar Coach.</p>
<p>Outside of work, Charles cherishes time with his wife and two young children. “When you have kids, they are your number one hobby,” he said. He also enjoys exploring the scenic routes around Paris, especially running through the stretch from Saint Cloud to Versailles.</p>
<h3>Closing Thoughts</h3>
<p>Charles’ journey offers an inspiring example of the supportive and people-first culture at Cloudera. He’s proof that if you’re willing to take initiative and work with your team, Cloudera is the place to find meaningful growth and connections that support your long-term career.&nbsp;</p>
<p>Hear from another <a href="/content/www/en-us/blog/culture/clouderalife-employee-spotlight-meet-susan-wulf-clouderas-senior-director-of-learning-and-enrichment.html">Clouderan</a> and explore career <a href="/content/www/en-us/careers.html">opportunities</a> at Cloudera.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderalife-employee-spotlight-meet-charles-aad-clouderas-senior-industry-solutions-engineer</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>It’s Not About Being First: Building AI That Works</title><description><![CDATA[Jack Kennedy, CTO and co-founder of Whippy, joined The AI Forecast to discuss the evolution of Whippy from a scrappy consulting operation to a fully productized platform, the surprising advantages of building with niche programming languages, and why speed—not just scale—is a critical differentiator in enterprise AI.]]></description><link>https://www.cloudera.com/blog/business/its-not-about-being-first-building-ai-that-works.html</link><guid>https://www.cloudera.com/blog/business/its-not-about-being-first-building-ai-that-works.html</guid><pubDate>Fri, 25 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-woman-reading-tablet.webp"><p>AI agents are rapidly emerging as one of the most powerful tools in the enterprise AI toolkit. However, for organizations looking beyond experimentation, the real challenge lies in deploying them effectively and scaling to deliver lasting value. How do enterprises unlock that value quickly? And how can they trust automation enough to implement it as a core part of their operations?</p>
<p>Jack Kennedy, CTO and co-founder of Whippy, joined The AI Forecast to discuss the evolution of Whippy from a scrappy consulting operation to a fully productized platform, the surprising advantages of building with niche programming languages, and why speed—not just scale—is a critical differentiator in enterprise AI.</p>
<p>Here are some key takeaways from that conversation.</p>
<h3>Scale Doesn’t Matter If You’re Wrong</h3>
<p>Paul: The pace of change in AI has been rapid, and as a CTO, part of your role is managing that, deciding when to lean in versus when to hold back. So, how do you make good decisions about where to apply different models? When do you replace what you’ve already built to make it better, and when do you just say this is good enough for now?</p>
<p>Jack: My principle from the start has been to stay on top of what’s new, but remember it’s better to be right than first. It doesn’t matter what you do as long as the last thing you do is correct. That mindset helps us avoid chasing hype and stay focused on building things that actually work.</p>
<p>Paul: How do you figure out what’s worth scaling versus where a human is still needed? What’s your framework for deciding what AI should take on and what it shouldn’t?<br>
<br>
Jack: We work with a lot of staffing and recruiting agencies. There’s a role like a phone recruiter that churns at 125 percent a year. That’s a great candidate for automation. AI is better than a human in that case;&nbsp; it speaks every language, works 24/7, and documents everything directly into your CRM. But there are parts of the process, like convincing a candidate to take the job, where the human touch still matters. That’s not something we try to automate. You have to be selective about where you scale and where you don’t. Not everything should scale, even if it can.</p>
<h3>Rein in the Dreamers and Empower the Pragmatists</h3>
<p>Paul: AI has changed fast. How do you think customers react, especially regarding fear versus excitement?</p>
<p>Jack: There are three buckets of people — the Dreamers, the Pragmatists, and the Skeptics.</p>
<p>First, the Dreamers. They want to automate everything and sit on a beach while the business runs. They’ll ask, “Can we do this?” and I’ll say, you can automate it, but should you? Technically, yes, you can automate many job responsibilities. However, once we discuss what it takes to maintain that kind of system, and how valuable the human touch still is in many parts of the business, they usually start to rethink their approach.</p>
<p>Then there are the Pragmatic users. These are the people who handle ten tasks a day and recognize that two of them could be automated. They’ve already experimented with tools like ChatGPT and understand the potential. They are practical, open to change, and often become internal champions who help set the stage for broader adoption.</p>
<p>These are the people we focus on first. They’re the ones who help you scale AI responsibly. If you empower the Pragmatists and guide the Dreamers toward reality, you create a solid foundation. That makes it much easier to bring along the third group, the Skeptics.</p>
<h3>Win Over the Skeptics by Starting Small</h3>
<p>Paul: What advice do you have for people considering building AI apps, especially given the many tech stack options and the pressure to move quickly?</p>
<p>Jack: If you’re in a company where people are either scared of AI or eager to automate everything, you need to be the voice of reason in the middle. As mentioned earlier, the third bucket of users is the Skeptics. These are the ones who are hesitant or cautious about adopting AI. For them, the best approach is to start with one low-risk, clearly worthwhile project.</p>
<p>One client was getting 100 after-hours calls in Spanish that no one returned. We suggested capturing those voicemails, translating them, and emailing transcripts to the team. It was a low-risk change that didn’t interfere with existing systems, but it immediately improved the customer experience and helped bring the skeptics on board. It showed that AI could create real value without forcing a big leap.</p>
<h3>Future-Proof Your Stack by Staying Flexible</h3>
<p>Paul: What should leaders consider regarding long-term flexibility when choosing between APIs and models?</p>
<p>Jack: We’ve been provider-agnostic since day one, and that has been important. We separate data storage from model usage. We store data we pass into the model, store the data it gives back, and log where it came from. That way, we can rerun something if it fails, switch providers when better ones come along, and reuse the same data without rebuilding everything.<br>
<br>
I’d be cautious about tools that don’t let you take your data out or track what’s going on. Those are the risky ones. There’s a whole wave of LLM proxies now that let you plug multiple providers into one dashboard, which is useful. But even with tools like that, you still need your own logging and architecture. If you own your data and track everything, you’ll be in a strong position no matter where the ecosystem goes.<br>
&nbsp; <br>
Catch the full conversation with Jack Kennedy on The AI Forecast on <a href="https://open.spotify.com/episode/0Qw935vZ6bhjtyd6WzyZGx?si=fbcf8787a24f4dbb">&nbsp;Spotify</a>,<a href="https://podcasts.apple.com/us/podcast/from-concept-to-capability-making-ai-actually-useful/id1779293119?i=1000714485135"> Apple Podcasts</a>, and<a href="https://youtu.be/kOMPm5EX-pM?si=74kyiT1XmeaovoAK"> YouTube</a>.&nbsp; </p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=its-not-about-being-first-building-ai-that-works</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Self-Service vs. Centralized Data Management: How to Leverage Data Lineage to Empower and Control</title><description><![CDATA[Cloudera Octopai Data Lineage can assist organizations in both self-service and centralized data management scenarios. It’s a robust, automated data lineage solution that helps organizations gain visibility and control over their data assets.]]></description><link>https://www.cloudera.com/blog/technical/self-service-vs-centralized-data-management-how-to-leverage-data-lineage-to-empower-and-control.html</link><guid>https://www.cloudera.com/blog/technical/self-service-vs-centralized-data-management-how-to-leverage-data-lineage-to-empower-and-control.html</guid><pubDate>Fri, 25 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Noam Shaby]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-people-talking-meeting.webp"><p>In the era of big data, organizations are grappling with the challenge of effectively managing and leveraging vast amounts of data. Two prominent approaches have emerged: self-service data management and centralized data management. Each approach has its own merits and trade-offs, and understanding how to leverage data lineage within these frameworks can empower organizations and provide crucial control over their data assets.</p>
<h3>What is Self-Service Data Management?</h3>
<p>Self-service data management enables business users to have direct access to and control over their data. It allows them to explore, manipulate, and analyze data without heavy reliance on IT or data specialists. This approach promotes agility and empowers business users to make faster, data-driven decisions. However, self-service data management also comes with risks, including data inconsistencies, lack of governance, and potential security vulnerabilities.</p>
<h3>What is Centralized Data Management?</h3>
<p>Centralized data management emphasizes a more structured and governed approach. Data is managed and controlled by a dedicated team of data professionals, ensuring data quality, security, and compliance. This approach offers greater control and reduces the risk of data inconsistencies. However, it may introduce bottlenecks and hinder agility as business users need to rely on the central team for accessing and analyzing data.</p>
<h2>Data Lineage Offers The Best of Self-Service and Centralized Data Management</h2>
<p>Organizations can leverage data lineage to strike a balance between the benefits of self-service and centralized data management. Data lineage—automated tracking of data sources, transformations, and movements—provides a comprehensive view of how data flows through the organization, from its origin to its consumption. It helps trace the data’s journey and dependencies, enabling users to understand the context, quality, and reliability of the data.</p>
<h3>Strike a Balance Between Agility and Control with Cloudera Octopai Data Lineage</h3>
<p>Cloudera Octopai Data Lineage can assist organizations in both self-service and centralized data management scenarios. It’s a robust, automated data lineage solution that helps organizations gain visibility and control over their data assets.&nbsp;</p>
<p>By leveraging Cloudera Octopai Data Lineage, organizations can empower business users while maintaining control and governance over their data assets. Cloudera Octopai Data Lineage’s intuitive interface and automation features make it accessible to both technical and non-technical users, fostering collaboration and driving data-driven decision-making across the organization. Business users can explore and understand data lineage, regardless of their technical expertise, ensuring they’re working with accurate and reliable data.&nbsp;</p>
<ul>
<li><p>In a self-service data management context, Cloudera Octopai Data Lineage capabilities empower business users to navigate through complex data ecosystems. They can easily identify data sources, understand data transformations, and assess the impact of any of their changes. This visibility reduces the risk of making decisions based on inaccurate or outdated information, enabling business users to make more informed choices and drive better outcomes.</p>
</li>
</ul>
<ul>
<li><p>In a centralized data management framework, Cloudera Octopai Data Lineage enables data professionals to establish and enforce data governance policies effectively. By providing a comprehensive view of data lineage, Cloudera Octopai Data Lineage helps data teams understand how data is being used, who has access to it, and whether it complies with regulatory requirements. This insight enhances data governance practices, improves data quality, and ensures compliance across the organization.</p>
</li>
</ul>
<p>In conclusion, Cloudera Octopai Data Lineage offers a powerful solution for both self-service and centralized data management scenarios. Whether enabling business users to explore and analyze data or empowering data professionals to enforce governance policies, Cloudera Octopai Data Lineage capabilities can help organizations unlock the full potential of their data assets in a controlled and empowered manner.</p>
<p>If you’re interested in learning how Cloudera Octopai Data Lineage enables complete data traceability and builds trust in enterprise data, <a href="/content/www/en-us/products/cloudera-data-platform/cdp-demos.html">schedule a demo</a>! Discover how to transform your data flows for greater efficiency, compliance, and control.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=self-service-vs-centralized-data-management-how-to-leverage-data-lineage-to-empower-and-control</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Introducing Cloudera Surveyor: Your New Go-To for Seamless Kafka Management </title><description><![CDATA[We’re thrilled to announce the launch of Cloudera Surveyor  for Apache Kafka, a powerful new tool designed to simplify the management and monitoring of your Kafka clusters! ]]></description><link>https://www.cloudera.com/blog/technical/introducing-cloudera-surveyor-your-new-go-to-for-seamless-kafka-management.html</link><guid>https://www.cloudera.com/blog/technical/introducing-cloudera-surveyor-your-new-go-to-for-seamless-kafka-management.html</guid><pubDate>Thu, 24 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Gary Gaur]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty986494584.jpg"><p>We’re thrilled to announce the launch of <a href="https://docs.cloudera.com/csm-operator/1.4/surveyor-overview/topics/csm-op-surveyor-overview.html" target="_blank">Cloudera Surveyor</a>&nbsp; for Apache Kafka, a powerful new tool designed to simplify the management and monitoring of your Kafka clusters! Previously, managing Kafka meant navigating complex <a href="https://aws.amazon.com/what-is/cli/" target="_blank">command-line interfaces (CLIs)</a> or dealing with outdated tools, resulting in a lack of a cohesive view of your environment due to disparate components or accumulated technical debt. That's why we built Surveyor with a modern architecture focused on intuitive design, scalability, and robust security, providing an out-of-the-box, unified management experience.</p>
<p>&nbsp;</p>
<p>Surveyor is engineered to bring a centralized, intuitive, scalable, and secure platform for managing Apache Kafka clusters efficiently through a graphical interface, reducing the reliance on the Kafka CLI. What sets Surveyor apart is its broad compatibility: it can connect to and manage any Kafka distribution that provides an API compatible with Apache Kafka 2.4.1 or higher. This includes Kafka clusters deployed with the <a href="https://docs.cloudera.com/csm-operator/1.0/kafka-deploy-configure/topics/csm-op-configuring-cluster-operator.html">Strimzi Cluster Operator in Cloudera</a>, <a href="https://docs.cloudera.com/csm-operator/1.2/index.html">Streams Messaging Kubernetes Operator</a>, Kafka clusters running in Cloudera on-premises and Cloudera on cloud, as well as third-party Kafka distributions.</p>
<p><small><i>Figure 1: Visualizing your Kafka clusters with ease: The Cloudera Surveyor UI provides a clear overview and streamlined management capabilities.</i></small></p>
<p>&nbsp;</p>
<h4>The Surveyor Advantage: Overcoming Kafka Management Hurdles</h4>
<p>Surveyor is engineered to address key challenges faced by Kafka users and administrators, designed specifically to fill the management gap with a truly modern solution:</p>
<ul>
<li><p>Intuitive graphical interface: Say goodbye to the command line for everyday tasks. Surveyor provides a user-friendly UI for managing your Kafka clusters, topics, and consumer groups, making administration tasks much more efficient.</p>
</li>
</ul>
<ul>
<li><p>Multi-cluster support: Manage all your Kafka clusters from a single, centralized interface. Surveyor allows you to add, remove, and switch between clusters with ease, and even assign arbitrary tags for better organization and filtering.</p>
</li>
</ul>
<ul>
<li><p>Actionable health and performance insights: Gain deep visibility into your Kafka environment. Surveyor offers real-time insights into cluster health, performance, and key metrics, helping you quickly identify and address issues across brokers, topics, partitions, replicas, and consumer groups.</p>
</li>
</ul>
<ul>
<li><p>Enhanced security: Security is paramount. Surveyor adopts a secure-first approach, with TLS enabled by default for encrypted communication and robust LDAP authentication support. It also enforces Kafka ACLs to ensure users only perform actions they are authorized to do.</p>
</li>
</ul>
<ul>
<li><p>Scalability for enterprise deployments: Engineered for large-scale production environments, Surveyor can handle hundreds of Kafka clusters, each with hundreds of brokers and thousands of topics, ensuring it grows with your needs.</p>
</li>
</ul>
<ul>
<li><p>Broad Kafka compatibility: Manage diverse Kafka deployments from a single pane of glass, whether they’re on-premises, in the cloud, or on Kubernetes.</p>
</li>
</ul>
<h4>Insights in Action: An Example of Surveyor Success</h4>
<p>Consider the following use case: Sarah is a Kafka administrator who manages over 20 diverse Kafka clusters deployed across on-premises data centers, public clouds, and Kubernetes environments. While reviewing her centralized Surveyor dashboard, she notices a “Critical” health status flagged on her primary production cluster.&nbsp;</p>
<p>Instead of Sarah needing to sift through countless logs spread across different environments or running multiple CLI commands for each cluster, Surveyor highlights an “Offline Partition” alert. This directs Sarah to check the broker's logs and address a disk failure. A quick drill-down, guided by Surveyor's intuitive visual indicators, reveals the specific topic and partition experiencing issues, along with the affected broker.&nbsp;</p>
<p>Thanks to Surveyor's real-time, centralized insights and clear visual cues, Sarah quickly pinpoints the problem, takes corrective action, and restores service much faster than before, significantly minimizing the impact on her organization's critical data streams.</p>
<h4>Core and Out-of-Box Component of Cloudera Streams Messaging</h4>
<p>Surveyor is a key component of the newly released <a href="https://docs.cloudera.com/csm-operator/1.4/index.html" target="_blank" rel="noopener noreferrer">Cloudera Streams Messaging - Kubernetes Operator 1.4.0</a>. With this release, we’re integrating Surveyor directly into your Kubernetes environments, enhancing your ability to deploy and manage Kafka and related components on existing, shared Kubernetes infrastructure, completely eliminating the need for dedicated setups. This delivers flexible, agile, and rapid deployment and scaling for variable workloads, standardization on existing Kubernetes infrastructure, and operational efficiency through simple upgrades and swift cluster creation.&nbsp;&nbsp;</p>
<p>Cloudera Streams Messaging is deployed primarily via the Cloudera platform and the Cloudera Streams Messaging Operator on Kubernetes. Cloudera provides a managed, integrated platform with Cloudera Streams Messaging as a core service, ideal for comprehensive data lifecycle management. The <a href="https://docs.cloudera.com/csm-operator/1.4/index.html" target="_blank" rel="noopener noreferrer">Cloudera Streams Messaging - Kubernetes Operator</a>, on the other hand, allows native deployment and management of Kafka components directly on Kubernetes, offering a more agile, cloud-native approach for those prioritizing Kubernetes-based workflows.&nbsp;</p>
<p>Beyond messaging with Kafka, Cloudera's broader <a href="https://www.cloudera.com/products/stream-processing.html?tab=1" target="_blank" rel="noopener noreferrer">Data in Motion portfolio</a> includes robust offerings for data ingestion and transformation with Apache NiFi and powerful stream processing and analytics with Apache Flink.&nbsp;</p>
<p>Ready to Experience Seamless Kafka Management?&nbsp;</p>
<p><a href="https://docs.cloudera.com/csm-operator/1.4/surveyor-overview/topics/csm-op-surveyor-overview.html" target="_blank" rel="noopener noreferrer">Cloudera Surveyor</a> for Apache Kafka is proprietary software developed by Cloudera, ensuring it meets our high standards for performance, reliability, and support. We’re incredibly excited about Surveyor and its potential to transform how you manage your Kafka deployments.</p>
<p>Dive deeper into its capabilities and see it in action:</p>
<ul>
<li><p>Explore the documentation: Learn more about deploying, configuring, and using Cloudera Surveyor for Apache Kafka by visiting our <a href="https://docs.cloudera.com/csm-operator/1.4/surveyor-overview/topics/csm-op-surveyor-overview.html" target="_blank" rel="noopener noreferrer">official documentation</a>.</p>
</li>
</ul>
<ul>
<li><p>Connect with the community: Join the conversation, ask questions, and share your experiences with Cloudera Surveyor on our Community <a href="https://community.cloudera.com/t5/What-s-New-Cloudera/Announcing-Cloudera-Surveyor-with-Cloudera-Streams-Messaging/ba-p/411591" target="_blank" rel="noopener noreferrer">announcement post</a>.</p>
</li>
</ul>
<p>Get ready for a more efficient, intuitive, and secure Kafka experience!</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=introducing-cloudera-surveyor-your-new-go-to-for-seamless-kafka-management</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Unlocking the Benefits of Apache Impala</title><description><![CDATA[High-Performance SQL with Cloudera and Apache Impala]]></description><link>https://www.cloudera.com/blog/technical/unlocking-the-benefits-of-apache-impala.html</link><guid>https://www.cloudera.com/blog/technical/unlocking-the-benefits-of-apache-impala.html</guid><pubDate>Tue, 22 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty913004178.jpg"><h2>High-Performance SQL with Cloudera and Apache Impala</h2>
<p>Today's business leaders understand the value of leveraging data to make real-time business decisions. What’s not always so clear is how to get from point A to point B. They’re dealing with overwhelming volumes of data stored across data centers and clouds and need to access, analyze, and derive meaningful insight from it securely, accurately, and efficiently.&nbsp;</p>
<p>Speed is another factor for organizations trying to query and analyze huge volumes of data. As datasets grow to a massive scale, the higher latency and processing time of batch-processing frameworks can keep an organization from achieving real-time insights.&nbsp; One technology that helps enable faster insights is Apache Impala, an open-source SQL query engine designed for high-performance analytics on big data. With Impala, there are several factors that work to reduce query run time, as there’s an online cluster with coordinators and a lot of executors. The user just provides the SQL query, and Impala can start work on it right away.</p>
<p>In contrast with batch processing systems, Apache Impala <a href="https://docs.cloudera.com/runtime/7.3.1/impala-overview/topics/impala-overview.html" target="_blank">leverages</a> caching technology to catalog data and metadata, allowing it to be used for interactive analytic workloads and offering a step in the right direction toward real-time data-driven decision-making.</p>
<h4>The Apache Impala Difference</h4>
<p>As mentioned, Apache Impala is a <a href="https://impala.apache.org/docs/build/html/topics/impala_intro.html" target="_blank">distributed</a>, massively parallel processing (MPP)-style database engine. It provides high-performance and low latency SQL queries and the ability to query high volumes of data in Apache Hadoop.</p>
<p>Key benefits of Apache Impala include:</p>
<ul>
<li><p><b>Reduced complexity:</b> Apache Impala offers a single system for big data processing and analytics. That means organizations can avoid complex and costly modeling and ETL for their analytics.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><b>Positive user experience:</b> Impala uses the same unified storage platform, metadata, SQL syntax, ODBC driver, and user interface (UI) as Apache Hive. This means that data scientists and analysts will be familiar with the SQL interface, simplifying the query process and making integrations easier.&nbsp;</p>
</li>
</ul>
<ul>
<li><p><b>Cost effective:</b> Given the amount of data at hand—often well into the terabytes—cost effectiveness is a major business consideration. Apache Impala delivers SQL queries in a cluster environment, making scaling simple and convenient, while reducing overall costs.&nbsp;</p>
</li>
</ul>
<p>Through these elements, users gain a unified and familiar platform to handle both real-time and batch-oriented queries. With all that said, what does something like Impala look like in practice? A recent bit of competition offers a glimpse into just how powerful the tool can be in the real world.&nbsp;</p>
<h4>Exploring the Impact of Impala&nbsp;</h4>
<p>Recently, Apache Impala’s capabilities were put to the test in a “Trillion Lines of Code” <a href="https://medium.com/coiled-hq/one-trillion-row-challenge-5bfd4c3b8aef" target="_blank">challenge</a>, where the tool was evaluated for its file scanning and aggregation performance against a massive dataset: one trillion records containing temperature measurement data spread across 100,000 files, totaling around 2.4 TB.&nbsp;</p>
<p>Impala handled the challenge with ease—all it took was a simple SQL query. The challenge proved that using Apache Impala to run queries on vast datasets can result in critical savings in both cost as well as time.&nbsp;</p>
<h4>High-Performance SQL with Cloudera and Apache Impala</h4>
<p>At a time when data volumes are soaring and the ability to deliver real-time data insights is critical to success, it’s important for business leaders to choose the right tool for the job. Cloudera’s platform gives organizations a powerful means to manage and analyze data in real time and at rapidly increasing scale.&nbsp;</p>
<p>With support for Apache Impala, integration of a powerful open table format like Apache Iceberg, and the flexibility of an open data lakehouse, Cloudera ensures data can be managed easily, remain secure and compliant, and still be accessed and queried quickly.&nbsp;</p>
<p>To learn more about how your organization can take advantage of Apache Impala with Cloudera, here are a few next steps you can take:</p>
<ul>
<li><p><a href="https://docs.cloudera.com/runtime/7.3.1/impala-overview/topics/impala-overview.html" target="_blank">Review the technical documentation</a> of Cloudera and Apache Impala</p>
</li>
<li><p><a href="https://www.cloudera.com/contact-sales.html" target="_blank">Contact us</a> to speak directly with a member of our sales team</p>
</li>
<li><p><a href="https://www.cloudera.com/products/cloudera-public-cloud-trial.html" target="_blank">5-day free trial</a> of Cloudera solutions</p>
</li>
</ul>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=unlocking-the-benefits-of-apache-impala</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Ready to Scale: Tackling the Top Challenges of Agentic AI Adoption </title><description><![CDATA[Agentic AI is the next step in enterprise automation. Unlike traditional assistants or chatbots, these agents are autonomous systems that can reason, plan, and act, making complex decisions in real-time without human prompting.]]></description><link>https://www.cloudera.com/blog/business/ready-to-scale-tackling-the-top-challenges-of-agentic-ai-adoption.html</link><guid>https://www.cloudera.com/blog/business/ready-to-scale-tackling-the-top-challenges-of-agentic-ai-adoption.html</guid><pubDate>Mon, 21 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1134688471.jpg"><h3>What is Agentic AI—and Why It’s Gaining Ground&nbsp;</h3>
<p>Agentic AI is the next step in enterprise automation. Unlike traditional assistants or chatbots, these agents are autonomous systems that can reason, plan, and act, making complex decisions in real-time without human prompting. Whether it’s rerouting supply chains, supporting diagnostic assistance, or flagging financial risk, agents are already changing how businesses operate.&nbsp;</p>
<p>This shift isn’t hypothetical. In Cloudera’s 2025 global<a href="https://www.cloudera.com/campaign/the-future-of-enterprise-ai-agents.html"> survey</a> of nearly 1,500 IT leaders, 96% of organizations said they plan to expand their use of AI agents next year, and 84% believe agents are essential to staying competitive. What was once emerging tech is now a strategic imperative.&nbsp;</p>
<p>But while interest is high, scaling agentic AI isn’t simple. Fifty-three percent cite data privacy and compliance as their top concern. Others are held back by integration (40%), implementation complexity (39%), and gaps in governance (30%). These barriers aren’t stopping adoption but are forcing leaders to rethink how they go from pilots to production.&nbsp;</p>
<h3>The Roadblocks&nbsp;</h3>
<p>Scaling agentic AI isn’t just a technical lift—it’s a trust test. As enterprises move from limited pilots to real-world workflows, concerns around data privacy, system integration, and ethics come into sharper focus.&nbsp;</p>
<p>Data privacy tops the list. With agents accessing sensitive systems like financial records, patient data, and proprietary insights, organizations must lock down what they can access and infer. The stakes are high: IBM<a href="https://www.ibm.com/think/insights/cost-of-a-data-breach-2024-financial-industry"> reports</a> the average data breach cost is $4.45 million, a figure expected to only keep climbing. One misstep can lead to compliance violations and a breakdown in public trust.&nbsp;</p>
<p>Technical complexity follows close behind. Forty percent of leaders cite integration with legacy systems as a significant challenge, especially in sectors like telecom or finance, where infrastructure spans decades. More urgently, enterprises face a talent gap. Seventy-six percent of large companies<a href="https://aithority.com/machine-learning/ust-ai-report-93-percentage-of-large-firms-see-ai-as-vital-but-most-face-talent-shortages/?"> report</a> a shortage of AI-skilled talent, and 44% say it’s slowing them down. Agentic AI requires hybrid teams who understand both the tech and the business. Without that bridge, even well-funded projects can stall.&nbsp;</p>
<p>Then there’s the ethical dimension. Fifty-one percent of leaders are concerned about bias in AI systems. A Yale<a href="https://medicine.yale.edu/news-article/bias-in-bias-out-yale-researchers-pose-solutions-for-biased-medical-ai/"> study</a>, cited in Cloudera’s report, showed that diagnostic agents trained on non-diverse datasets performed worse for underrepresented patients, leading to delays and misdiagnosis. Bias can surface at any stage—data collection, model design, or deployment—and scale quickly without strong oversight.&nbsp;</p>
<p>Organizations are responding.<a href="https://www.cloudera.com/campaign/the-future-of-enterprise-ai-agents.html"> Thirty-eight percent</a> have implemented bias audits and human review processes, and another 36% use bias-detection tools. But bias training isn’t a checkbox; it must be continuous, transparent, and accountable to earn lasting trust.&nbsp;</p>
<h3>The Blueprint for Breaking Through&nbsp;</h3>
<p>The enterprises succeeding with agentic AI aren’t starting with sweeping rollouts, they’re starting with intentional, future-ready pilots designed to prove long-term value. High-impact internal projects help teams test workflows, establish controls, and demonstrate outcomes before scaling across the organization.&nbsp;</p>
<p>Cloudera’s latest research reveals a clear trend: most organizations begin with contained, low-risk use cases like internal IT support or DevOps automation. Tasks such as password resets or ticket routing are easy to automate and offer measurable ROI with minimal disruption. In fact, 78 percent of organizations already use agents for customer support, and 71 percent apply them to process automation. These early wins help build momentum, credibility, and operational readiness.&nbsp;</p>
<p>But these pilots are more than technical trials; they are a test of the teams behind them. Moving from localized projects to enterprise-scale deployment brings new challenges, including tighter risk management, stronger governance, and deeper system integration. Meeting those demands depends not just on robust platforms but also on having people with the skills, alignment, and oversight to lead the way.&nbsp;</p>
<p>Technology alone does not scale. People do. Rapid results are important, but even the most promising pilots stall without the right talent to sustain and extend them. While 85 percent of enterprises say GenAI investments laid a strong foundation for agentic AI, 34 percent still cite lack of expertise as a barrier to growth.&nbsp;</p>
<p>That is why upskilling is critical to move beyond pilot mode. In healthcare, for example, radiologists are learning to validate AI-generated diagnostics, while administrative teams adapt to working alongside agents that manage scheduling and records. These kinds of human-AI partnerships are essential—not just to maintain trust and compliance, but to ensure real, lasting impact.&nbsp;</p>
<h3>The Time to Scale Is Now&nbsp;</h3>
<p>Agentic AI is no longer on the horizon, it’s here. Across industries, agents are moving from pilots to production: streamlining diagnostics in healthcare, predicting churn in telecom, and improving compliance in finance. These aren’t experiments; they’re operational systems already delivering measurable impact.&nbsp;</p>
<p>The companies positioned to lead have already done the groundwork. They’ve modernized infrastructure, trained their teams, and embedded governance across the AI lifecycle. Those who wait for risk falling behind—competitors and raising customer and regulatory expectations.&nbsp;</p>
<p>Let’s build trusted agentic AI together. Contact Cloudera today to see how it can help you scale with confidence, or&nbsp;<a href="https://www.cloudera.com/products/cloudera-public-cloud-trial.html?internal_keyplay=ALL&amp;internal_campaign=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;cid=FY25-Q1-GLOBAL-CDP-5-Day-Trial&amp;internal_link=WWW-Nav-u01">start your free trial</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=ready-to-scale-tackling-the-top-challenges-of-agentic-ai-adoption</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Agents of Change: The Next Phase of Enterprise AI </title><description><![CDATA[Abhas Ricky, Chief Strategy Officer at Cloudera, joined The AI Forecast to discuss the real momentum behind agentic AI.  ]]></description><link>https://www.cloudera.com/blog/business/agents-of-change-the-next-phase-of-enterprise-ai.html</link><guid>https://www.cloudera.com/blog/business/agents-of-change-the-next-phase-of-enterprise-ai.html</guid><pubDate>Thu, 17 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-ai-agent-txt.webp"><p>AI agents are quickly becoming one of the most powerful tools in the enterprise AI toolkit. According to Cloudera’s latest Agentic AI Survey, 96% of enterprises plan to expand their use of AI agents. But what’s driving this surge, and how can organizations turn hype into results?&nbsp;</p>
<p>Abhas Ricky, Chief Strategy Officer at Cloudera, joined The AI Forecast to discuss the real momentum behind agentic AI.&nbsp;&nbsp;</p>
<p>Here are some key takeaways from that conversation.&nbsp;</p>
<h4>AI agents are no longer a theory. They’re digital coworkers.</h4>
<p><b>Paul:</b> Let’s talk about the big headline from the most recent report on agentic AI, citing that 96% of enterprises plan to expand their use of AI agents in 2025.&nbsp;</p>
<p><b>Abhas</b>: If you look at the last 40 years of technology, the dream has always been to simplify complex tasks. Thanks to agentic AI, we’re seeing the evolution of that dream come to life. These aren’t just chatbots or assistants. They’re digital workers with memory and autonomy. They process tasks independently, adapt based on past behavior, and don’t need to be constantly updated or instructed.&nbsp;</p>
<p>That makes them so powerful, and why enterprises already see 10 to 100x productivity improvements.&nbsp;&nbsp;</p>
<h4>The biggest blocker? It’s not tech. It’s trust.</h4>
<p><b>Paul:</b> On the other side of that 96%, what’s slowing workflows down? What’s keeping that other 4% on the sidelines?&nbsp;</p>
<p><b>Abhas</b>: That 4% points to the core blocker of enterprise adoption: data privacy. Over 53% of enterprises said scaling AI is their biggest challenge.</p>
<p>Agents operate across multiple systems, such as CRM and sensitive medical or financial data. Without clear boundaries, you’re introducing risk at every step. It’s not just about protecting the data; it’s about ensuring agents know who can access what at what time. That means robust policies, metadata governance, APIs, and enterprise-grade authorization frameworks.&nbsp;</p>
<h4>Start with fast wins, then scale.</h4>
<p><b>Paul: </b>Let’s talk about implementation. Where should enterprises start when scaling agentic AI to see value fast?&nbsp;</p>
<p><b>Abhas:</b> There are always cases where low-hanging fruit can deliver quick wins. That’s why we work with customers to break down large, complex initiatives into manageable micro-workflows that show value fast.&nbsp;</p>
<p>For example, a global bank we partnered with wanted to reduce its mortgage processing time from four weeks to just six hours. It’s a bold goal, but we didn’t try to overhaul everything at once. We started by automating sub-workflows like tax document review and data benchmarking. That’s where the early ROI appeared, and from there, we expanded.&nbsp;</p>
<p>The key is to break the use case into smaller workflows. Some will be easy and fast to implement, others more complex and time-consuming. Our recommendation is always the same: prove value early, pick quick wins first, earn the right to go deeper, and keep up the momentum. And remember: Wherever possible, bring the model to the data, not the data to the model—most enterprise data still remain locked where it was created.&nbsp;</p>
<h4>The road ahead involves full autonomy and talent arbitrage.</h4>
<p><b>Paul:</b> In the next 12 to 18 months, what specific advancements do you expect in autonomous agents?&nbsp;</p>
<p><b>Abhas:</b> You’ll see companies offering fully autonomous agents, especially as new protocols like MCP and ATA make systems more interoperable. But there’s a bigger shift happening, too, where AI is becoming geopolitical. Countries and companies are building sovereign AI stacks, and we’ll see more demand for full-stack LLM engineers who can build enterprise AI from the ground up.</p>
<p>Certain jobs will now be augmented. But the number of jobs created will outnumber the reduction in jobs because of AI. The nature of the job will just be different. I’m optimistic that we will never lose the value of human input – regardless of how advanced the LLMs and systems become.&nbsp;&nbsp;</p>
<p>Catch the full conversation with Abhas Ricky on The AI Forecast on <a href="https://open.spotify.com/episode/0gfnjLxaKEPICemOLi4Eht?si=ICPfGrEkSNqxfN3ekcLFZw">Spotify</a> ,<a href="https://podcasts.apple.com/gb/podcast/why-ai-agents-are-the-future-of-enterprise-ai/id1779293119?i=1000709243869"> Apple Podcasts,</a> and<a href="https://www.youtube.com/watch?v=BqElzAn4bU0&amp;list=PLe-h9HrA9qfAmGHgsmXUZgLL-T4Xjhlq8&amp;index=5"> YouTube</a>.  &nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=agents-of-change-the-next-phase-of-enterprise-ai</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera Data Visualization Now Available On-Premises, Unifying Your Analytics Experience</title><description><![CDATA[Today, we&apos;re excited to announce that Cloudera Data Visualization is now available for deployment on Cloudera on premises, delivering a truly unified analytics experience regardless of where your data resides.]]></description><link>https://www.cloudera.com/blog/business/cloudera-data-visualization-now-available-for-private-cloud-base-unifying-your-analytics-experience.html</link><guid>https://www.cloudera.com/blog/business/cloudera-data-visualization-now-available-for-private-cloud-base-unifying-your-analytics-experience.html</guid><pubDate>Wed, 16 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Matthew Michaelides]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-clouds-and-skyscrapers.webp"><p>Data visualization has become the cornerstone of modern business intelligence (BI), but organizations operating in hybrid and multi-cloud environments face significant challenges maintaining consistency, security, and cost-effectiveness across their analytics landscape. Today, we're excited to announce that <a href="/content/www/en-us/products/cloudera-data-platform/data-visualization.html">Cloudera Data Visualization</a> is now available for deployment on Cloudera on premises, delivering a truly unified analytics experience regardless of where your data resides.</p>
<h3>Bridging the On-Premises and Cloud Analytics Divide</h3>
<p>For enterprises with strict data governance requirements, the struggle to maintain consistent visualization capabilities across deployment models has been real. Organizations often find themselves managing duplicate dashboards, inconsistent reporting processes, and fragmented security models as they navigate between on-premises and cloud environments.</p>
<p>Cloudera Data Visualization on-premises eliminates these pain points by providing a single, intuitive experience that works seamlessly across your entire data estate. This deployment option complements our existing visualization capabilities in <a href="/content/www/en-us/products/data-warehouse.html">Cloudera Data Warehouse</a> and <a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a>, on-premises and in the cloud, creating a truly unified platform that adapts to your infrastructure choices rather than forcing you to adapt to the tool.</p>
<h3>Business Benefits That Impact Your Bottom Line</h3>
<p>The expanded deployment options for Cloudera Data Visualization deliver compelling advantages for enterprises:</p>
<h3>Rapid Value Delivery Through an Intuitive, Unified Experience</h3>
<p>One of the most significant advantages of Cloudera Data Visualization is the speed at which organizations can begin to get value from their data. With a single, unified interface for all users regardless of whether they’re working on-premises or in the cloud, onboarding is faster and more straightforward. Business analysts and data scientists alike benefit from an intuitive user experience that minimizes the learning curve, enabling teams to start building and sharing impactful dashboards and reports in record time.</p>
<p>This unified BI experience not only accelerates time-to-insight but also reduces the operational overhead associated with managing multiple BI tools or training users on different platforms. <a href="https://www.youtube.com/watch?v=y6FWF_R9ix4">Native AI tooling and automation</a> further streamline the process, empowering users to uncover insights and take action with minimal technical barriers. The result is a more agile, data-driven organization where insights are democratized and accessible to all.</p>
<h3>Cost Optimization and Consolidation</h3>
<p>Organizations typically save over 50% compared to competitive BI tools by eliminating per-user licensing fees. You pay only for compute used in the cloud or a single platform fee on-premises, dramatically reducing your analytics spend while consolidating vendors.</p>
<h3>Enhanced Security and Governance</h3>
<p>With <a href="https://www.dbta.com/Editorial/News-Flashes/Cloudera-Boosts-Metadata-Management-with-New-Solution-Capabilities-165318.aspx">&quot;single security&quot;</a> integration, Cloudera Data Visualization maintains consistent data governance and security across your entire data and analytics estate. This approach is particularly valuable for organizations with stringent regulatory requirements, providing complete control of data used for dashboards, reports, and analytics.</p>
<h3>Future-Proofed Analytics with Workload Portability and Flexibility</h3>
<p>The &quot;write once, deploy anywhere&quot; model delivers unmatched flexibility for BI analytics users. Your team can develop visualizations, dashboards, and reports in one environment and seamlessly deploy them across your infrastructure, providing enterprise flexibility amid evolving business and regulatory requirements. As organizations continue to <a href="https://www.corporatecomplianceinsights.com/saas-evolves-hybrid-models-take-center-stage/">evolve their data, analytics, and AI deployment models</a> to account for cost and security considerations, BI/analytics users can be prepared for a radically stable and consistent visualization experience.</p>
<h3>Real-World Impact: Major Global Bank Success Story</h3>
<p>A major global bank has implemented Cloudera Data Visualization as its enterprise security analytics platform, integrating system logs, network traffic, endpoint data, and threat intelligence across its global operations. With approximately 2,100 regular users interacting with over 4,500 live dashboards and more than 200 visuals, they have achieved an estimated 55% direct cost savings compared to their previous enterprise BI tool and captured $5 million in operational savings, while maintaining stricter security controls than was previously possible.</p>
<h3>Transform Your Analytics Experience Today</h3>
<p>As organizations continue to balance cloud innovation with on-premises requirements, Cloudera Data Visualization provides the consistent, secure, and cost-effective solution needed to drive data-powered transformation.&nbsp;</p>
<p>Ready to experience unified visualization across your hybrid data landscape? Contact your Cloudera representative today to learn how Cloudera Data Visualization on premises can transform your analytics experience.</p>
<p>Watch the <a href="/content/www/en-us/events/cloudera-now-cdp.html?internal_keyplay=cross&amp;internal_campaign=ClouderaNow---FY26-Q2-GLOBAL-VE-WebinarCloudera-Cloudera-Now&amp;cid=701Ui00000UGzCjIAL&amp;internal_link=p07">ClouderaNOW webinar</a> to see Cloudera Data Visualization first hand.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-data-visualization-now-available-for-private-cloud-base-unifying-your-analytics-experience</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>A Beginner’s Guide to the Model Context Protocol (MCP): What it is and Why it Matters</title><description><![CDATA[Let’s explore why communication protocols are critical to enterprise AI and how MCP can serve as the connective tissue that helps organizations scale AI reliably, securely, and efficiently over time. ]]></description><link>https://www.cloudera.com/blog/business/a-beginners-guide-to-the-model-context-protocol-mpc-what-it-is-and-why-it-matters.html</link><guid>https://www.cloudera.com/blog/business/a-beginners-guide-to-the-model-context-protocol-mpc-what-it-is-and-why-it-matters.html</guid><pubDate>Tue, 15 Jul 2025 17:01:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-ai-agent-txt.webp"><p>Artificial intelligence (AI) promises to redefine workflow efficiency, and it’s taking the business world by storm. It should come as no surprise that integrating <a href="https://www.cloudera.com/blog/business/generative-ai-needs-to-become-private-to-thrive-introducing-private-ai.html">private AI</a>—AI that works and learns exclusively from internal company data—is a high-priority item for organizations around the world.&nbsp;&nbsp;</p>
<p>The obvious challenge lies in making private AI both safe and usable across the company, allowing it to connect and communicate across distributed data environments that vary by department, region, and sector. At the heart of that challenge is communication: your systems need to talk to each other to stay in sync and aligned.&nbsp;</p>
<p>This is where communication protocols come in: they define how different parts of your AI ecosystem talk to each other, set boundaries for scope, and help keep queries focused and efficient. Among these, Model Context Protocol (MCP) stands out as a candidate for a universal standard, offering your organization a consistent way to connect AI models with data, tools, and people across your entire environment.&nbsp;</p>
<p>Let’s explore why communication protocols are critical to enterprise AI and how MCP can serve as the connective tissue that helps organizations scale AI reliably, securely, and efficiently over time.&nbsp;</p>
<h3>Communication Protocols: Building AI That Understands Business Context</h3>
<p>At a basic level, communication protocols govern how systems exchange information.&nbsp;</p>
<p>APIs enable the physical transfer of data between systems. But in AI-driven environments, simple data exchange isn’t enough. AI needs to understand the data and act on it within a specific context.&nbsp;&nbsp;</p>
<p>Communication protocols build on top of APIs by adding structure, intent, and meaning. They define what to say, when to say it, and how to interpret what’s said. In an enterprise AI setting, this means guiding how AI models interact with your data, tools, and users.&nbsp;</p>
<p>Without clear protocols, AI tools can lose sight of or completely fail to understand the context of a request, misinterpret intent, or pull incorrect or incomplete data. These guardrails are crucial in enterprise environments where data lives all over the place: on site, in the cloud, across departments—or even continents—and decisions need to be made quickly and accurately.&nbsp;</p>
<p>Communication protocols give AI a shared language and a clear framework so it can identify the right data, understand its relevance, and take the right action within your business context.&nbsp;</p>
<h3>What is MCP? The Universal Adapter</h3>
<p>MCP stands for Model Context Protocol. It’s a specific communication protocol designed as a “universal adapter” that allows easier integration between tools and systems.</p>
<p>Most organizations work with a variety of systems—databases, tools, cloud platforms—all built at different times by different vendors. MCP was created as a flexible, extensible architecture to make it easy to integrate systems and get them to talk to each other.&nbsp;</p>
<p>MCP structures inputs so the AI knows what tools to use, why to use them, and how to frame the query. It also imposes guardrails to ensure the AI accesses only the relevant and requested data.&nbsp;</p>
<p>In an enterprise setting, this structured coordination is key, which we’ll explain in the “Examples and Use Cases: Why MCP?” section with a few examples imagining MCP used in a private enterprise LLM setting.&nbsp;</p>
<h3>4 Reasons MCP is Cloudera’s Choice for a Universal Communications Protocol</h3>
<p>There are several communication protocols that could, in theory, replicate some of what MCP offers. So why does Cloudera advocate for MCP? The simple answer is contextual orchestration and tool calling.&nbsp;</p>
<p>MCP provides the structure and context AI models need to access the right tools at the right time reliably, securely, and in alignment with business logic.&nbsp;</p>
<p>Here are four ways we’ve seen MCP positively impact enterprise AI environments:&nbsp;</p>
<h3>1. Keeps AI Models and Humans on Task&nbsp;</h3>
<p>One of the most challenging parts of building an enterprise AI system is making sure that the AI stays on task. You don’t want outputs that are just technically correct. You want outputs that are relevant to your specific operations.&nbsp;&nbsp;</p>
<p>MCP acts as a contract between the user, the model, and the business logic. It structures communication with clear intent, scope, and constraints, keeping your private AI aligned with your goals so it doesn’t drift into irrelevant or overly generic responses.&nbsp;</p>
<p>MCP also defines how your team works with AI. When using MCP as a standardized protocol, it’s easier to build user-friendly interfaces where your various departments and teams can ask specific questions or make decisions through AI agents without needing to understand how everything works behind the scenes, as the examples later in this article illustrate.&nbsp;</p>
<h3>2. Improves LLM Context Management&nbsp;</h3>
<p>Think of MCP like a USB-C plug: a standard and universal connector that works on different devices, even if they have different manufacturers. Because large language models ( LLMs) are stateless (in other words, vendor agnostic), MCP takes on a similar role.&nbsp;&nbsp;</p>
<p>Enterprises looking to feed their proprietary or operational data into LLMs need standardization like this because it creates the link that connects an LLM and the chosen internal data storage location (like a local or cloud server).&nbsp;&nbsp;</p>
<p>It acts as a consistent layer that injects the right context—like user roles, prior interactions, or task-specific data—into each interaction regardless of source. This prevents fragmentation, ensures the model responds appropriately across use cases, and helps maintain reliable performance.&nbsp;</p>
<h3>3. Supports a Modular, Scalable Infrastructure&nbsp;</h3>
<p>MCP offers a universal adapter that works across environments, allowing on-premises, hybrid cloud, and globally distributed systems to work together. This is critical for enterprises that don’t have a single, unified tech stack today as well as for those planning to adopt new technologies over time.&nbsp;&nbsp;</p>
<p>MCP lets you scale AI adoption across teams, regions, and tools without rebuilding your infrastructure or reinventing the wheel with every change. It ensures interoperability between systems and provides a consistent framework that can also support governance, security, and compliance needs as AI scales.&nbsp;</p>
<h3>4. Future-Proofs Your AI Strategy&nbsp;</h3>
<p>Because MCP is vendor- and platform-agnostic, it offers flexibility as your tech stack evolves. You can update, swap out, or add new tools without rewriting how everything communicates. Its extensibility also supports custom additions and integrations that go beyond just LLMs, helping you connect AI capabilities across your broader enterprise.&nbsp;</p>
<p>Put simply, this means the investments you make today won’t become immediately outdated the minute something new comes along.&nbsp;</p>
<h3>Examples and Use Cases: Why MCP?</h3>
<p>To better cement this concept, we’re going to outline a few examples that demonstrate how MCP can positively impact workflows across teams and departments.&nbsp;</p>
<p>First, imagine a director of marketing needs a report that shows all EMEA channel metrics to decide on next quarter’s strategy. They input the following into their enterprise search model:&nbsp;</p>
<p>Prompt: &quot;Summarize last month’s campaign performance by channel for the EMEA region.&quot;<br>
Tool called: Marketing analytics platforms<br>
How MCP helps:&nbsp;&nbsp;</p>
<ul>
<li><p>Defines the reporting period and scope (by channel and region)&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Normalizes data from multiple sources&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Ensures consistent metrics (click-through rates, conversions, etc.)&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Produces a user-friendly output that can immediately guide next steps&nbsp;</p>
</li>
</ul>
<p>Next, imagine an IT director auditing high-level employee user accounts to ensure any loose ends are closed before the end of the month.&nbsp;</p>
<p>Prompt: &quot;List all user accounts with admin access that haven't logged in for 30 days.&quot;<br>
Tool called: Internal identity and access management (IAM) framework<br>
How MCP helps:</p>
<ul>
<li><p>Applies security constraints (admin-level query)&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Specifies filtering logic (last login &gt; 30 days ago)&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Enforces role-based access control so only authorized staff can see results&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Produces a user-friendly output that allows immediate action to be taken with confidence&nbsp;</p>
</li>
</ul>
<p>Now, picture an inventory manager with warehouses distributed around the world who needs to know how much inventory of a certain item they have in stock and where it’s located.&nbsp;</p>
<p>Prompt: &quot;What’s the current inventory level of “Item X” across our regional warehouses?&quot;<br>
Tool called: Inventory management system<br>
How MCP helps:</p>
<ul>
<li><p>Clarifies entity (“Item X”), timeframe (“current”), and locations (“regional warehouses”)&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Prevents irrelevant or overly broad queries&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Connects to the right real-time tool or dataset securely&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Produces an easy-to-read report that satisfies the query in seconds&nbsp;</p>
</li>
</ul>
<p>In each use case, the AI tool used MCP for guidance, ensuring the query responses were relevant, clear, and fulfilled with correctly sourced data.&nbsp;</p>
<h3>Keep Reading: MCP and Cloudera</h3>
<p>Now that you understand what MCP is, how it works, and what it offers your organization, it’s time to dig a little deeper. Head to our next blog, <a href="https://www.cloudera.com/blog/technical/bringing-context-to-genai-with-cloudera-mpc-servers.html">Bringing Context to GenAI with Cloudera MCP Servers</a> to learn in a bit more detail how MCP and Cloudera work together.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=a-beginners-guide-to-the-model-context-protocol-mpc-what-it-is-and-why-it-matters</wfw:commentRss><slash:comments>0</slash:comments></item><item><title> The Iceberg Wave: How an Open Format Became an Enterprise Standard</title><description><![CDATA[Apache Iceberg is now the de facto open standard for managing large-scale structured, semi-structured, and evolving data. It was originally developed in 2017 at Netflix to address the challenges of delivering reliable, petabyte (PB)-scale analytics on Apache Hive and Spark, and has since grown into a robust, open-table format suited to run multiple workloads concurrently. ]]></description><link>https://www.cloudera.com/blog/business/the-iceberg-wave-how-an-open-format-became-an-enterprise-standard.html</link><guid>https://www.cloudera.com/blog/business/the-iceberg-wave-how-an-open-format-became-an-enterprise-standard.html</guid><pubDate>Mon, 14 Jul 2025 17:01:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Navita Sood]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-mountains-cloud-hero.jpg"><h2>Cloudera Innovations Propelling Iceberg Adoption</h2>
<p>Apache Iceberg is now the de facto open standard for managing large-scale structured, semi-structured, and evolving data. It was originally developed in 2017 at Netflix to address the challenges of delivering reliable, petabyte (PB)-scale analytics on Apache Hive and Spark, and has since grown into a robust, open-table format suited to run multiple workloads concurrently.&nbsp;</p>
<p>Iceberg unifies your data and provides SQL behavior to easily access that data. As it continues to evolve with richer SQL capabilities and simplified data operations, Iceberg is increasingly favored by users of varying technical expertise, not just data engineers but also data consumers (data scientists, analysts, and application developers) seeking fast, reliable access to any data.</p>
<p>With Iceberg, organizations gain true separation of compute and storage, enabling unparalleled flexibility. If you're looking for multifunction analytics, AI readiness, and vendor freedom, no other table format comes close.</p>
<h3>A Vibrant and Growing Community</h3>
<p>In less than 10 years, Iceberg has evolved from emerging tech to enterprise standard. Iceberg’s momentum can be credited to its architectural strengths as well as the vibrant, open community behind it.&nbsp;</p>
<p>Importantly, the Iceberg community is led by its users, not just a single vendor. This user-driven governance model helps ensure the project evolves in ways that serve broad, real-world needs—a major reason why it has gained so much traction.</p>
<h3>Key Takeaways from the Iceberg Summit</h3>
<p>Iceberg’s mainstream adoption was evident at the 2025 Iceberg <a href="https://www.icebergsummit2025.com/" target="_blank" rel="noopener noreferrer">Summit in San Francisco</a>. The event brought together startups, Fortune 500s, and the three major cloud providers (AWS, Microsoft, and Google), and attendees joined from across the globe—both in person and virtually—everyone eager to learn, contribute, and grow the ecosystem.&nbsp;</p>
<p>A few themes in particular dominated conversations at the summit: interoperability and Iceberg's growing prominence (its expanding ecosystem and capabilities, including automation).</p>
<h4>Interoperability</h4>
<p>From Netflix to Apple to Bloomberg, many organizations shared how Iceberg enables them to manage a single source of truth that powers multiple workloads—eliminating redundant data copies and reducing data movement across systems. They discussed the various types of workloads that rely on Iceberg’s trusted data layer to deliver segmentation, personalization, churn/relapse predictions, recommendations, optimized customer experience, and more.</p>
<h4>Exploding Ecosystem</h4>
<p>Another highlight was the emergence of new open-source tools such as Comet, Polaris, and Lance in the Iceberg ecosystem, designed to enhance performance and support multi-modal analytics and AI.</p>
<h4>Updates Coming in Iceberg V3 and V4</h4>
<p>There was a lot of excitement around the capabilities coming in Iceberg V3 and V4. V3 will significantly bolster <a href="/content/www/en-us/services-and-support/training/learning-paths/data-governance.html">data governance</a>, performance optimization, and support for more complex data types like Variant and Geospatial. By leveraging the principles of columnar format, Variant enables advanced querying capabilities, such as filtering and aggregations, on semi-structured data without requiring extensive transformations. Support for Geospatial will allow organizations to manage location-based data, unlocking new use cases. The new adaptive metadata layout proposed in V4 promises to improve performance for small files.</p>
<h4>Automated Data Management</h4>
<p>Another hot topic was automating routine maintenance (partitioning, sorting, compaction) via policy-driven DevOps-style interfaces to reduce manual toil. As organizations bring more data into Iceberg tables, this becomes a huge bottleneck since they must hire experts for these maintenance tasks.&nbsp;</p>
<p>As more and more engines access the data in these Iceberg tables, governance, security, and lineage become high priority. Visibility into data flows and data transformations becomes critical to trust the data. This led to discussions around the need for catalog federation and governance to improve visibility across Iceberg tables.&nbsp;</p>
<h3>Iceberg Adoption at Cloudera</h3>
<p>Cloudera featured native integration of Apache Iceberg in its <a href="https://docs.cloudera.com/cdp-public-cloud/cloud/cdp-iceberg/topics/iceberg-in-cdp.html" target="_blank" rel="noopener noreferrer">public cloud Lakehouse platform in 2021</a>, followed by <a href="https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/iceberg-overview/topics/iceberg-overview-base.html" target="_blank" rel="noopener noreferrer">on-premises in 2022</a>. Today, a majority of our customers are either running or testing new workloads on Iceberg; in total, our customers manage PBs of data on Iceberg.</p>
<blockquote>Iceberg is a growth vector for Cloudera. We’re seeing a surge in customers migrating Hive workloads to Iceberg to modernize and future-proof their data platforms.” - Venkat Rajaji, SVP of Product Management, Cloudera</blockquote>
<p>Once a company starts its Iceberg journey, the benefits compound, resulting in growing volumes of data on Iceberg tables, expansion of workloads, and emergence of new use cases. Faster performance is often the first motivator, followed by interoperability and workload flexibility for agility. Moving to Iceberg reduces storage, ETL, and operational costs by up to 75%. Capabilities like time travel, snapshots, write-audit-publish, and hidden partitioning further improve efficiency, making it the right choice to deploy new use cases.</p>
<p>Some of the most popular <a href="/content/www/en-us/campaign/introducing-apache-iceberg-the-case-for-an-open-data-lakehouse-powered-by-cloudera.html?internal_keyplay=MDA&amp;internal_campaign=Other---FY25-Q3-GLOBAL-Iceberg-Powered-by-Cloudera-WP&amp;cid=701Hr000001fkFfIAI&amp;internal_link=p01">use cases</a> for Iceberg at Cloudera are:</p>
<ul>
<li><b>Data sharing</b> between different vendor systems owned by trusted parties, like different business units within an organization or with trusted partners and suppliers.&nbsp;</li>
<li><b>Data engineering</b> for massive-scale data preparation and best price performance.</li>
<li><b>Near real-time analytics and decisioning</b> by ingesting streaming data into the lakehouse.</li>
<li><b>Regulatory compliance reporting and continuous risk mitigation</b>, leveraging Iceberg’s time travel features and Cloudera’s governance, lineage, and auditing capabilities.</li>
<li><b>Optimizing analytics cloud spend</b> by unlocking the data in Iceberg and leveraging Cloudera’s robust ingestion and data processing capabilities.</li>
<li><b>Accelerating data prep for AI</b> by leveraging Spark and NiFi for faster data processing.</li>
<li><b>Efficient model training</b> across multiple data versions with reduced compute and storage usage.</li>
<li><b>Multi-tiered feature stores</b> that combine Iceberg and HBase for low-latency AI.</li>
<li><b>Running hybrid workloads</b> using compute in public cloud on sensitive data stored on premises.</li>
</ul>
<p>Listen to <a href="https://www.youtube.com/watch?v=iF8tSf0tvts" target="_blank">Illumina</a> and <a href="https://www.youtube.com/watch?v=5NaJkcV9GdI" target="_blank">LY Corporation</a>’s journey with Apache Iceberg and how they are overcoming their data and analytic challenges at scale.</p>
<h3>Cloudera Innovations to Address Common Challenges&nbsp;</h3>
<p>While Lakehouse and Iceberg offer <a href="/content/www/en-us/products/open-data-lakehouse.html">significant benefits</a>, including converging all your data and accelerating analytics, there are a few challenges our customers have shared with us related to adopting Iceberg. First, their data lies in multiple clouds, on premises, and in edge systems and moving all that data to the cloud to leverage Iceberg is almost impossible. Hence, they need the same Iceberg support on premises and in the cloud. Second, they need integration with multiple vendor engines so they can easily share data across systems with confidence, lineage, and traceability. As the data grows, manually and continuously optimizing Iceberg tables for optimal performance becomes very expensive, requiring experts and compute resources. Lastly, while Iceberg increases the usage of data, the freedom to bring in any tool introduces risks and requires effective governance and security tools to control access and provide metadata management for auditability, lineage, and visibility to better understand the data and drive usability.</p>
<p>We’re always innovating to solve customer challenges and have made several platform enhancements to address these common pain points, including:</p>
<ul>
<li><b>Iceberg everywhere with the hybrid lakehouse:</b> Delivers native support for Iceberg on premises and in multiple public clouds with the ability to port applications and code to use Impala, Spark, NiFi, Flink, and Hive on the same data with the same experience. This allows customers to modernize their data center with cloud-native capabilities. Iceberg on Ozone delivers S3-compatible object stores on premises. Cloudera enables organizations to unify their data in cloud and on premises under a single governance and security model—with fine-grained access controls, versioned metadata, and a shared catalog.</li>
<li><b>Real-time application building:</b> Build real-time CDC pipelines and seamlessly ingest and unify batch and streaming data with our Data in Motion offering for streaming pipelines (NiFi+Kafka+Flink-on-Iceberg).</li>
<li><b>Full interoperability with REST catalog integration:</b> Drive interoperability with external engines and open ecosystems with single security and governance.</li>
<li><b>Lower TCO and faster performance with the</b> <a href="/content/www/en-us/blog/technical/cloudera-lakehouse-optimizer-easier-to-deliver-high-performance-iceberg-tables.html">Cloudera Lakehouse Optimizer</a>: Built-in AI auto-tunes compaction, snapshot expiry, and layout—no manual tuning required.</li>
<li><b>Complete understanding of all data sources and destinations:</b> Octopai by Cloudera unlocks intelligent metadata automation and full-lifecycle lineage for all data flows even outside of Cloudera to give better visibility into data.</li>
<li><b>HA/DR and low latency across applications:</b> Iceberg table replication provides resilience and flexibility for HA data architectures.</li>
<li><b>Risk-free and fast adoption with smart migration tools:</b> Our “Hive Tables to Apache Iceberg” <a href="https://newsletter.api.simpplr.com/r?et=newsletter.link.clicked&amp;u=https%3A%2F%2Fsessionize.com%2Fapp%2Forganizer%2Fsession%2F17552%2F831897&amp;tenantId=00D5Y000001c282UAA&amp;newsletterId=edc7ca40-710c-4430-80f9-89ea8a3130b8&amp;userId=a0w5Y00000qezjvQAA&amp;blockId=block-aSKVq23WFofHFHTaYifPv9&amp;blockType=RichText&amp;index=3&amp;clickType=link" target="_blank">blueprint</a> simplifies onboarding.&nbsp;</li>
</ul>
<blockquote>As we envision a future where Apache Iceberg is the foundation and linchpin, empowering cross-platform data and AI, we relentlessly enhance Iceberg's capabilities to unlock unprecedented agility and intelligence for every enterprise.” Bill Zhang, VP of Product Strategies at Cloudera</blockquote>
<h3>Road Ahead</h3>
<p>We believe that Iceberg will continue to dominate as the enterprise standard for open-table formats. The new innovations in automated optimizations, multi-modal support, metadata management, and Python integration will only further drive adoption. Other open-table formats will likely take a more specialized approach suited to run specific workloads or in specific environments to complement Iceberg.&nbsp;</p>
<p>Cloudera’s goal is to help customers build an open data lakehouse powered by Iceberg with lower complexity, greater flexibility, and higher impact. We’re focused on delivering enterprise- grade security and governance, additional optimizations, tiered storage mechanisms, and “catalog of catalogs” to enhance interoperability and collaboration. You can get started today with the<a href="/content/www/en-us/products/cloudera-public-cloud-trial.html"> Cloudera Lakehouse 5-day trial</a> or by reading our <a href="/content/www/en-us/open-source/apache-iceberg.html">how-to guides</a>.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=the-iceberg-wave-how-an-open-format-became-an-enterprise-standard</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Privacy-First Enterprise AI Innovation with Cloudera Synthetic Data Studio </title><description><![CDATA[The Challenge of Data Privacy, Quality, and Access for AI Applications]]></description><link>https://www.cloudera.com/blog/business/privacy-first-enterprise-ai-innovation-with-cloudera-synthetic-data-studio.html</link><guid>https://www.cloudera.com/blog/business/privacy-first-enterprise-ai-innovation-with-cloudera-synthetic-data-studio.html</guid><pubDate>Tue, 01 Jul 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Andreas Tsiartas,Khauneesh Saigal,Yi-Hsun Tsai]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1187832984.jpg"><h3>The Challenge of Data Privacy, Quality, and Access for AI Applications&nbsp;</h3>
<p>Enterprises are facing a dilemma: they must automate their business processes with AI to stay competitive and reduce costs while contending with strict data privacy regulations such as the General Data Protection Regulation (GDPR) or the California Consumer Privacy Act (CCPA). On top of that, they are saddled with the high costs of cloud-based large language models (LLMs) and a scarcity of high-quality, open, and readily available data, all while needing to manage access around enterprise proprietary information and sensitive customer interactions—technical support tickets, financial records, or healthcare data—that must be kept private and cannot be shared or exposed.&nbsp;</p>
<p>This creates several challenges for AI developers. First, using raw data for model training risks legal penalties due to non-compliance. Second, sharing data with cloud-based LLMs introduces privacy vulnerabilities. Third, the lack of accessible, high-quality data leads to accuracy gaps in AI models. The result? Stalled innovation, missed opportunities, and a growing gap between AI’s potential and its practical implementation in enterprises.</p>
<p>At <a href="/content/www/en-us/">Cloudera</a>, we’re committed to empowering enterprises to harness AI’s full potential without compromising data privacy or budget constraints. As part of that mission, we’ve released <a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera AI Studios</a>, which makes advanced AI <a href="/content/www/en-us/blog/business/clouderas-ai-studios-making-advanced-ai-accessible-to-all.html">accessible to all</a>—both technical and non-technical users—by providing modular, no-code tools with high-code extensibility that guide developers through the generative AI (Gen AI) lifecycle.</p>
<p>Cloudera Synthetic Data Studio is part of this toolset, and it helps organizations adapt powerful AI models while adhering to regulatory requirements and operational efficiency. With Synthetic Data Studios, users can generate high-quality synthetic data for fine-tuning open language models for specific use cases, evaluate the performance of retrieval-augmented generation (RAG) or agentic systems, perform AI-powered data augmentation, and much more—all without exposing sensitive information.&nbsp;</p>
<h4>Synthetic Data Studio Overview&nbsp;&nbsp;</h4>
<p>Synthetic Data Studio is a strategic enabler for enterprises navigating the complexities of modern AI. By combining a privacy-first design with advanced AI workflows, Synthetic Data Studio empowers teams to train accurate models using synthetic data derived from real-world examples. This approach eliminates data exposure risks and ensures compliance with regulatory requirements.&nbsp;</p>
<p>The studio also enables organizations to scale AI applications across diverse use cases—ranging from customer support to fraud detection—allowing teams to test RAG, agentic, and other systems using data grounded in proprietary documents. To ensure quality, synthetic datasets are evaluated using an LLM-as-a-judge, retaining only the highest-quality outputs for downstream workflows.</p>
<h4>Intuitive Workflows to Ensure Model Accuracy and Reliability</h4>
<p>The studio’s workflow is intuitive and powerful. Starting with a no-code/low-code interface, teams can instruct LLMs to generate synthetic data that mirrors real-world patterns. For example, customer support teams can create synthetic support tickets that reflect real technical queries or service requests. The system supports multiple synthesis methods, such as free-form generation, supervised fine-tuning, and model alignment, and allows grounding generation using private documents to maintain contextual relevance.&nbsp;&nbsp;</p>
<p>Once generated, synthetic datasets undergo rigorous evaluation. A chosen LLM acts as a judge, assessing the data against custom criteria to ensure only the highest-quality outputs are retained. This quality control step is critical for maintaining model accuracy and reliability.&nbsp; In addition, human evaluators are allowed to intervene and further filter the generated data for even higher-quality outputs.</p>
<p>Finally, datasets are automatically integrated into Cloudera AI Workbench projects for subsequent workflows. For organizations needing external integration, datasets can also be exported in formats like JSON or CSV for use with platforms like Hugging Face.&nbsp;&nbsp;</p>
<h4>Open, Scalable Architecture to&nbsp; Embrace Third-Party Tooling and Deliver Reliability</h4>
<p>Synthetic Data Studio’s LLM-agnostic architecture supports flexibility and leverages both AWS Bedrock and <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Cloudera AI Inference</a>, which allows it to support advanced techniques like knowledge distillation, free-form data generation, supervised fine-tuning, reinforcement learning, and preference optimization (KTO, DPO, PPO, ORPO) to build reasoning models for agentic systems. This adaptability is paired with scalable performance through parallel processing and fallback mechanisms, ensuring reliability even with large datasets.&nbsp;</p>
<p>Seamless integration with CI/CD pipelines via Cloudera AI Workbench Jobs API ensures synthetic data generation and augmentation workflows align with enterprise DevOps practices. This integration reduces friction and accelerates time-to-value for AI projects.&nbsp;</p>
<p>And integration with other Cloudera AI Studios, such as the Fine-Tuning Studio, further streamlines workflows. Whether refining models, testing agentic systems, or optimizing for specific use cases, Synthetic Data Studio provides the tools to accelerate development without compromising security.</p>
<h4>Use Cases and Impact: 95% Reduction in Processing Time</h4>
<p>The real value of Synthetic Data Studio becomes evident when applied to practical scenarios. For example, Cloudera’s customer support team used the studio to generate high-quality datasets for knowledge distillation to a smaller LLM, and the results were transformative. According to internal testing, processing time for support ticket analysis was reduced by 95% when compared to that of a bigger LLM, the distilled model achieved a 70% win rate against larger LLMs (like Goliath-120B), and compute resource requirements dropped significantly, enabling 11x throughput for real-time analytics.&nbsp;&nbsp;</p>
<p>The studio’s versatility extends beyond customer support. In the financial sector, synthetic transaction data can be used to train models for lending decisions without exposing customer information. In software development, synthetic coding problems and solutions improve LLM performance on code generation. For regulatory compliance, teams can test models against custom criteria to ensure adherence to standards.&nbsp;&nbsp;</p>
<h4>The Future of Private AI with Cloudera’s Synthetic Data Studio</h4>
<p>Synthetic Data Studio is a blueprint for how enterprises can innovate with AI while safeguarding data. By democratizing access to synthetic data generation methods, such as knowledge distillation, Cloudera empowers organizations to:&nbsp;</p>
<ul>
<li><p>Reduce costs: Use smaller distilled models specialized in specific use cases.</p>
</li>
<li><p>Compete with confidence: Leverage cutting-edge AI with regulatory compliance.&nbsp;&nbsp;</p>
</li>
<li><p>Build ethically: Establish trust by ensuring data privacy remains a competitive advantage.&nbsp;&nbsp;</p>
</li>
</ul>
<p>In business, where trust and compliance are paramount, Synthetic Data Studio offers a path forward. It’s not just about solving today’s challenges—it’s about enabling enterprises to lead tomorrow’s AI revolution responsibly.</p>
<p>As next steps, explore Synthetic Data Studio <a href="https://github.com/cloudera/CAI_AMP_Synthetic_Data_Studio">here</a>, or try our generative AI capabilities, powered by Cloudera AI, via our <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html?utm_medium=sem&amp;utm_source=google&amp;keyplay=ALL&amp;utm_campaign=FY25-Q2-GLOBAL-ME-PaidSearch-5-Day-Trial&amp;cid=701Hr000001fVx4IAE&amp;utm_term=cloudera&amp;gad_source=1&amp;gad_campaignid=20463667387&amp;gbraid=0AAAAACcmMQFnnF-_Cyh_jdg9sxNkQW_HE&amp;gclid=Cj0KCQjw0qTCBhCmARIsAAj8C4afzTNp9fPuKPWjM0pMt73cSi9Vb64CHDNcSknBjJ5g0DrDG4cz31UaAjSjEALw_wcB">5-day free trial</a> of Cloudera on cloud.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=privacy-first-enterprise-ai-innovation-with-cloudera-synthetic-data-studio</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>#ClouderaLife Employee Spotlight: Meet Susan Wulf, Cloudera’s Senior Director of Learning and Enrichment (L&amp;amp;E)</title><description><![CDATA[Let’s meet Susan Wulf and learn more about what it’s like growing a career with Cloudera.]]></description><link>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-susan-wulf-clouderas-senior-director-of-learning-and-enrichment.html</link><guid>https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-susan-wulf-clouderas-senior-director-of-learning-and-enrichment.html</guid><pubDate>Mon, 30 Jun 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Culture]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty534570145.jpg"><p>At Cloudera, we’re committed to fostering a culture of inclusion, where everyone feels welcomed, valued, and empowered to grow. That culture includes respecting individual aspirations by encouraging people to explore new interests, embrace fresh challenges, and shape their own career paths.&nbsp;</p>
<p>Leaders like Susan Wulf embody this spirit. A long-time Clouderan, Wulf stepped into a role that aligned with her passion: leading the Learning and Development Team. What was once a lesser-known initiative is now a thriving program, thanks in large part to her vision, dedication, and drive.&nbsp;</p>
<p>Join us as we meet Susan Wulf and explore how Cloudera supported her career growth and empowered her to follow her interests.</p>
<h4>Meet Susan Wulf</h4>
<p>Susan Wulf is Cloudera’s Senior Director of Learning and Enrichment (L&amp;E), focused on building programs that help Clouderans grow, thrive and unlock their full potential. She joined the company in 2016 via Hortonworks, a<a href="https://www.cloudera.com/about/news-and-blogs/press-releases/2019-01-03-cloudera-and-hortonworks-complete-planned-merger.html"> company that merged with Cloudera</a> in 2019, starting with the company as a Human Resources Business Partner. After establishing herself within HR, Susan was offered an exciting opportunity: to lead Cloudera’s L&amp;E team.&nbsp;</p>
<p>Susan was exposed to Learning &amp; Development early in her career and reignited that spark at Cloudera. She was drawn in by the opportunity to help others develop their skills, learn new things and explore opportunities as she had. Energized by the impact and potential of L&amp;E, she knew this was the direction she wanted to take her career.&nbsp;</p>
<p>“It’s been such a fun journey and a good change,” said Susan. “I never anticipated that leadership taking a chance on me would bring me here. Now I have the opportunity to work with an amazing team providing so much content to Clouderans.”&nbsp;</p>
<h4>Susan’s Cloudera Journey</h4>
<p>At the start of this career shift, Susan was understandably nervous about leading a relatively new L&amp;E team, but she stepped up to the challenge and never looked back.&nbsp;</p>
<p>“It was probably one of the scariest transitions in my career. But I also think it’s good to be scared sometimes,” said Susan. “It means you’re being challenged.”&nbsp;</p>
<p>Susan spearheaded a complete redesign of the L&amp;E department and led its global expansion. Under her guidance, it grew from a handful of courses into a robust library of programs and interest-driven resources that Clouderans increasingly seek out, with engagement numbers reflecting that momentum.&nbsp;</p>
<p>“When I look back five years ago to now, it’s pretty astonishing to see what we’ve created.”&nbsp;</p>
<h4>Cloudera Fosters a Community of Growth &nbsp;</h4>
<p>At Cloudera, Susan found the kind of support that’s hard to come by—a culture where people genuinely care and where growth is fueled by trust, flexibility and connection. Whether she was encouraged to explore a new career path or received unwavering support during a major life transition, she was continuously met with understanding and opportunity.&nbsp;</p>
<p>From early mentorship to leadership that asked the right questions and believed in her potential, Susan experienced firsthand how Cloudera invests in its people. “It’s been a meaningful opportunity, and one that isn’t easy to find elsewhere,” she said.</p>
<p>At Cloudera, she emphasized that support comes through in everyday interactions, in the way people listen, collaborate and create space for each other to grow both personally and professionally – it is unlike what she had experienced before.&nbsp;</p>
<h4>About Learning and Development</h4>
<p>It took a massive effort to build Learning and Development into what it is today. Through thoughtful leadership discussions, post-program surveys and valuable input from employees across the company, Cloudera now has a thriving program rooted in the things that most interest Clouderans.&nbsp;&nbsp;</p>
<p>Cloudera’s expansive e-learning library, built on platforms like Udemy and Articulate Suite, fosters interactive global workplace training and career exploration. Whether learners prefer self-paced e-learning, in-person sessions, interactive breakout discussions, visual presentations or live instruction, Susan ensures there are options to meet every learning style.&nbsp;&nbsp;</p>
<p>The result is a dynamic and inclusive learning environment that continues to evolve with the needs of its people. Susan attributes much of this success to a culture that listens first, adapts often and meets learners exactly where they are.&nbsp;</p>
<h4>Programs That Spark Growth</h4>
<p>The project that excites Susan most right now is the Executive Presence course which focuses on crafting messages for executive audiences with attention to delivery, pitch and tone.</p>
<p>“I’m excited for Clouderans to take part in the course and then apply what they’ve learned in their day-to-day interactions. It’s the part I’m looking forward to the most—it’s truly exciting.”&nbsp;</p>
<h4>Beyond Cloudera</h4>
<p>When she’s not helping Clouderans build their skill sets or discover their full potential, Susan lives in Denver, Colorado, with her husband Tim and their two young children, Emma and Jack, who keep them busy outdoors—whether playing in the backyard or exploring through hiking or swimming. A pilates and yoga enthusiast, Susan enjoys staying active and spending time with family and friends outside of work.&nbsp;</p>
<h4>Closing Thoughts</h4>
<p>Susan Wulf’s journey is a strong reflection of Cloudera’s growth-minded culture. Her story shows what’s possible when people are supported, curious and open to new challenges. With a team-first mindset and a passion for helping others grow, Susan not only shapes her own path but also creates opportunities for those around her.&nbsp;</p>
<p>At Cloudera, when people are encouraged to explore their career paths and supported by a culture that genuinely cares, the possibilities continue to grow.&nbsp;</p>
<p>Want to keep reading?<a href="https://www.cloudera.com/services-and-support/professional-services.html"> Click here to explore more</a> about Learning and Development at Cloudera, or<a href="https://www.cloudera.com/blog/culture/clouderalife-employee-spotlight-meet-orla-mccarthy-cloudera-s-vice-president-of-professional-services-emea.html"> click here</a> to meet another inspiring Clouderan.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderalife-employee-spotlight-meet-susan-wulf-clouderas-senior-director-of-learning-and-enrichment</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>The 2025 Enterprise Tech Event Season: What to Watch and Why It Matters</title><description><![CDATA[To keep pace with the latest advancements in AI, IT leaders have to stay entrenched in the action. That’s why Cloudera is scheduled to share insights based on what we’re seeing from our enterprise customers at all the major 2025 tech events in this year’s circuit. Together, we’re shaping AI, cloud and enterprise IT.]]></description><link>https://www.cloudera.com/blog/business/the-2025-enterprise-tech-event-season-what-to-watch-and-why-it-matters.html</link><guid>https://www.cloudera.com/blog/business/the-2025-enterprise-tech-event-season-what-to-watch-and-why-it-matters.html</guid><pubDate>Wed, 25 Jun 2025 11:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-girl-ai-agent.webp"><p>To keep pace with the latest advancements in AI, IT leaders have to stay entrenched in the action. That’s why Cloudera is scheduled to share insights based on what we’re seeing from our enterprise customers at all the major 2025 tech events in this year’s circuit. Together, we’re shaping AI, cloud and enterprise IT.</p>
<p>The early events in this year’s circuit, Gartner Data &amp; Analytics Summit in Tokyo (<a href="https://www.gartner.com/en/conferences/apac/data-analytics-japan">Gartner D&amp;A Tokyo</a>), Gartner Data &amp; Analytics Summit in London (<a href="https://www.gartner.com/en/conferences/emea/data-analytics-uk">Gartner D&amp;A London</a>), and Dell Technologies World 2025 (<a href="https://www.dell.com/en-us/dt/events/delltechnologiesworld/2025/index.htm">DTW 2025</a>) revealed a resounding focus: determining the corporate preparation and soft skills needed to implement a private enterprise AI strategy in 2025.</p>
<p>Here’s what you missed if you couldn’t attend these events.</p>
<h3>Agentic AI is Accelerating Across Industries</h3>
<p>This year’s events have so far proven that enterprises are actively deploying private agentic AI—agentic AI trained and used exclusively on internal company data—across key sectors, including healthcare, financial services, industrial and telecommunications. Cloudera also observed this to be true in our 2025 report,<a href="https://www.cloudera.com/campaign/the-future-of-enterprise-ai-agents.html"> The Future of Enterprise AI Agents</a>, which surveyed 1,484 enterprise IT leaders across 14 countries.</p>
<p>During<a href="https://www.dell.com/en-us/dt/events/delltechnologiesworld/2025/index.htm"> Dell Tech World,</a> a Cloudera-hosted session on Private Agentic AI touched on the fact that while some trends in adoption are universal, there are nuances by industry. Each sector must plan to address a unique mix of obstacles—technical, organizational, and ethical—when rolling out private AI and AI agents, or plan to fail long term.</p>
<h3>Hybrid Cloud is Becoming the Standard</h3>
<p>A hybrid cloud approach to data management is quickly becoming a standard for enterprises hoping to leverage emerging technologies like AI. <a href="https://www.cloudera.com/content/dam/www/marketing/resources/whitepapers/cio-whitepaper-data-architecture-and-strategy-in-the-ai-era.pdf.landing.html">93% of organizations</a> are moving to hybrid and multi-cloud, according to<a href="https://www.linkedin.com/in/jake-bengtson-2b807240/"> Jake Bengston</a> at Dell Tech World.</p>
<p>But early signals from the 2025 tech event circuit reveal a clear distinction: success hinges not just on adopting hybrid cloud, but on embracing a true hybrid model.&nbsp;</p>
<p>A<a href="https://www.cloudera.com/content/dam/www/marketing/resources/whitepapers/what-is-true-hybrid-checklist.pdf.landing.html"> true hybrid</a> model, one that spans the full data lifecycle, from ingestion to transformation, warehousing, and machine learning, and treats the entire enterprise environment—data center, cloud, and edge—as a unified platform are unlocking innovation, better governance, and more scalable operations faster.</p>
<p>Still, many enterprise IT leaders see hybrid as a transitional state rather than a long-term strategy. Speakers at Dell Tech World 2025 emphasized a need to switch that mindset, as it may signal the difference between enterprises that fully leverage their data, and those that continually struggle to do so.</p>
<h3>The Biggest Barrier to Adoption is Privacy</h3>
<p>Another emerging theme is that there are clear and consistent barriers for organizations hoping to adopt enterprise AI. At Gartner D&amp;A London, speakers<a href="https://www.linkedin.com/in/wimstoop/"> Wim Stoop</a> and<a href="https://www.linkedin.com/in/boaz-rubin-b7b1786/?originalSubdomain=il"> Boaz Rubin</a> talked about what they have seen to be the top three data governance and trust barriers:</p>
<h3>Privacy – “Can I trust my proprietary data’s safety in public AI models?”</h3>
<p>The short answer is no. Public AI will never be as safe as private AI, which is why organizations are focusing their efforts here.<a href="https://www.cloudera.com/content/dam/www/marketing/resources/whitepapers/cio-whitepaper-data-architecture-and-strategy-in-the-ai-era.pdf.landing.html"> 72% of organizations</a> have cited “privacy” as both a top focus and barrier to adoption. The organizations that achieve the full potential of AI will be the ones that implement a private AI strategy centered on robust assurances around training data, model integrity, and respect for security and privacy.</p>
<h3>Reliability – “Can I trust that my data quality will give useful AI results?”</h3>
<p>Even with a custom private AI implementation, IT and data leaders are not always confident in the consistency and availability of enterprise data, Stoop and Rubin report. That’s why having a single data platform that spans local and cloud infrastructures will help address this challenge—it provides unified governance, consistent data quality, and the scalability needed to adopt increasingly large AI models.</p>
<h3>Responsibility – “Can I trust my AI models will give meaningful insight?”</h3>
<p>&nbsp;Using an AI agent adds an extra layer between users and raw data, which can complicate transparency and confidence in outcomes. IT and data leaders recognize the need to build trust and integrity into their AI models, ensuring the insights generated meet or exceed the standards of accuracy and relevance they would expect from manual research. Otherwise, results risk being incomplete, inconclusive, or inaccurate. To address this, Stoop and Rubin shared that leaders are increasingly turning to flexible and scalable<a href="https://www.cloudera.com/solutions.html"> cloud management technologies</a> that support AI model training, inferencing, and the transformation of data into actionable insights.</p>
<h3>Deployment Should Depend on Business Goals</h3>
<p>Another key theme repeated throughout this year’s events is that a private enterprise AI strategy is not one-size-fits-all. Speakers across industries reported what should come as no surprise—deployment will look different depending on the organization’s business imperatives—whether focused on economy, resilience, performance, carbon footprint, or compliance.</p>
<p><a href="https://www.cloudera.com/content/dam/www/marketing/resources/whitepapers/what-is-true-hybrid-checklist.pdf.landing.html">True hybrid</a> is critical for tailoring deployment to business goals. It enables enterprises to move data and analytics to where they are best placed. Operating as a single platform, data and workloads move friction-free multi-directionally, and IT leaders can use a single, common control plane regardless of where or how data and analytics are deployed.</p>
<h3>The Top Soft Skill for Leaders is Vision</h3>
<p>What’s the difference between organizations that successfully implement AI and those that don’t?&nbsp; According to<a href="https://www.linkedin.com/in/yaelbenarie/"> Yael Ben Arie,</a> the CEO of Octopai, a company<a href="https://www.cloudera.com/about/news-and-blogs/press-releases/2024-11-14-cloudera-to-acquire-octopais-platform.html"> recently acquired</a> by Cloudera, and Cloudera’s<a href="https://www.linkedin.com/in/navitasood/"> Navita Sood,</a> it’s a leader with vision.</p>
<p>It has become apparent that AI implementation requires data leaders to have both hard tech skills and strategic soft skills. They’re often tasked with erasing years of neglected, abandoned data, curing the company of data obesity, and building the most trusted, scalable, AI-ready data ecosystem—all while keeping the lights on.</p>
<p>During the Beyond AI: The Soft Skills &amp; Methodologies That Set Data Leaders Apart discussion at Gartner D&amp;A Orlando, Arie and Sood shared that the most effective data leaders all have a clear goal or set outcome in mind. It might evolve over time, but it’s there from the start. AI is rapidly transforming the way enterprises approach automation and decision-making, so leaders need to know where they’re headed (at least on a macro level) or they risk getting distracted or derailed.</p>
<h3>Continue the Conversation: Join Us at an Upcoming Event</h3>
<p>Ready to join the conversation and help shape the future of enterprise AI?<a href="https://www.cloudera.com/events.html"> Click here</a> for more information about where we’ll be and how to schedule a meeting or join us at any of these upcoming events:</p>
<ul>
<li><p>AWS Summit NYC: July 16</p>
</li>
<li><p>AWS Summit Mexico City: Aug. 6</p>
</li>
<li><p>Big Data Paris: Oct. 1-2</p>
</li>
<li><p>Gitex 2025: Oct. 13-17</p>
</li>
<li><p>AWS re:Invent</p>
</li>
</ul>
<p>Or <a href="https://www.cloudera.com/events/evolve.html">join us at an EVOLVE25 event near you</a> to connect with industry visionaries, data and AI experts, and your peers to explore the impact of accessible data and AI across industries.</p>
<p>&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=the-2025-enterprise-tech-event-season-what-to-watch-and-why-it-matters</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>The Future Is Already Here—And It&amp;apos;s Agentic</title><description><![CDATA[Let me take you on a journey—not into some far-off sci-fi  future, but into a tomorrow that’s just around the corner and uses Agentic AI.]]></description><link>https://www.cloudera.com/blog/business/the-future-is-already-here-and-its-agentic.html</link><guid>https://www.cloudera.com/blog/business/the-future-is-already-here-and-its-agentic.html</guid><pubDate>Mon, 23 Jun 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Sergio Gago]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1178328942.jpg"><p>Let me take you on a journey—not into some far-off sci-fi&nbsp; future, but into a tomorrow that’s just around the corner.&nbsp;</p>
<p>Imagine this: you walk into your workplace and some of your “colleagues” are no longer human. They’re not robots in the traditional sense, but agents—autonomous software entities, each trained on vast datasets, equipped with decision-making power, and capable of performing economic, civic, and operational tasks at scale. These agents write policies, monitor supply chains, process health records, generate news, and even govern our digital interactions.</p>
<p>This isn’t a scene from a movie. It’s the tectonic shift heading our way, and it will change how we work, how our governments function, and even how our communities operate. In this world, digital public infrastructure (DPI) will not be a convenience. It will be a lifeline.</p>
<h2>Sovereignty in the Age of Agents</h2>
<p>We like to say, “Everyone has data.” But the real question is: Where is it? Who controls it? Who governs access to it? In a world run by agents, these questions aren’t just technical—they’re about power and independence.</p>
<p>A sovereign nation that cannot locate, trust, or manage its data is no longer sovereign. A government that cannot verify what its own agents have learned—or with whom they are communicating—is no longer governing.</p>
<p>To survive and thrive in this new ecosystem, DPI must evolve into what I call Digital Shoring: a foundation for sovereign, trusted, and open environments built on four pillars:</p>
<ol>
<li><p><b>Open Data</b> – not just access, but trust. Data lineage, provenance, and verifiable governance. Knowing where your data came from and where it’s going is no longer optional.<br>
<br>
</p>
</li>
<li><p><b>Open Source Software</b> – because critical infrastructure built on black boxes is neither secure nor sovereign.<br>
<br>
</p>
</li>
<li><p><b>Open Standards</b> – because without shared protocols, agents can’t cooperate, institutions can’t interoperate, and governments can’t govern.<br>
<br>
</p>
</li>
<li><p><b>Open Skills</b> – because the capacity to read a balance sheet—or audit a neural net—shouldn't belong to a privileged few.<br>
<br>
</p>
</li>
</ol>
<p>This is the backbone of an agentic society that is fair, sovereign, and resilient.</p>
<h3>Agentic Intelligence: More Than Just Fancy Tools</h3>
<p>Let’s talk about what agents actually are—and what they aren’t.</p>
<p>Imagine I hand a company’s financial statement to two readers: a junior analyst and a seasoned economist. Both might understand the numbers, but only one can extract strategic insight. Similarly, agents can read, analyze, and reason—but the quality of their actions depends entirely on the skills they are equipped with. These skills can be trained, acquired, or—critically—shared.</p>
<p>In public sector contexts, this presents an extraordinary opportunity. Why should every institution reinvent the same agent? Why can’t the skills of a fraud detection agent used in one department be transferred, securely and ethically, to another?</p>
<p>Just like people share their expertise, we need infrastructure for sharing agentic capabilities across digital institutions. This is where organizations like the UN can help: setting the standards and helping everyone through the lens of the Global Digital Compact initiative.</p>
<h3>From “Sovereign Cloud” to “Sovereign AI Platforms”</h3>
<p>Right now, a lot of talk is around keeping data inside national borders. But in the world of agents, that is just not enough. What really matters is where and how models are trained, how they are managed, and how we keep them in check.</p>
<p>We need Sovereign AI Platforms—not unlike the way HR departments manage employees: verifying credentials, ensuring alignment, monitoring performance, and enabling collaboration.</p>
<p>At Cloudera, we’re developing the scaffolding for such platforms: secure <a href="/content/www/en-us/why-cloudera/enterprise-ai.html">hybrid AI environments</a>, open-source data pipelines, governance-first orchestration layers, and modular LLM serving infrastructure that respects national compliance frameworks. But no company can do this alone. This is a global mission.</p>
<h3>Open by Design. Governed by Default.</h3>
<p>Governments around the world are already realizing that private AI cannot be built on public cloud monopolies. Digital identity and agent oversight need to be open and transparent, not hidden, ad hoc, or opaque.</p>
<p>So the future must be open by design—in code, in data, in protocols—but governed by default. From Digital IDs that authenticate not only humans, but also agents and their behavior, to full knowledge graphs that maintain shared institutional knowledge across systems, together with audit trails that document every decision, every inference, every prompt.</p>
<p>This isn’t just about technology. It is about building a new kind of digital society—one designed to empower states, safeguard citizens, and align intelligence with democratic values.</p>
<h3>The Path Forward</h3>
<p>This transformation will not be easy. It will require bold policy, sustained investment, cross-border cooperation, and—above all—technical leadership grounded in values.</p>
<p>But make no mistake: digital cooperation is not optional. It is the condition for sovereignty in an agentic world. Without it, we are left with silos, vendor lock-in, and algorithmic drift. With it, we build a future where intelligence—human or machine—serves the public good.</p>
<p>So let’s move beyond the buzzwords. Let’s build platforms, protocols, and public goods that are open, modular, and sovereign. Let’s treat agents not just as tools, but as members of a digital society in need of governance, trust, and cooperation.</p>
<p>And maybe—just maybe—when we look back at today from the vantage point of tomorrow, we’ll remember this moment not as a crisis, but as the moment we chose to govern the future together.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=the-future-is-already-here-and-its-agentic</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Streamlining Critical Business Capabilities in Financial Services with Cloudera</title><description><![CDATA[Drive data analytics for financial services with Cloudera. Leverage AI in financial services to boost efficiency, compliance, risk management.]]></description><link>https://www.cloudera.com/blog/business/streamlining-critical-business-capabilities-in-financial-services-with-cloudera.html</link><guid>https://www.cloudera.com/blog/business/streamlining-critical-business-capabilities-in-financial-services-with-cloudera.html</guid><pubDate>Fri, 13 Jun 2025 11:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Andreas Skouloudis]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-bank-front.jpg"><p>We’re living in an era of unprecedented transformation in financial services. Powerful technology and business disruptions are impacting the market, including <a href="/content/www/en-us/why-cloudera/enterprise-ai.html">generative AI (GenAI)</a>, cloud computing, an evolving regulatory environment, and financial product innovations, such as digital assets and payment models. In response, financial services organizations are accelerating their efforts to digitize their operations and deliver a consistent, data-driven, digital-first customer experience across channels.</p>
<p>However, as firms attempt to capitalize on technological innovations and maximize value from these investments, they’re running into several challenges, including:&nbsp;</p>
<ul>
<li><p>High cloud costs for compute-intensive tasks, such as AI/ML training and data engineering</p>
</li>
</ul>
<ul>
<li><p>Incomplete insights due to data and technology silos, which negatively impact decision-making&nbsp;</p>
</li>
</ul>
<ul>
<li><p>Lack of real-time responses due to data duplication across systems, which increases latency in data pipelines</p>
</li>
</ul>
<ul>
<li><p>Lack of a comprehensive, fine-grained security and governance model from existing data and analytics tools</p>
</li>
</ul>
<h3>Why Financial Services Firms are Partnering with Cloudera</h3>
<p>As the <a href="/content/www/en-us/why-cloudera/hybrid-data-platform.html">only true hybrid platform</a> for data, analytics, and AI, Cloudera is uniquely positioned to help financial services firms overcome these challenges, successfully progress their digital transformation initiatives, and embrace a modern data architecture.&nbsp;</p>
<p>Some of our key differentiators include:</p>
<ul>
<li><p>Multi-function analytics for building solutions across the data lifecycle, including real-time and batch data movement, AI/ML, GenAI model contextualization and deployment, data engineering, and data warehousing for compute-intensive workloads</p>
</li>
</ul>
<ul>
<li><p>Vendor-agnostic deployment model, supporting cloud and on-premises environments across vendors and regions</p>
</li>
</ul>
<ul>
<li><p>Integrated security and governance, offering a consistent, fine-grained access model across data services and deployment models to meet the most demanding and complex security requirements</p>
</li>
</ul>
<ul>
<li><p>Open-table format (<a href="/content/www/en-us/open-source/apache-iceberg.html">Apache Iceberg</a>) and the ability (through <a href="/content/www/en-us/about/news-and-blogs/press-releases/2024-08-06-cloudera-strengthens-metadata-management-with-modernized-data-catalog-and-iceberg-rest-integration.html">Iceberg REST Catalog</a>) to integrate with the company’s broader data and analytics landscape</p>
</li>
</ul>
<p>Additionally, from our work with over 450 financial services institutions across regions, we’ve identified several critical business capabilities where Cloudera provides exceptional business value. Let’s look at a few examples:&nbsp;</p>
<h3>Regulatory Compliance</h3>
<p>While more financial data provides more granular insights into risk, managing increasing volumes of data can also make it more difficult to maintain regulatory compliance. This is especially true for banks that are struggling with the multiple data silos so often found in traditional architectures. Inflexible architectures also make it extremely difficult to address new or evolving regulatory requirements, such as <a href="/content/www/en-us/blog/business/embrace-a-hybrid-data-platform-for-dora-compliance.html">DORA</a>, a framework designed to strengthen the operational resilience of financial institutions to combat cyberattacks specifically targeting this industry.&nbsp;</p>
<p>Cloudera can serve as the backbone of a hybrid <a href="/content/www/en-us/blog/technical/laying-the-foundation-for-modern-data-architecture.html">modern data architecture</a> by enabling organizations to extend their on-premises analytical footprint to the cloud and use transient compute resources for end-of-month or end-of-quarter tasks, such as regulatory reporting. In addition, it can streamline complex data management tasks, such as auditing historical data and modeling market scenarios through the time-travel capabilities of <a href="/content/www/en-us/open-source/apache-iceberg.html">Apache Iceberg</a>.&nbsp;</p>
<h3>Financial Risk Management</h3>
<p>Financial risk management—whether evaluating market, credit, or liquidity risk—is at the heart of a bank’s operations. As a result, they need to continuously evolve existing risk management strategies and reduce the time to complete risk-related analytics processes (for example, stress testing).&nbsp;</p>
<p><a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a> streamlines the training and deployment lifecycle of data science models to deliver innovative AI/ML models for risk management. In addition, Apache Iceberg simplifies the process of integrating new risk attributes into existing models by optimizing foundational data management tasks, such as schema and partition evolution.&nbsp;</p>
<h3>Fraud Prevention</h3>
<p>Cyberattacks and fraud attempts are becoming increasingly sophisticated as cybercriminals take advantage of new technologies, AI most notably. To combat these threats, banks also need to leverage these new technologies, yet traditional, inflexible architectures can make it difficult to implement new solutions quickly and seamlessly.&nbsp;</p>
<p>Cloudera addresses multifaceted cyberthreats by offering real-time data processing capabilities using <a href="/content/www/en-us/products/dataflow.html">Cloudera Data Flow</a> and <a href="/content/www/en-us/products/stream-processing.html">Cloudera Streaming</a>, enabling prompt detection and response. In addition, it offers a comprehensive AI deployment service to prevent high-volume, real-time fraud attempts by optimizing the underlying NVIDIA GPUs through <a href="/content/www/en-us/about/news-and-blogs/press-releases/2024-10-08-cloudera-unveils-ai-inference-service-with-embedded-nvidia-nim-microservices-to-accelerate-genai-development-and-deployment.html">NIM Microservices</a>. By leveraging Cloudera to train models on all of a firm’s data, financial services companies can produce more accurate models that result in fewer false positives, reducing friction for customers while keeping their assets safe.&nbsp;</p>
<h3>Next Steps: Innovation in Financial Services</h3>
<p>If you want to learn more about how Cloudera is accelerating innovation in financial services, <a href="/content/dam/www/marketing/resources/whitepapers/accelerating-financial-services-innovation.pdf.landing.html">check out our whitepaper</a>. It includes several customer success stories across different financial services verticals and regions. Or, see for yourself how Cloudera can benefit your organization with our <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html?utm_medium=sem&amp;utm_source=google&amp;keyplay=ALL&amp;utm_campaign=FY25-Q2-GLOBAL-ME-PaidSearch-5-Day-Trial&amp;cid=701Hr000001fVx4IAE&amp;utm_term=cloudera&amp;gad_source=1&amp;gad_campaignid=20463667387&amp;gbraid=0AAAAACcmMQF4zbk-56n4VE1oQBfxZjKpR&amp;gclid=Cj0KCQjwuvrBBhDcARIsAKRrkjee8hV4_3Ra4ekXEmSBGplXW8WBMxnSTvBaK_ZInbEAQdlWX3B1uZMaAlfLEALw_wcB">5-day free trial</a>.</p>
<p>&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=streamlining-critical-business-capabilities-in-financial-services-with-cloudera</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera Supercharges Your Private AI with Cloudera AI Inference, AI-Q NVIDIA Blueprint, and NVIDIA NIM</title><description><![CDATA[Together, Cloudera and NVIDIA empower businesses to leverage the latest advancements in AI easily, efficiently, and cost-effectively on all of their data, whether public or private.]]></description><link>https://www.cloudera.com/blog/partners/cloudera-supercharges-your-private-ai-with-cloudera-ai-inference-nvidia-ai-q-and-nvidia-nim.html</link><guid>https://www.cloudera.com/blog/partners/cloudera-supercharges-your-private-ai-with-cloudera-ai-inference-nvidia-ai-q-and-nvidia-nim.html</guid><pubDate>Wed, 11 Jun 2025 14:00:00 UTC</pubDate><comments/><category><![CDATA[Partners]]></category><dc:creator><![CDATA[Zoram Thanga,Dennis Duckworth]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty-1731101767.jpg"><p>As we speak with our customers about their goals for AI, a common pain point we hear is that their plans and implementations are sometimes stalled due to concerns about privacy. They want to use AI on all of their corporate data since that is the way their employees and customers will get the most accurate results and answers, but they realize they can’t send their data out to a public endpoint for a closed-source large language model (LLM) since, 1) there is too much data, and 2) their data would no longer be private.</p>
<p>To address these concerns, Cloudera has begun espousing the concept of <a href="/content/www/en-us/blog/business/generative-ai-needs-to-become-private-to-thrive-introducing-private-ai.html">Private AI</a>, which would allow these customers to get all of the benefits that AI brings and keep their proprietary data safe and secure.<br>
<br>
NVIDIA is seeing the same challenge, but at a much higher and broader level: nation states. Governments are realizing that it isn’t in the best interests of their nations to run AI in another country, so they’re working to build out the infrastructure that they need to keep their data and their AI within their own borders. They can then control what other countries or entities they share their data or AI results with.</p>
<p>At the GTC Paris conference today, NVIDIA provided the building blocks for <a href="https://blogs.nvidia.com/blog/what-is-sovereign-ai/" target="_blank" rel="noopener noreferrer">Sovereign AI</a>&nbsp;to support governments in their efforts. This initiative aligns well with Cloudera’s focus on enabling our customers to implement their own Private AI platforms.&nbsp;</p>
<p>NVIDIA made two other announcements that are&nbsp;of particular interest to Cloudera, and in this blog we’ll dive into <a href="https://build.nvidia.com/blueprints" target="_blank" rel="noopener noreferrer">AI-Q NVIDIA Blueprint for Enterprise Research</a>&nbsp;and the <a href="https://www.nvidia.com/en-us/ai-data-science/products/nim-microservices/" target="_blank" rel="noopener noreferrer">NVIDIA NIM</a>&nbsp;and what this means for our customers.</p>
<h2>AI-Q NVIDIA Blueprint with Cloudera AI</h2>
<p>NVIDIA’s&nbsp;introduction of the AI-Q blueprint for enterprise research provides&nbsp;<a href="/content/www/en-us/products/machine-learning.html">Cloudera AI</a>&nbsp;with more capabilities for supporting our customers’ complex agentic AI needs.&nbsp;</p>
<p>Cloudera AI Inference can host all of the <a href="https://developer.nvidia.com/nemo-retriever" target="_blank" rel="noopener noreferrer">NVIDIA NeMo Retriever</a> and LLM inference microservices that make up the <a href="https://blogs.nvidia.com/blog/ai-agents-blueprint/" target="_blank" rel="noopener noreferrer">AI-Q NVIDIA Blueprint</a>, including <a href="https://www.nvidia.com/en-us/ai-data-science/foundation-models/llama-nemotron/" target="_blank" rel="noopener noreferrer">NVIDIA Llama Nemotron</a> reasoning models. Combining the strong privacy and security provided by the Cloudera AI platform for the model endpoints with the powerful <a href="https://docs.nvidia.com/aiqtoolkit/latest/index.html" target="_blank" rel="noopener noreferrer">NVIDIA Agent Intelligence toolkit</a>, you can take your enterprise agentic applications to the next level.</p>
<h3><b>Benefits of Using AI-Q NVIDIA Blueprint with Cloudera AI</b></h3>
<p>Leveraging AI-Q NVIDIA Blueprint within Cloudera AI Inference service unlocks massive&nbsp; AI potential. This powerful combination integrates leading reasoning models packaged as NVIDIA NIM and NeMo Retriever microservices onto Cloudera AI, and it ensures seamless connectivity between agents, tools, and data through full compatibility with the NVIDIA Agent Intelligence toolkit.&nbsp;</p>
<p>This multi-framework capability empowers organizations to build sophisticated enterprise retrieval-augmented generation (RAG) applications with robust privacy and security, taking full advantage of state-of-the-art AI advancements.</p>
<h2>NVIDIA NIM microservice with Cloudera AI Inference</h2>
<p>NVIDIA's NIM&nbsp;container is a game-changer for getting the best performance from LLMs quickly and easily: it significantly speeds up LLM deployment and inference by automatically selecting the best inference backend based on the model and GPU hardware, enabling&nbsp;a model-agnostic inference solution that streamlines the production serving of numerous cutting-edge LLMs.&nbsp;</p>
<p>Digging deeper, the NVIDIA NIM microservice enables users to quickly deploy LLMs accelerated by <a href="https://docs.nvidia.com/tensorrt-llm/index.html" target="_blank" rel="noopener noreferrer">NVIDIA TensorRT-LLM</a>, vLLM, or SGLang for top-tier inference on any NVIDIA accelerated platform. It supports models stored in Hugging Face or TensorRT-LLM formats, enabling enterprise-grade inference for a vast array of LLMs. Users can rely on smart defaults for optimized latency and throughput or fine-tune performance with simple configuration options. As part of <a href="https://www.nvidia.com/en-us/data-center/products/ai-enterprise/" target="_blank" rel="noopener noreferrer">NVIDIA AI Enterprise</a>, the NVIDIA NIM microservice receives continuous updates from NVIDIA, ensuring compatibility with a wide range of popular LLMs.</p>
<h3><b>Benefits of Using the&nbsp;NVIDIA NIM within Cloudera AI Inference</b></h3>
<p>NVIDIA's NIM provides our customers more flexibility in how they can make use of LLMs in their AI applications. Cloudera AI Inference service already has NVIDIA NIM embedded in it, so customers can implement the NVIDIA NIM microservice quickly and easily. Customers get the benefits of NVIDIA NIM with the ease of use, security, and streamlined support of a single, unified&nbsp;platform: Cloudera.</p>
<p>Through its seamless integration into our AI Inference service, NVIDIA NIM microservice offers significant advantages for Cloudera AI customers, including:</p>
<ul>
<li><p>Accelerated deployment: Get your LLM applications up and running faster with pre-built, optimized containers.</p>
</li>
</ul>
<ul>
<li><p>Enhanced performance: Leverage the full potential of NVIDIA accelerated computing for high-speed inference and reduced latency.</p>
</li>
</ul>
<ul>
<li><p>Scalability: Easily scale your LLM deployments to meet the demands of your growing business.</p>
</li>
</ul>
<ul>
<li><p>Simplified management: Manage and monitor your LLM deployments with Cloudera's intuitive interface.</p>
</li>
</ul>
<h3>Conclusion</h3>
<p>Together, Cloudera and NVIDIA empower businesses to leverage the latest advancements in AI easily, efficiently, and cost-effectively on all of their data, whether public or private. By simplifying the AI application lifecycle, from development to deployment, and by optimizing performance, we're helping our users unlock the full potential of AI.<br>
</p>
<p>Be sure to check out <a href="https://blogs.nvidia.com/blog/sovereign-ai-agents-factories/" target="_blank" rel="noopener noreferrer">NVIDIA’s blog</a> about announcements out of GTC Paris and&nbsp; Cloudera’s blogs on AI, especially the most recent one about “<a href="/content/www/en-us/blog/business/ai-in-a-box--experience-the-future-of-private-ai-at-dell-technologies-world-part-1.html">AI in a Box</a>,” powered by Dell, NVIDIA, and Cloudera which gives customers a new way to implement&nbsp; Private AI quickly, easily, and with minimal risk.&nbsp;</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=cloudera-supercharges-your-private-ai-with-cloudera-ai-inference-nvidia-ai-q-and-nvidia-nim</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Bringing Context to GenAI with Cloudera MCP Servers
</title><description><![CDATA[Enhance enterprise GenAI with Cloudera MCP Servers. Find out how you can securely deploy context-aware AI by using MCP servers.]]></description><link>https://www.cloudera.com/blog/technical/bringing-context-to-genai-with-cloudera-mpc-servers.html</link><guid>https://www.cloudera.com/blog/technical/bringing-context-to-genai-with-cloudera-mpc-servers.html</guid><pubDate>Thu, 05 Jun 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Technical]]></category><dc:creator><![CDATA[Peter Ableda]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-person-servers.png"><p><small>Figure 1: Two scenarios of AI agents accessing data for AI context:</small></p>
<ul>
<li><small>Left: Without a common protocol, AI agents must handle multiple unique APIs to access context from each source. </small></li>
<li><small>Right: MCP unifies access, enabling agents to retrieve context through a single interface, simplifying integration and improving scalability.&nbsp;</small></li>
</ul>
<h2>Agentic Architectures Need a Standard Integration Layer</h2>
<p>As organizations rush to <a href="/content/www/en-us/blog/business/the-breakout-year-for-enterprise-ai-agents.html">adopt agentic architectures</a>, a consistent integration layer is more important than ever.&nbsp;</p>
<p>“The frenzy around adopting agentic architectures is driving organizations to launch multiple initiatives in parallel. While this momentum is encouraging, it also risks creating the modern equivalent of spaghetti code—something we’ve seen before in the early days of software engineering. What companies truly need is a simplified, standards-based architecture that ensures interoperability across the diverse systems participating in the agentic ecosystem. Anthropic’s MCP is emerging as a promising standard in this space, already seeing broad adoption from AI vendors.”<br>
- Sanjeev Mohan, Principal at SanjMo and former Gartner analyst</p>
<p>MCP isn’t a proprietary Cloudera tool—it’s a widely adopted standard that avoids vendor lock-in while tapping into a growing ecosystem of tools. Cloudera’s approach to MCP Servers aligns with the MCP <a href="https://www.anthropic.com/news/model-context-protocol">philosophy</a> of openness, simplicity, and control. Cloudera MCP Servers run natively within Cloudera’s unified platform, eliminating risky data movement and enabling seamless deployment across both multi-cloud and on-premises environments.</p>
<h2>A Tenet of Private AI: Bringing AI Compute to Data</h2>
<p>AI’s transformative power relies on the quality of the data that fuels it. When data and AI systems operate in isolation, disconnected information delays insights, creates fragile pipelines, and leaves models without the necessary context for accurate decisions.&nbsp;</p>
<p><a href="/content/www/en-us/why-cloudera/enterprise-ai.html">Cloudera brings data and AI together</a> in a cohesive lifecycle. Data flows smoothly into AI workflows, governed by shared metadata, security policies, and optimized compute resources. This approach eliminates costly data duplication and movement while making every prediction traceable to its origin—ensuring transparency, trust, and compliance.</p>
<h2>Take the Next Step</h2>
<p>Ready to eliminate integration friction? <a href="https://github.com/cloudera/iceberg-mcp-server">Explore</a> Cloudera MCP Server for Apache Iceberg <a href="https://github.com/cloudera/iceberg-mcp-server">here</a>—currently available in preview—and discover how you can empower your AI applications with the context they need, right where your data lives. To put this into action today, <a href="/content/www/en-us/products/cloudera-public-cloud-trial.html">try our FREE 5-day trial</a>.</p>
<p>Three years ago, Cloudera customers began exploring generative AI to transform data interactions—building intelligent assistants, summarizing complex documents, and generating insights on demand. And today, our customers manage more than 25 exabytes (that’s 25 billion gigabytes!) of enterprise data across on-premises and cloud environments.</p>
<h2>The Context Gap in Enterprise AI</h2>
<p>How organizations manage their data is key: in the age of AI, context isn’t just helpful—it’s the difference between accurate decisions and hallucinations. AI models need seamless access to proprietary data to generate insights, answer questions, or automate workflows. Yet, in most organizations, this data remains fragmented across siloed object stores, Iceberg tables, Kafka streams, and operational databases. Developers waste valuable time writing custom connectors and maintaining fragile pipelines—a tax on innovation that slows time to value.</p>
<h2>Introducing Cloudera MCP Servers: A Universal Gateway to Your Data</h2>
<p>That’s where Cloudera’s Model Context Protocol (MCP) Servers come in. Our servers are built on MCP and provide a universal gateway to govern enterprise data. MCP is an <a href="https://docs.anthropic.com/en/docs/agents-and-tools/mcp">open standard</a> that aims to standardize AI integration in the same way that Microsoft Open Database Connectivity (ODBC) standardized relational databases (more on MCP in the next section).</p>
<p>To support this mission, we’re launching with Cloudera MCP Server for <a href="https://github.com/cloudera/iceberg-mcp-server">Apache Iceberg via Impala</a>. Apache Iceberg is the backbone of modern lakehouses, offering petabyte-scale management, ACID compliance, time travel, and granular governance. It’s the perfect starting point for bridging the gap between data and AI.</p>
<p>By starting with Apache Iceberg, we address a critical challenge: AI applications need real-time, governed access to analytical data without additional custom code. Our MCP Server enables developers to query Iceberg tables in natural language, integrate seamlessly with frameworks—like CrewAI, Microsoft AutoGen, LangChain or LangGraph, LlamaIndex, and agentic AI toolkits that work with these frameworks, like <a href="https://developer.nvidia.com/agent-intelligence-toolkit">NVIDIA Agent Intelligence (AIQ) toolkit</a>—while maintaining robust security with Cloudera SDX policies. And this is just the beginning: future Cloudera MCP Servers will extend support to streaming data, operational databases, and file/object stores.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=bringing-context-to-genai-with-cloudera-mpc-servers</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Beyond the Textbook: Peter Norvig on the Future of AI Literacy</title><description><![CDATA[In a world where AI is rapidly reshaping everything from how we work to how we live, one truth stands out: education is the cornerstone of both innovation and safety.  ]]></description><link>https://www.cloudera.com/blog/business/beyond-the-textbook-peter-norvig-on-the-future-of-ai-literacy.html</link><guid>https://www.cloudera.com/blog/business/beyond-the-textbook-peter-norvig-on-the-future-of-ai-literacy.html</guid><pubDate>Wed, 04 Jun 2025 16:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Cloudera]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty1146067345.jpg"><p>In a world where AI is rapidly reshaping everything from how we work to how we live, one truth stands out: education is the cornerstone of both innovation and safety.&nbsp;&nbsp;</p>
<p>On <a href="https://www.cloudera.com/resources/podcast/the-ai-forecast.html">The AI Forecast</a>, host Paul Muller sat down with <a href="https://www.norvig.com/">Peter Norvig</a>, former Director of Research at Google, Stanford Education Fellow, and co-author of the world’s most-used AI textbook—Artificial Intelligence: A Modern Approach. Their discussion spanned decades of AI progress, the deep transformation of education, and what the next evolution of data might look like.&nbsp;</p>
<p>Here are some takeaways from Paul and Peter’s conversation.&nbsp;</p>
<p><b>AI education must go beyond a static textbook, according to the author of the go-to AI education textbook.</b></p>
<p><b>Paul:</b> How would you describe the state of AI education today?&nbsp;</p>
<p><b>Peter:</b> It’s overwhelming. There’s just too much going on, too fast. When I started, you could keep up with every new development. AI papers came out slowly, and textbooks were relevant for years. Now, there are a dozen breakthrough papers published every week.&nbsp;</p>
<p>I don’t think the “one textbook” model works anymore. What we need is something interactive and personalized. And we need to shift from the idea that you go to college for four years, get a degree, and never have to learn again. That’s not how the world works. AI education should be a lifelong, continually evolving experience.&nbsp;</p>
<p><b>Paul:</b> What’s your view on how AI is impacting education?&nbsp;</p>
<p><b>Peter:</b> AI is changing both what we teach and how we teach. Tools like <a href="https://chatgpt.com/">ChatGPT</a> or <a href="https://github.com/features/copilot">GitHub Copilot</a> can be great accelerators, but only if you already know what you’re doing. They can lead you astray if you don’t.&nbsp;</p>
<p>The real value of learning to program isn’t memorizing syntax – it’s developing judgment. It’s about knowing how to scope a problem, debug it, recover from failure, and think critically. That’s the mindset we should teach, whether people are coding or using AI tools. The goal is not to produce code. It’s to produce understanding.&nbsp;</p>
<p><b>AI is all about solving uncertainty.</b></p>
<p><b>Paul:</b> What does AI mean to you?&nbsp;</p>
<p><b>Peter:</b> In our book, we define it as making better decisions to accomplish your goals. But that’s also what economists and software engineers try to do. The real difference lies in the kind of problems AI takes on.&nbsp;</p>
<p>Traditional software is about complexity: writing exact rules to solve exact problems. But AI deals with uncertainty. You're asked to classify an image, decide what a sentence means, or predict how someone might vote. There’s often no ground truth. You're trying to make the best decision based on incomplete, noisy, or ambiguous data. That’s where AI lives.&nbsp;</p>
<p><b>Can AI give us the most unbiased view of our world?</b></p>
<p><b>Paul:</b> What are you most excited about when it comes to the future of AI?&nbsp;</p>
<p><b>Peter</b>: I’m excited about new ways of gathering data, especially video. We’ve made huge strides in training on text because it’s compact. Image models are catching up. But video? That’s still untapped.&nbsp;</p>
<p>The challenge is scale. Training on all of YouTube isn’t economically feasible today. But give it a few more generations of processing power, and we’ll get there. What excites me is that the video medium captures action. It gives a more accurate view of the world. And, unlike text or photos, it’s less biased.&nbsp;</p>
<p>Everything written is something someone thought was important. Every photo was taken deliberately. But some videos are just a camera running 24/7. That’s about as close as we get to an unbiased record of physical reality. Combine that with what we already know, and we start to connect the physics of the world with the psychology of the world.&nbsp;</p>
<p>We’ve improved processing power by a factor of 1,000 at least three times in my lifetime. We just need to do it one more time.&nbsp;</p>
<p><br>
Catch the full conversation with Peter Norvig on The AI Forecast on<a href="https://podcasts.apple.com/gb/podcast/the-path-to-safe-ai-education-with-peter-norvig/id1779293119?i=1000704577281"> Apple Podcasts</a>,<a href="https://open.spotify.com/episode/2KN4mo0e5VszMi2Nxgk0D8?si=Ra1QjkoLRPSAtSfvZUFJxQ"> Spotify</a>, and<a href="https://www.youtube.com/watch?v=fWeIypJ22IM"> YouTube</a>.</p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=beyond-the-textbook-peter-norvig-on-the-future-of-ai-literacy</wfw:commentRss><slash:comments>0</slash:comments></item><item><title>Cloudera&amp;apos;s AI Studios: Making Advanced AI Accessible to All</title><description><![CDATA[The demand for AI-driven applications is surging, and enterprises have reached an inflection point where they can no longer afford fragmented, siloed development.]]></description><link>https://www.cloudera.com/blog/business/clouderas-ai-studios-making-advanced-ai-accessible-to-all.html</link><guid>https://www.cloudera.com/blog/business/clouderas-ai-studios-making-advanced-ai-accessible-to-all.html</guid><pubDate>Fri, 30 May 2025 13:00:00 UTC</pubDate><comments/><category><![CDATA[Business]]></category><dc:creator><![CDATA[Charu Anchlia,Robert Hryniewicz]]></dc:creator><content:encoded><![CDATA[<img class="cmp-blog-page__blog-banner blog-banner" src="/content/dam/www/marketing/blog/b01/b01-getty2080252145.jpg"><h3>RAG Studio</h3>
<p>The RAG Studio enables rapid development and deployment of RAG applications through a no-code interface. By integrating external knowledge sources with large language models (LLMs), users can create a more informed, context-aware <a href="/content/www/en-us/products/machine-learning.html">AI application</a> that excels at document search and question-answering tasks involving real-time, dynamic data.</p>
<h3>Synthetic Data Generation Studio</h3>
<p>The Synthetic Data Generation Studio provides users with powerful tools to create synthetic datasets for fine-tuning, model training or alignment, and evaluation. This studio offers a scalable and compliant alternative when real-world data is scarce or sensitive. By producing data that mirrors real-world patterns, the studio enables organizations to improve AI model and application robustness while ensuring compliance with regulations like CCPA in the US and GDPR in the EU.</p>
<p><span class="text-lead">The demand for AI-driven applications is surging, and enterprises have reached an inflection point where they can no longer afford fragmented, siloed development.&nbsp;</span></p>
<p>Traditionally, AI development is done by data scientists or machine learning experts with deep expertise in multiple tools and frameworks. But now a new class of builders has emerged—developers, not experts—who are enthusiastic about using the power of generative AI (GenAI) to solve their real-world use cases but who often lack specialized AI skills. Enterprises need solutions to simplify the development complexity for these GenAI builders, giving them an easier, faster path to production while maintaining enterprise-grade security, governance, and scalability.</p>
<p>Additionally, conventional enterprise software upgrade cycles are too slow to match the pace of AI innovation. This delay puts organizations at risk of building solutions that are outdated before they are even deployed. Enterprises need adaptive, modular solutions that evolve in-step with the AI landscape, ensuring that their solutions remain cutting-edge.</p>
<p><a href="/content/www/en-us/products/machine-learning/ai-studios.html">Cloudera AI Studios</a> can help solve these challenges: by providing modular, no-code tools with high-code extensibility, these studios accelerate AI adoption by guiding developers through various stages of the AI application lifecycle. AI Studios are designed to not only streamline development but also equip a broader range of users with knowledge about the underlying technologies—empowering organizations to solve meaningful business challenges with GenAI.</p>
<h2><br>
Democratizing AI Innovation: The Strategic Vision and Design Behind Cloudera AI Studios</h2>
<p>Delivering real enterprise value with GenAI demands mastering distinct stages across the complete AI application lifecycle. We deliberately architected AI Studios to map directly to these critical stages, democratizing the entire process with intuitive, low-code tools accessible to all users—regardless of technical expertise.</p>
<p>By seamlessly guiding users through each development stage, our approach eliminates traditional barriers and dramatically accelerates time-to-value. Our comprehensive ecosystem addresses critical challenges across the GenAI lifecycle:</p>
<ul>
<li><p><b>Synthetic Data Studio</b> reimagines data availability by generating enterprise-grade synthetic datasets that solve compliance and data scarcity challenges.</p>
</li>
</ul>
<ul>
<li><p><b>Retrieval-Augmented Generation (RAG) Studio</b> transforms model intelligence by seamlessly connecting foundation models with organizational knowledge—delivering contextually aware AI.</p>
</li>
</ul>
<ul>
<li><p><b>Fine Tuning Studio</b> redefines model specialization through frictionless adaptation workflows that align generic models with specific domain expertise.</p>
</li>
</ul>
<ul>
<li><p><b>Agent Studio</b> pioneers the next frontier of business transformation through sophisticated agentic applications that deliver measurable value across the enterprise.</p>
</li>
</ul>
<h3>Fine Tuning Studio</h3>
<p>The Fine Tuning Studio serves as a one-stop shop for customizing foundation models to meet specific business needs through increased model accuracy and domain relevance. Without it, fine-tuning would require writing extensive code and managing complex workflows. Instead, users can train, compare, and evaluate adapters against base models through a single interface. With built-in support for supervised fine-tuning (SFT), MLflow-based evaluation, and native integration with Cloudera AI Workbench and Inference, Fine Tuning Studio simplifies and accelerates the entire model adaptation process.</p>
<h3>Customizable Across Expertise Levels</h3>
<p>The artificial boundary between technical and non-technical users has historically limited innovation. Our architectural approach breaks down this divide and differentiates AI Studios from other low-code solutions by offering seamless switching between visual interfaces and full code environments—customizable to each user’s expertise and needs. These deliberate &quot;escape hatches&quot; prevent lock-in to proprietary black-box solutions with limited functionality, giving executive decision-makers confidence to invest in low-code solutions for serious AI development.&nbsp;</p>
<p>Our architecture builds on the strong foundation of the Cloudera AI Workbench—an established, enterprise-grade, self-service data science product with developer-friendly features like interactive notebooks, models, jobs, and applications. By accessing <a href="/content/www/en-us/products/machine-learning/ai-studios.html">AI Studios</a> within the AI Workbench, developers can begin with intuitive visual interfaces and transition to custom coding environments as greater control or expertise is required.&nbsp;</p>
<p>We designed AI Studios this way because of our strong belief that technical growth should be encouraged rather than constrained. Business developers with domain expertise can use the visual interfaces and collaborate effectively in the same environment with data experts using the coding interfaces—accelerating AI adoption across the enterprise.</p>
<h3>Interoperable Functionality Through AI Development Stages</h3>
<p>Each of the AI Studios is designed to be fully functional in a self-contained manner, yet can seamlessly interoperate with other studios through sharing resultant artifacts in the same project environment. This extends beyond just studio-to-studio integration—it encompasses the entire Cloudera AI platform. The studios integrate directly with the underlying AI Workbench and AI Inference services, creating an end-to-end system with consistent data and model governance throughout.</p>
<p>For example, Synthetic Data Studio can generate domain-specific training datasets that Fine Tuning Studio can then use to adapt a foundation model for agentic tasks. This specialized model can then be served by the Cloudera AI platform to power agentic applications orchestrated in Agent Studio, with contextual knowledge enhancement through RAG Studio. This deliberate multi-level interoperability enables organizations to build comprehensive AI solutions while still allowing users to have full flexibility in selecting which stages of the GenAI lifecycle they want assistance with and which they want to handle independently.&nbsp;</p>
<h3>Accelerated Integration of Open-Source Innovation</h3>
<p>We built each of the AI Studios as independent components capable of rapid release cycles aligned with the AI community's pace of innovation. This modular architecture allows AI Studios to leverage state-of-the-art open-source frameworks and swap underlying libraries without disrupting core functionality.</p>
<p>This reflects our belief that no single organization will drive AI innovation in isolation, and embraces open-source innovation as a way of contributing to a broader ecosystem of shared advancement.</p>
<h2>Introducing Cloudera AI Studios</h2>
<p>AI Studios provide a purpose-built experience targeted to each of the critical stages of the Generative AI lifecycle. The studios are Synthetic Data Studio, Fine Tuning Studio, RAG Studio, and Agent Studio.</p>
<h2>The Next Chapter in Enterprise AI</h2>
<p>As GenAI becomes a cornerstone of enterprise innovation, AI Studios represent a paradigm shift: bringing the power of AI to a broader set of users while maintaining the robustness and security that organizations demand. <a href="/content/www/en-us/products/machine-learning/ai-studios.html">AI Studios</a> are now available in the Cloudera AI Workbench, and together with the <a href="/content/www/en-us/products/machine-learning/ai-inference-service.html">Cloudera AI Inference service</a>, power <a href="/content/www/en-us/products/machine-learning.html">Cloudera’s enterprise AI platform</a>.</p>
<h3>Agent Studio</h3>
<p>Agent Studio empowers enterprises to build, test, and deploy AI agents that combine the reasoning capabilities of LLMs with the operational strength of traditional software. Through native integration with the Cloudera platform, Agent Studio uniquely exposes the full suite of Cloudera’s enterprise-grade services—<a href="/content/www/en-us/products/dataflow.html">Cloudera Data Flow</a>, <a href="/content/www/en-us/products/data-warehouse.html">Cloudera Data Warehouse</a>, <a href="/content/www/en-us/products/cloudera-data-platform/data-visualization.html">Cloudera Data Visualization</a>, and more—as composable, callable agents. This foundation, when combined with open-source agents and frameworks, enables sophisticated multi-agent orchestration that seamlessly coordinates operations across diverse data environments, from structured and unstructured data to real-time streams.</p>
<p>Figure 3: Synthetic Data Studio within a Cloudera AI Workbench project</p>
<small>Figure 2: All four AI Studios are interoperable within a single Cloudera AI project.</small><small>Figure 6: Agent Studio within a Cloudera AI Workbench project</small><small>Figure 4: Fine Tuning Studio within a Cloudera AI Workbench project</small><small>Figure 1: Cloudera AI: Enabling every stage of the Generative AI lifecycle</small><small>Figure 5: RAG Studio within a Cloudera AI Workbench project</small><p>The core design philosophies of AI Studios are:</p>
<ul>
<li><p>Customizable across expertise levels</p>
</li>
<li><p>Interoperable functionality through AI development stages</p>
</li>
<li><p>Accelerated integration of open-source innovation</p>
</li>
</ul>
<p><span class="text-lead">The future of AI is about more than advanced algorithms: it’s about making AI more accessible, interoperable, and impactful across the enterprise. Empower your workforce to begin building enterprise-grade AI applications by starting with Cloudera’s<a href="/content/www/en-us/products/cloudera-public-cloud-trial.html"> 5-day free trial</a>.</span></p>
]]></content:encoded><wfw:commentRss>https://www.cloudera.com/api/www/blog-feed?page=clouderas-ai-studios-making-advanced-ai-accessible-to-all</wfw:commentRss><slash:comments>0</slash:comments></item></channel></rss>