Disruptive Technology - Modern Data Platforms for AI & Beyond w/ WekaIO - Top Shelf Tech

23 August 2021

Join us for our first Top Shelf Tech: Disruptive Technology series, with WekaIO President Jonathan Martin and CTO Shimon Ben-David chatting live with The Instillery COO Jeremy Nees about how Weka changes the game for enterprises who need a zero compromise modern data platform for AI and beyond.

Disruptive Technology is a series dedicated to showcasing the disruptors from around the world who share our spirit of shaking up the world of technology.

Watch the video below or scroll down for the full transcript.

Video

Transcript

Jeremy Nees

Welcome along to Top Shelf Tech. Today, we are kick-starting our Disruptive Technology series, where we have handpicked companies from around the world that we think are gonna shake up technology as we know it and be disruptors in their field. So, today I have Jonathan Martin and Shimon Ben-David from WekaIO joining me.

Jonathan Martin

Hey, thanks, Jeremy.

Jeremy Nees

Maybe we just want to start with letting us know we're in the world you're joining us.

Jonathan Martin

Jonathan coming to you all the way from sunny Los Angeles. It's obviously not New Zealand, but you know, it's not too bad here either. I'm a huge New Zealand fan. So yeah, actually very new to Weka, technically in my third week, but previously was the General Manager for storage software at HP. And I was also the CMO for EMC pure storage and Hitachi before this.

Jeremy Nees

Awesome. Thanks, Jonathan.

Shimon Ben-David

I'm actually connecting from the San Francisco bay area, Sunnyvale, California.

I'm actually living here right next to our US headquarter office. I'm actually in Weka. I'm the CTO. I'm actually in Weta for the last seven years. I actually moved to the bay area six years ago to open our US office, went around some of other storage companies in the past, XtremIO, which actually was sold to EMC. And then before that, with the same founding team of Weka at XIV and we sold that IBM. That was a really nice appliance as well.

Jeremy Nees

Awesome. So you both got some pretty impressive credentials there, which probably brings us around to WekaIO and storage has maybe not always been one of the things that people talk about as disruptive technology, but Weka has a pretty interesting story. So did you want to just fill us in a bit on where Weka has come from and what problems you guys are looking to solve?

Jonathan Martin

So maybe I can jump in and Shimon you can help out along the way. So we've been around for kind of six or seven years at this point. And the market that we're in, as you said, it's not for at least the last 10 to 12 years not really been the sexiest market. You know, creating these big tin buckets to store your information. So I think there's a lot of analogies with the automotive industry over the last 20 years or so. Where you could buy a car 20 years ago and it came with a combustion engine and steel radial tires, and a turbocharger and all those sorts of things. And you know, you roll the clock forward 20 years and they still come in turbocharges and drink dinosaur juice, they're cooler and they go faster and they're a bit more powerful, but there's not, not fundamentally anything different.

And then Tesla came along and really turned the industry on its head by picking a small, but very, very particular vertical. So this was, if you think back to the first car, the Tesla Roadster. High-performance cars targeted at people that really care about the planet and driving electric vehicles and that transformed the industry by throwing away what had come before, and instead, doubling down on electric motors, batteries, software and AI. And that I think is what we're doing with Weka, we've taken a very different approach. The last 20 years has really been dominated by the hardware companies that Shimon and I've talked about. Weka is a software-based company. We develop all our products in the cloud, but they can also be used in your data centre as well.

And we pick a particular set of workloads and those workloads, and maybe not the workloads of the past, although we can provide significant value to those, but it's really the workloads of today and tomorrow. So things like artificial intelligence, things like machine learning.

There's obviously a very trendy cool topic. You look toward any enterprise. For the last five years, they've been trying to take concepts from high-performance computing, scientific computing and bring them into the enterprise, typically in the form of some artificial intelligence project. And let's face it, most of these projects the last five years have been kind of kitchen sink exercises. They've been your passion projects in organizations. But a lot of organizations, particularly, organizations that are looking to transform industries or organizations that are looking to transform themselves, are looking about how they do artificial intelligence at scale.

And when they make that step, they suddenly find kind of two things. One is the amount of information that they're trying to process, like in the traditional data centres probably doubling every 18 months, the amount of information that they're creating on these new platforms is going up by an order of magnitude or maybe even two orders of magnitude every year. So massive petascale and exascale workloads. And then the second thing is that they need the ability to be able to process all of that information, again, an order of magnitude or faster than they'd been able to historically do. So I want to pause and Shimon I'll hand over to you, and you can talk a little bit about kind of how customers are bringing those concepts to life.

Shimon Ben-David

Great. Definitely. Thank you, Jonathan. So what we're seeing is actually, as Jonathan mentioned, we're targeting use cases where we can really provide value to customers. So, financial environments, life science environments, AI ML environments, where the common thing about them is that they need to process massive amounts of data. And there's a value for processing that data in a fast and efficient way. So, environments that are going on-prem and on-cloud resolved all of that discussion of where should the data reside? Do I have any data gravity? So Weka is designed to solve all of these environments, all of these issues, and actually allow customers to gain insights from their data in a fast and efficient way.

So I'll give some customer use cases that I think we showed a lot of value. Some customers in the life science environment, for example, that we're handling petabytes of data using regular NAS appliances, the head challenge in terms of scaling capacity. So they had several petabytes on the NAS appliances and it couldn't even grow anymore.

They needed to scale it to double-digit petabytes and to triple-digit petabytes. They had issues with the performance of that appliance and also managing that appliance. So with Weka what we did is we actually came in and we solved their scaling problems. So we placed the system over there that can actually scale to the petabytes.

It actually already is double-digit petabytes it's scaled on in terms of performance. It actually even scaled between data centres. So, with Weka, we wanted to change the paradigm that you need multiple different storage appliances to manage your entire pipeline. And that's actually what we did.

So, another example, we went to another customer and that customer had, this is an IOT environment, initially ingest data, transform the data and then train in and influence on that data. It's not like an endless cycle. And what we saw is that that customer is actually using multiple different storage appliances, simply because each different clients use a different connectivity protocol, and because of the cost associated with that appliance. And they constantly kept copying the data between the different environments to accommodate for their needs. And what we actually showed is by placing a Weka system that could be connected to all of these environments in all of the protocols, you don't need these limited data silos. You can actually eliminate the silos and actually police everything on a single Weka system that can communicate using all of these protocols.

And that can actually provide the performance and scale that is needed for all of these pipeline steps. Additionally, one thing that is very unique about the Weka system is its ability to move data between environments. So if a customer has multiple data centres, or if a customer as Jonathan mentioned, a lot of customers are going to the cloud nowadays, so they have some sort of cloud strategy. So Weka is actually able to run on multiple data centres. It's able to run on the cloud and it's able to mobilize the data between the different environments. So now customers that are challenged with trying to copy the data and backup between different environments using different utilities don't need to do that anymore because the Weka environment is actually doing the data mobility for them.

And I think I'd finished with another piece of information. When we designed Weka we wanted to create the best storage of all breeds. So we wanted to create storage environments that would break the paradigm, that storage is very, and I'm going to be a bit technical here, it's very throughput oriented. So usually we see storage appliances that are handling massive amounts of large files. And that's great and there are many use cases for that, but we wanted to create a storage environment that would also be able to accommodate for massive amounts of metadata or massive amounts of smaller files. Because if we look at the pipeline at an AI ML pipeline, a life science pipeline, what we see during the pipeline, the data actually transforms multiple times and different applications are running on top of it. And each of them has unique IO requirements. Some are requiring throughput, some are requiring IOPS and latency, and we designed Weka to actually accommodate for all of these different requirements. Then Weka is actually very unique in its ability to accommodate all of these patterns on the same data sets.

Jeremy Nees

Cool. So what I'm hearing is Weka actually enables businesses if they're looking at from an R and D perspective, they're looking at how they can run data pipelines to actually accelerate the development using data. So that could be through AI and email modelling, whereas with, other technology maybe actually the latency or the storage held back your ability to process data and make good business decisions. And therefore what you're trying to do is really accelerate paradigm with these businesses where they can do what they need to do as part of the overall business life cycle, a lot quicker in a lot simpler when it involves large amounts of data in high-performance computing workloads.

Shimon Ben-David

Yeah, exactly. Eventually, it's all about gaining insights from the data. So you're acquiring massive amounts of data. And the question is how fast can you gain insights from it? I'll give another example, at an autonomous car company, we're able to get to them 20 times faster than their current environment. So just by placing Weka in working and eliminating these data silos, the data movement, and obviously benefiting from the Weka performance, they were able to do 20 times more work at the same amount of time. So eventually that's the value.

Jeremy Nees

Yeah, 20 times as a massive leap really for a business like that. From some of my background playing with storage and playing with scale-out storage systems, often the other piece that's missed is there's a lot of costs there. So if you're running with a proprietary system, it's very expensive to scale them out or if you're running with an open-source system and then password cluster file systems, there's a huge amount of complexity. Very challenging environments to scale out and to manage so what I'm hearing as well as you've got very good deployment options. So, looking at what you're doing on AWS, there's the ability to deploy virtually from the marketplace and very quickly get up and running with Weka.

Is that something that you see as a major point of difference in competitive advantage and your business as well?

Jonathan Martin

Yeah, absolutely. So, the last 20 years in storage has been defined by really just a bunch of compromises. And the compromises, you know, do I want speed, then I need to go with a block device. What's a great block device out there. Do I want to scale? Then I need to go with object. Do I want simplicity? I need fast system semantic share-ability, et cetera, et cetera. Or more recently, the compromises have been, do I run on-prem or do I run them in the cloud? So when you look at the storage market overall, it's been a set of compromises.

Most people today, do you want speed? Yes. Do I want simplicity? Yes. Do I want to scale? Yes. Do you want to be able to run this on-prem or in the cloud? Same product. Yes. But why would you not want all of those things? So I think that's really one of the big differentiators here is that the compromises of the last 20 years and those false choices that IT organizations have been made to take, we can kind of throw out the window with this solution, which is why more and more companies are beginning to adopt this technology. I think initially in AI and machine learning environments, but I think once they kind of taste it, take a bite at the purple that they see the speed, simplicity and scale benefits of the solution. They then begin to apply them to general-purpose, tier-one workloads as well.

Jeremy Nees

Awesome. Now the Kiwis watching probably know the name Weka and I understand that your mascot is a small flightless bird, but actually, the name is about what you guys are about isn't it?

Jonathan Martin

You could probably count the zeros better than I can. You've probably heard of gigabytes and terabytes and petabytes and exabytes and yottabytes, and zettabytes, I think that's the right way around that. But somewhere up there ten to the power of 30 is a Weka it's very large, only one step away from a Wonkabyte.

Jeremy Nees

Oh, very good. Hey guys, thanks a lot for joining us today to give us a bit of a rundown. Was there anything else that you thought you'd want to add today? Anything else that you think the audience would be interested in in terms of Weka and where you guys are going.

Jonathan Martin

Awesome. Hey, look thanks a lot for joining us. The first Disruptive Tech session on Top Shelf Tech. So been great to have you guys along and I'll certainly be watching the company over the next couple of years and looking forward to seeing what you've got.

Jonathan Martin

Awesome. Thanks for the opportunity.

Jeremy Nees

Nice to meet you. Bye. Thank you.