"Data is about the survival of the most well informed" - Swami Sivasubramanian’s Keynote

Swami Sivasubramanian’s keynote on data and ML kicks off the third day of re:Invent 2021 with the quote “Data is about the survival of the most well informed” before explaining that being informed allows you to respond best to the unexpected. [Tips fedora to shady character lurking in the back of the room “Oh hey COVID”]

As you’d expect with their broad set of data services, AWS is focusing on an end to end data and ML journey and defines this as having a capability in data, analytics and ML. One thing that is constantly coming through in the data presentations is around security and access controls.

One of the reasons we acquired privacy specialist TwoBlackLabs was to focus on security and privacy by design up front when building on cloud. Nothing illustrates these two domains coming together better than data storage, and in particular where granular controls are available. It has been a massive blind spot for many organisations to secure data at a server or database level using network or OS focused controls. Granular data controls allow you to define and implement access control policies on individual fields, rows or other levels of the data stack that take into account who is actually accessing what. Expect this to be a big change in the industry over the next few years.

Back to the keynote, a slide pitching Moving fast with broad access to data versus Data governance is displayed, highlighting the potential conflict between security and access. Swami explains that in fact strong data governance upfront will help you move fast, which echoes our experience where customers can regularly get tripped up trying to engineer in data governance after the fact.

Managed data services is another big push from AWS and one I personally advocate for. When Lake Formation was launched at re:Invent 2018 it was IMO one of the highlights of the event as it took a multitude of AWS services and put these together in a manageable and integrated fashion. If we think about Big Data (a term I hate), prior to cloud services the cost of entry was astronomical and didn’t allow for experimentation making it a tough business case. Managed and integrated cloud based data services are the key to unlocking data for smaller and smaller businesses - make it easy, make it cost effective, make it dynamic.

The Aurora driver customer example is worth a watch if you have a spare 5-10mins. It really is one of those future looking “dream big” examples which I love seeing at re:Invent.

Data is about the survival Aurora Small

As always there is a swag of new announcements, so let's get into them.

Announcements:

  1. Amazon devops guru for RDS. Goes beyond detecting issues and recommends solutions through in depth root cause analysis including guidance on resolving the issues.
  2. Amazon RDS Custom was announced for Oracle earlier in the year and is now available for Microsoft SQL
  3. Dynamodb Standard has a new storage class for infrequent access which reduces cost by 60%. Table classes can be switched between as requirements change
  4. AWS Database Migration Fleet Advisor accelerates database migration by performing an automatically inventory of your on premise databases and provides a comprehensive and customized migration plan
  5. SageMaker Ground Truth Plus is about reducing the cost and improving the quality of labelling data
  6. SageMaker Studio Notebook allows you to perform data engineer, analytics and ML workflows in one notebook
  7. SageMaker Infrastructure Innovation brings together SakeMaker Training Compiler to accelerate model training, SageMaker Inference Recommendations to reduce deployment times and SageMaker Serverless Inference to lower cost of ownership with pay-per-use pricing.
  8. Kendra Experience Builder - Kendra is AWS’s intelligent search service powered by machine learning. With Experience builder you can deploy intelligent search applications with a few clicks. Easy.
  9. Sagemaker Studio Lab - Ever wanted to get into machine learning in your spare time? Studio lab provides no cost, no setup access to machine learning tech. Bringing increased accessibility and inclusiveness to machine learning technology keeps improving with initiatives like Studio Lab.
  10. AWS AI & ML Scholarship - not a tech announcement but a $10M per year scholarship fund

If you haven't heard it enough yet at re:Invent, it is constantly reinforced with a large portion of the new services announced that there is no coding required. NO CODING REQUIRED. Got it yet?

Probably the thing that amazes me most about listening to the AWS Data and ML keynote is that if AWS just did Data and ML they would be an enormous business. In fact if I draw a strange parallel to when I visited the Boeing factory in Seattle many years ago, the scale was simply overwhelming. The building that housed the assembly line for 747s (yes it was some time ago) had the capacity for six 747s and then some. A 747 is crazy big when you stand next to it, but the factory is so large it has a climate of its own (Google it). If a large cloud data company is a 747, AWS is the Boeing Everett factory that assembles them.

747 Planes 2

And I’ll leave you on that note...