Last week I was one of the room hosts for Containerdays. I met old friends and made new ones. Through this post I want to share a couple of talks I think you should try and check out when you get a chance (and once those get uploaded on Containerdays’ YouTube).
The MC team: Timothy Mamo, Adelina Simion, Senior Go Engineer & DevRel, David de Hoop, Cloud Native Architect @ Team Rockstars IT, Rachid Zarouali, Docker Captain, Cloud Architect & Founder Sevensphere.
Summaries (I tried!):
Observability is expensive
Chris Cooney, Director of Developer Advocacy at Coralogix (a SaaS Observability company) says that observability is too expensive, but what can we do about it?
Well, we can abandon the pattern of putting data in really expensive SSDs, to then after 1 week move them to Magnetic EBS, and after 3 weeks to S3. When we’re not using like 30% of that data (studies show), that’s just throwing money away,
Optimize your demand, for instance by:
- Debug information gets transformed to metrics
- Other info (and warnings) go through routing logic and then to either magnetic EBS, SSDs, or S3
Building Kubernetes operators
Lars Francke (Stackable) & Jannik Heyl (Evoila) build 10+ k8s operators and live to tell the tale.
Execsum: OperatorHub.io has unmaintained operators (but I mean, have you seen the state of the Cloud Native landscape?!), Operator SDKs is not much beyond templating, writing operators in Rust is “AWESOME”, (Red Hat OpernShift) (re-)certification is non-trivial.
The operator pattern is incomplete:
- Pattern describes how an operator works
- What’s missing:
- Standardized conventions how database operators should work
- Unified monitoring APIs
- Standardized Custom Resource Definitions
- Standardized backup & restore
Cloud Native - the past, present and future
Christopher Liljenstolpe (Cisco) & Andrew Randall (Microsoft, prev. Kinvolk), two absolute powerhouses, talked about the history, present and future of cloud native. Definitely one of my favorite talks.
My new friend at the CNCF (mainly because he also loves dinosaurs) Jorge Castro, had a very similar(ly epic) talk.
Anyway, from Andrew and Christopher’s talk:
AI Impact on Cloud Native
- Cloud Native as a platform for AI workloads
- Drives demand for GPU capacity, scale, efficiency
- EG OpenAI scaled up to 7,500 k8s nodes per cluster in Azure
- Enable AI-powered applications
- Innovation
- Competitiveness
- Simplified operations
- Improve operator satisfaction
- Enabler of scale, efficiency and optimized architectures
- Smarter, real-time optimization decisions
Some cautions:
- CUDA is, for all intents and purposes, a single vendor API
- The vast majority of software requires CUDA
- We are possibly walking into another single vendor API monoculture which is not the way
- The vast majority of software requires CUDA
- Some aspects of AI (LLM training) is extremely resource consumptive
- In some cases rack power densities are increasing by an order of magnitude in the space of 2-3 years
- This, and the matching cooling load, will collide immediately with ESG goals
2020 > Drivers > 2025
- Serverless computing is niche > Re-architecting applications to be cloud native > Serverless goes mainstream due to elasticity & low ops overhead (60% of new event-driven apps)
- Disjointed dev tools create a bumpy road to production > Increased focus on DX > Self-service developer portals create paved road to prod (75% of orgs with platform teams)
- Locally installed IDEs > Anywhere access to development workspaces > Browser-based IDEs (30% of large enterprises)
- Development environments assumed safe, are unprotected > supply chain security risks > Securing development environments is a key priority (60% of organizations)
- Monitoring via proprietary SDKs and agents > Developer-centric observability practices > White-box observability using OpenTelemetry instrumentation (70% of new cloud-native apps)
Concluding thoughts
The future of cloud-based app development and deployment will be:
- Characterized by a more streamlined, automated, enjoyable productive developer and operator experience
- Driven by AI
- Deployed in heterogeneous environments, with data and compute optimally (and potentially dynamically) located for efficiency, performance, custom and regulatory needs
Miscellaneous:
- Definitely also loved Anna McDougall and Andreas Prang from Axel Springer’s talk about their platform engineering journey.