Saturday, April 04, 2015
Weekly Kubernetes Community Hangout Notes - April 3 2015
Kubernetes: Weekly Kubernetes Community Hangout Notes
Every week the Kubernetes contributing community meet virtually over Google Hangouts. We want anyone who’s interested to know what’s discussed in this forum.
Agenda:
- Quinton - Cluster federation
- Satnam - Performance benchmarking update
Notes from meeting:
- Quinton - Cluster federation
- Ideas floating around after meetup in SF
- * Please read and comment
- Not 1.0, but put a doc together to show roadmap
- Can be built outside of Kubernetes
API to control things across multiple clusters, include some logic
Auth(n)(z)
Scheduling Policies
…
- Different reasons for cluster federation
Zone (un) availability : Resilient to zone failures
Hybrid cloud: some in cloud, some on prem. for various reasons
Avoid cloud provider lock-in. For various reasons
“Cloudbursting” - automatic overflow into the cloud
Hard problems
Location affinity. How close do pods need to be?
Workload coupling
Absolute location (e.g. eu data needs to be in eu)
Cross cluster service discovery
- How does service/DNS work across clusters
Cross cluster workload migration
- How do you move an application piece by piece across clusters?
Cross cluster scheduling
How do know enough about clusters to know where to schedule
Possibly use a cost function to achieve affinities with minimal complexity
Can also use cost to determine where to schedule (under used clusters are cheaper than over-used clusters)
- Implicit requirements
Cross cluster integration shouldn’t create cross-cluster failure modes
- Independently usable in a disaster situation where Ubernetes dies.
Unified visibility
- Want to have unified monitoring, alerting, logging, introspection, ux, etc.
Unified quota and identity management
- Want to have user database and auth(n)/(z) in a single place
- Important to note, most causes of software failure are not the infrastructure
Botched software upgrades
Botched config upgrades
Botched key distribution
Overload
Failed external dependencies
- Discussion:
Where do you draw the “ubernetes” line
- Likely at the availability zone, but could be at the rack, or the region
Important to not pigeon hole and prevent other users
Satnam - Soak Test
Want to measure things that run for a long time to make sure that the cluster is stable over time. Performance doesn’t degrade, no memory leaks, etc.
github.com/GoogleCloudPlatform/kubernetes/test/soak/…
Single binary, puts lots of pods on each node, and queries each pod to make sure that it is running.
Pods are being created much, much more quickly (even in the past week) to make things go more quickly.
Once the pods are up running, we hit the pods via the proxy. Decision to hit the proxy was deliberate so that we test the kubernetes apiserver.
Code is already checked in.
Pin pods to each node, exercise every pod, make sure that you get a response for each node.
Single binary, run forever.
Brian - v1beta3 is enabled by default, v1beta1 and v1beta2 deprecated, turned off in June. Should still work with upgrading existing clusters, etc.