Skip to content

Conversation

@agracey
Copy link
Contributor

@agracey agracey commented Nov 20, 2020

No description provided.

@agracey
Copy link
Contributor Author

agracey commented Nov 20, 2020

https://www.youtube.com/watch?v=q2t-NWK-oNA Can we do a cover of this "I'm a Linux Girl in a Cloud Native World"? I know it doesn't flow but whatever 😆

@agracey agracey requested a review from timirnich November 21, 2020 01:01
@agracey
Copy link
Contributor Author

agracey commented Nov 21, 2020

@timirnich This is not done (I don't think), I would just like a sanity check on direction.

@agracey
Copy link
Contributor Author

agracey commented Nov 29, 2020

@timirnich I'll add an intro with some comments about the latest us-east-1 outage being caused by linux OS level limits not being understood and link to a few tweets that I think are interesting. I might actually go in a little different direction with this since we got given a really interesting example to talk about

Copy link
Contributor

@timirnich timirnich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good one! Could go a bit deeper in a number of places if you want to put some more flesh onto the bone.


Linux has a lot of resource constraints built into the OS and these can cause failures in odd and unexpected ways.

As we saw in the AWS example, process limits exist!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Briefly explain what a process limit is and what happens when it gets hit?


# Security!

NOTE: I'm also not a security expert so the main point is that nothing has changed, and there's just more work to do as Kubernetes is insecure by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrase to something like "the additional layers like container orchestration increases the attach surface."


NOTE: I'm also not a security expert so the main point is that nothing has changed, and there's just more work to do as Kubernetes is insecure by default.

Just because the application is walled off in a container doesn't mean there's no threat. There have been many SVEs found where a rouge process can break out of it's confinement like Tai Lung from Kung Fu Panda. This means that we still need to think about security on the host and in the applications as well as security of the cluster data plane and network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a nice segway to talk about things like Kata containers, Container-in-VM etc.


We need to still be concerned with host and physical security if running locally. Setting up App Armor profiles, correct firewall rules, and minimal privileges are still needed to keep the control plane and applications secure. This also means not running privileged pods in your cluster as those can get root access to your node and mess with kernel parameter (ask me how I know that it's possible...).

There's also security concerns with the build pipeline. Too many tools take a shortcut and ask for the container to mount the docker socket. This would allows a ci/cd script to make changes to your cluster. One thing that can help this (in my opinion) is moving from the docker daemon to cri-o and using tools that are runtime agnostic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There's also security concerns with the build pipeline. Too many tools take a shortcut and ask for the container to mount the docker socket. This would allows a ci/cd script to make changes to your cluster. One thing that can help this (in my opinion) is moving from the docker daemon to cri-o and using tools that are runtime agnostic.
There's also security concerns with the build pipeline. Too many tools take a shortcut and ask for the container to mount the docker socket. This would allow a ci/cd script to make changes to your cluster. One thing that can help this (in my opinion) is moving from the docker daemon to cri-o and using tools that are runtime agnostic.


Sadly, spinning up new instances and scaling horizontally to infinity doesn't fix the physics of how long it takes electrons to move around. This means that we should still be concerned with both computational complexity *and* resource constraints. These aren't normally important early in the process but can definitely start to be costly when you get to push the system to scale to larger sizes.

For an example that I've seen play out in my systems, knowing when to use different storage types can dramatically change the end user's experience of a system. The same is true for knowing when to break up single services or components into multiple pieces to allow faster or more flexible scaling.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elaborate on that? Sounds really interesting...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants