The WASI systems libraries that define the standard platform for WebAssembly/wasm are Capability-based.
From their first high level goal:
> Define a set of portable, modular, runtime-independent, and WebAssembly-native APIs which can be used by WebAssembly code to interact with the outside world. These APIs preserve the essential sandboxed nature of WebAssembly through a Capability-based API design.
WASI needs to prove itself first. For example, the very first goal it gives itself is portability, but the API is just a thin clone of POSIX. I see no evidence they're targeting common platforms people use like Windows or HTML with this API. The API defines things like "character device inodes" which is very UNIX specific.
How can you implement an object capability system on WASM? It gives modules a flat memory space in which you can run C so nothing stops one library interfering with the memory of another other than software-level verification, at which point you don't need WASM anymore. At most it could be a Mojo-style system in which cooperating message-passing processes can send each other interfaces.
> How can you implement an object capability system on WASM?
It's been well known for decades that the germ of an object capability system already exists in Unix - the file descriptor (that's why the control message for transferring them over sockets is called SCM_RIGHTS).
Capsicum was developed to make that potentiality a reality. It didn't require too much work. Since most things are represented by file descriptors, that was just a matter of tightening what you can do with the existing FDs (no going from a directory FD to its parent unless you have the right to the parent, or some parent of that parent, by another FD), and introducing process and anonymous shared memory FDs so that a global namespace is no longer needed to deal with these resources.
So WASI has derived itself from an actually existing object capability architecture - Capsicum - one which happens to be a simple refinement of the Unix API that everyone knows and every modern OS has at least been very much inspired by.
FDs are owned by processes, not libraries, and are by themselves not sufficient to implement a sandbox. A lot of policies people want to implement in the real world can't be expressed in UNIX or with file descriptors. For instance: UNIX kernels don't understand HTTP, but restricting socket access to particular domains is a common need. Of course, all of this can be hacked on top. Another example: stopping libraries quitting the process can't be done with fds.
Every modern OS has very much not been inspired by UNIX. Windows has little in common with it e.g. no fork/exec equivalents, the web is a sort of OS these days and has no shared heritage with UNIX, and many operating systems that ship a UNIX core as a convenience don't use its APIs or design patterns at the API level you're meant to use, e.g. an Android app doesn't look anything like a UNIX app, Cocoa APIs aren't UNIX either.
Windows has extreme similarities with Unix, and if you would look at a really different OS like, for instance, IBM i, it becomes clear. The Windows-Unix affinity is so great that you even interact with devices through file handles by Read, Write, and IoCtl methods.
Check "Inside Windows NT" by Helen Custer, an official account. She explicitly credits the handles to Unix. That's not surprising - not only was Unix fresh on the minds of the NT developers, with quite a few of them having Unix backgrounds, but every conceptual ancestor of Windows NT was at least significantly influenced by Unix:
- VMS: The VAX/VMS team were in regular contact with Ken Thompson, and got the idea for channel IDs (= file descriptors) for representing open files and devices from him, as well as the idea of three standard channels which child processes inherit by default: input, output, error (the error one was at the time a very recent development, I think in Unix v6 or v7)
- MICA: Would have been a combined Unix and VMS compatible system.
- OS/2: FDs with read, write, ioctl again.
Even MS-DOS is already highly Unix-influenced: they brought in file descriptors in DOS 2.0 and even called the source files implementing the API "XENIX.ASM" and "XENIX2.ASM" (see the recent open source release.)
I have deliberately chosen to not make anything of the fact that Windows NT was intended to be POSIX compatible either (and even supports fork, which WASI mercifully doesn't) because my point is the fact that all modern general-purpose operating systems are at least very much inspired and deeply indebted to Unix. I would accept that OSes that are not general purpose may not be, and old operating systems made in siloed environments like IBM are fundamentally very different. IBM i is very different to Unix and that's clear in its native APIs even though
Cocoa and Android APIs don't look much like the basic Unix APIs, it's true, even if they are implemented in terms of them. WASI wants to define APIs at that lower level of abstraction. It's tackling the problem at a different level (the inter-process level) to what object capability _languages_ are tackling (the intra-process level).
Caja's spiritual successor is HardenedJS (https://hardenedjs.org/), authored by some of the same folks (Mark Miller + friends). As I understand it, Caja attempted to secure not just javascript but the DOM as well, which ultimately proved to be a too large, interconnected, and rapidly changing surface to keep up with.
LavaMoat (https://lavamoat.github.io/), while not quite object capabilities, builds on HardenedJS to provide runtime supplychain security protections to js apps (nodejs or browser) by eliminating ambient authority and only exposing global capabilities per npm package according user-specified policy. LavaMoat is used in production at MetaMask, protecting ~300M users.
OCapN (https://github.com/ocapn/ocapn/) is a nascent effort to standardize a distributed object capability protocol (transferring capabilities across mutually distrusting peers).
One capability mechanism that's in wide use but not really well known or touched on in the article is Androids RPC mechanism, Binder(and a lot of the history predates Android from what I recall).
Binder handles work just like object capabilities, you can only use what's sent to you and process can delegate out other binder handles.
Android hides most of this behind their permission model but the capability still exist and can be implemented by anyone in the system.
Yes, and macOS/iOS have XPC which is similar to the Binder. Binder is a BeOS era thing. Parts of Android were written by former Be engineers so the API terminology is the same (binders, loopers, etc).
Binder is also somewhat like Mojo in that you can do fast in-process calls with it, iirc. The problem is that, as you note, this isn't very useful in the Android context because within a process there's no way to keep a handle private. Mojo's ability to move code in and out of processes actually is used by Chrome extensively, usually either for testing (simpler to run everything in-process when debugging) or because not every OS it runs on requires the same configuration of process networks.
Answering the question: “Can this process access this resource?” is equivalent to solving the halting problem.
There’s a reason simpler access control models are popular. Even ACLs are completely untenable in practice. Look at all the trouble accidentally-public s3 buckets create.
Yup, this is all really hard, which is why it hasn't been much more than a research project up to this point.
If I had to guess, the supply chain problems that may eventually cause this to be created will need to get, oh, I don't know, call it two orders of magnitude worse before the system as a whole really takes note. Then, since you can't really write a new language just for this, even though I'd like that to happen, it'll get bodged on to the side of existing languages and it won't be all that slick.
That said, I do think there's probably some 80/20 value in creating an annotation to the effect of "this library doesn't need filesystem access or sockets" and having perhaps a linter or some other external tool validate it externally to the main language compiler/runtime. The point of this would not be to solve the capabilities problem for libraries that are doing intrinsically tricky things, because that's really hard to do correctly, but just to get a lot of libraries out of the line of fire. There's a lot of libraries that already don't need to do those things, and more that could easily be tweaked to just take passed-in file handles or whatever if there was a concrete reason to design them that way.
The library that I personally could do the most damage with on my GitHub is a supervision tree library for Go. It doesn't need any capabilities to speak of. The closest thing is that you can pass in a logger object and that is constrained to specific calls too. Even a hack that just lets me say that this library doesn't need anything interesting would at least get that out of the set of libraries that could be exploited.
Or to put it another way, rather than trying to perfectly label all the code doing tricksy stuff, maybe we can start by labelling the code that doesn't.
I'd also point out that I think the question of libraries is different than things like Chrome isolation. Those things are good, but they're for treating data carefully and limiting blast radiuses; I'm looking at the problem of "if I download this library and miss one single file is it going to upload every AWS token it can find to someone who shouldn't have them".
The right place to start is definitely a deep study of the Java SecurityManager because it came the closest to what seems to be needed here, and yet failed. One reason it was hard to use is that many permissions transitively implied others without that being obvious. For example, you said you want an annotation that lets you drop filesystem access or sockets. But that's not enough, is it?
- Loading native code = granting all permissions
- Access to the unsafe package = granting all permissions
- Many syscalls that write data to user buffers = granting all permissions
- Being able to run a sub-process = granting all permissions
So you need at minimum to exclude all of those too. But then you also have:
- Tampering with global state of any kind e.g. the default HTTP server can be mutated by anything in Go (I think?). If you can modify the logging system to write to a new location you might be able to use that to escape the sandbox.
- Deserialization of objects can be a sandbox escape.
And then what about the threat model? If you can cause every process including your library to segfault simultaneously, then that's a DoS attack on a company that can be profitably used for extortion. Are DoS-driven extortions in-scope or out? This is why System.exit is a permission in the SecurityManager.
And so on. The large number of ways you can accidentally configure an exploitable set of permissions is huge, and because nobody seems to care much, there was no tooling to help avoid such misconfigurations.
I did not see a link to your screenshotted comment in the article so looked it up, https://news.ycombinator.com/item?id=43936830 to save others the work (not the whole thing is screenshotted, and the comment and its thread are good, thanks!)
That would be preferable, certainly. I'm staying vague because there's a world of differences in how all the languages would most easily be able to implement something like this and I'm trying to stay language non-specific. Imagine how you'd solve this in Rust versus Ruby.
It's unclear to me why the "god object" pattern described in the article isn't a good solution. The pattern in the article is different from the "god object" pattern as commonly known[1], in that the god object in the article doesn't need to be referenced after initialization. It's basically used once during initialization then forgotten.
It's normal for an application to be built from many independent modules that accept their dependencies as inputs via dependency inversion[2]. The modules are initialized at program start by code that composes everything together. Using the "god object" pattern from the article is basically the same thing.
The God Object needs to be referenced at least transitively by the wrappers you hand out, and its API surface needs to grow constantly to encompass any possibly security-sensitive operation.
Dependency injection doesn't help here much, at least not with today's languages and injectors. The injector doesn't have any opinion on whether a piece of code should be given something or not, it just hands out whatever a dependency requests. And the injector often doesn't have enough information to precisely resolve what a piece of code needs or resolve it at the right time, so you need workarounds like injecting factories. It could be worth experimenting with a security-aware dependency injector, but if you gave it opinions about security it'd probably end up looking a lot like the SecurityManager did (some sort of config language with a billion fine grained permissions).
What is "injector"? Is this the Service Locator anti-pattern?
The application's entry point takes the God Object of capabilities, slices it accordingly to what it thinks its submodules should be able to access, and then initializes the submodules with the capabilities it has decided upon. Obviously, if some submodules declare that they need access to everything plus a kitchen sink, the choice is either a) give up and give them access to everything; b) look for a replacement that requires less capabilities.
There's a great paper implementing this idea in the node.js ecosystem; [BreakApp: Automated, Flexible Application Compartmentalization](https://ic.ese.upenn.edu/pdf/breakapp_ndss2018.pdf) which modifies the `require` signature to allow specifying a security scope in which the module can be run.
It doesn't quite work at the capabilities level, but it does provide some novel protections against unusual supply-chain attacks such as denial-of-service attacks which may otherwise require no special capabilities.
Thanks for the BreakApp paper. I read it this morning. This sort of thing is heading in the right direction and I've explored it before (without writing a paper), but there are some problems with their approach that they seem to have ignored which ended up sucking up a lot of the time I put into it. OTOH there are lots of good points: the usability issues are front and center, and they (very roughly) sketch out a threat model.
The first problem is that their attempt to abstract program location has a lot of bugs. You can't solve cross-process GC with the approach they outline (in like, one paragraph). Using finalizers to release cross-process references fails the moment there's a reference cycle. To trace cycles across heaps requires a new kind of GC that doesn't exist today, as far as I know, made much harder by the fact that they're including DoS attacks in their threat model. And intercepting value writes then batching them to save IPCs changes program semantics in pretty subtle ways. I think this is actually the core problem of sandboxing libraries and everything else is a well understood implementation distraction, so it'd be good to have a paper that focused exclusively on that. They also seem to think that modules return DAGs but they don't, program heaps are directed cyclic graphs, not acyclic.
The second problem is that their policy language is too simple and doesn't have any notion of transitive permissions. This is the same issue that made SecurityManager hard to use. You can grant a module filesystem access but in a typical JS app being tested or run on a developer's desktop, that's equivalent to granting all permissions (because it can write to some module directory). Even if you use Docker to ship the app there's no guarantee the js files inside the container are read only, as in a container programs often run as root.
The third problem is the only sandbox they offer is LXC containers, but containers aren't meant to be sandboxes and often aren't "out of the box". And of course they're Linux specific but development of JS apps often takes place on non-Linux machines. The details of actually doing kernel sandboxing for real are rather complex.
Still, something like this is the right direction of travel. The usability issues with process sandboxing arise due to performance problems and the harshness of the address space transition. Allowing object graphs to be 'shadowed' into another process, with proper handling of memory management, and then integrating that into language runtimes seems like the right approach.
This is similar to my work on LavaMoat (https://lavamoat.github.io/) which provides runtime supplychain security protections to js apps (nodejs or browser) by eliminating ambient authority and only exposing global capabilities per npm package according user-specified policy. LavaMoat is used in production at MetaMask, protecting ~300M users.
Ecosystems get it right because they have to. E.g. iOS and Android etc. This ain't so good on desktop systems.
Probably the compiled program should just get tbe permissions it needs.
A simple capability system for libraries might be the good that is the enemy of perfect:
Pure - can only access compute and its own memory plus passed in parameters (needs immutable languages or serialization at interop)
Storage IO - Pure but can do Storage IO. IO on what? Anything the program has access to.
Network IO - similar concept
Desktop in Window - can do UI stuff in the window
Desktop General - models, notifications, new windows etc.
Etc...
Not very fine grained but many libraries cab be Pure.
It ain't perfect.
A Pure library that formats a string can still inject some nasty JS hoping that you'll use that string on a web page! Ultimately... useful computation is messy and you can't secure everything in advance through capabilities alone.
IIUC iOS and Android don't have library sandboxing; any code that you allow to run in your app's process can access the whole address space. Apps themselves are sandboxed, but that doesn't help with the class of problem that this post is about.
The capability system i hear talked about too little for some reason and is even more "chromey baby" is workerd using isolates. You can clearly see the lineage from sandstorm/ capnp and its kind of crazy something like this is finally in a mainstream platform. Sure, the concept is not taken to the extreme without much possibility to delegate / demote capabilities at runtime, but the direction is clearly what we need. Whenever i have to come back to other environments I immediately feel the lack of trust, clarity and control these have.
For some reason I'm having a horrible time googling workerd, any tips? -- also, I might be looking up the wrong thing? (I'm hoping to learn more about the environment you are in before you "have to come back to other environments")
An interesting analogy could be Rust's unsafe. One important property of unsafe is that it's not inherently contagious. A safe function can contain an unsafe block, and that unsafe block can contain unsafe code. What this entails in practice is the function's author pinky-swearing that, despite its internals, the author promises that the function won't violate memory safety. The analogy isn't exact. In particular, functions are allowed to place human language constraints on their callers, which aren't verified, in order to uphold that guarantee. I wonder why there hasn't been any work done in this direction. It seems promising, and if there's any obvious walls that this hits, at least I hadn't figured out any.
Ahh, JAAS... Java applets were removed, but JAAS, which only existed to support applets, could not be removed because there was lots of code that depended on the Login and Subject classes and the runAs() method. Why would such code exist though, if JAAS existed only to support applets? Well, because the Login class could be used to acquire credentials. For example the Krb5Login class could be used as a Java kinit in Kerberos environments.
Anyways, JAAS' Permission class and model are weak, but yeah, they could be used to limit libraries' capabilities. A capability model would be much better than a permission model.
I wouldn't count on that, since Java 9 there has been a more agressive approach towards deprecated code, hence why @deprecated(forRemoval=true) is now possible.
Without speaking to all of the issues, this is all made much harder by the underlying hardware having extremely bad defaults. The idea that running code on the hardware is itself an unsafe operation means that any time you want to touch it you need proxies and intermediate languages and all this by default.
It's pretty easy for me to imagine a world where running code was safe by default, and this followed all the way to the top. It's obviously not that onerous, else JavaScript wouldn't be as successful as it is. Most of the details the post touches on are then just package management and grouping concerns.
> The idea that running code on the hardware is itself an unsafe operation
Ignoring microcontrollers, and tiny embedded stuff, no hardware or modern operating systems I know of works that way.
Modern hardware almost all has an MMU (which blocks I/O once the process table is set up), and most have an IOMMU (which partitions the hardware mutually distrusting operating systems can run directly on the same machine).
The remaining architectural holes are side channel / timing attacks that hit JS just as hard as bare metal.
which tries to calculate a silly hash of all the memory it can reach via indirect loads and then to zero whatever memory it can reach via indirect stores (which is usually just the whole of the process's memory on modern systems in both cases). What mechanisms do you propose that would allow one to blindly run this code without erasing all kinds of precious information in the memory, ideally still returning 42 in r0 to the caller in the end but without leaking any sensitive information via r1?
But enlightening: I did not previously know that CHERI had explicit tool (CCall) to implement the unspoofable "restore privileges and return from subroutine" instruction.
I had an explanation of that but deleted it because the article is already too long. The Joe-E paper explains their reasoning. Briefly, Java uses exceptions to indicate certain kinds of errors that might leave the application in an undefined state like stack overflows, running out of memory and so on. Finally blocks allow you to execute code after those events occur. Therefore, sandboxed code could run code inside a VM that's entered a somewhat indeterminate state.
This shows up a subtle detail of sandboxing schemes that are often overlooked. The guarantees Java provides around safety are tightly scoped and often little more than saying the JVM itself won't crash. It's not that hard to arrange for a stack overflow to occur whilst some standard library code is running, which means execution can abort in nearly any place. If the code you're calling into isn't fully exception-safe, it means the libraries global variables (if any) can be left in a logically corrupted state, which might be exploitable.
If Java finally blocks had a filter clause, that could help, but finally is sometimes implicit as with try-with-resources.
See also: formal verification methodologies like those applied to seL4. Holistic practices of upping rigorous proofs of safety and correctness is the elephant in the room besides merely testing, language features, or any one solution.
The WASI systems libraries that define the standard platform for WebAssembly/wasm are Capability-based.
From their first high level goal:
> Define a set of portable, modular, runtime-independent, and WebAssembly-native APIs which can be used by WebAssembly code to interact with the outside world. These APIs preserve the essential sandboxed nature of WebAssembly through a Capability-based API design.
https://github.com/WebAssembly/WASI/blob/main/README.md#wasi...
WASI needs to prove itself first. For example, the very first goal it gives itself is portability, but the API is just a thin clone of POSIX. I see no evidence they're targeting common platforms people use like Windows or HTML with this API. The API defines things like "character device inodes" which is very UNIX specific.
How can you implement an object capability system on WASM? It gives modules a flat memory space in which you can run C so nothing stops one library interfering with the memory of another other than software-level verification, at which point you don't need WASM anymore. At most it could be a Mojo-style system in which cooperating message-passing processes can send each other interfaces.
> How can you implement an object capability system on WASM?
It's been well known for decades that the germ of an object capability system already exists in Unix - the file descriptor (that's why the control message for transferring them over sockets is called SCM_RIGHTS).
Capsicum was developed to make that potentiality a reality. It didn't require too much work. Since most things are represented by file descriptors, that was just a matter of tightening what you can do with the existing FDs (no going from a directory FD to its parent unless you have the right to the parent, or some parent of that parent, by another FD), and introducing process and anonymous shared memory FDs so that a global namespace is no longer needed to deal with these resources.
So WASI has derived itself from an actually existing object capability architecture - Capsicum - one which happens to be a simple refinement of the Unix API that everyone knows and every modern OS has at least been very much inspired by.
https://www.cl.cam.ac.uk/research/security/capsicum/
FDs are owned by processes, not libraries, and are by themselves not sufficient to implement a sandbox. A lot of policies people want to implement in the real world can't be expressed in UNIX or with file descriptors. For instance: UNIX kernels don't understand HTTP, but restricting socket access to particular domains is a common need. Of course, all of this can be hacked on top. Another example: stopping libraries quitting the process can't be done with fds.
Every modern OS has very much not been inspired by UNIX. Windows has little in common with it e.g. no fork/exec equivalents, the web is a sort of OS these days and has no shared heritage with UNIX, and many operating systems that ship a UNIX core as a convenience don't use its APIs or design patterns at the API level you're meant to use, e.g. an Android app doesn't look anything like a UNIX app, Cocoa APIs aren't UNIX either.
Windows has extreme similarities with Unix, and if you would look at a really different OS like, for instance, IBM i, it becomes clear. The Windows-Unix affinity is so great that you even interact with devices through file handles by Read, Write, and IoCtl methods.
Check "Inside Windows NT" by Helen Custer, an official account. She explicitly credits the handles to Unix. That's not surprising - not only was Unix fresh on the minds of the NT developers, with quite a few of them having Unix backgrounds, but every conceptual ancestor of Windows NT was at least significantly influenced by Unix:
- VMS: The VAX/VMS team were in regular contact with Ken Thompson, and got the idea for channel IDs (= file descriptors) for representing open files and devices from him, as well as the idea of three standard channels which child processes inherit by default: input, output, error (the error one was at the time a very recent development, I think in Unix v6 or v7)
- MICA: Would have been a combined Unix and VMS compatible system.
- OS/2: FDs with read, write, ioctl again.
Even MS-DOS is already highly Unix-influenced: they brought in file descriptors in DOS 2.0 and even called the source files implementing the API "XENIX.ASM" and "XENIX2.ASM" (see the recent open source release.)
I have deliberately chosen to not make anything of the fact that Windows NT was intended to be POSIX compatible either (and even supports fork, which WASI mercifully doesn't) because my point is the fact that all modern general-purpose operating systems are at least very much inspired and deeply indebted to Unix. I would accept that OSes that are not general purpose may not be, and old operating systems made in siloed environments like IBM are fundamentally very different. IBM i is very different to Unix and that's clear in its native APIs even though
Cocoa and Android APIs don't look much like the basic Unix APIs, it's true, even if they are implemented in terms of them. WASI wants to define APIs at that lower level of abstraction. It's tackling the problem at a different level (the inter-process level) to what object capability _languages_ are tackling (the intra-process level).
Caja's spiritual successor is HardenedJS (https://hardenedjs.org/), authored by some of the same folks (Mark Miller + friends). As I understand it, Caja attempted to secure not just javascript but the DOM as well, which ultimately proved to be a too large, interconnected, and rapidly changing surface to keep up with.
LavaMoat (https://lavamoat.github.io/), while not quite object capabilities, builds on HardenedJS to provide runtime supplychain security protections to js apps (nodejs or browser) by eliminating ambient authority and only exposing global capabilities per npm package according user-specified policy. LavaMoat is used in production at MetaMask, protecting ~300M users.
OCapN (https://github.com/ocapn/ocapn/) is a nascent effort to standardize a distributed object capability protocol (transferring capabilities across mutually distrusting peers).
One capability mechanism that's in wide use but not really well known or touched on in the article is Androids RPC mechanism, Binder(and a lot of the history predates Android from what I recall).
Binder handles work just like object capabilities, you can only use what's sent to you and process can delegate out other binder handles.
Android hides most of this behind their permission model but the capability still exist and can be implemented by anyone in the system.
Yes, and macOS/iOS have XPC which is similar to the Binder. Binder is a BeOS era thing. Parts of Android were written by former Be engineers so the API terminology is the same (binders, loopers, etc).
Binder is also somewhat like Mojo in that you can do fast in-process calls with it, iirc. The problem is that, as you note, this isn't very useful in the Android context because within a process there's no way to keep a handle private. Mojo's ability to move code in and out of processes actually is used by Chrome extensively, usually either for testing (simpler to run everything in-process when debugging) or because not every OS it runs on requires the same configuration of process networks.
Why not?
Answering the question: “Can this process access this resource?” is equivalent to solving the halting problem.
There’s a reason simpler access control models are popular. Even ACLs are completely untenable in practice. Look at all the trouble accidentally-public s3 buckets create.
Now I am aware that answering the question is np-hard, but why (and how) is it equivalent to solving the halting problem?
A module has a line of code that gives the capability to the component we are asking about.
Is that line of code executed?
Replace the line with “halt”, and change the question to “Does this program halt?”
Yup, this is all really hard, which is why it hasn't been much more than a research project up to this point.
If I had to guess, the supply chain problems that may eventually cause this to be created will need to get, oh, I don't know, call it two orders of magnitude worse before the system as a whole really takes note. Then, since you can't really write a new language just for this, even though I'd like that to happen, it'll get bodged on to the side of existing languages and it won't be all that slick.
That said, I do think there's probably some 80/20 value in creating an annotation to the effect of "this library doesn't need filesystem access or sockets" and having perhaps a linter or some other external tool validate it externally to the main language compiler/runtime. The point of this would not be to solve the capabilities problem for libraries that are doing intrinsically tricky things, because that's really hard to do correctly, but just to get a lot of libraries out of the line of fire. There's a lot of libraries that already don't need to do those things, and more that could easily be tweaked to just take passed-in file handles or whatever if there was a concrete reason to design them that way.
The library that I personally could do the most damage with on my GitHub is a supervision tree library for Go. It doesn't need any capabilities to speak of. The closest thing is that you can pass in a logger object and that is constrained to specific calls too. Even a hack that just lets me say that this library doesn't need anything interesting would at least get that out of the set of libraries that could be exploited.
Or to put it another way, rather than trying to perfectly label all the code doing tricksy stuff, maybe we can start by labelling the code that doesn't.
I'd also point out that I think the question of libraries is different than things like Chrome isolation. Those things are good, but they're for treating data carefully and limiting blast radiuses; I'm looking at the problem of "if I download this library and miss one single file is it going to upload every AWS token it can find to someone who shouldn't have them".
The right place to start is definitely a deep study of the Java SecurityManager because it came the closest to what seems to be needed here, and yet failed. One reason it was hard to use is that many permissions transitively implied others without that being obvious. For example, you said you want an annotation that lets you drop filesystem access or sockets. But that's not enough, is it?
- Loading native code = granting all permissions
- Access to the unsafe package = granting all permissions
- Many syscalls that write data to user buffers = granting all permissions
- Being able to run a sub-process = granting all permissions
So you need at minimum to exclude all of those too. But then you also have:
- Tampering with global state of any kind e.g. the default HTTP server can be mutated by anything in Go (I think?). If you can modify the logging system to write to a new location you might be able to use that to escape the sandbox.
- Deserialization of objects can be a sandbox escape.
And then what about the threat model? If you can cause every process including your library to segfault simultaneously, then that's a DoS attack on a company that can be profitably used for extortion. Are DoS-driven extortions in-scope or out? This is why System.exit is a permission in the SecurityManager.
And so on. The large number of ways you can accidentally configure an exploitable set of permissions is huge, and because nobody seems to care much, there was no tooling to help avoid such misconfigurations.
I did not see a link to your screenshotted comment in the article so looked it up, https://news.ycombinator.com/item?id=43936830 to save others the work (not the whole thing is screenshotted, and the comment and its thread are good, thanks!)
Oops, sorry, there was a link and it seems to have been removed during editing. I put it back, thanks!
Shouldn’t it be automatic based on behavior? An annotation is ripe for exploitation if the system itself can’t make sense of its own parts.
That would be preferable, certainly. I'm staying vague because there's a world of differences in how all the languages would most easily be able to implement something like this and I'm trying to stay language non-specific. Imagine how you'd solve this in Rust versus Ruby.
It's unclear to me why the "god object" pattern described in the article isn't a good solution. The pattern in the article is different from the "god object" pattern as commonly known[1], in that the god object in the article doesn't need to be referenced after initialization. It's basically used once during initialization then forgotten.
It's normal for an application to be built from many independent modules that accept their dependencies as inputs via dependency inversion[2]. The modules are initialized at program start by code that composes everything together. Using the "god object" pattern from the article is basically the same thing.
[1] https://en.wikipedia.org/wiki/God_object [2] https://en.wikipedia.org/wiki/Dependency_inversion_principle
The God Object needs to be referenced at least transitively by the wrappers you hand out, and its API surface needs to grow constantly to encompass any possibly security-sensitive operation.
Dependency injection doesn't help here much, at least not with today's languages and injectors. The injector doesn't have any opinion on whether a piece of code should be given something or not, it just hands out whatever a dependency requests. And the injector often doesn't have enough information to precisely resolve what a piece of code needs or resolve it at the right time, so you need workarounds like injecting factories. It could be worth experimenting with a security-aware dependency injector, but if you gave it opinions about security it'd probably end up looking a lot like the SecurityManager did (some sort of config language with a billion fine grained permissions).
What is "injector"? Is this the Service Locator anti-pattern?
The application's entry point takes the God Object of capabilities, slices it accordingly to what it thinks its submodules should be able to access, and then initializes the submodules with the capabilities it has decided upon. Obviously, if some submodules declare that they need access to everything plus a kitchen sink, the choice is either a) give up and give them access to everything; b) look for a replacement that requires less capabilities.
There's a great paper implementing this idea in the node.js ecosystem; [BreakApp: Automated, Flexible Application Compartmentalization](https://ic.ese.upenn.edu/pdf/breakapp_ndss2018.pdf) which modifies the `require` signature to allow specifying a security scope in which the module can be run.
It doesn't quite work at the capabilities level, but it does provide some novel protections against unusual supply-chain attacks such as denial-of-service attacks which may otherwise require no special capabilities.
Thanks for the BreakApp paper. I read it this morning. This sort of thing is heading in the right direction and I've explored it before (without writing a paper), but there are some problems with their approach that they seem to have ignored which ended up sucking up a lot of the time I put into it. OTOH there are lots of good points: the usability issues are front and center, and they (very roughly) sketch out a threat model.
The first problem is that their attempt to abstract program location has a lot of bugs. You can't solve cross-process GC with the approach they outline (in like, one paragraph). Using finalizers to release cross-process references fails the moment there's a reference cycle. To trace cycles across heaps requires a new kind of GC that doesn't exist today, as far as I know, made much harder by the fact that they're including DoS attacks in their threat model. And intercepting value writes then batching them to save IPCs changes program semantics in pretty subtle ways. I think this is actually the core problem of sandboxing libraries and everything else is a well understood implementation distraction, so it'd be good to have a paper that focused exclusively on that. They also seem to think that modules return DAGs but they don't, program heaps are directed cyclic graphs, not acyclic.
The second problem is that their policy language is too simple and doesn't have any notion of transitive permissions. This is the same issue that made SecurityManager hard to use. You can grant a module filesystem access but in a typical JS app being tested or run on a developer's desktop, that's equivalent to granting all permissions (because it can write to some module directory). Even if you use Docker to ship the app there's no guarantee the js files inside the container are read only, as in a container programs often run as root.
The third problem is the only sandbox they offer is LXC containers, but containers aren't meant to be sandboxes and often aren't "out of the box". And of course they're Linux specific but development of JS apps often takes place on non-Linux machines. The details of actually doing kernel sandboxing for real are rather complex.
Still, something like this is the right direction of travel. The usability issues with process sandboxing arise due to performance problems and the harshness of the address space transition. Allowing object graphs to be 'shadowed' into another process, with proper handling of memory management, and then integrating that into language runtimes seems like the right approach.
hadn't heard of breakapp! paper author Nikos Vasilakis also contributed to Mir (https://github.com/andromeda/mir).
This is similar to my work on LavaMoat (https://lavamoat.github.io/) which provides runtime supplychain security protections to js apps (nodejs or browser) by eliminating ambient authority and only exposing global capabilities per npm package according user-specified policy. LavaMoat is used in production at MetaMask, protecting ~300M users.
Ecosystems get it right because they have to. E.g. iOS and Android etc. This ain't so good on desktop systems.
Probably the compiled program should just get tbe permissions it needs.
A simple capability system for libraries might be the good that is the enemy of perfect:
Pure - can only access compute and its own memory plus passed in parameters (needs immutable languages or serialization at interop)
Storage IO - Pure but can do Storage IO. IO on what? Anything the program has access to.
Network IO - similar concept
Desktop in Window - can do UI stuff in the window
Desktop General - models, notifications, new windows etc.
Etc...
Not very fine grained but many libraries cab be Pure.
It ain't perfect.
A Pure library that formats a string can still inject some nasty JS hoping that you'll use that string on a web page! Ultimately... useful computation is messy and you can't secure everything in advance through capabilities alone.
IIUC iOS and Android don't have library sandboxing; any code that you allow to run in your app's process can access the whole address space. Apps themselves are sandboxed, but that doesn't help with the class of problem that this post is about.
Android surely has, although its adoption is currently optional.
https://privacysandbox.google.com/private-advertising/sdk-ru...
https://files.spritely.institute/papers/spritely-core.html
Spritely seems very relevant, but I don't see it get much mentions when this pops up
The capability system i hear talked about too little for some reason and is even more "chromey baby" is workerd using isolates. You can clearly see the lineage from sandstorm/ capnp and its kind of crazy something like this is finally in a mainstream platform. Sure, the concept is not taken to the extreme without much possibility to delegate / demote capabilities at runtime, but the direction is clearly what we need. Whenever i have to come back to other environments I immediately feel the lack of trust, clarity and control these have.
For some reason I'm having a horrible time googling workerd, any tips? -- also, I might be looking up the wrong thing? (I'm hoping to learn more about the environment you are in before you "have to come back to other environments")
https://github.com/cloudflare/workerd
An interesting analogy could be Rust's unsafe. One important property of unsafe is that it's not inherently contagious. A safe function can contain an unsafe block, and that unsafe block can contain unsafe code. What this entails in practice is the function's author pinky-swearing that, despite its internals, the author promises that the function won't violate memory safety. The analogy isn't exact. In particular, functions are allowed to place human language constraints on their callers, which aren't verified, in order to uphold that guarantee. I wonder why there hasn't been any work done in this direction. It seems promising, and if there's any obvious walls that this hits, at least I hadn't figured out any.
unsafe blocks aren't anywhere near enough. Java code inside the sandbox couldn't do unsafe operations, yet, sandbox escapes happened fairly regularly.
Ahh, JAAS... Java applets were removed, but JAAS, which only existed to support applets, could not be removed because there was lots of code that depended on the Login and Subject classes and the runAs() method. Why would such code exist though, if JAAS existed only to support applets? Well, because the Login class could be used to acquire credentials. For example the Krb5Login class could be used as a Java kinit in Kerberos environments.
Anyways, JAAS' Permission class and model are weak, but yeah, they could be used to limit libraries' capabilities. A capability model would be much better than a permission model.
JAAS is on the death row and will be gone.
https://openjdk.org/jeps/411
https://openjdk.org/jeps/486
Hmmm, but Login and Subject and related classes stick around, right?
I wouldn't count on that, since Java 9 there has been a more agressive approach towards deprecated code, hence why @deprecated(forRemoval=true) is now possible.
https://docs.oracle.com/en/java/javase/24/docs/api/java.base...
=> "The following methods in this class for user-based authorization that are dependent on Security Manager APIs are deprecated for removal: "
Surprised Pony isn't mentioned:
https://www.ponylang.io/
Link on capabilities specifically:
https://tutorial.ponylang.io/object-capabilities/object-capa...
Without speaking to all of the issues, this is all made much harder by the underlying hardware having extremely bad defaults. The idea that running code on the hardware is itself an unsafe operation means that any time you want to touch it you need proxies and intermediate languages and all this by default.
It's pretty easy for me to imagine a world where running code was safe by default, and this followed all the way to the top. It's obviously not that onerous, else JavaScript wouldn't be as successful as it is. Most of the details the post touches on are then just package management and grouping concerns.
> The idea that running code on the hardware is itself an unsafe operation
Ignoring microcontrollers, and tiny embedded stuff, no hardware or modern operating systems I know of works that way.
Modern hardware almost all has an MMU (which blocks I/O once the process table is set up), and most have an IOMMU (which partitions the hardware mutually distrusting operating systems can run directly on the same machine).
The remaining architectural holes are side channel / timing attacks that hit JS just as hard as bare metal.
Process isolation only works across processes. You can't just execute an untrusted block of code without setting up a whole sandbox for it.
Okay, so let's say you have this untrusted block of code:
which tries to calculate a silly hash of all the memory it can reach via indirect loads and then to zero whatever memory it can reach via indirect stores (which is usually just the whole of the process's memory on modern systems in both cases). What mechanisms do you propose that would allow one to blindly run this code without erasing all kinds of precious information in the memory, ideally still returning 42 in r0 to the caller in the end but without leaking any sensitive information via r1?CHERI
Quite lapidary.
But enlightening: I did not previously know that CHERI had explicit tool (CCall) to implement the unspoofable "restore privileges and return from subroutine" instruction.
I think Newspeak may have potential in this area.
https://newspeaklanguage.org/
I’m really stuck on a relatively minor point: why does Joe-E have to ban the finally keyword?
I had an explanation of that but deleted it because the article is already too long. The Joe-E paper explains their reasoning. Briefly, Java uses exceptions to indicate certain kinds of errors that might leave the application in an undefined state like stack overflows, running out of memory and so on. Finally blocks allow you to execute code after those events occur. Therefore, sandboxed code could run code inside a VM that's entered a somewhat indeterminate state.
This shows up a subtle detail of sandboxing schemes that are often overlooked. The guarantees Java provides around safety are tightly scoped and often little more than saying the JVM itself won't crash. It's not that hard to arrange for a stack overflow to occur whilst some standard library code is running, which means execution can abort in nearly any place. If the code you're calling into isn't fully exception-safe, it means the libraries global variables (if any) can be left in a logically corrupted state, which might be exploitable.
If Java finally blocks had a filter clause, that could help, but finally is sometimes implicit as with try-with-resources.
Per Wikipedia: non-deterministic execution.
https://en.wikipedia.org/wiki/Joe-E
See also: formal verification methodologies like those applied to seL4. Holistic practices of upping rigorous proofs of safety and correctness is the elephant in the room besides merely testing, language features, or any one solution.