I remember implementing this decades ago. Brings me back to see it again :)
Your implementation looks nice.
You are right that often lots of lattice tricks and such are played in real implementations.
If you are worried about speed and memory usage while dealing with context sensitivity, and want to make the lattice taller or wider, incorporate BDD's or a similar mechanism. Somewhat pointless for single-functions, but as context-sensitivity grows, you either need BDD's or summaries (or some form of auto-sharing) if you want to represent nice large lattices.
Lattice value bdd's are a thing too, so you don't have to spend a bunch of time figuring out how to represent the lattice as a BDD (alternatively, you can do something like Souffle and friends that will take care of it under the covers) :)
This is the historically normal mechanism for effective representation of this sort of problem, because it deals with sharing of equivalent values/etc for you.
Also, if you want later passes to take advantage of the final lattice information, there are ways to represent it in the IR effectively, the same way you would with value ranges.
You can do the same thing with the final type info to pass it along to later things.
The limit of this kind of splitting is basically SSI and friends.
If you are interested in the general theory of splitting and using it to represent info or solve other types of problems faster, see: https://arxiv.org/pdf/1403.5952 and friends.
Cool article! I was wondering how high k goes in production? You can easily get call stacks that are 20-100 depth so the theoretical maximum is around 100 but I imagine that would be very very slow
I don't know how deep call stacks go (and also separately don't know if I am allowed to quote stats like that) but people tend to not do points-to with k higher than 2 or 3 because it explodes the whole graph
IIRC the complexity of k-CFA is exponential in k, which is not great. There’s another version called m-CFA; I don’t remember exactly what it does differently, but it doesn’t blow up exponentially
I'll borrow a few choice pieces from the paper to try to explain why.
The main difference is that:
k-cfa is sensitive to the last k calls.
m-cfa is sensitive to the top m stack frames.
This turns out not equivalent: with k=1 vs m=1: a program where main calls a calls b. in k-cfa, the context will be the call to b. in m-cfa, the context will be the call to a.
How m-cfa came about and why it works:
k-CFA is EXPTIME-complete (so no polynomial algorithm can exist).
However, it was observed that there are plenty of provably k-cfa equivalent points-to for OO languages that ran in provably polynomial time (and none for functional languages). Diving into this caused folks to discover the difference between objects and closures matters a lot to the time-bounds, and led to the formulation of m-cfa.
Space complexity was never as much an issue as time complexity was, due to BDD's and other things.
It's not really supposed to be for any particular language. We didn't want to even say Ruby in the title because the experiment is on a very very (very!) stripped-down Ruby.
IMO the more interesting thing here is incrementally discovering and building the interprocedural flows-to relation.
One of the authors here. Would enjoy hearing what you think
I remember implementing this decades ago. Brings me back to see it again :)
Your implementation looks nice.
You are right that often lots of lattice tricks and such are played in real implementations.
If you are worried about speed and memory usage while dealing with context sensitivity, and want to make the lattice taller or wider, incorporate BDD's or a similar mechanism. Somewhat pointless for single-functions, but as context-sensitivity grows, you either need BDD's or summaries (or some form of auto-sharing) if you want to represent nice large lattices.
Lattice value bdd's are a thing too, so you don't have to spend a bunch of time figuring out how to represent the lattice as a BDD (alternatively, you can do something like Souffle and friends that will take care of it under the covers) :)
This is the historically normal mechanism for effective representation of this sort of problem, because it deals with sharing of equivalent values/etc for you.
Also, if you want later passes to take advantage of the final lattice information, there are ways to represent it in the IR effectively, the same way you would with value ranges.
In value ranges, you would do:
if (a0 > 5) { a1 = range_info(a0, >5) } else { a2 = range_info(a0, <=5) }
You can do the same thing with the final type info to pass it along to later things.
The limit of this kind of splitting is basically SSI and friends.
If you are interested in the general theory of splitting and using it to represent info or solve other types of problems faster, see: https://arxiv.org/pdf/1403.5952 and friends.
I was wondering what you would use to find function callers—happy to see k-CFA employed!
I wasn’t able to get too deep in your article; have you looked at demand-CFA for making some analysis cheaper?
How is analysis going for higher-order functions?
Cool article! I was wondering how high k goes in production? You can easily get call stacks that are 20-100 depth so the theoretical maximum is around 100 but I imagine that would be very very slow
I don't know how deep call stacks go (and also separately don't know if I am allowed to quote stats like that) but people tend to not do points-to with k higher than 2 or 3 because it explodes the whole graph
IIRC the complexity of k-CFA is exponential in k, which is not great. There’s another version called m-CFA; I don’t remember exactly what it does differently, but it doesn’t blow up exponentially
Yes, the time complexity of m-cfa is polynomial.
See https://arxiv.org/pdf/1311.4231
I'll borrow a few choice pieces from the paper to try to explain why.
The main difference is that:
k-cfa is sensitive to the last k calls.
m-cfa is sensitive to the top m stack frames.
This turns out not equivalent: with k=1 vs m=1: a program where main calls a calls b. in k-cfa, the context will be the call to b. in m-cfa, the context will be the call to a.
How m-cfa came about and why it works:
k-CFA is EXPTIME-complete (so no polynomial algorithm can exist).
However, it was observed that there are plenty of provably k-cfa equivalent points-to for OO languages that ran in provably polynomial time (and none for functional languages). Diving into this caused folks to discover the difference between objects and closures matters a lot to the time-bounds, and led to the formulation of m-cfa.
Space complexity was never as much an issue as time complexity was, due to BDD's and other things.
[flagged]
It's not really supposed to be for any particular language. We didn't want to even say Ruby in the title because the experiment is on a very very (very!) stripped-down Ruby.
IMO the more interesting thing here is incrementally discovering and building the interprocedural flows-to relation.
> We know from experience that YJIT compiles over 9000 methods when running Shopify’s production code
All your methods are belong to us?