A new OS architecture for heterogeneous multicore systems analysis
Provide a short summary of the paper:
This paper advocates for a redesign of operating systems to reflect the
distributed nature of the underlying hardware. Future generations of
hardware will offer larger numbers of cores that are also heterogeneous. At
the same time, the interconnect between these cores will be subject to many
of the same problems currently experienced in networks. Furthermore, the
larger number of cores will make cache coherency much more complicated and
make it more difficult to provide the shared-memory abstraction. So, the
authors propose that new operating systems be composed of multiple kernels,
one for each core, and that those kernels communicate strictly through
well-defined message-passing interfaces. Then, operating systems will look
very similar to distributed systems and will be able to leverage all of the
advances in that field.
What is the strength of this paper:
The authors of this paper provide an actual operating system implementation
that they benchmark on a variety of machines to substantiate their claims.
As such, they are able to demonstrate very good scaling characteristics.
What is the weakness of this paper:
The authors are unable to ensure that the performance gains witnessed as
compared to Linux and Windows are due specifically to the multi-kernel model
advocated for in this paper. It is possible that some of these differences
are due to implementation artifacts or other factors.
Your qualifications to review this paper:
I know the material, but am not an expert.
Writing quality:
Good.
Relevance to SOSP:
People would read it before conference and attend session.
Experimental Methodology:
Good.
Novelty of Paper:
This is a new contribution to an established area.
Overall paper merit:
Strong accept.
Additional comments:
Would it have been possible to show the incremental improvements of each
aspect advocated for in the paper? For example, could blocking and
non-blocking operations be compared for a workload in AnonOS? Similarly,
could shared memory and message-passing costs be compared within AnonOS?
This would ensure that all of the gains witnessed are due specifically to
the functions advocated for in this paper.
|
Attribute |
Value | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Provide a short summary of the paper | The paper describes multikernal, a new os design, that uses message passing for communication | ||||||||||||
| What is the strength of the paper? (1-3 sentences) | They have an actual implementation of an OS that is very different than conventional OSs in an area that is interesting to many people. They have a decent evaulation. | ||||||||||||
| What is the weakness of the paper? (1-3 sentences) | I didn't like the graphs earlier in the paper with no explanation of where they came from. I also didn't find the paper to really draw me in. There's a lot of details without strong motivation. | ||||||||||||
| Your qualifications to review this paper |
|
||||||||||||
| Writing Quality |
|
||||||||||||
| Relevance to SOSP? |
|
||||||||||||
| Experimental Methodology | Good | ||||||||||||
| Novelty of paper | This is a new contribution to an established area/This is very novel | ||||||||||||
| Overall paper merit |
|
||||||||||||
| Provide additional detailed comments to the author | The paper is in a good area with good motivation for a new os but the writing could be a little better. There's also a jump from the high level to the details and it would be nice to see more of the OS design. | ||||||||||||
| Additional comments to PC (not seen by author) |
|||||||||||||
Short summary: The authors argue that an operating system should be treated like a distributed message passing system where shared state is made explict rather then implicit. They contend that this would allow a machine to consist of very hetrogeneous cores running different systems software and yet still communicating, and scaling to very large numbers of these cores. The design has a preliminary implementation which reveals no significant performance cost on current hardware for this approach.
Paper strengths: the design is of current interest, and using current shared-memory machines to implement a message passing system is easy and convincing because the architectures are fundamentally message based, under the hood. There are several strong arguments made in the paper for this architecture such as dealing with hetrogeneity, especially citing network and graphics cards as "too smart" to be accesed as dumb peripherals like current techniques require as well as adding the ability to batch updates for more efficient bus utilization-- the typical message system arguments.
Paper weaknesses: The exact design of their operating system seems strangely vague to me considering they have actually implemented it. Processes are run using something like scheduler activations, but how memory sharing is coordinated between these processes (they say it's possible) doesn't seem clear; they really have two designs going on, one which would be on an ideal architecture which is natively message based, and one which is actually implemented on shared hardware. How exactly do mechanism like vm work in the implementation, and does this differ from how it would on the shared architecture? Where does the networking stack go? Some of this is burried in there, but could be much better organized. Furthermore, what are the implications of the split phase nature? I think a few examples of how you would implement kernel subsystems would really help. In general, what hardware architecture are they proposing? It's worth noting that message systems (microkernels) did not originally take off more or less because the constant was too big, and this appears in their charts, for a single core. They're arguing that this has changed now that we have lots of cores and what we care about is further out on that graph. The fundamental arguement of this paper is that systems should be based on message passing. Good arguement. However, by introducing an actual system, they add a huge number of questions which can't really be answered in the same paper as a position paper arguing for messages. I'm not sure they quite manage to finesse these two tasks.
Qualifications: 3?
Writing quality: excellent.
Relevence to SOSP: 4 or 5?
Experimental methodology: the preliminary implementation is an important step towards evaluating the proposed architectual change, and seems sufficient for a first step. This feels like a very reasonable thing to do and furthermore is well motivated, so the burden of proof may be somewhat lower. The mix of macro- and micro-benchmarks serve to illuminate the strengths while arguing that the architecture has no inherrent inefficiencies. It would have been nice to see a carefully engineered implementation of one of the numerical tasks achieve an optimial speedup. The loopback tests and 9(c) especially leave me somewhat curious since it's not clear that, in the case of 9(c) the performance of the systems "do not differ significantly."
Novelty: they cite a lot of work, although interestingly I wish there was actually more discussion about the difference between this system and microkernels; they do mention exokernels and two true microkernel references, but I would like some more discussion about this. [I think] the most novel element is advocating exposing message passing among hetrogeneous hardware, although there certainly are things like this in the supercomputer world-- maybe this is an indication that this is a good idea, since those systems are definitly scalable.
Overall Paper merit: Accept
I agree with many of their arguements, and think that the
implementation does a reasonable job of making the point that there is
not inherent bottleneck. However, the real coup would be
demonstrating
a system where some significant benefit is attained or else integrates
hetrogeneity much better then is possible today. Their mention of
an
ARM port makes me think they agree with this point. There are
several
very good papers in all this work, and the question is how to divide
it. I'm not sure this piece exactly hits the nail on the head but
it's
a nice piece of work, with a lot of nice engineering techniques.
Summary: This paper focuses
on the problem that hardware is changing and diversifying faster than
operating system software. The internal organization of a
general-purpose computer increasingly resembles a network rather than a
single system; therefore it is no longer useful to tune a general
purpose OS design for a particular hardware model. The approach that
this paper takes is to model a new OS structure, the multikernel, that
treats the machine as a network of independent cores, assumes no
inter-core sharing at the lowest level, and moves traditional OS
functionality to a distributed system of processes that communicate via
message-passing. The authors provide the main design principles of
replication, message passing and split-phase invocation and implement
the multikernel model (AnonOS). They show that AnonOS is scalable and
adapts to a diverse set of hardware characteristics, while providing
competitive performance.
What is the strength of the paper? (1-3 sentences) The main
contribution of this paper is to treat the machine as a network of
independent cores and then to make use of previous ideas in distributed
systems. I believe that this core idea has a lot of potential for
expansion. Furthermore, this paper is very well structured with a clear
problem statement, the identification of the design principles for
their model and a detailed justification for these choices, and finally
an implementation and evaluation of the model.
What is the weakness of the paper? (1-3 sentences) One weakness of
the paper is not to present the negative aspects of their work and to
state some issues that they noticed while implementing AnonOS.
Your qualifications to review this paper: 2
Writing quality: 5
Comment: The writing was clear and I liked how the authors motivated
each part and the design decision of the paper.
Relevance to SOSP? 5
Comment: I think it is a very interesting paper and tackles a problem
relevant to the community.
Experimental methodology: 4
Comment: There is a nice parallel between section 5 and 3.4 where the
authors evaluate all the goals stated previously, but in some cases the
presentation could have been a little clearer.
Novelty of paper: 5
Comment: Their idea of treating the machine as a network of independent
cores with no directly shared memory is, to the best of my knowledge,
very novel.
Overall paper merit: 5
Provide additional detailed comments to the author: I have written the
comments with each question instead.
| Provide a short summary of the paper | The emergence of heterogeneous multicore architectures necessitates the need for a complete redesign of the underlying OS structure used to manage them. Rather than looking at the OS from a traditional point of view as one based on shared memory data structures and lock-based contention mechanisms, the OS of the future should be viewed as a distributed system of interacting components that communicate via message passing. Viewing the OS as a distributed system allows us to customize the lowest levels of an OS to take advantage of the particular features of a given hardware platform, while maintaining a good layer of abstraction for any necessary communication between core OS services. | ||||||||||||
| What is the strength of the paper? (1-3 sentences) | The case is made pretty well as to why you might like to have an OS designed from the ground up as a distributed system. The design, and implementation of AnonOS is clear, and the experiments ran are pretty convincing for the criteria they layed out for evaluation i.e. comparable performacne to OSs running on todays hardware, but with much better scalability properties for future hardware. | ||||||||||||
| What is the weakness of the paper? (1-3 sentences) |
The paper seems to be very repetitive in places. Every new section seems to re-provide motivation for designing the OS as a distributed system. The case is made well in sections 1/2, leave it at that. The evaluation could have been done on more advanced architectures, such as the Intel Core i7, or the AMD barcelona. Additionally, it would have been interesting to see how well it scaled in a virtual machine setting that was set up for 256+ cores to see how it scaled, even if the performance wasn't too great. |
||||||||||||
| Your qualifications to review this paper |
|
||||||||||||
| Writing Quality |
|
||||||||||||
| Relevance to SOSP? |
|
||||||||||||
| Experimental Methodology |
The authors verify their design through a set of experiments designed to demonstrate that it (1) has good baseline performance, (2) scales with the number fo cores, (3) is portable with minimal refactoring, (4) expoits message passing throughout, and (5) is sufficiently modular to amke use of topology-awareness. |
||||||||||||
| Novelty of paper | OS as a distributed system..... | ||||||||||||
| Overall paper merit |
|
||||||||||||
| Provide additional detailed comments to the author |
| Provide a short summary of the paper | The paper describes a new operating system design, the multikernel, for many-core processors. The idea is to treat each core as a semi-independent device that communicates with other cores through message passing. The system design is similar to microkernels in that the OS consists of userspace services (running on seperate cores) but differs in that even the kernel does not have shared-memmory. | ||||||||||||
| What is the strength of the paper? (1-3 sentences) | The paper presents an interesting OS design that takes microkernels to a logical conclusion of removing all implicit sharing within the kernel. The design is well motivated: sharing must be removed to scale to many cores. | ||||||||||||
| What is the weakness of the paper? (1-3 sentences) | The paper does not explain how this work is different from prior microkernel work until the middle of the paper. The experiments do not consider heterogeneous processors/cores as was discussed in the motivation. Finally, the experiments dont seem to show a fundemental inprovement in scalability. | ||||||||||||
| Your qualifications to review this paper |
|
||||||||||||
| Writing Quality |
|
||||||||||||
| Relevance to SOSP? |
|
||||||||||||
| Experimental Methodology | |||||||||||||
| Novelty of paper | |||||||||||||
| Overall paper merit |
|
||||||||||||
| Provide additional detailed comments to the author |
I recommend describing the novelty of the design (compared
to microkernels, Tornado, etc.) early in the paper.
|
||||||||||||
| Additional comments to PC (not seen by author) |
This paper proposes the re-evaluation of the design of multicore operating systems as a distributed networked system. By viewing each core as a separate processing unit with its own state and using message-passing and various distributed-system protocols to manage consensus and sharing, you can generalize the operating system for the ever changing underlying hardware. A general message-passing interface and split-phase programming can allow for pipelining and other optimizations that decouple the techniques and mechanisms from the policies for dealing with sharing across core. Furthermore, many of the contributions and work in the distributed system theoretical and systems literature can be pulled into future OS design for multicore systems.
Strengths:
The authors bring up many convincing arguments and motivate the problem quite well. Furthermore, the AnonOS implementation of the multikernel design is a nice demonstration that this is more than just an academic discussion. The fact that it was able to perform comparably to current systems is encouraging.
Weaknesses:
It's not clear to me how much of the performance overhead seen in the evaluation is inherent to their implementation of the design and how much is the actual design of a multikernel system. To the authors' benefit, however, they do state that the evaluation is not thorough and that instead what should be focused on is the fact that the performance numbers are comparable which motivates the design choices.
Your qualifications to review this paper: 3
Writing Quality: 5
Relevance to SOSP?: 5
Experimental Methodology: 4
Novelty of paper: 4
Overall paper merit: 5
Provide additional detailed comments to the author:
This is a nice piece of work that focuses the community on thinking about new designs for multicore system and I think that going about it like this paper has will allow operating system designers to more easily keep up with the rate of change of hardware. That said my only concern is that if this does become a viable design point, that more distributed system problems will be introduced at the cost of performance that could/would have been solved more easily with current techniques. In other words, if you choose to introduce the benefits of distributed system design into OS design you also introduce the inherent set of problems that that community is also trying to tackle. Small problems in more distributed higher-latency network settings might be amplified at the scale of a multicore OS.
Additional comments to PC:
I think this is a fresh look at multicore OS design. It's
important, i think, for the community to consider this work in moving
forward and keeping up with the hardware trends and to move away from
incremental improvements and tweaks made to new systems based on
specific design decisions made for the older versions.
|
Attribute |
Value | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Provide a short summary of the paper | To make an operating system that scales to heterogenous multicore systems, the authors develop an OS where the cores are loosely coupled as in a cluster: Each core communicates with other cores only uses an asynchronous messages abstraction (which, on the systems the authors test, is implemented using shared memory); most operating system state is replicated between cores instead of shared. Performance results show that this approach can scale to tens of cores better than the approaches used by commodity operating systems. | |||||||||||
| What is the strength of the paper? (1-3 sentences) | The "multikernel" criteria are clearly stated and likely widely applicable. The design choices are backed up by experiments on real systems that suggest their apporach is likely to scale to many-way multicore systems. | |||||||||||
| What is the weakness of the paper? (1-3 sentences) |
Heterogenity, depsite playing such a prominent role in the paper, is not actually addressed, not even in simulation or a model. The paper does not do a good job at quickly distinguishing itself from prior work. |
|||||||||||
| Your qualifications to review this paper | I know the material, but am not an expert. | |||||||||||
| Writing Quality | Good | |||||||||||
| Relevance to SOSP | People would attend session | |||||||||||
| Experimental Methodology | Excellent |
| Novelty of paper | Incremental improvements |
| Overall paper merit | Strong accept |
| Provide additional detailed comments to the author |
The paper talks about heterogenity as a motivating factor, but does not attempt to evaluate how the approach deals with heterogenity with any rigor; the paper would be fine without mentioning heterogenity. How this approach differs from microkernel and K42-like approaches should be addressed more prominantly than a couple of paragraphs near the end of the paper. The experiments are generally good, however, some things about them were wanting: - Given that the interconnect network requires more hops between different pairs of processors in some cases, it would be helpful to see worst case RPC timings. - Comparing with Apache (a webserver usually chosen for configurability that typically operates using many kernel threads) with some custom webserver seems unfair. The lack of description of the architecture of the custom web server (does it use a thread/process poll model like Apache?) makes it difficult to gauge how realistic the comparision is. |
| Additional comments to PC (not seen by author) |
|
Attribute |
Value | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Provide a short summary of the paper | The author argues that the shared-memory model for multicore systems is inappropriate in the face of their rapidly-increasing complexity and heterogeneity; instead, he says that we should shift to the shared-nothing distributed system paradigm. The author proposes the multikernel model, similar to an exokernel, where the simple kernel is shared across all cores and a per-core user-level monitor coordinates core-specific state. The author provides a multikernel implementation called AnonOS, which he demonstrates to have similar scalability to Windows and Linux and better performance in some cases. | ||||||||||||
| What is the strength of the paper? (1-3 sentences) | The author provides a good motivation for the paradigm shift and builds upon many good past results to develop his multikernel model. | ||||||||||||
| What is the weakness of the paper? (1-3 sentences) | The author is a bit inconsistent about his claims; eg, in Figure 7, he shows that the unmap latency achieved by AnonOS is lower than that of either Windows or Linux, but later on, he says that the OSes should not really be compared since the latter are so much more mature. | ||||||||||||
| Your qualifications to review this paper |
|
||||||||||||
| Writing Quality |
|
||||||||||||
| Relevance to SOSP? |
|
||||||||||||
| Experimental Methodology -- 5 | |||||||||||||
| Novelty of paper -- 4 | |||||||||||||
| Overall paper merit -- 6 |
|
||||||||||||
| Provide additional detailed comments to the author | I would like to have the claims clarified -- you say that AnonOS should not be compared with Linux and Windows yet present superior performance in some cases. Why is it that they should not be compared? Also, you say that there is a significant difference between multikernels and VMMs; I would have liked to see more justification for why this is true, especially since in the related work section, you mention that the distributed systems paradigm is also used in VM work like Disco. | ||||||||||||
| Additional comments to PC (not seen by author) |
Work is very interesting. I would have liked to see the author mention that the distributed system paradigm has been applied before to manycore systems. As is, it sounds like he is the first (until I reach the related work section). | ||||||||||||
Provide a short summary of the paper:
This paper starts from the premise that hardware is evolving too rapidly for OS software to keep up with it. Furthermore, hardware is becoming less uniform over time and the current strategy of treating any unusual processing units as off chip devices doesn't scale well. The solution proposed by the paper is to treat the processor more like a distributed system and have the OS be programmed like a distributed system. Each core is its own processing node and everything communicates via message passing. This allows an OS to be easily adapted to different architectures by simply optimizing the message passing code. The rest of the OS can remain unchanged. Also described is an implementation of an OS called AnonOS using the ideas presented in the paper. It is shown that AnonOS scales better than existing OSes on some workloads, and as well on others even though AnonOS has not been optimized much at all.
What is the strength of the paper? (1-3 sentences)
The paper makes a strong argument for why message passing is a preferable paradigm for programming an OS in. Furthermore, they show that existing OSes employ techniques that simply don't scale well in multi-core NUMA environments.
What is the weakness of the paper? (1-3 sentences)
The paper uses one example to argue why existing OSes will not be able to adapt to new architectures well. This is unconvincing for two reasons. Firstly, perhaps it was simply bad design in Windows that made it so much work to remove the dispatcher lock, and secondly, implementing a whole new OS is a huge amount of work too.
Your qualifications to review this paper
Writing Quality
Relevance to SOSP?
Experimental Methodology:
Average
Novelty of paper:
This is a new contribution to an established area
Overall paper merit:
Strong accept - This is of interest to Sensys, and a novel or new
contribution to existing area with good/average methodology, or an
incremental contribution paper that has excellent methodology.
Provide additional detailed comments to the author:
The presentation is good and the ideas are certainly interesting.
I
would like to see some absolute performance comparisons to existing
systems.
Additional comments to PC:
Similar work has been done before (Tornado/K42), but this is still an
interesting paper and worth reading.
|
Attribute |
Value | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Provide a short summary of the paper | The authors propose a new operating system design, a multikernel, for emerging heterogenous multi/manycore architectures. Their design uses the distributed systems techniques of replication, message-passing, and split-phase (i.e. asynchronous) communication. They discuss an implementation of their operating sytem design, and evaluate their design using a series of microbenchmarks as well as a web server and relational database. | |||||||||||
| What is the strength of the paper? (1-3 sentences) | The authors include a detailed discussion of how system diversity, core heterogeneity, interconnect topology, and cache coherency are motivating factors for redesigning operating system structure. The discussion of these is valuable to the community at large to consider. | |||||||||||
| What is the weakness of the paper? (1-3 sentences) | The authors motivate heterogenous systems as being a major influence in the design of their operating system, but they fail to perform any evaluation on a heterogenous system. Granted, there are not many (if any) heterogenous systems other than GPU + CPU, but they failed to even show their system on this configuration. | |||||||||||
| Your qualifications to review this paper | I know a lot about this area | |||||||||||
| Writing Quality | Good | |||||||||||
| Relevance to SOSP? | People would attend session, but not read it beforehand | |||||||||||
| Experimental Methodology | Good | |||||||||||
| Novelty of paper | This is a new contribution to an established area | |||||||||||
| Overall paper merit | Accept - This is of interest to SOSP, a novel or new contribution with average/weak methodology, or incremental contribution paper that has good methodology. | |||||||||||
| Provide additional detailed comments to the author | |
|||||||||||
| Additional comments to PC (not seen by author) |
We are seeing a lot of operating systems papers that have proposed novel techniques, and shown how much better applications perform using their operating system. However, many/most of the performance benefits seem to come from eliminating bottlenecks such as kernel crossings, that contemporary operating systems could employ themselves. Its harder to evaluate these papers when it is unclear how many of these novel techniques are the true cause of the performance increases, and how many of the techniques have artifacts that can be applied to existing operating systems that benefit performance. | |||||||||||