Serverless computing: Projects

CS 260r Spring 2019

Final project writeups

“Alto Snapshot/Restore: Enabling Stateful, Long-Lived Serverless Computation”, by James Larisch, Alisha Ukani, and Vincent Viego
“HyperFork: Improving Serverless Latency Through Virtual Machine Flash-Cloning”, by Michael Colavita, David Gardner, and Mark Wilkening
“λ Bureau of Investigation”, by Juan Esteller and William Qian
“Long Running Functions as a Service”, by Daniel Inge
“Melange: Toward Better Persistant Storage for Serverless Applications”, by Kristen Eberts and Benjamin D. Lee
“Native DAG Execution in OpenWhisk”, by Harshita Gupta and Eric Lu
“PEPPA: Platform microbEnchmarks—Performance Playground for functions-as-A-service”, by Alex Wendland, Athena Kan, and William Wang
“Pikachu: Serverless Message Passing”, by Yang Zhou and David Ralph Hughes

Expectations

Your project is a system that measures, uses, or extends the serverless computing model.

The primary outcome of the course will be a project writeup of sufficient density and polish that you would feel comfortable submitting the writeup for peer review (6–12 pages). You will also submit:

(Group projects) An individual contribution document detailing your contribution to the work, and your perceptions of the contributions of others (≤1 page).
A short self-evaluation (1–2 paragraphs). You may assign yourself a grade, or you may simply discuss your self-perceived level of effort and investment over the course.

Purpose: Self-evaluations are a common feature of performance reviews in industry and academia. They have advantages and disadvantages. They are often difficult to write; they can reward self-aggrandizement and enforce existing power structures. But they also enforce valuable introspection, and help people realize just how much work they’ve actually accomplished.
Any code and data collected for the project. No need for superb software engineering, but ideally the code should be accompanied by enough documentation that a motivated user could attempt to replicate your results.

All course material is due on our final deadline, Friday May 17 at 11:59pm. There is a coursewide extension until Monday May 20 at 7:00am. That extended deadline is a hard deadline: no further extensions will be granted.

Project writeup

The final project writeup should be a PDF no more than 14 pages long. Follow the format of a typical research paper: 11pt font, two-column, single-spaced. Do not double-space.

Writeups will be shared with the class and posted publicly unless you explicitly withhold permission to do so.

Research papers from the Spring 2017 Verified Systems class

Michael Ernst has some interesting advice on how to write a technical paper. (He has other advice too.) Some key points:

“The goal of writing a paper is to change people’s behavior: for instance, to change the way they think about a research problem or to convince them to use a new approach. … As a general rule, your paper needs to convince the audience of three key points: that the problem is interesting, that it is hard, and that you solved it.”

“A common mistake is to focus on what you spent the most time on. Do not write your paper as a chronological narrative of all the things that you tried, and do not devote space in the paper proportionately to the amount of time you spent on each task. Most work that you do will never show up in any paper; the purpose of infrastructure-building and exploration of blind alleys is to enable you to do the small amount of work that is worth writing about. Another way of stating this is that the purpose of the paper is not to describe what you have done, but to inform readers of the successful outcome or significant results, and to convince readers of the validity of those conclusions.”

“A related work section should not only explain what research others have done, but in each case should compare and contrast that to your work and also to other related work. After reading your related work section, a reader should understand the key idea and contribution of each significant piece of related work, how they fit together (what are the common themes or approaches in the research community?), and how your work differs.”

But do not get too hung up on these points, especially in the context of a class project. There are many ways to write a good research paper, and different communities have different standards. Regarding the quotes above:

Negative results are important research, but negative results may not change people’s behavior.
“Interesting” and “hard” are subjective and depend on audience receptivity and authorial skill. A problem can be interesting for many reasons, including that the problem has intrinsically interesting features, that other people care about the problem (whether or not the authors do), or that a solution to the problem would open up a new class of problem. And a problem can be hard for many reasons, including that the problem has simply never been noticed before: a hard problem need not have a hard solution.
A chronological narrative is often interesting to me.

Your contribution is a key aspect of the project writeup. What did you do that adds to, or slightly inflects, the sum of world knowledge?

A typical writeup will follow this format, which is not mandatory.

Title. Something grabby that correctly describes a part of the contribution.
Abstract. A paragraph or two that concisely describes the motivation for the work (the problem the work addresses), the contribution of the work, and a highlight of your results. The abstract tends to target an already-expert audience.
Introduction. A page or so that expands on the abstract. The introduction is written for a broader audience than the abstract; its purpose is to situate the technically adept, but not necessarily expert, reader with the research problem as you define it. Your introduction, for example, might start out with a brief explanation of serverless computing, followed by a description of the issue with serverless computing that you address.

Often an introduction will end with a list of contributions, and an “outline” paragraph that says “The rest of this paper is organized as follows. Section 2, etc.”
Related work (this may also appear just before the conclusion). A description of related research, especially research closely related to your own work. The purposes of this section are citation and comparison. Foundational work requires citation only; “Amazon Web Services introduced modern serverless computing with AWS Lambda in 2014 [19].” More recent work requires comparison. Describe for each group of citations (1) the core idea, (2) what is complementary with your work, (3) what is more advanced than your work, and (4) what is advanced upon by your work. (2)–(4) are optional—some papers will be entirely complementary with or orthogonal to your work. (Hopefully no work will be entirely more advanced than yours!)
Architecture. A half-page defining the context in which your work runs.
Design. The system you built, or the experiments you designed, described in enough relevant detail that a skilled system builder could replicate your work. Design is not about the names of your classes or your software engineering prowess, though you should include these details when they are relevant. Multiple pages.
Implementation. Aspects of your system that are not relevant to the design, but might be relevant to understanding your evaluation. Typically a page or less.
Evaluation. For systems work, this will often include the following subsections:
- Experimental setup. Describe how you ran your experiments. What kinds of machine? How much memory? How many trials? How did you prepare the machine before each trial?
  
  Systems Benchmarking Crimes, by Gernot Heiser
- The experiments themselves, grouped by purpose. Include figures.
- A summary of the experimental results.
Some good evaluations are organized around performance hypotheses: statements that the experiments aim to support or disprove.

Not all projects will have systems evaluations, but every project should be evaluated. If you are introducing a new computing model, for example, you could evaluate whether existing systems would fit the model.
Discussion. This optional section can speculate about your work. Feel free to put stuff here that is based on intuition and that you haven’t backed up with experiment. Do you think, in the end, that the project was a good idea? What would you do next? What should readers learn?
Conclusion. A paragraph or so.

Do not punt the writeup. In previous years, I have seen writeups with “lorem ipsum dolor” text in the abstract, and an introduction that said “Our verification VERIFIES SOMETHING HERE”. Bs were assigned.

Questions

Hyperfork

What factor do you think dominates function cold-start latency? Finding a free instance? Spinning up a VM? Loading libraries/runtimes? Copying over code? Copying over data?
We’re currently preparing infrastructure to assess several implementations of our forking primitive. Aside from the standard serverless applications (booting a kernel, starting a Node/python interpreter, starting a JVM), are there any workloads we could use to assess more general applicability (similar to Potemkin/SnowFlock)?

Messaging

Adding message passing interface to serverless function would add programming difficulties. Since programmer now needs to consider communication with other function instances, instead of simply pulling data, doing stateless computation on the data, pushing bash the data.

How do you think could control the programming difficulties? Or do you think it could be a problem for serverless computing?

Alto

How much interaction do you think serverless functions have with the file system on their VMs?

We want to snapshot FS state so that if a function performs any FS operations, we can snapshot and migrate that function, and the FS operations will appear unchanged on the new environment (assuming the actual files are also mirrored). We’re definitely going to support a very limited set of FS operations for the project, but it’d be helpful to put this in context of what users expect.
Is filesystem access important? Should we give Alto VMs filesystem access? This leads to a larger question – should Alto VMs be advertised as VMs? For runtime-based programs for which we tout the ability to snapshot and restore in-memory data structures, filesystem access seems superfluous. If we do away with it, why are we calling them VMs?
In a world in which cloud providers implement FaaS functions beneath the process level, would you be willing to deploy applications dealing with secret/sensitive information in this environment so long as the provider guarantees that functions belonging to different tenants will not be collocated within the same process?

Serverless Relational Database Migration

What sort of evaluations/comparisons/tests would be the most interesting and useful for us to do?
I would like to know why the class thinks that NoSQL databases have been overly represented among the early serverless database options such as DynamoDB. Is it due to the hype of NoSQL (serverless + nosql = vC mOnEy !!!!11!!!!1!!) or is there something that makes NoSQL better for serverless applications (ease of implementation for the cloud provider or the user)?

Debugging

We’re considering the possibility of using message passing to enable arbitrary code execution for debugging purposes – but this creates the opportunity for a major security pitfall. How would you suggest balancing flexibility with security? E.g. we can have a DSL for debugging (probably “future work” if we do this), but that could be a pain for people who now have to learn YADSL (yet another DSL)
Given the programming that you all have performed over the course of your projects, what parts of serverless programming have you found the most difficult to debug? Or have you found that it is relatively straightforward?

Benchmarking

Is it more useful to developers to have cold/warm/scaling results that are updated on an ongoing basis, or to have a broader set of performance characteristics, such as S3 throughput?
What can we do for part 2 of our project?
What is a good next step / 2nd part for our project? (Ideas: Automation, generalization, prediction, UI)

DAGular

Currently, numpywren is implemented using pywren and boto3, which rely on AWS Lambda. Since we are extending OW to compete with numpywren, our benchmarks should ideally compare numpywren running on an OW deployment with numpywren running on our modified OW. This, however, requires that we re-write much of numpywren to use OW semantics. Is there a more targeted way to benchmark our project, so that we can spend time on optimizations instead?

Dan

I’m not sure the best way to go about collecting results. If I write the code to migrate Lambdas while they are blocked I can record some timings but what should I be comparing these times to? Running the same thing with OpenWhisks controller architecture?