Artifact Review Summary: Boki: Stateful Serverless Computing with Shared Logs
Artifact Details
Badges Awarded
Artifact Available | Artifact Functional | Results Reproduced |
Description of the Artifact
The artifact is located in https://github.com/ut-osa/boki-benchmarks. The correct version of the artifact is in the main
branch. According to the authors, this repository includes source code of evaluation workloads of Boki, and scripts for running experiments. It includes all materials for the artifact evaluation of the SOSP ‘21 paper Boki: Stateful Serverless Computing with Shared Logs
.
According to the reviewers, the artifact includes three major modules:
- the core algorithm of Boki (source code in
~/boki
) - the evaluation suites (evaluation programs in
~/workloads
, configuration files and execution scripts in~/experiments
) - supporting materials (pre-compiled docker files in
~/dockerfiles
, auxiliary utility functions in~/scripts
, and a document~/README.md
)
The entry point for the evaluation is at ~/experiments/run_quick.sh
. Logically, this script executes all run_once.sh
scripts in each of the experimental directory with a set of configurations.
According to all reviewers, the artifact is well organized in structure and is easy to evaluate. Reviewers can run experiments successfully under documental guidance.
Environment(s) Used for Testing
Two environments were used successfully for the evaluation: the author-provided environment and the user-built environment.
Some reviewers used the public EC2 instance at region us-east-2
well prepared by the authors to run experiments. This instance was a t3.micro
instance with 2 vCPU cores, 1GiB RAM, 12GB EBS disk. As for the software, it was based on Ubuntu 20.04
with all the necessary dependencies such as Python 3.8.10
installed.
Others created the EC2 instance on their own. The EC2 instance was based on the AMI artifact presented by the author. That instance was configured to a t2.micro
type with 1GB RAM, 1 vCPU core, and 16GB disk space. The controller machine run on Ubuntu 20.04.2 LTS
with Python 3.8
installed.
Any of the pre-mentioned instances serves as a controller machine in the cluster that hosts all the serverless experiments. config.json
in each experiment folder determines what kind of cluster should be created for the evaluation.
Step-By-Step Instructions to Exercise the Artifact
- Setting up the controller machine.
-
For an author-provided environment, the authors granted reviewers access to the instance by acknowledging public keys. The instance had pre-installed all required packages. Therefore, no extra operations are required for evaluation.
-
For a user-built environment, to build up an environment suitable to run experiments, the reviewer took these additional steps:
- Redirect python from
Python 2.7
toPython 3.8
with commandssudo rm /usr/bin/python
andsudo ln -s /usr/bin/python3 /usr/bin/python
. - Install packages
unzip
andpython3.8-venv
to unzip and install aws CLI version 1. - Configure your aws CLI within the instance to allow connections to itself.
- Clone the repository with all git submodules with
git clone --recursive https://github.com/ut-osa/boki-benchmarks.git
- Execute
scripts/setup_sshkey.sh
to setup SSH keys that will be used to access experiment VMs.
- Redirect python from
- Setting up EC2 security group and placement group.
-
The controller machine creates EC2 instances with security group
boki
and placement groupboki-experiments
. The security group includes firewall rules for experiment VMs (including allowing the controller machine to SSH into them), while the placement group instructs AWS to place experiment VMs close together. -
Reviewer had to execute
scripts/aws_provision.sh
on the controller machine to create these groups with correct configurations.
- Building Docker images (optional).
- Use the script
scripts/docker_images.sh
for building Docker images relevant to experiments in this artifact. As the authors have already pushed all compiled images to DockerHub, there is no need to run this script as long as you do not modify source code ofBoki
(in~/boki
directory) and evaluation workloads (in~/workloads
).
- Run the experiments.
-
As an initial attempt, run
~/experiments/run_quick.sh
. The script should ran for at least an hour and terminated without any problem. After its competence, run~/boki-benchmarks/scripts/summarize_results.py
for obtaining a printed summary on the recorded metrics. -
Compare the obtained summary with an expected one which is located at
~/boki-benchmarks/tmp/expected_summary.txt
. Also, compare the results with the major claims made in the paper for precise evaluation results.
How The Artifact Supports The Paper
Above all, the artifact well supports the major performance claims made in Sections 7.2-7.4.
In Section 7.2 BokiFlow
, the authors made the following arguments:
- In both workloads,
BokiFlow
achieves much lower latencies thanBeldi
for all throughput values. - In the
movie
workload, when running at 200 requests per second (RPS), BokiFlow’s median latency is 26ms, 4.7× lower thanBeldi
(121ms). - In the
travel
workload, BokiFlow’s median latency is 18ms at 500 RPS, 4.3× lower thanBeldi
(78ms).
In Section 7.3 BokiStore
, the authors made the following arguments:
BokiStore
achieves 1.18–1.25× higher throughput thanMongoDB
.BokiStore
has considerable advantages over MongoDB in transactions (up to 2.3× faster).BokiStore
is slower than MongoDB for non-transactional reads.
In Section 7.4 BokiQueue
, the authors made the following arguments:
- When the P:C ratio is 1:4, both
BokiQueue
andPulsar
achieve double the throughput ofAmazon SQS
.BokiQueue
achieves up to 2.0× lower latencies thanPulsar
. - When the P:C ratio is 4:1,
Amazon SQS
suffers significant queueing delays, limiting its throughput.BokiQueue
andPulsar
have very similar throughput, whileBokiQueue
achieves 1.18× lower latency thanPulsar
in the case of 256 producers. - When the P:C ratio is 1:1,
BokiQueue
consistently achieves higher throughput and lower latency than bothAmazon SQS
andPulsar
.
All in all, there are nine major claims in this part of the paper. These claims proved to be affirmable and correct according to the evaluations by all reviewers. There are no significant conflicts between the authors’ arguments and the experimental outcomes.
In conclusion, the Artifact Available
badge is approved because all reviewers have access to the source codes of the artifact. The Artifact Functional
badge is approved because all reviewers can re-execute all the experiments the author mentioned in the paper. The Results Replicated
badge is approved because all reviewers have obtained evaluation results that have met the expectations argued by the authors.