使用微信扫一扫分享到朋友圈
使用微信扫一扫进入小程序分享活动
Front of Ballroom A and Ballroom B
Welcome to the Summit
SPDK: State of the Project
SPDK is continuing to evolve – from both a technology and community perspective. In this talk, Jim with review the significant changes over the last year, and provide insight into what to expect from SPDK over the coming year.
PMDK: State of the Project
The SNIA NVM Programming Model is an agreement between dozens of companies on how the Operating System exposes persistent memory, building on the standard storage APIs. But to make the most of persistent memory, application writers want to access persistence directly using memory semantics, and this can be tricky programming. This is where PMDK comes in. In this talk, Andy will explain the goals of PMDK, the primary motivation for creating it, and how well it has met those goals so far. Andy will talk about what has worked well, as well as some of the challenges we still have ahead of us for PMDK.
VTune: State of the Product
TBD
Morning break
Squeezing Compression and Encryption into SPDK
SPDK has already enabled encryption and compression bdev by fully leveraging DPDK’s existing variety of drivers.
This talk will first give an introduction on compress bdev about its overall architecture and explain in detail how we are managing the layout of the device and leveraging the Persistent Memory Development Kit(PMDK) to store metadata in super-fast persistent memory. Then we will talk about encryption related feature (encryption bdev & SED opal) which will give an overview on how it works and how we can use it.
VTune and Analyzers Overview (first part)
Overview of Intel Tools (e.g. Intel® Parallel Studio, Emon, etc), methodology to characterize workloads/systems, Intel® Optane™ DC Persistent Memory (Apache Pass) tools, and examples on which to use depending on the situation
FusionEngine 2.0--Alibaba user-space full stack solution for storage
FusionEngine is a user-space storage system developed by Alibaba. FusionEngine provide user space storage stack for single node at the very beginning and it achieved great advancements recently which we announce as FusionEngine 2.0. It provides the capability of tiered storage for one single node and storage capability across network, all the solution help to improve performance, reduces CPU utilization and simplified the troubleshooting greatly.
Full stack optimization for Udisk with SPDK
UDisk provides persistent block storage to virtual cloud hosts in UCloud, which is automatically replicated to protect users from component failure. Currently, UDisk-SSD cannot meet the performance requirements of some users. We still had a lot of work to do to take full use of CPU, NVMe SSD and network. With the help of solutions provided by SPDK and some technologies such as RDMA, we made a deep optimization of whole IO path, making a great breakthrough in performance. New Generation UDisk-RSSD provides up to 1.2M iops and latency is as low as 100us. In this presentation, we will share the experience and results of performance optimization, and some work we will do in the future
Afternoon break
Prepare for the next generation of memory, is your application a good candidate?
Applications are developing an insatiable appetite for DRAM memory. It is well known that the limited availability of system memory has a direct impact on performance for many software programs. To keep up with this demand, platforms have added more and more expensive memory since alternative solutions were not widely available… until now. This talk provides step-by-step instructions on how to use Intel® VTune™ Amplifier to determine whether an application may be a good candidate for using Intel Optane DC persistent memory as an affordable, high-capacity, volatile memory.
Optimize system configurations and workloads for Intel® Optane™ DC persistent memory
Have you ever wondered if your system is configured well for its typical loads? Or if your typical workloads are well optimized for your system? Will your workloads benefit from Intel Optane DC persistent memory? State of the art performance analysis tools, for longer runs, do not always give sufficiently detailed performance metrics. More detailed performance analysis tools can overwhelm the user with huge amount of fine-grained data. Intel® VTune™ Amplifier’s Platform Profiler provides an adequate amount of data for a user to detect if there is any problem with the system configuration, or if there is any pressure on specific system components like memory or I/O that cause performance bottlenecks. This presentation focuses on how to use Intel® VTune™ Amplifier Platform Profiler for (1) Analyzing suitability of your workload for Intel® Optane™ DC PMM and (2) Analyzing performance on an Intel® Optane™ DC PMM enabled system.
PMDK essentials
The SNIA NVM Programming Model is an agreement between dozens of companies on how an OS exposes persistent memory, building on the standard storage APIs. But to make the most of persistent memory, application writers want to access persistence directly using memory semantics, and this can be tricky programming. This is where PMDK comes in. In this talk, Andy will explain the goals of PMDK, the primary motivation for creating it, and how well it has met those goals so far. Andy will talk about what has worked well, as well as some of the challenges we still have ahead of us for PMDK.
High Performance Pooled Storage for RSD Architectures
Driving efficiency and performance is critical to modern data center architectures, this talk will cover how the features of SPDK (lockless design, user space lib, core affinity, bdev stacking, and debug ability) were used to provide high performance nondurable block storage with RAID 0, think provisioning QOS, Clones, Snapshots and Redfish/RSD compliant management.
Afternoon break
SPDK based user space NVMe/TCP transport solution and Intel’s 100Gb NIC update
In November 2018, NVM express releases the new spec of TCP transport for NVMe over fabrics. In this talk, we would like to introduce the design, implementation and development plan of NVMe-oF TCP transport in SPDK . Currently, SPDK implements both TCP transport in host and target side, and can be tested against Linux kernel solution with good interoperability. Besides, some experiments results will be presented to demonstrate the performance and scalability of SPDK's NVMe-oF TCP transport implementation. Moreover, we will introduce some techniques for the further performance improvement of SPDK's solution, e.g., (1) leveraging user space TCP stack to replace the kernel TCP stack; (2) leveraging some features of NICs, we will introduce Intel’s New 100 Gb NIC. Compared with kernel solution, SPDK based NVMe-oF solution has much better per CPU core performance in different aspects (e.g., IOPS, latency).
Accelerating Redis with Intel Optane DC Persistent Memory
This talk introduces the optimizations of Redis on DCPMM with the detail designs on different data structures, migrate data to DCPMM by identify hot and cold data, and leverage DCPMM persistent capability to improve the persistent performance. With the optimization, the performance of redis show the same level of performance and latency of DRAM and meet the customer’s SLA requirements. Since data persistent in the DCPMM and we don’t need to store the real data instead of data location in AOF that will reduce the disk IO throughput dramatically and improve the redis persistent performance over 2x.
VTune and Analyzers OverviewV(second part)
Overview of Intel Tools (e.g. Intel® Parallel Studio, Emon, etc), methodology to characterize workloads/systems, Intel® Optane™ DC Persistent Memory (Apache Pass) tools, and examples on which to use depending on the situation.
Persistent Memory Provisioning/Configuration tools
This session is aimed at System Administrators or Application Developers with minimal or no experience working with persistent memory. Usha will introduce and demonstrate how to provision persistent memory in Linux using the open source ndctl utility.
Afternoon break
Persistent Memory Programming Made Easy with pmemkv
Introducing pmemkv, an open-source local key/value store for persistent memory based on PMDK. Written in C/C++, pmemkv provides optimized language bindings for Java, JavaScript, and Ruby. Pmemkv includes multiple storage engines that are tailored for different use-cases. Fast, flexible and bulletproof, pmemkv is an easy way to modify applications to use persistent memory
End-to-end data protection with SPDK NVMe/TCP target
This talk is an update to end-to-end data protection with SPDK since the last SPDK US summit.
The strategy for SPDK iSCSI target goes well and TCP transport of SPDK NVMe-oF target support DIF insert/strip next.
There are some differences between SPDK iSCSI target and TCP transport of SPDK NVMe-oF target.
SPDK NVMe-oF target already have supported DIF passthrough. TCP transport of SPDK NVMe-oF target convert SGL to data sent within a PDU series for transmission across a TCP fabric.
This talk will mainly for the difference and the initial performance evaluation for DIF insert/strip with TCP transport of SPDK NVMe-oF target.
Front of Ballroom A and Ballroom B
Introduce a new VM and Container file accelerator and live recovery feature in SPDK Vhost
In this presentation, we propose an SPDK user space vhost-user-fs solution, which can be used to accelerate file access in VMs and Containers.
We will present this solution in detail including the utilization of techniques such as virtio-fs, blobfs. Relying on this solution,
we are going to build a fast, consistent and secure manner to share directory tree on host to guests.
Live recovery is very useful in production environment, it can let users upgrade their vhost process without interrupting VMs, in this presentation,
We will introduce this new feature.
Lessons learned from MemVerge, an avid PMDK user
MemVerge is a startup based in Silicon Valley, focusing on building the next generation high performance data infrastructure based on persistent memory. In this presentation, we would start by briefly introducing some of the data infrastructure software that MemVerge has been working on followed by discussing a few use cases we are currently exploring. As an avid user of PMDK, we would like to further share a few lessons learned during our product development, where some product components are being actively prototyped with PMDK. In particular, we would like to talk about our experience using PMDK’s memory allocator. We discuss a few memory fragmentation challenges that we encountered, and present our solutions based on allocation class and the control functions on memory arena that was recently introduced in PMDK 1.6.
Morning break
Introduction of Baidu Chitu Storage with SPDK NVMe-oF Application
In the deep learning scenario, video and graphics are trained by GPU, which requires high concurrency and random reading of a large number of small data blocks. Under the common HDFS storage architecture, the performance of small files and the utilization of storage space are both low. High-speed GPU computing components need to match high-speed NVMe storage, HDFS storage architecture is difficult to unleash NVMe storage’s extreme performance.
Baidu Chitu Storage provides high throughput and low latency shared NVMe storage for GPU cluster, and adopts a layered architecture. The lower layer is a distributed high-availability block-level logical volume system based on NVMe-oF, and the upper layer is a parallel file system called Baidu Parallel FS (bpfs) on the logical volume. Small files and their meta-information are packaged and stored in logical volumes. The client accesses small files directly through NVMe-oF to the remote storage server. SPDK NVMe-oF is the key module of data path.
Baidu Chitu's storage latency is 10us higher than local NVMe. On one single client, multithreaded random IO of 16K small file can saturate the 100G network, and 500K IOPS for small files. Before reaching the hardware bottleneck, the aggregation throughput of multi-storage servers increases linearly. Meanwhile the aggregation throughput of multi-client random reads also increases linearly and the latency remains unchanged. Test performance from video and image trainings of 2 billion files is equivalent to the local multi-NVMe RAID0. Under the pressure of millions of IOPS on storage servers, using SPDK NVMe-oF saves 8 times CPU overhead than using Linux kernel modules. At the same time, the development and operation efficiency of SPDK user-mode program is much higher than that of kernel module.
Optimize your PMDK application’s performance with the help of Intel® VTune™ Amplifier profiler
Take a deep dive into the details of profiling to optimize performance with persistent memory. If you want to take advantage of the Intel® Optane® DC persistent memory in AppDirect mode then PMDK library is probably your best bet. But what can you do if the performance you get doesn’t satisfy you? In this talk you will learn how to use Intel® VTune™ Amplifier to optimize PMDK-based applications.
Intel NVM technology and solution evolutions
Morning break
Integrating SPDK in the NAS gateway
The goal of integrating SPDK in our NAS system is to build a highly available and high performenc cache layer with low latencies. We will use SPDK to take over local nvme device and export it out with SDPK iscsi target or NVMe-OF of the SPDK for higher performence. Then in our gateway nodes ,one local nvme device and one remote nvme device can be made mirrors as a cache layer in our NAS system.In this talk, we will go over the cache layer design of our NAS system,and how we use the SPDK to use remote nvme device in our system.
Accelerate Spark with Intel Optane DC Persistent Memory
The capacity of data grows rapidly in big data area, more and more memory are consumed either in the computation or holding the intermediate data for analytic jobs. For those memory intensive workloads, end-point users have to scale out the computation cluster or extend memory with storage like HDD or SSD to meet the requirement of computing tasks. For scaling out the cluster, the extra cost from cluster management, operation and maintenance will increase the total cost if the extra CPU resources are not fully utilized. To address the shortcoming above, Intel Optane DC persistent memory (Optane DCPM) brings the break to the traditional memory/storage hierarchy and scale up the computing server. It brings higher capacity than memory and higher bandwidth & lower latency than storage like SSD or HDD. And Spark is widely used in the analytics like SQL and ML on the cloud environment. For cloud environment, low performance of remote data access is typical stop gap for users especially for some I/O intensive queries. For the ML workload, it’s an iterative model which I/O bandwidth is key to the end-2-end performance. In this talk, we will introduce how to accelerate Spark SQL with OAP (https://github.com/Intel-bigdata/OAP) to accelerate SQL performance on Cloud to archive 8X performance gain and RDD cache to improve K-means performance leveraging Intel Optane DCPMM with 2.5X performance improvement. Also we will have a deep dive about the root cause for those performance gains.
Persistent Memory – which mode do I want? Where are the “gotchas” hidden?(first part)
Intel Optane DC persistent memory can be configured either as persistent memory (AppDirect) or as main memory, with DRAM used as a cache (memory mode). Each mode has some challenges for adoption and to extract the best performance from this memory technology and identify which configuration mode is best suited for your application, it is necessary to understand the architectural flow from the core to the memory and some key glass jaws. In this presentation, we will present an overview of the uncore architecture leading to architectural glass jaw issues to monitor when using Intel Optane DC persistent memory and the conditions to monitor when using Intel Optane DC persistent memory in both memory configurations with the relevant architectural background, recommendation on what tools/profiles to use with customer applications.
Morning break
Persistent Memory – which mode do I want? Where are the “gotchas” hidden?(second part)
Intel Optane DC persistent memory can be configured either as persistent memory (AppDirect) or as main memory, with DRAM used as a cache (memory mode). Each mode has some challenges for adoption and to extract the best performance from this memory technology and identify which configuration mode is best suited for your application, it is necessary to understand the architectural flow from the core to the memory and some key glass jaws. In this presentation, we will present an overview of the uncore architecture leading to architectural glass jaw issues to monitor when using Intel Optane DC persistent memory and the conditions to monitor when using Intel Optane DC persistent memory in both memory configurations with the relevant architectural background, recommendation on what tools/profiles to use with customer applications.
VTune - Performance characterization of SPDK using Intel® VTune™ Amplifier
With traditional interrupt driven I/O, the CPU is either doing something useful or waiting. With SPDK’s polled I/O the CPU is always 100% busy so traditional profiling techniques don’t work. Intel® VTune™ Amplifier can identify “empty” spinning so you can balance core loading, balance SSDs, see the throughput per device, PCIe traffic breakdown and lots of good stuff. Learn how to use Intel VTune Amplifier to optimize your I/O performance.
Why SSD developers need pynvme, and why pynvme needs SPDK?
SSD is becoming ubiquitous in both Client and Data Center markets. The requirements on function, performance and reliability are refreshed frequently. As a result, SSD design, especially the firmware, has been keeping upgrading and restructuring for the decade.
The test makes the change under control. However, the firmware test is not as mature as the software test. We have well developed methodologies, processes and tools for software. But the embedded platform, where the firmware executes, only provides the limited resources on computation and memory. So, it is difficult to run full test in the native embedded environment. Practically, SSD vendors run system tests with 3-rd party software, consuming huge resources. The existed tools lacks the flexibility to make efficient tests against vendor's own features and flaws. SSD developers need an infrastructure to implement their test s or programs in low cost. Our pynvme is just the answer.
The test-dedicated light-weighted NVMe driver is the most essential part of the solution, where we rely on SPDK. First, SPDK is reliable. It is designed and tested in large-scale Data Center. Second, SPDK is highly modularized. We can choose modules we need, and extend them with our own features. Last, but not the least, SPDK is active and open. People work for better quality and latest features, so we can focus on the features for testing. I even forget to mention the best-in-class performance.
As part of the SPDK community, pynvme also contributes to the up-stream. And another interesting side-effect is that, since SSD devices pass the test of pynvme s, they also pass the very first test of SPDK!
Afternoon break
[Lab 1] SPDK Hands on Lab
SPDK NVMe-OF acceleration
• Latest progress done with NVMe-OF RDMA performance.
• Introduce initial proposal to extend the internal POSIX-like transport API to allow better integration with zero-copy enabled TCP stacks, such as Mellanox’s VMA.
• Explain how Mellanox T10-DIF offload can be used to add integrity functionality to NVMe-OF protocol at initiator and target sides Including adding/striping/verifying T10-DIF or simple CRC.
Creating C++ apps with libpmemobj
With persistent memory, data can be retained after a program crash or power failure. In this session, learn how to make your C++ application persistent memory aware using the Persistent Memory Developers Kit (PMDK). The presentation includes C++ code samples walkthrough.
Afternoon break
[Lab 2] PMDK Hands on Lab
[Lab 3] Intel® VTune™ Amplifier Hands on Lab
Welcome to the Summit
SPDK: State of the Project
PMDK: State of the Project
VTune: State of the Product
Morning break
Squeezing Compression and Encryption into SPDK
VTune and Analyzers Overview (first part)
FusionEngine 2.0--Alibaba user-space full stack solution for storage
Full stack optimization for Udisk with SPDK
Afternoon break
Prepare for the next generation of memory, is your application a good candidate?
Optimize system configurations and workloads for Intel® Optane™ DC persistent memory
PMDK essentials
High Performance Pooled Storage for RSD Architectures
Afternoon break
SPDK based user space NVMe/TCP transport solution and Intel’s 100Gb NIC update
Accelerating Redis with Intel Optane DC Persistent Memory
VTune and Analyzers OverviewV(second part)
Persistent Memory Provisioning/Configuration tools
Afternoon break
Persistent Memory Programming Made Easy with pmemkv
End-to-end data protection with SPDK NVMe/TCP target
Introduce a new VM and Container file accelerator and live recovery feature in SPDK Vhost
Lessons learned from MemVerge, an avid PMDK user
Morning break
Introduction of Baidu Chitu Storage with SPDK NVMe-oF Application
Optimize your PMDK application’s performance with the help of Intel® VTune™ Amplifier profiler
Intel NVM technology and solution evolutions
Morning break
Integrating SPDK in the NAS gateway
Accelerate Spark with Intel Optane DC Persistent Memory
Persistent Memory – which mode do I want? Where are the “gotchas” hidden?(first part)
Morning break
Persistent Memory – which mode do I want? Where are the “gotchas” hidden?(second part)
VTune - Performance characterization of SPDK using Intel® VTune™ Amplifier
Why SSD developers need pynvme, and why pynvme needs SPDK?
Afternoon break
[Lab 1] SPDK Hands on Lab
SPDK NVMe-OF acceleration
Creating C++ apps with libpmemobj
Afternoon break
[Lab 2] PMDK Hands on Lab
[Lab 3] Intel® VTune™ Amplifier Hands on Lab