The development of in-memory computing has fueled the emergence of in-memory computing systems. Data explosion is also posing an unprecedented demand for memory capacity to handle the ever-growing data size. Thus, in-memory computing systems are increasingly looking inward at hybrid memory caches of under-processed data as resources to be mined. Our preliminary study finds that some existing data management strategies often trade application performance for low memory utilization, and hence can induce frequent I/O operations between memory system and storage system.
To achieve this goal, we propose to design a hybrid memory system that includes fast and relatively slow memory hardware. In order to realize a runtime system that automatically optimizes data management on hybrid memory, we will (1) propose a new shared in-memory cache layer among parallel executors that are co-hosted on the same computing node, which aims to improve the overall hit rate of data blocks; (2) develop a middleware layer built on top of existing deep learning frameworks that streamlines the support and implementation of online learning applications; (3) design a unified in-memory computing architecture with efficient data management strategy to optimize memory allocation and recycle for ML applications.