Dr. Longbin Lai obtained his Degree of Bachelor and Master from Shanghai Jiao Tong University (SJTU) in 2010 and 2013 respectively. After that, He joined the Database group in CSE, UNSW, Sydney for Ph.D study under the supervision of Prof. Xuemin Lin and Dr. Lu Qin. He obtained his Ph.D degree in Mar. 2017, and his thesis is available here. He now joins Alibaba Damo Academy to build intelligent systems for big (graph) data processing. His research interests include graph algorithms, graph database, database management, distributed algorithms and systems.
PhD in Computer Science and Engineering, 2017
University of New South Wales, Sydney
Master in Computer Engineering, 2013
Shanghai Jiao Tong University
BSc in Information Security, 2010
Shanghai Jiao Tong University
To develop a graph pattern matching system in the distributed context, while gluing together the academic results of optimal join …
Recently there emerge many distributed algorithms that aim at solving subgraph matching at scale. Existing algorithm-level comparisons failed to provide a systematic view of distributed subgraph matching mainly due to the intertwining of strategy and optimization. In this paper, we identify four strategies and three general-purpose optimizations from representative state-of-the-art algorithms. We implement the four strategies with the optimizations based on the common Timely dataflow system for systematic strategy-level comparison. Our implementation covers all representative algorithms. We conduct extensive experiments for both unlabelled matching and labelled matching to analyze the performance of distributed subgraph matching under various settings, which is finally summarized as a practical guide.
Graph pattern matching is one of the most fundamental problems in graph database and is associated with a wide spectrum of applications. Due to its computational intensiveness, researchers have primarily devoted their efforts to improving the performance of the algorithm while constraining the graphs to have singular labels on vertices (edges) or no label. Whereas in practice graphs are typically associated with rich properties, thus the main focus in the industry is instead on powerful query languages that can express a sufficient number of pattern matching scenarios. We demo PatMat in this work to glue together the academic efforts on performance and the industrial efforts on expressiveness. To do so, we leverage the state-of-the-art join-based algorithms in the distributed contexts and Cypher query language - the most widely-adopted declarative language for graph pattern matching. The experiments demonstrate how we are capable of turning complex Cypher semantics into a distributed solution with high performance.