Bio

I work as an engineering manager in the query engine group of Databricks. My team focuses on delivering best-in-class SQL experience for customers with modern, rich and easy-to-use language features.

Previously I managed an engineering team in Amazon Redshift with responsibilities on Redshift SQL language features, ingestion, migrations and query federation. Prior to that I worked as a research scientist at Datometry on query cross compilation. Earlier I was part of the query optimizer team of Greenplum Database, working on ORCA.

My research interest includes query processing in massive parallel database systems and privacy-preserving data publishing and data mining.

I got my Ph.D. degree in the Department of Computer Science at North Carolina State University. My advisor was Dr. Ting Yu. Before I came to NC State University, I obtained my Bachelor degree from Zhejiang University, Hangzhou, China. I am an alumni of Tianjin Nankai High School.

Professional Service

I serve regularly on the program committee of database conferences and as a reviewer for IEEE Transactions on Knowledge and Data Engineering.

PC member - CIKM 2014, SIGMOD 2019 (Industrial Track), VLDB 2020 (Industrial Track), ICDE 2021, SIGMOD 2022, ICDE 2022

Publications

Selected publications. For the full list please refer to my Google Scholar page.

Orca - A Modular Query Optimizer Architecture for Big Data
MA Soliman, L. Antova, V. Raghavan, A. El-Helw, Z. Gu, E. Shen, GC Caragea, C. Garcia-Alvarado, F. Rahman, M. Petropoulos, F. Waas, S. Narayanan, K. Krikellas, R. Baldwin

2014 ACM SIGMOD International Conference on Management of Data. [pdf] [src]

Datometry Hyper-Q: Bridging the Gap Between Real-Time and Historical Analytics
L. Antova, R. Baldwin, D. Bryant, T. Cao, M. Duller, J. Eshleman, Z. Gu, E. Shen, MA Soliman, F. Waas

2016 ACM SIGMOD International Conference on Management of Data. [pdf]

Mining Frequent Graph Patterns with Differential Privacy
Entong Shen, Ting Yu

2013 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. [pdf] [slides]

Differentially Private Spatial Decompositions
G Cormode, C Procopiuc, D Srivastava, E Shen, T Yu

Data engineering (ICDE), 2012 IEEE 28th international conference on, 20-31. [pdf] [slides] [src]

Reversing Statistics for Scalable Test Databases Generation
Entong Shen, Lyublena Antova

Proceedings of the Sixth International Workshop on Testing Database Systems, 2013. [pdf]

Empirical Privacy and Empirical Utility of Anonymized Data
G Cormode, CM Procopiuc, E Shen, D Srivastava, T Yu

2013 IEEE 29th International Conference on Data Engineering Workshops. [pdf]

Patents

Methods and Apparatus to Anonymize a Dataset of Spatial Data
GR Cormode, CM Procopiuc, D Srivastava, E Shen
US Patent 8,627,488
Method and System for Transparent Interoperability between Applications and Data Management Systems
FM Waas, M Soliman, Z Gu, LR Antova, TA Cao, E Shen, MA Duller
US Patent 10,628,438
Method and System for Workload Management for Data Management Systems
FM Waas, M Soliman, Z Gu, LR Antova, TA Cao, E Shen, MA Duller
US Patent 10,594,779