Data science, human factors, text mining, applied machine learning, software engineering, privacy
My research interests are user-centered and data-driven. They include two parts: First, designing machine learning approaches to assist users with difficult interactive tasks (i.e., learning to interact with users to solve their ``real pains''). Specifically, I have been working on e-Commerce search (assisting online shoppers to quickly find an item), software development question retrieval (assisting developers find semantically relevant questions), and mobile security decision making (assisting users understand the purpose of data collection). My recent ongoing work focuses on natural language to code synthesis, i.e., semantic parsing. Second, I am also interested in designing statistical studies for discovering new insights in human factors. More details can be found from my research statement.
Text Mining for Mobile Security Interaction
CLAP: A Recommender System for Assisting User Security Interaction
Learning Search Log to Optimize Numerical Facet Interface
Interactive Hierarchical Moment-based Inference
Constructing a topic hierarchy for large text collection, such as business documents, news articles, social media messages, and research publications, is helpful for information workers, data analysts and researchers to summarize and navigate them in multiple granularity efficiently. However, complete automatic approaches are often error prone, often failing to meet user requirements. We proposes to give users freedom to construct topical hierarchies via interactive operations such as expanding a branch and merging several branches. We build our approach based on a spectral learning framework named moment-based inference method, and our technical contributions are of two folds. First, we derive robust inference solutions for each operation, so that user editing does not lose information for the inference. Second, we optimize the algorithms of moment-based framework, so our proposed method is orders of magnitude faster than existing hierarchical topic construction methods.