Bin Cui

Orcid: 0000-0003-1681-4677

Affiliations:
  • Peking University, School of Electronics Engineering and Computer Science, Beijing, China
  • National University of Singapore, School of Computing, Singapore (former)
  • Peking University, Beijing, China


According to our database1, Bin Cui authored at least 340 papers between 2001 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Diffusion Models: A Comprehensive Survey of Methods and Applications.
ACM Comput. Surv., April, 2024

Welcome to a New Era of the Data Science and Engineering Journal (DSE).
Data Sci. Eng., March, 2024

Individual and Structural Graph Information Bottlenecks for Out-of-Distribution Generalization.
IEEE Trans. Knowl. Data Eng., February, 2024

LIST: Learning to Index Spatio-Textual Data for Embedding based Spatial Keyword Queries.
CoRR, 2024

Retrieval-Augmented Generation for AI-Generated Content: A Survey.
CoRR, 2024

Structure-Guided Adversarial Training of Diffusion Models.
CoRR, 2024

Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing.
CoRR, 2024

RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models.
CoRR, 2024

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs.
CoRR, 2024

SpotServe: Serving Generative Large Language Models on Preemptible Instances.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
OneSketch: A Generic and Accurate Sketch for Data Streams.
IEEE Trans. Knowl. Data Eng., December, 2023

Experimental Analysis of Large-scale Learnable Vector Storage Compression.
Proc. VLDB Endow., December, 2023

BurstSketch: Finding Bursts in Data Streams.
IEEE Trans. Knowl. Data Eng., November, 2023

An Efficient Transfer Learning Based Configuration Adviser for Database Tuning.
Proc. VLDB Endow., November, 2023

AutoML for Deep Recommender Systems: A Survey.
ACM Trans. Inf. Syst., October, 2023

Graph-Based Non-Sampling for Knowledge Graph Enhanced Recommendation.
IEEE Trans. Knowl. Data Eng., September, 2023

HoppingSketch: More Accurate Temporal Membership Query and Frequency Query.
IEEE Trans. Knowl. Data Eng., September, 2023

P<sup>2</sup>CG: a privacy preserving collaborative graph neural network training framework.
VLDB J., July, 2023

On the Evolutionary of Bloom Filter False Positives - An Information Theoretical Approach to Optimizing Bloom Filter Parameters.
IEEE Trans. Knowl. Data Eng., July, 2023

A Sketch Framework for Approximate Data Stream Processing in Sliding Windows.
IEEE Trans. Knowl. Data Eng., May, 2023

VolcanoML: speeding up end-to-end AutoML via scalable search space decomposition.
VLDB J., March, 2023

ICS-GNN<sup>+</sup>: lightweight interactive community search via graph neural network.
VLDB J., March, 2023

DRGI: Deep Relational Graph Infomax for Knowledge Graph Completion.
IEEE Trans. Knowl. Data Eng., March, 2023

Survey on performance optimization for database systems.
Sci. China Inf. Sci., February, 2023

Hetu: a highly efficient automatic parallel distributed deep learning system.
Sci. China Inf. Sci., January, 2023

Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-Aware Deep Architecture.
IEEE Trans. Knowl. Data Eng., 2023

Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent.
Proc. VLDB Endow., 2023

SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training.
Proc. VLDB Endow., 2023

ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems.
Proc. VLDB Endow., 2023

Towards Designing and Learning Piecewise Space-Filling Curves.
Proc. VLDB Endow., 2023

Towards General and Efficient Online Tuning for Spark.
Proc. VLDB Endow., 2023

Double-Anonymous Sketch: Achieving Top-K-fairness for Finding Global Top-K Frequent Items.
Proc. ACM Manag. Data, 2023

A Unified and Efficient Coordinating Framework for Autonomous DBMS Tuning.
Proc. ACM Manag. Data, 2023

JoinSketch: A Sketch Algorithm for Accurate and Unbiased Inner-Product Estimation.
Proc. ACM Manag. Data, 2023

Scapin: Scalable Graph Structure Perturbation by Augmented Influence Maximization.
Proc. ACM Manag. Data, 2023

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement.
Proc. ACM Manag. Data, 2023

TreeSensing: Linearly Compressing Sketches with Flexibility.
Proc. ACM Manag. Data, 2023

LadderFilter: Filtering Infrequent Items with Small Memory and Time Overhead.
Proc. ACM Manag. Data, 2023

DBPA: A Benchmark for Transactional Database Performance Anomalies.
Proc. ACM Manag. Data, 2023

Graph Neural Networks in Recommender Systems: A Survey.
ACM Comput. Surv., 2023

On-Device Recommender Systems: A Tutorial on The New-Generation Recommendation Paradigm.
CoRR, 2023

CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models.
CoRR, 2023

Poisoning Attacks Against Contrastive Recommender Systems.
CoRR, 2023

Accelerating Scalable Graph Neural Network Inference with Node-Adaptive Propagation.
CoRR, 2023

Model-enhanced Vector Index.
CoRR, 2023

VQGraph: Graph Vector-Quantization for Bridging GNNs and MLPs.
CoRR, 2023

BOURNE: Bootstrapped Self-supervised Learning Framework for Unified Graph Anomaly Detection.
CoRR, 2023

Improving Automatic Parallel Training via Balanced Memory Workload Optimization.
CoRR, 2023

FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference.
CoRR, 2023

OpenBox: A Python Toolkit for Generalized Black-box Optimization.
CoRR, 2023

Transfer Learning for Bayesian Optimization: A Survey.
CoRR, 2023

StreamE: Learning to Update Representations for Temporal Knowledge Graphs in Streaming Scenarios.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Model-enhanced Vector Index.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Improving Diffusion-Based Image Synthesis with Context Prediction.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Rover: An Online Spark SQL Tuning Service via Generalized Transfer Learning.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Hyper-USS: Answering Subset Query Over Multi-Attribute Data Stream.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

SketchPolymer: Estimate Per-item Tail Quantile Using One Sketch.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

MicroscopeSketch: Accurate Sliding Estimation Using Adaptive Zooming.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

SteadySketch: Finding Steady Flows in Data Streams.
Proceedings of the 31st IEEE/ACM International Symposium on Quality of Service, 2023

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Hierarchical Interest Modeling of Long-tailed Users for Click-Through Rate Prediction.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

REncoder: A Space-Time Efficient Range Filter with Local Encoder.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

SketchConf: A Framework for Automatic Sketch Configuration.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

HyperCalm Sketch: One-Pass Mining Periodic Batches in Data Streams.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Finding Simplex Items in Data Streams.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

KVSAgg: Secure Aggregation of Distributed Key-Value Sets.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Mitigating Semantic Confusion from Hostile Neighborhood for Graph Active Learning.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

ProxyBO: Accelerating Neural Architecture Search via Bayesian Optimization with Zero-Cost Proxies.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
CuWide: Towards Efficient Flow-Based Training for Sparse Wide Models on GPUs.
IEEE Trans. Knowl. Data Eng., 2022

Coloring Embedder: Towards Multi-Set Membership Queries in Web Cache Sharing.
IEEE Trans. Knowl. Data Eng., 2022

LTC: A Fast Algorithm to Accurately Find Significant Items in Data Streams.
IEEE Trans. Knowl. Data Eng., 2022

Elastic Bloom Filter: Deletable and Expandable Filter Using Elastic Fingerprints.
IEEE Trans. Computers, 2022

Facilitating Database Tuning with Hyper-Parameter Optimization: A Comprehensive Experimental Evaluation.
Proc. VLDB Endow., 2022

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism.
Proc. VLDB Endow., 2022

Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale.
Proc. VLDB Endow., 2022

An I/O-Efficient Disk-based Graph System for Scalable Second-Order Random Walk of Large Graphs.
Proc. VLDB Endow., 2022

Stingy Sketch: A Sketch Framework for Accurate and Fast Frequency Estimation.
Proc. VLDB Endow., 2022

Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Update.
Proc. VLDB Endow., 2022

Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training.
CoRR, 2022

Efficient Graph Neural Network Inference at Large Scale.
CoRR, 2022

Distributed Graph Neural Network Training: A Survey.
CoRR, 2022

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning.
CoRR, 2022

Diffusion Models: A Comprehensive Survey of Methods and Applications.
CoRR, 2022

Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Updates.
CoRR, 2022

Efficient End-to-End AutoML via Scalable Search Space Decomposition.
CoRR, 2022

DFG-NAS: Deep and Flexible Graph Neural Architecture Search.
CoRR, 2022

Instance-wise Prompt Tuning for Pretrained Language Models.
CoRR, 2022

HetuMoE: An Efficient Trillion-scale Mixture-of-Expert Distributed Training System.
CoRR, 2022

LECF: recommendation via learnable edge collaborative filtering.
Sci. China Inf. Sci., 2022

AutoDC: an automatic machine learning framework for disease classification.
Bioinform., 2022

PaSca: A Graph Neural Architecture Search System under the Scalable Paradigm.
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022

MinMax Sampling: A Near-optimal Global Summary for Aggregation in the Wide Area.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Towards Dynamic and Safe Configuration Tuning for Cloud Databases.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Hunting Temporal Bumps in Graphs with Dynamic Vertex Properties.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

BlindFL: Vertical Federated Machine Learning without Peeking into Your Data.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

K-core decomposition on super large graphs with limited resources.
Proceedings of the SAC '22: The 37th ACM/SIGAPP Symposium on Applied Computing, Virtual Event, April 25, 2022

DivBO: Diversity-aware CASH for Ensemble Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Graph Attention Multi-Layer Perceptron.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Model Degradation Hinders Deep Graph Neural Networks.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

TransBO: Hyperparameter Optimization via Two-Phase Transfer Learning.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Transfer Learning based Search Space Design for Hyperparameter Tuning.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Analyzing Online Transaction Networks with Network Motifs.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

TJOSConf: Automatic and Safe System Environment Operations Platform.
Proceedings of the ICSCA 2022: 11th International Conference on Software and Computer Applications, Melaka, Malaysia, February 24, 2022

BurstBalancer: Do Less, Better Balance for Large-scale Data Center Traffic.
Proceedings of the 30th IEEE International Conference on Network Protocols, 2022

NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning.
Proceedings of the International Conference on Machine Learning, 2022

Deep and Flexible Graph Neural Architecture Search.
Proceedings of the International Conference on Machine Learning, 2022

Information Gain Propagation: a New Way to Graph Active Learning with Soft Labels.
Proceedings of the Tenth International Conference on Learning Representations, 2022

The Stair Sketch: Bringing more Clarity to Memorize Recent Events.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Contrastive Learning for Sequential Recommendation.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

TSPLIT: Fine-grained GPU Memory Management for Efficient DNN Training via Tensor Splitting.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-aware Deep Architecture (Extended Abstract).
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

DRGI: Deep Relational Graph Infomax for Knowledge Graph Completion: (Extended Abstract).
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Zoomer: Boosting Retrieval on Web-scale Graphs by Regions of Interest.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

PeriodicSketch: Finding Periodic Items in Data Streams.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

HET-KG: Communication-Efficient Knowledge Graph Embedding Training via Hotness-Aware Cache.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Message from Data Science and Systems 2022 Program Chairs.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

PointCLIP: Point Cloud Understanding by CLIP.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Scalable Graph Sampling on GPUs with Compressed Graph.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

UDM: A Unified Deep Matching Framework in Recommender Systems.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

Distributed Machine Learning and Gradient Optimization
Springer, ISBN: 978-981-16-3419-2, 2022

2021
Memory-aware framework for fast and scalable second-order random walk over billion-edge natural graphs.
VLDB J., 2021

Model averaging in distributed machine learning: a case study with Apache Spark.
VLDB J., 2021

Sys-TM: A Fast and General Topic Modeling System.
IEEE Trans. Knowl. Data Eng., 2021

FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation.
Proc. VLDB Endow., 2021

Grain: Improving Data Efficiency of Graph Neural Networks via Diversified Influence Maximization.
Proc. VLDB Endow., 2021

Finding Group Steiner Trees in Graphs with both Vertex and Edge Weights.
Proc. VLDB Endow., 2021

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework.
Proc. VLDB Endow., 2021

VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition.
Proc. VLDB Endow., 2021

Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation.
Proc. VLDB Endow., 2021

Enhanced review-based rating prediction by exploiting aside information and user influence.
Knowl. Based Syst., 2021

A classification framework for multivariate compositional data with Dirichlet feature embedding.
Knowl. Based Syst., 2021

Dense-to-Sparse Gate for Mixture-of-Experts.
CoRR, 2021

Graph Attention Multi-Layer Perceptron.
CoRR, 2021

Evaluating Deep Graph Neural Networks.
CoRR, 2021

CausCF: Causal Collaborative Filtering for RecommendationEffect Estimation.
CoRR, 2021

GMLP: Building Scalable and Flexible Graph Neural Networks with Feature-Message Passing.
CoRR, 2021

BurstSketch: Finding Bursts in Data Streams.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

ResTune: Resource Oriented Tuning Boosted by Meta-Learning for Cloud Databases.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

ALG: Fast and Accurate Active Learning Framework for Graph Convolutional Networks.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

VF<sup>2</sup>Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Node Dependent Local Smoothing for Scalable Graph Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

RIM: Reliable Influence-based Active Learning on Graphs.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

ROD: Reception-aware Online Distillation for Sparse Graphs.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

DeGNN: Improving Graph Neural Networks with Graph Decomposition.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

OpenBox: A Generalized Black-box Optimization Service.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Challenges and Opportunities of Building Fast GBDT Systems.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Efficient and Scalable Structure Learning for Bayesian Networks: Algorithms and Applications.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

UniNet: Scalable Network Representation Learning with Metropolis-Hastings Sampling.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Explore User Neighborhood for Real-time E-commerce Recommendation.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

CuWide: Towards Efficient Flow-based Training for Sparse Wide Models on GPUs (Extended Abstract).
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

CausCF: Causal Collaborative Filtering for Recommendation Effect Estimation.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

iMap: Incremental Node Mapping between Large Graphs Using GNN.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
SKCompress: compressing sparse and nonuniform gradient in distributed machine learning.
VLDB J., 2020

SF-Sketch: A Two-Stage Sketch for Data Streams.
IEEE Trans. Parallel Distributed Syst., 2020

On-Off Sketch: A Fast and Accurate Sketch on Persistence.
Proc. VLDB Endow., 2020

Hunting multiple bumps in graphs.
Proc. VLDB Endow., 2020

EdgeDIPN: a Unified Deep Intent Prediction Network Deployed at the Edge.
Proc. VLDB Endow., 2020

GARG: Anonymous Recommendation of Point-of-Interest in Mobile Networks by Graph Convolution Network.
Data Sci. Eng., 2020

DASFAA 20202 Special Issue Editorial.
Data Sci. Eng., 2020

Graph Neural Networks in Recommender Systems: A Survey.
CoRR, 2020

Contrastive Pre-training for Sequential Recommendation.
CoRR, 2020

Snapshot boosting: a fast ensemble framework for deep neural networks.
Sci. China Inf. Sci., 2020

Reliable Data Distillation on Graph Convolutional Network.
Proceedings of the 2020 International Conference on Management of Data, 2020

Memory-Aware Framework for Efficient Second-Order Random Walk on Large Graphs.
Proceedings of the 2020 International Conference on Management of Data, 2020

WavingSketch: An Unbiased and Generic Sketch for Finding Top-k Items in Data Streams.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Sliding Sketches: A Framework using Time Zones for Data Stream Processing in Sliding Windows.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript.
Proceedings of the 37th International Conference on Machine Learning, 2020

Efficient Diversity-Driven Ensemble for Deep Neural Networks.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

C olumnSGD: A Column-oriented Framework for Distributed Stochastic Gradient Descent.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

PSGraph: How Tencent trains extremely large-scale graphs with Spark?
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Group Recommendation with Latent Voting Mechanism.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Preference-Aware Mask for Session-Based Recommendation with Bidirectional Transformer.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Densely-Connected Transformer with Co-attentive Information for Matching Text Sequences.
Proceedings of the Web and Big Data - 4th International Joint Conference, 2020

Efficient Automatic CASH via Rising Bandits.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Large-scale Graph Analysis: System, Algorithm and Optimization
Springer, ISBN: 978-981-15-3927-5, 2020

2019
Fine-grained probability counting for cardinality estimation of data streams.
World Wide Web, 2019

Fast and accurate stream processing by filtering the cold.
VLDB J., 2019

An Experimental Evaluation of Large Scale GBDT Systems.
Proc. VLDB Endow., 2019

Special Issue of APWeb-WAIM 2019.
Data Sci. Eng., 2019

Fast De-anonymization of Social Networks with Structural Information.
Data Sci. Eng., 2019

PS2: Parameter Server on Spark.
Proceedings of the 2019 International Conference on Management of Data, 2019

Buying or Browsing?: Predicting Real-time Purchasing Intent using Attention-based Deep Network with Multiple Behavior.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

MLlib*: Fast Training of GLMs Using Spark MLlib.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

Coloring Embedder: A Memory Efficient Data Structure for Answering Multi-set Query.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

Multi-copy Cuckoo Hashing.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

Sparse Gradient Compression for Distributed SGD.
Proceedings of the Database Systems for Advanced Applications, 2019

Magic Cube Bloom Filter: Answering Membership Queries for Multiple Sets.
Proceedings of the IEEE International Conference on Big Data and Smart Computing, 2019

FeatureBand: A Feature Selection Method by Combining Early Stopping and Genetic Local Search.
Proceedings of the Web and Big Data - Third International Joint Conference, 2019

2018
Spatiotemporal Recommendation in Geo-Social Networks.
Proceedings of the Encyclopedia of Social Network Analysis and Mining, 2nd Edition, 2018

CO<sup>2</sup>: Inferring Personal Interests From Raw Footprints by Connecting the Offline World with the Online World.
ACM Trans. Inf. Syst., 2018

Cold Filter: A Meta-Framework for Faster and More Accurate Stream Processing.
Proceedings of the 2018 International Conference on Management of Data, 2018

SketchML: Accelerating Distributed Machine Learning with Data Sketches.
Proceedings of the 2018 International Conference on Management of Data, 2018

DimBoost: Boosting Gradient Boosting Decision Tree to Higher Dimensions.
Proceedings of the 2018 International Conference on Management of Data, 2018

CRAN: A Hybrid CNN-RNN Attention-Based Model for Text Classification.
Proceedings of the Conceptual Modeling - 37th International Conference, 2018

GLM+: An Efficient System for Generalized Linear Models.
Proceedings of the 2018 IEEE International Conference on Big Data and Smart Computing, 2018

Fine-Grained Probability Counting: Refined LogLog Algorithm.
Proceedings of the 2018 IEEE International Conference on Big Data and Smart Computing, 2018

Single Hash: Use One Hash Function to Build Faster Hash Based Data Structures.
Proceedings of the 2018 IEEE International Conference on Big Data and Smart Computing, 2018

SSS: An Accurate and Fast Algorithm for Finding Top-k Hot Items in Data Streams.
Proceedings of the 2018 IEEE International Conference on Big Data and Smart Computing, 2018

Geo-Edge: Geographical Resource Allocation on Edge Caches for Video-on-Demand Streaming.
Proceedings of the 4th International Conference on Big Data Computing and Communications, 2018

CUTE: Querying Knowledge Graphs by Tabular Examples.
Proceedings of the Web and Big Data - Second International Joint Conference, 2018

2017
GVoS: A General System for Near-Duplicate Video-Related Applications on Storm.
ACM Trans. Inf. Syst., 2017

UniAD: A Unified Ad Hoc Data Processing System.
ACM Trans. Database Syst., 2017

Fast Parallel Path Concatenation for Graph Extraction.
IEEE Trans. Knowl. Data Eng., 2017

An Experimental Evaluation of SimRank-based Similarity Search Algorithms.
Proc. VLDB Endow., 2017

LDA*: A Robust and Large-scale Topic Modeling System.
Proc. VLDB Endow., 2017

MLog: Towards Declarative In-Database Machine Learning.
Proc. VLDB Endow., 2017

SF-sketch: slim-fat-sketch with GPU assistance.
CoRR, 2017

Heterogeneity-aware Distributed Parameter Servers.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

SF-sketch: A Fast, Accurate, and Memory Efficient Data Structure to Store Frequencies of Data Items.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

TencentBoost: A Gradient Boosting Tree System with Parameter Server.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

From Raw Footprints to Personal Interests: Bridging the Semantic Gap via Trip Intention Aggregation.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

StroMAX: Partitioning-Based Scheduler for Real-Time Stream Processing System.
Proceedings of the Database Systems for Advanced Applications, 2017

ABC: A practicable sketch framework for non-uniform multisets.
Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

TeslaML: Steering Machine Learning Automatically in Tencent.
Proceedings of the Web and Big Data - First International Joint Conference, 2017

2016
Spatio-Temporal Recommendation in Social Media
Springer Briefs in Computer Science, Springer, ISBN: 978-981-10-0748-4, 2016

Distinguishing re-sharing behaviors from re-creating behaviors in information diffusion.
World Wide Web, 2016

Joint Modeling of User Check-in Behaviors for Real-time Point-of-Interest Recommendation.
ACM Trans. Inf. Syst., 2016

Adapting to User Interest Drift for POI Recommendation.
IEEE Trans. Knowl. Data Eng., 2016

POS: A High-Level System to Simplify Real-Time Stream Application Development on Storm.
Data Sci. Eng., 2016

Expert team finding for review assignment.
Proceedings of the Conference on Technologies and Applications of Artificial Intelligence, 2016

Tornado: A System For Real-Time Iterative Analysis Over Evolving Data.
Proceedings of the 2016 International Conference on Management of Data, 2016

Real-time Video Recommendation Exploration.
Proceedings of the 2016 International Conference on Management of Data, 2016

Cross-layer approach to joint transmitter selection for cooperative transmission.
Proceedings of the 16th International Symposium on Communications and Information Technologies, 2016

Satisfiability of Linear Time Mu-Calculus on Finite Traces.
Proceedings of the Computing and Combinatorics - 22nd International Conference, 2016

2015
A multi-source integration framework for user occupation inference in social media systems.
World Wide Web, 2015

Dynamic User Modeling in Social Media Systems.
ACM Trans. Inf. Syst., 2015

Heterogeneous Environment Aware Streaming Graph Partitioning.
IEEE Trans. Knowl. Data Eng., 2015

PAGE: A Partition Aware Engine for Parallel Graph Computation.
IEEE Trans. Knowl. Data Eng., 2015

Modeling Location-Based User Rating Profiles for Personalized Recommendation.
ACM Trans. Knowl. Discov. Data, 2015

An Efficient Similarity Search Framework for SimRank over Large Dynamic Graphs.
Proc. VLDB Endow., 2015

Exploiting Matrix Dependency for Efficient Distributed Matrix Computation.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

TencentRec: Real-time Stream Recommendation in Practice.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Community Level Diffusion Extraction.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Joint Modeling of Users' Interests and Mobility Patterns for Point-of-Interest Recommendation.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Evaluations and measurements of a high frequency nanocrystalline core transformer for power converters.
Proceedings of the IECON 2015, 2015

Finding top-k local users in geo-tagged social media data.
Proceedings of the 31st IEEE International Conference on Data Engineering, 2015

2014
LCARS: A Spatial Item Recommender System.
ACM Trans. Inf. Syst., 2014

LogGP: A Log-based Dynamic Graph Partitioning Method.
Proc. VLDB Endow., 2014

HAT: an efficient buffer management method for flash-based hybrid storage systems.
Frontiers Comput. Sci., 2014

A temporal context-aware model for user behavior modeling in social media systems.
Proceedings of the International Conference on Management of Data, 2014

Towards unified ad-hoc data processing.
Proceedings of the International Conference on Management of Data, 2014

Parallel subgraph listing in a large-scale graph.
Proceedings of the International Conference on Management of Data, 2014

Efficient cohesive subgraphs detection in parallel.
Proceedings of the International Conference on Management of Data, 2014

User Group Oriented Temporal Dynamics Exploration.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
A Gram-Based String Paradigm for Efficient Video Subsequence Search.
IEEE Trans. Multim., 2013

TeRec: A Temporal Recommender System Over Tweet Stream.
Proc. VLDB Endow., 2013

Community Specific Temporal Topic Discovery from Social Media.
CoRR, 2013

A Multiple Feature Integration Model to Infer Occupation from Social Media Records.
Proceedings of the Web Information Systems Engineering - WISE 2013, 2013

bCATE: A Balanced Contention-Aware Transaction Execution Model for Highly Concurrent OLTP Systems.
Proceedings of the Web-Age Information Management - 14th International Conference, 2013

LCARS: a location-content-aware recommender system.
Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013

A unified model for stable and temporal topic detection from social media data.
Proceedings of the 29th IEEE International Conference on Data Engineering, 2013

PAGE: a partition aware graph computation engine.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

Hotness-aware buffer management for flash-based hybrid storage systems.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

A Probabilistic Data Replacement Strategy for Flash-Based Hybrid Storage System.
Proceedings of the Web Technologies and Applications - 15th Asia-Pacific Web Conference, 2013

2012
Bursty event detection from collaborative tags.
World Wide Web, 2012

Evolutionary taxonomy construction from dynamic tag space.
World Wide Web, 2012

Approaches to Exploring Category Information for Question Retrieval in Community Question-Answer Archives.
ACM Trans. Inf. Syst., 2012

A Framework for Similarity Search of Time Series Cliques with Natural Relations.
IEEE Trans. Knowl. Data Eng., 2012

Challenging the Long Tail Recommendation.
Proc. VLDB Endow., 2012

Extracting representative motion flows for effective video retrieval.
Multim. Tools Appl., 2012

Recommending Flickr groups with social topic model.
Inf. Retr., 2012

Temporal provenance discovery in micro-blog message streams (abstract only).
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012

Provenance-based Indexing Support in Micro-blog Platforms.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

Keyword Query Reformulation on Structured Data.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

Efficient approximation of the maximal preference scores by lightweight cubic views.
Proceedings of the 15th International Conference on Extending Database Technology, 2012

2011
Correlation-based retrieval for heavily changed near-duplicate videos.
ACM Trans. Inf. Syst., 2011

Constrained Skyline Query Processing against Distributed Data Sites.
IEEE Trans. Knowl. Data Eng., 2011

Operation-aware buffer management in flash-based systems.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

Modeling User Expertise in Folksonomies by Fusing Multi-type Features.
Proceedings of the Database Systems for Advanced Applications, 2011

Log-Compact R-Tree: An Efficient Spatial Index for SSD.
Proceedings of the Database Systems for Adanced Applications, 2011

Finding a Wise Group of Experts in Social Networks.
Proceedings of the Advanced Data Mining and Applications - 7th International Conference, 2011

2010
Practical Online Near-Duplicate Subsequence Detection for Continuous Video Streams.
IEEE Trans. Multim., 2010

Exploring Correlated Subspaces for Efficient Query Processing in Sparse Databases.
IEEE Trans. Knowl. Data Eng., 2010

A generalized framework of exploring category information for question retrieval in community question answer archives.
Proceedings of the 19th International Conference on World Wide Web, 2010

ACAR: An Adaptive Cost Aware Cache Replacement Approach for Flash Memory.
Proceedings of the Web-Age Information Management, 11th International Conference, 2010

Multiple feature fusion for social media applications.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

Content-enriched classifier for web video classification.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

Efficient similarity matching of Time Series Cliques with natural relations.
Proceedings of the 26th International Conference on Data Engineering, 2010

Detecting bursty events in collaborative tagging systems.
Proceedings of the 26th International Conference on Data Engineering, 2010

ISIS: A New Approach for Efficient Similarity Search in Sparse Databases.
Proceedings of the Database Systems for Advanced Applications, 2010

Distributed Cache Indexing for Efficient Subspace Skyline Computation in P2P Networks.
Proceedings of the Database Systems for Advanced Applications, 2010

Temporal and Social Context Based Burst Detection from Folksonomies.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009
A novel framework for efficient automated singer identification in large music databases.
ACM Trans. Inf. Syst., 2009

Bounded coordinate system indexing for real-time video clip search.
ACM Trans. Inf. Syst., 2009

Efficient Skyline Computation in Structured Peer-to-Peer Systems.
IEEE Trans. Knowl. Data Eng., 2009

Accelerating sequence searching: dimensionality reduction method.
Knowl. Inf. Syst., 2009

Hybrid information retrieval policies based on cooperative cache in mobile P2P networks.
Frontiers Comput. Sci. China, 2009

Linking identical neighborly partitions for efficient high-dimensional similarity search in unstructured peer-to-peer systems.
Distributed Parallel Databases, 2009

Implementation Issues of A Cloud Computing Platform.
IEEE Data Eng. Bull., 2009

A Novel Content Distribution Mechanism in DHT Networks.
Proceedings of the NETWORKING 2009, 2009

Routing Questions to the Right Users in Online Communities.
Proceedings of the 25th International Conference on Data Engineering, 2009

Hybrid Retrieval Mechanisms in Vehicle-Based P2P Networks.
Proceedings of the Computational Science, 2009

A Revisit of Query Expansion with Different Semantic Levels.
Proceedings of the Database Systems for Advanced Applications, 2009

Video Annotation System Based on Categorizing and Keyword Labelling.
Proceedings of the Database Systems for Advanced Applications, 2009

Constructing evolutionary taxonomy of collaborative tagging systems.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

Efficient information retrieval in mobile peer-to-peer networks.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

The use of categorization information in language models for question retrieval.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2008
PGG: An Online Pattern Based Approach for Stream Variation Management.
J. Comput. Sci. Technol., 2008

Semantic similarity based on compact concept ontology.
Proceedings of the 17th International Conference on World Wide Web, 2008

Achieving Effective Multi-term Queries for Fast DHT Information Retrieval.
Proceedings of the Web Information Systems Engineering, 2008

Towards Efficient and Flexible KNN Query Processing in Real-Life Road Networks.
Proceedings of the Ninth International Conference on Web-Age Information Management, 2008

Parallel Distributed Processing of Constrained Skyline Queries by Filtering.
Proceedings of the 24th International Conference on Data Engineering, 2008

iSky: Efficient and Progressive Skyline Computing in a Structured P2P Network.
Proceedings of the 28th IEEE International Conference on Distributed Computing Systems (ICDCS 2008), 2008

Compacting music signatures for efficient music retrieval.
Proceedings of the EDBT 2008, 2008

Effective Skyline Cardinality Estimation on Data Streams.
Proceedings of the Database and Expert Systems Applications, 19th International Conference, 2008

Squeezing Long Sequence Data for Efficient Similarity Search.
Proceedings of the Progress in WWW Research and Development, 2008

2007
Efficient index-based KNN join processing for high-dimensional data.
Inf. Softw. Technol., 2007

LINP: Supporting Similarity Search in Unstructured Peer-to-Peer Networks.
Proceedings of the Advances in Data and Web Management, 2007

Effective variation management for pseudo periodical streams.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

QueST: querying music databases by acoustic and textual features.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Evaluating MAX and MIN over Sliding Windows with Various Size Using the Exemplary Sketch.
Proceedings of the Advances in Databases: Concepts, 2007

An Optimized Process Neural Network Model.
Proceedings of the Advances in Databases: Concepts, 2007

CLAIM: An Efficient Method for Relaxed Frequent Closed Itemsets Mining over Stream Data.
Proceedings of the Advances in Databases: Concepts, 2007

Optimizing Moving Queries over Moving Object Data Streams.
Proceedings of the Advances in Databases: Concepts, 2007

2006
Indexing and Integrating Multiple Features for WWW Images.
World Wide Web, 2006

ICICLE: A semantic-based retrieval system for WWW images.
Multim. Syst., 2006

IMPACT: A twin-index framework for efficient moving object query processing.
Data Knowl. Eng., 2006

Classifying E-Mails Via Support Vector Machine.
Proceedings of the Advances in Web-Age Information Management, 2006

Towards efficient automated singer identification in large music databases.
Proceedings of the SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006

Exploring composite acoustic features for efficient music similarity query.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

A new image segmentation algorithm based on kernel spatial fuzzy c-means (FCM).
Proceedings of the 6th Industrial Conference on Data Mining, Workshop Proceedings, 2006

HSI: A Novel Framework for Efficient Automated Singer Identification in Large Music Database.
Proceedings of the 22nd International Conference on Data Engineering, 2006

Summarizing Frequent Patterns Using Profiles.
Proceedings of the Database Systems for Advanced Applications, 2006

A Web-based Framework For On-line Collaborative Learning.
Proceedings of the 10th International Conference on CSCW in Design, 2006

2005
Indexing High-Dimensional Data for Efficient In-Memory Similarity Search.
IEEE Trans. Knowl. Data Eng., 2005

On Effective E-mail Classification via Neural Networks.
Proceedings of the Database and Expert Systems Applications, 16th International Conference, 2005

Towards Optimal Utilization of Main Memory for Moving Object Indexing.
Proceedings of the Database Systems for Advanced Applications, 2005

Indexing Text and Visual Features for WWW Images.
Proceedings of the Web Technologies Research and Development - APWeb 2005, 7th Asia-Pacific Web Conference, Shanghai, China, March 29, 2005

Exploring Bit-Difference for Approximate KNN Search in High-dimensional Databases.
Proceedings of the Database Technologies 2005, 2005

2004
Main Memory Indexing: The Case for BD-Tree.
IEEE Trans. Knowl. Data Eng., 2004

Adaptive Quantization of the High-Dimensional Data for Efficient KNN Processing.
Proceedings of the Database Systems for Advances Applications, 2004

Diagonal Ordering: A New Approach to High-Dimensional KNN Processing.
Proceedings of the Database Technologies 2004, 2004

2003
Supporting Frequent Updates in R-Trees: A Bottom-Up Approach.
Proceedings of 29th International Conference on Very Large Data Bases, 2003

Contorting High Dimensional Data for Efficient Main Memory Processing.
Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, 2003

2001
Range Top/Bottom k Queries in OLAP Sparse Data Cubes.
Proceedings of the Database and Expert Systems Applications, 12th International Conference, 2001


  Loading...