Kevin Lin

Orcid: 0000-0002-4968-0532

According to our database1, Kevin Lin authored at least 111 papers between 2011 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 




Lost in the Middle: How Language Models Use Long Contexts.
Trans. Assoc. Comput. Linguistics, 2024

MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities.
CoRR, 2024

IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation.
CoRR, 2024

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos.
CoRR, 2024

Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation.
CoRR, 2024

CLARINET: Augmenting Language Models to Ask Clarification Questions for Retrieval.
CoRR, 2024

Consistency Policy: Accelerated Visuomotor Policies via Consistency Distillation.
CoRR, 2024

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs.
CoRR, 2024

COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training.
CoRR, 2024

MPT: Mesh Pre-Training with Transformers for Human Pose and Mesh Reconstruction.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

"It Can Relate to Real Lives": Attitudes and Expectations in Justice-centered Data Structures & Algorithms for Non-Majors.
Proceedings of the 55th ACM Technical Symposium on Computer Science Education, 2024

Open X-Embodiment: Robotic Learning Datasets and RT-X Models : Open X-Embodiment Collaboration.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Diffusion and Multi-Domain Adaptation Methods for Eosinophil Segmentation.
Proceedings of the 2024 7th International Conference on Machine Vision and Applications, 2024

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Teaching Accessible Design in Data Structures and Algorithms.
Proceedings of the 2024 ACM Conference on International Computing Education Research, 2024

Partial-View Object View Synthesis via Filtering Inversion.
Proceedings of the International Conference on 3D Vision, 2024

Text2Motion: from natural language instructions to feasible plans.
Auton. Robots, December, 2023

RALF: Accuracy-Aware Scheduling for Feature Store Maintenance.
Proc. VLDB Endow., November, 2023

MetaEx-GAN: Meta Exploration to Improve Natural Language Generation via Generative Adversarial Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning.
CoRR, 2023

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation.
CoRR, 2023

MM-VID: Advancing Video Understanding with GPT-4V(ision).
CoRR, 2023

DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design.
CoRR, 2023

MemGPT: Towards LLMs as Operating Systems.
CoRR, 2023

Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation.
CoRR, 2023

OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation.
CoRR, 2023

The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision).
CoRR, 2023

Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models.
CoRR, 2023

DisCo: Disentangled Control for Referring Human Dance Generation in Real World.
CoRR, 2023

Aligning Large Multi-Modal Model with Robust Instruction Tuning.
CoRR, 2023

Partial-View Object View Synthesis via Filtered Inversion.
CoRR, 2023

MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action.
CoRR, 2023

Equitable Grading Best Practices.
Proceedings of the 54th ACM Technical Symposium on Computer Science Education, Volume 2, 2023

Few-Shot Adaptation for Parsing Contextual Utterances with LLMs.
Proceedings of the Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023, 2023

Equivariant Similarity for Vision-Language Foundation Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Uncertainty Quantification for Eosinophil Segmentation.
Proceedings of the 2023 10th International Conference on Bioinformatics Research and Applications, 2023

An Empirical Study of Multimodal Model Merging.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Decomposing Complex Queries for Tip-of-the-tongue Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

ReCo: Region-Controlled Text-to-Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Adaptive Human Matting for Dynamic Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Neural Voting Field for Camera-Space 3D Hand Pose Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

GIT: A Generative Image-to-text Transformer for Vision and Language.
Trans. Mach. Learn. Res., 2022

Multimodal graph neural network for video procedural captioning.
Neurocomputing, 2022

Cross-modal Representation Learning for Zero-shot Action Recognition.
CoRR, 2022

Reading Between the Lines: Student Experiences of Resubmission in an Introductory CS Course.
Proceedings of the SIGCSE 2022: The 53rd ACM Technical Symposium on Computer Science Education, 2022

CS Education for the Socially-Just Worlds We Need: The Case for Justice-Centered Approaches to CS in Higher Education.
Proceedings of the SIGCSE 2022: The 53rd ACM Technical Symposium on Computer Science Education, 2022

Approaches for Weaving Responsible Computing into Data Structures and Algorithms Courses.
Proceedings of the SIGCSE 2022: The 53rd ACM Technical Symposium on Computer Science Education, 2022

Crossmodal Representation Learning for Zero-shot Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

OVIS: Open-Vocabulary Visual Instance Search via Visual-Semantic Aligned Representation Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Cross-Domain Complementary Learning Using Pose for Multi-Person Part Segmentation.
IEEE Trans. Circuits Syst. Video Technol., 2021

Combining optimal control and learning for autonomous aerial navigation in novel indoor environments.
CoRR, 2021

VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling.
CoRR, 2021

Do Abstractions Have Politics? Towards a More Critical Algorithm Analysis.
CoRR, 2021

How Can We Make Office Hours Better?
Proceedings of the SIGCSE '21: The 52nd ACM Technical Symposium on Computer Science Education, 2021

Nifty Web Apps: Build a Web App for Any Text-Based Programming Assignment.
Proceedings of the SIGCSE '21: The 52nd ACM Technical Symposium on Computer Science Education, 2021

Strategies for Authentic Assessments of Mastery in CS Courses.
Proceedings of the SIGCSE '21: The 52nd ACM Technical Symposium on Computer Science Education, 2021

Do Abstractions Have Politics? Toward a More Critical Algorithm Analysis.
Proceedings of the 2021 Conference on Research in Equitable and Sustained Participation in Engineering, 2021

Constructing Taxonomies from Pretrained Language Models.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Learning Nonparametric Human Mesh Reconstruction From A Single Image Without Ground Truth Meshes.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Mesh Graphormer.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

KLSI Methods for Human Simultaneous Interpretation and Towards Building a Simultaneous Machine Translation System Reflecting the KLSI Methods.
Proceedings of the Artificial Intelligence in HCI, 2021

End-to-End Human Pose and Mesh Reconstruction with Transformers.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Cross-Batch Reference Learning for Deep Retrieval.
IEEE Trans. Neural Networks Learn. Syst., 2020

Liquidity transmission and the subprime mortgage crisis: a multivariate GARCH approach.
Soft Comput., 2020

Inducing Taxonomic Knowledge from Pretrained Transformers.
CoRR, 2020

VIVO: Surpassing Human Performance in Novel Object Captioning with Visual Vocabulary Pre-Training.
CoRR, 2020

A Berkeley View of Teaching CS at Scale.
CoRR, 2020

Evaluating NLP Models via Contrast Sets.
CoRR, 2020

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.
CoRR, 2020

Perspectives on Allyship in Academia.
Proceedings of the 51st ACM Technical Symposium on Computer Science Education, 2020

How Can We Make Office Hours Better?
Proceedings of the 51st ACM Technical Symposium on Computer Science Education, 2020

Transitioning From Peer Instruction to POGIL with Guided Lecture Notes.
Proceedings of the 51st ACM Technical Symposium on Computer Science Education, 2020

It Seemed Like a Good Idea at the Time (Hindsight is 2020).
Proceedings of the 51st ACM Technical Symposium on Computer Science Education, 2020

Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.
Proceedings of the 37th International Conference on Machine Learning, 2020

Neural Module Networks for Reasoning over Text.
Proceedings of the 8th International Conference on Learning Representations, 2020

Learning to Generate Multiple Style Transfer Outputs for an Input Sentence.
Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020

Unsupervised Deep Learning of Compact Binary Descriptors.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Cross-Domain Complementary Learning with Synthetic Data for Multi-Person Part Segmentation.
CoRR, 2019

Grammar-based Neural Text-to-SQL Generation.
CoRR, 2019

Subgoals, Problem Solving Phases, and Sources of Knowledge: A Complex Mangle.
CoRR, 2019

DeepBase: Deep Inspection of Neural Networks.
Proceedings of the 2019 International Conference on Management of Data, 2019

Subgoals, Problem Solving Phases, and Sources of Knowledge.
Proceedings of the 50th ACM Technical Symposium on Computer Science Education, 2019

Adversarial Learning for Fine-Grained Image Search.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Reasoning Over Paragraph Effects in Situations.
Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 2019

Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Dialectic: Enhancing Text Input Fields with Automatic Feedback to Improve Social Content Writing Quality.
CoRR, 2017

A Sharp Error Analysis for the Fused Lasso, with Application to Approximate Changepoint Screening.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Adversarial Ranking for Language Generation.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Revisiting compressed sensing: exploiting the efficiency of simplex and sparsification methods.
Math. Program. Comput., 2016

Cross-batch Reference Learning for Deep Classification and Retrieval.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Abandoned Object Detection via Temporal Consistency Modeling and Back-Tracing Verification for Visual Surveillance.
IEEE Trans. Inf. Forensics Secur., 2015

Supervised Learning of Semantics-Preserving Hashing via Deep Neural Networks for Large-Scale Image Search.
CoRR, 2015

Rapid Clothing Retrieval via Deep Learning of Binary Codes and Hierarchical Search.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Location-aware object detection via coherent region grouping.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Deep learning of binary hash codes for fast image retrieval.
Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015

Face Verification using LBP Feature and Clustering.
Proceedings of the VISAPP 2014, 2014

Object Detection for Neighbor Map Construction in an IoV System.
Proceedings of the 2014 IEEE International Conference on Internet of Things, 2014

Left-Luggage Detection from Finite-State-Machine Analysis in Static-Camera Videos.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Network-assisted device discovery for LTE-based D2D communication systems.
Proceedings of the IEEE International Conference on Communications, 2014

Optimization for Compressed Sensing: the Simplex Method and Kronecker Sparsification.
CoRR, 2013

Teleport: space navigation by detecting the self-motion of a mobile device.
Proceedings of the SIGGRAPH Asia 2013, 2013

Target-driven video summarization in a camera network.
Proceedings of the IEEE International Conference on Image Processing, 2013

Biologically inspired 3D trajectory prediction system using a moth flight-to-light tracking model.
Proceedings of the 2011 IEEE International Conference on Signal and Image Processing Applications, 2011
