About Me

I am a PhD candidate in Mechanical Engineering with Prof. Masayoshi Tomizuka at University of California, Berkeley. I am focused on building trustworthy planning algorithms for autonomous agents, such as vehicles and robots. I got my Master of Science in 2022, in the middle of my PhD study at UC Berkeley. Previously, I received my Bachelor of Engineering from Harbin Institute of Tehcnology, working with Prof. Huijun Gao and Prof. Weichao Sun on systems control and fault diagnosis.

Interests
  • Machine Learning
  • Reinforcment Learning
  • Control
  • Optimization
Education
  • Ph.D. in Mechanical Engineering, 2024 (Expected)

    University of California, Berkeley

  • M.S. in Engineering, 2022

    University of California, Berkeley

  • B.Eng. in Automation, 2019

    Harbin Institute of Technology, China

Work Experience

 
 
 
 
 
Interactive Behavior Modeling, [Honda Research Institute USA, Inc.](https://usa.honda-ri.com/)
Student Associate / Research Intern
Interactive Behavior Modeling, Honda Research Institute USA, Inc.
Aug 2023 – Present San Jose, CA
  • Designed algorithms to improve safe generalization of prediction models, incorporating behavior planning modules of the vehicles
  • Captured invariant information between different traffic scenes across partitions of training datasets by unsupervised learning
  • Evaluated the proposed algorithm on large-scale datasets, e.g., Waymo Open Motion Dataset & Argoverse 2 Dataset, which are preprocessed and unified to the same format
 
 
 
 
 
[University of California, Berkeley](https://www.berkeley.edu)
Graduate Student Researcher / Instructor
Aug 2019 – Present Berkeley, CA
  • Conduct research on trustworthy and safe machine learning, especially reinforcement learning, and their application on real-world systems
  • Provide lectures and guide on discussion sessions in graduate-level courses
 
 
 
 
 
Machine Learning Infrastructure Team, [Google LLC](https://about.google/)
Software Engineer Intern
Machine Learning Infrastructure Team, Google LLC
May 2023 – Aug 2023 Sunnyvale, CA
  • Designed and built a JAX-ONNX backend library: Jaxonnxruntime. Github: https://github.com/google/jaxonnxruntime
  • Passed more than 700 unit tests in both ONNX backend test suites and customized scenarios including Large Language Models
  • Transformed the original Pytorch LLaMA model to JAX
  • Exported and served the transformed models by the JAX ecosystem on Google Cloud internal server platforms
  • Benchmarked the inference of JAX Transformers on model servers with different parallel partition rules on GPUs and TPUs
  • Customized the library based on the needs of users at Google
 
 
 
 
 
Discover Ads Auction Team, [Google LLC](https://about.google/)
Software Engineer Intern
Discover Ads Auction Team, Google LLC
May 2022 – Aug 2022 Mountain View, CA
  • Designed and built an offline reinforcement learning infrastructure under Tensorflow for discover ads auction
  • Trained deep NNs to optimize auction long term values from real-world data to achieve better advertiser/user value trade-off
  • Conducted A/B testing of the trained algorithm on production traffic and polished the models accordingly

Projects

Trustworthy Reinforcement Learning Algorithms for Real-World Application
  • Designed a guided online distillation algorithm (website) for safe reinforcement learning (RL): extracted skills from human demonstrations by Decision Transformer, and distilled them into a lightweight network in online interactive funetuning for safety enhancement

  • Proposed a metric to quantify the interaction intensity for multi agent RL, which guides resource allocation for training diverse policies under a constraint budget

  • Develop a generative model (Diffusion) based simulator producing human like interactions, which can be trained concurrently and accept feedback from planning modules for better sample efficiency and final performance on safety

Resulting Publications:

Machine Learning Framework and Algorithms Design for Decision Making
  • Designed a spatio-temporal graph dual-attention network for multi-agent prediction, considering context information, trajectories of interactive agents, and physical feasibility constraints

  • Proposed a Pessimistic Offline Reinforcement Learning algorithm, which palliates the distributional shift problem by explicitly handling out-of-distribution states

  • Built a hierarchical planning framework especially for long horizon tasks, with a high-level module reasons about long-term strategies and plan sub-goals, and low-level goal-conditioned offline reinforcement learning algorithms to achieve sub-tasks

Resulting Publications:

Interaction-Aware Behavior Planning for Autonomous Vehicles
  • Built an interaction-aware behavior planning algorithm, which predicts the cooperativeness of the surrounding vehicles and solves a POMDP problem by MCTS

  • Proposed a general hierarchical planning framework, which safely handles various complex urban traffic conditions

  • Built a simulator that reproduces real traffic scenarios, and the proposed algorithms achieved both high completion rate of around and low collision rate

Resulting Publications:

Machine Learning Based Fault Diagnosis for Industrial Processes
  • Built an integrated SVM model with KPCA to extract and compress information, and GA to optimize the model parameters

  • Evaluated the algorithm on Tennessee Eastman process benchmark. Ablation studies showed that KPCA and GA both boost the performance of the SVM

Resulting Publications:

Publications

(2023). Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration. In arXiv preprint arXiv:2309.09408.

PDF Cite Video

(2023). Quantifying Agent Interaction in Multi-agent Reinforcement Learning for Cost-efficient Generalization. In arXiv preprint arXiv:2310.07218.

PDF Cite

(2022). Hierarchical Planning Through Goal-Conditioned Offline Reinforcement Learning. In IEEE Robotics and Automation Letters (RA-L).

PDF Cite

(2022). Dealing with the Unknown: Pessimistic Offline Reinforcement Learning. In Conference on Robot Learning (CoRL).

PDF Cite

(2021). Spatio-temporal graph dual-attention network for multi-agent prediction and tracking. In IEEE Transactions on Intelligent Transportation Systems.

PDF Cite

(2021). A safe hierarchical planning framework for complex driving scenarios based on reinforcement learning. In IEEE Conference on Robotics and Automation (ICRA).

PDF Cite

(2020). Interaction-aware behavior planning for autonomous vehicles validated with real traffic data. In Dynamic Systems and Control Conference (DSCC).

PDF Cite

(2019). A Novel Integrated SVM for Fault Diagnosis Using KPCA and GA. In Journal of Physics: Conference Series.

PDF Cite