ritengzhang

Riteng (Gavin) Zhang

Riteng (Gavin) Zhang I am an undergraduate student at Boston College, passionate about researching deep neural network interpretability and building interpretable AI systems. I am majoring in Mathematics and Computer Science, complemented by a minor in Philosophy.


About Me

Research Interest ✏️

Branch Specialization Analysis Project 🌳

Branch Specialization Analysis

The Branch Specialization[3] Analysis Project is a project of my own that will have several stages. Currently, it’s in the very early stages with a focus on providing baseline and evaluation metrics for branch specialization consistency and exploring the potential of branch specialization in combining the functional and architectural modularity of deep learning models. Understanding and analyzing branch specialization is crucial for several reasons:

  - It aids in creating more interpretable models, as it becomes clearer what roles different parts of the network are playing.

  - It can lead to more efficient network designs, where unnecessary or redundant branches can be identified and pruned without loss of overall functionality.

  - It provides insights that can be utilized in neural architecture search (NAS) to design optimized and task-specific models.


Papers under this project:

Analyzing Variations in Layer-wise Feature Attributions

Analyzing Variations in Branch Attribution in Non-monolithic Models (advised by Professor Sergio Alvarez)

This paper investigates the variability in layer feature attribution across different branches in various branched neural networks (monolithic design vs. inception-like). Despite using consistent datasets, model architectures, and hyperparameters, training with different initial parameters leads to differences in neuron roles and contributions. Our focus is on determining whether the monolithic design of branched models will have higher variation in its branch attribution than that of inception-like models or non-monolithic branched neural networks.


Other Research 📖

Interpretability Formalizing and Auto-Explaining Framework

Interpretability Formalizing and Automating Framework

Inspired by “Post-hoc Interpretability for Neural NLP: A Survey,” I aimed to encode interpretability methods into a structured, formal framework using Python classes, this new project seeks to establish a unified representation that automatically captures the essence (functionality and application) of deep learning interpretability methods across various dimensions. It categorizes interpretability methods based on characteristics such as global vs. local, similar to what previous surveys have done, and evaluates their complexity, fidelity, etc. The project also hopes to generate new methods through innovative approaches, such as generative sequence models (like tree RNNs).


Evaluation of LLM Zero to Few-Shot Ability

Evaluation of LLM Zero to Few-Shot Ability when Expecting Formatted Output

These experiments are designed to systematically evaluate the performance of GPT models under various conditions, focusing on their ability to generate formatted output and provide accurate answers. The collected metrics aid in assessing the models’ capabilities and limitations in handling diverse scenarios and formatting requirements.


Publications 📄

CS Placement Test System

Leveraging LLMs and MLPs in Designing a Computer Science Placement Test System (In Proceedings of CSCI 2023, Coauthored with Yi LI and Angela Qu, advised by Professor Maira Samary)

Different types of models, including LLMs, are utilized to create an automated process for conducting a Computer Science (CS) placement test in a step-by-step manner. The framework’s limitations and potential are discussed.


Startup - Blossoms ai 🤖➕🎓

Blossoms ai

As CAIO and co-founder of BlossomsAI, I am deeply committed to transforming education through the power of artificial intelligence. At BlossomsAI, we believe that a personalized, individualized, and customized approach is key to unlocking every student’s full potential. Our goal is to provide teachers with the tools and resources they need to save time, increase efficiency, and focus on nurturing the unique abilities and interests of each student.


Travel ✈️

ACL 2023

NeurIPS 2023

Talk

Understanding LSTM Networks, Boston College Experimental Math & ML lab, Nov 2023

Reference

[1] Madsen, Andreas, Siva Reddy, and Sarath Chandar. “Post-hoc Interpretability for Neural NLP: A Survey.” ACM Computing Surveys 55, no. 8 (2022): 1-42.

[2] Min, Sewon, et al. “Rethinking the role of demonstrations: What makes in-context learning work?.” arXiv preprint arXiv:2202.12837 (2022).

[3] Voss, Chelsea, et al. “Branch Specialization.” Distill (2021).