I am an undergraduate student at Boston College, passionate about researching deep neural network interpretability and building interpretable AI systems. I am majoring in Mathematics and Computer Science, complemented by a minor in Philosophy.
🔈 Research Assistant for Professor Emily Prud’hommeaux on topics including ASR for under-resourced language, Language Models Evaluation for Neuroatypical Language, etc.
💬 Teaching assistant for CSCI3399 Vision and Learning, CSCI3345 Machine Learning in Boston College
🌻 Co-founder and machine learning engineer of Blossoms ai.
📫 Contact Information: zhangcoj@bc.edu
📄 Here’s my cv.
The Branch Specialization[3] Analysis Project is a project of my own that will have several stages. Currently, it’s in the very early stages with a focus on providing baseline and evaluation metrics for branch specialization consistency and exploring the potential of branch specialization in combining the functional and architectural modularity of deep learning models. Understanding and analyzing branch specialization is crucial for several reasons:
- It aids in creating more interpretable models, as it becomes clearer what roles different parts of the network are playing.
- It can lead to more efficient network designs, where unnecessary or redundant branches can be identified and pruned without loss of overall functionality.
- It provides insights that can be utilized in neural architecture search (NAS) to design optimized and task-specific models.
Papers under this project:
Analyzing Variations in Branch Attribution in Non-monolithic Models (advised by Professor Sergio Alvarez)
This paper investigates the variability in layer feature attribution across different branches in various branched neural networks (monolithic design vs. inception-like). Despite using consistent datasets, model architectures, and hyperparameters, training with different initial parameters leads to differences in neuron roles and contributions. Our focus is on determining whether the monolithic design of branched models will have higher variation in its branch attribution than that of inception-like models or non-monolithic branched neural networks.
Interpretability Formalizing and Automating Framework
Inspired by “Post-hoc Interpretability for Neural NLP: A Survey,” I aimed to encode interpretability methods into a structured, formal framework using Python classes, this new project seeks to establish a unified representation that automatically captures the essence (functionality and application) of deep learning interpretability methods across various dimensions. It categorizes interpretability methods based on characteristics such as global vs. local, similar to what previous surveys have done, and evaluates their complexity, fidelity, etc. The project also hopes to generate new methods through innovative approaches, such as generative sequence models (like tree RNNs).
Evaluation of LLM Zero to Few-Shot Ability when Expecting Formatted Output
These experiments are designed to systematically evaluate the performance of GPT models under various conditions, focusing on their ability to generate formatted output and provide accurate answers. The collected metrics aid in assessing the models’ capabilities and limitations in handling diverse scenarios and formatting requirements.
Leveraging LLMs and MLPs in Designing a Computer Science Placement Test System (In Proceedings of CSCI 2023, Coauthored with Yi LI and Angela Qu, advised by Professor Maira Samary)
Different types of models, including LLMs, are utilized to create an automated process for conducting a Computer Science (CS) placement test in a step-by-step manner. The framework’s limitations and potential are discussed.
As CAIO and co-founder of BlossomsAI, I am deeply committed to transforming education through the power of artificial intelligence. At BlossomsAI, we believe that a personalized, individualized, and customized approach is key to unlocking every student’s full potential. Our goal is to provide teachers with the tools and resources they need to save time, increase efficiency, and focus on nurturing the unique abilities and interests of each student.
ACL 2023
NeurIPS 2023
Understanding LSTM Networks, Boston College Experimental Math & ML lab, Nov 2023
[1] Madsen, Andreas, Siva Reddy, and Sarath Chandar. “Post-hoc Interpretability for Neural NLP: A Survey.” ACM Computing Surveys 55, no. 8 (2022): 1-42.
[2] Min, Sewon, et al. “Rethinking the role of demonstrations: What makes in-context learning work?.” arXiv preprint arXiv:2202.12837 (2022).
[3] Voss, Chelsea, et al. “Branch Specialization.” Distill (2021).