Skip to main content
Loading Events

« All Events

  • This event has passed.

Graduate Defense: Graham Annett

August 14 @ 3:30 pm - 5:30 pm MDT

Dissertation Defense

Dissertation Information

Title: Multimodal Deep Learning Approaches For Web Environments

Program: Doctor of Philosophy in Computing

Advisor: Dr. Timothy Andersen, Computer Science

Committee Members: Dr. Hoda Mehrpouyan, Computer Science; Dr. Casey Kennington, Computer Science and Dr. Grady Wright, Mathematics

Abstract

This dissertation presents a framework for the development of deep learning models tailored for dynamic web environments, leveraging generalized pre-trained multimodal models. A task generation framework, applied to multiple web datasets, is introduced, facilitating instruction finetuning of models to execute multi-action web workflows. This approach improves the adaptability of pre-trained models to a diverse range of novel web tasks, crucial for ensuring the reliable operation of web agents.

Further, the work explores model interpretability, providing insights into the operational processes of these models. A novel encoding schema extending the Decision Transformer (Chen 2021) is proposed, which enhances the adaptability of these models for downstream tasks, thereby broadening their practical applicability. These enhancements not only improve the functionality of models as agents but also ensure their reliability and transparency, essential for their deployment in real-world applications.

This work showcases methods and tools for the aggregation and curation of complex web trajectories, refining the software foundations of open-ended tasks in web environments. The findings indicate that deep learning models are increasingly capable of practical deployment, enabling agents to facilitate effective and efficient interactions across a variety of web-based tasks. This work contributes to the field by detailing the implementation of advanced learning models as agents within the web, advancing the deployment and utilization of AI in complex digital landscapes.