The CAIB Artificial Intelligence Research Student Internship Program

The application for Spring 2025 is now closed.

What is the CAIB Artificial Intelligence Research Student Internship Program

The program trains and develops the next generation of leaders in various fields related to AI, its governance, and its use for research purposes. The program offers research opportunities and diverse training on artificial intelligence that build on the strengths of graduate and undergraduate education at the University of Maryland. A key aspect of the internship is the work carried out closely with a supervisor, complemented by a bi-monthly workshop series that offers training on essential AI skills and opportunities to exchange and discuss progress.

What do interns do?

Dedicate 10 hours per week to a research project of relevance to artificial intelligence under the supervision of faculty at UMD;
Develop a work plan in collaboration with a supervisor that outlines the research activities, deliverables, and contributions for the internship;
Attend bi-monthly seminars and training sessions;
Prepare a report and present it at the end of the spring semester.

What do interns get?

Hands-on research experience by working on a cutting-edge research project at UMD;
Opportunity to present their work at the Smith School of Business and potentially beyond;
Receive recognition for your work, including a $2500 stipend upon successful completion of the program;
Develop a professional and academic network through work with a supervisor, seminars and sessions, and more.

Eligibility

To be eligible, you must be enrolled in an undergraduate or graduate program at UMD for the duration of the internship. The program is open to undergraduate and graduate students from any level and all disciplines. Students must show a keen interest in research and academic exploration.

How to Apply

The application form contains the following sections:

Applicant background and academic information.
Recommendation: The contact of at least one academic or employment recommendation person.
Project selection: Applicants may select up to two projects and explain their interests and qualifications for each project.
Program fit: Applicants are asked to describe their academic and learning objectives, time management skills, and overall career goals.

Supporting Documents

A current CV.
Electronic copies of transcripts.
Proof of enrollment.
Supplementary documents, if applicable.

Selection and Review Process

Applicants will be evaluated based on submitting a complete application on time, their fit for the projects both in terms of objective requirements and the requirements outlined by faculty for each project, and how their participation in the program will advance their own future endeavors. Students currently in the Interdisciplinary Business Honors (IBH) Program will receive preference in the review process.

All completed applications will be reviewed by the program lead (Bertrand Stoffel) and faculty supervisors. Applicants may be shortlisted for an interview.

Contact Bertrand Stoffel, Internship Program Lead, at bstoffel@umd.edu with any questions.

Spring 2025 Projects

Project lead and authors: Professor Margret Bjarnadottir, with Professors David Anderson at Villanova University, David Rea at Lehigh University, and David Ross at University of Florida.

Project description: The goal of this project is to answer important questions about the US labor market. Some key questions include the impacts of laws on the US labor market; the feasibility of providing key information about skills and salaries to job seekers; the pricing of skills and job requirements. To achieve our goals, we plan to extract key information from 135 million LinkedIn job listings from 2020 to 2023. Doing this at scale is a natural application of AI (e.g., Large Language Models). With skills and salaries successfully extracted, we can develop appropriate analytical approaches to summarize the information in a manner that is useful to job seekers, employees, employers, and policy makers.

Learning opportunities: The intern will apply ChatGPT at scale to extract important information from job postings (through API calls) and use modern topic modeling approaches (e.g., BerTopic) to understand contents and trends of job listings.

Requirements: The ideal AI intern is self-motivated with a strong attention to detail, a curious mindset, and has the ability to work independently. Ideally, the AI intern has some experience with coding and script execution.

Project lead and authors: Jessica Clark, Aseem Baji

Project description: Our goal is to analyze the design of AI systems, particularly in the pre-modeling phases. We are particularly interested in two interrelated questions:

How to optimally formulate a predictive problem in a way that is aligned with long-term business outcomes. Specifically, many problems can be formulated either as regression (predicting numerical values) or classification (predicting categorical values).
Quantifying the impact of data preprocessing on AI model performance and making data-driven recommendations for defaults, depending on underlying data characteristics and modeling algorithm.

To answer both questions, we have been running large-scale experiments on UMD’s HPC system. An intern would continue managing these experiments, as well as analyzing the results and implementing code to conduct new experiments. We also need help building out a detailed literature review on designing predictive systems in the business literature and beyond.

Learning opportunities: The intern will learn about state-of-the-art machine learning techniques for real business problems and technical skills in using UMD’s high performance computing system. They will also get experience in research activities such as analyzing results from large-scale experiments, telling stories with data through charts and tables, and reviewing literature.

Requirements: Solid knowledge of statistics, including hypothesis testing. They should be interested in ML and AI and know about the vocabulary and main techniques. Solid programming experience, especially in Python, ideally also in R. Experience working in Unix command line interface would be a plus, but can be learned during the internship.

Project lead and authors: Prof. Cody Buntain, colleagues in the Center for AI, Data, and Conflict (CAIDAC)

Project description: The proliferation of social media represents a transformative opportunity for conflict studies and for tracking the proliferation and use of weaponry, as conflicts are increasingly documented in these online spaces. At the same time, the scale and types of data available are problematic for traditional open-source intelligence. This project focuses on identifying specific weapon systems and used by armed groups in the Ukraine War, Syrian civil war, fall of Afghanistan, and in the conflict in Israel. The large scale of social media makes manual assessment difficult, however, so this project will develop new AI-based methods to identify weapons embedded in images shared in social media and how the resulting collection of military-relevant images and their post times interact with the offline, real-world conflict. We will investigate how these images are used, the response these images receive, and the degree to which these images help us understand kinetic engagements in these conflicts.

Learning opportunities: The intern will gain experience in multi-modal machine learning methods, particularly in object-detection and object-recognition models. This work will include approaches for data collection and annotation for a specific AI task. These methods will extend beyond text-based analytics and provide the intern with new tools for understanding online audience engagement.

Requirements: The intern should have some skill and experience with Python and machine learning methods as well as a basic literacy with Unix/Linux tools to access computational resources.

Project lead and authors: Lauren Rhue and Jessica Clark

Project description: This project explores how AI/ML can reduce disparities in crowdfunding platforms. Employees at crowdfunding platforms select which projects to promote on their website. Promoted projects are twice as likely to be successful as regular projects, so that selection process influences which projects are ultimately funded. Using more than 100,000 projects from the crowdfunding platform Kickstarter.com, we adapt existing fairness-aware AI/ML models to explore alternative ways of identifying which projects to promote. We find that both accuracy-based models and fairness-aware models identify a more diverse set of projects for promotion than those selected by employees. We plan to use LLMs to evaluate the results from our AI/ML models and simulate whether our recommendations would have led some projects to be successful if they had been selected for promotion. With this simulation, we also consider the business impact of our AI/ML model to the platform.

Learning Opportunities: Students will process the Kickstarter dataset as inputs to LLM prompts. Using GPT4-o or Claude API, students will create prompts according to an experimental design and record the results of those prompts into an analysis dataset.

Requirements: Students must know Python. Students should feel comfortable working with .csv files and the pandas library. Students will process data, so any data wrangling skills are a plus. Previous knowledge about LLMs is not required.

Project Lead and Author: Wendy R. Sanhai, Ph.D., MBA

Project Description: Drug development is a time-intensive, expensive endeavor. Pharmaceutical and biotech companies need to optimize clinical trials and drug development processes so that patients can benefit from these therapies in a cost-effective and timely manner. AI is a powerful tool that can be leveraged to reduce timelines, identify potential risks and develop mitigation strategies during these processes. Under the guidance of the Project Lead, the intern will conduct research and develop a report identifying recommendations for leveraging AI across clinical trials and drug development processes to include:

Data integration: Ways to integrate data across clinical trials, literature reviews, and regulatory documentation into a structured format.
Challenge Prediction: Identify challenges and mitigation strategies in clinical trials and drug development.
Risk Reduction: Proactively identify and reduce the risk of project failure.
Improve Operations: Accelerate time to market by streamlining the drug development process, leading to faster access to innovative therapies.

Learning Opportunities: By the end of this project, the intern will develop research and analytical skills and understand ways by which AI can be leveraged to potentially address challenges faced by life science companies in the business of clinical trials and drug development.

Requirements: The intern must possess good verbal and written communication skills, be proficient in Microsoft Office Suite and have a basic understanding of medical, business and science terminology. The intern must be able to follow-through on instructions provided by the project lead and able to work in a team environment.

Project Lead and Author: Wendy R. Sanhai, Ph.D., MBA

Project Description: Medical device development is very different from drug development in many ways. The timelines, regulatory requirements and costs vary significantly from those of other medical products. Furthermore, the lifecycle for medical devices is much shorter than drugs and requires more frequent updates and improvements. AI that can be used to optimize development cycles, reduce costs and risks to patients by incorporating real-world data into subsequent iterations of these devices in the following ways:

Personalized Device Design: AI algorithms can create customized device designs based on individual patient characteristics, improving treatment efficacy and reducing side effects.
Accelerated Development: AI can streamline the development process by automating tasks like data analysis, simulation, and regulatory compliance, leading to faster time-to-market.
Continuous Improvement: AI-powered monitoring systems can collect real-world data on device performance and patient outcomes, enabling continuous improvement and optimization.

Learning opportunities: By the end of this project, the intern will: understand ways by which AI can be leveraged to improve medical device development, develop research techniques, critical analytical and communication skills that can be applied to other fields and through his/her career.

Project lead and authors: Prof. Cody Buntain, lead, and Do Won Kim (PhD student)

Project description: This project will examine how emotional cues in visual and multi-modal content drive user engagement in online spaces. Drawing on studies of emotion in psychology and political mobilization, we will develop and test several hypotheses around which emotions driven this engagement and how this engagement differs across modalities–-i.e., text versus imagery. Gathering content from multiple online spaces–at least Twitter, Facebook, Reddit, and Instagram–and training new methods for emotion recognition across modalities, we will develop tools to test these hypotheses on real-world social media data. Results from this project will be submitted to a high-quality academic venue, and resulting models will be released as open-source.

Learning opportunities: The intern will gain experience in multi-modal machine learning methods as well as audience analysis and social media analytics. These methods will extend beyond simple text-based analytics and provide the intern with new tools for understanding online audience engagement.

Requirements: The intern should have some skill and experience with Python and machine learning methods as well as a basic literacy with Unix/Linux tools to access computational resources.

Project lead and authors: Profs. Cody Buntain, Keng-Chi Chang (UCSD and Dartmouth)

Project description: Recent advances in computer vision allow for rapid image characterization for tasks like object detection, but validity suffers in these models in subjective tasks. Multi-modal models may resolve such issues by integrating additional textual and social context, which may enable analysts to match visual media to the related narratives and social issues the media is intended to activate (e.g., social narratives, “dog whistle” imagery, etc.), and new generative AI methods may support fine-grained content creation for specific audience segments. This project will develop new generative-AI tools that combine insights from audience analysis with user-provided descriptions to craft new content that includes audience-specific visual concepts, symbols, and narratives. We will then assess the how these AI-human-team-generated visuals compare to content gathered via digital-trace data among the target audiences.

Learning opportunities: The intern will extend extant models for multi-modal AI content generation to include new visual concepts extracted from audience-specific digital-trace data. This work will include programming and training conditional AI generation models, producing new content with these models, and evaluating the output with audience experts.

Requirements: The intern should have some skill and experience with Python and machine learning methods as well as a basic literacy with Unix/Linux tools to access computational resources.