About me: I work on deployable and interpretable solutions for data-intensive research problems. In my Ph.D. thesis research, I developed a human-in-the-loop framework and related methods that are used daily by public health stakeholders to identify and diagnose data events from large volumes of public health data.
My expertise is in applied AI, data science, and computer systems. I also have extensive experience working directly with line-level and aggregated public health data. Outside of that, I have experience across the stack (e.g. research in caching, building NiChrome) and in applications like urban data, protein folding, and LLMs (automated prompt engineering and fine tuning).
[β New] I'm starting to look for post-Ph.D. opportunities. Let's chat!
π Thesis Proposal
December 8th 2023Public health data aggregators publish millions of data points across many data streams, like the daily number of influenza cases, hospitalizations, and deaths per county and state in the United States. So that their users, including public health experts, do not draw erroneous conclusions, these aggregators must identify noteworthy changes in their data, including those that result from data errors and outbreaks. However, given increasing data volumes and limited numbers of data reviewers, aggregators can only have some of their data inspected manually. <\br><\br> This thesis introduces a human-in-the-loop framework for public health data aggregators to inspect their data given their reviewer resources. Currently, an automated method ranks each new data point in context, with a correction for high-volume settings, so that reviewer attention is focused on the most noteworthy data. Still, reviewers need to investigate hundreds of data points from the ranked list before being able to understand the scope of the impacted data. Accordingly, we propose including an automated module that jointly identifies multiple data points that are noteworthy in context.
[Slides Here][YouTube Playlist Here]π Creating an Automated Ideological Transformer Using Moral Reframing [GPT2]
Ananya Joshi*, Christiane Fellbaum, Michael Guerzhoyπ Cooperate Rule Caching for SDN Switches Ori Rottenstreich, Ariel Kulik, Ananya Joshi, Jennifer Rexford, GΓ‘bor RΓ©tvΓ‘ri, Daniel MenaschΓ©, 2020 IEEE 9th International Conference on Cloud Networking (CloudNet).
π Data Plane Cooperative Caching With Dependencies Ori Rottenstreich, Ariel Kulik, Ananya Joshi, Jennifer Rexford, GΓ‘bor RΓ©tvΓ‘ri, Daniel MenaschΓ©, 2020 IEEE Transactions on Network and Service Management.
π An open repository of real-time COVID-19 indicators Alex Reinhart, Logan Brooksβ¦ Ananya Joshi β¦, Roni Rosenfeld, Ryan J. Tibshirani, Proceedings of the National Academy of Sciences.
π οΈ NiChrome, Google : Intern project with Anthony Rolland, advised by Ron Minnich and Christopher Koch
π οΈ COVIDCast Engineering, Delphi: Contributor
π οΈ Research at MIT Lincoln Laboratory on Probabiltiy of Cloud Free Line of Sight
π Urban Data Mining in Switzerland, ETH Zurich
π Time Series Data: Clarifying Practical Approaches
This was a 80-minute active-learning lecture for students in Machine Learning in Practice. In a post-class anonymous survey (85% completion rate), 83% found the material to be at the right level (1 student found it too easy and 1 found it too difficult) and, across all tasks, identified in the learning objectives, students reported being able to increase their skills (e.g. from being unfamiliar with the task, being able to define the task, being able to classify tasks that belong to the component, knowing at least 2 ways to approach the task, and being familiar with some technical insights/nuances of the tasks).
π AI for Social Good in Public Health which used a follow-along Colab notebook to demonstrate statistical properties of data streams like nonstationarity, nosiness, and weekday effects and walkthrough methods to process data with these properties.
Please contact me if you would like to use this activity in your teaching.
π Best Paper Award (SIGCSE β23) [Group Award]
π Carnegie Mellon University Graduate Student Service Award 2022 [Group Award]
π Mentored undergrad student for changepoint detection (Sep 2022-Aug 2023) [Blog].
π Mentored students in Pittsburgh Girls-of-Steel for data science basics weekly (Sep 2022-Jun 2023).
π Lead of SCS Ph.D. Wellness Group (Sep 2020-May 2022)
π Co-Instructor for 15-996 (Spring 2022/23) Article & TA for Machine Learning in Practice (Spring 2023)
π Co-Organizer of the Delta Workshop for drift phenomnea at KDD β24! Link
π Completed the Eberly Centerβs Future Faculty Program
π Joined the Council of State and Territorial Epidemiologistsβ (CSTE) Peer-to-Peer Technical Assistance network as a mentor
π Selected Courses: Grad AI (A+), Mobile & Pervasive Computing [IoT] (A+)
π Joined AI/Healthcare panel from the Coding School as part of a free machine learning course for high school students.
π Presented a talk at CMU Artificial Intelligence Seminar Series (Apr 2023).
π Presented a poster at the first annual InsightNet meeting (Apr 2023).
Hobbies: Pittsburgh is a great place to explore new hobbies. Iβve enjoyed starting rock climbing and martial arts here! My views and opinions are my own!
April 30, 2024
This semester, I had the opportunity to put together two practical, active learning lectures related to AI x Public Health!
5174
April 25, 2024
Last week, I attended the first annual InsightNet conference in North Carolina!
8390
April 25, 2024
Interesting Papers, Talks, Videos Related to Public Health Monitoring
2016
April 20, 2024
If you are a CS/AI/ML student interested in getting exposure either with applying your methods to public health or are interested in developing methods for public health applications these opportunities may be interesting to you:
1505
April 10, 2024
Using Synthetic Persona Generation to Improve Prompt Responses
1566
November 1, 2020
Epidemic forecasting, by its very nature, is a direct application of computer science, math, and biology.
1802
September 24, 2020
Deploying the Shiftview repo on Google Cloud creates a website that highlights my work on ShiftView, an ideological transformer based on the moral foundations theory. You can find a light demo of the work in the sidebar. Please be patient as the translations require time to generate. Additionally, because some of the data is uncensored, there may be cursing and confusing speech. I am not repsonsible for the translations generated. In its current state, ShiftView is simply a guide for what is possible with automated moral reframing.
4306
June 30, 2020
The following are from the bitsvbytes website that I could salvage - while this project was put on hiatus during COVID, I learned a lot about designing public health interventions using a data-driven approach!
34848
June 30, 2020
In 2019-2020, I was a Fulbright research student in Singapore. I kept a weekly blog (formerly called bitsvbytes because I was using ML techniques to reduce mosquito bites) that turned into a daily blog at the start of COVID. Unfortunately, I kept my blog (with all my writing, pictures, models, and demos for Buzznet) on Heroku (with a Heroku database) with the hope of eventually restarting the service once I could shrink the footprint (and the cost) of hosting this resource.
125075