About me: I work on deployable and interpretable solutions for data-intensive research problems. In my Ph.D. thesis research, I developed a human-in-the-loop framework and related methods that are used daily by public health stakeholders to identify and diagnose data events from large volumes of public health data.
My expertise is in applied AI, data science, and computer systems. I also have extensive experience working directly with line-level and aggregated public health data. Outside of that, I have experience across the stack (e.g. research in caching, building NiChrome) and in applications like urban data, protein folding, and LLMs (automated prompt engineering and fine tuning).
[β New] I'm starting to look for post-Ph.D. opportunities. Let's chat!
π Thesis Proposal
December 8th 2023Growing volumes of public health-related data render existing techniques for identifying changes in disease dynamics and data quality assurance, designed for smaller data volumes, obsolete. Accordingly, my thesis presents a practical framework for experts to monitor large-scale aggregate public health data. Our novel methods, which are simple, scalable, and shown to be accurate in real-world settings, identify data corresponding to quality issues or changes in disease dynamics. Coupled with our custom user interfaces, these methods have led to a 53x increase in monitoring efficiency for data experts at the Delphi Group at Carnegie Mellon University, who can now detect about 200 noteworthy data issues from 15 million new data points each week. As a final step, we are building communication pipelines to disseminate this expert-analyzed data to public health stakeholders, including state and local health departments, thereby enabling actionable intelligence from Delphi's vast volume of public health data.
[Slides Here][YouTube Playlist Here]π Creating an Automated Ideological Transformer Using Moral Reframing [GPT2]
Ananya Joshi*, Christiane Fellbaum, Michael Guerzhoyπ Cooperative Rule Caching for SDN Switches Ori Rottenstreich, Ariel Kulik, Ananya Joshi, Jennifer Rexford, GΓ‘bor RΓ©tvΓ‘ri, Daniel MenaschΓ©, 2020 IEEE 9th International Conference on Cloud Networking (CloudNet).
π Data Plane Cooperative Caching With Dependencies Ori Rottenstreich, Ariel Kulik, Ananya Joshi, Jennifer Rexford, GΓ‘bor RΓ©tvΓ‘ri, Daniel MenaschΓ©, 2020 IEEE Transactions on Network and Service Management.
π An open repository of real-time COVID-19 indicators Alex Reinhart, Logan Brooksβ¦ Ananya Joshi β¦, Roni Rosenfeld, Ryan J. Tibshirani, Proceedings of the National Academy of Sciences.
π οΈ NiChrome, Google : Intern project with Anthony Rolland, advised by Ron Minnich and Christopher Koch
π οΈ COVIDCast Engineering, Delphi: Contributor since 2020.
π οΈ Cases2Beds Project lead for a tool to provision hospital beds, developed in collaboration with the Allegheny County Health Department and shared with several other county public health departments during the COVID-19 Pandemic. Presented on 01/08/2020 at the COVID-19 Trends and Impact Surveys Data Users Meeting.
π οΈ Research at MIT Lincoln Laboratory on Probabiltiy of Cloud Free Line of Sight
π Urban Data Mining in Switzerland, ETH Zurich
π Time Series Data: Clarifying Practical Approaches
This was a 80-minute active-learning lecture for students in Machine Learning in Practice. In a post-class anonymous survey (85% completion rate), 83% found the material to be at the right level (1 student found it too easy and 1 found it too difficult) and, across all tasks, identified in the learning objectives, students reported being able to increase their skills (e.g. from being unfamiliar with the task, being able to define the task, being able to classify tasks that belong to the component, knowing at least 2 ways to approach the task, and being familiar with some technical insights/nuances of the tasks).
π AI for Social Good in Public Health which used a follow-along Colab notebook to demonstrate statistical properties of data streams like nonstationarity, noisiness, and weekday effects and walkthrough methods to process data with these properties.
Please contact me if you would like to use these activities!
π MLCommons Medical Working Group Presentation: Monitoring for Health Events: Bridging Healthcare and Public Health Approaches.
Motivated by the goal of enhancing collaboration between these fields, the lecture proposes a foundational step in scoping opportunities for integrating machine learning orchestration in healthcare and public health. Details available upon request.
π Best Paper Award (SIGCSE β23) [Group Award]
π Carnegie Mellon University Graduate Student Service Award 2022 [Group Award]
π Co-Organizer of the Delta Workshop for drift phenomena at KDD β24! Link
π Presenting at KDD Doctoral Consortium 2024
π Presented a talk at CMU Artificial Intelligence Seminar Series (Apr 2024).
π Presented a poster at the first annual InsightNet meeting (Apr 2024).
β Co-Instructor for 15-996 (Spring 2022/23) Article & TA for Machine Learning in Practice (Spring 2023)
β Completed the Eberly Centerβs Future Faculty Program
β Selected Courses: Grad AI (A+), Mobile & Pervasive Computing [IoT] (A+)
β Partial travel grant to KDD, fee waiver for NUβs Future Faculty Workshop.
β Joined the Council of State and Territorial Epidemiologistsβ (CSTE) Peer-to-Peer Technical Assistance network as a mentor for the Pennsylvania Dept. of Health and Santa Clara County Public Health Department. On a biweekly cadence I prepare lectures and hand-on Colab notebooks for concepts relevant to mentees! (2024+)
β Lead of SCS Ph.D. Wellness Group (Sep 2020-May 2022)
β Joined AI/Healthcare panel from the Coding School as part of a free machine learning course for high school students (2024).
β Mentored students in Pittsburgh Girls-of-Steel for data science basics weekly (Sep 2022-Jun 2023).
β Mentored my amazing undergraduate student Tara Lakdawala on a changepoint detection project (Sep 2022-Aug 2023) [Blog]. Sheβs now at Goldman Sachs!
β Congratulations to my incredible masters student, Richa Gadgil for her graduation from the Machine Learning Masters Degree at CMU! Richa was a core contributor for the FlaSH project and we are excited to keep working at Delphi with her this summer! (May 2024)
Hobbies: Pittsburgh is a great place to explore new hobbies. Iβve enjoyed starting rock climbing and martial arts here! My views and opinions are my own!
July 2, 2024
A few weeks ago, I had the opportunity to attend the Council of State and Territorial Epidemiologists (CSTE) Annual Conference right here in Pittsburgh! Getting involved with CSTE this past year, especially through the forecasting and modeling workgroup calls, has been incredibly rewarding, and it was great to see some familiar faces in person.
1009
July 2, 2024
I arrived in Nairobi last week to start my summer internship with the AI team at IBM Research! Iβm enjoying my project, and the team has a welcoming culture.
616
May 16, 2024
The CMS proposed rule will impact how data related to public health will be collected. Two standout quotes related to monitoring large-scale systems are,
492
April 30, 2024
This semester, I had the opportunity to put together two practical, active learning lectures related to AI x Public Health!
726
April 25, 2024
Last week, I attended the first annual InsightNet conference in North Carolina!
1023
April 25, 2024
Interesting Papers, Talks, Videos Related to Public Health Monitoring
117
April 20, 2024
If you are a CS/AI/ML student interested in getting exposure either with applying your methods to public health or are interested in developing methods for public health applications these opportunities/insights may be interesting to you:
152
April 10, 2024
Using Synthetic Persona Generation to Improve Prompt Responses
192
November 1, 2020
Epidemic forecasting, by its very nature, is a direct application of computer science, math, and biology.
247
September 24, 2020
Deploying the Shiftview repo on Google Cloud creates a website that highlights my work on ShiftView, an ideological transformer based on the moral foundations theory. You can find a light demo of the work in the sidebar. Please be patient as the translations require time to generate. Additionally, because some of the data is uncensored, there may be cursing and confusing speech. I am not repsonsible for the translations generated. In its current state, ShiftView is simply a guide for what is possible with automated moral reframing.
603
June 30, 2020
The following are from the bitsvbytes website that I could salvage - while this project was put on hiatus during COVID, I learned a lot about designing public health interventions using a data-driven approach!
4702
June 30, 2020
In 2019-2020, I was a Fulbright research student in Singapore. I kept a weekly blog (formerly called bitsvbytes because I was using ML techniques to reduce mosquito bites) that turned into a daily blog at the start of COVID. Unfortunately, I kept my blog (with all my writing, pictures, models, and demos for Buzznet) on Heroku (with a Heroku database) with the hope of eventually restarting the service once I could shrink the footprint (and the cost) of hosting this resource.
15131