Meiyi Ma

Assistant Professor
Computer Science
Vanderbilt University
Office: 401E, 1025 16th Ave S, Nashville, TN 37212
Email: meiyi.ma AT vanderbilt.edu


2026
2025
Abstract:
Emergency response services are critical to public safety, with 9-1-1 call-takers playing a key role in ensuring timely and effective emergency operations. To ensure call-taking performance consistency, quality assurance is implemented to evaluate and refine call-takers' skillsets. However, traditional human-led evaluations struggle with high call volumes, leading to low coverage and delayed assessments. We introduce LogiDebrief, an AI-driven framework that automates traditional 9-1-1 call debriefing by integrating Signal-Temporal Logic (STL) with Large Language Models (LLMs) for fully-covered rigorous performance evaluation. LogiDebrief formalizes call-taking requirements as logical specifications, enabling systematic assessment of 9-1-1 calls against procedural guidelines. It employs a three-step verification process: (1) contextual understanding to identify responder types, incident classifications, and critical conditions; (2) STL-based runtime checking with LLM integration to ensure compliance; and (3) automated aggregation of results into quality assurance reports. Beyond its technical contributions, LogiDebrief has demonstrated real-world impact. Successfully deployed at Metro Nashville Department of Emergency Communications, it has assisted in debriefing 1,701 real-world calls, saving 311.85 hours of active engagement. Empirical evaluation with real-world data confirms its accuracy, while a case study and extensive user study highlight its effectiveness in enhancing call-taking performance.
Abstract:
Reward design is a key component of deep reinforcement learning (DRL), yet some tasks and designer’s objectives may be unnatural to define as a scalar cost function. Among the various techniques, formal methods integrated with DRL have garnered considerable attention due to their expressiveness and flexibility in defining the reward and requirements for different states and actions of the agent. Nevertheless, the exploration of leveraging Signal Temporal Logic (STL) for guiding multi-agent reinforcement learning (MARL) reward design is still limited. The presence of complex interactions, heterogeneous goals, and critical safety requirements in multi-agent systems exacerbates this challenge. In this paper, we propose a novel STL-guided multi-agent reinforcement learning framework. The STL requirements are designed to include both task specifications according to the objective of each agent and safety specifications. The robustness values from checking the states against STL specifications are leveraged to generate rewards. We validate our approach by conducting experiments across various testbeds. The experimental results demonstrate significant performance improvements compared to MARL without STL guidance, along with a remarkable increase in the overall safety rate of the multi-agent systems.
Abstract:
Medical image segmentation plays a critical role in both image interpretation and disease diagnosis. Data-driven approaches such as U-Net have significantly advanced the field by enabling pixel-level classification of anatomical structures. However, the resulting...
Abstract:
Ensuring the online safety of youth has motivated research towards the development of machine learning (ML) methods capable of accurately detecting social media risks after-the-fact. However, for these detection models to be effective, they must proactively identify high-risk scenarios (e.g., sexual solicitations, cyberbullying) to mitigate harm. This `real-time' responsiveness is a recognized challenge within the risk detection literature. Therefore, this paper presents a novel two-level framework that first uses reinforcement learning to identify conversation stop points to prioritize messages for evaluation. Then, we optimize state-of-the-art deep learning models to accurately categorize risk priority (low, high). We apply this framework to a time-based simulation using a rich dataset of 23K private conversations with over 7 million messages donated by 194 youth (ages 13-21). We conducted an experiment comparing our new approach to a traditional conversation-level baseline. We found that the timeliness of conversations significantly improved from over 2 hours to approximately 16 minutes with only a slight reduction in accuracy (0.88 to 0.84). This study advances real-time detection approaches for social media data and provides a benchmark for future training reinforcement learning that prioritizes the timeliness of classifying high-risk conversations.
Abstract:
Predicting spatiotemporal patterns is essential for traffic flow forecasting in Intelligent Transportation Systems (ITS), where accurate predictions can greatly enhance traffic management. Currently, data-driven approaches, such as Graph Neural Networks (GNNs) combined with physics-informed partial differential equations (PDEs), have shown promising performance in capturing complex traffic dynamics. However, these models sometimes make unexpected and incorrect predictions with high certainty which will mislead real-world decision-making, particularly in critical scenarios. This lack of uncertainty quantification (UQ), which estimates the confidence of neural network predictions beyond prediction accuracy, limits the reliability of the predictions. Existing UQ methods in traffic forecasting are typically applied to purely data-driven models, leaving the effects of physics-informed approaches on UQ largely unexplored. In this paper, we address these challenges by combining multiple UQ baselines with physics-informed methods to investigate the impact of physical constraints on UQ for traffic forecasting. Additionally, we introduce a new physics-informed loss function to guide the model learning process, which holds an exponential relation and complements the linear relations of the PDE layer. Through extensive experiments on real-world traffic datasets, we demonstrate that our proposed method outperforms existing approaches, achieving reductions of up to 14.8% in short-term and 8.7% in long-term traffic speed prediction errors. Sensitivity analysis further illustrates the robustness of our approach under perturbations.
Abstract:
Wearable sensor technology has significantly enhanced healthcare quality, including physical therapy. However, due to the design of current deep learning models, existing works often ignore the unique variations of rest intervals between repetitions and variations in individual user progress, potentially hindering effective therapy outcomes. To address these limitations, we introduce EXACT, a novel framework designed to improve exercise segmentation and differentiate between exercise variations and rest intervals in PT applications. EXACT leverages a unique combination of a U-Net architecture integrated with Model-Agnostic Meta-Learning (MAML), enhanced with residual connections and attention mechanisms to capture subtle variations in exercise patterns and rest intervals. This approach addresses key challenges in segmenting dense, multivariate IMU data, providing a robust solution that adapts to new tasks with minimal retraining. EXACT achieves up to 20% improvement in segmentation Dice score over state-of-the-art U-Net models, demonstrating superior performance in distinguishing queried exercises from other exercises and rest intervals and handling variability in patient movements. Through rigorous evaluation and ablation studies, we demonstrate that attention and residual connections are essential for propagating relevant feature information and maintaining generalizability across varied exercise contexts. EXACT's adaptability and precision make it a valuable tool for real-time monitoring in PT, offering enhanced insights into patient progress and exercise quality in rehabilitation tracking. The code for our project is available at the EXACT Codebase1.
Abstract:
Emergency response services are vital for enhancing public safety by safeguarding the environment, property, and human lives. As frontline members of these services, 9-1-1 dispatchers have a direct impact on response times and the overall effectiveness of emergency operations. However, traditional dispatcher training methods, which rely on role-playing by experienced personnel, are labor-intensive, time-consuming, and often neglect the specific needs of underserved communities. To address these challenges, we introduce Sim911, the first training simulation for 9-1-1 dispatchers powered by Large Language Models (LLMs). Sim911 enhances training through three key technical innovations: (1) knowledge construction, which utilizes archived 9-1-1 call data to generate simulations that closely mirror real-world scenarios; (2) context-aware controlled generation, which employs dynamic prompts and vector bases to ensure that LLM behavior aligns with training objectives; and (3) validation with looped correction, which filters out low-quality responses and refines the system performance. Experimental results show Sim911's superior performance in effectiveness and equity. Beyond its technical advancements, Sim911 delivers significant social impacts. Successfully deployed in the Metro X of Emergency Communications (MXDEC)(PS: To ensure a double-blind review, we refer to the city as 'City X,' a mid-sized U.S. city with a population of over 700,000. Its Metro Department of Emergency Communications (MXDEC) employs around 80 dispatchers and call-takers. For the rest of the paper, we refer to MXDEC as 'DEC.') Sim911 has been integrated into multiple training sessions, saving time for dispatchers. By supporting a diverse range of incident types and caller tags, Sim911 provides more realistic and inclusive training experiences. In a conducted user study, 90.00 percent of participants found Sim911 to be as effective or even superior to traditional human-led training, making it a valuable tool for emergency communications centers nationwide, particularly those facing staffing challenges.
Abstract:
There is a growing trend toward AI systems interacting with humans to revolutionize a range of application domains such as healthcare and transportation. However, unsafe human-machine interaction can lead to catastrophic failures. We propose a novel approach that predicts future states by accounting for the uncertainty of human interaction, monitors whether predictions satisfy or violate safety requirements, and adapts control actions based on the predictive monitoring results. Specifically, we develop a new quantitative predictive monitor based on Signal Temporal Logic with Uncertainty (STL-U) to compute a robustness degree interval, which indicates the extent to which a sequence of uncertain predictions satisfies or violates an STL-U requirement. We also develop a new loss function to guide the uncertainty calibration of Bayesian deep learning and a new adaptive control method, both of which leverage STL-U quantitative predictive monitoring results. We apply the proposed approach to two case studies: Type 1 Diabetes management and semi-autonomous driving. Experiments show that the proposed approach improves safety and effectiveness in both case studies.
Abstract:
This paper proposes the utilization of a multi-mode ACC based on interpretable Finite State Machine (FSM) to address challenges in infrastructure and surrounding condition changes while meeting macroscopic safety and comfort requirements in traffic flow. Specifically, the paper designs and simulates a merge yield control mode, a dynamic speed change mode, and a safety and monitoring mode switching under defined state transitions with the traditional car-following mode. Advanced ACC algorithms have been applied to improve traffic efficiency and have demonstrated energy savings. Yet they have typically been deployed in a single-use case: car-following mode. In this mode, where the Autonomous Vehicle (AV) maintains an appropriate distance or time gap with the preceding vehicle, decelerating when the gap is small and accelerating when it is large, the system may struggle to guarantee safety and comfort in complex and variable driving scenarios. Although there exist mode-switching ACC and merge mode controllers, in which even involving latitude direction control have been proposed, their dynamics when driving alongside other controlled or human-driving vehicles under a Connected and Autonomous Vehicles (CAVs) traffic environment remains unclear. The paper includes results from an implementation that was successfully tested on the open road, and simulation results that show dampened disturbances from the mode-switching approach, compared to single-mode use of an ACC controller in the same scenarios.
2024
Abstract:
Smart cities operate on computational predictive frameworks that collect aggregate and utilize data from large-scale sensor networks. However these frameworks are prone to multiple sources of data and algorithmic bias which often lead to unfair prediction results. In this work we first demonstrate that bias persists at a micro-level both temporally and spatially by studying real city data from Chattanooga TN. To alleviate the issue of such bias we introduce FairGuard a micro-level temporal logic-based approach for fair smart city policy adjustment and generation in complex temporal-spatial domains. The FairGuard framework consists of two phases. First we develop a static generator that is able to reduce data bias based on temporal logic conditions by minimizing correlations between selected attributes. Second to ensure fairness in predictive algorithms we design a dynamic component to regulate prediction results and generate future fair predictions by harnessing logic rules. To navigate potential conflicts among these single fairness rules including logical contradictions and data interference we formulate detection strategies grounded in Satisfiability Modulo Theories (SMT) across both logic and data levels. Furthermore acknowledging the limitations of fairness rules focused on a single attribute we enhance the Static FairGuard to accommodate heterogeneous fairness rules that simultaneously consider multiple protected attributes. In addition we develop an interactive online visualizer that displays the adjustments made to correct unfair city states thereby improving fairness alongside the prediction outcomes from the dynamic component. Evaluations showcase that logic-enabled Static FairGuard can effectively reduce the biased correlations while Dynamic FairGuard can guarantee fairness on protected groups at runtime with minimal impact on overall performance.
Abstract:
This paper proposes a merge yield mode for Adaptive Cruise Control (ACC) that can switch with the traditional car-following mode to meet macroscopic requirements of safety and comfort in traffic flow. In car-following mode, an Autonomous Vehicle (AV) follows a time gap with the preceding vehicle. However, the system may struggle to guarantee safety and comfort in complex and variable driving scenarios where vehicles enter and leave the lane of travel. Motivated by this, the paper proposes a multi-mode ACC based on interpretable Finite State Machine (FSM) that addresses challenges in infrastructure changes and vehicles merging into (and out of) the current lane of travel. Also, we present results from an implementation that was successfully tested on the open road, and simulation results that show dampened disturbances from the mode-switching approach, compared to single-mode use of an ACC controller in the same scenarios.
Abstract:
Recent advancements in federated learning (FL) have greatly facilitated the development of decentralized collaborative applications, particularly in the domain of Artificial Intelligence of Things (AIoT). However, a critical aspect missing from the current research landscape is the ability to enable data-driven client models with symbolic reasoning capabilities. Specifically, the inherent heterogeneity of participating client devices poses a significant challenge, as each client exhibits unique logic reasoning properties. Failing to consider these device-specific specifications can result in critical properties being missed in the client predictions, leading to suboptimal performance. In this work, we propose a new training paradigm that leverages temporal logic reasoning to address this issue. Our approach involves enhancing the training process by incorporating mechanically generated logic expressions for each FL client. Additionally, we introduce the concept of aggregation clusters and develop a partitioning algorithm to effectively group clients based on the alignment of their temporal reasoning properties. We evaluate the proposed method on two tasks: a real-world traffic volume prediction task consisting of sensory data from fifteen states and a smart city multi-task prediction utilizing synthetic data. The evaluation results exhibit clear improvements, with performance accuracy improved by up to 54% across all sequential prediction models.
Abstract:
Emergency and non-emergency response systems are essential services provided by local governments and critical to protecting lives, the environment, and property. The effective handling of (non-)emergency calls is critical for public safety and well-being. By reducing the burden through non-emergency callers, residents in critical need of assistance through 911 will receive a fast and effective response. Collaborating with the Department of Emergency Communications (DEC) in Nashville, we analyzed 11,796 non-emergency call recordings and developed Auto311, the first automated system to handle 311 non-emergency calls, which (1) effectively and dynamically predicts ongoing non-emergency incident types to generate tailored case reports during the call; (2) itemizes essential information from dialogue contexts to complete the generated reports; and (3) strategically structures system-caller dialogues with optimized confidence. We used real-world data to evaluate the system's effectiveness and deployability. The experimental results indicate that the system effectively predicts incident type with an average F-1 score of 92.54%. Moreover, the system successfully itemizes critical information from relevant contexts to complete reports, evincing a 0.93 average consistency score compared to the ground truth. Additionally, emulations demonstrate that the system effectively decreases conversation turns as the utterance size gets more extensive and categorizes the ongoing call with 94.49% mean accuracy.
Abstract:
Recent global estimates suggest that as many as 2.41 billion individuals have health conditions that would benefit from rehabilitation services. Home-based Physical Therapy (PT) faces significant challenges in providing interactive feedback and meaningful observation for therapists and patients. To fill this gap, we present MicroXercise, which integrates micro-motion analysis with wearable sensors, providing therapists and patients with a comprehensive feedback interface, including video, text, and scores. Crucially, it employs multi-dimensional Dynamic Time Warping (DTW) and attribution-based explainable methods to analyze the existing deep learning neural networks in monitoring exercises, focusing on a high granularity of exercise. This synergistic approach is pivotal, providing output matching the input size to precisely highlight critical subtleties and movements in PT, thus transforming complex AI analysis into clear, actionable feedback. By highlighting these micro-motions in different metrics, such as stability and range of motion, MicroXercise significantly enhances the understanding and relevance of feedback for end-users. Comparative performance metrics underscore its effectiveness over traditional methods, such as a 39% and 42% improvement in Feature Mutual Information (FMI) and Continuity. MicroXercise is a step ahead in home-based physical therapy, providing a technologically advanced and intuitively helpful solution to enhance patient care and outcomes.
Abstract:
Monte Carlo tree search (MCTS) is one of the most capable online search algorithms for sequential planning tasks, with significant applications in areas such as resource allocation and transit planning. Despite its strong performance in real-world deployment, the inherent complexity of MCTS makes it challenging to understand for users without technical background. This paper considers the use of MCTS in transportation routing services, where the algorithm is integrated to develop optimized route plans. These plans are required to meet a range of constraints and requirements simultaneously, further complicating the task of explaining the algorithm's operation in real-world contexts. To address this critical research gap, we introduce a novel computation tree logic-based explainer for MCTS. Our framework begins by taking user-defined requirements and translating them into rigorous logic specifications through the use of language templates. Then, our explainer incorporates a logic verification and quantitative evaluation module that validates the states and actions traversed by the MCTS algorithm. The outcomes of this analysis are then rendered into human-readable descriptive text using a second set of language templates. The user satisfaction of our approach was assessed through a survey with 82 participants. The results indicated that our explanatory approach significantly outperforms other baselines in user preference.
2023
Abstract:
Video capturing devices with limited storage capacity have become increasingly common in recent years. As a result, there is a growing demand for techniques that can effectively analyze and understand these videos. While existing approaches based on data-driven methods have shown promise, they are often constrained by the availability of training data. In this paper, we focus on dashboard camera videos and propose a novel technique for recognizing important events, detecting traffic accidents, and trimming accident video evidence based on anomaly detection results. By leveraging meaningful high-level time-series abstraction and logical reasoning methods with state-of-the-art data-driven techniques, we aim to pinpoint critical evidence of traffic accidents in driving videos captured under various traffic conditions with promising accuracy, continuity, and integrity. Our approach highlights the importance of utilizing a formal system of logic specifications to deduce the relational features extracted from a sequence of video frames and meets the practical limitations of real-time deployment.
Abstract:
Given the availability of abundant data, deep learning models have been advanced and become ubiquitous in the past decade. In practice, due to many different reasons (e.g., privacy, usability, and fidelity), individuals also want the trained deep models to forget some specific data. Motivated by this, machine unlearning (also known as selective data forgetting) has been intensively studied, which aims at removing the influence that any particular training sample had on the trained model during the unlearning process. However, people usually employ machine unlearning methods as trusted basic tools and rarely have any doubt about their reliability. In fact, the increasingly critical role of machine unlearning makes deep learning models susceptible to the risk of being maliciously attacked. To well understand the performance of deep learning models in malicious environments, we believe that it is critical to study the robustness of deep learning models to malicious unlearning attacks, which happen during the unlearning process. To bridge this gap, in this paper, we first demonstrate that malicious unlearning attacks pose immense threats to the security of deep learning systems. Specifically, we present a broad class of malicious unlearning attacks wherein maliciously crafted unlearning requests trigger deep learning models to misbehave on target samples in a highly controllable and predictable manner. In addition, to improve the robustness of deep learning models, we also present a general defense mechanism, which aims to identify and unlearn effective malicious unlearning requests based on their gradient influence on the unlearned models. Further, theoretical analyses are conducted to analyze the proposed methods. Extensive experiments on real-world datasets validate the vulnerabilities of deep learning models to malicious unlearning attacks and the effectiveness of the introduced defense mechanism.
Abstract:
An increasing number of monitoring systems have been developed in smart cities to ensure that a city’s real-time operations satisfy safety and performance requirements. However, many existing city requirements are written in English with missing, inaccurate, or ambiguous information. There is a high demand for assisting city policymakers in converting human-specified requirements to machine-understandable formal specifications for monitoring systems. To tackle this limitation, we build CitySpec (Chen et al., 2022), the first intelligent assistant system for requirement specification in smart cities. To create CitySpec, we first collect over 1,500 real-world city requirements across different domains (e.g., transportation and energy) from over 100 cities and extract city-specific knowledge to generate a dataset of city vocabulary with 3,061 words. We also build a translation model and enhance it through requirement synthesis and develop a novel online learning framework with shielded validation. The evaluation results on real-world city requirements show that CitySpec increases the sentence-level accuracy of requirement specification from 59.02% to 86.64%, and has strong adaptability to a new city and a new domain (e.g., the F1 score for requirements in Seattle increases from 77.6% to 93.75% with online learning). After the enhancement from the shield function, CitySpec is now immune to most known textual adversarial inputs (e.g., the attack success rate of DeepWordBug (Gao et al., 2018) after the shield function is reduced to 0% from 82.73%). We test the CitySpec with 18 participants from different domains. CitySpec shows its strong usability and adaptability to different domains, and also its robustness to malicious inputs.
Abstract:
Physical therapy (PT) is crucial for patients to restore and maintain mobility, function, and well-being. Many on-site activities and body exercises are performed under the supervision of therapists or clinicians. However, the postures of some exercises at home cannot be performed accurately due to the lack of supervision, quality assessment, and self-correction. Therefore, in this paper, we design a new framework, PhysiQ, that continuously tracks and quantitatively measures people's off-site exercise activity through passive sensory detection. In the framework, we create a novel multi-task spatio-temporal Siamese Neural Network that measures the absolute quality through classification and relative quality based on an individual's PT progress through similarity comparison. PhysiQ digitizes and evaluates exercises in three different metrics: range of motions, stability, and repetition.
Abstract:
To demonstrate the value of machine learning based smart health technologies, researchers have to deploy their solutions into complex real-world environments with real participants. This gives rise to many, oftentimes unexpected, challenges for creating technology in a lab environment that will work when deployed in real home environments. In other words, like more mature disciplines, we need solutions for what can be done at development time to increase success at deployment time. To illustrate an approach and solutions, we use an example of an ongoing project that is a pipeline of voice based machine learning solutions that detects the anger and verbal conflicts of the participants. For anonymity, we call it the XYZ system. XYZ is a smart health technology because by notifying the participants of their anger, it encourages the participants to better manage their emotions. This is important because being able to recognize one's emotions is the first step to better managing one's anger. XYZ was deployed in 6 homes for 4 months each and monitors the emotion of the caregiver of a dementia patient. In this paper we demonstrate some of the necessary steps to be accomplished during the development stage to increase deployment time success, and show where continued work is still necessary. Note that the complex environments arise both from the physical world and from complex human behavior.
Abstract:
Smart cities operate on computational predictive frameworks that collect, aggregate, and utilize data from large-scale sensor networks. However, these frameworks are prone to multiple sources of data and algorithmic bias, which often lead to unfair prediction results. In this work, we first demonstrate that bias persists at a micro-level both temporally and spatially by studying real city data from Chattanooga, TN. To alleviate the issue of such bias, we introduce Fairguard, a micro-level temporal logic-based approach for fair smart city policy adjustment and generation in complex temporal-spatial domains. The Fairguard framework consists of two phases: first, we develop a static generator that is able to reduce data bias based on temporal logic conditions by minimizing correlations between selected attributes. Then, to ensure fairness in predictive algorithms, we design a dynamic component to regulate prediction results and generate future fair predictions by harnessing logic rules. Evaluations show that logic-enabled static Fairguard can effectively reduce biased correlations while dynamic Fairguard can guarantee fairness on protected groups at run-time with minimal impact on overall performance.
2022
Abstract:
Background: The field of dietary assessment has a long history, marked by both controversies and advances. Emerging technologies may be a potential solution to address the limitations of self-report dietary assessment methods. The Monitoring and Modeling Family Eating Dynamics (M2FED) study uses wrist-worn smartwatches to automatically detect real-time eating activity in the field. The ecological momentary assessment (EMA) methodology was also used to confirm whether eating occurred (ie, ground truth) and to measure other contextual information, including positive and negative affect, hunger, satiety, mindful eating, and social context. Objective: This study aims to report on participant compliance (feasibility) to the 2 distinct EMA protocols of the M2FED study (hourly time-triggered and eating event-triggered assessments) and on the performance (validity) of the smartwatch algorithm in automatically detecting eating events in a family-based study. Methods: In all, 20 families (58 participants) participated in the 2-week, observational, M2FED study. All participants wore a smartwatch on their dominant hand and responded to time-triggered and eating event-triggered mobile questionnaires via EMA while at home. Compliance to EMA was calculated overall, for hourly time-triggered mobile questionnaires, and for eating event-triggered mobile questionnaires. The predictors of compliance were determined using a logistic regression model. The number of true and false positive eating events was calculated, as well as the precision of the smartwatch algorithm. The Mann-Whitney U test, Kruskal-Wallis test, and Spearman rank correlation were used to determine whether there were differences in the detection of eating events by participant age, gender, family role, and height. Results: The overall compliance rate across the 20 deployments was 89.26% (3723/4171) for all EMAs, 89.7% (3328/3710) for time-triggered EMAs, and 85.7% (395/461) for eating event-triggered EMAs. Time of day (afternoon odds ratio [OR] 0.60, 95% CI 0.42-0.85; evening OR 0.53, 95% CI 0.38-0.74) and whether other family members had also answered an EMA (OR 2.07, 95% CI 1.66-2.58) were significant predictors of compliance to time-triggered EMAs. Weekend status (OR 2.40, 95% CI 1.25-4.91) and deployment day (OR 0.92, 95% CI 0.86-0.97) were significant predictors of compliance to eating event-triggered EMAs. Participants confirmed that 76.5% (302/395) of the detected events were true eating events (ie, true positives), and the precision was 0.77. The proportion of correctly detected eating events did not significantly differ by participant age, gender, family role, or height (P>.05). Conclusions: This study demonstrates that EMA is a feasible tool to collect ground-truth eating activity and thus evaluate the performance of wearable sensors in the field. The combination of a wrist-worn smartwatch to automatically detect eating and a mobile device to capture ground-truth eating activity offers key advantages for the user and makes mobile health technologies more accessible to nonengineering behavioral researchers.
Abstract:
Rigorous approaches based on formal methods have the potential to fundamentally improve many aspects of deep learning. This article discusses the challenges and future directions of formal methods enhanced deep learning for smart cities.
Abstract:
As various smart services are increasingly deployed in modern cities, many unexpected conflicts arise due to various physical world couplings. Existing solutions for conflict resolution often rely on centralized control to enforce predetermined and fixed priorities of different services, which is challenging due to the inconsistent and private objectives of the services. Also, the centralized solutions miss opportunities to more effectively resolve conflicts according to their spatiotemporal locality of the conflicts. To address this issue, we design a decentralized negotiation and conflict resolution framework named DeResolver, which allows services to resolve conflicts by communicating and negotiating with each other to reach a Pareto-optimal agreement autonomously and efficiently. Our design features a two-step self-supervised learning-based algorithm to predict acceptable proposals and their rankings of each opponent through the negotiation. Our design is evaluated with a smart city case study of three services: intelligent traffic light control, pedestrian service, and environmental control. In this case study, a data-driven evaluation is conducted using a large dataset consisting of the GPS locations of 246 surveillance cameras and an automatic traffic monitoring system with more than 3 million records per day to extract real-world vehicle routes. The evaluation results show that our solution achieves much more balanced results, i.e., only increasing the average waiting time of vehicles, the measurement metric of intelligent traffic light control service, by 6.8% while reducing the weighted sum of air pollutant emission, measured for environment control service, by 12.1%, and the pedestrian waiting time, the measurement metric of pedestrian service, by 33.1%, compared to priority-based solution.
Abstract:
An increasing number of monitoring systems have been developed in smart cities to ensure that real-time operations of a city satisfy safety and performance requirements. However, many existing city requirements are written in English with missing, inaccurate, or ambiguous information. There is a high demand for assisting city policy makers in converting human-specified requirements to machine-understandable formal specifications for monitoring systems. To tackle this limitation, we build CitySpec, the first intelligent assistant system for requirement specification in smart cities. To create CitySpec, we first collect over 1,500 real-world city requirements across different domains from over 100 cities and extract city-specific knowledge to generate a dataset of city vocabulary with 3,061 words. We also build a translation model and enhance it through requirement synthesis and develop a novel online learning framework with validation under uncertainty. The evaluation results on real-world city requirements show that CitySpec increases the sentence-level accuracy of requirement specification from 59.02% to 86.64%, and has strong adaptability to a new city and a new domain (e.g., F1 score for requirements in Seattle increases from 77.6% to 93.75% with online learning).
2021
Abstract:
Sensing is becoming more and more pervasive. New sensing modalities are enabling the collection of data not previously available. Artificial Intelligence (AI) and cognitive assistance technologies are improving rapidly. Cyber Physical Systems (CPS) are making significant progress in utilizing AI and Machine Learning (ML). This confluence of technologies is giving rise to the potential to achieve the vision of ambient intelligence. This paper describes some of the main challenges and research directions for ambient intelligence from a CPS perspective.
Abstract:
Wearable devices, such as smartwatches and head-mounted devices (HMD), demand new input devices for a natural, subtle, and easy-to-use way to input commands and text. In this paper, we propose and investigate ViFin, a new technique for input commands and text entry, which harness finger movement induced vibration to track continuous micro finger-level writing with a commodity smartwatch. Inspired by the recurrent neural aligner and transfer learning, ViFin recognizes continuous finger writing, works across different users, and achieves an accuracy of 90% and 91% for recognizing numbers and letters, respectively. We quantify our approach's accuracy through real-time system experiments in different arm positions, writing speeds, and smartwatch position displacements. Finally, a real-time writing system and two user studies on real-world tasks are implemented and assessed.
Abstract:
How can the advantages of formal methods be brought to emerging smart cities? We discuss several core challenges and our recent efforts as the first step toward developing novel formal methods to ensure safety and performance in smart cities.
Abstract:
Aim: The aim of this study is to develop a Smarthealth system of monitoring, modelling, and interactive recommendation solutions (for caregivers) for in-home dementia patient care that focuses on caregiver-patient relationships. Design: This descriptive study employs a single-group, non-randomized trial to examine functionality, effectiveness, feasibility, and acceptability of the novel Smarthealth system. Methods: Thirty persons with Alzheimer's Disease or related dementia and their family caregivers (N = 30 dyads) will receive and install Smarthealth technology in their home. There will be a 1-month observation phase for collecting baseline mood states and a 2-month implementation phase when caregivers will receive stress management techniques for each detected, negative mood state. Caregivers will report technique implementation and usefulness, sent via Ecological Momentary Assessment system to the study-provided smartphone. Caregivers will provide daily, self-reported mood and health ratings. Instruments measuring caregiver assessment of disruptive behaviours and their effect on caregivers; caregiver depressive symptoms, anxiety and stress; caregiver strain; and family functioning will be completed at baseline and 3 months. The study received funding in 2018 and ethics board approval in 2019. Discussion: This study will develop and test novel in-home technology to improve family caregiving relationships. Results from this study will help develop and improve the Smarthealth recommendation system and determine its usefulness, feasibility, and acceptability for persons with dementia and their family caregiver. Impact: The Smarthealth technology discussed will provide in-home stress reduction resources at a time when older adults may be experiencing increasingly high rates of isolation and anxiety and caregiver dyads may be experiencing high levels of relationship strain. Trial registration: This study was registered with Clinical Trials.gov (Identifier NCT04536701).
Abstract:
Predictive monitoring—making predictions about future states and monitoring if the predicted states satisfy requirements—offers a promising paradigm in supporting the decision making of Cyber-Physical Systems (CPS). Existing works of predictive monitoring mostly focus on monitoring individual predictions rather than sequential predictions. We develop a novel approach for monitoring sequential predictions generated from Bayesian Recurrent Neural Networks (RNNs) that can capture the inherent uncertainty in CPS, drawing on insights from our study of real-world CPS datasets. We propose a new logic named Signal Temporal Logic with Uncertainty (STL-U) to monitor a flowpipe containing an infinite set of uncertain sequences predicted by Bayesian RNNs. We define STL-U strong and weak satisfaction semantics based on whether all or some sequences contained in a flowpipe satisfy the requirement. We also develop methods to compute the range of confidence levels under which a flowpipe is guaranteed to strongly (weakly) satisfy an STL-U formula. Furthermore, we develop novel criteria that leverage STL-U monitoring results to calibrate the uncertainty estimation in Bayesian RNNs. Finally, we evaluate the proposed approach via experiments with real-world CPS datasets and a simulated smart city case study, which show very encouraging results of STL-U based predictive monitoring approach outperforming baselines.
Abstract:
COVID-19 has caused many disruptions in conducting smart health research. Both in-lab sessions and in-home deployments had to be delayed or canceled because in-person meetings were no longer allowed. Our research project on “in-home monitoring with personalized recommendations to reduce the stress of caregivers of Alzheimer’s patients” was affected. To enable continued research without any person-to-person contact, we created an out-of-the-box deployment solution. The solution is multifaceted and deals with everything from technical adjustments, deployment documentation, EMA additions, additional monitoring software, use of videos, Zoom and TeamViewer, budget changes, new logistics, and changes to IRBs. This article briefly describes the purpose and design of the original system and then articulates the necessitated changes. We also provide lessons learned and an initial evaluation of the effectiveness of the solutions after the changes. The evaluation surveys the opinions of seven people that assembled, initialized, and deployed our system in home environments. We believe that the various solutions we developed can be applied to other similar projects, and will be helpful to new projects even when personal contact returns.
Abstract:
Healthcare cognitive assistants (HCAs) are intelligent systems or agents that interact with users in a context-aware and adaptive manner to improve their health outcomes by augmenting their cognitive abilities or complementing a cognitive impairment. They assist a wide variety of users ranging from patients to their healthcare providers (e.g., general practitioner, specialist, surgeon) in several situations (e.g., remote patient monitoring, emergency response, robotic surgery). While HCAs are critical to ensure personalized, scalable, and efficient healthcare, there exists a knowledge gap in finding the emerging trends, key challenges, design guidelines, and state-of-the-art technologies suitable for developing HCAs. This survey aims to bridge this gap for researchers from multiple domains, including but not limited to cyber-physical systems, artificial intelligence, human-computer interaction, robotics, and smart health. It provides a comprehensive definition of HCAs and outlines a novel, practical categorization of existing HCAs according to their target user role and the underlying application goals. This survey summarizes and assorts existing HCAs based on their characteristic features (i.e., interactive, context-aware, and adaptive) and enabling technological aspects (i.e., sensing, actuation, control, and computation). Finally, it identifies critical research questions and design recommendations to accelerate the development of the next generation of cognitive assistants for healthcare.
Abstract:
With the development of the Internet of Things, millions of sensors are being deployed in cities to collect real-time data. This leads to a need for checking city states against city requirements at runtime. In this paper, we develop a novel spatial-temporal specification-based monitoring system for smart cities. We first describe a study of over 1,000 smart city requirements, some of which cannot be specified using existing logic such as Signal Temporal Logic (STL) and its variants. To tackle this limitation, we develop SaSTL -- a novel Spatial Aggregation Signal Temporal Logic -- for the efficient runtime monitoring of safety and performance requirements in smart cities. We develop two new logical operators in SaSTL to augment STL for expressing spatial aggregation and spatial counting characteristics that are commonly found in real city requirements. We define Boolean and \newcontent{quantitative semantics}~for SaSTL in support of the analysis of city performance across different periods and locations. We also develop efficient monitoring algorithms that can check a SaSTL requirement in parallel over multiple data streams (e.g., generated by multiple sensors distributed spatially in a city). Additionally, we build a SaSTL-based monitoring tool to support decision making of different stakeholders to specify and runtime monitor their requirements in smart cities. We evaluate our SaSTL monitor by applying it to three case studies with large-scale real city sensing data (e.g., up to 10,000 sensors in one study). The results show that SaSTL has a much higher coverage expressiveness than other spatial-temporal logic, and with a significant reduction of computation time for monitoring requirements. We also demonstrate that the SaSTL monitor improves the safety and performance of smart cities via simulated experiments.
Abstract:
Obesity is a risk factor for many health issues, including heart disease, diabetes, osteoarthritis, and certain cancers. One of the primary behavioral causes, dietary intake, has proven particularly challenging to measure and track. Current behavioral science suggests that family eating dynamics (FED) have high potential to impact child and parent dietary intake, and ultimately the risk of obesity. Monitoring FED requires information about when and where eating events are occurring, the presence or absence of family members during eating events, and some person-level states such as stress, mood, and hunger. To date, there exists no system for real-time monitoring of FED. This paper presents MFED, the first of its kind of system for monitoring FED in the wild in real-time. Smart wearables and Bluetooth beacons are used to monitor and detect eating activities and the location of the users at home. A smartphone is used for the Ecological Momentary Assessment (EMA) of a number of behaviors, states, and situations. While the system itself is novel, we also present a novel and efficient algorithm for detecting eating events from wrist-worn accelerometer data. The algorithm improves eating gesture detection F1-score by 19% with less than 20% computation compared to the state-of-the-art methods. To date, the MFED system has been deployed in 20 homes with a total of 74 participants, and responses from 4750 EMA surveys have been collected. This paper describes the system components, reports on the eating detection results from the deployments, proposes two techniques for improving ground truth collection after the system is deployed, and provides an overview of the FED data, generated from the multi-component system, that can be used to model and more comprehensively understand insights into the monitoring of family eating dynamics.
Abstract:
As various smart services are increasingly deployed in modern cities, many unexpected conflicts arise due to various physical world couplings. Existing solutions for conflict resolution often rely on centralized control to enforce predetermined and fixed priorities of different services, which is challenging due to the inconsistent and private objectives of the services. Also, the centralized solutions miss opportunities to more effectively resolve conflicts according to their spatiotemporal locality of the conflicts. To address this issue, we design a decentralized negotiation and conflict resolution framework named DeResolver, which allows services to resolve conflicts by communicating and negotiating with each other to reach a Pareto-optimal agreement autonomously and efficiently. Our design features a two-level semi-supervised learning-based algorithm to predict acceptable proposals and their rankings of each opponent through the negotiation. Our design is evaluated with a smart city case study of three services: intelligent traffic light control, pedestrian service, and environmental control. In this case study, a data-driven evaluation is conducted using a large data set consisting of the GPS locations of 246 surveillance cameras and an automatic traffic monitoring system with more than 3 million records per day to extract real-world vehicle routes. The evaluation results show that our solution achieves much more balanced results, i.e., only increasing the average waiting time of vehicles, the measurement metric of intelligent traffic light control service, by 6.8% while reducing the weighted sum of air pollutant emission, measured for environment control service, by 12.1%, and the pedestrian waiting time, the measurement metric of pedestrian service, by 33.1%, compared to priority-based solution.
2020
Abstract:
We present SaSTL---a novel Spatial Aggregation Signal Temporal Logic---for the efficient runtime monitoring of safety and performance requirements in smart cities. We first describe a study of over 1,000 smart city requirements, some of which can not be specified using existing logic such as Signal Temporal Logic (STL) and its variants. To tackle this limitation, we develop two new logical operators in SaSTL to augment STL for expressing spatial aggregation and spatial counting characteristics that are commonly found in real city requirements. We also develop efficient monitoring algorithms that can check a SaSTL requirement in parallel over multiple data streams (e.g., generated by multiple sensors distributed spatially in a city). We evaluate our SaSTL monitor by applying to two case studies with large-scale real city sensing data (e.g., up to 10,000 sensors in one requirement). The results show that SaSTL has a much higher coverage expressiveness than other spatial-temporal logics, and with a significant reduction of computation time for monitoring requirements. We also demonstrate that the SaSTL monitor can help improve the safety and performance of smart cities via simulated experiments.
Abstract:
Recurrent Neural Networks (RNNs) have made great achievements for sequential prediction tasks. In practice, the target sequence often follows certain model properties or patterns (e.g., reasonable ranges, consecutive changes, resource constraint, temporal correlations between multiple variables, existence, unusual cases, etc.). However, RNNs cannot guarantee their learned distributions satisfy these model properties. It is even more challenging for predicting large-scale and complex Cyber-Physical Systems. Failure to produce outcomes that meet these model properties will result in inaccurate and even meaningless results. In this paper, we develop a new temporal logic-based learning framework, STLnet, which guides the RNN learning process with auxiliary knowledge of model properties, and produces a more robust model for improved future predictions. Our framework can be applied to general sequential deep learning models, and trained in an end-to-end manner with back-propagation. We evaluate the performance of STLnet using large-scale real-world city data. The experimental results show STLnet not only improves the accuracy of predictions, but importantly also guarantees the satisfaction of model properties and increases the robustness of RNNs.
2019
Abstract:
Cities are deploying tens of thousands of sensors and actuators and developing a large array of smart services. The smart services use sophisticated models and decision-making policies supported by Cyber Physical Systems and Internet of Things technologies. The increasing number of sensors collects a large amount of city data across multiple domains. The collected data have great potential value, but has not yet been fully exploited. This survey focuses on the domains of transportation, environment, emergency and public safety, energy, and social sensing. This article carefully reviews both the data sets being collected across 14 smart cities and the state-of-the-art work in modeling and decision making methodologies. The article also points out the characteristics, challenges faced today, and those challenges that will be exacerbated in the future. Key data issues addressed include heterogeneity, interdisciplinary, integrity, completeness, real-timeliness, and interdependencies. Key decision making issues include safety and service conflicts, security, uncertainty, humans in the loop, and privacy.
Abstract:
Family relationships influence eating behavior and health outcomes (e.g., obesity). Because eating is often habitual (i.e., automatically driven by external cues), unconscious behavioral mimicry may be a key interpersonal influence mechanism for eating within families. This pilot study extends existing literature on eating mimicry by examining whether multiple family members mimicked each other's bites during natural meals. Thirty-three participants from 10 families were videotaped while eating an unstructured family meal in a kitchen lab setting. Videotapes were coded for participants' bite occurrences and times. We tested whether the likelihood of a participant taking a bite increased when s/he was externally cued by a family eating partner who had recently taken a bite (i.e., bite mimicry). A paired-sample t-test indicated that participants had a significantly faster eating rate within the 5 s following a bite by their eating partner, compared to their bite rate at other times (t = 7.32, p < .0001). Nonparametric permutation testing identified five of 78 dyads in which there was significant evidence of eating mimicry; and 19 of 78 dyads that had p values < .1. This pilot study provides preliminary evidence that suggests eating mimicry may occur among a subset of family members, and that there may be types of family ties more prone to this type of interpersonal influence during meals.
2018
Abstract:
Resolution of conflicts across services in smart cities is an important yet challenging problem. We present CityResolver - a decision support system for conflict resolution in smart cities. CityResolver uses an Integer Linear Programming based method to generate a small set of resolution options, and a Signal Temporal Logic based verification approach to compute these resolution options' impact on city performance. The trade-offs between resolution options are shown in a dashboard to support decision makers in selecting the best resolution. We demonstrate the effectiveness of CityResolver by comparing the performance with two baselines: a smart city without conflict resolution, and CityGuard which uses a priority rule-based conflict resolution. Experimental results show that CityResolver can reduce the number of requirement violations and improve the city performance significantly.
2017
Abstract:
Conflicting health information is a primary barrier of self-management of chronic diseases. Increasing number of people now rely on mobile health apps and online health websites to meet their information needs and often receive conflicting health advice from these sources. This problem is more prevalent and severe in the setting of multi-morbidities. In addition, often medical information can be conflicting with regular activity patterns of an individual. In this work, we formulate the problem of finding conflicts in heterogeneous health applications including health websites, health apps, online drug usage guidelines, and daily activity logging applications. We develop a comprehensive taxonomy of conflicts based on the semantics of textual health advice and activities of daily living. Finding conflicts in health applications poses its own unique lexical and semantic challenges. These include large structural variation between pairs of textual advice, finding conceptual overlap between pairs of advice, inferring the semantics of an advice (i.e., what to do, why and how) and activities, and aligning activities suggested in advice with the activities of daily living based on their underlying dependencies and polarity. Hence, we develop Preclude2, a novel semantic rule-based solution to detect conflicts in activities and health advice derived from heterogeneous sources. Preclude2 utilizes linguistic rules and external knowledge bases to infer advice. In addition, Preclude2 considers personalization and context-awareness while detecting conflicts. We evaluate Preclude2 using 1156 real advice statements covering 8 important health topics, 90 online drug usage guidelines, 1124 online disease specific health advice covering 34 chronic diseases, and 2 activity datasets. The evaluation is personalized based on 34 real prescriptions. Preclude2 detects direct, conditional, sub-typical, quantitative, and temporal conflicts from 2129 advice statements with 0.91, 0.83, 0.98, 0.85 and 0.98 recall, respectively. Overall, it results in 0.88 recall for detecting inter advice conflicts and 0.89 recall for detecting activity–advice conflicts. We also demonstrate the effects of personalization and context-awareness in conflict detection from heterogeneous health applications.
Abstract:
With the rapid digitalization of the health sector, people often turn to mobile apps and online health websites for health advice. Health advice generated from different sources can be conflicting as they address different aspects of health (e.g., weight loss, diet, disease) or as they are unaware of the context of a user (e.g., age, gender, physiological condition). Conflicts can occur due to lexical features, (such as, negation, antonyms, or numerical mismatch) or can be conditioned upon time and/or physiological status. We formulate the problem of finding conflicting health advice and develop a comprehensive taxonomy of conflicts. While a similar research area in the natural language processing domain explores the problem of textual contradiction identification, finding conflicts in health advice poses its own unique lexical and semantic challenges. These include large structural variation between text and hypothesis pairs, finding conceptual overlap between pairs of advice, and inference of the semantics of an advice (i.e., what to do, why and how). Hence, we develop Preclude, a novel semantic rule-based solution to detect conflicting health advice derived from heterogeneous sources utilizing linguistic rules and external knowledge bases. As our solution is interpretable and comprehensive, it can guide users towards conflict resolution too. We evaluate Preclude using 1156 real advice statements covering 8 important health topics that are collected from smart phone health apps and popular health websites. Preclude results in 90% accuracy and outperforms the accuracy and F1 score of the baseline approach by about 1.5 times and 3 times, respectively.
Abstract:
Research in the area of internet-of-things, cyberphysical-systems, and smart health often employ sensor systems at residences for continuous monitoring. Such research-oriented residential monitoring systems (RRMSs) usually face two major challenges, long-term reliable operation management and validation of system functionality with minimal human effort. Targeting these two challenges, this paper describes a monitor of monitoring systems with ground-truth validation capabilities, M 2 G. It consists of two subsystems, the Monitor 2 system and the Ground-truth validation system. The Monitor 2 system encapsulates a flexible set of general-purpose components to monitor the operation and connectivity of heterogeneous sensor devices (e.g. smart watches, smart phones, microphones, beacons, etc.), a local base-station, as well as a cloud server. It provides a user-friendly interface and supports different types of RRMSs in various contexts. The system also features a ground truth validation system to support obtaining ground truth in the field. Additionally, customized alerts can be sent to remote administrators and other personnel to report any dysfunction or inaccuracy of the system in real time. M 2 G is applied to three very different case studies: the M2FED system which monitors family eating dynamics [1], an in-home wireless sensing system for monitoring nighttime agitation [2], and the BESI system which monitors behavioral and environmental parameters to predict health events and to provide interventions [3]. The results indicate that M 2 G is a comprehensive system that (i) requires small cost in time and effort to adapt to an existing RRMS, (ii) provides reliable data collection and reduction in data loss by detecting faults in real-time, and (iii) provides a convenient and timely ground truth validation facility.
Abstract:
Nowadays, increasing number of smart services are being developed and deployed in cities around the world. IoT platforms have emerged to integrate smart city services and city resources, and thus improve city performance in the domains of transportation, emergency, environment, public safety, etc. Despite the increasing intelligence of smart services and the sophistication of platforms, the safety issues in smart cities are not addressed adequately, especially the safety issues arising from the integration of smart services. Therefore, CityGuard, a safety-aware watchdog architecture is developed. To the best of our knowledge, it is the first architecture that detects and resolves conflicts among actions of different services considering both safety and performance requirements. Prior to developing CityGuard, safety and performance requirements and a spectrum of conflicts are specified. Sophisticated models are used to analyze secondary effects, and detect device and environmental conflicts. A simulation based on New York City is used for the evaluation. The results show that CityGuard (i) identifies unsafe actions and thus helps to prevent the city from safety hazards, (ii) detects and resolves two major types of conflicts, i.e., device and environmental conflicts, and (iii) improves the overall city performance.
2016
Abstract:
The populations of large cities around the world are growing rapidly. Cities are beginning to address this problem by implementing significant sensing and actuation infrastructure and building services on this infrastructure. However, as the density of sensing and actuation increases and as the complexities of services grow there is an increasing potential for conflicts across Smart City services. These conflicts can cause unsafe situations and disrupt the benefits that the services were originally intended to provide. Although some of the conflicts can be detected and avoided during designing the services, many can still occur unpredictably during runtime. This paper carefully defines and enumerates the main issues regarding the detection and resolution of runtime conflicts in smart cities. In particular, it focuses on conflicts that arise across services. This issue is becoming more and more important as Smart City designs attempt to integrate services from different domains (transportation, energy, public safety, emergency, medical, and many others). Research challenges are identified and then addressed that deal with uncertainty, dynamism, real-time, mobility and spatio-temporal availability, duration and scale of effect, efficiency, and ownership. A watchdog architecture is also described that oversees the services operating in a Smart City. This watchdog solution detects and resolves conflicts, it learns and adapts, and it provides additional inputs to decision making aspects of services. Using data from a Smart City dataset, an emulated set of services and activities using those services are created to perform a conflict analysis. A second analysis hypothesizes 41 future services across 5 domains. Both of these evaluations demonstrate the high probability of conflicts in smart cities of the future.
2015
2014
2013
2011