Introduction
The CDC defines syndromic surveillance as a form of public health watching that employs automated data acquisition and statistical alerts to monitor disease indicators in real-time or near real-time. Its main goal is to detect beforehand large-scale infections of high sensitivity and specificity by identifying probable clusters of disease onset from prediagnostic data and comparing it with a threshold generated from a set of normal symptomatic cases (Patrick 2007). Besides, it provides information on the size and tempo of outbreaks trends and reassures the public. It uses continuous and routine data collection including patient counts at emergency departments, absentee counts, sales at medication counters, and laboratory test results to obtain information on the outbreak.
The system operates in a sequence of stages by gathering details of symptoms and categorizing cases into syndromes before analyzing data to estimate probable outbreaks. Subsequently, it sends signals to public health departments to initiate responses but without triggering false alarms. Additionally, the system collects surrogate data such as work or school absenteeism apart from domestic and avian illness. It also uses alternative data sources and where the system is optimal with relatively high confidence, it can absorb integrated data from multiple sources. It employs automated electronic reporting to transfer data to a central database and utilizes statistical algorithms supported by complex information technology infrastructures to analyze data in generating a detectable daily count and nonspecific disease indicators (5) (May et al. 2009). Though the information it facilitates is not definitive, it produces a close estimation by use of probability of outbreak (Fricker 2006). Timeliness is a key attribute of the system. (Heffernan et al. 2004).
Comparison with Traditional Surveillance System
Syndromic surveillance is more complex as it uses informatics, statistics, and advanced epidemiological methods (Buehler 2004). On the other hand, traditional surveillance involves manual data collection through compulsory reporting of specific, diagnosed diseases to authorities (Green and Kaufman 2002). While the latter focuses on the initial stages of severe illness, the former tries to estimate and predict outbreaks a few days earlier, which facilitates time for disease management and thereby minimizes mortality (Figure 1, p. 8: What is Syndromic Surveillance?). Another major difference is that traditional surveillance systems use epidemiological methods by relying on retrospective studies to address specific risks, syndromic surveillance systems do not focus on specific risk events. As a result, the two systems differ in data aggregation, detection methods, and models used for detection (Fricker 2006). Further, traditional surveillance accounts for more accuracy because it depends on diagnostic data, whereas syndromic surveillance uses prediagnostic data. The general data source for the former is voluntary reports, while the latter employs data from protocols and automatic routines (Mandl et al. 2004). Since there is a need to balance different approaches and functions of the systems, it is better to use syndromic surveillance to enhance traditional surveillance rather than to replace it (3) (May et al. 2009; Heffernan et al. 2004).
Strengths in Syndromic Surveillance Systems
Recent studies reveal that the key strengths of syndromic surveillance systems lie in data collection and transfer, cooperation between multiple sources, the use of ICD-9 codes, and space-time analysis. Furthermore, successful reporting of biological activities and reassurance to public health departments makes syndromic surveillance systems more effective in infectious disease warning systems.
Since syndromic surveillance relies on electronic data collection and transfer, it is capable of providing data quickly. Besides, it enjoys another advantage of being able to generate a relatively accurate deviation in occurrence rates as compared to normal levels because it can determine rates of cases within a time interval or emulate a stochastic model of the spread of disease. In addition, the advanced space-time analysis enhances sensitivity and specificity with good timeliness (Bravata et al. 2004).
The nontraditional clinical data provided by healthcare venues are crucial in terms of accuracy as symptoms from attacks are likely to resemble those of common illnesses such as influenza (4) (Lombardo et al. 2003).
C1-MILD, C2-MEDIUM, and C3-EXTREME are methods used to determine baseline levels of syndrome occurrence. Hutwager et al. (2005) show that these systems account for large population movements to a specific geographic area by using weekly trends and incorporating values from seven-day baseline periods. They also can stratify data to generate more surveillance information. These methods, when integrated, generate optimal information (Hutwagner et al. 2005). Cooperation among multiple sources enables the generation of reports in a timely fashion (Das et al. 2003).
ICD-9 codes are used to group reports into different syndromes. A comparison with other health indicators shows that this is the most efficient means to achieve timeliness. Miller et al. indicate that ICD-9 codes have accurately segregated data relating to deaths from influenza and pneumonia. Besides, Betancourt et al (2007) have held that ICD-9 codes have the advantage of high sensitivity, specificity, and accuracy in the grouping.
Syndromic surveillance is, thus, instrumental in public health surveillance as they are effective tools in the detection of a seasonal, epidemic, and pandemic flu and other natural disease outbreaks besides facilitating diagnosis-based surveillance in detecting emerging diseases. Its approaches involve database entry of patient information through the application of statistic algorithms including comparisons with non-outbreak trends to detect morbidity and mortality trends (1,2) (Chretien et al. 2008).
In the past decade, syndromic surveillance systems have been able to establish links between healthcare providers and public health, and to detect and manage outbreaks effectively (71). The system has also facilitated quick countermeasures when detection fails. It overcomes human limitations by detecting abnormal incidences, allowing officials to respond to an attack when the sentinel physicians fail to notice such cases (19). Besides, it provides the response time for public health authorities as well as reassurance to public health departments, thus preventing chaos and economic losses (Balter et al. 2005).
Weakness in Syndromic Surveillance Systems
Though the system enjoys many advantages, it also ails from some limitations relating to the variability of prediction, data collection, symptom classification, multiple data streams, sensitivity, multiple tradeoffs, and signal detection.
Since data collection happens at the early stage of disease, specificity is compromised and the information on occurring regions becomes prone to errors (Henning 2004).
Accuracy
Since the system issues a lower number of noise signals in small outbreaks accuracy becomes less. However, intense signals, indicated by a larger signal-to-noise ratio, reduce variability and increase accuracy. Bioterrorist attacks take the form of multiple small outbreaks and thereby the lack of accuracy can limit the effectiveness of syndromic surveillance in such cases (Reis & Mandl 2003). Accuracy, however, can be improved by integrating data from multiple regional emergency departments. Exogenous factors affect normal levels that syndromic systems do not account for, which again reduces accuracy. A plausible solution is to record the disease level and outbreak levels due to such factors and subtract those from the data (Wang et al., 2005)
Data Collecting
Syndromic Surveillance Systems rely on International Classification of Disease (ICD) diagnostic codes and ED chief complaints (CC). Reis and Mandl (2004) point out lower accuracy and sensitivity in hospitals employing CC and higher accuracy and sensitivity in those using diagnostic codes.
Data can be collected using the EISO survey but input is always insufficient because hospital staffs are too busy to answer the questions, and this limitation reduces accuracy. A solution is to involve the human resources department to elicit maximum information (Das et al. 2003). Systems that collect and use pharmaceutical data such as EPIFAR are effective for detection. However, as a weakness, they cannot differentiate natural disease outbreaks from bioterrorist attacks (Bravata et al. 2004). Further, the lack of a specific classification of symptoms under a uniform code system reduces sensitivity and accuracy while increasing false positives and negatives (Das et al. 2003).
Data availability is another problem that weakens the systems. Detection methods need three to five years of historical data to calculate baseline levels. However, limited historical data produces inaccurate aberrations. Simulations of outbreaks using animals or past outbreaks can be used to create reference standards.
Monitoring of multiple data streams has been used in many systems to increase accuracy, sensitivity, and comprehensiveness. With the use of multiple data streams, a larger population can be monitored and cases identified. The results will be more reflective of the actual situation. On the other hand, more data implies higher cost, while more data streams increase false positives [?]. This challenge can be confronted by focusing on common symptoms in a region; however, this is promising only when background counts and incidences of such symptoms increase proportionally. The use of multivariate detection algorithms generates optimal solutions with respect to tradeoffs between sensitivity, specificity, and timeliness, increasing accuracy (Stoto 2005).
Sensitivity
Sensitivity for detecting abnormal patterns is very important (Patrick 2007). Systems with low sensitivity generate more false alarms and add to overheads. Moreover, frequent false alarms cause officials to ignore genuine warnings (Bravata et al. 2004). As studies show that sensitivity decreases when the average distance between individuals and cases to be recorded and evaluated for successful detection keeps increasing. More records and analysis demand higher capacity and are limited in most, if not all, systems, which in turn limit the capabilities of the system (Kaufmann et al. 2005). Reis & Mandl (2004) believe that through complex statistical processes and temporal smoothing in data grouping, sensitivity can be increased.
Tradeoff
Another limitation in the system lies in the fact that tradeoff between false-positive rates, sensitivity, and timeliness can be a setback. False positives can be reduced by lowering sensitivity or timeliness. In the absence of diagnostic data, statistical analysis of a larger timeframe can reduce false positives and improve sensitivity (Stoto, 2005; Das et al., 2003). The possible tradeoff between specificity and sensitivity is also a weakness of results. However, the approach to increase the sensitivity to a type of attack may reduce specificity for other types. (Stoto 2005). The tradeoff between sensitivity and timeliness is certainly a fundamental weakness (Balter et al. 2005). Detailed and complicated analysis that determines sensitivity requires time. Therefore, where quicker reports become necessary sensitivity gets compromised. However, it is possible to improve sensitivity through the use of clinical diagnosis and confirmations.
Studies also show that the system is not effective in detecting slowly spreading agents. Monitoring less common symptoms and syndromes and using data from multiple sources and analyzing the maximum number of indicators can mitigate this problem. Integrating geographical patterns with statistical analysis will facilitate a broader data set. Additionally, geographic patterns can identify the location of disease spread besides information about its dispersal pattern. (RAND 2004).
Signals and Detection
RAND researchers have sought to find the minimum size and speed of detection. Their findings indicate that it takes about two days to detect incidences greater than nine cases. No more than a 50% chance of triggering an alarm for an incidence with 18 cases were found. Figure 2 (Fig 1, page 3?) shows that the probability of detecting an attack rises from 60% to 100% after two days of the initial attack and a probability of around 15% for detecting such attacks within the first day. These are low detection rates and accuracy becomes a key issue for the systems.
Timeliness
Once an abnormal pattern comes to notice, cases must be investigated for their causes, history of exposure, and possible sources that increase the time needed for analysis, reducing timelines. This can entail the system’s failure to deliver timely information in the case of diseases that spread fast (Kaufmann et al. 2005). Larger population dispersals require more cases which enhances the time required for analysis (Kaufmann et al. 2005). Besides, the systems that facilitate manual reporting cause delays in data input (Bravata et al. 2004).
Conclusion: Current Syndromic Surveillance Systems Compared with an Ideal Infectious Disease Early Warning System
Early warning systems aim to alert about the spread and/or outbreak of infectious diseases. Therefore, timing is a crucial factor for this along with accuracy and specificity. An ideal system would be able to collect accurate data; perform correct, sensitive, and specific analysis within the shortest time span and distinguish a terrorist attack from natural outbreaks. Besides, the data collected needs to be comprehensive to reflect the status of disease outbreaks accurately. Systems such as those using the Bayes classifier and an ICD-9 code classifier have sufficient accuracy to detect epidemics of moderate to large scale (Ivanov et al. 2002), and
typically have a specificity of around 89% and 96% respectively (Wagner et al. 2004; Paladini 2004). Data accuracy and sensitivity vary among data grouping methods. Chief complaints produce low accuracy and sensitivity while diagnostic codes achieve better results (Reis & Mandl 2003). While some systems facilitate sufficiently comprehensive data these are not cost-effective. The current computer system enables fast and efficient data transfer, which are the essential components of an effective system. Detections are performed by data and the reporting must be accurate and timely. Systems such as the ESSENCE, AIRMA, and Pulsar are able to operate with good timeliness, but there exists room for improvement (Burkom et al. 2004; Shapiro 2004). On the other hand, ASPREN and syndromic surveillance of NC DETECT cause delays in output delivery and analysis (Clothier et al., 2005; Travers et al., 2006).
Systems vary in timeliness depending on the methods applied and resources available, and certainly not all systems operate within an adequate time interval (Cooper et al. 2008). As stated earlier, although systems are able to attain high accuracy, specificity, and sensitivity, they all involve tradeoffs among sensitivity, timeliness, and specificity (Balter et al. 2005; Stoto 2005). In most cases, these tradeoffs are not conducive to an effective early warning system (Soto et al. 2006). Soto et al (2006) remark that these tradeoffs must be resolved for making the systems effective. Due to limitations in current statistics, computers, and information technology, this is extremely difficult to attain. Such systems always come with a higher price tag and operating costs. Current syndromic systems are able to manage data effectively and operate in a timely fashion, but the tradeoffs are certainly key factors leading to complications. Although some systems possess the required features of an ideal system, currently there is not a single system that is endowed with all such features. Thus, unfortunately, it becomes evident that the current syndromic surveillance systems cannot stand up fully to the requirements of an ideal and efficient system for early warning about infectious diseases.