Integrating clinical, molecular, and social contacts at statewide and national scales (100k+ cases) while avoiding web browser performance degradation.
Disk and network footprint reduced to approximately 30% of original size.
Simplified memory arrays replace complex database objects.
Visual layout renders incrementally to keep the software interactive.
// Persons labels repeated for every case
"Nodes": [
{
"id": "case1",
"ehars_uid": "E-102",
"age_dx": "32",
"gender": "man"
},
{
"id": "case2",
"ehars_uid": "E-103",
"age_dx": "45",
"gender": "woman"
}
]
Redundant field labels inflate file sizes and increase computer memory overhead.
// Labels defined once; values stored as lists
"persons_attribute_schema": {
"ehars_uid": { "type": "String" },
"age_dx": { "type": "String" },
"gender": { "type": "String" }
},
"Nodes": {
"id": ["case1", "case2"],
"ehars_uid": ["E-102", "E-103"],
"age_dx": ["32", "45"],
"gender": ["man", "woman"]
}
Structured data storage eliminates redundant labels and optimizes file sizes.
"rle": true, "len": 969, "values": ["Unknown", "Alive"], "runs": [400, 569]
"front": true, "suffixes": ["XS02H", "3491"], "lens": [0, 5]
"delta": true, "values": [172800, 86400, 86400]
| Total Nodes (Cases) | 6,949 |
| Total Links (Edges) | 15,339 |
| Linked (Clustered) Nodes | 4,601 |
| Unlinked (Isolated) Nodes | 2,348 |
| MSPP Cases (Multiple Sequences) | 1,070 (3,120 nodes) |
Index,Partner,Contact case1_id,case2_id,Social case2_id,unseq1_id,Social
Index,SexWork,SubstanceUse,InternetDating case1_id,Yes,No,Yes unseq1_id,No,Yes,No
// Match all multi-cluster individuals
{
"condition": "AND",
"rules": [
{
"field": "CLUSTER_ID",
"operator": "contains",
"value": ";"
}
]
}
// Attribute schema defined in JSON
"persons_attribute_schema": {
"poor_quality": { "type": "String" },
"fraction_ambig": { "type": "Number" },
"TNA_eligible": { "type": "String" }
},
// Parallel arrays in persons_attributes
"persons_attributes": {
"poor_quality": ["No", "Yes", null],
"fraction_ambig": [0.0015, 0.052, null],
"TNA_eligible": ["Yes", "No", null]
}
Enables granular searching, filtering, and case color-coding in the visualization dashboard.
| Component | Updates | Public Health Impact |
|---|---|---|
| Web Visualization Dashboard | Optimized memory allocation and non-blocking loading stages. | Supports seamless interactive visualization of 100k+ cases without application freezes or crashes. |
| Network Annotation Engine | Integrated sequence quality metrics and unsequenced cases into the network schema. | Enables full display of social links, data quality flags, and unsequenced cases in outbreak tables. |
| Data Quality Control Pipeline | Integrated sequence quality reports directly into the network generation pipeline. | Automates identification of poor quality and treatment-naive cases for quick prioritization. |