Challenge
Modern cyber-attacks are often conducted by distributing digital documents that contain malware.
AIS employees’ approach, which consists of a classifier that uses features derived from dynamic analysis of a document viewer as it renders the document in question, can classify the disposition of digital documents with greater than 98 percent accuracy even when its model is trained on just small amounts of data. To keep the classification model itself small and thereby to provide scalability, they employ an entity resolution strategy that merges syntactically disparate features that are thought to be semantically equivalent but vary due to programmatic randomness. Entity resolution enables construction of a comprehensive model of benign functionality using relatively few training documents, and the model does not improve significantly with additional training data.
Key Insights:
Reach out to talk to one of our experts and learn more about our research initiatives.