Malware Detection in Internet of Things Using Opcodes and Machine Learning



Khare, Aditi Atul

Journal Title

Journal ISSN

Volume Title



In the recent years, the exponential growth of Internet of Things devices has caused a huge security threat. These devices are being deployed even before being secured. Most of the IoT devices are either unsecured or weakly secured and attackers are taking advantage of this. Even if one IoT device gets infected, it has the potential to spread the malware to the entire network. Obfuscation techniques like polymorphism are being used by hackers to avoid detection. This research is focused on polymorphic malware detection in Internet of Things networks using opcodes and machine learning. ARM-based malware was used for testing because of the large share of ARM-based IoT platforms making it more indictive of real-world attacks. Opcodes were extracted by disassembling the dataset using the IDA Pro disassembler. A sequentially ordered dataset of the opcodes was created to be used for detection. Four different datasets namely Dmalware, Dgoodware, Dunseenmalware and Dunseengoodware were created. A polymorphed version of the unseen malware dataset was also created to test the utility of the approach in polymorphic malware detection. We used the sequential pattern mining algorithm, Mind the Gap: Frequent Sequence Mining, to mine the most frequent patterns in malware. These Maximal Sequential Patterns aka MSPs were categorized based on their functionality using ARM resources. Three different approaches were tested and compared. The first approach was to create an opcode-rank dictionary based on opcode frequency in the malware dataset to create vectors for machine learning classification. The second approach used the frequency of MSPs to vectorize the given dataset while the third approach used the MSP type as a feature for detection. Machine learning classifiers like Decision tree, KNN, Random-Forest, SVM and AdaBoost were used to detect malware as well as polymorphic malware. It was observed that the sequential pattern mining approaches were faster and more resilient to polymorphed malware. A comparative study showed that the MSP list approach has comparable performance to the MSP type approach. Also, the MSP list approach has faster pre-processing runtimes and lower memory usage making it a viable approach for classification of malware.



Malware Detection, Sequential Pattern Mining, Polymorphic malware, Malware in Internet of Things (IOT)