CUTE: Traffic Classification Using TErms

Abstract

Among different traffic classification approaches, Deep Packet Inspection (DPI) methods are considered as the most accurate. These methods, however, have two drawbacks: (i) they are not efficient since they use complex regular expressions as protocol signatures, and (ii) they require manual intervention to generate and maintain signatures, partly due to the signature complexity. In this paper, we present CUTE, an automatic traffic classification method, which relies on sets of weighted terms as protocol signatures. The key idea behind CUTE is an observation that, given appropriate weights, the occurrence of a specific term is more important than the relative location of terms in a flow. This observation is based on experimental evaluations as well as theoretical analysis, and leads to several key advantages over previous classification techniques: (i) CUTE is extremely faster than other classification schemes since matching flows with weighed terms is significantly faster than matching regular expressions; (ii) CUTE can classify network traffic using only the first few bytes of the flows in most cases; and (iii) Unlike most existing classification techniques, CUTE can be used to classify partial (or even slightly modified) flows. Even though CUTE replaces complex regular expressions with a set of simple terms, using theoretical analysis and experimental evaluations (based on two large packet traces from tier-one ISPs), we show that its accuracy is as good as or better than existing complex classification schemes, i.e. CUTE achieves precision and recall rates of more than 90%. Additionally, CUTE can successfully classify more than half of flows that other DPI methods fail to classify.

Topics

5 Figures and Tables

Download Full PDF Version (Non-Commercial Use)