Blog

Let me have a look at your data! Perfect, you have an anomaly!

Being perfect in the world of big data has a different meaning. When we analyze behaviors and look at your business data, anomalies are the gold nugget we are looking for. The state of “normal” doesn’t sit well with data. Anomalies and big changes, good or bad, can grow one business and improve.

Let me have a look at your data! Perfect, you have an anomaly!

Three’s Company

Anomalies in streaming data can occur in three primary ways. First when data goes beyond or below an expected range. A possible scenario can be a change in the temperature below or above an expected range of temperatures. Second is the number of times or the instances at which a data pattern is visible. That is 20 people inside an office at 9 am is usual but 20 people at 1 am inside an office is an anomaly. The third type of anomaly is based on relative scenarios. A car decelerating after brakes are applied is usual, but if a car is accelerated even after brakes are applied is a dangerous anomaly. Ok, now we need to detect them.

Fantastic Anomalies and Where to Find Them

There are numerous techniques to detect anomalies in streaming data. Traditional statistical methods which can track the history of data streams and create correlations among fields and then report unseen values. Then came ML and AI (machine learning and artificial intelligence) where a machine is made to learn the past data and based on that, a model of the data is created. Anomalous behavior is detected by comparing the real-time data values with the generated model.

Both Statistical machine learning and Traditional machine learning have some cons. Traditional statistics cannot be generalized to all applications. Different data models need different statistical approaches to make it capable of anomaly detection and then there is an issue of continuous learning where both statistics and traditional ML techniques seem to give up. As the data pattern changes, models have to be manually adjusted using new training data.

Anomaly detection in Cantiz Nucleus uses HTM and CLA to detect unusual patterns in Streaming data. HTM (Hierarchical temporal memory) are learning algorithms that can store, learn, infer, and recall high-order sequences. HTM helps a system to remember data patterns in time series data and also generate correlations among metrics within the data and depict relative anomalies. CLA (cognitive learning algorithm) helps to attain the property of continuous learning, i.e. the models generated through training are dynamically changed as the data pattern changes to avoid repetitive manual training for any use case.

Love your anomalies

Anomaly detection enables users to create big data projects with descriptions and sample data. This description and sample data play a pivotal role in training and model creation. Once the model is created, users can pass their real stream data to the model which notifies anomalous behavior with a score and data point attached to it. And this is where you start to love your anomalies.