Zostaw swoje dane kontaktowe, a my skontaktujemy się z Tobą w ciągu 1 dnia roboczego

Z przyjemnością z Tobą porozmawiamy!

Prylada zobowiązuje się do ochrony i poszanowania Twojej prywatności.

Dziękuję!

Twoja prośba jest przetwarzana. Skontaktujemy się z Tobą w ciągu 1 dnia roboczego

Ups! Coś poszło nie tak podczas przesyłania formularza.
Zamknij wyskakujące okienko

How we teach machines to see the invisible: an ML approach to black ice detection

December 18, 2025

IoT

Weather condition monitoring

Road condition monitoring

No items found.

The problem nobody sees

Black ice is one of the most insidious hazards on the road. This transparent layer of ice on asphalt is virtually impossible to spot with the naked eye, yet it causes thousands of accidents every winter. Road services traditionally rely on weather forecasts and visual inspection, but these methods don't provide an accurate real-time picture.

At Prylada, we decided to approach the problem differently: instead of trying to see ice, we taught machines to sense it through sensors and understand it through machine learning.

Our approach: When data speaks for itself

Prylada's black ice detection solution: from sensors to intelligence

While this article focuses on the ML approach to analyzing sensor data, it's important to understand the complete Prylada system that makes this intelligence possible. Our black ice detection solution combines hardware, connectivity, and machine learning into a comprehensive road safety platform.

Complete IoT System for Road Safety

Our monitoring system collects data from multiple sources simultaneously:

  • Radar analyzes road surface structure
  • Temperature sensors track surface conditions
  • Humidity sensors detect the presence of water

But raw data is just numbers. The real magic begins when we apply machine learning to transform these numbers into understanding.

Philosophy: reliability through determinism

We deliberately chose traditional deterministic ML over trendy generative models. Why? Imagine a road safety system giving different answers each time for the same sensor readings. Absurd, right?

When surface temperature is -2°C, humidity is 95%, and the radar shows a specific signal amplitude, the road condition must be determined unambiguously. Every time. Without variation. That's determinism: same input data → same result.

In critical systems, predictability matters more than creativity.

The main challenge: learning without a teacher

The problem of labeled data

Typically, ML models are trained using "supervised learning" - when you have examples of correct answers. Want to predict radiation from temperature and humidity? Collect a dataset where each combination of temperature and humidity has a recorded radiation value. The model finds patterns and learns to predict.

But with black ice, we face a fundamental problem: we have no "teacher".

Nobody stands 24/7 at every intersection with a sign saying "wet ice here now" or "dry now". It's impossible to create a labeled dataset of road surface conditions for training.

Solution: the system learns on its own

We applied unsupervised learning - learning without a teacher. The system independently analyzes sensor data and identifies natural groups, patterns that correspond to different road conditions.

This approach has proven effective across various IoT applications. Research shows that unsupervised learning methods like K-means successfully handle unlabeled sensor data, achieving high accuracy in classification tasks where traditional labeled datasets are unavailable or prohibitively expensive to create.

It's like showing a child thousands of animal photos without naming them, and they notice on their own: "These big striped ones are usually together, and these small fluffy ones are a separate group." Our system works the same way: it finds groups of similar states in the multidimensional space of sensor data.

How it works: clustering in 6 dimensions

Multidimensional state space

Each moment in time is characterized by 6 parameters:

  • Humidity
  • Surface temperature
  • Radar signal amplitude
  • Additional metrics from radar and external sensors

These 6 numbers can be represented as a point in 6-dimensional space. Yes, we can't visualize this (our brains are wired for 3 dimensions), but mathematics handles it perfectly.

Finding meaning in correlations

Before training, we analyze how parameters relate to each other. For example, observations show:

  • 73% correlation between radar amplitude and ice formation—when amplitude increases, ice probability also increases
  • Negative correlation of -69% between humidity and temperature—a typical inverse relationship

These connections help the model better understand the physics of the processes.

K-means: how the system finds groups

We use the K-means algorithm—one of the classic clustering methods:

  1. Initialization: The Initial group "centers" are randomly selected in 6-dimensional space
  2. Iteration: Each data point (moment in time) is assigned to the nearest center
  3. Update: Centers shift to the average point of their group
  4. Repeat: The process continues until centers stabilize

How many groups are needed? We use the Elbow method - analysis showed that splitting data into 5 clusters is optimal. This number ensures the best separation of states with the minimum number of iterations.

What we get as output

The system automatically identified 5 different road surface conditions:

  • Dry - dry surface (safe)
  • Damp - damp surface
  • Wet - wet surface (after rain)
  • Frost - ice/frost
  • Slush - snow/mixture of snow and water

We assign these names after training, analyzing the characteristics of each cluster. For the algorithm, these are just groups 0, 1, 2, 3, 4—but for us, these are safety-critical states.

Visualization: understanding the clusters

PCA 2D projection with cluster centers

The visualization shows how our 6-dimensional data projects onto a 2D plane using Principal Component Analysis.

The first visualization demonstrates how the algorithm sees the data. Each point represents a moment in time with its 6 sensor readings, projected onto a 2D plane for human comprehension. Principal Component Analysis (PCA) is widely used in road surface monitoring applications to reduce dimensionality while preserving the most important variance in the data:

  • Purple cluster (damp) — high humidity, moderate temperature
  • Dark blue cluster (wet) — wet surface conditions
  • Cyan cluster (dry) — dry and safe conditions
  • Green cluster (frost/anomaly) — ice formation zone
  • Yellow-green cluster (slush) — mixed precipitation

The red X marks show cluster centers in this projection—the "ideal" representatives of each state in 6-dimensional space. Notice how clearly separated the clusters are, with minimal overlap. This separation is what makes accurate classification possible.

Pair plot: deep dive into feature relationships

This visualization shows all pairwise relationships between our features.

This comprehensive pair plot reveals the full complexity of our data:

Key insights from the visualization:

  • Diagonal histograms show the distribution of each feature across clusters—notice how different states occupy different ranges
  • Scatter plots reveal relationships between feature pairs—some clusters separate clearly on certain combinations (like corrected_amplitude vs humidity)
  • Temperature patterns (bottom rows) show distinct groupings—cold temperatures correlate strongly with ice/frost clusters
  • Yellow cluster (cluster 4) appears as an outlier in many projections—these are anomalies or edge cases

The color coding helps identify which combinations of features are most discriminative. For instance, the relationship between temperature and humidity clearly separates safe (dry) from dangerous (frost/wet) conditions.

Real-time operation

From data to decision in seconds

When the trained model receives new sensor readings:

  1. A 6-value vector is formed
  2. Distance to each of the 5 cluster centers is calculated
  3. Data is assigned to the nearest cluster
  4. The system instantly reports: "Wet Frost" or "Dry"

All this happens on the server side, automatically, without human intervention.

Adapting to change

Weather and road conditions change. A model trained in October may lose accuracy in January. That's why we retrain the system approximately once a week, when:

  • New data accumulates
  • Weather conditions change dramatically

Importantly, retraining happens based on the previous model, so we don't lose accumulated knowledge but supplement it with new data.

Advantages over traditional approaches

1. No manual configuration

The traditional approach requires months of engineering work to set threshold values:

  • "If temperature below -1°C AND humidity above 90%, then ice"
  • "If radar amplitude in range X-Y, then snow."

Our system finds these boundaries itself, and does it better than humans by analyzing relationships across all 6 dimensions simultaneously. This aligns with Road Weather Information System (RWIS) best practices, which increasingly incorporate machine learning to improve detection accuracy beyond traditional threshold-based systems.

2. Solution universality

The same approach can be applied to different monitoring tasks:

  • Black ice detection (implemented)
  • Helicopter flight state determination (in development)
  • Multi-layer road surface analysis (planned)

3. User transparency

Road services don't need to understand correlation matrices or multidimensional spaces. They receive a simple, clear result: "Wet Frost—treatment required" or "Dry—safe".

Visualization: from complexity to clarity

For analytics

During development, we use complex graphs:

  • Correlation matrices — show parameter relationships
  • Pair plots — visualize clusters in 2D projections

These tools help us verify that the model correctly separates states.

For end users

For end clients, we plan to provide density plots:

  • Clearly show the current state
  • Display system confidence (proximity to cluster center)
  • Border values are visually apparent—it's clear why there might be fluctuation between states
  • Anomalies are highlighted in a different color

But most often, a simple text status in the interface is sufficient.

What's next: expanding capabilities

Air transport monitoring

The next task is to determine the helicopter's flight state. The system will analyze:

  • GPS data (coordinates, speed, altitude)
  • Pressure sensors
  • Accelerometers

Interesting feature: cross-validation of altitude via GPS and atmospheric pressure will protect against GPS failures.

Again, unsupervised learning—no one will label every second of flight.

Multi-layer radar analysis

Our radars can "see" not just the surface but also the pavement structure in depth:

  • Layer of water/snow from rain
  • Concrete pavement
  • Base (soil, if concrete is porous)

This will open new possibilities for road surface diagnostics. We're currently working on radar firmware and noise filtering algorithms.

Approach philosophy: trust data, verify results

Our experience demonstrates several important principles:

1. Determinism for critical systems
In safety, there's no room for randomness. Predictability matters more than flexibility.

2. Data is smarter than humans at finding patterns
Unsupervised learning finds patterns that humans might miss in multidimensional space.

3. Simplicity on output, complexity inside
The user gets a simple answer, but behind it lies correlation analysis, 6-dimensional clustering, and continuous retraining.

4. Adaptation is critical
A static model becomes outdated. Regular retraining maintains relevance.

5. Approach universality
One methodology solves different monitoring tasks - from roads to helicopters.

Conclusion: Machine learning as a bridge between sensors and decisions

Black ice is invisible to the eye but not to a properly trained ML system. We don't just collect sensor data, we extract meaning from it and turn it into action.

The traditional approach would require an army of specialists constantly adjusting thresholds and rules. Our approach is a self-learning system that adapts to reality and works more accurately than humans. While other approaches, like deep learning have achieved 96% accuracy, they require extensive labeled datasets that are often impractical to obtain for road monitoring applications.

This is just the beginning. Each new sensor type, each new monitoring task, is an opportunity to apply the same principle: let the data speak for itself.

Table of contents