Depth Enhanced Heat Map Generation


Today’s stores are not just interested in what you buy. They also want to know how you shop, how much you shop and how you move about the store. Tracking and understanding the shopper within the brick and mortar space has been challenging and of high interest in the retail ecosystem. But this challenge is not only present in the retail ecosystem — it is equally present in many vertical platforms such as casinos, malls, sports stadiums and even office environments. Within the introduction of challenges caused by COVID it is even more important to understand and track the shopper, for tracking the behavior of shoppers can inform retailers about better distancing measures and other insights in order to maintain customer safety.

Currently there are two types of technology that these vertical platforms use: basic IP (internet protocol) security cameras and heat maps. The biggest challenge in tracking shoppers occurs from using both types of mediums together. Although heat maps provide store owners with valuable insight, the challenge emerges from the heat map technology being completely anonymized. Furthermore, the exact distance from where the shopper is to the camera is unknown, leaving a possibility for error and subjective interpretation of outputs.

Although experimenting with basic store layouts isn’t a new idea. For instance, grocery stores place the higher margin items at the front of the store, and the more in demand items at the right-hand side of an aisle. But using heat maps and depth cameras can make the analysis possible on entirely different scale.

Methodology and Approach

I present to you H.E.A.T

Heat is a platform for the service model, the market for this platform has great growth potential and can be beneficial to various industries like retail and gaming/entertainment. Both these industries desire to have data on the foot traffic of their customers, patterns for the foot traffic and the areas where customer spend most of their dwell time (Hot Spots).

Our customers are companies who are investing in smart spaces and are challenged by the robustness/issues of the current system and people tracking. For example, the customer can be grocery store, clothing stores, casinos or sports stadiums. Casinos will benefit greatly by implementing this technology as it will help them address two main issues, the safety of their staff and customers and also any anomalous/dangerous situations before they arise. Adopting the combination of heat map technology and depth camera will be game changing in grocery and clothing stores as it will help them track their customers purchase patterns more effectively and it will also help them identity the saturation areas/times in their stores.

With the implementation of H.E.A.T, retail companies will now have a better tool to understand and analyze the traction in their stores. Solutions to address the problem of human traction/traffic optimization have been implemented yet they have their own drawbacks such as lack of immediate feedback, being complicated to use, as well as obstructed camera views. What makes H.E.A.T different is that it uses depth camera to not only generate heat maps but getting exact measurements of the floor space. Instead of having to look at endless hours of security camera feeds to see what areas lack or exceed in traction over a long period of time or look at confusing heat maps that include data over several weeks/months. These companies will be able to see real-time heat mapping of their store and implement the changes they need to make their store work in the most optimal way possible. Additionally, current solutions also have transformation issues since the floor tends to obstruct the scene view. With the addition of depth cameras, the floor can be used to determine the proper distance from the camera and get the exact measurements for a more accurate heat map. For example, helping customers who are unsure of a product, by looking at the hot spots on the map, employees will have the ability to see the real-time clusters that appear throughout the store and help those who are having trouble looking for a specific item. If certain aisles are more popular during certain hours of the day, the manager can take this information to better restock their aisles and gain more revenue by modifying the product placement to fit the customer’s needs. Keep in mind that the ability to generate heat maps are low hanging fruit and this technology has potential to allow for higher level analysis features.


In this section, we will go over the implementation details of HEAT.


Our algorithm is as follows:

  1. Collect RGB and depth data from RealSense and use suggested filters/preprocessing algorithms to clean up noise.
  2. Use the RGB image to find pixels associated with humans. Use OpenVINO image segmentation networks to do so.
  3. Using depth data, find the ground plane. For demo purposes, we fit a sum of exponential function as suggested by this paper. The algorithm for fitting a sum of exponential function can be found here. Additionally, use the pixels from step 2 to find the depth data associated with the human.
  4. Convert the depth data in pixel coordinates to 3D camera frame coordinates where the z-axis refers to the depth.
  5. Project the mean of the human pixels on to to the ground plane and show a top down view of the plane to the user.

The figure below illustrates the workflow described above.


Technology Used

We used a Intel RealSense D435 for capturing depth data. On the AI/ML side, we use OpenVINO for optimized inference as well as Pandas, Numpy, OpenCV, and PyRealSense2 for processing. For displaying results, we use Plotly, Matplotlib, and Streamlit.