Spatio-temporal deep learning for modeling dynamic drop-surface interactions
Loading...
Date issued
Authors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Reuse License
Description of rights: InC-1.0
Abstract
Sliding water drops are a familiar everyday phenomenon, for example on windows, but they also play an important role in many industrial processes. They serve as sensitive probes of wetting, adhesion, friction, and electrostatic charges, yet quantitative analysis remains difficult. In particular, friction forces depend on the contact-line width and dynamic advancing and receding contact angles. Measuring these quantities across the sliding path is difficult because front-view, high-resolution imaging requires complex optics and restricts the observable area.
This dissertation presents a single-view quantitative drop measurement framework that extracts drop geometry and dynamics from high-speed side-view videos using a combination of signal processing, computer vision, machine learning, and time series modeling. It enables automated analysis without additional cameras or mirror-based front-view setups and makes it possible to track drop metrics across the full sliding path.
First, the 4-segment super-resolution optimized-fitting (4S-SROF) method couples an Efficient Sub-Pixel Convolutional Network with an optimized polynomial fitting strategy to reconstruct high-resolution drop contours and extract dynamic contact angles from low-resolution videos. The method improves contact-angle accuracy by about 20% for angles below 90◦ and 30% above 90◦, while remaining computationally efficient for large datasets.
Second, the dissertation formulates front-view contact-line width estimation as a temporal inference problem from side-view measurements. Using water and water–glycerol drops on surfaces with controlled chemical and topographic patterns, an LSTM model achieves an RMSE of about 67 µm (approximately 2.4% relative error) and reconstructs drop width continuously along the sliding path, avoiding the mirror or second-camera limitation of front-view imaging.
Third, hand-crafted features are replaced by end-to-end spatiotemporal representation learning using a CNN-Transformer architecture that operates on short video sequences and velocity. A position-invariant video processing pipeline keeps the drop centered in a sliding spatial window and reduces memory and computation by over 80%. A custom BlurVGG8-ConvTran model with low-dimensional absolute positional encoding achieves about 48 µm
error (approximately 1.7% relative), remains robust under surface defects and imaging perturbations, and provides Grad-CAM visualizations to interpret salient regions.
Overall, the dissertation delivers a unified, data-driven pipeline for measurement from single-view scientific videos. The approach simplifies experimental instrumentation and enables precise, automated estimation of drop dynamics. It further allows researchers to monitor drop width continuously along the full sliding path, overcoming the field-of-view limitations of conventional front-view imaging. Beyond wetting, the proposed methods provide broadly applicable tools for learning-based measurement under practical imaging constraints.