TacEva: A Performance Evaluation Framework for Vision-Based Tactile Sensors

Abstract

Vision-based tactile sensors (VBTSs) are widely used in robotic tasks, because of the high spatial resolution they offer and their relatively low manufacturing costs. However, variations in their sensing mechanisms, structural dimension, and other parameters lead to significant performance disparities between VBTSs currently in use. This makes it challenging to optimize VBTSs for specific tasks, as both the initial choice and subsequent fine-tuning are hindered by the lack of standardized metrics. To address this issue, we present TacEva, a comprehensive evaluation framework for the quantitative analysis of VBTS performance. We define a set of performance metrics that capture and quantify the key characteristics displayed in typical application scenarios. For each metric, we designed an experimental pipeline that provides a structured procedure for performance quantification. We then applied this evaluation approach to multiple VBTSs with distinct sensing mechanisms. The results show that the proposed framework yields a thorough evaluation of each design, and provides quantitative indicators for each performance dimension. This enables researchers to pre-select the most appropriate VBTS on a task by task basis, and also offers performance-guided insights for the optimization of VBTS design.

Existing VBTS Work and Their Evaluation Methods

Explore papers on vision-based tactile sensors. Use search and filters; click headers to sort.

For detailed explanation of mechanisms, refer to this comprehensive survey. VBTSs are categorized according to their underlying sensing principles, including: (i) Intensity Mapping Method (IMM), which infers contact geometry and pressure through spatial variations in reflected light intensity; (ii) Marker Displacement Method (MDM), which detects surface deformation by tracking the displacement of embedded markers under force; and (iii) Modality Fusion Method (MFM), which employs transparent skin to enable multimodal perception.

Mechanism	Sensor	Paper	Performance Metrics / Description

Rows

Standard Performance

Calibration Process

▼

Calibration Process

Definition. Two sequential steps with the sensor on a robot: (1) Surface geometry via first-contact mapping with a 10 mm spherical indenter; (2) Force/position mapping from synchronized images and 6‑axis F/T labels across randomized normal + shear stimuli.

Protocol. Probe the surface on a grid (≈0.1 mm steps) until contact (threshold ≈0.02 N), then indent to safe depths per device while adding small x–y displacements. Train a common ResNet‑18 baseline (70/20/10 split) to regress $(P_x, P_y, P_z, F_x, F_y, F_z)$. Report MAE, $R^2$, and sMAPE: $$\text{sMAPE} = \frac{1}{n} \sum_{i=1}^{n} \frac{|y_i - \hat{y}_i|}{\frac{|y_i| + |\hat{y}_i|}{2} + \epsilon} \times 100\%$$

Analysis. ViTacTip minimizes absolute force errors; GelSight variants excel in P_z; marker‑free GelSightWM is strong in F_z/P_z but weaker in F_xy; MagicTac is competitive in P_xy yet noisier in F_z.

Spatial Resolution

▼

Spatial Resolution

Definition. Ability to distinguish closely spaced features. We report accuracy as a function of tolerance $\epsilon$ using a grating‑classification task:

$$\text{SR}(\epsilon) = \frac{1}{N} \sum_{i=1}^{N} \mathbf{1}[|\hat{r}_i - r_i| \leq \epsilon]$$

Protocol. 3D‑printed dot/line gratings (≈0.05–2.0 mm). 100 presses per sample with randomized yaw. Train classifier; sweep ε.

Analysis. Above ≈5 mm, all near‑perfect. At 0.05 mm, GelSight/GelSightWM ≈99%, MagicTac ≈98%, ViTacTip ≈80% — reflecting gel stiffness/geometry and effective pixel density.

Evaluation using dot and line grating samples, with spacing from 0 mm (flat) to 2 mm, to determine the minimum resolvable feature size. All four sensors were benchmarked using grating-based samples for spatial resolution assessment.

Spatial Resolution Test Samples

Dot and Line samples from 0.0625 mm to 2 mm spacing - examples shown below

Dot 0.05mm

Dot 2mm

Line 0.05mm

Line 2mm

Sensitivity

▼

Sensitivity

Definition. Normal compliance: $S = \Delta z / F$ (mm/N). Uniformity (0–1): $U = 1 / (1 + \sigma/|\mu|)$ from binned sensitivity means.

Protocol. Reuse calibration data; bin by (x,y); compute mean S per bin to form maps; aggregate μ, σ for U.

Analysis. ViTacTip is most sensitive but less uniform (edge‑enhanced S); GelSight/MagicTac are stiffer with higher U.

Robustness

Spatial Robustness

▼

Spatial Robustness

Definition. Stability of error across location and depth. Compute MAE per radial bin and per depth bin; robustness (lower is better):

$$R_{\text{spatial},c} = \frac{1}{2} \left[ \text{STD}(\{m^{\text{dist}}_b\}) + \text{STD}(\{m^{\text{depth}}_d\}) \right]$$

Protocol. Collect a held‑out grid (≈1.6k points) with the same probing pattern; evaluate by bins over normalized radius/depth.

Analysis. ViTacTip holds force errors flat across the surface; planar gels show edge growth (notably in F_z and P_xy). Depth improves P_xy after shallow contact.

Lighting Robustness

▼

Lighting Robustness

Definition. Sensitivity of prediction error to illumination changes (transparent/semi‑transparent devices). Example metric:

$$R_{\text{light}} = \frac{|\frac{I_c}{I_o} - 1|}{|\frac{I_c}{I_o} - 1| + |\frac{\text{MAE}_c}{\text{MAE}_o} - 1|}$$

Protocol. Test under four scenes (diffuse/point/mixed; varying intensity). Compare to training‑light baseline using mean grayscale intensity.

Analysis. ViTacTip's errors grow under bright point sources; MagicTac's intensities shift less but error variance can rise due to grid interactions with external light.

Repeatability

▼

Repeatability

Definition. Across $N$ repeats at $K$ points and $D$ depths, per‑channel variability (lower is better):

$$\text{Rep}_c = \frac{1}{KD} \sum_{k,d} \text{STD}(\hat{c}_{k,d,1..N})$$

Protocol. K≈100 random points, step 0.1 mm to max depth; N=10 repeats per (point,depth).

Analysis. ViTacTip is most repeatable for forces and competitive for positions; GelSight is strongest in P_z; MagicTac is intermediate for position and higher variance for force.

Additional Analysis

Inter‑sensor Variability

▼

Inter‑sensor Variability

Compare reconstructed surfaces across units of the same type via rigid alignment and nearest‑neighbor distances inside the common hull; the mean absolute surface gap summarizes manufacturing consistency.

Hysteresis

▼

Hysteresis

Quantify the area between load/unload F–Δz curves (trapezoidal rule) over the overlap range at multiple surface points; ViTacTip shows measurable, spatially varying hysteresis, while GelSight variants/MagicTac show no clear hysteresis under our protocol.

BibTeX

BibTex Code Here

TacEva: A Performance Evaluation Framework for Vision-Based Tactile Sensors

We propose a comprehensive framework for evaluating vision-based tactile sensors, systematically comparing performance across design properties and sensing performance. In this paper, we showcase our framework against four representative VBTS (ViTacTip, MagicTac, GelSight, GelSightWM).

Abstract

Existing VBTS Work and Their Evaluation Methods

Standard Performance

Calibration Process

Calibration Process

Spatial Resolution

Spatial Resolution

Spatial Resolution Test Samples

Sensitivity

Sensitivity

Robustness

Spatial Robustness

Spatial Robustness

Lighting Robustness

Lighting Robustness

Repeatability

Repeatability

Additional Analysis

Inter‑sensor Variability

Inter‑sensor Variability

Hysteresis

Hysteresis

Summary of VBTS Evaluation

BibTeX