TacEva: A Performance Evaluation Framework for Vision-Based Tactile Sensors

1Imperial-X Initiative, Imperial College London, 2Waseda University,
3King's College London, 4Queen Mary University of London

Equal Contribution, *Corresponding author
Research framework diagram

We propose a comprehensive framework for evaluating vision-based tactile sensors, systematically comparing performance across design properties and sensing performance. In this paper, we showcase our framework against four representative VBTS (ViTacTip, MagicTac, GelSight, GelSightWM).

Abstract

Vision-based tactile sensors (VBTSs) are widely used in robotic tasks, because of the high spatial resolution they offer and their relatively low manufacturing costs. However, variations in their sensing mechanisms, structural dimension, and other parameters lead to significant performance disparities between VBTSs currently in use. This makes it challenging to optimize VBTSs for specific tasks, as both the initial choice and subsequent fine-tuning are hindered by the lack of standardized metrics. To address this issue, we present TacEva, a comprehensive evaluation framework for the quantitative analysis of VBTS performance. We define a set of performance metrics that capture and quantify the key characteristics displayed in typical application scenarios. For each metric, we designed an experimental pipeline that provides a structured procedure for performance quantification. We then applied this evaluation approach to multiple VBTSs with distinct sensing mechanisms. The results show that the proposed framework yields a thorough evaluation of each design, and provides quantitative indicators for each performance dimension. This enables researchers to pre-select the most appropriate VBTS on a task by task basis, and also offers performance-guided insights for the optimization of VBTS design.

Interactive Sensing Mechanism Simulator

This interactive demo illustrates how different VBTS sensing mechanisms respond to a ball pressing a soft surface: IMM (intensity + lighting), MDM (marker displacement), and IMM+MDM (combined).

Drag the ball in any side-view panel.

IMM

Side View

Camera View

Side view: ball presses gel, producing contact shadow + photometric intensity changes.

MDM

Side View

Camera View

Side view markers spread laterally and compress vertically as deformation grows.

IMM + MDM

Side View

Camera View

Hybrid side view overlays IMM lighting behavior with MDM marker displacement.

MDM + MFM

Side View

Camera View

Transparent-surface effect: background remains visible while marker displacement is tracked.

VBTS Evaluation Landscape (Appendix A1/A2)

This table summarizes reported metric coverage across VBTS works from the appendix (Table A1), grouped by the sensing-mechanism taxonomy in ref. [8]: IMM, MDM, and MFM (including hybrids). Application-level interpretation (Table A2) is shown later below the main evaluation sections.

Table A1: Reported Evaluation Metrics Across VBTS Works

Y = reported N = not reported P = partial/indirect
Mechanism Work / Sensor Res FOV Gel FPS Calib SR S Rep Rspatial Rlight
IMMDIGIT [29]YYNYNYNNNN
IMMDTact [49]YYYYYYYNNY
IMMOmniTact [30]YYNYNNNNNN
IMMMinsight [50]YYYYYYYNYN
IMMThinTact [32]YYYYNYPNPN
IMMInSight [51]YYYYYPYNYN
IMMGelTip [34]NNYNPYNNNN
MDMGelForce [28]YYYYYYYNNN
MDMChromaTouch [52]YYYYYNYNNN
MDMTacTip [10]YYYYYPYNYN
MDMDeltac [53]YYYNYNYNPN
MDMTac3D [54]YYYYYNYNNN
IMM+MDMGelSight [3]YYYYYYNNNN
IMM+MDMGelSlim [39]YYNYYNNYYY
IMM+MDMUV-Tac [55]YYYYYPYNPY
IMM+MDMDenseTac 2.0 [31]YYYYYNNNNN
IMM+MFMVisTac [56]YYYYPPNNNN
IMM+MFMFinger-STS [57]YYYNNYNNNN
IMM+MFMTIRgel [58]YYNYNYNNNY
IMM+MFMHiVTac [59]YYYYYNYNYN
MDM+MFMViTacTip [11]YYNNYYYNNY
MDM+MFMFingerVision [12]YYYYNNPNPP
MDM+MFMSpecTac [60]NYYYYNNNNN
MDM+MFMVPTS [61]YYYYYNNNNN
IMM+MDM+MFMMagicTac [41, 62]YYYYYYNYNY
IMM+MDM+MFMF-Touch [9]YYYYYNNYNY
Evaluation / ReviewPBR-Design [36]YYYYYYNNYY
Evaluation / ReviewVT-Review [63]NNNNNNNNNN
Evaluation / ReviewEval [37]NNNYNYYNNN
Evaluation / ReviewSparsh [38]NNNYNYNNPP
Evaluation / ReviewTacEva (Ours)YYYYYYYYYY

Standard Performance

Calibration Process

Calibration Process

VBTS calibration setup diagram

Definition. Two sequential steps with the sensor on a robot: (1) Surface geometry via first-contact mapping with a 10 mm spherical indenter; (2) Force/position mapping from synchronized images and 6‑axis F/T labels across randomized normal + shear stimuli.

Protocol. Probe the surface on a grid (≈0.1 mm steps) until contact (threshold ≈0.02 N), then indent to safe depths per device while adding small x–y displacements. Train a common ResNet‑18 baseline (70/20/10 split) to regress $(P_x, P_y, P_z, F_x, F_y, F_z)$. Report MAE, $R^2$, and sMAPE: $$\text{sMAPE} = \frac{1}{n} \sum_{i=1}^{n} \frac{|y_i - \hat{y}_i|}{\frac{|y_i| + |\hat{y}_i|}{2} + \epsilon} \times 100\%$$

Calibration error results

Analysis. ViTacTip minimizes absolute force errors; GelSight variants excel in Pz; marker‑free GelSightWM is strong in Fz/Pz but weaker in Fxy; MagicTac is competitive in Pxy yet noisier in Fz.

Spatial Resolution

Spatial Resolution

Spatial resolution evaluation results showing accuracy vs tolerance for different VBTS

Definition. Ability to distinguish closely spaced features. We report accuracy as a function of tolerance $\epsilon$ using a grating‑classification task:

$$\text{SR}(\epsilon) = \frac{1}{N} \sum_{i=1}^{N} \mathbf{1}[|\hat{r}_i - r_i| \leq \epsilon]$$

Protocol. 3D‑printed dot/line gratings (≈0.05–2.0 mm). 100 presses per sample with randomized yaw. Train classifier; sweep ε.

Analysis. Above ≈5 mm, all near‑perfect. At 0.05 mm, GelSight/GelSightWM ≈99%, MagicTac ≈98%, ViTacTip ≈80% — reflecting gel stiffness/geometry and effective pixel density.

Evaluation using dot and line grating samples, with spacing from 0 mm (flat) to 2 mm, to determine the minimum resolvable feature size. All four sensors were benchmarked using grating-based samples for spatial resolution assessment.

Spatial Resolution Test Samples

Dot and Line samples from 0.0625 mm to 2 mm spacing - examples shown below

Dot 0.05mm
Dot 2mm
Line 0.05mm
Line 2mm

Sensitivity

Sensitivity

Sensitivity analysis showing normal compliance and uniformity maps for different VBTS

Definition. Normal compliance: $S = \Delta z / F$ (mm/N). Uniformity (0–1): $U = 1 / (1 + \sigma/|\mu|)$ from binned sensitivity means.

Protocol. Reuse calibration data; bin by (x,y); compute mean S per bin to form maps; aggregate μ, σ for U.

Analysis. ViTacTip is most sensitive but less uniform (edge‑enhanced S); GelSight/MagicTac are stiffer with higher U.

Robustness

Spatial Robustness

Spatial Robustness

Spatial robustness evaluation showing error stability across location and depth

Definition. Stability of error across location and depth. Compute MAE per radial bin and per depth bin; robustness (lower is better):

$$R_{\text{spatial},c} = \frac{1}{2} \left[ \text{STD}(\{m^{\text{dist}}_b\}) + \text{STD}(\{m^{\text{depth}}_d\}) \right]$$

Protocol. Collect a held‑out grid (≈1.6k points) with the same probing pattern; evaluate by bins over normalized radius/depth.

Analysis. ViTacTip holds force errors flat across the surface; planar gels show edge growth (notably in Fz and Pxy). Depth improves Pxy after shallow contact.

Lighting Robustness

Lighting Robustness

Lighting robustness experimental setup and results
Lighting robustness comparison table

Definition. Sensitivity of prediction error to illumination changes (transparent/semi‑transparent devices). Example metric:

$$R_{\text{light}} = \frac{|\frac{I_c}{I_o} - 1|}{|\frac{I_c}{I_o} - 1| + |\frac{\text{MAE}_c}{\text{MAE}_o} - 1|}$$

Protocol. Test under four scenes (diffuse/point/mixed; varying intensity). Compare to training‑light baseline using mean grayscale intensity.

Analysis. ViTacTip's errors grow under bright point sources; MagicTac's intensities shift less but error variance can rise due to grid interactions with external light.

Repeatability

Repeatability

Repeatability analysis showing per-channel variability across multiple sensor measurements

Definition. Across $N$ repeats at $K$ points and $D$ depths, per‑channel variability (lower is better):

$$\text{Rep}_c = \frac{1}{KD} \sum_{k,d} \text{STD}(\hat{c}_{k,d,1..N})$$

Protocol. K≈100 random points, step 0.1 mm to max depth; N=10 repeats per (point,depth).

Analysis. ViTacTip is most repeatable for forces and competitive for positions; GelSight is strongest in Pz; MagicTac is intermediate for position and higher variance for force.

Additional Analysis

Inter‑sensor Variability

Inter‑sensor Variability

Inter-sensor variability analysis

Compare reconstructed surfaces across units of the same type via rigid alignment and nearest‑neighbor distances inside the common hull; the mean absolute surface gap summarizes manufacturing consistency.

Hysteresis

Hysteresis

Hysteresis analysis

Quantify the area between load/unload F–Δz curves (trapezoidal rule) over the overlap range at multiple surface points; ViTacTip shows measurable, spatially varying hysteresis, while GelSight variants/MagicTac show no clear hysteresis under our protocol.

Table A2: VBTS Sensors and Evaluation by Application-Level Interpretation

Mechanism Sensor / Paper Geometry-Focused Evaluations Force-Focused Evaluations Robustness-Critical Evaluations
IMMDIGIT [29]Pose control; contact and pose tracking--
IMMDTact [49]Contact reconstruction; pose estimation; object recognition-Illumination robustness
IMMOmniTact [30]Connector insertion; contact-angle estimation--
IMMMinsight [50]Tactile servoing; lump detectionForce estimation-
IMMThinTact [32]Delicate grasping; insertion; sliding-pose manipulationGentle force regulation-
IMMInSight [51]Shape; orientation; posture sensingMulti-contact force mapping-
IMMGelTip [34]Contact localization--
MDMGelForce [64]-Traction-field estimation-
MDMChromaTouch [52]Curvature estimation (3D displacement field)--
MDMTacTip [10]In-hand rolling/reorientation; edge following; localization; JND discrimination--
MDMDelTact [54]Contact reconstructionForce mapping-
MDMTac3D [53]Spatial resolution; contact reconstructionForce estimation; friction-coefficient estimation; real-time mapping-
IMM+MDMGelSight [3]Contact reconstruction; texture recognition; USB insertionForce/stiffness estimation; slip detection-
IMM+MDMGelSlim [33,39,65]Pose and shape reconstruction; insertionForce/torque estimation; force-controlled manipulation; slip detectionDurability (>3000 grasps)
IMM+MDMDenseTac 2.0 [31,44]Dense contact reconstructionForce/torque estimation-
IMM+MDMUV-Tac [55]Contact localization; alignmentNormal and shear force mappingUV/white-light switching
IMM+MFMVisTac [56]Contact reconstruction/localization; pose estimation; insertion--
IMM+MFMFinger-STS [57]Object/texture recognitionDense slip detection; bead-maze tasks-
IMM+MFMTIRgel [58]Object classification-Ambient-brightness resistance
IMM+MFMHiVTac [59]Pose estimation; deformation analysisForce estimation; grasping-
MDM+MFMViTacTip [11]Grating classification; pose estimation; localizationForce estimationLighting robustness (GAN-based)
MDM+MFMFingerVision [12,66-68]Surface reconstruction; deformation fieldForce estimation; slip/vibration detectionGrasp-stability tests (shaking)
MDM+MFMSpecTac [60]3D triangulation; feature matchingForce estimationSIFT feature detection
MDM+MFMVPTS [61]Contact reconstruction; proximity exploration-Vision-proximity-tactile fusion
IMM+MDM+MFMMagicTac [41,62]Grating classification; pose estimation; contact localizationForce estimationLighting robustness; manufacture error; wear and tear
IMM+MDM+MFMF-Touch [9]Object/texture recognitionForce/torque estimation-
Evaluation / ReviewPBR-Design [36]Contact reconstruction (embossed text, grasping); VBTS design-Design robustness (optical stability)
Evaluation / ReviewVT-Review [63]3D visuo-tactile contact reconstructionForce estimationRobustness across textures and marker layouts
Evaluation / ReviewEval [37]Edge detection; contact reconstructionMinimum detectable force; sensitivity mapping; slip and frequency testsTemperature/material dependency; real-world grasping
Evaluation / ReviewSparsh [38]Pose tracking; texture recognitionForce estimation/mapping; slip-accumulation; bead-maze tasks; grasp stability-
Evaluation / ReviewTacEva (Ours)Spatial resolution; contact localizationForce estimation; force sensitivity mappingRepeatability; spatial and lighting robustness

Summary of VBTS Evaluation

Summary comparison of VBTS performance across all evaluation metrics

Selection guide: ViTacTip — best for low‑force, deep/soft contacts and force repeatability; sensitive to lighting and weaker in ultra‑fine resolution. MagicTac — fast, strong planar localization; force estimates noisier; control lighting when possible. GelSight — highest camera resolution and stable depth (Pz); modest frame rate and edge effects. GelSightWM — practical choice when shear is secondary; robust Pz/Fz without markers.

BibTeX

@article{taceva,
  title   = {TacEva: A Performance Evaluation Framework for Vision-Based Tactile Sensors},
  author  = {Cong, Qingzheng and Oh, Steven and Fan, Wen and Luo, Shan and Althoefer, Kaspar and Zhang, Dandan},
  journal = {Advanced Intelligent Systems},
  year    = {2026},
  pages   = {e202501179},
  doi     = {10.1002/aisy.202501179},
  url     = {https://doi.org/10.1002/aisy.202501179}
}