AI Learns Common Sense from Touch, Not Just Vision

by Large Models (dot tech) June 13th, 2025

Appendix for Octopi: Object Property Reasoning with Large Tactile-Language Models
- APPENDIX A: ANNOTATION DETAILS
- APPENDIX B: OBJECT DETAILS
- APPENDIX C: PROPERTY STATISTICS
- APPENDIX D: SAMPLE VIDEO STATISTICS
- APPENDIX E: ENCODER ANALYSIS
- APPENDIX F: PG-INSTRUCTBLIP AVOCADO PROPERTY PREDICTION

VI. EXPERIMENTAL RESULTS

To address the above questions, we evaluated OCTOPI using (i) accuracy on the physical understanding tasks in PHYSICLEAR’s test set, (ii) accuracy on scenario reasoning tasks, (iii) task success rate on a real robot, and (iv) property prediction accuracy on unseen objects. We tested two versions of OCTOPI, OCTOPI-7b and OCTOPI-13b, which use Vicuna7b v1.5 and Vicuna-13b v1.5 as their LLMs respectively.

A. Tactile-grounded Physical Understanding with Object Property Descriptions

During tactile feature alignment and end-to-end fine-tuning, we trained OCTOPI with comparison tasks (i.e. PC, PSS and POM) to align its physical understanding of our physical properties and ...

Copyright of this story solely belongs to hackernoon.com . To see the full text click HERE

VI. EXPERIMENTAL RESULTS

Share: