AI Learns Common Sense from Touch, Not Just Vision
hackernoon.com-
Appendix for Octopi: Object Property Reasoning with Large Tactile-Language Models
- APPENDIX A: ANNOTATION DETAILS
- APPENDIX B: OBJECT DETAILS
- APPENDIX C: PROPERTY STATISTICS
- APPENDIX D: SAMPLE VIDEO STATISTICS
- APPENDIX E: ENCODER ANALYSIS
- APPENDIX F: PG-INSTRUCTBLIP AVOCADO PROPERTY PREDICTION
VI. EXPERIMENTAL RESULTS
To address the above questions, we evaluated OCTOPI using (i) accuracy on the physical understanding tasks in PHYSICLEAR’s test set, (ii) accuracy on scenario reasoning tasks, (iii) task success rate on a real robot, and (iv) property prediction accuracy on unseen objects. We tested two versions of OCTOPI, OCTOPI-7b and OCTOPI-13b, which use Vicuna7b v1.5 and Vicuna-13b v1.5 as their LLMs respectively.
A. Tactile-grounded Physical Understanding with Object Property Descriptions
During tactile feature alignment and end-to-end fine-tuning, we trained OCTOPI with comparison tasks (i.e. PC, PSS and POM) to align its physical understanding of our physical properties and ...
Copyright of this story solely belongs to hackernoon.com . To see the full text click HERE