Title: Charting the Landscape of Neuro-symbolic Reasoners
Speakers: Xuelong An
In the first half of my presentation, I want to share an ongoing work on building a comprehensive benchmark to empirically assess the plethora of neuro-symbolic models. We note that in recent years, interest over this family of models is growing, evidenced by the constant influx of novel methods and benchmarks to test their robust generalization and reasoning capabilities. However, much of the successes reported by neuro-symbolic methods over assessed datasets are often disparate with respect to one another. There lacks a unified, comprehensive test to assess the panorama of neuro-symbolic models. To design such benchmark, we survey the current landscape of neuro-symbolic architectures and benchmarks. From this, we propose a general taxonomy for classifying current and future neuro-symbolic models and reasoning benchmarks, which helps us understand how they relate to each other. Henceforth, we propose SaSSY-CLEVR, a heterogeneous benchmark suite which can serve as a common testing ground for different neuro-symbolic reasoners to compare their strengths and limitations.
If time allows, in the second half of my presentation, I will share a series of experiments to assess NeSy models on CLEVR-Hans3, which test for the ability of object-centric reasoning adopted in SaSSY-CLEVR. In our study, we expand on work done by Stammer et. al (2021), where we test the robustness of a traditional convolutional neural networks (CNN) and Neuro-Symbolic (NeSy) architectures comprising of a Slot Attention and a Set Transformer component. We evaluate different NeSy variants by comparing their classification accuracy after fine-tuning them to a modified version of the CLEVR-Hans3 dataset containing four different kinds of data complications. We find that models using the pretrained Slot Attention maintained good classification performance across data complications, indicating that the object-centric representations built by this perceptual component are crucial for model robustness. We also find that a Slot Attention with fully connected layers, instead of a Set Transformer, had the best overall performance, underscoring the importance of controlled comparisons.