How Well Do Machines Perform on IQ tests: a Comparison Study on a Large-Scale Dataset

How Well Do Machines Perform on IQ tests: a Comparison Study on a Large-Scale Dataset

Yusen Liu, Fangyuan He, Haodi Zhang, Guozheng Rao, Zhiyong Feng, Yi Zhou

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Understanding Intelligence and Human-level AI in the New Machine Learning era. Pages 6110-6116. https://doi.org/10.24963/ijcai.2019/846

AI benchmarking becomes an increasingly important task. As suggested by many researchers, Intelligence Quotient (IQ) tests, which is widely regarded as one of the predominant benchmarks for measuring human intelligence, raises an interesting challenge for AI systems. For better solving IQ tests automatedly by machines, one needs to use, combine and advance many areas in AI including knowledge representation and reasoning, machine learning, natural language processing and image understanding. Also, automated IQ tests provides an ideal testbed for integrating symbolic and sub-symbolic approaches as both are found useful here. Hence, we argue that IQ tests, although not suitable for testing machine intelligence, provides an excellent benchmark for the current development of AI research. Nevertheless, most existing IQ test datasets are not comprehensive enough for this purpose. As a result, the conclusions obtained are not representative. To address this issue, we create IQ10k, a large-scale dataset that contains more than 10,000 IQ test questions. We also conduct a comparison study on IQ10k with a number of state-of-the-art approaches.
Keywords:
Special Track on Understanding Intelligence and Human-level AI in the New Machine Learning era: Learning knowledge representations (Special Track on Human AI and Machine Learning)
Special Track on Understanding Intelligence and Human-level AI in the New Machine Learning era: Integrating Learning and (any form of) Reasoning (Special Track on Human AI and Machine Learning)