Tech »  Topic »  Study shows languages with more speakers tend to be harder for machines to learn

Study shows languages with more speakers tend to be harder for machines to learn


Illustration of measuring learning difficulty in Study 1. Circles represent observed bits-per-symbol that are needed (on average) to encode/predict symbols based on increasing amounts of training data for different (hypothetical) documents in different (hypothetical) languages, each with a source entropy of 5. Credit: Scientific Reports (2023). DOI: 10.1038/s41598-023-45373-z

Just a few months ago, many people would have found it unimaginable how well artificial intelligence-based "language models" could imitate human speech. What ChatGPT writes is often indistinguishable from human-generated text.

A research team at the Leibniz Institute for the German Language (IDS) in Mannheim, Germany have now used text material in 1,293 different languages to investigate how quickly different computer language models learn to "write." The surprising result: languages that are spoken by a large number of people tend to be more difficult for algorithms to learn than languages with a smaller linguistic community. The study is ...


Copyright of this story solely belongs to phys.org . To see the full text click HERE