Prevalence of Confusing Code in Software Projects

Dan Gopstein , Hongwei Zhou , Phyllis Frankl and Justin Cappos

Prior work has shown that extremely small code patterns, such as the conditional operator and implicit type conversion, can cause considerable misunderstanding in programmers. Until now, the real world impact of these patterns ś known as ‘atoms of confusion’ ś was only speculative. This work uses a corpus of 14 of the most popular and inluential open source C and C++ projects to measure the prevalence and signiicance of these small confusing patterns. Our results show that the 15 known types of confusing micro patterns occur millions of times in programs like the Linux kernel and GCC, appearing on average once every 23 lines.