Developers' Code Reuse Security Conundrum: Cut, Paste, FailGitHub Projects Riddled With Flawed Stack Overflow Code, Researchers Find
Code reuse kills - software quality, that is.
See Also: What is next-generation AML?
A new study of C++ code snippets shared on code-sharing site Stack Overflow traced how they ended up in software projects hosted on GitHub.
"Bad coding practices, improper checks for unusual or exceptional conditions, and improper input validation were most prevalent types of vulnerabilities."
Researchers reviewed more than 72,000 code snippets drawn from 1,325 Stack Overflow posts that were reused in at least one GitHub project over a 10-year period. All told, they identified 69 vulnerable code snippets, which they categorized as belonging to one of 29 types of vulnerabilities.
"Bad coding practices, improper checks for unusual or exceptional conditions, and improper input validation were most prevalent types of vulnerabilities," write report authors Morteza Verdi, Ashkan Sami, Jafar Akhondali of Shiraz University in Iran; Foutse Khomh and Gias Uddin at Polytechnique Montreal University in Quebec; and Alireza Karami Motlagh at Iran's Chamran University.
Their study - "An Empirical Study of C++ Vulnerabilities in Crowd-Sourced Code Examples" - is being reviewed for possible publication in the journal IEEE Transactions on Software Engineering.
Several dozens of vulnerabilities might not seem like much. But code repositories have a one-to-many use model, meaning the impact of code-snippet errors can quickly magnify.
"The 69 vulnerable code snippets found in Stack Overflow were reused in a total of 2,859 GitHub projects," the researchers say.
Their study, apparently the first to look at how C++ errors flow from Stack Overflow to GitHub, is meant to highlight the risks that using unverified code can pose. "The people who are using Stack Overflow, they shouldn't trust it fully," Ashkan Sami, associate professor of computer science, engineering and information technology at Shiraz University in Iran, tells The Register. "It's better for programmers to do it the hard way and learn secure coding."
Can't Stop the Cut and Paste
To help, the researchers have developed a browser extension that Stack Overflow contributors can use to check for security errors in code snippets before they upload them to the code-sharing site. But will developers buy in?
The researchers say they have already notified developers of the projects they studied about all errors they found, but often didn't see a swift reaction.
"Although they acknowledged the vulnerabilities, many of them are still not corrected today," the researchers say.
That's a concern, because numerous enterprise applications and mobile apps today get constructed using other people's code. Many developers, for example, regularly point to Uber as an example of an organization producing apps that might only be 10 percent built using code created in-house, with the rest coming from open source projects.
In another study on the risks posed by code reuse back in May 2017, researcher Felix Fischer delivered a paper that he'd co-authored on the topic - "Stack Overflow Considered Harmful? The Impact of Copy & Paste on Android Application Security" - at the IEEE Symposium on Security and Privacy.
Fischer said that he and a team of researchers extracted 4,000 code snippets posted to Stack Overflow that were copied and pasted into 1.3 million Android applications available on Google Play. Of those snippets, 1,161 had errors or security flaws.
The researchers found rampant code reuse. "We found that 15.4 percent of all 1.3 million Android applications contained security-related code snippets from Stack Overflow," the researchers said. "Out of these, 97.9 percent contain at least one insecure code snippet."
The researchers added: "The proliferation of insecure code snippets within the Android ecosystem, and thus the impact of insecure code snippets posted on Stack Overflow, poses a major and dangerous problem for Android application security."
Textbooks Often Have Errors Too
Code quality shortfalls are a well-known phenomenon. The question is: What should be done to fix it?
That question was posed at the December 2018 Black Hat Europe in London. At the ending "locknote" panel discussion, an audience member asked Black Hat founder Jeff Moss if it was time to get tough on vendors that produce poor software, because the basics - including the Open Web Application Security Project's top 10 most critical application security risks - haven't changed fundamentally in years. Developers don't seem to be learning; are new regulations required?
Moss responded: "If you search for source code samples, they're almost all insecure." He said he's discussed with "a major search engine" what it would take to prioritize well-written code snippets in code results, to help solve this problem. Its answer: While the search engine staff didn't have time to figure out what was best, if Moss wanted to send them his best code, the search engine would be happy to feature it at the top of results.
Moss said the problem doesn't just exist online, but is widespread in computer science and code-development textbooks too. He estimated that for perhaps $200,000 or $300,000, major textbooks and websites could be rewritten to include only error-free code.
Secure Code Repositories Start at Home
For writing more secure code, culture remains another challenge. Stu Hirst, principal cloud security engineer at British online food order and delivery service Just Eat, speaking at last week's ScotSoft conference in Edinburgh, Scotland, advocated literally showing developers the risks that poor or poor-quality reused code can create, for example, by showing them how it can be hacked. He says such discussions are essential for fostering a culture in which coders are coding securely, without trying to impose punitive measures.
Some organizations are also actively supporting their developers' urge to cut and paste code snippets.
Earlier this year, the CISO of a European financial services firm told me that his organization's approach has been to maintain its own repository of code snippets that have been vetted and trusted, from which in-house developers can draw, thus saving time and contributing to more secure and stable software builds.
The organization also regularly evaluates open source offerings, and it isn't afraid to tear up code built in-house when a better open source alternative becomes available. Finally, the organization encourages developers to contribute to open source libraries. Beyond helping to maintain quality code for all, this also boosts developer morale by the kudos that can accompany working on open source libraries.
Perhaps just as important, developers still get to cut and paste many of the code snippets they require to get their job done - albeit only so long as it comes from a trusted source. Because like many things in life, building a quality final product requires using high-quality components.