tag:blogger.com,1999:blog-7094652.post7965312829714821373..comments2024-03-23T14:36:09.980+00:00Comments on Neil Mitchell's Blog (Haskell etc): Undefined Behaviour in CNeil Mitchellhttp://www.blogger.com/profile/13084722756124486154noreply@blogger.comBlogger14125tag:blogger.com,1999:blog-7094652.post-26130347010647471662017-08-08T13:50:29.160+01:002017-08-08T13:50:29.160+01:00A fast parser with a C interface for Haskell -- al...A fast parser with a C interface for Haskell -- almost perfect excuse to try out Rust IMO. I would refrain from asking for a rewrite in Rust tho since the project is already published. But honestly, check it out at https://www.rust-lang.org , and maybe try it out on another system-level project.Random Rust Fannoreply@blogger.comtag:blogger.com,1999:blog-7094652.post-87719315938666008562017-01-02T15:39:58.876+00:002017-01-02T15:39:58.876+00:00Scan-build was my first thought: http://clang-anal...Scan-build was my first thought: http://clang-analyzer.llvm.org/ So, I aimed scan-build from checker-279 at your simplified example, and it didn't catch anything other than the `malloc(0)`. Enabling additional checkers that seemed somewhat related didn't change this.<br /><br />Cranking up warnings on clang itself griped about missing prototypes and that the global should be either static or extern, but didn't call out what's effectively a dead store (one way or the other).<br /><br />I hoped that inlining f using a GNU statement expression might make it gripe, but that also didn't.Ioasaph Vespertinehttps://www.blogger.com/profile/14374883492962952525noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-88190342357756319062016-12-11T21:36:38.003+00:002016-12-11T21:36:38.003+00:00In fact, I have allocated memory at the point in b...In fact, I have allocated memory at the point in both cases, so I think both cases are fully defined. I guess I had hoped that UBSan also found unspecified behaviour, but maybe not.Neil Mitchellhttps://www.blogger.com/profile/13084722756124486154noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-79873788488255813892016-12-10T17:09:13.905+00:002016-12-10T17:09:13.905+00:00Actually, the root cause of your issue is unspecif...Actually, the root cause of your issue is unspecified behavior, where the C standard specifies that the order of evaluation of the LHS with respect to the RHS is not specified. The undefined behavior then happens in only one of the two cases, which might explain why it is difficult for the static analysis tools to spot it.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-7094652.post-74449318352179217072016-12-07T21:54:32.409+00:002016-12-07T21:54:32.409+00:00FWIW, http://blog.llvm.org/2011/05/what-every-c-pr...FWIW, http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html <br />Three part blog entry, What Every C Programmer Should Know About Undefined Behavior by LLVM founder Chris Lattner. C is a vey scary language.George Colpittshttps://www.blogger.com/profile/15526399395097722268noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-65673029625856884582016-12-07T17:39:44.619+00:002016-12-07T17:39:44.619+00:00Try running your debug binary under Valgrind, it&#...Try running your debug binary under Valgrind, it'll immediately tell you which line is offending. I treat Valgrind like a last-resort antibiotics, as using it too often can develop lazy habits. In the world of C/C++ you have to be on your toes all the time. 60% of your brain should be spared for proving that what you're doing at a given moment is not UB.enobayramhttps://www.blogger.com/profile/06193721360285674740noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-54980710014841993452016-12-07T17:37:26.846+00:002016-12-07T17:37:26.846+00:00Hmm, then my only hope is UBSan
http://clang.llvm....Hmm, then my only hope is UBSan<br />http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html<br /><br />If it doesn't detect the bug, the folks on the team will likely want to hear about it.greghttps://www.blogger.com/profile/14628021483366672849noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-22361588271823469542016-12-07T09:18:48.470+00:002016-12-07T09:18:48.470+00:00Did you try FlexeLint? I pasted your simplified ex...Did you try FlexeLint? I pasted your simplified example (after adding #include <stdlib.h>) into http://gimpel-online.com//cgi-bin/genPage.py?srcFile=diy.c&cgiScript=analyseCode.py&title=Do-It-Yourself+Example+%28C%29&intro=This+example+allows+you+to+specify+your+own+C+code.&compilerOption=online32.lnt&includeOption={{quotedIncludeOption}} , and it gave me a warning: "Warning 591: Variable 'array' depends on the order of evaluation; it is modified through function 'f(void)' via calls: f()"Bence Kodajhttps://www.blogger.com/profile/14672150992333362022noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-15098370281564095542016-12-07T09:14:10.759+00:002016-12-07T09:14:10.759+00:00This comment has been removed by the author.Bence Kodajhttps://www.blogger.com/profile/14672150992333362022noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-7610028320813961502016-12-07T06:59:15.565+00:002016-12-07T06:59:15.565+00:00I was also hoping to find the undefined behaviour ...I was also hoping to find the undefined behaviour even if the compiler did pick RHS before LHS, as they are free to do, which would avoid any problems.Neil Mitchellhttps://www.blogger.com/profile/13084722756124486154noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-34806357628676740082016-12-07T06:54:05.074+00:002016-12-07T06:54:05.074+00:00greg, just tried it with both clang and gcc addres...greg, just tried it with both clang and gcc address sanitisers and no luck. The problem is that in truth I allocate a shared buffer that stores the document structure, a small number of attributes, and a small number of nodes, so that small parse trees require only one allocation. If the nodes grow, then I allocate some space just for some nodes, and if they grow more, I repeat that process. It means the initial small nodes buffer remains valid on reallocation, it just happens to point at somewhere I no longer care about, and the address sanitiser is perfectly happy.Neil Mitchellhttps://www.blogger.com/profile/13084722756124486154noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-14347055690090924512016-12-07T03:49:56.608+00:002016-12-07T03:49:56.608+00:00I'm pretty sure ASan will find this at runtime...I'm pretty sure ASan will find this at runtime.<br />http://clang.llvm.org/docs/AddressSanitizer.htmlgreghttps://www.blogger.com/profile/14628021483366672849noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-78597003314436977382016-12-06T23:37:14.472+00:002016-12-06T23:37:14.472+00:00RV Match essentially involves running kcc, and is ...RV Match essentially involves running kcc, and is based on kcc, so I suspect with RV Match I covered everything KCC does, but I'm not 100% sure.Neil Mitchellhttps://www.blogger.com/profile/13084722756124486154noreply@blogger.comtag:blogger.com,1999:blog-7094652.post-73926983426851213772016-12-06T22:06:39.117+00:002016-12-06T22:06:39.117+00:00Maybe give KCC a try: https://github.com/kframewor...Maybe give KCC a try: https://github.com/kframework/c-semanticsAnonymousnoreply@blogger.com