How to verify C functions with array parameters using Isabelle

I am using Isabelle to verify a C program. During the verification process, I used the c-parser install-C-file to load the code, but the following issue occurred:

“attempt to cause decay to pointer on array without an address”

The c code is as follows

void GetName(char a[32]);
typedef struct Attribute {
    char name[32];
} Attribute;
void PartitionInit(){
    Attribute a;    
    GetName(a.name);
}

The line of code that caused the error is GetName(a.name);

How can I make install-C-file load this file properly?

Thank you very much!

This is hitting one of the restrictions of the StrictC language.

As in normal C, in GetName, the parameter a translates to just a char pointer. The error message is trying to say that the record field that is being provided isn’t compatible with just a pointer.

In “normal” C this would work, because the compiler could use the address of the array field inside the struct, but in StrictC taking the address of a local variable isn’t allowed, and the struct Attribute a is a local variable.

If you make it a global variable instead, you could pass the address of the name field to the function and things should work. Or you could pass the struct, but that would pass it by value, not by reference.

1 Like

Thank you very much, I have understood the reason for the issue. However, I would like to know the reason for such limitations, is there any material I could refer to? What should I pay attention to if I want to add this feature? Although I think it might be very difficult, I still want to give it a try, hoping to understand the principles of c-parser. In addition, I would also like to understand how to get started with the ML language.

There is not much written up about the C subset itself. I’ve recently updated the C parser README to link to the relevant papers and the corresponding PhD thesis, but these mostly talk about the C memory model that the parser uses. Might be worth checking, though, it’s been a while since I read them.

For the local variable restriction, the main reason is that we want to be able to use a simpler state model for local variables, because you have to reason about them a lot. If you could take their address, they would have to live in the heap together with all the rest. This means, any write to the heap could potentially change the value of any local variable of the same type (that’s in fact what you want when you take its address). That means after each heap write, you’d have to re-establish the value of all locals in the proof.

The second problem is that if local variables live in the heap, they have to be allocated and de-allocated each time a function is called and returns. This means pointers can become invalid when you return from a function, and you have to prove that enough memory is available when you call a function. These are in fact real issues in C – your stack can overflow and a function return can produce a dangling pointer if you’ve taken the address of a local variable and stored it somewhere else in the heap. The stack overflow issue is usually one you want to ignore in the verification (that is, do the the verification modulo stack overflow not occurring). The dangling pointer problem only exists if you can take the address of local variables, so the current model avoids it completely. A model that adds the feature would have to reason about it explicitly.

There are ways to mitigate the complexity the feature introduces. For instance, you could make the addresses of local variable a special pointer type to a special area in the heap, so that they are easier to tell apart in the formalisation and so it becomes easier to emit conditions that you don’t store them in the heap for instance. The Simpl language also provides mechanisms to verify a program modulo specific guard failures, so if you encode the out-of-stack problem with guards, you can still verify assuming the stack will not overflow. So the feature could be added, but it’s fairly heavy-weight and we have so far been able to write all necessary code without it. It would likely also break existing proofs that use the old model, and we have a lot of those.