I have studied mcs code in seL4 kernel for last three weeks, And I can’t understand the senario as mentiond in seL4-manual-latest.pdf as flollow:
Could you please give some hint to me, to help me understand the schedule context donation?
In intial state: A have a SC which is a postive thread;B and C are passive threads without SC.
Then A execute syscall seL4_call->enter seL4 kernel->sendIPC Question: Current endpoint state is idle, we can’t donate A’s SC to B with the code show as below, and B can’t execute any user code, such as seL4_Recv for don’t have a SC. Which means, B have no chance to get A’s SC! **
** Could you please give some exapmle user code to show me how to make SC donate from A to B to C to be come true?
To leave an explanation here in text form: for scheduling context donation to happen, the server must be waiting on an endpoint (i.e. the endpoint state must be waiting to receive, not idle).
To achieve that, you would usually make the server first active for initialisation, then let the server send a message that it is finished initialising, using SendRecv so that the server is now waiting for its first message to handle in “normal” passive operation. To make the server passive, you unbind the scheduling context it had during initialisation. It is now a passive server that with an endpoint in waiting-to-receive state and scheduling context donation can happen.
but I still have doubt at call chain A->B->C when B is revoked,what will happen?
As far as I can concer,in hte last seL4_ReplyRecv(B_endpoint, , , B_reply) in task B, it will directly retune in function doReplyTransfer.
And I can not find the call chain bettwen A and C, It seem like SC in C can’t return back to A after B’s reply cap is revoked?
To answer the easier one first: revoking a cap means deleting all descendants of a cap, and deleting a cap deletes the cap itself (children of the cap may continue existing).
Revoke is used to re-use authority to a resource you have delegated, but still retain control over. Delete is usually only useful for managing your own cspace.
To really fully re-use a resource (not just remove authority you have delegated), you would usually revoke the untyped cap that the resource was retyped from. That will clean up all references to this resource any other component in the system might have and reset its memory.
I think you are correct, if an object in the call chain is removed (which can happen when the last cap to that object is deleted, e.g. when the untyped cap it was created from is revoked), the reply chain is broken and the SC cannot travel back. You have to have sufficient trust in the system to know the communication structure will not be disturbed for A to reclaim its scheduling context in this scenario.
@mbrcknl@corey.lewis I remember discussions about this during the invariant proofs for MCS. Can you confirm that what I describe above is true?
If you have a dynamic system like that, you’d have to make sure that the components that have sufficient authority to delete things either have the ability to reset A or otherwise restore its operation. To make sense of this behaviour: if you think of IPC with scheduling context donation as a procedure call, this behaviour is similar to the client A calling a procedure B which calls a procedure C. If the memory for B disappears while C is running, you wouldn’t expect execution to return to A – the reply from C wouldn’t make sense to A. You’d instead have some kind of exception and/or recovery mechanism.
Yes, if a reply object in the middle of the call chain is revoked, the chain is broken and the scheduling context can never go back to the originator.
I originally toyed with designs where we would preserve the chain, but this added a lot of complexity to the kernel for gains we couldn’t find a decent use case for - tearing down a server in the middle of a nested RPC doesn’t seem like something you would want to work at all.