OpenStack allows to attach and detach volumes from running instances. When a volume is detached from a VM (using the appropriate API call to the compute service), the compute service (Nova) removes the access to the volume for the VM and then calls the storage service (Cinder) on the user’s behalf to force access to be stopped. However, the storage service API can also be invoked directly by the user. Even if the volume is still attached, this can be forced and Cinder will not talk to Nova, so it results in a surprise removal on the compute side. The VM on the compute mode in this case thus continues to be entitled to access the storage volume, though this will result in I/O errors due to being rejected by the storage service.
Due to the way the SCSI and multipath stacks work in Linux, the devices used to access the storage can be reused. It is thus possible that a newly created storage volume uses the device that was incompletely revoked before, resulting in the old VM getting access to the new volume despite not owning it.
In OpenStack setups, this affects storage access via iSCSI and via FibreChannel.
Authenticated users of the OpenStack IaaS service thus might accidentially get access to storage volumes that they should not be authorized to. This can also be provoked by doing lots of force detachments.
The vulnerability has been assigned CVE-2023-2088.
In the reference implementation of Sovereign Cloud Stack, the storage service is provided by ceph and access to the storage is handled through rados block devices, not the SCSI layer. No problematic reuse of connections/devices/identifiers has been found for this and the storage isolation is maintained.
The cloud-in-a-box configurations before ~R4 this year used iSCSI to access local storage volumes – this has meanwhile been replaced by a single-node ceph setup. So it is not affected any more. Note that we don’t consider the cloud-in-a-box setup to be meant for production, so we would not necessarily provide patches with the same urgency for it even it it was still affected.
It is possible for providers to diverge from SCS default setup with ceph storage and connect other block storage backends that use SCSI and are thus affected by this. We have double-checked with the four productive public clouds (Betacloud, pluscloud open, Wavestack, regio.digital cloud) that use SCS currently and they are not affected.
The issue has been reported by Jan Wasilewski in private to the OpenStack Vulnerability Management Team. The reporters and upstream developers have worked together to address the issue with fixes and an embargo date has been set to Wednesday, 2023-05-10, 15:00 UTC. At this point in time, the patches will get merged and an OpenStack Security Advisory (OSSA-2023-003) will be published. The issue is tracked in OpenStack issue #2004555, which should be publically accessible after the advisory has been published.
Under the used responsible disclosure approach, the information was shared with a select group of trustable users of OpenStack, so they can prepare updates and protect their user data in time for the publication.
The SCS and OSISM teams have analyzed the information carefully and determined that the SCS IaaS reference implementation from OSISM is not affected in the default configuration.
To avoid Cinder removing devices that Nova still assumes to have access to,
Cinder should reject force removals of still-attached volumes unless the
removal request comes from Nova. There are patches from the upstream
maintainers that help Cinder to make that distinction. The OpenStack
os-brick library gets support for a force
parameter that Nova and
Glance can then use. In addition,
config changes
need to be applied to enable Nova to send service tokens along with
the user tokens and which is then used by Cinder to validate the
provenance of the request.
The next minor release of OSISM (expected end of May / early June) will include
the necessary changes for the force
parameter. It is still under investigation
whether the changes to enable service tokens can be safely applied as part of
the automated upgrade process or whether that will be documented as an important
hint for cloud operators that diverge in their storage implementation from
the default setup with ceph (where this is all not needed).
A workaround until the next minor release can be implemented by providers. It would entail a config change that ensures that Nova uses a user with a service role to send tokens to Cinder on behalf of users and a policy on the Cinder API that enforces this role. This is described in more detail in the OpenStack Security Advisory. We suggest providers that use OSISM in modes which may require such protection to get in touch with us.
The authors would like to thank the reporter, the upstream OpenStack developers and the OpenStack Vulnerability Management Team for the responsible reporting, careful analysis, fixing, testing and professional handling of the issue and the OSISM team for additional analysis.
SCS security contact is security@scs.community, as published on https://scs.community/.well-known/security.txt.