Earlier  
Posted Nick Remark
#openstack-nova - 2019-06-28
14:45:26 mriedem sean-k-mooney: we fail
14:45:52 mriedem sean-k-mooney: https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L468
14:46:26 mriedem gibi: so i think we agree that rolling back the port binding update is simpler than the allocation
14:46:33 gibi mriedem: good point about port bound to a host. But can it be a port with device_id=instance_uuid that is not bound?
14:46:35 mriedem and i'd prefer we include a rollback
14:47:01 mriedem gibi: "But can it be a port with device_id=instance_uuid that is not bound?" that i'm not sure about
14:47:03 mriedem sean-k-mooney: ^
14:47:05 sean-k-mooney ... ok was an api breakage on upgrade but i understand why it was done
14:47:12 mriedem sean-k-mooney: oh i think we can,
14:47:15 mriedem because of shelve offload
14:47:31 mriedem a shelved instance still has its ports and volumes
14:47:37 mriedem but those ports and volumes aren't "bound" to a host
14:47:40 sean-k-mooney yes shelve offloaded would still have the device id set
14:47:47 gibi ack
14:47:55 gibi then I have to check for boundness as well
14:48:01 mriedem does heal_allocations filter out instances that aren't on a host....
14:48:04 mriedem it should implicitly,
14:48:08 mriedem because it's using instance.node
14:48:30 sean-k-mooney gibi: you can filter py ports with vif_type!=vif-unbound
14:48:32 gibi yeah it checks instance.node then
14:48:47 gibi so we don't need an extra vif_type!=vif-unbound check
14:49:17 sean-k-mooney ok i guess that makes sense
14:49:34 gibi if we don't know the where the instance runs then we don't know which RP tree need to be targeted with the healing
14:50:10 gibi mriedem: I don't get your comment at https://review.opendev.org/#/c/668184/1/nova/cmd/manage.py@1891
14:50:11 sean-k-mooney yep. so we whould never need to heal offloaded instances
14:50:47 mriedem now i'm not so sure where we unbind the port on shelve offload,
14:50:49 gibi mriedem: is it about trying the rollback for each port even if one of it fails?
14:50:51 mriedem because https://github.com/openstack/nova/blob/324da0532f3b59aa16233a93a260d289e55860fb/nova/compute/manager.py#L5168 is a noop for neutron
14:51:12 mriedem gibi: yes
14:51:22 mriedem right now you'd stop on the first port update that fails
14:51:31 mriedem i'm saying we should try to clean all and fail at the end
14:51:51 gibi mriedem: OK got it, that could be done
14:52:29 mriedem https://github.com/openstack/nova/blob/324da0532f3b59aa16233a93a260d289e55860fb/nova/network/neutronv2/api.py#L3183
14:52:39 mriedem so are ports unbound when the instance is shelved offloaded....
14:53:12 sean-k-mooney maybe in driver.destroy
14:53:15 sean-k-mooney ill go check
14:53:18 sean-k-mooney but they might not be
14:53:50 sean-k-mooney destroy should at least unplug them
14:54:07 mriedem right
14:54:14 sean-k-mooney acttuly it cant unbind
14:54:18 mriedem driver.destroy just does the unplug, the virt drivers never mess with port bindings
14:54:20 sean-k-mooney because we use destoy in reboot
14:54:28 sean-k-mooney and we dont ubind there
14:54:57 mriedem anyway for gibi's heal_allocation changes it doesn't really matter b/c we wouldn't get that far anyway for a shelved offloaded instance b/c we won't try to heal allocations on an instance that doesn't have a host est
14:55:00 mriedem *set
14:55:46 gibi ack
14:56:08 sean-k-mooney right its just something to look into to see if we have a latent bug
14:56:45 sean-k-mooney i would expect it to happen as part of moving it ot offloaded
14:57:08 sean-k-mooney but i dont think it would break anything either
14:57:10 gibi so in summary. I will add the rollback of neutron updates. But I only add a todo to possible rollback-retry
14:57:36 gibi mriedem: is that works for you ?
14:57:53 gibi mriedem: or you are a bit more on the paranoid side :)
14:58:50 mriedem the backoff retry could be a follow up
14:58:58 gibi mriedem: OK
14:59:00 mriedem try to stagger these changes as much as possible
14:59:04 mriedem since this is a large change
14:59:05 sean-k-mooney oh other edgecase. do we currently allow attaching ports with resouce requests
14:59:13 mriedem sean-k-mooney: no
14:59:17 sean-k-mooney cool
14:59:20 gibi :)
14:59:46 gibi mriedem: I can split out the helper functions from the current commit if that helps
15:00:16 mriedem gibi: sure
15:00:39 mriedem sean-k-mooney: this is where that would fail https://github.com/openstack/nova/blob/324da0532f3b59aa16233a93a260d289e55860fb/nova/compute/api.py#L4408
15:01:25 sean-k-mooney cool. AttachInterfaceWithQoSPolicyNotSupported is slightly missleading but it makes sense
15:01:46 sean-k-mooney we allow attaching ports with QoS polices jut not with min bandwith qos policies
15:01:59 sean-k-mooney e.g. DSCP is fine as is max bandwith
15:02:40 sean-k-mooney its really we dont allow attaching with resource requests.
15:02:42 gibi sean-k-mooney: file a bug, I'm happy to amend the message of AttachInterfaceWithQoSPolicyNotSupported exception with a more specific message
15:04:08 mriedem efried: the reason i didn't jump on https://storyboard.openstack.org/#!/story/2005258 was because of the osc semantic patterns to follow for other osc commands as mentioned in the comments - i gets a bit hairy
15:04:30 mriedem rather than just like a simple --append or --remove
15:04:31 sean-k-mooney gibi: sure i could but we will also be changing that at some point too right
15:04:53 sean-k-mooney e.g. we will eventrually allow it if the host can fulfil the request but like not until U+
15:04:58 gibi sean-k-mooney: based on my current speed of progress with the server move operation patches it will be far in the future
15:05:29 gibi sean-k-mooney: but yes, in theory I would like to get the support for attach interface if the request fitts on the current host
15:05:52 sean-k-mooney well i know that neutorn want to start modeing ip pools in placment and use resouce resquest in the future
15:06:08 sean-k-mooney so i expect it to become more common
15:06:15 gibi yeah
15:06:31 mriedem are you talking about for routed networks?
15:06:35 mriedem modeling routed networks
15:06:49 sean-k-mooney not just routed netwrok but that is one usecase
15:06:51 mriedem i.e. https://review.opendev.org/#/c/656885/
15:07:19 mriedem that's been kicked around for many years and is going to be complicated i imagine
15:07:42 mriedem we don't even have integration testing for the routed networks support that is documented in the neutron docs
15:08:08 mriedem https://docs.openstack.org/neutron/latest/admin/config-routed-networks.html
15:08:11 sean-k-mooney i think they also wanted to supprot using sharing aggrates to model things like floating ips and what host can recive an ip form which subnet pool
15:08:44 mriedem well the good news is pop up teams officially exist now so all of this work can get done!
15:09:14 sean-k-mooney ya without a cross project effort this is hard to make progress on
15:09:52 mriedem i was being a sarcastic jerk but yeah
15:11:09 sean-k-mooney hehe well i have tried to do enough stuff that crosses the nova neutron boundry that i know without getting buyin form both core teams up front and finding people to care about it it will fail
15:11:33 sean-k-mooney anyway thats a tangent
15:11:56 mriedem lyarwood: i guess you got busy https://review.opendev.org/#/q/topic:bug/1653953+(status:open+OR+status:merged)
15:13:11 lyarwood mriedem: busy spamming yes, hopefully that's what you had in mind.
15:13:24 mriedem it is
15:14:01 lyarwood wonderful, there are also DNM changes actually testing things on each branch btw
15:14:08 lyarwood I'll add them to the same topic now to make it clear
15:14:45 mriedem oh i see
15:15:05 mriedem nova DNM changes that depend on the related devstack-plugin-ceph change per branch that depeds on the backport per nova branch....
15:15:28 lyarwood and around and around we go
15:16:08 lyarwood but that should allow the fix to land then the change to enable the test without everything blowing up like it did with my cinder migrate series.
15:16:08 mriedem this is going to be a pain for cross-cell resize, i'm not sure how i'm going to handle that

Earlier   Later