| Posted | Nick | Remark | |
|---|---|---|---|
| #openstack-nova - 2019-06-26 | |||
| 18:13:16 | sean-k-mooney | melwitt: this seam oververly complex https://review.opendev.org/#/c/627765/28/nova/compute/manager.py@8884 | |
| 18:14:32 | sean-k-mooney | also _destroy_orphan_instances is not a greate name for that since it might not destroy anything . | |
| 18:15:02 | sean-k-mooney | melwitt: i was wrong however its adding a new periodici task not extending the existing tasks | |
| 18:41:20 | melwitt | sean-k-mooney: yeah, that's what I had thought. it should get much simpler if it's changed to extend the existing periodic. and I think the suggestion at the ptg from dansmith was to add another enumerated choice to the existing conf option that is something like "reap-unknown" which will reap both known deleted and unknown guests. and otherwise just log unknowns | |
| 18:43:20 | openstackgerrit | Merged openstack/nova master: Fix update_provider_tree signature in reference docs https://review.opendev.org/667408 | |
| 18:43:52 | openstackgerrit | Eric Fried proposed openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | |
| 18:44:18 | mriedem | sean-k-mooney: it's named _destroy* because it's similar to the existing _destroy_running_instances | |
| 18:44:21 | mriedem | which might also not destroy anything | |
| 18:44:38 | sean-k-mooney | mriedem: ah ok | |
| 18:45:12 | sean-k-mooney | my main issue with this is it will only really work for the libvirt dirver | |
| 18:45:17 | sean-k-mooney | and the fake driver | |
| 18:53:52 | mriedem | yup | |
| 18:54:01 | mriedem | i believe i noted something along those lines last time i did a deep review on it | |
| 18:54:07 | openstackgerrit | Miguel Ángel Herranz Trillo proposed openstack/nova master: Fix type error on call to mount device https://review.opendev.org/659780 | |
| 18:54:13 | mriedem | i seem to remember why they needed to lookup by name at the time, and it was libvirt-specific | |
| 18:54:35 | sean-k-mooney | ya im looking at it again | |
| 18:55:02 | sean-k-mooney | its because we can have libvirt domains that are for instance that nova created but have been deleted form the db | |
| 18:55:40 | sean-k-mooney | so to destory the running guest they need to use the libvirt domain name | |
| 18:55:58 | mriedem | having said all that, i'm not opposed to starting with something that could eventually be implemented by other drivers | |
| 18:56:07 | mriedem | as long as it's graceful about other drivers not implementing the necessary interface | |
| 18:56:31 | sean-k-mooney | ya it handels the not implmetned excetion correctly in the manager | |
| 18:56:58 | sean-k-mooney | im think of asking them to use the uuid instead of the name however | |
| 18:57:36 | sean-k-mooney | the instance uuid is set in the libvirt domains uuid field | |
| 18:57:57 | mriedem | yeah uuid is ideal if we can use it | |
| 18:58:14 | sean-k-mooney | but i libvirt allows us to deleate by uuid | |
| 18:58:32 | sean-k-mooney | me might just be pushing the translation into the dirver | |
| 18:59:48 | sean-k-mooney | im also wondering how to deal with the fact we could be leaking plugged interfaces | |
| 19:03:55 | mriedem | heh, | |
| 19:04:01 | mriedem | well we could be leaking all sorts of things | |
| 19:04:07 | mriedem | storage connections | |
| 19:04:42 | sean-k-mooney | will undefineing the domin delete the root disk? | |
| 19:04:55 | sean-k-mooney | or other disks | |
| 19:05:26 | mriedem | i doubt it | |
| 19:05:46 | sean-k-mooney | if we are reaping the orpahn vms that were created by nova but are deleted in the db we really should be cleaning up all there local resouces in that case which this current patch does not attemtp to do | |
| 19:06:03 | mriedem | otherwise we wouldn't need separate calls during driver.destroy to cleanup the volumes via brick and unplug vifs via os-vif | |
| 19:06:19 | mriedem | at some point this could also be cyborg devices and such couldn't it? | |
| 19:06:20 | sean-k-mooney | ya that is a good point | |
| 19:06:26 | sean-k-mooney | ya | |
| 19:07:19 | sean-k-mooney | so if we are going to reap these we need to try and clean up as much of the resouces as we can or just support powering down the instance but not reaping them? | |
| 19:07:50 | openstackgerrit | Merged openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | |
| 19:07:57 | sean-k-mooney | if we just delete the domain the operator has noting ot go on to figure out what they need to clean up manually | |
| 19:09:11 | sean-k-mooney | its tricky however as from the domain we dont know what nova create or not but i think we should at lesast try to do some cleanup | |
| 19:12:52 | sean-k-mooney | anyway im going to get something to eat | |
| 19:13:05 | sean-k-mooney | o/ | |
| 19:19:29 | mriedem | there is meta in the domain that tells us if nova created it | |
| 19:19:42 | sean-k-mooney | yes there is | |
| 19:19:48 | mriedem | for libvirt anyway | |
| 19:20:05 | sean-k-mooney | which i asked yonglihe to check when deleteing the domian | |
| 19:20:13 | sean-k-mooney | in the libvirt case that is | |
| 19:20:48 | efried | mriedem: +1 on the ML note, thanks for that | |
| 19:21:24 | mriedem | np | |
| 19:22:36 | sean-k-mooney | that is what https://review.opendev.org/#/c/648912/14/nova/virt/libvirt/driver.py@9695 is doing and its used to filer the domains here https://review.opendev.org/#/c/627765/28/nova/compute/manager.py@8886 | |
| 19:59:33 | efried | mriedem: took me three days to go through all the specs, but http://lists.openstack.org/pipermail/openstack-discuss/2019-June/007381.html | |
| 20:10:01 | mriedem | efried: huh https://review.opendev.org/#/q/project:openstack/nova-specs+status:open+path:%255Especs/train/approved/.*+reviewedby:self seems to not work properly, it's only showing me https://review.opendev.org/#/c/631154/ but i've clearly commented on https://review.opendev.org/#/c/648686/ as well | |
| 20:10:17 | mriedem | maybe reviewedby is only on the latest PS? | |
| 20:10:55 | mriedem | aha | |
| 20:10:56 | mriedem | https://review.opendev.org/#/q/project:openstack/nova-specs+status:open+path:%255Especs/train/approved/.*+reviewer:self | |
| 20:11:00 | mriedem | reviewer:self, not reviewedby:self | |
| 20:14:14 | efried | whoops, thanks | |
| 21:31:07 | melwitt | mnaser, imacdonn: hi, as responders to the ML thread awhile back, I have a spec up for showing server status as UNKNOWN if host status is UNKNOWN that has been receiving some reviews. your reviews would be helpful for deciding whether it goes forward https://review.opendev.org/666181 | |
| 21:33:10 | mriedem | dansmith: you may want to drop your +2 to a +1 or 0 https://review.opendev.org/#/c/457886/ until i get the ceph job results on it | |
| 21:41:44 | openstackgerrit | Miguel Ángel Herranz Trillo proposed openstack/nova master: Fix type error on call to mount device https://review.opendev.org/659780 | |
| 21:44:11 | mriedem | dansmith: nvm, lyarwood already had a patch up to test that | |
| 22:14:40 | mriedem | efried: are you ok with me just pushing up this test change and +2ing? https://review.opendev.org/#/c/659780/3/nova/tests/unit/virt/disk/mount/test_nbd.py | |
| 22:14:51 | efried | ... | |
| 22:15:58 | efried | mriedem: yes, and I'll +A | |
| 22:16:52 | openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix type error on call to mount device https://review.opendev.org/659780 | |
| 22:17:37 | mriedem | done | |
| 22:18:58 | efried | +A | |
| 22:25:39 | mnaser | melwitt: left a comment thanks :D | |
| 22:43:44 | melwitt | thanks | |
| #openstack-nova - 2019-06-27 | |||
| 13:13:51 | gibi | mriedem: the previous patches just refactored the heal allocation code path | |
| 13:14:10 | gibi | to make it easy to add the port healing | |
| 13:15:30 | gibi | mriedem: the basic structure of final patch | |
| 13:18:00 | gibi | mriedem: during _heal_allocations_for_instance first we get if the instance has ports that needs healing | |
| 13:18:26 | gibi | this _get_port_allocations_to_heal is the majority of the added code there | |
| 13:19:01 | gibi | as it detect is there is any port that needs healing and generates the allocation fragment for that port | |
| 13:19:33 | mriedem | ok i'm assuming just checking if the port has a resource request (if hitting neutron directly) or checking the info cache for a bound port with the allocatoin in the binding profile? | |
| 13:19:52 | gibi | we are hitting neutron directly | |
| 13:20:02 | mriedem | hmm, that's kind of expensive isn't it? | |
| 13:20:19 | mriedem | why can't we use the info cache to tell if the vif has a port allocation? | |
| 13:20:49 | mriedem | or is that some kind of chicken and egg issue? or we don't trust the cache? | |
| 13:20:55 | gibi | we need to hit neutron to see if there is resource request on the port | |
| 13:21:00 | gibi | that is not part of the vif cache | |
| 13:21:23 | mriedem | sure but the binding profile in the vif cache has a thing that tells us if it had a resource request right? | |
| 13:21:44 | gibi | mriedem: now, the binding profile in the cache only could tell us if there is allocation for the port | |
| 13:21:52 | openstackgerrit | Eric Fried proposed openstack/nova-specs master: WIP: Physical TPM passthrough https://review.opendev.org/667926 | |
| 13:21:57 | gibi | but if there is no allocation then we don't know if there is a need for such allocation | |
| 13:22:48 | mriedem | so let's say the binding profile in the vif cache doesn't have that allocation marker, but we find a port with a resource request - we're going to heal the allocation by putting allocations to placement for that instance right? | |
| 13:23:03 | mriedem | are we also going to update the cache to indicate it's now got an allocatoin? | |
| 13:23:14 | gibi | mriedem: right, and also updating neutron with the rp uuid | |
| 13:23:27 | gibi | mriedem: hm, we are not updating the info cache | |
| 13:23:29 | gibi | :/ | |
| 13:24:16 | mriedem | yeah, so i'm wondering about the scenario that our cache is busted, we go tell placement there is an allocation, but then what happens when we teardown that port/server in the compute service? are we basing that on the cache? b/c if so, we'll leak the allocation correcT? | |
| 13:24:25 | openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Physical TPM passthrough https://review.opendev.org/667928 | |
| 13:25:07 | mriedem | similarly are we only creating the port allocation in heal_allocations if the port is bound? | |
| 13:25:13 | gibi | mriedem: we cannot leak allocation during delete as delete deletes everything for the consumer | |
| 13:25:28 | mriedem | gibi: what about when we detach the port though? | |
| 13:25:31 | mriedem | but don't delete the server | |
| 13:26:31 | gibi | I hope that code path works based on the allocation key in the binding profile, but let me check | |