| Posted | Nick | Remark | |
|---|---|---|---|
| #openstack-nova - 2019-06-26 | |||
| 17:10:29 | sean-k-mooney | maybe im missreading it as its kind of hard to read the initilal bug | |
| 17:10:31 | mriedem | a pre-check on one of the candidate dest hosts failed | |
| 17:10:44 | mriedem | which triggers a reschedule to another dest host in the conductor live migration task | |
| 17:10:49 | mriedem | the 2nd host works | |
| 17:11:17 | mriedem | but b/c the pre-check failed on the first dest host the instance action event for that one is error which sets the overall action message to 'Error' | |
| 17:11:36 | sean-k-mooney | ah ok | |
| 17:11:46 | mriedem | iow, actions aren't reschedule aware | |
| 17:11:56 | mriedem | or what is a non-fatal error | |
| 17:12:05 | sean-k-mooney | right | |
| 17:12:27 | sean-k-mooney | should we be loging the prechecks as events? | |
| 17:12:59 | sean-k-mooney | i was not exepcting to see compute_check_can_live_migrate_destination events | |
| 17:13:25 | mriedem | hard to say | |
| 17:13:33 | mriedem | if you configure nova to not have alternate hosts for reschedules | |
| 17:13:46 | mriedem | then you'd likely want to know it's that dest pre-check that failed right? | |
| 17:14:18 | sean-k-mooney | maybe or jsut that you had no valid hosts? | |
| 17:14:29 | sean-k-mooney | / exausted teh list of alternates | |
| 17:14:57 | sean-k-mooney | i thikn we would still log the failure right | |
| 17:15:02 | openstackgerrit | Merged openstack/nova master: Remove orphaned comment from _get_group_details https://review.opendev.org/667135 | |
| 17:15:07 | mriedem | sure, if you set https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.migrate_max_retries to 0 for now retries, or you don't have any alternate hosts | |
| 17:15:44 | mriedem | idk, anyway, it's tangential to the auto-revert failed resize thing mel was asking about | |
| 17:15:45 | sean-k-mooney | for me this feels like we are leaking an implemenation detail as an event | |
| 17:16:04 | sean-k-mooney | ya it is | |
| 17:16:11 | mriedem | instance action events are basically entirely leaked implementation details :) | |
| 17:16:17 | mriedem | the event names come from the methods they decorate | |
| 17:16:27 | mriedem | there is no guarantee on api stability for those things | |
| 17:17:17 | sean-k-mooney | ok personally i would prefer not to decorate that function | |
| 17:17:45 | sean-k-mooney | but as you said its tangental to melwitt's topic | |
| 17:36:44 | openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Add a rbd_connect_timeout configurable https://review.opendev.org/667421 | |
| 17:36:57 | openstackgerrit | Eric Fried proposed openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | |
| 17:38:12 | Nick_A | any idea why metadata would send all /24 routes in a region to each instance? http://paste.openstack.org/show/y0lE42EA59yhnu7G1KnY/ | |
| 17:38:25 | openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix AttributeError in RT._update_usage_from_migration https://review.opendev.org/667687 | |
| 17:38:26 | openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix RT init arg order in test_unsupported_move_type https://review.opendev.org/667688 | |
| 17:48:03 | openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Multiple API cleanup changes https://review.opendev.org/666889 | |
| 17:59:52 | sean-k-mooney | yonglihe: efried just reviewing https://review.opendev.org/#/c/648912/14 but why are we looking up instance by name? | |
| 18:01:58 | efried | sean-k-mooney: I haven't the foggiest. I'm involved here in an administrative capacity :) | |
| 18:02:43 | sean-k-mooney | the domain xml has the instance uuid stored in the uuid field for several release now so im wondering why we would use the instance domain name instead | |
| 18:03:43 | melwitt | sean-k-mooney: been meaning to get to that review. after what we briefly discussed at the ptg, that change would be best to piggyback onto the existing periodic for handling deleted instances. I don't know why it would be looking up instances by name | |
| 18:03:45 | efried | k, hopefully yonglihe will be able to answer on the review. Thanks sean-k-mooney. | |
| 18:04:20 | sean-k-mooney | melwitt: it is piggybacking on that task | |
| 18:04:42 | melwitt | but I see a lot of new methods | |
| 18:04:42 | sean-k-mooney | efried: ok im trying to find where it gets teh list of suspected instances | |
| 18:05:16 | sean-k-mooney | melwitt: ya there are. im not sure if they are all needed. | |
| 18:05:36 | melwitt | yeah, I would think there should be none new needed | |
| 18:06:18 | sean-k-mooney | well we need a new method to query the driver for the instance that are runnin on the host but not in the db | |
| 18:07:07 | sean-k-mooney | and then we can call the old meethods that implement the policy. e.g. reap or log or do nothing whatever you have set in the config | |
| 18:07:31 | melwitt | why? there's a self._get_instances_on_driver method already | |
| 18:08:08 | sean-k-mooney | that is a good question :) | |
| 18:08:46 | sean-k-mooney | i have only properly looked at https://review.opendev.org/#/c/648912/14 whic is doing driver change but i need to look at how that is being used in https://review.opendev.org/#/c/627765/28 | |
| 18:08:51 | melwitt | anyway, just saying skimming those patches I don't see why they're so large | |
| 18:09:29 | melwitt | or rather, I expected not such a large patch for this | |
| 18:13:16 | sean-k-mooney | melwitt: this seam oververly complex https://review.opendev.org/#/c/627765/28/nova/compute/manager.py@8884 | |
| 18:14:32 | sean-k-mooney | also _destroy_orphan_instances is not a greate name for that since it might not destroy anything . | |
| 18:15:02 | sean-k-mooney | melwitt: i was wrong however its adding a new periodici task not extending the existing tasks | |
| 18:41:20 | melwitt | sean-k-mooney: yeah, that's what I had thought. it should get much simpler if it's changed to extend the existing periodic. and I think the suggestion at the ptg from dansmith was to add another enumerated choice to the existing conf option that is something like "reap-unknown" which will reap both known deleted and unknown guests. and otherwise just log unknowns | |
| 18:43:20 | openstackgerrit | Merged openstack/nova master: Fix update_provider_tree signature in reference docs https://review.opendev.org/667408 | |
| 18:43:52 | openstackgerrit | Eric Fried proposed openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | |
| 18:44:18 | mriedem | sean-k-mooney: it's named _destroy* because it's similar to the existing _destroy_running_instances | |
| 18:44:21 | mriedem | which might also not destroy anything | |
| 18:44:38 | sean-k-mooney | mriedem: ah ok | |
| 18:45:12 | sean-k-mooney | my main issue with this is it will only really work for the libvirt dirver | |
| 18:45:17 | sean-k-mooney | and the fake driver | |
| 18:53:52 | mriedem | yup | |
| 18:54:01 | mriedem | i believe i noted something along those lines last time i did a deep review on it | |
| 18:54:07 | openstackgerrit | Miguel Ángel Herranz Trillo proposed openstack/nova master: Fix type error on call to mount device https://review.opendev.org/659780 | |
| 18:54:13 | mriedem | i seem to remember why they needed to lookup by name at the time, and it was libvirt-specific | |
| 18:54:35 | sean-k-mooney | ya im looking at it again | |
| 18:55:02 | sean-k-mooney | its because we can have libvirt domains that are for instance that nova created but have been deleted form the db | |
| 18:55:40 | sean-k-mooney | so to destory the running guest they need to use the libvirt domain name | |
| 18:55:58 | mriedem | having said all that, i'm not opposed to starting with something that could eventually be implemented by other drivers | |
| 18:56:07 | mriedem | as long as it's graceful about other drivers not implementing the necessary interface | |
| 18:56:31 | sean-k-mooney | ya it handels the not implmetned excetion correctly in the manager | |
| 18:56:58 | sean-k-mooney | im think of asking them to use the uuid instead of the name however | |
| 18:57:36 | sean-k-mooney | the instance uuid is set in the libvirt domains uuid field | |
| 18:57:57 | mriedem | yeah uuid is ideal if we can use it | |
| 18:58:14 | sean-k-mooney | but i libvirt allows us to deleate by uuid | |
| 18:58:32 | sean-k-mooney | me might just be pushing the translation into the dirver | |
| 18:59:48 | sean-k-mooney | im also wondering how to deal with the fact we could be leaking plugged interfaces | |
| 19:03:55 | mriedem | heh, | |
| 19:04:01 | mriedem | well we could be leaking all sorts of things | |
| 19:04:07 | mriedem | storage connections | |
| 19:04:42 | sean-k-mooney | will undefineing the domin delete the root disk? | |
| 19:04:55 | sean-k-mooney | or other disks | |
| 19:05:26 | mriedem | i doubt it | |
| 19:05:46 | sean-k-mooney | if we are reaping the orpahn vms that were created by nova but are deleted in the db we really should be cleaning up all there local resouces in that case which this current patch does not attemtp to do | |
| 19:06:03 | mriedem | otherwise we wouldn't need separate calls during driver.destroy to cleanup the volumes via brick and unplug vifs via os-vif | |
| 19:06:19 | mriedem | at some point this could also be cyborg devices and such couldn't it? | |
| 19:06:20 | sean-k-mooney | ya that is a good point | |
| 19:06:26 | sean-k-mooney | ya | |
| 19:07:19 | sean-k-mooney | so if we are going to reap these we need to try and clean up as much of the resouces as we can or just support powering down the instance but not reaping them? | |
| 19:07:50 | openstackgerrit | Merged openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | |
| 19:07:57 | sean-k-mooney | if we just delete the domain the operator has noting ot go on to figure out what they need to clean up manually | |
| 19:09:11 | sean-k-mooney | its tricky however as from the domain we dont know what nova create or not but i think we should at lesast try to do some cleanup | |
| 19:12:52 | sean-k-mooney | anyway im going to get something to eat | |
| 19:13:05 | sean-k-mooney | o/ | |
| 19:19:29 | mriedem | there is meta in the domain that tells us if nova created it | |
| 19:19:42 | sean-k-mooney | yes there is | |
| 19:19:48 | mriedem | for libvirt anyway | |
| 19:20:05 | sean-k-mooney | which i asked yonglihe to check when deleteing the domian | |
| 19:20:13 | sean-k-mooney | in the libvirt case that is | |
| 19:20:48 | efried | mriedem: +1 on the ML note, thanks for that | |