Fix-Con: Automatic Fault Localization and Repair of Deep Learning Model Conversions between Frameworks
Converting deep learning models between frameworks is a common step to maximize model compatibility across devices and leverage optimization features that may be exclusively provided in one deep learning framework. However, this conversion process may be riddled with bugs, making the converted models either undeployable or problematic, considerably degrading their prediction correctness. In this paper we propose an automated approach for fault localization and repair, Fix-Con, during model conversion between deep learning frameworks. Fix-Con is capable of detecting and fixing faults introduced in model input, parameters, hyperparameters, and the model graph during conversion. Fix-Con uses a set of fault types (mined from surveying conversion issues reported \nick{in code repositories and forums}) to localize potential conversion faults in the converted target model and then repair them appropriately, e.g., replacing the parameters of the target model with those from the source model. This is done iteratively for every image in the dataset, comparing output label differences between the source model and the converted target model until all differences are resolved. We evaluate the effectiveness of Fix-Con in fixing model conversion bugs of three widely used image recognition models converted across four different deep learning frameworks. Overall, Fix-Con was able to fix $462$ out of $755$ detected conversion faults, either completely repairing or significantly improving the performance of $14$ out of the $15$ erroneous conversion cases.
PDF Abstract