{"id":25929,"date":"2019-07-11T00:39:29","date_gmt":"2019-07-11T04:39:29","guid":{"rendered":"https:\/\/www.dannyadam.com\/blog\/?p=25929"},"modified":"2020-01-14T22:09:10","modified_gmt":"2020-01-15T03:09:10","slug":"compressing-vgg-for-style-transfer","status":"publish","type":"post","link":"https:\/\/www.dannyadam.com\/blog\/2019\/07\/compressing-vgg-for-style-transfer\/","title":{"rendered":"Compressing VGG for Style Transfer"},"content":{"rendered":"\n<table style=\"table-layout: fixed; text-align: center; border: none; border-collapse: collapse;\">\n    <tr>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q0_elephant.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q0_elephant_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q8_elephant.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q8_elephant_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q7_elephant.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q7_elephant_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n    <\/tr>\n    <tr>\n        <td style=\"border: none;\">32-bit float (no quantization)<\/td>\n        <td style=\"border: none;\">8-bit<\/td>\n        <td style=\"border: none;\">7-bit<\/td>\n    <\/tr>\n    <tr>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q6_elephant.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q6_elephant_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q5_elephant.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q5_elephant_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q4_elephant.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q4_elephant_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n    <\/tr>\n    <tr>\n        <td style=\"border: none;\">6-bit<\/td>\n        <td style=\"border: none;\">5-bit<\/td>\n        <td style=\"border: none;\">4-bit<\/td>\n    <\/tr>\n    <tr>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q3_elephant.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q3_elephant_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q2_elephant.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q2_elephant_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q1_elephant.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q1_elephant_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n    <\/tr>\n    <tr>\n        <td style=\"border: none;\">3-bit<\/td>\n        <td style=\"border: none;\">2-bit<\/td>\n        <td style=\"border: none;\">1-bit<\/td>\n    <\/tr>\n<\/table>\n\n\n\n<p>I recently implemented <a href=\"https:\/\/github.com\/dstein64\/pastiche\">pastiche<\/a>\u2014discussed in a prior <a href=\"https:\/\/www.dannyadam.com\/blog\/2019\/06\/pastiche\/\">post<\/a>\u2014for applying neural style transfer. I encountered a size limit when uploading the library to <a href=\"https:\/\/pypi.org\/\">PyPI<\/a>, as a package cannot exceed 60MB. The 32-bit floating point weights for the underlying VGG model [<a href=\"https:\/\/www.dannyadam.com\/blog\/2019\/07\/compressing-vgg-for-style-transfer\/#references\">1<\/a>] were contained in an 80MB file. My package was subsequently approved for a size limit increase that could accommodate the VGG weights as-is, but I was still interested in compressing the model.<\/p>\n\n\n\n<p>Various techniques have been proposed for compressing neural networks\u2014including distillation [<a href=\"\/\/www.dannyadam.com\/blog\/2019\/07\/compressing-vgg-for-style-transfer\/#references\">2<\/a>] and quantization [<a href=\"\/\/www.dannyadam.com\/blog\/2019\/07\/compressing-vgg-for-style-transfer\/#references\">3<\/a><a href=\"\/\/www.dannyadam.com\/blog\/2019\/07\/compressing-vgg-for-style-transfer\/#references\"><\/a>,<a href=\"\/\/www.dannyadam.com\/blog\/2019\/07\/compressing-vgg-for-style-transfer\/#references\">4<\/a><a href=\"\/\/www.dannyadam.com\/blog\/2019\/07\/compressing-vgg-for-style-transfer\/#references\"><\/a>]\u2014which have been shown to work well in the context of classification. My problem was in the context of style transfer, so I was not sure how model compression would impact the results.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong><span style=\"text-decoration: underline;\">Experiments<\/span><\/strong><\/h4>\n\n\n\n<p>I decided to experiment with weight quantization, using a scheme where I could store the quantized weights on disk, and then uncompress the weights to full 32-bit floats at runtime. This quantization scheme would allow me to continue using my existing code after the model is loaded. I am not targeting environments where memory is a constraint, so I was not particularly interested in approaches that would also reduce the model footprint at runtime. I used <a href=\"https:\/\/github.com\/dstein64\/kmeans1d\">kmeans1d<\/a>\u2014discussed in a prior <a href=\"https:\/\/www.dannyadam.com\/blog\/2019\/07\/kmeans1d-globally-optimal-efficient-1d-k-means\/\">post<\/a>\u2014for quantizing each layer&#8217;s weights.<\/p>\n\n\n<p><!--more--><\/p>\n\n\n<p>Before I implemented support for loading a quantized VGG model, I first ran experiments to see how different levels of compression would impact style transfer. I did not conduct extensive experiments\u2014just a few style transfers at different levels of compression. <code><a href=\"https:\/\/github.com\/dstein64\/pastiche\/blob\/quantization_experiments\/quantize.py\">quantize.py<\/a><\/code> creates updated VGG models with simulated quantization, and <code><a href=\"https:\/\/github.com\/dstein64\/pastiche\/blob\/quantization_experiments\/quantized_pastiche.sh\">quantized_pastiche.sh<\/a><\/code> runs style transfer using the updated VGG models. These scripts are in a separate <a href=\"https:\/\/github.com\/dstein64\/pastiche\/tree\/quantization_experiments\">branch<\/a> I created for the experiments.<\/p>\n\n\n\n<p>The images at the top of this post were generated with Edvard Munch&#8217;s <a href=\"https:\/\/en.wikipedia.org\/wiki\/The_Scream\">The Scream<\/a> and a <a href=\"https:\/\/photos.dannyadam.com\/Gallery\/i-bB6ZWwz\">photo<\/a> I took at the Pittsburgh Zoo in 2017. The images below were generated with Vincent van Gogh\u2019s <em><a href=\"https:\/\/en.wikipedia.org\/wiki\/The_Starry_Night\">The Starry Night<\/a><\/em> and a <a href=\"https:\/\/photos.dannyadam.com\/Gallery\/i-2RX5VX5\">photo<\/a> I took in Boston in 2015. The image captions indicate the compression rate of the VGG model used for the corresponding style transfer.<\/p>\n\n\n\n<table style=\"table-layout: fixed; text-align: center; border: none; border-collapse: collapse;\">\n    <tr>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q0_boston.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q0_boston_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q8_boston.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q8_boston_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q7_boston.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q7_boston_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n    <\/tr>\n    <tr>\n        <td style=\"border: none;\">32-bit float (no quantization)<\/td>\n        <td style=\"border: none;\">8-bit<\/td>\n        <td style=\"border: none;\">7-bit<\/td>\n    <\/tr>\n    <tr>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q6_boston.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q6_boston_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q5_boston.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q5_boston_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q4_boston.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q4_boston_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n    <\/tr>\n    <tr>\n        <td style=\"border: none;\">6-bit<\/td>\n        <td style=\"border: none;\">5-bit<\/td>\n        <td style=\"border: none;\">4-bit<\/td>\n    <\/tr>\n    <tr>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q3_boston.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q3_boston_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q2_boston.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q2_boston_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n        <td style=\"padding: 0px; padding-left: 5px; padding-right: 5px; padding-top: 2px; padding-top: 10px; border: none;\">\n            <a href=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2019\/07\/q1_boston.jpg\">\n                <img decoding=\"async\" style=\"width:100%; height:auto; display: block;\"\n                     src=\"https:\/\/www.dannyadam.com\/blog\/wp-content\/uploads\/2020\/01\/q1_boston_thumbnail.jpg\">\n            <\/a>\n        <\/td>\n    <\/tr>\n    <tr>\n        <td style=\"border: none;\">3-bit<\/td>\n        <td style=\"border: none;\">2-bit<\/td>\n        <td style=\"border: none;\">1-bit<\/td>\n    <\/tr>\n<\/table>\n\n\n\n<h4 class=\"wp-block-heading\"><strong><span style=\"text-decoration: underline;\">Implementation<\/span><\/strong><\/h4>\n\n\n\n<p>I originally decided to compress the model using 6-bit weights, and ran a few additional style transfers to check the quality at this compression level. I <a href=\"https:\/\/github.com\/dstein64\/pastiche\/commit\/02ad33df4be0e9c99781a7c8f195ce0d87a35191\">modified<\/a> the code to generate and load VGG models with weights quantized to arbitrary bit widths. Unfortunately, my implementation had a noticeable effect on latency when loading the model, taking almost twenty seconds for a model with weights compressed to 2 bits (I didn&#8217;t test for other compression rates, but larger bit widths would presumably take longer).<\/p>\n\n\n\n<p>I subsequently decided to quantize the weights to 8 bits instead of 6 bits, since this allowed for fast processing using PyTorch&#8217;s built-in <code>uint8<\/code> type. The VGG file size decreased from 80MB to 20MB, well within the 60MB PyPI limit that I originally encountered. Loading the quantized model takes less than 1 second.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"references\"><strong><span style=\"text-decoration: underline;\">References<\/span><\/strong><\/h4>\n\n\n\n<p>[1] Simonyan, Karen, and Andrew Zisserman. \u201cVery Deep Convolutional  Networks for Large-Scale Image Recognition.\u201d ArXiv:1409.1556 [Cs],  September 4, 2014. <a href=\"http:\/\/arxiv.org\/abs\/1409.1556\">http:\/\/arxiv.org\/abs\/1409.1556<\/a>.<br><br>[2] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. \u201cDistilling the Knowledge in a Neural Network.\u201d ArXiv:1503.02531 [Cs, Stat], March 9, 2015. <a href=\"http:\/\/arxiv.org\/abs\/1503.02531\">http:\/\/arxiv.org\/abs\/1503.02531<\/a>.<br><br>[3] Vanhoucke, Vincent, Andrew Senior, and Mark Z. Mao. \u201cImproving the Speed of Neural Networks on CPUs.\u201d In Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.<br><br>[4] Han, Song, Huizi Mao, and William J. Dally. \u201cDeep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding.\u201d ArXiv:1510.00149 [Cs], October 1, 2015. <a href=\"http:\/\/arxiv.org\/abs\/1510.00149\">http:\/\/arxiv.org\/abs\/1510.00149<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>32-bit float (no quantization) 8-bit 7-bit 6-bit 5-bit 4-bit 3-bit 2-bit 1-bit I recently implemented pastiche\u2014discussed in a prior post\u2014for applying neural style transfer. I encountered a size limit when uploading the library to PyPI, as a package cannot exceed 60MB. The 32-bit floating point weights for the underlying VGG model [1] were contained in [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[1],"tags":[46,56,65,72,71],"class_list":["post-25929","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-machine-learning","tag-neural-networks","tag-neural-style-transfer","tag-quantization","tag-vgg"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p1sCC6-6Kd","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/posts\/25929","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/comments?post=25929"}],"version-history":[{"count":104,"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/posts\/25929\/revisions"}],"predecessor-version":[{"id":26270,"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/posts\/25929\/revisions\/26270"}],"wp:attachment":[{"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/media?parent=25929"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/categories?post=25929"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dannyadam.com\/blog\/wp-json\/wp\/v2\/tags?post=25929"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}