{"id":164087,"date":"2024-02-19T10:00:12","date_gmt":"2024-02-19T15:00:12","guid":{"rendered":"https:\/\/www.kdnuggets.com\/?p=164087"},"modified":"2024-02-16T12:44:24","modified_gmt":"2024-02-16T17:44:24","slug":"introduction-to-memory-profiling-in-python","status":"publish","type":"post","link":"https:\/\/www.kdnuggets.com\/introduction-to-memory-profiling-in-python","title":{"rendered":"Introduction to Memory Profiling in Python"},"content":{"rendered":"<p> <center><img decoding=\"async\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/c_introduction_memory_profiling_python_1.png\" alt=\"Introduction to Memory Profiling in Python\" width=\"100%\"\/><br \/>\n <font size=\"-1\">Image by Author<\/font><\/center><br \/>\n&nbsp; <\/p>\n<p>Profiling Python code is helpful to understand how the code works and identify opportunities for optimization. You\u2019ve probably profiled your Python scripts for <a href=\"\/profiling-python-code-using-timeit-and-cprofile\" rel=\"noopener\" target=\"_blank\">time-related metrics<\/a>\u2014measuring execution times of specific sections of code.\u00a0<\/p><div class=\"kdnug-after-first-paragraph kdnug-entity-placement\" id=\"kdnug-3848209083\"><div id=\"kdnug-2957899312\"><a data-no-instant=\"1\" href=\"https:\/\/brightdata.com\/products\/web-scraper?pscd=get.brightdata.com&#038;ps_partner_key=MTgwODIzNDM4ZTJl&#038;ps_xid=ddqsY59sX6XPwj&#038;gsxid=ddqsY59sX6XPwj&#038;gspk=MTgwODIzNDM4ZTJl&#038;utm_source=affiliates&#038;utm_campaign=MTgwODIzNDM4ZTJl\" rel=\"noopener nofollow\" class=\"a2t-link\" target=\"_blank\"><p>\t\t\t\t<img decoding=\"async\" style=\"max-width: 100%; height: auto;\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/s1-brightdata-2512.jpg\" alt=\"BrightData WebScarping\" \/><br \/>\nThe most reliable Web Scraping API<\/p>\n<\/a><\/div><\/div>\n<p>But profiling for memory\u2014to understand memory allocation and deallocation during execution\u2014is just as important. Because memory profiling can help identify memory leaks, resource utilization, and potential issues with scaling.\u00a0<\/p><div class=\"kdnug-in-content-1 kdnug-entity-placement\" style=\"text-align: center;padding-bottom: 180px;padding-top: 20px;\" id=\"kdnug-3471269588\"><div id=\"kdnug-2260025546\"><a data-no-instant=\"1\" href=\"https:\/\/sps.northwestern.edu\/information\/data-science-online-artificial-intelligence-masters.html?utm_source=kdnuggets&#038;utm_medium=banner300x250&#038;utm_campaign=kdnuggets_msds_banner300x250_l&#038;utm_term=may26&#038;utm_content=msds&#038;src=kdnuggets_msds_banner300x250_mayfy26_l\" rel=\"noopener nofollow\" class=\"a2t-link\" target=\"_blank\"><p><img decoding=\"async\" style=\"max-width: 100%; height: auto;\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/s-nwu-2605.jpg\" alt=\"NWU - Analytics, AI, and leadership skills.\" \/><br \/>\nAnalytics, AI, and leadership skills.\t\t<\/p>\n<\/a><\/div><\/div>\n<p>In this tutorial, we\u2019ll explore profiling Python code for memory usage using the Python package <a href=\"https:\/\/pypi.org\/project\/memory-profiler\/\" rel=\"noopener\" target=\"_blank\">memory-profiler<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<h1>Installing the memory-profiler Python Package<\/h1>\n<p>&nbsp;<\/p>\n<p>Let\u2019s start by installing the memory-profiler Python package using pip:<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>pip3 install memory-profiler<\/code><\/pre>\n<\/div>\n<p>&nbsp; <\/p>\n<blockquote><p>\n <font size=\"+1\"><strong>Note<\/strong>: Install memory-profiler in a dedicated <a href=\"https:\/\/docs.python.org\/3\/library\/venv.html\" rel=\"noopener\" target=\"_blank\">virtual environment<\/a> for the project instead of in your global environment. We'll also be using the plotting capabilities available in memory-profiler to plot the memory usage, which requires <a href=\"https:\/\/matplotlib.org\/\" rel=\"noopener\" target=\"_blank\">matplotlib<\/a>. So make sure you also have matplotlib installed in the project\u2019s virtual environment.<\/font> <\/p><\/blockquote>\n<p>&nbsp;<\/p>\n<h1>Profiling Memory Usage with the @profile Decorator<\/h1>\n<p>&nbsp;<\/p>\n<p>Let's create a Python script (say main.py) with a function <code style=\"background: #F5F5F5;\">process_strs<\/code>:<\/p>\n<ul>\n<li>The function creates two super long Python strings <code style=\"background: #F5F5F5;\">str1<\/code> and <code style=\"background: #F5F5F5;\">str2<\/code> and concatenates them.\u00a0\n<li>The keyword argument <code style=\"background: #F5F5F5;\">reps<\/code> controls the number of times the hardcoded strings are to be repeated to create <code style=\"background: #F5F5F5;\">str1<\/code> and <code style=\"background: #F5F5F5;\">str2<\/code>. And we give it a default value of 10**6 which will be used if the function called does not specify the value of <code style=\"background: #F5F5F5;\">reps<\/code>.\n<li>We then explicitly delete <code style=\"background: #F5F5F5;\">str2<\/code>.\u00a0\n<li>The function returns the concatenated string <code style=\"background: #F5F5F5;\">str3<\/code>.\n<\/ul>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code># main.py\r\n\r\nfrom memory_profiler import profile\r\n\r\n@profile\r\ndef process_strs(reps=10**6):\r\n\tstr1 = 'python'*reps\r\n\tstr2 = 'programmer'*reps\r\n\tstr3 = str1 + str2\r\n\tdel str2\r\n\treturn str3\r\n\r\nprocess_strs(reps=10**7)<\/code><\/pre>\n<\/div>\n<p>&nbsp; <\/p>\n<p>Running the script should give you a similar output:\u00a0<\/p>\n<p>&nbsp;<br \/>\n<center><img decoding=\"async\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/c_introduction_memory_profiling_python_5.png\" alt=\"Introduction to Memory Profiling in Python\" width=\"100%\"\/><\/center><br \/>\n&nbsp; <\/p>\n<p>As seen in the output, we\u2019re able to see the memory used, the increment with each subsequent string creation and the string deletion step freeing up some of the used memory.<\/p>\n<p>&nbsp;<\/p>\n<h2>Running the mprof command\u00a0<\/h2>\n<p>&nbsp;<\/p>\n<p>Instead of running the Python script as shown above, you can also run the <code style=\"background: #F5F5F5;\">mprof<\/code> command like so:<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>mprof run --python main.py<\/code><\/pre>\n<\/div>\n<p>&nbsp; <\/p>\n<p>When you run this command, you should also be able to see a .dat file with the memory usage data. You\u2019ll have one .dat file every time you run the <code style=\"background: #F5F5F5;\">mprof<\/code> command\u2014identified by the timestamp.<\/p>\n<p>&nbsp;<br \/>\n<center><img decoding=\"async\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/c_introduction_memory_profiling_python_3.png\" alt=\"Introduction to Memory Profiling in Python\" width=\"100%\"\/><\/center><\/p>\n<p>&nbsp;<\/p>\n<h2>Plotting Memory Usage\u00a0<\/h2>\n<p>&nbsp;<\/p>\n<p>Sometimes it's easier to analyze memory usage from a plot instead of looking at numbers. Remember we discussed matplotlib being a required dependency to use the plotting capabilities.\u00a0<\/p>\n<p>You can use the <code style=\"background: #F5F5F5;\">mprof plot<\/code> command to plot the data in the .dat file and save it to an image file (here output.png):<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>mprof plot -o output.png<\/code><\/pre>\n<\/div>\n<p>&nbsp; <\/p>\n<p>By default, <code style=\"background: #F5F5F5;\">mprof plot<\/code> used the data from the most recent run of the <code style=\"background: #F5F5F5;\">mprof<\/code> command.<\/p>\n<p>&nbsp;<br \/>\n<center><img decoding=\"async\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/c_introduction_memory_profiling_python_2.png\" alt=\"Introduction to Memory Profiling in Python\" width=\"100%\"\/><\/center><br \/>\n&nbsp; <\/p>\n<p>You can see the timestamps mentioned in the plot as well.<\/p>\n<p>&nbsp;<\/p>\n<h2>Logging Memory Usage Profile to a Log File<\/h2>\n<p>&nbsp;<\/p>\n<p>Alternatively, you can log the memory usage statistics to a preferred log file in the working directory. Here, we create a file handler <code style=\"background: #F5F5F5;\">mem_logs<\/code> to the log file, and set the <code style=\"background: #F5F5F5;\">stream<\/code> argument in the <code style=\"background: #F5F5F5;\">@profile<\/code> decorator to the file handler:<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code># main.py\r\n\r\nfrom memory_profiler import profile\r\n\r\nmem_logs = open('mem_profile.log','a')\r\n\r\n@profile(stream=mem_logs)\r\ndef process_strs(reps=10**6):\r\n\tstr1 = 'python'*reps\r\n\tstr2 = 'programmer'*reps\r\n\tstr3 = str1 + str2\r\n\tdel str2\r\n\treturn str3\r\n\r\nprocess_strs(reps=10**7)<\/code><\/pre>\n<\/div>\n<p>&nbsp; <\/p>\n<p>When you now run the script, you should be able to see the mem_profile.log file in your working directory with the following contents:<\/p>\n<p>&nbsp;<br \/>\n<center><img decoding=\"async\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/c_introduction_memory_profiling_python_4.png\" alt=\"Introduction to Memory Profiling in Python\" width=\"100%\"\/><\/center><\/p>\n<p>&nbsp;<\/p>\n<h1>Profiling Using the memory_usage Function\u00a0<\/h1>\n<p>&nbsp;<\/p>\n<p>You can also use the <code style=\"background: #F5F5F5;\">memory_usage()<\/code> function to understand the resources required for a specific function to execute\u2014sampled at regular time intervals.<\/p>\n<p>The <code style=\"background: #F5F5F5;\">memory_usage<\/code> function takes in the function to profile, positional and keyword arguments as a tuple.<\/p>\n<p>Here, we\u2019d like to find the memory usage of the <code style=\"background: #F5F5F5;\">process_strs<\/code> function with the keyword argument <code style=\"background: #F5F5F5;\">reps<\/code> set to 10**7. We also set the sampling interval to 0.1 s:<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code># main.py\r\n\r\nfrom memory_profiler import memory_usage\r\n\r\ndef process_strs(reps=10**6):\r\n\tstr1 = 'python'*reps\r\n\tstr2 = 'programmer'*reps\r\n\tstr3 = str1 + str2\r\n\tdel str2\r\n\treturn str3\r\n\r\nprocess_strs(reps=10**7)\r\n\r\nmem_used = memory_usage((process_strs,(),{'reps':10**7}),interval=0.1)\r\nprint(mem_used)<\/code><\/pre>\n<\/div>\n<p>&nbsp; <\/p>\n<p>Here\u2019s the corresponding output:<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>Output >>>\r\n[21.21875, 21.71875, 147.34375, 277.84375, 173.93359375]<\/code><\/pre>\n<\/div>\n<p>&nbsp; <\/p>\n<p>You can also adjust the sampling interval based on how often you want the memory usage to be captured. As an example, we set the interval to 0.01 s; meaning we\u2019ll now get a more granular view of the memory utilized.<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code># main.py\r\n\r\nfrom memory_profiler import memory_usage\r\n\r\ndef process_strs(reps=10**6):\r\n\tstr1 = 'python'*reps\r\n\tstr2 = 'programmer'*reps\r\n\tstr3 = str1 + str2\r\n\tdel str2\r\n\treturn str3\r\n\r\nprocess_strs(reps=10**7)\r\n\r\nmem_used = memory_usage((process_strs,(),{'reps':10**7}),interval=0.01)\r\nprint(mem_used)<\/code><\/pre>\n<\/div>\n<p>&nbsp; <\/p>\n<p>You should be able to see a similar output:<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>Output >>>\r\n[21.40234375, 21.90234375, 33.90234375, 46.40234375, 59.77734375, 72.90234375, 85.65234375, 98.40234375, 112.65234375, 127.02734375, 141.27734375, 155.65234375, 169.77734375, 184.02734375, 198.27734375, 212.52734375, 226.65234375, 240.40234375, 253.77734375, 266.52734375, 279.90234375, 293.65234375, 307.40234375, 321.27734375, 227.71875, 174.1171875]<\/code><\/pre>\n<\/div>\n<p>&nbsp;<\/p>\n<h1>Conclusion<\/h1>\n<p>&nbsp;<\/p>\n<p>In this tutorial, we learned how to get started with profiling Python scripts for memory usage.<\/p>\n<p>Specifically, we learned how to do this using the memory-profiler package. We used the <code style=\"background: #F5F5F5;\">@profile<\/code> decorator and the <code style=\"background: #F5F5F5;\">memory_usage()<\/code> function to get the memory usage of a sample Python script. We also learned how to use the capabilities such as plotting the memory usage and capturing the stats in a log file.<\/p>\n<p>If you\u2019re interested in profiling your Python script for execution times, consider reading <a href=\"\/profiling-python-code-using-timeit-and-cprofile\" rel=\"noopener\" target=\"_blank\">Profiling Python Code Using timeit and cProfile<\/a>.<br \/>\n&nbsp;<br \/>\n&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"So where did all the memory go? To figure out, learn how to profile your Python code for memory usage using the memory-profiler package.\n","protected":false},"author":390,"featured_media":164090,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_seopress_robots_primary_cat":"none","_seopress_titles_title":"","_seopress_titles_desc":"","_seopress_robots_index":"","inline_featured_image":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"mc4wp_mailchimp_campaign":[],"footnotes":"","_links_to":"","_links_to_target":""},"categories":[5286],"tags":[203],"class_list":["post-164087","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-kdnuggets-originals","tag-python"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/posts\/164087","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/users\/390"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/comments?post=164087"}],"version-history":[{"count":3,"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/posts\/164087\/revisions"}],"predecessor-version":[{"id":164097,"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/posts\/164087\/revisions\/164097"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/media\/164090"}],"wp:attachment":[{"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/media?parent=164087"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/categories?post=164087"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kdnuggets.com\/wp-json\/wp\/v2\/tags?post=164087"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}